In this article, Data Handling using Pandas-I you will learn about the Python Pandas data structure series.
Topics Covered
Introduction to Python Libraries
Python libraries are in-built Python modules that allow performing system-related operations, IO operations, data analysis, and some other standard operations. Pandas library is used for data analysis.
Introduction to Data handling using Pandas-I
Important points for pandas:
- Pandas word derived from PANel DAta System.
- It becomes popular for data analysis.
- It provides highly optimized performance with back-end source code purely written in C or Python.
- It makes a simple and easy process for data analysis.
Pandas offer two basic data structures:
- Series
- DataFrame
To work with pandas import the pandas library and create one object using one of these statements:
import pandas
import pandas as pd
from pandas import Series, DataFrame
Looking for questions on the series? Follow this link:
Data handling using Pandas-I Series
Series is an important data structure of pandas. It represents a one-dimensional array, containing an array of data. It can be any type of NumPy data. Basically series has two main components:
- An Array (Values)
- An index associated with an array (labels)
Example:
Task 1 Creating Series
The Series() function from the pandas module is used to create a series.
Example:
import pandas as pd
ser1=pd.Series()
An empty panda series has float64 data type.
Creating non-empty series
In non-empty series data and index will be supplied while creating series. Here data can be one of these data types:
- A python sequence
- An ndarray
- A dictionary
- A scalar value
- Equi-space Elements
- Repeated list
Creating series with a python sequence (Data handling using Pandas-I)
range() function is used to generate a series with python pandas.
import pandas as pd
s=pd.Series(range(5))
print(s)
In the above screenshot, a series is created with float numbers. Observe this code:
import pandas as pd
s=pd.Series([3.5,6.5,7.5,4.5,8.0])
print(s)
Series can be also created using a list of characters. Observe this series of vowels.
import pandas as pd
s=pd.Series(['a','e','i','o','u'])
print(s)
Output:
A series can be also created using a list of words/names. Let’s have a look at this code:
import pandas as pd
s=pd.Series(['alpha','beta','gama'])
print(s)
Creating Series with ndarray (Data handling using Pandas-I)
Creating a series from ndarray named nda. An array of odd numbers between 1 to 10 is created through the arange() function.
import pandas as pd
import numpy as np
nda=np.arrange(1,10,2)
s=pd.Series(nda)
print(s)
The arange() function also useful for creating a series with decimal step values such as 0.5, 0.8 etc. Observe this code:
import pandas as pd
import numpy as np
s=pd.Series(np.arange(0.5,5.5,0.5))
print(s)
Output:
Creating a series with a dictionary
Crating series from Dictionary object and storing first three days of the week in series. when a series is created using a dictionary, the dictionary keys become an index of a series by default.
import pandas as pd
d={'Monday':1,'Tuesday':2,'Wednesday':3}
s=pd.Series(d)
print(s)
Creating a series with a scalar value
Series created with scalar value 5. When a series is created using a scalar value, the index must be specified.
import pandas as pd
s=pd.Series(5,index=range(1,5))
print(s)
Task 2 Specifying NaN values in the series
Specified NaN at the index 1. If the data value for the series is unknown, python assigns the value as NaN (Not a Number). There are two methods to assign NaN: np.NaN and None in the sequence.
import pandas as pd
import numpy as np
#Method1
s=pd.Series([1,np.NaN,4,np.NaN,8,9])
#Method 2
s=pd.Series([1,None,4,None,8,9])
print(s)
Output:
Task 3 creating series and specifying index
In the above example, two lists were created for train numbers and train names. Train no list assigned as data and train name assigned as indexes.
import pandas as pd
tr_no=[19708,14708,19115,14155]
tr_name=['Aravali Express','Ranakpur Express','Sayaji Nagari Express','Kutch Express']
s=pd.Series(data=tr_no,index=tr_name)
print(s)
Task 4 Creating series using arithmetic operation
In this example, a series is created with a * 3 as data.
import pandas as pd
import numpy as np
l=np.arange(25,50,5)
s=pd.Series(index=l,data=l*3)
print(s)
Task 5 Creating series with equispaced elements
To create a series with equispaced elements linspace() function of numpy module is used. Let’s have a look:
import pandas as pd
import numpy as np
s=pd.Series(np.linspace(31,91,5))
print(s)
Output:
Task 6 Creating series using the repeated list
A list can be replicated or repeated to create a series. This can be done using two methods:
- Using tile() function of numpy module
- using replication operator for list
Just have a look at the following code:
Method 1
import pandas as pd
import numpy as np
s=pd.Series(np.tile([33,44,55],3))
print(s)
Method 2
import pandas as pd
l=[33,44,55]
s=pd.Series(l*3)
print(s)
Data handling using Pandas-I Common Series attributes
Attribute | Description |
Series.index | Retrieves index of a series |
Series.values | Return series as ndarray |
Series.dtype | Return data type of series |
Series.shape | Return tuples (no.of rows) of the shape |
Series.nbytes | Return no. of bytes |
Series.ndim | Return no. of dimension |
Series.size | Return no. of elements |
Series.hasnans | Return true is there are any NaN value else false |
Series.empty | Return true if the series is empty, else false |
Common series attribute Example
Attributes Example:
import pandas as pd
d={'Jan':133,'Feb':145,'Mar':165,'Apr':126,'May':176}
s=pd.Series(d)
print("Index:",s.index)
print("Values:",s.values)
print("Shape:",s.shape)
print("Bytes:",s.nbytes)
print("Dimension:",s.ndim)
print("Size:",s.size)
print("Contains NaN Items?:",s.hasnans)
Output:
Watch this video to understand the practical aspects:
Accessing elements from series
In the above screenshot, I have accessed elements by using their index value such as ser[2] and ser[3].
Observe the following which accesses 3rd and last element using a positional index.
import pandas as pd
d={'Jan':133,'Feb':145,'Mar':165,'Apr':126,'May':176}
s=pd.Series(d)
print("Element 3:",s[2])
print("Last Element:",s[-1])
Output:
The series can be also accessed using its label or index. Observe this code:
import pandas as pd
d={'Jan':133,'Feb':145,'Mar':165,'Apr':126,'May':176}
s=pd.Series(d)
print("January:",s['Jan'])
print("April:",s['Apr'])
Follow this link to read the questions and answer:
Modifying series elements
In the above screenshot code, I have changed the element value with a scalar value. In Python, series objects are valued mutable i.e. values can be changed but the size is immutable i.e. can’t be changed.
Example:
import pandas as pd
s=pd.Series([34,56,78,21,90])
s[3]=23
print(s)
Slicing in Series (Data handling using Pandas-I)
Slicing is also one of the methods to select/access or modify data from a series. Observe this code:
import pandas as pd
s=pd.Series([34,56,78,21,90])
print(s[1:])
print(s[2:5])
print(s[0::2])
s[0:4]=[23,32,87,19]
print(s)
head() and tail() function in series (Data handling using Pandas-I)
The head() function displays n number of elements from the top in the series. In the above example, I have accessed top 3 elements. If no value is passed in the parameter then by default it will display 5 elements from the top. Similarly, the tail function will work and display n number of elements from the bottom.
Observe this code and the output will return the first 5 elements from series:
import pandas as pd
s=pd.Series([34,56,78,21,90,88,58,95,97])
print(s.head())
Output:
The following code will return the last 5 elements from a series:
import pandas as pd
s=pd.Series([34,56,78,21,90,88,58,95,97])
print(s.tail())
The following codes display n rows from top and bottom, I have taken 3 as n.
import pandas as pd
s=pd.Series([34,56,78,21,90,88,58,95,97])
print(s.head(3))
import pandas as pd
s=pd.Series([34,56,78,21,90,88,58,95,97])
print(s.tail(3))
Vector and arithmetic operations on series
Here I have used different vector operations and operators to perform various tasks on series.
Observe this code:
import pandas as pd
s=pd.Series([34,56,78,21,90])
print("Add 5 to each value:")
print(s+5)
print("Display value more than 80:")
print(s>80)
print("Display values less than 70")
print(s<70)
Watch this video for practical understanding:
drop() method – (Data handling using Pandas-I)
drop(): Remove any entry from the series.
Observe this code and output to understand:
import pandas as pd
s=pd.Series([34,56,78,21,90])
print("Add 5 to each value:")
s=s.drop(2)
print(s)
Output:
Follow this link for practical programs with solutions:
Watch this video for series program:
Follow this link to read questions about the Python pandas series for class 12.
Download PDF
Download pdf for Python Pandas – Series from the below given button:
Thank you for reading the article. Feel free to ask any doubt in the comment section and share this article with your friends and classmates.