In this article we are going to discuss import export data csv to dataframes. As you are aware now of how to create dataframe in Pandas. In this chapter, you will learn how to save this data of data frame for future use in CSV files. So here we go!
Topics Covered
import export data csv to dataframes
Import export data csv to dataframes is an important part of the python syllabus. Whenever you are working with python you need to save the dataframe data from output screen to file and vice-versa. I have already written a detailed post about Data files introduction.
Follow this link to read about data files:
import export data csv to dataframes
Let’s start with import export data csv to dataframes with exporting data from dataframe to CSV files.
Consider following example:
import pandas as pd
emp_dict = {'Name':['Sagar','Mohit','Arjun','Manav','Malayketu'],
'age':[21,24,None,20,25], 'Salary':[25000,35000,None,27000,30000]}
df=pd.DataFrame(emp_dict)
df.to_csv('D:mydata.csv')
f = open('D:mydata.csv','r')
data = f.read()
print(data)
f.close()
In above example
- Employees data stored in a dictionary : emp_dict
- DataFrame created using pandas: df
- Data exported into mydata.csv file using to_csv() function: df.to_csv()
- CSV file opened through open() function with read mode: f = open(‘mydata.csv’,’r’)
- Data read by function read(): data = f.read()
- Data printed using print function : priint(data)
- The opened needs to be closed to avoid malfunctioning in csv file: f.close()
Now have look at recommended options with to_csv() for import export data csv to dataframes.
Recommended Parameters with to_csv() functions
- path_or_buf: This argument receives a file or string buffer. If path is nor provided as a parameter it will save data in CSV format. User can provide absolute or relative path. In above example relative path is given to the file. Observe this code:
import pandas as pd
player_stats={‘Player_Name’:[‘Rohit’,’Shikhar’,’Virat’,’Shreyas’,’Rahul’],
‘Matches_Player’:[200,190,156,89,110], ’Runs’:[6790,5678,8901,2356,4321]}
df=pd.DataFrame(player_stats)
df.to_csv(“D:Export_CSVplayer_data.csv”)
- sep: It specifies the separator character to separate the data columns. By default it is a comma. In below example ‘|’ symbol is used to separate data. Look at this code:
import pandas as pd
player_stats={‘Player_Name’:[‘Rohit’,’Shikhar’,’Virat’,’Shreyas’,’Rahul’],
‘Matches_Player’:[200,190,156,89,110], ’Runs’:[6790,5678,8901,2356,4321]}
df=pd.DataFrame(player_stats)
df.to_csv(“D:Export_CSVplayer_data.csv”, sep=’#’)
f=open(“D:Export_CSVplayer_data.csv”)
data=f.read()
print(data)
- na_rep:It specifies the value in place of NaN. The default is ”. The code is demonstrating the use of na_rep:
import pandas as pd
player_stats=[{'Rohit':56,'Shikhar':77,'Virat':42,'Shreyas':45,'Rahul':32},
{'Rohit':65,'Shikhar':23,'Virat':82,'Shreyas':52},
{'Rohit':16,'Shikhar':17,'Virat':122}]
df=pd.DataFrame(player_stats)
df.to_csv("player_data.csv", sep='#',na_rep='Not_Bat')
f=open("player_data.csv",'r')
data=f.read()
print(data)
- float_format: This option specifies the number format to store in CSV file. As you know python displays a large number after decimal values in output. So this option reduces the length of digits into specified digits. Observe this code:
import pandas as pd
player_stats={'Rohit':56.52478,'Shikhar':37.1323464,
'Virat':42.85444,'Shreyas':45.547899}
df=pd.DataFrame(player_stats,index=range(4))
df.to_csv("player_data.csv",sep='#',float_format='%.2f')
f=open("player_data.csv",'r')
data=f.read()
print(data)
header: It is used to export data column header into CSV. It can be specified True or False. By default it is True. Observe this code:
import pandas as pd
player_stats={'Rohit':56.52478,'Shikhar':37.1323464,
'Virat':42.85444,'Shreyas':45.547899}
df=pd.DataFrame(player_stats,index=range(4))
df.to_csv("player_data.csv",sep='#',float_format='%.2f',header=False)
f=open("player_data.csv",'r')
data=f.read()
print(data)
- columns: To write columns into CSV. By default it is None. Observe this code:
import pandas as pd
player_stats={'Rohit':56.52478,'Shikhar':37.1323464,'Virat':42.85444,'Shreyas':45.547899}
df=pd.DataFrame(player_stats,index=range(4))
df.to_csv("player_data.csv",sep='#',float_format='%.2f',columns=['Virat','Rohit'])
f=open("player_data.csv",'r')
data=f.read()
print(data)
- index: To write row number or not. By default it is True. Look at this code:
import pandas as pd
player_stats=[{'Rohit':56,'Shikhar':77,'Virat':42,'Shreyas':45,'Rahul':32},
{'Rohit':65,'Shikhar':23,'Virat':82,'Shreyas':52},
{'Rohit':16,'Shikhar':17,'Virat':122}]
df=pd.DataFrame(player_stats)
df.to_csv("player_data.csv",sep='#',float_format='%.2f',index=[0,2] )
f=open("player_data.csv",'r')
data=f.read()
print(data)
Watch this video for practical understanding:
Exporting Data into text files – import export data csv to dataframes
Data can be exported using a simple writing operations into a text files. To write data into a data frame to text file follow these steps:
- Create a dataframe.
- Create a text file with ‘w’ mode.
- Convert data into str using str() function and use write function.
- Read data to check the output.
The next subtopic of import export data csv to data frames is import data through the file.
import pandas as pd
player_stats={'Player_Name':['Rohit','Shikhar','Virat','Shreyas','Rahul'],
'Matches_Player':[200,190,156,89,110],
'Runs':[6790,5678,8901,2356,4321]}
df=pd.DataFrame(player_stats)
print(df)
f=open("runs.txt",'w')
f.write(str(df))
f.close()
f=open("runs.txt",'r')
data=f.read()
f.close()
Import Data through files – import export data csv to dataframes
To import data read_csv() function is used. It store the values from different files.
Consider this example:
import pandas as pd
d=pd.read_csv("file.txt")
print(d)
A read_csv() function contains following commonly used parameters:
file_path or buffer: It is similar as to_csv() parameter.
sep: It is too similar to to_csv() sep parameter.
index_col: Make a passed column as an index
import pandas as pd
d=pd.read_csv("file.txt")
print(d)
d1=pd.read_csv("file.txt",index_col='State')
print(d1)
Header: Change the header of as passed row
import pandas as pd
d=pd.read_csv("file.txt")
print(d)
d1=pd.read_csv("file.txt",header=1)
print(d1)
Watch these videos for practical understanding:
Video 1:
Video 2
You can use CSV file as well, I have copied data from the .txt file extension but data is given in the comma-separated format.
Follow this link to get access of questions on import export data csv to dataframes.
Follow this link to access practical prgorams for import export data csv to dataframes:
There are some website that provides you with open data sources to learn export data from CSV files. Kaggle is one of them.
I hope this article will help you to learn the concepts. Feel free to comment your views and if you have any doubts or queries regarding this article ask them in the comment section as well.
Thanks for reading!