Python Parse CSV File – A Brief Introduction
In essence, there are several different file-based formats available out there on the web. However, among them, CSV is considered to be one of the fastest and most convenient. Moreover, you would also not have to create them from scratch.
If you are using Python programming language, then you can find a Python parse CSV file library in it. You can use the Pandas library as well in this aspect. But, before you start working on it, you will need to more about CSV parsing and how does it work.
CSV File – What is It Really?
A CSV (Comma Separated Values) file is, in truth, a simple text component, which employs various structuring modules to manage tabular data. As it is a text-based file, it cannot contain anything else but text data. Thus, here, you can only find Unicode or ASCII characters.
The structure of the CSV files is quite straightforward and can be understood from its name. It usually uses a comma for separating one data value from another. The structure usually looks like –
column 1 name, column 2 name, column 3 name
first row data 1, first row data 2, first row data 3
second-row data 1, second-row data 2, second-row data 3
The character, which is separating one module from another, is known as a delimiter. It can be a comma, colon, tab, semi-colon, etc. Hence, if you are thinking about parsing data in python of a CSV file, then you will need to learn about the usage of the delimiters.
How to Read a CSV File?
You can read from a CSV file through the “reader” object. For that, you will need to open it on a text file, which can be done through the in-built “open ()” option of Python. This way, you will be able to get a file object without any issue. Let’s take an example of a Python parse CSV file to learn more about the procedure.
The following is a text file of “employee birthday”.
name, department, birthday monthJohn Smith, Accounting, NovemberErica Meyers, IT, March
So, if you want to read it, then you will need to use the following code.
import csv with open(’employee_birthday.txt’) as csv_file: csv_reader = csv.reader(csv_file, delimiter=‘,’) line_count = 0 for row in csv_reader: if line_count == 0: print(f‘Column names are {“, “.join(row)}‘) line_count += 1 else: print(f‘\t{row[0]} works in the {row[1]} department, and was born in {row[2]}.’) line_count += 1 print(f‘Processed {line_count} lines.’)
Now, the end result of the same is going to be –
Column names are name, department, birthday month John Smith works in the Accounting department, and was born in November. Erica Meyers works in the IT department and was born in March. Processed 3 lines.
So, as you can see, the “reader” module has returned you with a list of stringed information while eliminating the delimiter. The first row of the same features the names of the column. Conversely, the second one talks about the employee and his birth date.
Due to its simplicity and speedy working capability, CSV files are gradually becoming more popular in the market. So, be sure to start using it for your purpose to relish better benefits.