A short guide on multiple options for renaming columns in a pandas dataframe

Photo by Giulio Gabrieli on Unsplash

Ensuring that dataframe columns are appropriately named is essential to understand what data is contained within, especially when we pass our data on to others. In this short article, we will cover a number of ways to rename columns within a pandas dataframe.

But first, what is Pandas? Pandas is a powerful, fast, and commonly used python library for carrying out data analytics. The Pandas name itself stands for “Python Data Analysis Library”. According to Wikipedia, the name originates from the term “panel data”. It allows data to be loaded in from a number file formats (CSV, XLS, XLSX, Pickle, etc.) and stored within table-like structures. These tables (dataframes) can be manipulated, analyzed, and visualized using a variety of functions that are available within pandas

The first steps involve importing the pandas library and creating some dummy data that we can use to illustrate the process of column renaming.

import pandas as pd

We will create some dummy data to illustrate the various techniques. We can do this by calling upon the .DataFrame() Here we will create three columns with the names A, B, and C.

df = pd.DataFrame({'A':[1,2,3,4,5], 
'B':[101,102,103,104,105],
'C':[42,42,42,42,42]})
png
Starting dataframe created using pd.DataFrame()

An alternative method for creating the dataframe would be to load the data from an existing file, such as a csv or xlsx file. When we load the data, we can change the names of the columns using the names argument. When we do this, we need to make sure we drop the existing header row by using header=0.

df = pd.read_csv('data.csv', 
names=['ColA', 'ColB', 'ColC'],
header=0)
png
Pandas dataframe after renaming the columns during the loading of a csv file.

The first method we will look at is the .rename() function. Here we can pass in a dictionary to the columns keyword argument. The dictionary allows us to provide a mapping between the old column name and the new one that we want.

We will also set the inplaceargument to True so that we are making the changes to the dataframe, df, directly as opposed to making a copy of it.

df.rename(columns= {'A':'Z', 'B':'Y', 'C':'X' }, inplace=True)

An alternative version of this is to specify the axis, however, it is less readable and may not be clear what this argument is doing compared to using the columns argument.

df.rename({'A':'Z', 'B':'Y', 'C':'X' }, inplace=True, axis=1)

When we…

Continue reading: https://towardsdatascience.com/how-to-rename-columns-in-pandas-a-quick-guide-a934aa977bd5?source=rss—-7f60cf5620c9—4

Source: towardsdatascience.com