Adding Columns to Pandas DataFrame

5 mins read27.3K Views Comment

Updated on Oct 3, 2023 11:55 IST

Learn how to effortlessly expand your Pandas DataFrame’s functionality by mastering the art of adding new columns. Explore strategies for inserting, transforming, and populating new columns to suit your analysis needs.

Pandas DataFrames are tabular data structures that store data similar to an Excel or CSV file – in rows and columns. The below article covers Adding Columns to Pandas DataFrame.

During analysis, you perform several operations on a DataFrame using the functions provided in Pandas. We have already learned how to append rows to a Pandas DataFrame. In this article, we will learn how to add columns to Pandas DataFrame using four methods – assign(), insert(), concat() and apply().

We are going to cover the following sections:

Adding a column using List
Adding a column using Pandas Series
Adding columns using assign()
Adding a column using insert()
Adding a column using concat()
Adding a column using apply()
Adding an empty column
Adding a column with a constant value
Endnotes

For our purpose today, let’s create a sample DataFrame as shown below:

 
 
 
 
   
  
  
  
  
    
   
   
   
   
     
    
    
    
    
      #Importing Pandas Library
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
    
    
    
    
      import pandas 
    
    
    
    
      as pd
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
      
    
    
    
     
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
    
    
    
    
      #Creating a Sample DataFrame
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
      
    
    
    
    data 
    
    
    
    
      = pd.
    
    
    
    
      DataFrame
    
    
    
    
      (
    
    
    
    
      {
   
   
   
   
     
  
  
  
  
    
   
   
   
   
         
    
    
    
    
      'id': 
    
    
    
    
      [ 
    
    
    
    
      101
    
    
    
    
      , 
    
    
    
    
      123
    
    
    
    
      , 
    
    
    
    
      139
    
    
    
    
      , 
    
    
    
    
      112
    
    
    
    
      , 
    
    
    
    
      133
    
    
    
    
      ]
    
    
    
    
      ,
   
   
   
   
     
  
  
  
  
    
   
   
   
   
         
    
    
    
    
      'age': 
    
    
    
    
      [ 
    
    
    
    
      10
    
    
    
    
      , 
    
    
    
    
      12
    
    
    
    
      , 
    
    
    
    
      13
    
    
    
    
      , 
    
    
    
    
      11
    
    
    
    
      , 
    
    
    
    
      12
    
    
    
    
      ]
    
    
    
    
      ,
   
   
   
   
     
  
  
  
  
    
   
   
   
   
         
    
    
    
    
      'gender': 
    
    
    
    
      [ 
    
    
    
    
      'M'
    
    
    
    
      , 
    
    
    
    
      'F'
    
    
    
    
      , 
    
    
    
    
      'F'
    
    
    
    
      , 
    
    
    
    
      'M'
    
    
    
    
      , 
    
    
    
    
      'M'
    
    
    
    
      ]
    
    
    
    
      ,
   
   
   
   
     
  
  
  
  
    
   
   
   
   
         
    
    
    
    
      'group': 
    
    
    
    
      [ 
    
    
    
    
      'first'
    
    
    
    
      , 
    
    
    
    
      'second'
    
    
    
    
      , 
    
    
    
    
      'first'
    
    
    
    
      , 
    
    
    
    
      'third'
    
    
    
    
      , 
    
    
    
    
      'third'
    
    
    
    
      ]
    
    
    
    
      ,
   
   
   
   
     
  
  
  
  
    
   
   
   
   
         
    
    
    
    
      'math_score': 
    
    
    
    
      [ 
    
    
    
    
      41.5
    
    
    
    
      , 
    
    
    
    
      43
    
    
    
    
      , 
    
    
    
    
      38
    
    
    
    
      , 
    
    
    
    
      47
    
    
    
    
      , 
    
    
    
    
      29.5
    
    
    
    
      ]
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
    
    
    
    
      }
    
    
    
    
      )
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
      
    
    
    
     
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
      
    
    
    
    data

Our dummy dataset comprises of 5 columns – ‘id’, ‘age’, ‘gender’, ‘group’, and ‘math marks’. As you can observe, it contains both numerical and categorical variables.

Let’s see how we perform the operations to add column(s) to this dataset.

Adding a column using List

The simplest way to add a column to an existing DataFrame is to create a list and assign it to a new column.

 
 
 
 
   
  
  
  
  
    
   
   
   
   
     
      
    
    
    
    values 
    
    
    
    
      = 
    
    
    
    
      [
    
    
    
    
      40
    
    
    
    
      , 
    
    
    
    
      38
    
    
    
    
      , 
    
    
    
    
      32.5
    
    
    
    
      , 
    
    
    
    
      27
    
    
    
    
      , 
    
    
    
    
      30
    
    
    
    
      ]
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
      
    
    
    
     
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
      
    
    
    
    data
    
    
    
    
      [
    
    
    
    
      'science_score'
    
    
    
    
      ] 
    
    
    
    
      = values
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
      
    
    
    
    data

Adding a column using Pandas Series

A single column is nothing but a Pandas Series – that is a 1D homogenous array.

You can simply assign the values of your Series into the existing DataFrame to add a new column:

 
 
 
 
   
  
  
  
  
    
   
   
   
   
     
      
    
    
    
    series 
    
    
    
    
      = pd.
    
    
    
    
      Series
    
    
    
    
      (
    
    
    
    
      [
    
    
    
    
      40
    
    
    
    
      , 
    
    
    
    
      38
    
    
    
    
      , 
    
    
    
    
      32.5
    
    
    
    
      , 
    
    
    
    
      27
    
    
    
    
      , 
    
    
    
    
      30
    
    
    
    
      ]
    
    
    
    
      , index
    
    
    
    
      =
    
    
    
    
      [
    
    
    
    
      0
    
    
    
    
      , 
    
    
    
    
      1
    
    
    
    
      , 
    
    
    
    
      2
    
    
    
    
      , 
    
    
    
    
      3
    
    
    
    
      , 
    
    
    
    
      4
    
    
    
    
      ]
    
    
    
    
      )
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
      
    
    
    
     
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
      
    
    
    
    data
    
    
    
    
      [
    
    
    
    
      'science_score'
    
    
    
    
      ] 
    
    
    
    
      = series.
    
    
    
    
      values
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
      
    
    
    
    data

Note that if the new column indices do not match those of the DataFrame, then NaN values are assigned to those indices:

 
 
 
 
   
  
  
  
  
    
   
   
   
   
     
      
    
    
    
    data
    
    
    
    
      [
    
    
    
    
      'science_score'
    
    
    
    
      ] 
    
    
    
    
      = pd.
    
    
    
    
      Series
    
    
    
    
      (
    
    
    
    
      [
    
    
    
    
      40
    
    
    
    
      , 
    
    
    
    
      38
    
    
    
    
      , 
    
    
    
    
      32.5
    
    
    
    
      , 
    
    
    
    
      27
    
    
    
    
      , 
    
    
    
    
      30
    
    
    
    
      ]
    
    
    
    
      , 
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
      
    
    
    
                                      index
    
    
    
    
      =
    
    
    
    
      [
    
    
    
    
      1
    
    
    
    
      , 
    
    
    
    
      2
    
    
    
    
      , 
    
    
    
    
      3
    
    
    
    
      , 
    
    
    
    
      4
    
    
    
    
      , 
    
    
    
    
      5
    
    
    
    
      ]
    
    
    
    
      )
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
    
    
    
    
      print
    
    
    
    
      (data
    
    
    
    
      )

Adding columns using assign()

You can use the assign() function to insert multiple new columns in a DataFrame when:

Index of the new column can be ignored
Values of an existing column need to be overwritten

This method returns a new DataFrame object, that is a copy of the DataFrame, containing all the original columns along with the new ones.

 
 
 
 
   
  
  
  
  
    
   
   
   
   
     
      
    
    
    
    s1 
    
    
    
    
      = pd.
    
    
    
    
      Series
    
    
    
    
      (
    
    
    
    
      [
    
    
    
    
      40.5
    
    
    
    
      , 
    
    
    
    
      38.5
    
    
    
    
      , 
    
    
    
    
      33
    
    
    
    
      , 
    
    
    
    
      28
    
    
    
    
      , 
    
    
    
    
      31
    
    
    
    
      ]
    
    
    
    
      , index
    
    
    
    
      =
    
    
    
    
      [
    
    
    
    
      0
    
    
    
    
      , 
    
    
    
    
      1
    
    
    
    
      , 
    
    
    
    
      2
    
    
    
    
      , 
    
    
    
    
      3
    
    
    
    
      , 
    
    
    
    
      4
    
    
    
    
      ]
    
    
    
    
      )
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
      
    
    
    
    s2 
    
    
    
    
      = pd.
    
    
    
    
      Series
    
    
    
    
      (
    
    
    
    
      [
    
    
    
    
      48.5
    
    
    
    
      , 
    
    
    
    
      42
    
    
    
    
      , 
    
    
    
    
      41
    
    
    
    
      , 
    
    
    
    
      37
    
    
    
    
      , 
    
    
    
    
      43
    
    
    
    
      ]
    
    
    
    
      , index
    
    
    
    
      =
    
    
    
    
      [
    
    
    
    
      0
    
    
    
    
      , 
    
    
    
    
      1
    
    
    
    
      , 
    
    
    
    
      2
    
    
    
    
      , 
    
    
    
    
      3
    
    
    
    
      , 
    
    
    
    
      4
    
    
    
    
      ]
    
    
    
    
      )
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
      
    
    
    
     
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
      
    
    
    
    data.
    
    
    
    
      assign
    
    
    
    
      (science_score
    
    
    
    
      =s1.
    
    
    
    
      values
    
    
    
    
      , english_score
    
    
    
    
      =s2.
    
    
    
    
      values
    
    
    
    
      )

Adding a column using insert()

You can use the insert() function when you need to insert a new column in a specific position or index.

 
 
 
 
   
  
  
  
  
    
   
   
   
   
     
    
    
    
    
      #Using the Series s2 created above
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
      
    
    
    
    data.
    
    
    
    
      insert
    
    
    
    
      (
    
    
    
    
      len
    
    
    
    
      (data.
    
    
    
    
      columns
    
    
    
    
      )
    
    
    
    
      , 
    
    
    
    
      'english_score'
    
    
    
    
      , s2.
    
    
    
    
      values
    
    
    
    
      )
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
    
    
    
    
      print
    
    
    
    
      (data
    
    
    
    
      )

What if you wanted to insert the english_score before the math_score?

 
 
 
 
   
  
  
  
  
    
   
   
   
   
     
    
    
    
    
      #Using the Series s2 created above
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
      
    
    
    
    data.
    
    
    
    
      insert
    
    
    
    
      (
    
    
    
    
      4
    
    
    
    
      , 
    
    
    
    
      'english_score'
    
    
    
    
      , s2.
    
    
    
    
      values
    
    
    
    
      )
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
    
    
    
    
      print
    
    
    
    
      (data
    
    
    
    
      )

Recommended online courses

Best-suited Python courses for you

Learn Python with these high-rated online courses

Programming Languages

Indian Institute of Hardware Technology, GurgaonCertificate

Total Fees

– / –

Duration

40 hours

Python Course

GKTCS InnovationsCertificate

Total Fees

– / –

Duration

5 days

Certificate in Jython

GKTCS InnovationsCertificate

Total Fees

– / –

Duration

3 days

PLC Programming

CRISPCertificate

Total Fees

₹3 K

Duration

3 weeks

Django - High Level Web framework

GKTCS InnovationsCertificate

Total Fees

– / –

Duration

4 days

Certificate in Python

Techdata Solution, PuneCertificate

Total Fees

– / –

Duration

20 hours

Scripting - Python

BlackBox Digital Lab - School of Visual EffectsCertificate

Total Fees

– / –

Duration

2 months

Certificate in Python

High Technologies Solutions (Delhi I Noida I Gurgaon) - HTS, KalkajiCertificate

Total Fees

– / –

Duration

1 year

Python Training

IIT BombayCertificate

4.2

Total Fees

Free

Duration

6 weeks

Databases and SQL for Data Science with Python

IBM - Institute of Business ManagementCertificate

Total Fees

– / –

Duration

3 months

Adding duplicate columns using insert()

The allow_duplicates parameter is set to False by default and returns a ValueError if the new column has a duplicate column name.

 
 
 
 
   
  
  
  
  
    
   
   
   
   
     
      
    
    
    
    s3 
    
    
    
    
      = pd.
    
    
    
    
      Series
    
    
    
    
      (
    
    
    
    
      [
    
    
    
    
      43
    
    
    
    
      , 
    
    
    
    
      34
    
    
    
    
      , 
    
    
    
    
      33.5
    
    
    
    
      , 
    
    
    
    
      29
    
    
    
    
      , 
    
    
    
    
      47
    
    
    
    
      ]
    
    
    
    
      , index
    
    
    
    
      =
    
    
    
    
      [
    
    
    
    
      0
    
    
    
    
      , 
    
    
    
    
      1
    
    
    
    
      , 
    
    
    
    
      2
    
    
    
    
      , 
    
    
    
    
      3
    
    
    
    
      , 
    
    
    
    
      4
    
    
    
    
      ]
    
    
    
    
      )
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
      
    
    
    
     
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
      
    
    
    
    data.
    
    
    
    
      insert
    
    
    
    
      (
    
    
    
    
      5
    
    
    
    
      , 
    
    
    
    
      'english_score'
    
    
    
    
      , s3.
    
    
    
    
      values
    
    
    
    
      , allow_duplicates
    
    
    
    
      =
    
    
    
    
      True
    
    
    
    
      )
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
    
    
    
    
      print
    
    
    
    
      (data
    
    
    
    
      )

As you can observe, there are two english_score columns in the above DataFrame.

Adding a column using concat()

You can concatenate a new column to an existing DataFrame by setting axis=1. The output would be a new DataFrame with the concatenated column.

Adding a column using apply()

When performing data manipulation, you might need to add a new column based on the values in the existing column(s). For this, apply() method can be used as shown:

 
 
 
 
   
  
  
  
  
    
   
   
   
   
     
      
    
    
    
    data
    
    
    
    
      [
    
    
    
    
      'avg_score'
    
    
    
    
      ] 
    
    
    
    
      = data.
    
    
    
    
      apply
    
    
    
    
      (
    
    
    
    
      lambda row: 
   
   
   
   
     
  
  
  
  
    
   
   
   
   
                                    
    
    
    
    
      (
    
    
    
    
      (row.
    
    
    
    
      math_score + row.
    
    
    
    
      science_score
    
    
    
    
      ) / 
    
    
    
    
      2
    
    
    
    
      )
    
    
    
    
      , 
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
      
    
    
    
                                   axis
    
    
    
    
      =
    
    
    
    
      1
    
    
    
    
      )
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
      
    
    
    
    data

As shown in the above DataFrame, we have calculated the average score based on the math_score and science_score columns using the lambda function.

Setting axis=1 ensures that apply() method works at the column level.

Adding an empty column

You can also add an empty column to the DataFrame by assigning a new column with the pd.NaT. Let’s add an empty column to our original DataFrame:

 
 
 
 
   
  
  
  
  
    
   
   
   
   
     
      
    
    
    
    data
    
    
    
    
      [
    
    
    
    
      'avg_score'
    
    
    
    
      ] 
    
    
    
    
      = pd.
    
    
    
    
      NaT
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
      
    
    
    
    data

pd.NaT denotes missing or null values in the Pandas DataFrame.

Adding a column with a constant value

You can assign a single value to all elements in a new column, as shown:

 
 
 
 
   
  
  
  
  
    
   
   
   
   
     
      
    
    
    
    data
    
    
    
    
      [
    
    
    
    
      'total_score'
    
    
    
    
      ] 
    
    
    
    
      = 
    
    
    
    
      50
   
   
   
   
     
  
  
  
  
    
   
   
   
   
     
      
    
    
    
    data

Endnotes

When inserting new columns to your Pandas DataFrame, you must pick the most suitable method based on your requirement. Pandas is a very powerful data processing tool and provides a rich set of functions to process and manipulate data for analysis. If you seek to learn the basics and various functions of Pandas, you can explore related articles here.

———————————————————————————————————————

Contributed by – Prerna Singh

About the Author

Shiksha Online

This is a collection of insightful articles from domain experts in the fields of Cloud Computing, DevOps, AWS, Data Science, Machine Learning, AI, and Natural Language Processing. The range of topics caters to upski... Read Full Bio

Adding Columns to Pandas DataFrame

Adding a column using List

Adding a column using Pandas Series

Adding columns using assign()

Adding a column using insert()

Best-suited Python courses for you

Programming Languages

Python Course

Certificate in Jython

PLC Programming

Django - High Level Web framework

Certificate in Python

Scripting - Python

Certificate in Python

Python Training

Databases and SQL for Data Science with Python

Adding duplicate columns using insert()

Adding a column using concat()

Adding a column using apply()

Adding an empty column

Adding a column with a constant value

Endnotes

Top Picks & New Arrivals