Series vs. DataFrame in Pandas – Shiksha Online

5 mins read31.5K Views Comment

Updated on Apr 10, 2023 16:17 IST

In this tutorial, we are going to learn the two most common data structures in Pandas – Series and DataFrame.

Pandas is a very popular open-source Python library that offers a diverse set of tools that aid in performing data analysis more efficiently. The Pandas package is mainly used for data pre-processing purposes such as data cleaning, manipulation, and transformation. Hence, it is a very handy tool for data scientists and analysts. In this article, we cover the two most common data structures in Pandas – Series and DataFrame, and also Series vs DataFrame.

We will cover the following sections:

Installing and Importing Pandas
Data Structures in Pandas
Pandas Series
Pandas DataFrame

Installing and Importing Pandas

First, let’s install the pandas library in your working environment. Execute the following command in your terminal:

pip install pandas
Copy code

Now let’s import the libraries we’re going to need today:

import pandas as pd
import numpy as np
Copy code

Recommended online courses

Best-suited Python for data science courses for you

Learn Python for data science with these high-rated online courses

Python for data science

IIT MadrasCertificate

5.0

Total Fees

Free

Duration

4 weeks

Data Analysis with Python for Managers (with Live Project)

Coding NinjasCertificate

4.6

Total Fees

₹12 K

Duration

8 hours

Data Science using Python

IIT KanpurCertificate

4.0

Total Fees

₹4.24 K

Duration

6 weeks

Data Science Online Training

Besant Technologies, Velachery - ChennaiCertificate

5.0

Total Fees

₹40 K

Duration

100 hours

Certificate Program in Data Science for Finance (CPDSF)

Indian Institute of Quantitative FinanceCertificate

Total Fees

₹68 K

Duration

3 months

Online Course Data Science with Python

ThinkNext TechnologiesCertificate

Total Fees

₹4.99 K

Duration

– / –

Certified Professional Diploma in Data Science

NetTech IndiaCertificate

4.0

Total Fees

– / –

Duration

– / –

DATA SCIENCE COURSE USING PYTHON.

CETPA Infotech Pvt LtdCertificate

5.0

Total Fees

– / –

Duration

60 hours

Python

Seven Mentor Pvt LtdCertificate

4.5

Total Fees

– / –

Duration

90 hours

Introduction to Python for Data Science and Data Engineering

DatabricksCertificate

Total Fees

₹1.27 L

Duration

12 hours

Data Structures in Pandas

Data Structure refers to the specialized way of organizing, processing, and storing data to apply specific types of functionalities to them.

Pandas has two main types of Data Structures based on their dependability –

Series: 1D labeled array
DataFrame: 2D labeled tabular structure

Series vs DataFrame

Let’s summarize the difference between the two structures in a table:

Pandas Series	Pandas DataFrame
One-dimensional	Two-dimensional
Homogenous – Series elements must be of the same data type.	Heterogenous – DataFrame elements can have different data types.
Size-immutable – Once created, the size of a Series object cannot be changed.	Size-mutable – Elements can be dropped or added in an existing DataFrame.

Now that we have a fair idea about Series and DataFrame, let’s see how we create them in Python, shall we?

Pandas Series

Creating a Pandas Series Using Dictionary
Creating a Pandas Series Using ndarray
Creating a Pandas Series Using Scalar Values

As stated above, Pandas Series is a one-dimensional labeled array whose object size cannot be changed. You can also see it as the primary building block for a DataFrame, making up its rows and columns.

Following is the basic method to create a Series:

#pandas.Series
series = pd.Series(data=None, index=None, dtype=None, name=None)
Copy code

Index	Data
0	element 1
1	element 2
2	element 3
3	element 4

The data parameter can take any of the following data types:

Python dictionary (dict)
ndarray
A scalar value

The index parameter accepts list data type that allows you to label your index axis.

The dtype parameter sets the data type of the Series.

The name parameter allows you to name your Series.

Creating a Pandas Series Using Dictionary

If data is of dict type and index is not specified, the dict keys will be the index labels.

#Creating a Series from dict
data = {'Mon': 22, 'Tues': 23, 'Wed': 23, 'Thurs': 24, 'Fri': 23, 'Sat': 22, 'Sun': 21}
series = pd.Series(data=data, name='series_from_dict')
print(series)
Copy code

Creating a Pandas Series Using Dictionary

Creating a Pandas Series Using ndarray

If data is a ndarray, the index must be of the same length as the array. If index is not specified, it will be created automatically with values: [0, …, len(data) – 1].

NumPy library has a function random.randint() that produces a ndarray populated with random integers, let’s use that here:

#Creating a Series from ndarray
data = np.random.randn(5)
series = pd.Series(data=data, 
                   index=['one', 'two', 'three', 'four', 'five'],
                   name='series_from_ndarray')
print(series)
Copy code

Creating a Pandas Series Using Scalar Values

The data can be assigned a single value. The index has to be provided in this case. The given value will be repeated up to the length of the index.

#Creating a Series from a scalar value
series = pd.Series(data=7.3, 
                   index=['a', 'b', 'c', 'd'],
                   name='series_from_scalar')
print(series)
Copy code

Creating a Pandas Series Using Scalar Values

Pandas DataFrame

Pandas DataFrame, on the other hand, is a two-dimensional structure with columns and rows whose size can be changed. You can also think of it as a dictionary of Series objects.

Creating a Pandas DataFrame Using a Dictionary of Pandas Series
Creating a Pandas DataFrame Using a Dictionary of Lists or ndarrays
Creating a Pandas DataFrame Using a List of Dictionaries
Creating a Pandas DataFrame Using a Series

Following is the basic method to create a DataFrame:

\n  \n  \n  <pre class="python" style="font-family:monospace">\n   \n   \n   <span style="color: #808080;font-style: italic">\n    \n    \n    #pandas.DataFrame\n    \n    \n    
\n   \n   \n   </span style="color: #808080;font-style: italic">\n  \n  \n  </pre class="python" style="font-family:monospace">
Copy code

df = pd.DataFrame(data=None, index=None, columns=None, dtype=None)
Copy code

Index	COlumn 1	COLUMN 2
0	element 1	element a
1	element 2	element b
2	element 3	element c
3	element 4	element d

The data parameter can take any of the following data types:

Dictionary (dict) of – 1D ndarray, lists, or Series
2D ndarray
Pandas Series
Another Pandas DataFrame

The index parameter can be passed optionally, and it accepts row labels.

The columns parameter can also be passed optionally, and it accepts column labels.

The dtype parameter sets the data type of the DataFrame.

Creating a Pandas DataFrame Using a Dictionary of Pandas Series

The index must be the same length as the Series. If index is not specified, it will be created automatically with values: [0, …, len(data) – 1].

#Creating a DataFrame from a dictionary of Series
data = pd.DataFrame({
    "Class 1": pd.Series([22, 33, 38], index=["math avg", "science avg",  "english avg"]),
    "Class 2": pd.Series([45, 28, 36], index=["math avg", "science avg",  "english avg"]),
    "Class 3": pd.Series([32, 41, 47], index=["math avg", "science avg",  "english avg"])
})
 
data
Copy code

Creating a Pandas DataFrame Using a Dictionary of Pandas Series

Creating a Pandas DataFrame Using a Dictionary of Lists or ndarrays

The ndarrays must all be of the same length. The index must be the same length as the arrays. If the index is not specified, the result will be range(n), where n is the array length.

Let’s create the same DataFrame, but this time using lists/ndarrays:

#Creating a DataFrame from a dictionary of lists
data = {
    "Class 1": [22, 33, 38],
    "Class 2": [45, 28, 36], 
    "Class 3": [32, 41, 47]
}
df = pd.DataFrame(data=data, index=['math avg', 'science avg', 'english avg'])
 
df
Copy code

Creating a Pandas DataFrame Using a Dictionary of Lists or ndarrays

Creating a Pandas DataFrame Using a List of Dictionaries

#Creating a DataFrame from a list of dictionaries
data = [{"col1": 1, "col2": 2}, {"col1": 5, "col2": 10, "col3": 20}]
 
pd.DataFrame(data)
Copy code

Creating a Pandas DataFrame Using a Series

When you create a DataFrame using a Series, the resulting DataFrame will have one column whose name is the original name of the Series:

#Creating a DataFrame from a Series
data = pd.DataFrame({"Col1": pd.Series([22, 33, 38])})
data
Copy code

Creating a Pandas DataFrame Using a Series

After the creation of a DataFrame, you can query it and select, add, or delete columns from it, i.e., perform Data Manipulation.

Pandas DataFrame can be queried in multiple ways – such as loc[] and iloc[] methods – .iloc[] can be used to query using the index/position of the value and .loc[] to query using the user-defined keys.

Endnotes

Pandas is a very powerful data processing tool for Python. It offers a rich set of functions to import and process various types of file formats from multiple data sources. The Pandas library is specifically useful for data scientists working with data cleaning and analysis. Hope this article on Series vs DataFrame helped you understand the concept better.

About the Author

Shiksha Online

This is a collection of insightful articles from domain experts in the fields of Cloud Computing, DevOps, AWS, Data Science, Machine Learning, AI, and Natural Language Processing. The range of topics caters to upski... Read Full Bio

Series vs. DataFrame in Pandas – Shiksha Online

Installing and Importing Pandas

Best-suited Python for data science courses for you

Python for data science

Data Analysis with Python for Managers (with Live Project)

Data Science using Python

Data Science Online Training

Certificate Program in Data Science for Finance (CPDSF)

Online Course Data Science with Python

Certified Professional Diploma in Data Science

DATA SCIENCE COURSE USING PYTHON.

Python

Introduction to Python for Data Science and Data Engineering

Data Structures in Pandas

Series vs DataFrame

Pandas Series

Creating a Pandas Series Using Dictionary

Creating a Pandas Series Using ndarray

Creating a Pandas Series Using Scalar Values

Pandas DataFrame

Creating a Pandas DataFrame Using a Dictionary of Pandas Series

Creating a Pandas DataFrame Using a Dictionary of Lists or ndarrays

Creating a Pandas DataFrame Using a List of Dictionaries

Creating a Pandas DataFrame Using a Series

Endnotes

Top Picks & New Arrivals