7 Real-Life Examples of Python's pivot_table

A Powerful Tool for Summarising and Aggregating Data

Sponsored
AiNexaVerse NewsWeekly AI Tools in your email!

The pivot_table() function in Pandas is a powerful tool for summarising and aggregating data. It allows you to create pivot tables, similar to those in spreadsheet software, to analyze your data efficiently. In this post, we'll explore seven real-life examples of how to use the pivot_table() function, with detailed explanations for each code snippet.

 

What is a Pivot Table?

A pivot table is a data summarisation tool commonly used in data analysis and reporting. It allows you to reorganize and summarise selected columns and rows of data to obtain a desired report. Pivot tables are especially useful for exploring and analyzing large datasets, enabling you to see patterns and trends at a glance.

 

Understanding the pivot_table() Function

The pivot_table() function in Pandas has several parameters that you can use to customise your pivot table. Here are some of the key parameters:

 

  • data: The DataFrame to use for creating the pivot table.

  • values: The column(s) to aggregate.

  • index: The column(s) to group by in the rows.

  • columns: The column(s) to group by in the columns.

  • aggfunc: The aggregate function to apply to the grouped data (e.g., sum, mean, count).

 

Dataset Overview

We'll use a sample dataset containing sales transactions from a retail store. The dataset includes the following columns:

 

  • TransactionID: The unique identifier for each transaction.

  • Date: The date of the transaction.

  • Store: The store where the transaction took place.

  • Product: The product sold.

  • Quantity: The quantity of the product sold.

  • Price: The price of the product.

 

Here is the sample data for sales_data.csv:

 

TransactionID,Date,Store,Product,Quantity,Price

1,2024-01-01,Store A,Laptop,1,1000

2,2024-01-01,Store A,Tablet,2,500

3,2024-01-02,Store B,Laptop,3,1000

4,2024-01-02,Store B,Tablet,4,500

5,2024-01-03,Store A,Smartphone,5,700

6,2024-01-03,Store A,Tablet,6,500

7,2024-01-04,Store B,Smartphone,7,700

8,2024-01-04,Store B,Tablet,8,500

9,2024-01-05,Store A,Laptop,9,1000

10,2024-01-05,Store A,Smartphone,10,700

 

Let's load this data into a Pandas DataFrame and explore how to use the pivot_table() function.

 

import pandas as pd

 

# Load the data into a DataFrame

df = pd.read_csv('sales_data.csv')

print(df)

 

Example 1: Total Quantity Sold by Store and Product

 

Let's create a pivot table to calculate the total quantity of each product sold in each store.

 

# Create a pivot table for total quantity sold by store and product

pivot_quantity = pd.pivot_table(df, values='Quantity', index='Store', columns='Product', aggfunc='sum')

print(pivot_quantity)

 

Explanation:

  • values='Quantity': We're interested in the Quantity column.

  • index='Store': We're grouping by the Store column in the rows.

  • columns='Product': We're grouping by the Product column in the columns.

  • aggfunc='sum': We're summing the quantities.

 

Output:

Example 2: Average Price by Store and Product

 

Let's create a pivot table to calculate the average price of each product in each store.

 

# Create a pivot table for average price by store and product

pivot_price = pd.pivot_table(df, values='Price', index='Store', columns='Product', aggfunc='mean')

print(pivot_price)

 

Explanation:

  • values='Price': We're interested in the Price column.

  • index='Store': We're grouping by the Store column in the rows.

  • columns='Product': We're grouping by the Product column in the columns.

  • aggfunc='mean': We're calculating the mean price.

 

Output:

 

Example 3: Total Sales by Store and Product

 

Let's create a pivot table to calculate the total sales (revenue) for each product in each store. The total sales for each transaction can be calculated by multiplying the Quantity by the Price.

 

# Add a new column for total sales

df['TotalSales'] = df['Quantity'] * df['Price']

 

# Create a pivot table for total sales by store and product

pivot_sales = pd.pivot_table(df, values='TotalSales', index='Store', columns='Product', aggfunc='sum')

print(pivot_sales)

 

Explanation:

  • df['TotalSales'] = df['Quantity'] * df['Price']: Adds a new column TotalSales to the DataFrame by multiplying Quantity and Price.

  • values='TotalSales': We're interested in the TotalSales column.

  • index='Store': We're grouping by the Store column in the rows.

  • columns='Product': We're grouping by the Product column in the columns.

  • aggfunc='sum': We're summing the total sales.

 

Output

 

Example 4: Count of Transactions by Store and Product

 

Let's create a pivot table to count the number of transactions for each product in each store.

 

# Create a pivot table for the count of transactions by store and product

pivot_count = pd.pivot_table(df, values='TransactionID', index='Store', columns='Product', aggfunc='count')

print(pivot_count)

 

Explanation:

  • values='TransactionID': We're counting the TransactionID column.

  • index='Store': We're grouping by the Store column in the rows.

  • columns='Product': We're grouping by the Product column in the columns.

  • aggfunc='count': We're counting the number of transactions.

 

Output

 

Example 5: Maximum Quantity Sold by Store and Product

 

Let's create a pivot table to find the maximum quantity sold for each product in each store.

 

# Create a pivot table for the maximum quantity sold by store and product

pivot_max_quantity = pd.pivot_table(df, values='Quantity', index='Store', columns='Product', aggfunc='max')

print(pivot_max_quantity)

 

Explanation:

  • values='Quantity': We're interested in the Quantity column.

  • index='Store': We're grouping by the Store column in the rows.

  • columns='Product': We're grouping by the Product column in the columns.

  • aggfunc='max': We're finding the maximum quantity sold.

 

Output:

 

Example 6: Minimum Price by Store and Product

 

Let's create a pivot table to find the minimum price of each product in each store.

 

# Create a pivot table for the minimum price by store and product

pivot_min_price = pd.pivot_table(df, values='Price', index='Store', columns='Product', aggfunc='min')

print(pivot_min_price)

 

Explanation:

  • values='Price': We're interested in the Price column.

  • index='Store': We're grouping by the Store column in the rows.

  • columns='Product': We're grouping by the Product column in the columns.

  • aggfunc='min': We're finding the minimum price.

 

Output:

Example 7: Average Total Sales by Store and Product

 

Let's create a pivot table to calculate the average total sales for each product in each store.

 

# Add a new column for total sales

df['TotalSales'] = df['Quantity'] * df['Price']

# Create a pivot table for the average total sales by store and product

pivot_avg_sales = pd.pivot_table(df, values='TotalSales', index='Store', columns='Product', aggfunc='mean')

print(pivot_avg_sales)

 

Explanation:

  • values='TotalSales': We're interested in the TotalSales column.

  • index='Store': We're grouping by the Store column in the rows.

  • columns='Product': We're grouping by the Product column in the columns.

  • aggfunc='mean': We're calculating the mean of total sales.

 

Output:

Summary

The pivot_table() function in Pandas is a versatile tool for summarising and aggregating data. In this post, we demonstrated seven real-life examples of how to use the pivot_table() function on a dataset of retail sales transactions. These examples included calculating total quantities sold, average prices, total sales, count of transactions, maximum and minimum quantities, and average total sales, among others.

 

By mastering the pivot_table() function, you can effectively analyze and gain insights from your data, making it an essential skill for any data analyst.

 

Ready for More Python Fun?

Subscribe to our newsletter now and get a free Python cheat sheet! Dive deeper into Python programming with more exciting projects and tutorials designed just for beginners.

Keep exploring, keep coding, and enjoy your journey into data analytics with Pandas!