import pandas as pd
orders = pd.DataFrame(
{
"order_id": [1001, 1002, 1003, 1004, 1005, 1006],
"customer_id": [1, 2, 1, 3, 2, 4],
"channel": ["web", "mobile", "web", "web", "mobile", "partner"],
"category": ["books", "books", "course", "course", "books", "course"],
"amount": [20, 15, 120, 95, 30, 210],
}
)
customers = pd.DataFrame(
{
"customer_id": [1, 2, 3, 4],
"segment": ["individual", "individual", "startup", "enterprise"],
"region": ["US", "IN", "DE", "US"],
}
)
orders10 Pandas Transformations Practice
This notebook focuses on grouping, aggregation, reshaping, and joining tables.
10.1 Exercise 1
Compute the total amount by category.
# TODO: Group by category and sum the amount column.
revenue_by_category = orders.groupby("____")["amount"].____()
revenue_by_category10.2 Exercise 2
Compute count, mean, and max amount by channel.
# TODO: Aggregate the amount column with count, mean, and max.
channel_summary = orders.groupby("channel")["amount"].agg(["____", "____", "____"])
channel_summary10.3 Exercise 3
Create a pivot table with channel as rows, category as columns, and the sum of amount as values.
# TODO: Fill in the pivot arguments.
pivoted = orders.pivot_table(
index="____",
columns="____",
values="____",
aggfunc="sum",
)
pivoted10.4 Exercise 4
Merge orders with customers using customer_id.
# TODO: Merge the two tables on customer_id.
orders_with_customers = orders.merge(customers, on="___________", how="left")
orders_with_customers10.5 Exercise 5
After the merge, compute total revenue by segment.
# TODO: Reuse the merged table and summarize amount by segment.
segment_revenue = orders_with_customers.groupby("____")["amount"].sum()
segment_revenue10.6 Reflection
Which operation feels closest to SQL, and which feels distinctively pandas-native?