10  Pandas Transformations Practice

This notebook focuses on grouping, aggregation, reshaping, and joining tables.

import pandas as pd

orders = pd.DataFrame(
    {
        "order_id": [1001, 1002, 1003, 1004, 1005, 1006],
        "customer_id": [1, 2, 1, 3, 2, 4],
        "channel": ["web", "mobile", "web", "web", "mobile", "partner"],
        "category": ["books", "books", "course", "course", "books", "course"],
        "amount": [20, 15, 120, 95, 30, 210],
    }
)

customers = pd.DataFrame(
    {
        "customer_id": [1, 2, 3, 4],
        "segment": ["individual", "individual", "startup", "enterprise"],
        "region": ["US", "IN", "DE", "US"],
    }
)

orders

10.1 Exercise 1

Compute the total amount by category.

# TODO: Group by category and sum the amount column.
revenue_by_category = orders.groupby("____")["amount"].____()
revenue_by_category

10.2 Exercise 2

Compute count, mean, and max amount by channel.

# TODO: Aggregate the amount column with count, mean, and max.
channel_summary = orders.groupby("channel")["amount"].agg(["____", "____", "____"])
channel_summary

10.3 Exercise 3

Create a pivot table with channel as rows, category as columns, and the sum of amount as values.

# TODO: Fill in the pivot arguments.
pivoted = orders.pivot_table(
    index="____",
    columns="____",
    values="____",
    aggfunc="sum",
)
pivoted

10.4 Exercise 4

Merge orders with customers using customer_id.

# TODO: Merge the two tables on customer_id.
orders_with_customers = orders.merge(customers, on="___________", how="left")
orders_with_customers

10.5 Exercise 5

After the merge, compute total revenue by segment.

# TODO: Reuse the merged table and summarize amount by segment.
segment_revenue = orders_with_customers.groupby("____")["amount"].sum()
segment_revenue

10.6 Reflection

Which operation feels closest to SQL, and which feels distinctively pandas-native?