← Back to portfolio
Predictive Analytics Case Study

Forecasting Logistic Yard Moves Across Multiple Campuses with Snowflake and Prophet

This project turns operational movement history into a short-horizon planning signal. The workflow pulls historical yard move data from Snowflake with Snowpark, anonymizes campus names, trains a per-site Prophet model, and produces a 7-day forecast with executive-friendly visualizations.

Python Jupyter Notebook Snowflake Snowpark Prophet Matplotlib GitHub Pages

Business Problem

Yard operations create a high volume of repetitive logistics moves that vary by day of week, site profile, and local demand patterns. Planning against raw history is noisy, especially when multiple campuses need to be monitored at once.

The goal was to create a lightweight forecasting workflow that operations leaders could use to anticipate the next 7 days of movement volume, compare likely demand across campuses, and identify where attention or staffing might be needed.

Sensitive fields were masked before publication. Real campus names are transformed into neutral labels such as Site A, Site B, and Site C.

Architecture & Tools

The notebook follows a straightforward pipeline: query Snowflake, convert to a Pandas frame, train one Prophet model per campus, then consolidate predictions into a single reporting dataset used for downstream charts.

Snowflake + Snowpark

Source of historical yard move features and active warehouse execution context.

Pandas

Tabular wrangling, field renaming, concatenation, and summary statistics.

Prophet

Per-campus forecasting with weekly seasonality enabled for short-horizon operational demand.

Matplotlib

Presentation layer for forecast grids, comparison bars, and a top-campus heatmap.

The Code

The notebook is structured in four logical blocks: data extraction, model training, forecast consolidation, and visual storytelling. For a portfolio page, it is better to show the most decision-relevant slices of code rather than every notebook cell.

Data extraction and anonymization Python
import pandas as pd
import string
from prophet import Prophet
from snowflake.snowpark.context import get_active_session

session = get_active_session()
session.sql("USE DATABASE ANALYTICS_DB").collect()
session.sql("USE SCHEMA DEV_MARTS").collect()

df_snow = session.table("ANALYTICS_DB.DEV_MARTS.FACT_DAILY_YARD_MOVE_FEATURES")
df = df_snow.to_pandas()

unique_campuses = sorted(df["CAMPUSNAME"].unique())
campus_mask = {
    name: f"Site {string.ascii_uppercase[i]}" if i < 26 else f"Site {i + 1}"
    for i, name in enumerate(unique_campuses)
}
df["CAMPUSNAME"] = df["CAMPUSNAME"].map(campus_mask)
Campus-level forecasting loop Python
all_predictions = []
campuses = df["CAMPUSNAME"].unique()

for campus in campuses:
    campus_df = df[df["CAMPUSNAME"] == campus].copy()
    campus_df = campus_df.rename(columns={"MOVE_DATE": "ds", "TOTAL_MOVES": "y"})

    model = Prophet(
        daily_seasonality=False,
        yearly_seasonality=False,
        weekly_seasonality=True,
    )
    model.fit(campus_df)

    future = model.make_future_dataframe(periods=7)
    forecast = model.predict(future)
    forecast = forecast[["ds", "yhat", "yhat_lower", "yhat_upper"]].tail(7)
    forecast["CAMPUSNAME"] = campus
    all_predictions.append(forecast)

final_df = pd.concat(all_predictions, ignore_index=True)
Output shaping for reporting Python
final_df = final_df.rename(columns={
    "ds": "PREDICTED_DATE",
    "yhat": "FORECASTED_MOVES",
    "yhat_lower": "LOWER_BOUND",
    "yhat_upper": "UPPER_BOUND",
})

final_df["FORECASTED_MOVES"] = final_df["FORECASTED_MOVES"].round(0)
final_df["LOWER_BOUND"] = final_df["LOWER_BOUND"].round(0)
final_df["UPPER_BOUND"] = final_df["UPPER_BOUND"].round(0)

print(f"Success! Forecasted 7 days for {len(campuses)} campuses.")

Visualizations

The reporting layer combines overview and prioritization views. The forecast grid shows the full network, the comparison bars simplify executive scanning, and the heatmap highlights where the largest operational load is concentrated over the next week.

A campus-by-campus view of the next 7 days, including confidence intervals and daily forecast labels.
A comparative view of total forecasted moves and average daily demand by campus.
A dense ranking view that helps spot where operational load is concentrated by day.

Conclusion

This case study demonstrates an end-to-end analytics pattern that is common in modern data teams: operational data access, lightweight predictive modeling, and business-facing visualization. The implementation is intentionally pragmatic, favoring quick interpretability and repeatability over unnecessary modeling complexity.

For a portfolio audience, the strongest signal is not only that a model was trained, but that the output was structured in a way that supports decisions. That is the core story this project tells.