Fan Chen
Back to Work

Project Case

Shanghai Housing Price Forecast

An explainable forecasting model for Shanghai monthly housing-price growth.

Role

Data collection / Feature engineering / XGBoost modeling / SHAP interpretation

Stage

Master thesis research, model iteration, and market interpretation

Key Outcome

Reduced forecast error by more than 98% versus the baseline model and identified momentum, liquidity, and sentiment signals.

Strategy Snapshot

BUSINESS QUESTION

Can housing movement be read before it becomes obvious?

The case turns a broad real-estate topic into a monthly forecasting problem, so the model can support earlier market judgment instead of only explaining past price changes.

REPORT EVIDENCE

98%+ error reduction with interpretable drivers

XGBoost reduced RMSE from 18,222.69 to 320.06, while SHAP identified price momentum, M2 liquidity, stock-market signals, and consumer confidence as readable market drivers.

STRATEGY SIGNAL

Use it as an early-warning framework

The output is strongest when translated into a dashboard for momentum, liquidity, sentiment, and policy direction, helping analysts discuss market timing and risk.

01

Background

This thesis reframes Shanghai housing-price analysis as a monthly growth forecasting problem. Instead of only describing long-term market trends, it builds a model that can capture short-term movements using housing prices, macro-financial indicators, policy signals, stock-market variables, and consumer sentiment.

02

Problem

A black-box forecast is not very useful for market judgment. The project needed to answer two questions at the same time: whether monthly housing-price movements could be predicted more accurately than a traditional linear model, and which variables were actually driving the model's prediction.

03

Approach

I collected and aligned multi-source data from 2000 to 2024, transformed annual and cumulative indicators into monthly features, built lagged variables for policy, LPR, M2, stock-market and sentiment signals, trained iterative XGBoost models, compared them with linear regression, and used SHAP to interpret feature contribution.

04

Key Evidence

FINAL MODEL

RMSE 320.06 / MAE 176.82

The final XGBoost version tracked Shanghai monthly housing prices from 2021 to 2024 with much lower error than the baseline.

ERROR REDUCTION

98%+

Compared with the initial baseline RMSE of 18,222.69, the final model achieved a substantial improvement after target redesign and feature engineering.

TOP SIGNAL

Price momentum

SHAP results showed that previous-month growth was the most influential predictor, indicating market inertia in short-term price movement.

MACRO DRIVER

M2 liquidity

Money supply emerged as an important macro-financial predictor, linking liquidity conditions to housing-market expectations.

Model Performance Snapshot

The final model sharply reduced error after target redesign and feature engineering.

Baseline RMSE18,222.69
Final XGBoost RMSE320.06
Final XGBoost MAE176.82

Market Driver Map

The model was presented as a market-explanation framework, not only a prediction engine.

Momentum

Previous-month growth

The strongest short-term predictor in SHAP analysis.

Liquidity

M2 money supply

A macro-financial signal linked to market expectation.

Sentiment

Consumer confidence / stock market

Signals investor mood and capital reallocation pressure.

05

Decision Logic

Use the model as an early-warning lens, not an automatic investment rule

The strongest value is not a single price forecast, but a repeatable framework for detecting when momentum, liquidity, sentiment, and policy signals begin to move in the same direction.

Translate SHAP output into market narratives

For stakeholders, the model should be presented as a structured explanation of what is driving the market, so analysts can connect technical output with housing-policy and investment discussions.

06

Outcome

The final workflow produced an accurate and explainable model for short-term Shanghai housing-price dynamics. It showed that price momentum, liquidity conditions, seasonality, financial-market indicators, and sentiment variables can jointly improve forecasting and interpretation.

07

Reflection

For job-facing presentation, this is my strongest data case: it shows that I can move from messy real-world data to model design, performance evaluation, interpretability, and business-facing market judgment.

PDF

Original Deliverable

This page is a concise case summary. The full report contains the research process and modeling details.

View Full Report