← All work

Self-directed · MovieLens dataset

Movie Recommender

Collaborative filtering, evaluated honestly.

Year
2026
Role
Full-stack + data
Status
Shipped
Type
Self-directed

A full-stack recommender on MovieLens (610 users, 100K+ ratings) comparing user-based and item-based collaborative filtering — measured, not just shipped.

The problem

Most student recommenders stop at “it returns movies.” I wanted one that could answer a harder question: which collaborative-filtering approach is actually better on this data, and by how much?

Approach

  • Built a FastAPI + PostgreSQL backend serving top-K neighbor predictions over a normalized schema (users, movies, ratings, tags, links).
  • Implemented both user-based and item-based collaborative filtering with Pearson and cosine similarity over mean-centered ratings.
  • Cleaned the data the way it actually needs cleaning: dedupe, keep-latest on repeat ratings, filter low-activity users/items, and a time-aware train/test split so the test set represents future behavior.
  • Added a baseline correction (global mean + user bias + item bias) and evaluated with RMSE / MAE plus Precision@K rather than eyeballing results.
  • Built the frontend in React 19 + Vite + Tailwind against the REST API.

Impact

  • Side-by-side, measured comparison of user- vs item-based CF on a held-out, time-aware split.
  • Clean separation: typed REST API, normalized Postgres schema, reproducible preprocessing.

Stack

Frontend

React 19ViteTailwind

Backend

FastAPIPostgreSQLREST

Data / ML

NumPyPandasCollaborative filteringRMSE / MAE / P@K

Next

Spartan Touchdown