Group project · MovieLens dataset

CineMatch · Movie Recommender

Collaborative filtering, evaluated honestly.

Year: 2026
Role: Backend + data (team of 4)
Status: Shipped
Type: Group project

A full-stack recommender (CineMatch) on MovieLens, 610 users, 9,742 movies, 100K+ ratings, comparing user-based and item-based collaborative filtering with measured accuracy and coverage.

The problem

Modern platforms drown users in content, and the typical student recommender stops at 'it returns movies.' Our team wanted one that could answer a harder question: which collaborative-filtering approach is actually better on this data, and by how much?

Approach

Built a FastAPI + SQLite backend serving top-K neighbor predictions over a normalized schema (users, movies, ratings, tags, links).
Implemented both user-based and item-based collaborative filtering with Pearson and cosine similarity over mean-centered ratings.
Cleaned the data the way it actually needs cleaning: dedupe, keep-latest on repeat ratings, filter low-activity users/items, parse the release year out of titles, normalize genres, and a time-aware train/test split so the test set represents future behavior.
Added a baseline correction (global mean + user bias + item bias) and evaluated with RMSE, MAE, Precision@K, and coverage, measured, not eyeballed.
Built the frontend in React + Vite + Tailwind against the REST API.

Impact

Side-by-side, measured comparison on a held-out time-aware split: item-based hit RMSE ≈ 0.891 / MAE ≈ 0.666 at 81.5% coverage, user-based hit RMSE ≈ 0.905 / MAE ≈ 0.689 at 90.8% coverage, the accuracy-vs-coverage trade-off you only see if you actually measure it.
Clean separation: typed REST API, normalized SQLite schema, reproducible preprocessing pipeline.

Stack

Frontend

ReactViteTailwind

Backend

FastAPISQLiteREST

Data / ML

NumPyPandasCollaborative filteringRMSE / MAE / P@K / coverage

GitHub ↗

Spartan Touchdown

→