1,212 Lines Deleted, Zero Lines Missed
Yesterday I had a new intent detection system and a plan. Today I have a working full-stack refactor and nothing left to delete.
Three phases completed in one session: rewired the extension, deleted the old code, updated the entire Python backend. The 15-category system that classified half of all browsing as “unknown” is gone. In its place: five behavioral intents, flowing from the browser extension through the sync layer into Supabase and the topic registry, end to end.
The Joy of Deletion
The final count: 1,212 lines removed across four files.
category-data.ts— 555 lines of hand-mapped domain-to-category lookupstopic-categorizer.ts— 475 lines of five-layer classification orchestradomain_classifier.py— 110 lines of server-side Claude API calls to classify domainsrouters/domains.py— 72 lines of endpoints that no longer have a reason to exist
The replacement code already existed from yesterday’s session. The 310-line intent system was sitting there, waiting to be plugged in. Today was the plugging-in.
There’s a particular satisfaction in deleting a file that you spent an entire session building a month ago. category-data.ts was Session 6. I remember painstakingly mapping subreddits to categories, debating whether r/explainlikeimfive was “education” or “reference.” That file did its job — it got MindCap to the point where I could see what was wrong with the approach. And now it’s gone.
The Bug That Wasn’t Mine
The most interesting discovery today was a pre-existing bug in the pattern detector. Two functions — _detect_unanswered_question and _detect_learning_style — queried the visits table like this:
result = self.supabase.table("visits").select(
"search_query, engagement_score"
).eq("user_id", user_id).execute()
The visits table has no user_id column. Visits belong to sessions; sessions belong to users. The query was silently returning nothing, which meant those two patterns never fired.
The fix is a join through sessions:
sessions = self.supabase.table("sessions").select("id").eq("user_id", user_id).execute()
session_ids = [s["id"] for s in (sessions.data or [])]
result = self.supabase.table("visits").select(
"search_query, engagement_score"
).in_("session_id", session_ids).execute()
This is the kind of bug that hides because it fails silently. No error, no crash — just an empty result set that makes a pattern detector conclude “no patterns found.” Row Level Security was quietly doing its job: the query didn’t match anything because the column didn’t exist, and Supabase returned an empty set rather than an error.
I only found it because I was reading every line of the pattern detector for the intent refactor. Refactoring is free code review.
Code Complete Is Not Done
The refactor is code complete. Every file is updated. The migration is applied. The types align from TypeScript to Pydantic to PostgreSQL.
But I haven’t run it yet.
This is the gap that junior developers underestimate and senior developers learn to respect. “It compiles” means almost nothing for a system with this many moving parts. The extension captures data, transforms it, stores it locally, batches it, syncs it to an API, which validates it, stores it in Supabase, and feeds it to a topic registry that aggregates it across time. A field renamed in the wrong place, a null where a default was expected, a type coercion that works in TypeScript but fails in Python — any of these breaks the chain silently.
So I wrote a testing plan. Eight tests, progressing through the pipeline:
- Intent detection across site types
- Keyword extraction quality
- Re-detection on multi-intent domains
- Visit relationship tracking
- Session closure metrics
- Sync payload construction
- Supabase storage verification
- Topic registry end-to-end
Each test has specific URLs to visit, expected values, and SQL queries to verify the data landed correctly. The plan exists because “browse around and see if it works” is not a test — it’s a hope.
The Two-Hour Gradient War
Unrelated to the refactor, but too funny not to document: I spent over two hours today fighting with a CSS background gradient on this website. The same gradient I’ve used on every page. The one that just works. Except today it didn’t, and I could not figure out why, and I kept tweaking Tailwind classes like a person adjusting a thermostat one degree at a time expecting the weather to change.
I refactored an entire full-stack data pipeline in less time than it took me to make a background go from pink to blue. Software humbles you on a schedule.
The Refactor in Numbers
| Metric | Before | After |
|---|---|---|
| Classification categories | 15 | 5 intents |
| Lines of classification code (extension) | 1,030 | 310 |
| Lines of server-side classification | 182 | 0 |
| API calls per domain classification | 1 (Claude Haiku) | 0 |
| Detection layers | 5 (content-blind) | 6 (content-aware) |
| Spread metric | Integer category count (0–14) | Continuous Jaccard distance (0–4) |
| “Unknown” classifications | 51.3% | TBD after testing |
That last row is the one that matters. The whole refactor exists to push that number down. I’ll find out tomorrow.
What I’m Reading
Between sessions, I picked up Peter Pirolli’s Information Foraging Theory. It’s the academic foundation for how people navigate information online — and it turns out MindCap is essentially building an instrument to measure what Pirolli describes theoretically. His concept of “information scent” maps almost exactly to what the intent detector is doing: inferring the user’s goal from environmental cues (titles, URLs, domain patterns) rather than asking them directly.
Nicholas Carr’s The Shallows is the complementary read — where Pirolli describes the mechanism, Carr describes the consequence. MindCap sits in between: measuring the mechanism to help users understand the consequence.
MindCap is a personal project — a browser extension that maps curiosity patterns. Built with Plasmo, Dexie.js, FastAPI, Supabase, and Claude. Session 13 of development. You can follow the development on this blog.