Display Settings

Back to Blog
Projects MindCap Data

Building a Topic Registry: Tracking Digital Attention Over Time

January 26, 2026 7 min read
A colorful iridescent brain illustration
What if your browser could tell you which topics are capturing your attention—and whether that interest is growing or fading?

Progress on MindCap today. The feature I've been wrestling with—a Topic Registry—is coming together.

MindCap is a privacy-focused browser behavior analysis tool I'm building to help users understand their online attention patterns. Today's work takes it from "tracking raw data" to "surfacing meaningful insights."

The Problem: Browser History Is Just a List of URLs

Your browser history tells you where you went. A timestamped list of URLs, stretching back weeks or months, useless for understanding what you were interested in or how that interest evolved.

I wanted to answer questions like:

Browser history can't answer any of these. It's raw data without meaning. So I built a Topic Registry.

The Solution: Topic Registry

The Topic Registry transforms raw browsing sessions into aggregated topic records. Instead of "you visited 47 URLs today," you get "you spent an hour on Kubernetes, spread across 12 sessions, and your interest is trending upward."

Here's what a topic record looks like:

yaml
topic: "kubernetes"
total_time_ms: 3600000      # 1 hour cumulative
session_count: 12           # appeared in 12 sessions
engagement_level: "engaged" # 10-30 min total = engaged
weekly_time_ms: [45min, 30min, 15min, 0, 0, ...]  # 12-week trend
related_topics: ["docker", "devops", "aws"]

Each topic tracks cumulative time, session count, engagement level, a 12-week time series for trend detection, and related topics that often appear in the same sessions. It's a complete picture of your relationship with a topic.

Engagement Levels: Making Numbers Human

Raw milliseconds are meaningless to humans. "You spent 847,293 ms on Python" tells me nothing. So I classify time into engagement levels:

Level Time Invested Meaning
quick_peek < 2 min Barely touched
exploring 2-10 min Getting familiar
engaged 10-30 min Actively learning
invested 30-90 min Significant commitment
deep 90+ min Core interest area

Now I can say "You're engaged with Kubernetes" or "Python is a deep interest for you." That actually means something.

Trend Detection: Growing or Fading?

The weekly_time_ms array is where the magic happens. By storing 12 weeks of time data per topic, I can detect whether interest is growing or fading.

The algorithm is simple: compare recent weeks to historical average.

python
# Trending: recent activity 50%+ higher than historical
if recent > older * 1.5:
    return "trending"

# Fading: recent activity 50%+ lower than historical
if recent < older * 0.5:
    return "fading"

This surfaces insights like "Your interest in Rust is growing" or "You haven't looked at Go in 3 weeks." Seeing your attention patterns visualized can be motivating—or concerning, depending on what you've been browsing.

Implementation Details

A few technical decisions worth noting:

Rolling Averages for Stability

One bad browsing session shouldn't destroy your engagement average. If you spend 3 hours rage-reading about a topic you hate, that shouldn't permanently mark it as a "deep interest."

Solution: weighted rolling averages.

python
new_avg = (old_avg * 0.8) + (new_score * 0.2)

New data contributes 20%, historical data keeps 80%. This creates smooth trends rather than noisy spikes. A single outlier session won't wreck your averages.

Time Distribution Across Keywords

When a session has multiple keywords, how do you attribute time? A 30-minute session about "building REST APIs in Python using FastAPI"—does Python get all 30 minutes? Does FastAPI?

I went with equal distribution: that session gives 10 minutes each to python, fastapi, and async. It's a simplification—you could weight by keyword frequency—but it's good enough for v1.

Background Processing

Topic updates happen as a background task after sync. Users shouldn't wait for analytics to compute:

python
background_tasks.add_task(
    update_topics_from_session,
    user_id,
    session_data,
    visits_data
)

The extension feels snappy because it is. All the heavy lifting happens after the user has moved on.

What's Next: Pattern Detection

The Topic Registry is the foundation for the next piece I'll build: Pattern Detection. Patterns use topic data to surface behavioral insights that raw metrics can't capture.

Here's what I'm planning:

Pattern detection is where MindCap stops being a fancy analytics dashboard and starts being genuinely useful for behavior change.

Key Takeaways from Today's Work:

  • Aggregate, don't just log. Raw data is overwhelming. Aggregated insights are actionable.
  • Use rolling averages for stability. Protect against noise in the data.
  • Classify into human-readable levels. "Engaged" means more than "847,293 ms".
  • Track trends over time. Weekly arrays enable "growing" vs "fading" detection.
  • Process in background. Don't make users wait for analytics.

MindCap is a personal project exploring how AI can help users understand their digital attention habits. All data stays local-first with privacy-preserving sync. Your browsing history never leaves your device—only the insights do.

More updates as Pattern Detection takes shape.

Jen Kim

Jen Kim

Developer, Claude Whisperer. Building tools for curiosity, creativity, and chaos.

All Posts Prev: Building an Attention Tracker