I'm going to start keeping a running dev log as LDBD evolves. This is the first entry.
This dev log started with one product question: what would give people a reason to come back to LDBD?
LDBD shouldn't be just another site where people make predictions and get ranked. The more interesting version is a place where humans and AI bots share the reasoning behind their calls side by side, so visitors can compare different views on the same asset and form their own.
So I spent a day with Claude Code going through LDBD page by page. It wasn't a day for building new screens from scratch. It was closer to reopening pages that already worked and asking, “What would a visitor want to see here that's still missing?” The previous two posts (No Users Yet, but Operations Have Already Started and Branding a SaaS Without a Designer) had a similar theme. This time, I tried to do the same kind of pass in a single day.
The common thread was simple. The data was already there. Visitors just couldn't see it in the right places.
The leaderboard didn't explain why #1 was #1. The asset page didn't show who was good at a specific ticker. The profile didn't show whether someone's score was trending up or sliding down. So the goal wasn't to bolt on more features — it was to make the data LDBD already had readable enough to give people a reason to come back.
I split the day into six chunks. In the codebase, each one became its own branch (a branch is just a self-contained bundle of code changes for one feature). Scoring fairness on the leaderboard, the asset page's community features, brand color application, the profile chart, per-asset specialists, and achievement badges.
More interesting than any single feature was the pattern that kept showing up across all six branches. Claude built a first draft fast, and sometimes came back with a product decision one step better than my first idea. In return, small bugs kept slipping through, and direction-setting stayed firmly on my side.
0. The day's work list
Here are all six at a glance. The rest of the post walks through each one.
| # | Branch | Surface | What it solved |
|---|---|---|---|
| 1 | scoring-fairness | Leaderboard | Closing the gap where a 1-hit user who stopped predicting could top the board |
| 2 | asset-discovery | Asset search + asset page | A channel for sharing views & reasoning + copycat protection |
| 3 | brand-color-pass | Site-wide | Turning the black default buttons into brand green |
| 4 | profile-insights | Profile page | From four numbers to one line |
| 5 | specialist-leaderboard | Asset page | “Who's actually good at this asset?” |
| 6 | streaks-achievements | Profile page | Encouraging streaks & milestones |
1. The 1-hit, wait 90 days, become #1 problem
The average-score leaderboard had a small loophole. The threshold for being listed was “30 resolved predictions OR 90 days since signup”, and that OR clause was producing surprising results.
Suppose a brand-new user signs up, throws in one or two predictions, gets lucky on one of them, and then stops predicting and waits 90 days. The moment the 90th day passes, that single hit's average gets surfaced on the leaderboard. The structure was effectively manufacturing a brand-new 100%-accuracy, +1.0 average #1. That user would sit next to people who had resolved hundreds of predictions over a year and built up a +0.3 average — and the comparison made no sense.
When I asked Claude “I want to stop users with low sample sizes from popping to the top,” it came back with three things at once.
- Bayesian smoothing — sort by
total_score / (resolved_count + 20). Put plainly: don't trust the average of a user with few predictions, and instead pull it toward zero. As the sample grows, the value converges to the real average. - Timeframe-weighted count — 30 daily predictions and 1 yearly prediction used to count the same. Weights: 1d=1, 1w=2, 1m=5, 6m=12, 1y=24. Someone who only calls 1y now reaches the leaderboard threshold after about two predictions.
- Closing the calibration loophole— remove the “OR 90 days” branch entirely and only check weighted count. The scenario above, where one lucky hit gets quietly elevated by waiting, gets closed automatically.
These three only made sense together. Just smoothing without weights, and yearly predictors disappear forever. Just weights without smoothing, and new users still spike to the top. All three needed to land at once to balance.
After merging, the top five settled into “people who'd actually built up a real track record.” A small side bug came along with it — the displayed value and the sort key were different (sort used the adjusted value, display showed the raw average). I unified both on the adjusted value in a follow-up patch.
2. Rebuilding the asset page — community, not just a game
This was the heaviest piece of the day, and the place where the bigger-picture direction from the intro got its most concrete form. On the surface, the result is just a distribution bar and a list of reasoning on the asset page. Underneath, it was a decision about what kind of service LDBD should be — moving from a prediction game toward a perspective-sharing site.
The first step of that vision was the asset page itself. I added an /explore search page, a search input in the header, and the new prediction panel on the asset detail page (the up/down distribution bar plus a list of reasoning).
Exposing all the reasoning openly created a problem, though. If someone posts a sharp call with strong reasoning, another user can just copy the same call. Copycat. My first suggestion to Claude was simple — “if a viewer hasn't predicted on this asset themselves, hide the reasoning entirely.”
Claude's answer was a step ahead.
“On Explore, when you search an asset, show the reasoning but hide who predicted.” An asymmetric split like that keeps the community value alive and still blocks copycats. Hide the who, keep the why.
Only when you've submitted your own prediction on the same asset, timeframe, and date does the identity behind a prediction unlock. Otherwise you see a “Human” or “AI” category label and a lock icon. The reasoning body itself is readable as soon as you log in.
The whole decision can be summed up in one line. Reasoning is a learning asset; the handle is the copycat trigger.What invites copying isn't the call itself — it's knowing who made the call. People follow a known top predictor. They don't follow an anonymous block of reasoning; they read it and form their own view, which is just learning.
If my first suggestion had gone in as-is, the community value would have been killed entirely. I was building the asset page to be a place to share perspectives, and yet the very first version would have hidden every perspective. Claude's one-step-ahead answer caught that contradiction. The smartest collaboration moment of the day.
3. One CSS variable, and the whole site changed color
The brand logo is emerald green and gold, but the signup button in the header was black. So was the unread dot on the notification icon, and the active state on the language toggle. “Where is the brand color, exactly?” was probably a fair first impression for anyone landing on the site.
The culprit was shadcn/ui's default theme. The CSS variable --primary was set to near-black gray (oklch(0.205 0 0)), and every default Button and Badge was inheriting from it. Brand emerald was only showing up in places where I'd explicitly typed “use this color here.”
When I told Claude “I want brand green to show up more,” it laid out a list of candidate spots to change, and then said “the single highest-impact change is to point --primary at the brand color itself.” Change one line, and every default Button, Badge, and bg-primary usage across the site flips to emerald at once.
/* before */
--primary: oklch(0.205 0 0);
/* after */
--primary: oklch(0.696 0.17 162.48); /* emerald-500 */The thought that “something unrelated might also turn emerald” nagged a bit, but the dev server told a different story. Every affected spot turned out to be an active or selected state — the unread dot, the selected tab, the active toggle — exactly the places that need to be visually emphasized, where the brand color belongs anyway.
Gold (amber) got handled more carefully. I reserved gold for “rare milestones” in branch #6 — a 6-month hit, a 1-year hit, an asset specialist, 100 resolved. Gold loses its signal value the moment it becomes common.
4. From four numbers to one line
The profile page used to surface four numbers in boxes: avg score / 1-year avg / prediction count / accuracy. It told you whether the profile was good. It didn't tell you which direction they were heading. Recent trajectory is the deciding signal — for the owner looking at their own page, and for a visitor deciding whether to follow.
I pulled 365 days of resolved predictions in a single query and let the client toggle between 7d / 30d / 1y windows on a time-series chart. The Y axis was, at first, cumulative score_delta. “Line goes up = score is accumulating” felt like the most intuitive shape.
Looking at the chart, something felt off. The ScoreCell next to it showed average score, but my chart's Y axis was cumulative. Two different metrics were sitting on the same page, and they didn't line up. It blurred which one to trust.
I switched the Y axis to the running Bayesian-adjusted average — the same metric the leaderboard ranks on. Now the chart shows how my position on the leaderboard has actually moved over time, as one readable line.
Next problem: the X axis. When several predictions resolved on the same day, the same date appeared two or three times in a row on the X axis. And days with no resolutions dropped out entirely, so the 30-day toggle could end up showing 90 calendar days of gaps.
The fix had two parts. First, group predictions by resolution date so each calendar day shows as a single point. Second, generate 7 / 30 / 365 calendar-day points for the visible window and forward-fill the average on days with no resolutions. The 7d toggle now actually shows the most recent 7 days, exactly as the label promises.
One small chart was enough to confirm a fact I've seen repeatedly with Claude. The first version is almost never the final version.Mine wasn't. Claude's wasn't. Iterating is the point.
5. “Who's actually good at this asset?”
Once a visitor reads the distribution bar and the reasoning on an asset page, the next natural question is “so who's actually good at calling this asset?”Jumping to the global leaderboard doesn't answer that — the global view doesn't tell you anything asset-specific.
I added a card listing the top five identities for the asset, by Bayesian-adjusted average over the past year. Medal + Human/AI label + accuracy + adjusted score.
There was one small decision inside this feature. The global leaderboard uses smoothing constant k=20, but per-asset sample sizes are inherently smaller. Reusing k=20 would have left almost every asset card empty. Claude suggested “use a smaller k for per-asset, something like 5,” and that value landed well. Plus a minimum of three resolved predictions on the asset before anyone gets listed.
It sits directly under the predictions panel. The flow on one page reads as why they saw it that way (reasoning) → who's actually good at this asset (specialist) → is this someone worth following. The whole flow stays on one page.
6. Gamification that doesn't punish — and what I chose not to build
Last one: streaks (consecutive correct calls) and eight achievement badges. You could have hit five correct in a row and the profile wouldn't tell you “you're on a roll right now.” That bothered me.
What I consciously decided not to build mattered just as much.
- Consecutive misses aren't shown. Encouragement-only.
- No negative milestones like “0 resolved.”
- No new database table. I skipped the migration and the Edge Function trigger, and instead derive everything from the raw prediction data at the moment the profile loads. Caching absorbs the cost.
The last bullet was the surprising one. When I asked Claude “how should I design the achievement system?”, the textbook answer was “create an identity_achievements table and trigger unlocks in the resolve Edge Function.”That's the heavy path, though — a migration, a deployed Edge Function update, and a backfill every time the achievement definitions change. At this stage, calculating on the fly when the profile page renders was much lighter and more than enough.
Claude's reassurance — “you can always promote this to a materialized view or a trigger later if it scales” — sealed it. Only as much as this stage needs. Promote later if you need to.
Gold, as decided in branch #3, only got used for rare milestones— 6-month hits, 1-year hits, asset specialists, 100 resolved. Everything else stays emerald. Even without design training, it's obvious that gold loses its meaning the moment it becomes common.
Three collaboration patterns from the day
Going through six branches in a single day made the shape of “coding with Claude” more concrete.
The biggest value was “one step ahead”
When I throw out a simple idea, Claude often comes back with a smarter one. The copycat decision in #2 is the clearest case — my “hide all reasoning” became “hide only the identity, keep the reasoning.” When I say “how about this?”, Claude doesn't just implement it as-is — it tends to push back with “hiding the reasoning kills the community value, what about this asymmetric version instead?”Those moments shaped the structure of the day's work.
Small bugs come with the territory
Recharts v3 changed the Tooltip formatter signature in a subtle way from v2, and that surfaced as a type error. The chart bug where the same date appeared twice on the X axis only got caught when I looked at the rendered chart. “The first cut is almost never clean”is one of the things I've confirmed repeatedly when coding with Claude. The loop is fast: ship a first version, catch small bugs by eye, push again. Each cycle runs about 30 minutes.
Direction-setting doesn't get delegated
I own the direction. Claude owns the implementation. When that division blurs — for example when I say “just pick a good direction and run with it” — Claude tends to default to the most orthodox, heaviest path. New table, new migration, new Edge Function. The clearest example was #6, where I cut in with “let's compute it lazily — calculate it when the page renders, don't add a new table.” One small direction call halved the work.
Closing — Claude Writes the Code; Product Judgment Stays with Me
Going through this day made my understanding of working with Claude Code a bit sharper.
Claude builds a first draft fast. Sometimes it pushes my simple suggestion one step further into a better product decision. But that first draft has small bugs more often than not, and the bigger the feature gets, the more direction-setting matters.
It turned out my job wasn't to write every line by hand. My job was to keep looking. Is this score fair? Is this information visible to the user? Is this feature needed at this stage? Is this design too heavy for now?
That was the conclusion of the day: the faster Claude moves on implementation, the more often I have to make direction calls.
If you want to put your own bot or your own predictions on LDBD, head to /settings to create an identity and (if needed) an API key. Sign up on the home page; using LDBD is free.