Scoring method

How we score a thread for pain and willingness to pay

Anecdotes don’t rank. Here is the fixed schema that turns hundreds of messy threads into numbers you can sort, compare, and act on.

Why score at all

Reading Reddit by hand gives you stories, and stories don’t rank — you remember the most dramatic thread, not the most common problem. To decide anything, you need every thread reduced to the same handful of comparable fields.

So every thread the pipeline keeps is sent to a language model with one job: read the post and its comments and return a fixed set of structured fields. No free-form essays — a schema.

The two numbers that carry the most weight

pain_signal is a 0–100 score for how acute the frustration is. A casual “would be nice” sits low; “I waste an hour on this every single day” sits high.

wtp_tier buckets willingness to pay into high, medium, low, or none — based on whether people describe paying for a fix, asking for one, or merely grumbling. Those two fields are what the report sorts on.

The full field set

Beyond those two, each thread also carries:

  • tools_mentioned[] — the products named in the discussion
  • sentiment_toward_tools — positive / negative / mixed / neutral, per tool
  • primary_use_case — market research, lead gen, brand monitoring, content ideation, or other
  • relevance_score — 0–10 match between the thread and your claim
  • key_quotes, best_quote_from_OP, best_quote_from_top_reply — verbatim, with source links
  • summary — a one-line plain-English gist

Why a fixed vocabulary beats free text

The temptation is to let the model write whatever it notices. The problem shows up at aggregation: a free-form “top advice” field comes back as a unique sentence every time, so nothing tallies.

Constraining fields to a fixed enum — high/medium/low/none, positive/negative/mixed/neutral — is what lets the report say “34% medium willingness to pay” instead of listing 300 one-off opinions. Even tool names get case-normalised before counting, so “F5Bot”, “F5bot”, and “f5bot” don’t split into three.

The cost lever is depth, not breadth

Scoring is the only paid step. At five comments per thread a run costs around $0.13; pulling fifty to a hundred comments for richer quotes pushes it to $0.30–0.40.

A top-N cap lets you decide exactly how many threads get the full treatment, so cost is a dial you set rather than a surprise you get.

See it end to end

Scoring is one stage of a six-step run, from hypothesis to report.

Read the full methodology

Validate what people actually say, not what you wish they would.