Jiddu

How Jiddu works

Technical walkthrough of every pipeline. Nothing is hidden — the source code is open under AGPL-3.0 at github.com/rafaehlers/jiddu, so anything described below can be cross-checked against the actual implementation.

1. The big picture

Jiddu runs three independent analyses on the same input text: a fallacy detector, a fact-check pipeline (extraction + Sonar-based verdict), and a neutrality assessment. They are separate routes in the app (`/`, `/factcheck`, `/neutrality`) and separate database tables. None of them is required by the others.

Each analysis is a streaming request to an LLM via OpenRouter. The model is told to call a function (OpenAI-style tool use) whose JSON Schema we control. We parse the tool-call arguments incrementally with partial-json so findings show up in the UI while the model is still writing.

All output is in the user's UI language. The same prompt is used regardless of which model the user picks — the language difference comes from a single `OUTPUT LANGUAGE:` line in the system prompt.

2. Input: ingestion and segmentation

**URL.** The fetch sends real-browser headers (current Chrome User-Agent, Accept-Language pt-BR, sec-ch-ua, sec-fetch-*) so naive bot-blockers don't reject it. Every redirect hop is DNS-resolved and rejected if the IP falls inside RFC1918, loopback, link-local, CGNAT, multicast or reserved IPv4/IPv6 ranges — that closes the SSRF door, including the AWS metadata IP (169.254.169.254) and the local NPM admin (port 81 on the VPS). Response body is capped at 5 MB. The HTML then passes through @mozilla/readability for article extraction.

**PDF.** Parsed in-process with pdfjs-dist — never sent to a remote service. The model only ever sees plain text.

**Paste.** Used as-is, only whitespace-normalized.

The extracted text is then split into sentences with an abbreviation-aware splitter (`src/lib/sentences.ts`) — it recognizes Portuguese abbreviations (`sr.`, `dr.`, `prof.`), keeps initialisms like `U.S.` intact, and doesn't break in the middle of numbers like `3,4%`. Sentences are grouped into chunks of ~12k characters (`src/lib/chunker.ts`) so long documents fit the model context window without losing sentence boundaries.

3. Models and fallback

Two models are exposed in the picker: 🇺🇸 GPT-5.4 mini (default) and 🇨🇳 MiniMax M2.7. They were picked from opposite geopolitical backgrounds so a user worried about training-data bias can run the same text through both and compare.

If the user-picked model fails *before emitting any event* (rate-limit from upstream, 5xx, etc.), the request falls through to the other model automatically — client-side retry, no UI flicker. Once the model has emitted at least one finding/claim, we don't retry: we accept the partial output and let the partial-parse layer salvage it.

Sonar Reasoning Pro is used exclusively for fact-check verdicts (because it has built-in web search). It is never exposed in the model picker.

4. Fallacy detector

Catalog of 28 fallacies covering relevance, ambiguity, presumption, emotional appeals, causal reasoning and evidence. Each fallacy has a stable id and a distinct color (golden-angle hue distribution) used to highlight passages. The full list is at /fallacies.

The system prompt instructs the model to be rigorous and conservative — flag only when there is a clear, defensible fallacy. Two specific rules tighten this: a **disambiguation** clause (don't flag a sentence with multiple plausible readings unless context resolves it) and a **standalone-explanation** clause (each explanation must name the implied premise and conclusion so a reader doesn't need to re-read the source). Both clauses are distilled from the Claimify paper (Microsoft Research 2025).

Each finding has a binary severity (`low` / `high`). The overall fallacy score (0-100) is a weighted average across chunks, weighted by character count. The score range labels (`clean reasoning` / `minor issues` / `several fallacies` / `highly fallacious` / `propagandistic`) are spelled out in the prompt itself so the model has the same calibration the UI uses.

5. Fact-check — claim extraction

Modelled on Claimify (Metropolitansky & Larson, MSR 2025). The prompt asks the model to walk every sentence through four stages: **selection** (keep only sentences with a verifiable claim — opinions, value judgements, rhetorical questions get dropped), **disambiguation** (drop sentences with multiple plausible readings when context doesn't resolve them), **decomposition** (split multi-fact sentences into separate claims without chasing infinite atomicity), and **decontextualization** (every emitted claim must stand alone — resolve pronouns, add the year for a number that depends on it, attribute a quote to its source).

Each emitted claim is typed: `numeric`, `date`, `quote`, `causal`, or `categorical`. The type informs how the verifier later searches for evidence.

Hard cap of 50 claims per extraction (admin-tunable). The cap exists because each claim later costs Sonar credits to verify; an uncapped extraction on a 30-page PDF could rack up serious money before the user notices.

6. Fact-check — per-claim verdict

Verdict is opt-in. Extraction alone is cheap (one LLM call). Verification is on demand: the user clicks **Verify claims** in the viewer and each claim is sent to Perplexity Sonar Reasoning Pro, one call per claim, with bounded concurrency (6 parallel by default).

Sonar returns a JSON payload with the verdict (`supported` / `contradicted` / `mixed` / `unverified`), a confidence between 0 and 1, a rationale, and a list of cited sources collected from Sonar's `annotations` channel. If confidence is below 0.5, the verdict is forced to `unverified` — better to surface 'not enough evidence' than to ship a confident wrong call.

Every cited source is classified into one of six tiers (per Goldfarb et al. 2025): `primary` (official data agencies, court rulings, peer-reviewed papers), `scholarly`, `think_tank`, `journalistic`, `commercial_content`, `informal`. A separate boolean flag marks `state-controlled` outlets (RT, CGTN, Xinhua, etc. — but not BBC/NPR/Deutsche Welle, which are state-FUNDED but editorially independent).

Verdict language is intentionally cautious about named people: never 'X is lying' or 'this is false' — always 'evidence does not support' / 'evidence contradicts the figure' / 'we could not find evidence that supports'. This is a hard rule in the system prompt.

How accurate is this in practice? We benchmarked the pipeline against PolitiFact's human-labeled corpus. On 200 claims drawn from PolitiFact's four polar buckets (true / mostly-true / false / pants-on-fire, 2020-onward), Jiddu's verdict matched the human polar verdict in 67.5% of cases overall and 81.3% on the unambiguous true / false / pants-on-fire buckets. Strict polar disagreement — Jiddu calling a claim the opposite of the human — happened in only 4.5% of cases (9 of 200). The remaining ~28% were Jiddu returning `mixed` or `unverified` instead of a polar verdict, concentrated on PolitiFact's `mostly_true` bucket (where 58% became `mixed` — a structural overlap, not a miss, since both labels describe partial truth). Full methodology, confusion matrix and disagreement analysis at the link.

7. Fact-check — verdict cache

Each verified claim is hashed and stored. The hash is `sha256(lang + ':' + normalize(statement))` where `normalize` lowercases, strips combining-mark accents, collapses whitespace and removes trailing punctuation. That means rewordings of the same claim collide, but cross-language doesn't.

TTLs are per claim type. `numeric` 30 days (IBGE and friends revise series), `date` and `categorical` 365 days (historical facts and definitions are stable), `quote` and `causal` 90 days (re-examinable but unlikely to flip). A best-effort sweep deletes expired rows on the cache-miss path, rate-limited to once every 30 minutes so the hot path stays cheap.

The user can bypass the cache with a per-claim **refresh** button (forces a fresh Sonar call). That counts against an admin-tunable refresh quota — default is 2 refreshes per claim. After that, the button hides and the API returns 429.

8. Neutrality

Distinct from fallacy detection: a text can be fallacy-free and factually accurate while still using loaded framing, asymmetric attribution or one-sided sourcing. Neutrality surfaces those patterns.

Eight issue types are flagged at the sentence level: `loaded_language`, `asymmetric_attribution`, `false_equivalence`, `selective_omission`, `source_asymmetry`, `both_sidesing`, `steering`, `genetic_framing`. The model also classifies the document into one of seven text types (informational / opinion / political speech / advocacy / analysis / marketing / social_media) which informs how strict the audit is.

The document-level verdict is one of `neutral` / `partisan_lean` / `explicitly_partisan` / `manipulation`. Jiddu **deliberately does not classify partisan direction** (left/right, pro-X/anti-Y). The Forum AI paper that inspired the methodology requires a bipartisan expert panel for that step; without one, partisan labels are unreliable and harmful. We surface specific patterns the reader can audit instead.

9. Storage, privacy and reviewers

Persistence is a single SQLite file on a Hostinger VPS, mounted as a Docker volume. There are no user accounts. Each analysis gets a short id and lives under `/a/<id>`, `/f/<id>` or `/n/<id>`. Anyone with the URL can view.

The submitted text is sent to OpenRouter, which routes it to whichever upstream serves the picked model. We don't relay it anywhere else. We log the request IP for rate-limiting and abuse signalling and we don't sell or share that.

Each analysis has a `reviewedAt` timestamp. By default it's null, meaning the analysis is automated and has not been spot-checked by an operator. The admin (at `/admin`, password-gated) can mark individual analyses as reviewed.

10. Cost and abuse controls

Every cost-incurring route has an in-memory token-bucket rate limit per IP. Defaults: 30 fallacy analyses / hour, 30 fact-check extractions / hour, 10 bulk verifications / hour, 20 per-claim refreshes / hour, 20 feedback flags / hour. All values are admin-tunable at runtime via `/admin/settings` — no redeploy needed. When a bucket empties, the API returns 429 with a `Retry-After` header.

The fact-check extraction caps at 50 claims per document. The per-claim refresh caps at 2 calls per claim. Both are admin-configurable. These are the structural defences against a single user racking up an unexpected bill.

11. Feedback loop

Every fact-check claim card has a **🚩 Report wrong verdict** button. Clicking it captures an optional free-text reason and POSTs to a feedback endpoint, which snapshots the current statement + verdict (so the admin sees what the user was looking at, even if the verdict later gets re-verified).

Submitted flags land in an admin queue at `/admin/feedback` with status `open`. The operator can mark each as `resolved` (concur with the user, plan to fix) or `ignored` (the verdict was correct, or the flag was spam), or delete it. The admin home shows a red badge with the open count.

There is currently no automatic action on the verdict when flags accumulate — every entry is reviewed manually. Future improvement: a 'disputed' badge in the public viewer when N flags pile up on the same claim.

12. Live mode (beta)

A separate pipeline at /live: the page captures audio from the user's microphone or, on Chromium browsers, from a browser tab via `getDisplayMedia`. The audio is streamed *directly from the browser to OpenAI's Realtime API* over WebRTC — it never passes through the Jiddu server. The transcription model is `gpt-realtime-whisper`, and the language is honored from the i18n header selector (EN / PT / ES).

How the wiring works: when the user presses **Start**, the browser calls `POST /api/live/token` to mint a short-lived ephemeral client secret (the Jiddu server holds the actual `OPENAI_API_KEY`; the browser never sees it). The browser then opens the WebRTC peer-connection to OpenAI using that secret and starts receiving transcript deltas. Every ~500 new characters (or every 30 seconds, whichever comes first), a separate SSE call to `POST /api/live/analyze` ships the cumulative transcript and runs the same fallacy prompt used everywhere else — findings stream back into the UI in near-real-time. On **Stop**, a final `finalize: true` call commits the session as a normal `/a/<id>` analysis, indistinguishable downstream from a pasted-text analysis.

Cost is bounded by a hard 60-minute session cap (≈ \$1 of Realtime + LLM analysis cost per session). The raw audio is never stored — only the transcript and the detected findings are persisted. The page is intentionally not linked from the nav while in beta — reach it by typing `/live` in the URL bar. Tab-audio capture works in Chrome / Edge only because Safari and Firefox don't expose tab audio through `getDisplayMedia`; the microphone source works everywhere.

13. Open source and methodology

Source code is at github.com/rafaehlers/jiddu under the AGPL-3.0-or-later license. AGPL closes the 'SaaS loophole' that GPL leaves open: anyone running a modified version as a network service must publish the modified source.

Two papers ground the design: Claimify (Metropolitansky & Larson — Microsoft Research, 2025) for the four-stage claim extraction; and Distilling Expert Judgment at Scale (Goldfarb, Hall, Fisher, Salam, Wilde — Forum AI / Stanford, 2025) for the verdict assessment, the 6-tier source taxonomy and the neutrality framework. Jiddu is not affiliated with either group.

The full disclaimer about what Jiddu does and doesn't claim is at /legal.