How Jiddu works

Technical walkthrough of every pipeline. Nothing is hidden — the source code is open under AGPL-3.0 at github.com/rafaehlers/jiddu, so anything described below can be cross-checked against the actual implementation.

1. The big picture

Jiddu runs six independent analyses: a fallacy detector, a fact-check pipeline (extraction + Sonar-based verdict), a neutrality assessment, a paragraph-by-paragraph paper explainer, an AI-slop detector, and an adversarial review of one autonomous paper with a published-literature check. They are separate routes in the app (/, /factcheck, /neutrality, /explain, /slop, /review) and separate database tables. None of them is required by the others.

The five text pipelines stream structured tool-call output from an LLM via OpenRouter; their JSON Schemas are controlled by Jiddu and partial-json lets findings appear while the model is still writing. Paper review also uses SSE, but to report five coarse stages; each stage completes and validates a full structured artifact before the next begins.

All output is in the user's UI language. The same prompt is used regardless of which model the user picks — the language difference comes from a single OUTPUT LANGUAGE: line in the system prompt.

2. Input: ingestion and segmentation

URL. The fetch sends real-browser headers (current Chrome User-Agent, Accept-Language pt-BR, sec-ch-ua, sec-fetch-*) so naive bot-blockers don't reject it. Every redirect hop is DNS-resolved and rejected if the IP falls inside RFC1918, loopback, link-local, CGNAT, multicast or reserved IPv4/IPv6 ranges — that closes the SSRF door, including the AWS metadata IP (169.254.169.254) and the local NPM admin (port 81 on the VPS). Response body is capped at 5 MB. The HTML then passes through @mozilla/readability for article extraction.

PDF. Parsed locally with LiteParse as the primary parser and pdfjs-dist as fallback. The five text pipelines use LiteParse's fast mode; adversarial review uses academic mode to preserve page-scoped Markdown, spatial reading order, selective OCR, complexity signals and screenshots. The raw PDF is not uploaded as a file, but review sends selected rendered page images through OpenRouter when visual inspection is relevant.

Paste. Used as-is, only whitespace-normalized.

The extracted text is then split into sentences with an abbreviation-aware splitter (src/lib/sentences.ts) — it recognizes Portuguese abbreviations (sr., dr., prof.), keeps initialisms like U.S. intact, and doesn't break in the middle of numbers like 3,4%. Sentences are grouped into chunks of ~12k characters (src/lib/chunker.ts) so long documents fit the model context window without losing sentence boundaries.

3. Models and fallback

Two models are exposed in the picker: 🇺🇸 GPT-5.4 mini (default) and 🇨🇳 MiniMax M3. They were picked from opposite geopolitical backgrounds so a user worried about training-data bias can run the same text through both and compare.

If the user-picked model fails *before emitting any event* (rate-limit from upstream, 5xx, etc.), the request falls through to the other model automatically — client-side retry, no UI flicker. Once the model has emitted at least one finding/claim, we don't retry: we accept the partial output and let the partial-parse layer salvage it.

Sonar Reasoning Pro is used exclusively for fact-check verdicts (because it has built-in web search). It is never exposed in the model picker.

4. Fallacy detector

Catalog of 46 fallacies covering relevance, ambiguity, presumption, emotional appeals, causal reasoning and evidence. Each fallacy has a stable id and a distinct color (golden-angle hue distribution) used to highlight passages. The full list is at /fallacies.

The system prompt instructs the model to be rigorous and conservative — flag only when there is a clear, defensible fallacy. Two specific rules tighten this: a disambiguation clause (don't flag a sentence with multiple plausible readings unless context resolves it) and a standalone-explanation clause (each explanation must name the implied premise and conclusion so a reader doesn't need to re-read the source). Both clauses are distilled from the Claimify paper (Microsoft Research 2025).

Each finding has a binary severity (low / high). The overall fallacy score (0-100) is a weighted average across chunks, weighted by character count. The score range labels (clean reasoning / minor issues / several fallacies / highly fallacious / propagandistic) are spelled out in the prompt itself so the model has the same calibration the UI uses.

5. Slop detector

Same architecture as the fallacy detector — same ingestion, chunking, streaming, model fallback and char-weighted 0-100 score — with a different catalog and prompt, persisted to its own table (results at /s/:id, same margin-comment viewer, print view and demos at /s/demo-en|pt|es).

The catalog covers 18 AI-slop writing patterns in five families: stock wording, manufactured drama, templated structure, hollow substance, and format tics. The full list with definitions and examples is at /slop-patterns. The pattern selection is based on Peter Yang's MIT-licensed no-ai-slop editing skill; the catalog wording, trilingual content and detection prompt are Jiddu's own.

The prompt detects patterns, not authorship — it is forbidden from claiming a text was or wasn't written by AI (humans write slop too). Every finding quotes the exact span, names the pattern, and says in a few words what a cleaner version would do (state the point, name the source, give the number) without rewriting anything. Score bands: 0-15 reads human, 16-35 light seasoning, 36-60 patterned, 61-80 heavily templated, 81-100 assembly-line prose.

6. Fact-check — claim extraction

Modelled on Claimify (Metropolitansky & Larson, MSR 2025). The prompt asks the model to walk every sentence through four stages: selection (keep only sentences with a verifiable claim — opinions, value judgements, rhetorical questions get dropped), disambiguation (drop sentences with multiple plausible readings when context doesn't resolve them), decomposition (split multi-fact sentences into separate claims without chasing infinite atomicity), and decontextualization (every emitted claim must stand alone — resolve pronouns, add the year for a number that depends on it, attribute a quote to its source).

Each emitted claim is typed: numeric, date, quote, causal, or categorical. The type informs how the verifier later searches for evidence.

Hard cap of 50 claims per extraction (admin-tunable). The cap exists because each claim later costs Sonar credits to verify; an uncapped extraction on a 30-page PDF could rack up serious money before the user notices.

7. Fact-check — per-claim verdict

Verdict is opt-in. Extraction alone is cheap (one LLM call). Verification is on demand: the user clicks Verify claims in the viewer and each claim is sent to Perplexity Sonar Reasoning Pro, one call per claim, with bounded concurrency (6 parallel by default).

Sonar returns a JSON payload with the verdict (supported / contradicted / mixed / unverified), a confidence between 0 and 1, a rationale, and a list of cited sources collected from Sonar's annotations channel. If confidence is below 0.5, the verdict is forced to unverified — better to surface 'not enough evidence' than to ship a confident wrong call.

Every cited source is classified into one of six tiers (per Goldfarb et al. 2025): primary (official data agencies, court rulings, peer-reviewed papers), scholarly, think_tank, journalistic, commercial_content, informal. A separate boolean flag marks state-controlled outlets (RT, CGTN, Xinhua, etc. — but not BBC/NPR/Deutsche Welle, which are state-FUNDED but editorially independent).

Verdict language is intentionally cautious about named people: never 'X is lying' or 'this is false' — always 'evidence does not support' / 'evidence contradicts the figure' / 'we could not find evidence that supports'. This is a hard rule in the system prompt.

How accurate is this in practice? We benchmarked the pipeline against PolitiFact's human-labeled corpus. On 200 claims drawn from PolitiFact's four polar buckets (true / mostly-true / false / pants-on-fire, 2020-onward), Jiddu's verdict matched the human polar verdict in 67.5% of cases overall and 81.3% on the unambiguous true / false / pants-on-fire buckets. Strict polar disagreement — Jiddu calling a claim the opposite of the human — happened in only 4.5% of cases (9 of 200). The remaining ~28% were Jiddu returning mixed or unverified instead of a polar verdict, concentrated on PolitiFact's mostly_true bucket (where 58% became mixed — a structural overlap, not a miss, since both labels describe partial truth). Full methodology, confusion matrix and disagreement analysis at the link.

8. Fact-check — verdict cache

Each verified claim is hashed and stored. The hash is sha256(lang + ':' + normalize(statement)) where normalize lowercases, strips combining-mark accents, collapses whitespace and removes trailing punctuation. That means rewordings of the same claim collide, but cross-language doesn't.

TTLs are per claim type. numeric 30 days (IBGE and friends revise series), date and categorical 365 days (historical facts and definitions are stable), quote and causal 90 days (re-examinable but unlikely to flip). A best-effort sweep deletes expired rows on the cache-miss path, rate-limited to once every 30 minutes so the hot path stays cheap.

The user can bypass the cache with a per-claim refresh button (forces a fresh Sonar call). That counts against an admin-tunable refresh quota — default is 2 refreshes per claim. After that, the button hides and the API returns 429.

9. Neutrality

Distinct from fallacy detection: a text can be fallacy-free and factually accurate while still using loaded framing, asymmetric attribution or one-sided sourcing. Neutrality surfaces those patterns.

Eight issue types are flagged at the sentence level: loaded_language, asymmetric_attribution, false_equivalence, selective_omission, source_asymmetry, both_sidesing, steering, genetic_framing. The model also classifies the document into one of seven text types (informational / opinion / political speech / advocacy / analysis / marketing / social_media) which informs how strict the audit is.

The document-level verdict is one of neutral / partisan_lean / explicitly_partisan / manipulation. Jiddu deliberately does not classify partisan direction (left/right, pro-X/anti-Y). The Forum AI paper that inspired the methodology requires a bipartisan expert panel for that step; without one, partisan labels are unreliable and harmful. We surface specific patterns the reader can audit instead.

10. Storage, privacy and reviewers

Persistence is a single SQLite file on a Hostinger VPS, mounted as a Docker volume. There are no user accounts. Results get short ids under /a, /f, /n, /e, /s or /r; anyone with the share URL can view the final result. Paper-review ledgers, research and draft judgements remain private intermediate artifacts.

Submitted content goes through OpenRouter to the upstream serving the chosen model. Paper review includes selected rendered page images, not only extracted text. Its literature stage searches public scholarly sources using technical claims while prompts prohibit queries by paper title, author name or other manuscripts by the same authors. Venue autocomplete sends only the typed venue query to DBLP. We log the request IP for rate-limiting and abuse signalling and do not sell it.

Each analysis has a reviewedAt timestamp. By default it's null, meaning the analysis is automated and has not been spot-checked by an operator. The admin (at /admin, password-gated) can mark individual analyses as reviewed.

11. Cost and abuse controls

Every cost-incurring route has an in-memory token-bucket rate limit per IP. Defaults include 30 fallacy analyses / hour, 30 fact-check extractions / hour, 10 bulk verifications / hour, 20 per-claim refreshes / hour and a deliberately conservative 4 paper reviews / hour with burst capacity 2. All values are admin-tunable at runtime via /admin/settings; exhausted buckets return 429 with Retry-After.

Fact-check extraction caps at 50 claims per document and per-claim refresh at 2 calls. Paper review accepts exactly one PDF, refuses to silently truncate past 300,000 extracted characters, limits visual inspection to selected pages and splits the costly work into auditable stages. These controls prevent accidental unbounded fan-out and bills.

12. Feedback loop

Every fact-check claim card has a 🚩 Report wrong verdict button. Clicking it captures an optional free-text reason and POSTs to a feedback endpoint, which snapshots the current statement + verdict (so the admin sees what the user was looking at, even if the verdict later gets re-verified).

Submitted flags land in an admin queue at /admin/feedback with status open. The operator can mark each as resolved (concur with the user, plan to fix) or ignored (the verdict was correct, or the flag was spam), or delete it. The admin home shows a red badge with the open count.

There is currently no automatic action on the verdict when flags accumulate — every entry is reviewed manually. Future improvement: a 'disputed' badge in the public viewer when N flags pile up on the same claim.

13. Live mode (beta)

A separate pipeline at /live: the page captures audio from the user's microphone or, on Chromium browsers, from a browser tab via getDisplayMedia. The audio is streamed *directly from the browser to OpenAI's Realtime API* over WebRTC — it never passes through the Jiddu server. The transcription model is gpt-realtime-whisper, and the language is honored from the i18n header selector (EN / PT / ES).

How the wiring works: when the user presses Start, the browser calls POST /api/live/token to mint a short-lived ephemeral client secret (the Jiddu server holds the actual OPENAI_API_KEY; the browser never sees it). The browser then opens the WebRTC peer-connection to OpenAI using that secret and starts receiving transcript deltas. Every ~500 new characters (or every 30 seconds, whichever comes first), a separate SSE call to POST /api/live/analyze ships the cumulative transcript and runs the same fallacy prompt used everywhere else — this analysis step is a different call from the OpenAI transcription above, routed via OpenRouter with the same GPT-5.4 mini / MiniMax M3 picker as the text-analysis pipelines. Findings stream back into the UI in near-real-time. On Stop, a final finalize: true call commits the session as a normal /a/<id> analysis, indistinguishable downstream from a pasted-text analysis.

Cost is bounded by a hard 60-minute session cap (≈ \$1 of Realtime + LLM analysis cost per session). The raw audio is never stored — only the transcript and the detected findings are persisted. The page is intentionally not linked from the nav while in beta — reach it by typing /live in the URL bar. Tab-audio capture works in Chrome / Edge only because Safari and Firefox don't expose tab audio through getDisplayMedia; the microphone source works everywhere.

14. Explain — paper explainer

The Explain pipeline (/explain, results at /e/:id) takes an academic paper or technical document and produces a paragraph-by-paragraph plain-language explanation. Each paragraph gets a role tag (Background / Objective / Method / Result / Interpretation / Limitation / Conclusion) and an explanation written at the chosen audience level. A whole-document overview is generated first.

Audience levels: Child (ELI5, everyday analogies, no jargon), High school (clear language, a few technical terms explained), Undergraduate (standard academic register, assumes field familiarity), Expert (concise, peer-level, full terminology retained). The level controls prose complexity only — it does not filter or validate content.

Role tagging is algorithmic. The model reads each paragraph in context and assigns the role that best describes its function in the argument. Interdisciplinary papers or documents with non-standard structure may receive incorrect tags.

Unlike the general-text pipelines, Explain is optimised for structured academic papers. Results on blog posts, legal filings, or unstructured prose will be lower quality.

15. Adversarial paper review

The review pipeline (/review, results at /r/:id) accepts exactly one final PDF as an autonomous manuscript, plus target venue, edition, track, paper type and optional review date. It does not combine supplements or multiple files. The public result includes the gatekeeping recommendation, summary, strengths, localized weaknesses, questions, reproducibility checklist, references and the exact venue standard applied; Markdown and print/PDF exports use the same stored final review.

It runs five private stages: ledger (page-tagged claims, evidence, experiments and visual-page inspection), published-literature research, structured synthesis, adversarial self-critique, and deterministic finalization. The self-critique must challenge unsupported novelty claims, missed evidence, severity inflation or understatement and inconsistency with the target venue before any draft becomes final.

Venue autocomplete combines account-free DBLP discovery with versioned, verified 2026 profiles for NeurIPS, ICML, ICLR, AAAI, IJCAI and CVPR. A verified profile injects its official criteria and requirements into both synthesis and self-critique and is persisted with source URLs. A DBLP-only or unmatched venue uses a clearly labeled general scholarly rubric; edition, track or paper-type mismatches are marked unverified rather than guessed.

Every substantive weakness must carry a manuscript page or section and a MAJOR, MODERATE or MINOR severity. The review is still automated: it can miss problems, invent comparisons or misapply a criterion, and it is not an official peer review or acceptance prediction.

16. Public API & MCP (for agents)

Beyond the web app, the five text pipelines are exposed as a public REST API for agents and scripts: POST /api/v1/detect-fallacies, /fact-check, /assess-neutrality, /explain, /detect-slop. These are synchronous — one JSON request in ({ text }, 40–2000 chars, plus optional lang and, for explain, level), one JSON response out. They drive the same generators to completion over a single chunk and never persist a row.

Requests authenticate with an X-Jiddu-Key header. Keys are admin-issued at /admin/api-keys (stored only as a sha256 hash, shown once) and rate-limited per key, separately from the web app's per-IP limits. Fact-check via the API both extracts and verifies each claim, capped by a per-request claim limit so one call can't fan out unboundedly to the paid Sonar verifier.

Adversarial review is exposed as a persisted asynchronous job: POST /api/v1/adversarial-reviews accepts one uploaded PDF or a public pdf_url and returns 202; GET /api/v1/adversarial-reviews/{id} reports the current stage and eventually returns the structured review. This keeps the multi-minute, five-stage workflow out of the synchronous text contract.

The hosted MCP server at POST /api/mcp exposes the five text tools plus start_adversarial_review and get_adversarial_review. Any MCP client can connect by URL through mcp-remote; the same tools also ship in the standalone jiddu-mcp package. Setup snippets and machine-oriented docs are at /api-access.

Quality is tracked by a fixed eval suite (15–16 cases per tool) committed in the repo and scored against expected results; the published report lives at evals/report.md.

17. Open source and methodology

Source code is at github.com/rafaehlers/jiddu under the AGPL-3.0-or-later license. AGPL closes the 'SaaS loophole' that GPL leaves open: anyone running a modified version as a network service must publish the modified source.

Claimify and Distilling Expert Judgment at Scale ground the fact-check and neutrality design. Paper review additionally uses LiteParse's layout-aware local extraction, DBLP venue discovery and edition-specific criteria transcribed from official venue pages. Jiddu is not affiliated with those groups, DBLP or the venues.

The full disclaimer about what Jiddu does and doesn't claim is at /legal.