How to read the report

A guide to every panel, score, and metric in Causal Performance Analyzer.

Overview

Causal Performance Analyzer is a six-phase forensics pipeline that takes a URL, runs Google PageSpeed Insights against it, and then systematically works out why the page is slow — distinguishing root causes from symptoms and assigning a confidence score to each diagnosis.

The pipeline runs left-to-right through these phases:

Phase 1

PSI

Fetch raw Lighthouse + CrUX data

Phase 2

Facts

Parse into typed evidence objects

Phase 3

Hypotheses

AI proposes causal explanations

Phase 4

Reasoning

Deterministic confidence scoring

Phase 5

Causal Graph

Map causes to symptoms

Phase 6

Report

Actionable verdict + fixes

Phases 4–6 are fully deterministic — no AI is involved after hypothesis generation. The same facts will always produce the same confidence scores and the same report.

Lighthouse Report

This panel shows the raw data returned by Google PageSpeed Insights before any causal analysis is applied. It is the ground truth for everything that follows.

Performance score

The large circle shows Lighthouse's overall performance score (0–100). It is a weighted geometric mean of the six lab metrics below and is colour-coded:

90–100 Good50–89 Needs work0–49 Poor

The score reflects a throttled mobile simulation (or desktop if you switched the strategy). It may differ from your site's real-world experience — that is exactly what the Field Data section below captures.

Core Web Vitals · Lab

Six metrics measured in the simulated Lighthouse environment:

FCP — First Contentful Paint

Time until the first text or image appears. High FCP often points to render-blocking resources or a slow server.

LCP — Largest Contentful Paint

Time until the largest visible element (hero image, heading) is rendered. The primary user-perceived load metric.

SI — Speed Index

How quickly the page visually fills in. A high SI with a normal LCP suggests the page loads piecemeal.

TTI — Time to Interactive

When the page becomes reliably interactive. Elevated TTI indicates heavy JavaScript execution.

TBT — Total Blocking Time

Sum of all main-thread blocking time between FCP and TTI. The lab proxy for INP / responsiveness.

CLS — Cumulative Layout Shift

Measures visual instability — elements shifting around after initial render. A score > 0.1 is user-visible.

Field Data (CrUX P75)

Real-user measurements from Chrome UX Report at the 75th percentile — meaning 75% of real users experienced this metric or better. When field data differs significantly from lab data, it usually means real users are on slower networks or devices than the simulated environment.

CrUX categories map to ratings: FAST = GoodAVERAGE = Needs improvementSLOW = Poor

Field data may be absent for low-traffic URLs (insufficient CrUX data) or for pages behind login walls. The engine notes this as a gap in the report.

Opportunities to improve

Lighthouse audits that scored below 0.9, sorted by severity (worst first). Each shows the audit title and the potential savings. These are the raw signals that feed into the hypothesis generation step.

Hypotheses & Symptoms

What is a hypothesis?

A hypothesis is a proposed root cause for the observed performance problems. It is not a measurement — it is a causal claim: "the reason the LCP is slow is X". The AI generates several competing hypotheses at once so the reasoning engine can evaluate and score them against the evidence.

Detected symptoms

Before generating hypotheses the system detects which performance symptoms are present based on the facts:

slow-lcp / very-slow-lcp

LCP exceeds 2.5 s / 4 s.

slow-fcp

FCP exceeds 1.8 s.

high-tbt / very-high-tbt

TBT exceeds 200 ms / 600 ms.

poor-cls / very-poor-cls

CLS exceeds 0.1 / 0.25.

high-tti

TTI exceeds 3.8 s.

slow-server

TTFB exceeds 200 ms.

render-blocking-detected

Render-blocking resources found.

third-party-blocking

Third-party scripts block the main thread.

excessive-unused-bytes

Large amounts of unused JS or CSS.

field-worse-than-lab

Real users experience worse metrics than the lab.

Hypothesis fields

Cause

The proposed root cause. A fixed vocabulary of 13 values (e.g. render-blocking-resources, server-latency, third-party-scripts).

Plausibility

The AI's initial assessment before scoring: high, medium, or low. This is an estimate only — the reasoning engine may rank a "low" hypothesis above a "high" one after applying evidence.

Rationale

The AI's plain-English explanation for why it proposed this cause. Read this to understand the reasoning, not just the verdict.

Explains

Which detected symptoms this hypothesis accounts for. A hypothesis that explains more symptoms is generally preferred.

Missing evidence

What additional data would make this diagnosis more or less certain. Use these as a checklist for further investigation.

Evidence IDs

The fact IDs from the evidence collection that support this hypothesis.

Validation

Every hypothesis is validated before being shown. Hypotheses that are logically impossible given the evidence are rejected automatically — for example, alazy-lcp-image hypothesis is rejected if the LCP element does not have loading="lazy". Rejected hypotheses are not shown in the UI.

Reasoning Engine

The reasoning engine takes the validated hypotheses and scores them using a set of deterministic rules against the fact evidence. No AI is involved.

Confidence score

Each hypothesis starts at a base confidence of 0.40 (40%). Rules then apply boosts (positive evidence) and penalties (contradicting evidence) to produce a final confidence score.

≥ 65% Strong signal40–64% Moderate signal< 40% Weak signal

A hypothesis is declared the dominant cause if it reaches ≥ 50% confidence and leads all other hypotheses by at least 15 percentage points. If no hypothesis clears this bar the verdict is inconclusive.

Boosts and penalties

Each small tag under the confidence bar shows one scoring signal. Green tags are boosts (evidence supports this cause); red tags are penalties (evidence contradicts it). Hover any tag to read the full rationale.

A hypothesis with many small boosts can outscore one with a single large boost if the evidence is consistent. Look at the full signal set, not just the total.

Elimination

Hypotheses can be eliminated — removed from contention — by hard rules such as missing a required fact, or by scoring so low that no evidence can recover them. Eliminated causes are listed with their reason at the bottom of the panel.

Unexplained symptoms

If a detected symptom is not explained by any non-eliminated hypothesis, it is flagged as unexplained. This usually means the root cause is outside the 13 known causes, or that key diagnostic data is missing.

Causal Graph

The causal graph is a bipartite view of the relationships between causes and symptoms. It is built directly from the reasoning result — no additional AI calls.

Reading the graph

Causes (left column)

Non-eliminated hypotheses with confidence ≥ 20%. The dominant cause has a green border. The bar shows confidence percentage.

Symptoms (right column)

Performance symptoms detected from the evidence. Unexplained symptoms have an amber border and a ⚠ label.

Arrows (→)

An edge from cause A to symptom B means: if A is the root cause, it explains symptom B. Edge weight equals the cause's confidence.

← attribution

Under each symptom, you can see which causes claim to explain it. Multiple causes explaining the same symptom indicates ambiguity.

Engineering Report

The engineering report is a deterministic summary derived entirely from the reasoning result. It is the primary deliverable for a developer or engineering team.

Verdict

The verdict is either identified or inconclusive.

Identified: one hypothesis cleared the 50% confidence threshold and leads the pack by more than 15 pp. The summary names the dominant cause and its confidence level.

Inconclusive: no hypothesis reached the threshold. The summary shows the highest confidence achieved and suggests collecting more telemetry (HAR files, RUM data, field traces).

Evidence chain

An ordered list of the scoring signals that contributed most to the dominant cause's score. Green tokens are boosts; red tokens are penalties. The number is the confidence delta in percentage points. This is the audit trail — it shows exactly why the engine reached its conclusion.

Alternatives

Causes that scored ≥ 20% but were not dominant. These are not ruled out — they may be contributing factors alongside the primary cause. Investigate them if fixing the dominant cause does not fully resolve the performance issue.

Gaps & Uncertainties

Missing evidence

Data that would increase diagnostic confidence. Typically HAR files, server timing headers, RUM session traces, or WebPageTest results.

Unexplained symptom

A performance symptom that no hypothesis accounts for. Indicates an unknown contributing factor.

Low confidence

The overall diagnostic confidence is too low to act on. Collect additional telemetry before making changes.

Recommendations

Actionable fixes for the dominant cause, tagged by priority:

HIGHFix this first. Direct cause of the bottleneck.

MEDSignificant improvement, lower effort.

LOWNice-to-have. Address after HIGH and MED items.

PDF Download

The ↓ Download PDF Report button (shown after hypothesis generation) opens a new browser tab containing a formatted print-ready document and triggers the browser's native Print dialog. Use Print → Save as PDF to save it.

Allow pop-ups from this site for the download to work.

The PDF contains all eight sections of the analysis:

Cover — URL, performance score, verdict, and date
Core Web Vitals — lab metrics grid, field data table, failing audits
Detected symptoms with rationale
AI-generated hypotheses with full rationale and missing evidence
Confidence scoring — bars and signal breakdown for each hypothesis
Verdict and evidence chain
Actionable recommendations by priority
Alternative causes, gaps, and appendix with all raw facts

The PDF is entirely self-contained — no internet connection is required to view it and it does not depend on any external fonts or CDN resources.

Configuration

All user-facing strings and defaults are in src/config.ts. Edit that file to rebrand the app or change defaults without touching any component or engine code.

name

Product name shown in the header, browser tab, and PDF cover.

tagline

Short subheading shown below the product name.

defaultUrl

URL pre-filled in the analysis input on first load.

defaultStrategy

Mobile or desktop — the pre-selected Lighthouse strategy.

docsUrl

Where the "How to read this →" link in the header points. Use '/docs' for this page or replace with an external URL.

Restart the dev server after editing src/config.ts for changes to take effect in production builds.