
Factagora
Jun 14, 2024
A Source-of-Truth–Driven Approach Leveraging SEC Filings and Financial Statements to Deliver Reliable Insights into Insurance Underwriting Performance
Overview
Factagora is transforming how financial professionals work with complex data.
By combining FactBlock and Source of Truth (SoT) technologies, we’ve turned generative AI into a faster, more accurate, and fully verifiable analysis tool.
In a real-world case study in the U.S. insurance industry, we structured five years of SEC filings and macroeconomic indicators into FactBlocks, aligned them with NAIC’s core underwriting metrics, and established a domain-specific analytical framework.
Powered by DeepVerify, the system generated AI outputs that were source-backed, explainable, and trustworthy.
The result: a Z-score of +1.31 across accuracy, relevance, and effectiveness—significantly outperforming GPT-4o and GPT-4o-mini.
Factagora helps high-trust industries move faster—with confidence.
The Problem
Public financial documents such as SEC filings present significant challenges for investors. A single 10-K report can exceed 200 pages, making it difficult to navigate. These filings are also highly complex—filled with domain-specific language and key metrics dispersed across multiple sections.
Even advanced general-purpose large language models (LLMs), such as GPT-4o, often struggle to interpret them accurately. They frequently generate plausible but incorrect or fabricated answers—so-called “hallucinations”—which undermine trust and compromise the quality of financial decision-making.
The Solution
Factagora addresses these challenges through a dedicated fact-checking layer built on three components: FactBlock, Source of Truth (SoT), and DeepVerify. By integrating this layer into any LLM-based AI system, organizations can generate outputs that are not only accurate, but also traceable, explainable, and grounded in verifiable data.
At the foundation, FactBlock extracts and structures key data points from SEC filings and macroeconomic sources—such as 10-K, 10-Q, 8-K, and FRED—turning unstructured text into modular, machine-verifiable knowledge units. These FactBlocks are compiled into a Source of Truth (SoT): a reusable, explainable knowledge base that provides consistent grounding for future queries and analysis.
Built on this SoT, DeepVerify acts as a factual guardrail. It validates AI-generated responses against the SoT to ensure that each output is based on real data—not assumptions. DeepVerify supports transparent, explainable answers with traceable citations, reasoning paths, and confidence scores, and allows teams to define custom domain-specific verification logic—enabling financial analysts, credit assessors, and institutional decision-makers to maintain rigor, trust, and control within automated workflows.
Case Study
Factagora’s approach was applied to a real-world case study in the U.S. insurance industry.
The project focused on the underwriting performance of major U.S. insurance companies. Eight firms—including Allstate, MetLife, and Progressive—were selected, and five years of data were extracted from SEC filings (10-K, 10-Q, 8-K) and structured into FactBlocks. Macroeconomic indicators from FRED—such as unemployment rate, real GDP, CPI, and the 10-year Treasury yield—were also incorporated to quantitatively link company performance with broader economic conditions. Together, these elements formed a domain-specific Source of Truth (SoT).
Underwriting performance was evaluated using five core metrics defined by the National Association of Insurance Commissioners (NAIC):
Combined Ratio
Loss Ratio
Loss Reserves
Expense Ratio
Underwriting Standards
Factagora aligned the FactBlock schema and analytical structure with these metrics to ensure consistent, standards-driven evaluation.
Next, a large language model (LLM) was used to automatically generate analysis questions based on the SoT. For each metric, both general and specific questions were created. For example, regarding the Combined Ratio:
“How has this ratio evolved over the past year, and what does it imply about underwriting profitability?”
“How has reinsurance accounting affected the loss and combined ratios?”
These questions enabled both qualitative and quantitative AI-driven analysis grounded in the SoT.
Finally, Factagora used DeepVerify to evaluate whether the generated responses were truly grounded in source data and met expectations for explainability and domain-specific reliability.
The Result
To test response quality, Factagora’s DeepVerify was compared with a pure LLM (e.g., GPT-4o) using the same question:
“How has the combined ratio evolved over the past year, and what does this indicate about the company’s underwriting profitability?”
A Comparative LLM-Judge model scored each response based on factual accuracy, explainability, and relevance:
Pure LLM: Z-score –0.7 (below average)
DeepVerify: Z-score +1.4 (well above average)
In a broader evaluation, DeepVerify, GPT-4o, and GPT-4o-mini were assessed across three key metrics, with the following Z-score results:
Metric | DeepVerify | GPT-4o | GPT-4o-mini |
---|---|---|---|
Accuracy | +1.35 | -0.45 | -0.80 |
Effectiveness | +1.33 | -0.42 | -0.75 |
Relevance | +1.24 | -0.41 | -0.78 |
Total | +1.31 | -0.44 | -0.76 |
DeepVerify consistently outperformed baseline LLMs across all metrics—achieving Z-scores above +1.0 in every category. This demonstrates that Factagora’s fact-checking layer enables far more accurate, explainable, and domain-aligned analysis than general-purpose models can achieve on their own.
This case study provides strong empirical evidence that Factagora delivers a trustworthy, scalable, and high-precision AI analysis framework, purpose-built for complex, high-stakes domains like financial reporting and risk evaluation.