Automating Insurance Underwriting Without an LLM in the Loop

Underwriting is where an insurer decides which risks to take on and at what price. Part of that work is mechanical: does the application meet the eligibility rules, do the figures reconcile, are the required documents present and signed. Part of it is genuine judgment that belongs to a person. The mechanical part is slow, repetitive, and a strong candidate for automation. The judgment part is not, and the hard problem is keeping the two cleanly apart.

A large language model blurs that line, because it will happily do both. Hand it an application and it will extract the data, weigh the risk, and return a decision in a single pass. That looks like the whole job done. In a regulated line of business it is closer to a liability, because you now hold a decision about a person that you cannot reliably reproduce, cannot fully explain, and reached by sending their medical and financial history to a model you do not control.

There is a version of this that works. AI does the heavy lifting while you build the workflow, and frozen, auditable code makes every actual decision when the workflow runs. I've spent more than twenty years building systems for insurers, so the rest of this is specific about where that line sits and why.

Why Underwriting Is a Harder Case Than Most

We've made the general argument for why non-deterministic agents stall before production elsewhere. Underwriting raises the stakes on that argument for two reasons specific to the work.

Every output is a decision about a person. A decline, a rated offer, or a higher premium is not an internal number. It carries legal weight. When a decision is based even in part on a consumer report, the Fair Credit Reporting Act requires an adverse action notice. Most states layer their own adverse underwriting decision rules on top, drawn from the NAIC model on insurance information and privacy, and those require you to state the specific reasons on request. A reason like "the model scored the application at 0.34" is not a reason a regulator accepts. You have to be able to point at the rule that fired.

The data is the most sensitive an applicant holds. Underwriting files contain medical history, prescription records, income, and credit detail. Routing that through a third-party model to reach a decision creates exactly the exposure we wrote about in keeping sensitive data out of third-party LLMs. A no-training clause in a vendor contract does not make a subpoena go away, and it does not put the data back on your own infrastructure.

Where AI Genuinely Earns Its Place

None of this means AI has no role in underwriting. It means the role is at design time, while you are building the workflow, not at runtime while it decides.

The richest example is the rulebook itself. Underwriting guidelines are long, full of thresholds, knockout conditions, and exceptions that have built up over years. Translating those into code by hand is slow and error prone. Describing them in plain English and having AI generate the code that encodes them is fast, and it produces something you can read line by line before you trust it. That is the core of how Build Studio works: you describe the logic, AI writes the Python, and you review it.

The point is the timing. The model helps you author the rules. Once you publish, the rules are frozen into versioned code, and that code is what evaluates every application after that. The applicant never meets the model. They meet the rules your team approved.

What About Reading the Documents?

Honesty matters here, because intake is the one stage where a model may still run against live data. Applications arrive as PDFs, scanned attending physician statements, and ACORD forms, and pulling structured fields out of messy documents is a real strength of modern models. If you use one for that, treat its output as data to be checked, never as a decision. Validate every extracted field against the source, flag low-confidence extractions for a human to confirm, and keep the model strictly upstream of the logic. Extraction can be probabilistic. The decision that follows it should not be.

A Worked Example

Here is how a term life application might flow through a deterministic workflow. The numbers and thresholds below are illustrative, chosen to show the shape of the process rather than any real carrier's guidelines.

Intake. The application and its attachments land in the workflow. Structured fields come straight through. Unstructured documents go through extraction, with low-confidence fields held for review.
Normalize and validate. Frozen code checks that required fields are present, dates are coherent, and the figures reconcile. Anything incomplete is returned with a specific list of what is missing, not a vague rejection.
Apply the filed rules. Eligibility and knockout conditions run exactly as written: age bands, coverage limits, occupation classes, the conditions that require evidence of insurability. Each rule that fires is recorded by name.
Score against the rating engine. The application is rated using the same tables the carrier filed. Same inputs, same rate class, every time.
Route. Clear accepts and clear declines are decided. Anything in between is referred to an underwriter with the file, the rules that triggered, and the reason it could not be settled automatically.
Generate the decision. The output letter is built from reason codes that map directly to the filed rules, so an adverse decision arrives with reasons a regulator will recognize.
Log everything. Inputs, the code version that ran, the rules that fired, and the output are written to an immutable record.

The property that makes this auditable is the same one that makes it boring to operate: feed the identical application through next quarter and you get the identical decision, because the code did not change and no model was in the loop to drift. When a decision is questioned, the answer is not a reconstruction of what a model was probably thinking. It is the exact rule, the exact version, and the exact inputs, sitting in the log. This is the same pattern we walked through for a finance process in a regulated report, automated end to end, applied to a decision instead of a report.

How This Lines Up With the Rules You Answer To

Underwriting automation is being watched closely, and the requirements are converging on a short list of things regulators want to see.

The NAIC Model Bulletin on the Use of Artificial Intelligence Systems by Insurers, finalized at the end of 2023 and since adopted by a long list of states, expects insurers to govern these systems, test them for unfairly discriminatory outcomes, document how they work, and be able to explain individual decisions. Colorado, acting under SB 21-169, has gone further for life insurers that use external consumer data and predictive models, requiring a documented governance framework and quantitative testing for disparate impact. In the EU, the AI Act classifies risk assessment and pricing in life and health insurance as high-risk, which brings obligations around documentation, logging, and human oversight.

Read those together and the common thread is reproducibility, explainability, and a record. Frozen code answers all three without extra machinery. The rule that decided a case is in the source. The reasons on the letter map to that rule. The log shows which version ran on which inputs. When an examiner asks how a class of applicants was treated, you can rerun the exact code over the exact files and show them, rather than explaining the behavior of a model at a moment that has since passed. Our compliance checklist for regulated industries goes through the auditor questions in more detail.

What Stays With the Underwriter

This is not a pitch to replace underwriters, and it would be a poor one. The cases that need real judgment, the unusual risk, the borderline file, the situation the guidelines never anticipated, are exactly the cases a deterministic workflow should refer rather than force. The goal is to settle the clear-cut volume automatically and hand the underwriter the genuinely hard files with the context already assembled. That is where their expertise is worth the most, and it is the part of the job that does not reduce to rules.

There is also a cost dimension worth a glance. A model in the runtime loop bills on every application, and an insurer's volume only grows. Frozen code has no per-decision token cost at all, which is the same point we made in the real cost of AI at runtime.

Bottom Line

The fastest path to automated underwriting is to let a model read the application and return a verdict. In a regulated line it is also the path that leaves you unable to reproduce the decision, unable to explain it cleanly, and holding sensitive data where it should not be. The durable path uses AI where it shines, encoding the rulebook and parsing the intake while you build, and then freezes the decision into code your team approved and an examiner can audit.

If underwriting is your world, our insurance solutions page covers how carriers put this to work, or you can watch the demo and build your first workflow and try it against a process you know well. You can also download the free Community edition and run it on your own servers.

This article describes regulatory frameworks in general terms as of June 2026 and is not legal advice. Confirm the rules that apply in the states and lines you write before deploying any automated underwriting process.

Sukesh Shetty

Founder of Dittah. 20+ years building mission-critical systems for financial services and insurance.