Practice

Install a Local AI

Run a powerful AI on your own computer. Analyze confidential policies, process claims documentation, and summarize loss reports — without sending a single word to the cloud. This guide gets you from zero to a working local AI in under 30 minutes.

Why Local AI Matters for Insurance Professionals

As an insurance professional, you handle some of the most sensitive personal and commercial data in any industry. Policyholder information, claims details, medical records, financial data, and proprietary underwriting models all demand strict confidentiality. Regulatory frameworks like HIPAA, state insurance data security laws, and NAIC model regulations all point to one simple truth: you must control where sensitive data goes.

The Problem with Cloud AI

When you paste a policy document, a claims file, or policyholder details into a cloud-based AI tool like ChatGPT or Claude, that data leaves your machine and travels to a third-party server. Even with enterprise agreements, you are relying on someone else's infrastructure, someone else's data retention policies, and someone else's security team. For many types of confidential insurance work — especially involving protected health information or proprietary loss data — that is an unacceptable risk.

The Local AI Solution

A local AI runs entirely on your computer. No internet connection is needed. No data is transmitted anywhere. The model loads into your machine's memory, processes your input locally, and generates its response locally. Your documents never leave your desk.

Related reading: For a deeper look at the risks of putting confidential information into cloud AI tools, see What Not to Do #2 — Don't Paste Confidential Information Into AI Tools Without Safeguards.

What is LM Studio?

LM Studio is a free, cross-platform desktop application that lets you download and run open-source large language models (LLMs) directly on your computer. Think of it as having your own private ChatGPT — but one that runs offline and keeps everything local.

Completely Private

Your data never leaves your computer. No telemetry, no cloud sync, no API calls. Everything stays local.

Works Offline

After the initial model download, no internet connection is required. Use it on a plane, at a client site, or in a secure facility.

Cross-Platform

Available for Windows, macOS, and Linux. Runs on laptops with 16GB RAM or more.

Thousands of Models

Browse and download from thousands of open-source models on Hugging Face. Find the right one for your task and hardware.

Step-by-Step Installation

Follow these five steps to go from nothing to a working local AI. Most insurance professionals complete this in 20-30 minutes, depending on internet speed for the model download.

Download LM Studio

Go to lmstudio.ai and download the installer for your operating system. LM Studio is available for:

Windows 10/11 macOS (Apple Silicon & Intel) Linux (Ubuntu/Debian)

The download is approximately 400-500 MB. The application itself is free with no account required.

Install and Launch

Run the installer. On Windows, double-click the .exe file. On macOS, drag the app to your Applications folder. On Linux, follow the package instructions on the site.

When you launch LM Studio for the first time, you will see a clean interface with a search bar and a chat window. No configuration is needed yet.

Download a Model

Click the Discover tab (magnifying glass icon) in the left sidebar. Search for a model by name. For your first model, we recommend:

Recommended First Model

Llama 3.1 8B Instruct — Search for "llama 3.1 8b instruct" in the Discover tab. This is a strong general-purpose model that runs well on most modern laptops with 16GB of RAM.

The "8B" means 8 billion parameters. Larger models (70B) are more capable but require significantly more hardware — typically 64GB+ RAM or a dedicated GPU. Start small.

Click the download button next to the model. LM Studio will show you the recommended quantization (file size variant). The default is usually fine. The download will be 4-6 GB for an 8B model — this is a one-time download.

Load the Model and Start Chatting

Go to the Chat tab (speech bubble icon). At the top of the chat window, click the model selector dropdown and choose the model you just downloaded. LM Studio will load the model into memory — this takes 10-30 seconds depending on your hardware.

Once loaded, you can start typing prompts just like you would with ChatGPT. Try something simple first: "Summarize the key coverages and exclusions in this commercial general liability policy." You should see the model generate a response in real time.

Use the Local Server (Advanced)

LM Studio includes a built-in local server feature. Click the Developer tab (code icon) and start the server. This creates an OpenAI-compatible API endpoint at http://localhost:1234 on your machine.

This lets other applications on your computer (note-taking apps, coding tools, document processors) connect to your local AI using the same interface they would use for cloud APIs — but all traffic stays on your machine. This is optional and not required for basic use.

Recommended Models for Insurance Work

Not all models are equal. Here are four models we recommend for insurance professionals, ranging from lightweight to powerful. Start with the first one and upgrade as your hardware and confidence allow.

Llama 3.1 8B Instruct

Recommended

The best all-around choice for most insurance professionals. Strong general reasoning, good at following instructions, and capable of summarization, drafting, and analysis tasks. Ideal for policy review, claims summarization, and generating first-draft communications.

16GB RAM minimum ~4-6 GB download General purpose

Mistral 7B Instruct

Fast and efficient. Mistral 7B punches above its weight class for summarization and text generation. If your primary need is quickly summarizing claims files or generating first drafts of policyholder communications, this is an excellent option that responds quickly.

16GB RAM minimum ~4 GB download Fast summarization

Phi-3 Mini

Microsoft's compact model. At only 3.8 billion parameters, it runs on older or less powerful hardware. Good for basic tasks: simple policy summarization, Q&A, and quick drafting. Not as capable as larger models for complex coverage analysis, but a solid entry point if your machine has limited resources.

8GB RAM minimum ~2 GB download Lightweight

Llama 3.1 70B

Power Users

The heavyweight. 70 billion parameters deliver substantially better reasoning, nuance, and accuracy. Closer to cloud model quality. But it requires serious hardware: 64GB+ RAM or a dedicated GPU with 48GB+ VRAM. Best for insurance professionals with workstation-class machines or IT-managed infrastructure.

64GB+ RAM or GPU ~40 GB download Near cloud quality

Practical Use Cases

Once you have LM Studio running, here are the most valuable ways insurance professionals are using local AI today. Each of these keeps your data completely on your machine.

Analyzing Policies

Paste full policy text into the chat and ask the model to identify key coverages, exclusions, conditions, and endorsements. Compare terms against standard forms or identify unusual provisions. All analysis happens locally.

Summarizing Claims Files

Feed in claims documentation, adjuster notes, medical records, or repair estimates and ask for structured summaries, chronologies, or key fact extraction for reserve reviews.

Drafting from Confidential Data

Provide the model with proprietary loss data, actuarial findings, or internal performance metrics and ask it to draft reports, executive summaries, or board presentations. The data never leaves your machine, so confidentiality is maintained.

Extracting Key Dates & Obligations

Ask the model to extract all deadlines, proof-of-loss requirements, notice periods, subrogation timelines, and policy effective dates from claims files and policies into a structured table.

Comparing Policy Versions

Paste two versions of a policy from different renewal years and ask the model to identify all coverage changes, modified exclusions, adjusted limits, and new endorsements. Useful for renewal review when you need the analysis to stay confidential.

Limitations to Keep in Mind

Local AI is a powerful tool, but it is not a replacement for cloud models in every situation. Understand these trade-offs so you can choose the right tool for each task.

Smaller Models, Lower Capability

An 8B parameter model running locally is substantially less capable than GPT-4o or Claude 3.5 Sonnet running in the cloud. Expect simpler reasoning, occasional errors, and less nuanced output. Always verify AI-generated analysis against the actual policy language and claims documentation.

No Internet Access

Local models cannot search the web, access insurance databases, or retrieve current regulatory information. They work only with what you provide in the prompt and their training data (which has a knowledge cutoff date).

Slower Response Times

Local inference is slower than cloud APIs, especially on consumer hardware. A response that takes 2 seconds from ChatGPT might take 15-30 seconds from a local model. This is acceptable for careful analysis but less ideal for rapid iteration.

Output Still Requires Verification

A local model can hallucinate just like a cloud model. It can misinterpret policy language, invent coverage provisions, or produce plausible-sounding but incorrect analysis. The verification obligation is the same regardless of where the model runs — always check AI output against the source documents.

The Smart Approach: Use Both

The most effective insurance professionals use cloud AI for non-confidential work (general research, learning, template creation) and local AI for confidential work (policyholder documents, claims files, proprietary loss data). This gives you the best of both worlds: maximum capability when privacy is less critical, and maximum privacy when it matters most.

Keep Building Your AI Skills

Now that you have a local AI running, learn how to get the best results from it. Our prompt engineering guide and quick wins work just as well with local models as they do with cloud tools.

Learn Prompt Engineering Try Quick Wins Review What Not to Do

Ready for structured learning? Explore the Learning Program →