Local LLM

Experiment

Building a privacy-first analytics assistant that converts raw GA4 data into actionable product insights using a locally hosted language model.

Local LLM started as an experiment to understand whether small, locally hosted language models could generate useful product insights without sending data to external services.

The goal was simple.

I wanted an automated way to analyse Google Analytics data, identify opportunities and generate recommendations without manually opening dashboards every day.

More importantly, I wanted the entire workflow to remain private. All processing, analysis and inference would happen on my own machine.

The Problem

Google Analytics provides an enormous amount of data.

The challenge is not accessing the information.

The challenge is understanding which metrics actually require attention.

Most dashboards show numbers.

Very few explain what those numbers mean, why users behave a certain way or what actions should be taken next.

I wanted to build a lightweight analytics assistant that could automatically identify high-impact pages and explain what was happening from a product perspective.

Data Collection

The workflow begins with Google Analytics 4.

A Python script called fetch_ga4_data.py connects to the Analytics Data API using a service account and retrieves page-level metrics for the previous thirty days.

The data includes:

Page title
Views
Active users
Event count
Bounce rate

The results are stored locally as a CSV file.

input-data/top_pages.csv

At this stage, the data is useful but still too raw for meaningful analysis.

Transforming Metrics Into Signals

A second script called page_leak_analysis.py processes the exported data using pandas.

The script cleans the dataset, converts columns into numeric values and calculates additional metrics designed to reveal behavioural patterns.

Three custom metrics are generated:

Leak Score

Measures the likelihood that users are leaving a page without progressing further.

Views × Bounce Rate

Engagement Efficiency

Measures how much interaction occurs relative to traffic volume.

Events / Views

Intent Depth

Measures how deeply users engage with a page.

Events / Active Users

These metrics provide a clearer picture of user behaviour than raw analytics data alone.

Classifying Pages

Once the metrics are calculated, each page is assigned one of three labels.

True Leak

Pages where users arrive but quickly leave without meaningful engagement.

Growth Opportunity

Pages showing strong interest but limited conversion or monetisation potential.

Healthy

Pages performing as expected with no immediate action required.

The classified output is saved as:

output-data/page_leak_classified.csv

This step transforms analytics data into prioritised product problems.

Preparing Data for the Model

Sending every page to a language model would create unnecessary noise.

Instead, a third script called prepare_llm_input.py filters only the most important pages.

Specifically:

True Leak pages
Growth Opportunity pages

The filtered dataset is then converted into structured JSON using pandas and Python's JSON utilities.

output-data/llm_priority_pages.json

The resulting file contains page names, classifications and supporting metrics in a format that is easy for a language model to understand.

Generating Product Recommendations

The next stage is where the local language model becomes useful.

A script called generate_recommendations.py reads the structured JSON and creates a detailed prompt.

The model is instructed to behave like a product analyst and answer questions such as:

Why are users dropping off?
What behavioural patterns are visible?
What UX issues might exist?
Which experiments should be prioritised?
What A/B tests should be considered?

Rather than asking the model to analyse raw analytics exports, the model receives carefully structured context designed for reasoning.

Running Mistral Locally

For inference, I use Mistral running locally through Ollama.

The script communicates with the model using:

ollama run mistral

The prompt and JSON data are passed directly into the model through a subprocess call.

Because everything runs locally:

No analytics data leaves the machine.
No external APIs are required.
No usage costs are incurred.
Complete control is maintained over the workflow.

The model then generates human-readable recommendations based on the classified data.

Output

The final recommendations are stored in:

output-data/pm_recommendations.txt

Typical outputs include:

Pages suffering from trust issues.
Potential UX friction points.
User drop-off explanations.
Conversion opportunities.
Suggested experiments.
Recommended A/B tests.

The output is designed for decision-making rather than reporting.

Automating The Entire Workflow

The final piece of the system is automation.

A master script called run_pipeline.py orchestrates every step in sequence.

The workflow looks like this:

Fetch GA4 Data
        ↓
Calculate Metrics
        ↓
Classify Pages
        ↓
Generate JSON
        ↓
Run Mistral
        ↓
Generate Recommendations

With a single command, the entire pipeline runs end-to-end.

The system can also be scheduled using cron jobs, allowing insights to be generated automatically without manual intervention.

What I Learned

The most interesting lesson was that model quality was not the biggest factor.

Data preparation mattered far more.

A relatively small model such as Mistral 7B produced surprisingly useful insights when provided with structured context and clear objectives.

Rather than replacing product analysis, the system acts as an assistant that surfaces opportunities and accelerates decision-making.

Current Status

This remains an ongoing experiment.

I continue to explore better classification methods, retrieval-based approaches and larger local models.

The long-term goal is not to build another dashboard.

The goal is to build systems that transform raw data into actionable product insights while remaining entirely private and locally controlled.