This guide explains how to validate AI Analyst results and establish a reliable workflow for using AI-generated analytics in your organization.
Why Validation Matters
Generic AI tools (ChatGPT, Gemini, platform chatbots) don’t query your actual data warehouse. When you ask them analytics questions, they may:
| Issue | What Happens | Impact |
|---|
| No data access | Tool uses estimates or asks you to provide data | Answer is a guess, not a fact |
| Wrong data model | Tool doesn’t understand your attribution logic | Metrics don’t match your actual definitions |
| Hallucinated joins | AI invents relationships between tables | Numbers look plausible but are fabricated |
| Outdated context | Tool trained on old data patterns | Doesn’t reflect your current business |
How SourceMedium AI Analyst Is Different
The AI Analyst queries your actual BigQuery warehouse:
- Queries your actual data — Every answer comes from SQL executed against BigQuery
- Uses your data model — Understands SourceMedium’s attribution logic and metric definitions
- Shows its work — You see the exact SQL query, making results auditable
- Validated schema — Queries only tables and columns that exist in your warehouse
Validation Process
Use this framework to validate AI Analyst results before relying on them for decisions.
Step 1: Define Your Key Questions
Work with your data point of contact to identify 10-15 business questions that matter most to your organization. These should be questions where getting the wrong answer would lead to bad decisions.
Example key questions:
- What was our blended ROAS last month?
- What’s our 90-day LTV by acquisition channel?
- Which products have the highest repeat purchase rate?
- What’s our customer acquisition cost by channel?
- How does subscription revenue compare to one-time revenue?
Step 2: Run Questions Through AI Analyst
Ask each question to the AI Analyst and collect:
- The natural language answer
- The SQL query generated
- The raw data returned
- Any visualization produced
Step 3: Independent Validation
Your data point of contact should independently validate each result:
-- Example: Validate the AI's ROAS calculation
-- 1. Review the AI-generated SQL for logical correctness
-- 2. Run the query manually in BigQuery
-- 3. Cross-check against your dashboard or known benchmarks
-- 4. Verify the time ranges and filters match your intent
SELECT
SAFE_DIVIDE(SUM(sm_last_touch_revenue), SUM(ad_spend)) AS roas
FROM `your_project.sm_experimental.rpt_ad_attribution_performance_daily`
WHERE sm_store_id = 'your-sm_store_id'
AND date BETWEEN DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY) AND CURRENT_DATE()
Focus validation on the SQL logic, not just the final number. A query can return a plausible-looking number while using the wrong joins or filters.
Step 4: Score and Document
For each question, score the result:
| Score | Meaning | Action |
|---|
| ✅ Correct | SQL logic is sound, number matches validation | Approved for org-wide use |
| ⚠️ Partially correct | Right approach, minor issues | Document the caveat, refine the question |
| ❌ Incorrect | Wrong logic or significant error | Report via feedback, do not use |
Document your validated questions as your organization’s canonical question set. These become the questions everyone should use the AI Analyst for.
Step 5: Establish Organizational Policy
Once your core questions are validated, establish clear guidelines:
-
Verified numbers must be traceable to the warehouse — Any metrics cited in reports, decisions, or external communications should come from SourceMedium dashboards or validated AI Analyst queries (both query BigQuery)
-
Exploratory analysis can use any tool — Quick hypothesis generation with Sidekick or ChatGPT is fine for exploration
-
Cite your source — When reporting numbers, indicate whether it came from a validated AI Analyst query or exploratory analysis
Suggested policy: Metrics cited in business decisions should come from SourceMedium dashboards or validated AI Analyst queries. Generic AI tools are useful for exploration but should not be cited as data sources.
Recommended Workflow
- Explore — Use any AI tool for quick hypothesis generation
- Verify — Confirm findings with AI Analyst (review the SQL)
- Cite — Use verified numbers in reports; keep the SQL for reproducibility
Ongoing Validation
Regular Re-validation
- Monthly: Re-run your canonical question set to catch any drift
- After major changes: Re-validate when you change attribution windows, add new data sources, or modify business logic
Feedback
Every AI Analyst response includes a feedback button. Use it to report incorrect answers or flag issues—we review all feedback and use it to improve the system. See Feedback for details.
Automate Re-validation Today (DIY)
If you want automated evaluation now, you can build a lightweight workflow using existing tools:
- Save your canonical question set — Copy the validated SQL from AI Analyst into a shared doc or version-controlled repo (one query per question).
- Schedule re-runs — Use BigQuery Scheduled Queries (or your existing orchestrator) to run those queries daily/weekly and write outputs into a small
ai_analyst_eval_runs table.
- Review drift — Track changes across runs and re-validate when results move materially (often caused by upstream data backfills, attribution configuration changes, or logic updates).
The goal of automation is not to “prove correctness forever,” but to catch drift early so your team re-validates before numbers make it into reports.
Summary
| Tool Type | Best For | Trust Level for Reporting |
|---|
| Generic AI assistants (ChatGPT, Gemini, platform chatbots) | Quick exploration, brainstorming | ❌ Do not cite |
| BI dashboards | Curated views of validated metrics | ✅ Cite (queries warehouse) |
| SourceMedium AI Analyst | Ad-hoc questions with SQL transparency | ✅ Cite after validation |
The goal isn’t to avoid other AI tools—they’re useful for exploration. The goal is to ensure numbers that inform decisions are traceable to your warehouse. Both SourceMedium dashboards and the AI Analyst provide this traceability.
Next Steps