Skip to main content
This guide explains how to validate AI Analyst results and establish a reliable workflow for using AI-generated analytics in your organization.

Why Validation Matters

Generic AI tools (ChatGPT, Gemini, platform chatbots) don’t query your actual data warehouse. When you ask them analytics questions, they may:
IssueWhat HappensImpact
No data accessTool uses estimates or asks you to provide dataAnswer is a guess, not a fact
Wrong data modelTool doesn’t understand your attribution logicMetrics don’t match your actual definitions
Hallucinated joinsAI invents relationships between tablesNumbers look plausible but are fabricated
Outdated contextTool trained on old data patternsDoesn’t reflect your current business

How SourceMedium AI Analyst Is Different

The AI Analyst queries your actual BigQuery warehouse:
  1. Queries your actual data — Every answer comes from SQL executed against BigQuery
  2. Uses your data model — Understands SourceMedium’s attribution logic and metric definitions
  3. Shows its work — You see the exact SQL query, making results auditable
  4. Validated schema — Queries only tables and columns that exist in your warehouse

Validation Process

Use this framework to validate AI Analyst results before relying on them for decisions.

Step 1: Define Your Key Questions

Work with your data point of contact to identify 10-15 business questions that matter most to your organization. These should be questions where getting the wrong answer would lead to bad decisions. Example key questions:
  • What was our blended ROAS last month?
  • What’s our 90-day LTV by acquisition channel?
  • Which products have the highest repeat purchase rate?
  • What’s our customer acquisition cost by channel?
  • How does subscription revenue compare to one-time revenue?

Step 2: Run Questions Through AI Analyst

Ask each question to the AI Analyst and collect:
  • The natural language answer
  • The SQL query generated
  • The raw data returned
  • Any visualization produced

Step 3: Independent Validation

Your data point of contact should independently validate each result:
-- Example: Validate the AI's ROAS calculation
-- 1. Review the AI-generated SQL for logical correctness
-- 2. Run the query manually in BigQuery
-- 3. Cross-check against your dashboard or known benchmarks
-- 4. Verify the time ranges and filters match your intent

SELECT
  SAFE_DIVIDE(SUM(sm_last_touch_revenue), SUM(ad_spend)) AS roas
FROM `your_project.sm_experimental.rpt_ad_attribution_performance_daily`
WHERE sm_store_id = 'your-sm_store_id'
  AND date BETWEEN DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY) AND CURRENT_DATE()
Focus validation on the SQL logic, not just the final number. A query can return a plausible-looking number while using the wrong joins or filters.

Step 4: Score and Document

For each question, score the result:
ScoreMeaningAction
CorrectSQL logic is sound, number matches validationApproved for org-wide use
⚠️ Partially correctRight approach, minor issuesDocument the caveat, refine the question
IncorrectWrong logic or significant errorReport via feedback, do not use
Document your validated questions as your organization’s canonical question set. These become the questions everyone should use the AI Analyst for.

Step 5: Establish Organizational Policy

Once your core questions are validated, establish clear guidelines:
  1. Verified numbers must be traceable to the warehouse — Any metrics cited in reports, decisions, or external communications should come from SourceMedium dashboards or validated AI Analyst queries (both query BigQuery)
  2. Exploratory analysis can use any tool — Quick hypothesis generation with Sidekick or ChatGPT is fine for exploration
  3. Cite your source — When reporting numbers, indicate whether it came from a validated AI Analyst query or exploratory analysis
Suggested policy: Metrics cited in business decisions should come from SourceMedium dashboards or validated AI Analyst queries. Generic AI tools are useful for exploration but should not be cited as data sources.
  1. Explore — Use any AI tool for quick hypothesis generation
  2. Verify — Confirm findings with AI Analyst (review the SQL)
  3. Cite — Use verified numbers in reports; keep the SQL for reproducibility

Ongoing Validation

Regular Re-validation

  • Monthly: Re-run your canonical question set to catch any drift
  • After major changes: Re-validate when you change attribution windows, add new data sources, or modify business logic

Feedback

Every AI Analyst response includes a feedback button. Use it to report incorrect answers or flag issues—we review all feedback and use it to improve the system. See Feedback for details.

Automate Re-validation Today (DIY)

If you want automated evaluation now, you can build a lightweight workflow using existing tools:
  1. Save your canonical question set — Copy the validated SQL from AI Analyst into a shared doc or version-controlled repo (one query per question).
  2. Schedule re-runs — Use BigQuery Scheduled Queries (or your existing orchestrator) to run those queries daily/weekly and write outputs into a small ai_analyst_eval_runs table.
  3. Review drift — Track changes across runs and re-validate when results move materially (often caused by upstream data backfills, attribution configuration changes, or logic updates).
The goal of automation is not to “prove correctness forever,” but to catch drift early so your team re-validates before numbers make it into reports.

Summary

Tool TypeBest ForTrust Level for Reporting
Generic AI assistants (ChatGPT, Gemini, platform chatbots)Quick exploration, brainstorming❌ Do not cite
BI dashboardsCurated views of validated metrics✅ Cite (queries warehouse)
SourceMedium AI AnalystAd-hoc questions with SQL transparency✅ Cite after validation
The goal isn’t to avoid other AI tools—they’re useful for exploration. The goal is to ensure numbers that inform decisions are traceable to your warehouse. Both SourceMedium dashboards and the AI Analyst provide this traceability.

Next Steps