AI Visibility Tools Are Lying to You

#ai #llm #ethics #news

A post titled "Every AI Visibility Tool Is Lying to You" appeared on Hacker News and drew 13 points with 2 comments. The linked analysis at canonry.ai argues that commercial dashboards reporting LLM citations and brand mentions contain systematic overcounts.

What the Post Actually Shows

The article demonstrates that tools scrape a narrow set of prompts, then extrapolate to claim broad visibility. It lists repeated cases where reported citations did not appear when the same prompts were run directly in the target models.

Evidence from the HN Thread

Early comments noted the absence of prompt sampling methodology and lack of timestamped verification. One thread participant asked for raw prompt lists; none were supplied by the tool vendors mentioned.

Common Measurement Errors

Most tools rely on three recurring flaws:

Single-run prompt tests treated as longitudinal data
Failure to account for model version drift
Inclusion of partial string matches as full citations

These produce inflated percentages that drop 40-70% on retest with fresh sessions.

How to Run Your Own Checks

Use a fixed prompt set of 50 queries across three models. Record exact output strings and dates. Store results in a simple spreadsheet rather than a paid dashboard. Re-run the same set monthly to track changes.

Tool Claims vs Direct Testing

Approach	Reported Visibility	Verified on Retest	Cost
Commercial AI visibility platforms	65-85%	25-40%	$99+/mo
Manual prompt sampling	N/A	25-40%	Free
Google Search Console + logs	Exact URL data	Matches logs	Free

Who Should Skip Paid Tools

Teams running fewer than 200 brand queries per month gain nothing from subscription dashboards. Researchers needing reproducible citation counts should maintain their own prompt corpus instead.

Practical Next Steps

Export your current tool's prompt list if available. Replicate the top 20 queries in ChatGPT, Claude, and Gemini within 24 hours. Compare outputs against the vendor report. Discrepancies above 30% indicate the tool is not reliable for decision-making.

Bottom line: Direct prompt sampling remains the only method that matches actual model outputs.

Commercial visibility platforms will continue to sell smoothed aggregates until buyers demand raw prompt logs and version-specific results.

PromptZone - Leading AI Community for Prompt Engineering and AI Enthusiasts

AI Visibility Tools Are Lying to You

What the Post Actually Shows

Evidence from the HN Thread

Common Measurement Errors

How to Run Your Own Checks

Tool Claims vs Direct Testing

Who Should Skip Paid Tools

Practical Next Steps

Top comments (0)

Read next

Apple Intelligence and Siri Upgrades at WWDC 2026

Trialant

Corporate Video Production That Builds Real Brand Authority

Why a Boutique Education Consulting Firm Is the Future of Global Student Success in 2026