CoreCMO

Channels & Execution


Reviews & Social Proof

Review platforms are the most influential channel at the bottom of the funnel. Late in their evaluation, prospects open a category-relevant review site — G2, Gartner Peer Insights, TrustRadius, or a vertical equivalent. If your presence there is weak, you lose deals you'll never hear about.

Channels & Execution 3 prompts 1 agent — Reviews Agent ~5 min preview

The framework — strategy first


Reviews & Social Proof — the strategic foundation.

STRATEGY & PROCESS

Review platforms are the most influential channel at the bottom of the funnel. Late in their evaluation, prospects open a category-relevant review site — G2, Gartner Peer Insights, or TrustRadius in horizontal B2B SaaS; KLAS Connect in healthcare; AdvisoryHQ in financial services; PeerSpot or SourceForge for IT and developer tools. If your presence there is weak, you lose deals you'll never hear about.

WHY REVIEW PLATFORMS MATTER

Most B2B buyers build a shortlist before talking to sales — Gartner's recurring CSO research has tracked this at 70–83% over the last several years, with the share trending up as self-serve research matures. The mechanism is the same regardless of the exact percentile: prospects on a review platform (G2, Gartner Peer Insights, TrustRadius, or a vertical equivalent) are at decision stage, which makes them the highest-intent audience marketing can reach. Review-site badges (G2 Leader, Gartner Customers' Choice, TrustRadius Top Rated, KLAS Best in KLAS, PeerSpot Tech Leader) act as third-party trust signals that close deals — buyers tell us in win/loss interviews that the badges resolved the final shortlist tie.

AI CHAT IS #1, REVIEW SITES ARE #2 (G2 BUYER RESEARCH, 2025–2026)

The shortlist source mix is shifting fast. In G2-published buyer research from 2025–2026, AI chat (ChatGPT, Perplexity, Claude) is now the #1 source buyers use to build software shortlists. Review sites are #2 at 54%. Vendor websites have dropped to #3 at 43%. The full ranking buyers reported using:

SHORTLIST SOURCE% OF BUYERS USING (G2 BUYER RESEARCH, 2025–2026)
AI chat (ChatGPT, Perplexity, etc.)#1 — highest-ranked source
Software review sites54%
Vendor websites43%
Research analyst firms36%
Community forums32%
Peers & colleagues28%
Thought-leader content22%
Internal supplier portal22%
Vendor salesperson17%

What this means for review strategy: reviews still matter — 54% is enormous, and review-platform authority feeds the AI-chat layer above (the LLMs read G2, Gartner Peer Insights, and TrustRadius when answering "what's the best [category] software"). But the senior-operator move is to optimize for both surfaces: real reviews across the platforms your buyers consult — typically two of (G2, Gartner Peer Insights, TrustRadius) plus a vertical equivalent — with structured data the AI models can ingest. Reviews are no longer the destination; they're now also the supply chain for the AI inference layer above them.

The G2 syndication chain — one program, five surfaces

The single highest-leverage tactical move in B2B SaaS review strategy: invest in G2 review velocity, harvest five downstream surfaces automatically.

G2 reviews auto-syndicate across the marketplace surfaces buyers also consult during evaluation. The chain, as of 2026:

  • G2 itself — primary destination; LLMs cite verbatim
  • AWS Marketplace — software listings inherit G2 ratings; buyers comparing cloud-procured software see your G2 score next to your AWS listing
  • Azure Marketplace — same syndication pattern; enterprises with Microsoft commit dollars see your score
  • Capterra — now owned by G2 (acquired by Gartner Digital Markets); review syndication is rolling through, depending on category
  • LLM training and retrieval corpus — G2 is the #1 cited review source in ChatGPT, Perplexity, Claude, and Gemini responses to "what are the best tools for X"

The implication for the operator: a quarter of focused G2 review-velocity investment compounds across cloud marketplaces, the AI inference layer, AND the comparison-shopping surface. Treat G2 as the upstream — every other surface inherits from it. Build the program around G2 first; the syndication is the multiplier, not a separate workstream.

The review velocity formula — what actually moves a category ranking.

Review platforms are usually treated as a passive presence — your team responds to the reviews that come in, hopes the rating stays above 4.5, and lobbies for a Leader badge each quarter. The teams that move rankings treat reviews as a channel with its own velocity, its own funnel, and its own dedicated owner. The math:

The review velocity formula

(Net new reviews / 90 days) × (Average rating in those reviews) × (Recency weighting) = category momentum score.

Every major review platform — G2, Gartner Peer Insights, TrustRadius — weights recent reviews far more heavily than older ones. A 5-star review from this month moves your rank more than ten 5-star reviews from 18 months ago. The implication: review programs need to run continuously, not in bursts before badge season.

Quarterly review campaigns — the cadence that wins Leader status

The pattern that gets a Leader badge on any major platform (G2, Gartner Peer Insights, TrustRadius): 20–30 net-new reviews per quarter, average rating 4.6+, every review responded to within 5 business days, 100% profile field completion. The math: 100+ reviews per year, 80+ in the trailing 6 months (the primary recency window for most platform algorithms), with response velocity that signals an engaged vendor to algorithm and prospects.

SOURCETARGET # PER QUARTERCONVERSION RATEOWNER
Net Promoter Score (NPS) promoters (8–10 score)10–15~30% of asks convertCS Ops
Quarterly Business Review (QBR) completions5–8~40% of asks convertCustomer Success Managers (CSMs)
Event capture (QR codes at summit/booth)3–5~15% of scans convertField Marketing
Post-close (30 days after activation)4–6~25% of asks convertMarketing Ops automation
In-product prompts (after milestone)2–4~5% of triggers convertProduct + Marketing

The review-site paid layer — buyer intent at decision stage

Review-site buyer-intent data is the most underrated paid investment in B2B SaaS. You're not paying for impressions; you're paying for the names of accounts actively comparing you to competitors on the platform right now. The three programs that ship this signal: G2 Buyer Intent, TrustRadius Intent Signals, and Gartner Peer Insights review activity. The ROI math justifies the spend for any company at $10M+ ARR: at $30K–$50K/year per platform for the buyer-intent feed plus category sponsorship, you typically surface 50–150 active comparison accounts/month. If 10% convert to opportunity and average Annual Contract Value (ACV) is over $25K, the unit economics are unbeatable. The senior-operator pattern: pick the one platform your buyers actually use (often G2 for SMB/mid-market, Gartner Peer Insights for enterprise, TrustRadius for IT-heavy buying) and run it deep before adding a second.

Your Review Program

The platforms you target, your current state, your quarterly cadence. Every reviews / customer-marketing / sales-enablement prompt on the site uses these.

Saved as [PRIMARY REVIEW PLATFORM], [SECONDARY REVIEW PLATFORMS], [REVIEW QUARTERLY TARGET], [BUYER INTENT PROGRAM].

Review Platform Priority

Lead with the function — a review platform exists to deliver high-intent buyer-shortlist visibility plus third-party badge credibility. Then map platforms to audience. The senior-operator pattern: pick two horizontal platforms plus one vertical-specific platform; don't try to be everywhere.

PLATFORMAUDIENCEPRIORITYKEY METRIC
G2SMB to mid-market buyers, IT and HR/OperationsHorizontal — common primaryRating, review count, category badges
Gartner Peer InsightsEnterprise buyers, C-suite, procurementHorizontal — common primary for enterpriseRating, Customers' Choice recognition
TrustRadiusMid-market and enterprise, IT-heavy buyersHorizontal — common primary for IT-led buysTrScore, vendor responses, Top Rated badge
Capterra (now owned by G2)SMB, first-time buyersSecondaryRating, category rank
PeerSpotIT/security buyers, mid-market to enterpriseSecondaryTech Leader badge, peer-to-peer depth
SourceForgeDeveloper tools, open-source-adjacent buyersSecondaryCategory rank, downloads
KLAS ConnectHealthcare buyers (provider IT, EHR)Vertical primary for healthcareKLAS rating, Best in KLAS
AdvisoryHQ / vertical equivalentsFinancial services, vertical-specific buyersVertical primary by sectorVertical-specific metric

Review Generation Program

Program Architecture

Quarterly review campaigns: coordinate with Customer Success to identify promoters (NPS 8–10) and request reviews at high-satisfaction moments.

In-product prompts: trigger a review request after a customer achieves a milestone or completes a successful QBR.

Event capture: deploy QR code review stations at hosted events and customer summits.

CSM-led requests: each CSM should request 2 new reviews per quarter from their book of business.

Post-close emails: trigger a review request 30 days after a deal closes.

Review Request Best Practices

Link directly to the specific platform profile URL — never generic instructions.

Personalize the ask: reference a specific outcome the customer achieved.

Never incentivize reviews in ways that violate platform policies.

Respond to every review — positive and negative — within 5 business days. Responses are public.

Review Volume Targets

PLATFORMQUARTERLY GOALANNUAL TARGETOWNER
Horizontal primary (G2, Gartner Peer Insights, or TrustRadius)[X] net new reviews/quarter[Y] reviews/yearMarketing + CS
Horizontal secondary (the other two of G2/Gartner Peer Insights/TrustRadius)[X] net new reviews/quarter[Y] reviews/yearMarketing + CS
Vertical platform (KLAS, AdvisoryHQ, or sector equivalent)[X] net new reviews/quarter[Y] reviews/yearMarketing
Long-tail (Capterra, PeerSpot, SourceForge)[X] net new reviews/quarter[Y] reviews/yearMarketing

Profile Optimization — the work that applies to every platform

The optimization checklist below uses G2 as the worked example because it's the most-used platform. Every line applies equally to Gartner Peer Insights, TrustRadius, KLAS, and any other category-relevant review surface — the field names differ, the discipline is the same.

Complete 100% of profile fields: description, screenshots, videos, pricing, integrations, FAQs.

Update screenshots and product videos every 6 months to reflect current UI.

Feature 3–5 customer quotes directly on each platform profile (rotate quarterly).

Enable the platform's comparison widget on your website to intercept in-evaluation prospects (G2 ships this; TrustRadius and Gartner Peer Insights have equivalents).

Display active badges on homepage, pricing page, and all landing pages — G2 Leader, Gartner Customers' Choice, TrustRadius Top Rated, KLAS Best in KLAS, whichever you've earned.

If budget allows: enable the buyer-intent program on your primary platform (G2 Buyer Intent, TrustRadius Intent Signals, or Gartner Peer Insights activity tracking) to alert Sales when target accounts research your category.

Social Proof Asset Inventory

ASSET TYPEUSE CASESREFRESH CADENCE
Customer logos (recognizable brands)Homepage, pitch decks, proposalsUpdate when major logos land
Case studies with quantified metricsSales cycle, website, events2+ new per month
Video testimonialsWebsite, LinkedIn, events1 per month
ROI / business impact dataAll channelsQuarterly from CS team
Review-platform badges (G2 Leader, Gartner Customers' Choice, TrustRadius Top Rated, KLAS Best in KLAS, PeerSpot Tech Leader)All marketing channelsUpdate when new awards issued
Analyst coverage / quotesPR, sales materialsAs generated

The prompt pack


Paste-ready prompts for Reviews & Social Proof.

Each prompt is a named, named-by-what-it-does deliverable. Click any card to expand the paste-able body. Run against your Operator Brief.

Five copy-paste prompts. Open ChatGPT, Claude, or Gemini. Paste a prompt. Run it. The output of one prompt feeds into the next.

READ THIS ONCE BEFORE ANY PROMPT IN THIS BOOK

These prompts assume you've populated your Operator Brief (the worksheet that lives in /Operator-Brief-Worksheet.docx). When a prompt asks for OPERATOR BRIEF, paste the relevant Brief sections rather than typing context from scratch.

Your output then arrives in your voice, against your buyers, using your differentiators. Not [BRACKETED] generics. The Brief is the difference between an LLM helper and a tool that sounds like you.

Prompt 1

Review request sequence

A 3-touch review-request email program with subject, body, and exit logic — works for G2, Gartner Peer Insights, TrustRadius, or any platform-equivalent.

Write a 3-touch review-request program. PLATFORM: [G2 / Gartner Peer Insights / TrustRadius / KLAS / other — name yours] SEGMENT: [happy customers — NPS ≥ 9, tenure ≥ 6 months] INCENTIVE: [platform-compliant — e.g., G2 standard $25 gift card; Gartner Peer Insights = none allowed] 3 emails: (T0) ask, (T+5) reminder with the question they'll answer, (T+12) break-up. Subject + body + CTA + the auto-suppression criterion. Honor each platform's incentive rules.

Prompt 2

Social-proof asset inventory

An audit of social proof assets we have and what's missing.

Audit our social-proof assets. ASSETS WE HAVE: [list — testimonials, case studies, video, logos, review-platform badges (G2 / Gartner Peer Insights / TrustRadius / vertical), awards] TARGET PERSONAS: [3] For each persona: which assets resonate, which are missing, the one asset to build first. Output as table.

Prompt 3

Review response framework

A response framework for 5-star, 3-star, and 1-star reviews.

Write our review-response framework. BRAND VOICE: [3 adjectives] COMMON CONCERNS: [list 3] Outputs: response template for 5★, 3★, 1★. Decision tree for escalations. The do-not-respond conditions.

The agent spec


The agent for Reviews & Social Proof.

How to install this agent

Five steps from spec to running agent.

  1. System prompt — copy the system prompt block below into your AI tool's system prompt field (Claude Project instructions, Cowork Skill instructions, custom GPT config, or your agent platform's equivalent).
  2. Inputs — wire the inputs as the agent's reference files. The Operator Brief is always input #1; the other inputs vary by agent.
  3. Outputs — the output schema tells you what the agent produces. Use it as a structured-output instruction in the system prompt, or as the format you expect to see back.
  4. Evals — before publishing any output, score it against the eval criteria. Don't ship anything that doesn't pass.
  5. Cadence — set the run cadence on your calendar (or your agent platform's scheduler). Log every run in your wins log.

Reviews Agent

Operates review-site presence as a program — G2 first (it’s the upstream that auto-syndicates to AWS Marketplace, Azure Marketplace, Capterra, and the LLM citation corpus), TrustRadius / Gartner Peer Insights secondary. Drafts customer review-request sequences tied to lifecycle moments (onboarding, QBR, NPS-promoter, post-renewal), monitors review velocity AND syndication health, surfaces sentiment shifts, drafts vendor responses to public reviews.

Who is this agent
Identity card
NameReviews Agent
RoleReview-site presence operations — the social-proof velocity layer
OwnerDirector of Customer Marketing (with Head of PMM partnership)
Reports toVP Marketing + CRO
Versionv0.5 (supervised)
SurfaceReplit + Postgres (review corpus + sentiment history) + integration with review APIs
Output target/reviews/per-platform/, /reviews/sentiment-tracker.md, /reviews/request-queue/
Review cadenceDaily new-review watch; weekly request-sequence health; quarterly platform-by-platform strategy review
Mission
Run review-site presence as a program, not a Slack-channel reminder. Treat G2 as the upstream investment — it’s the #1 cited review source in LLM responses to "what are the best tools for [category]" AND it auto-syndicates to AWS Marketplace, Azure Marketplace, and Capterra (now owned by G2). One G2 program = five downstream surfaces. Draft customer review-request sequences with the right contacts at the right moments (NPS-promoters at 8-10, post-QBR satisfaction, onboarding completion, expansion-stage advocates). Monitor review velocity (target: 20-30 net-new reviews per quarter, 4.6+ average rating, every review responded to within 5 business days). Watch sentiment shifts. Watch LLM citation share-of-voice for category-relevant prompts (do LLMs cite our G2 profile when answering "alternatives to [competitor]"?). Draft thoughtful vendor responses. Maintain the social-proof flywheel without burning the customer relationship.
Goals & KPIs the agent moves
Leading indicators — the agent controls these
Time-to-vendor-response on new public reviews< 48 hours
Review-request sends to qualified closed-won customers within 30 days of go-live≥ 80%
Lagging indicators — downstream outcomes with review triggers
Net new reviews per quarter on the primary category platform. Trigger: 2 consecutive quarters below 75% of the agreed quarterly target pages the Head of Customer Marketing for advocacy-program review.Quarterly target set per platform
Average review rating across primary platforms. Trigger: drop below 4.2 / 5 on a rolling 90-day window pages the VP Marketing and Head of Product for real-customer-feedback review (this is a product signal, not a review-volume signal).≥ 4.5 / 5
What it does
Task list
  1. Daily Scan G2 / TrustRadius / Capterra / Gartner Peer Insights for new reviews. Pull rating + verbatim + reviewer profile.
  2. Daily Draft vendor response for any new public review — positive (acknowledgment) or critical (thoughtful engagement). Submit to Director Customer Marketing.
  3. Daily Watch sentiment trends: average rating, common praise themes, common complaint themes. Surface emerging patterns.
  4. Weekly Build the review-request queue. Per-platform request sequences targeting customers with right tenure + right satisfaction signal + right persona-platform fit.
  5. Weekly Compile weekly Reviews digest: new reviews, sentiment trends, response coverage, top-line themes.
  6. Weekly Cross-reference Win/Loss themes against review themes. Surface alignment (customer story matches lost-deal pattern) + divergence (reviews say one thing, lost deals say another).
  7. Monthly Platform-level audit: per-platform velocity vs. category-leader benchmark. Surface platforms where we’re losing share-of-voice.
  8. Quarterly Strategy review per platform. Are we investing in the right ones? Do platform-level dynamics support our positioning?
  9. Event When Customer Marketing Agent surfaces a new advocate, queue them for review-request consideration.
  10. Event When a critical review (< 3 stars) lands, page Director Customer Marketing + Director Customer Success within 4 hours.
Schedule grid
TaskFrequencyDurationOutput goes to
Daily new-review scanDaily 08:00~20 minDirector Customer Marketing + Director CS
Daily vendor-response draftingDaily 09:00~30 minDirector Customer Marketing (approval)
Daily sentiment trend watchDaily 17:00~10 minDirector Customer Marketing
Weekly review-request queueWeekly Tue 10:00~60 minDirector Customer Marketing + assigned CSMs
Weekly Reviews digestWeekly Fri 14:00~30 minDirector Customer Marketing + Head of PMM + VP Marketing
Weekly Win/Loss cross-referenceWeekly Fri 14:30~20 minWin/Loss Agent + Head of PMM
Monthly platform auditMonthly 1st~60 minDirector Customer Marketing + Head of PMM
Quarterly strategy reviewQuarterly Q-1 days~2 hoursVP Marketing + Director Customer Marketing
Triggers

Scheduled (cron-style):

ScheduleWhat it runs
0 8 * * *Daily new-review scan
0 9 * * *Daily vendor-response drafting
0 17 * * *Daily sentiment watch
0 10 * * 2Weekly review-request queue
0 14 * * 5Weekly Reviews digest
0 9 1 * *Monthly platform audit

Event-driven:

EventWhat it runs
New public review lands on any tracked platformDraft vendor response within 24 hours; submit to Director Customer Marketing
Critical review (< 3 stars)Page Director Customer Marketing + Director CS within 4 hours
Customer Marketing Agent surfaces a new advocateQueue for review-request consideration at next refresh
Review velocity drops > 30% MoM on a primary platformPage Director Customer Marketing + Head of PMM
Average rating drops < 4.0 on any platformPage Director Customer Marketing + CRO + CPO — this is a product / CS signal, not a review-tactics signal
Who it works with
Inputs
SourceTypeCadenceRequired?
Operator Brief (Sections 6, 8)MarkdownRead every runRequired
Review platform APIs (G2 / TrustRadius / Capterra / Gartner Peer Insights)APIDailyRequired if platforms in use
Customer Success engagement + advocate signalGainsight / ChurnZero APIDailyRequired
Customer Marketing Agent advocate rosterMarkdownContinuousRequired
Win/Loss Agent themesMarkdownPer-interviewRequired
Account Intel Hub per-account stateJSONLive queryRequired for request targeting
Brand Voice Agent scoring APIInline callPer-response-draftRequired
Outputs
OutputFormatTarget pathAudience
Per-platform review feed + sentiment trackerMarkdown + JSON/reviews/per-platform/<platform>-tracker.mdDirector Customer Marketing + Head of PMM
Vendor response draftsMarkdown/reviews/responses/<platform>-<review-id>.mdDirector Customer Marketing (approval)
Review-request queueMarkdown/reviews/request-queue/YYYY-WW.mdCSMs + Director Customer Marketing
Weekly Reviews digestMarkdown/reviews/digests/YYYY-WW.mdDirector Customer Marketing + Head of PMM + VP Marketing
Monthly platform auditMarkdown + chart/reviews/audits/YYYY-MM.mdDirector Customer Marketing + Head of PMM
Quarterly strategy recommendationsMarkdown/reviews/strategy/Q<n>.mdVP Marketing + Director Customer Marketing
↑ Upstream — agents/sources that feed this one
  • Customer Marketing Agent. Surfaces advocate roster for review-request targeting.
  • Account Intel Hub. Per-account engagement + satisfaction signal informs which customers to ask + when.
  • Win/Loss Agent. Themes to cross-reference against review patterns.
  • Operator Brief (human-maintained). Voice + brand pillars ground every vendor response.
  • Brand Voice Agent. Scores every vendor-response draft.
↓ Downstream — agents/humans that consume its output
  • Director Customer Marketing (human). Approves every vendor response + reviews request queue.
  • CSMs (humans). Receive review-request drafts; send personalized requests to assigned customers.
  • Director CS (human). Receives critical-review alerts.
  • Web Operations Agent. Receives review-velocity data for landing-page social-proof updates.
  • Account Intel Hub. Receives reviewer-engagement signals for the per-account record.
  • PR Comms Agent. Receives review-volume + sentiment data for analyst briefings.
Human escalation paths
Trigger conditionEscalate toWithin
Critical review (< 3 stars) landsDirector Customer Marketing + Director CS< 4 hours
Average platform rating drops < 4.0Director Customer Marketing + CRO + CPOSame business day
Review velocity drops > 30% MoM on a primary platformDirector Customer Marketing + Head of PMM< 7 days
Vendor response > 48 hours unpostedDirector Customer MarketingSame business day
Review-request send-rate hits Comms Governance capDirector Customer Marketing + Director LifecycleSame business day
How to build it
System prompt
You are the Reviews Agent for [COMPANY]. YOUR JOB Run review-site presence as a program. Operate G2 / TrustRadius / Capterra / Gartner Peer Insights as distinct surfaces. Draft request sequences. Monitor velocity + sentiment. Draft vendor responses. Maintain the social-proof flywheel without burning the customer relationship. INPUTS (always read in this order) 1. /operator-brief.md (Sections 6, 8) 2. /reviews/platforms.yaml - tracked platforms + per-platform strategy 3. /reviews/corpus/ - historical review corpus + sentiment history 4. Today's review platform API output 5. /customer-marketing/advocate-roster.md (from Customer Marketing Agent) OUTPUTS - /reviews/per-platform/<platform>-tracker.md - /reviews/responses/<platform>-<review-id>.md (draft per new review) - /reviews/request-queue/YYYY-WW.md (weekly) - /reviews/digests/YYYY-WW.md (weekly) RULES 1. Every vendor response is in the Brief Section 8 voice. Brand Voice Brand Voice Agent scores BEFORE Director approval. 2. Vendor response within 48h on every new review (positive or critical). 3. Critical (<3 star) reviews escalate to Director CS in <4h - they often indicate a product / CS problem, not a tactical review-response problem. 4. Review-request sequences honor Comms Governance Agent caps. 5. Never request a review from an unhappy customer - check CS health signal. 6. Never offer incentives that violate platform Terms of Service. 7. Per-platform strategy varies: G2 prioritizes recency + verified-buyer; Gartner Peer Insights prioritizes enterprise-tier reviewers; TrustRadius prioritizes feature-level depth. ESCALATION - Critical review: Director Customer Mktg + Director CS within 4h. - Avg rating <4.0: Director + CRO + CPO same business day.
Tools & integrations
Platform / toolUsed forRequired?
Replit + Postgres (review corpus + sentiment history)State + auditRequired
G2 API + Vendor DashboardG2 presence operationsRequired if G2 active
TrustRadius / Capterra / Gartner Peer Insights APIs or portalsPer-platform operationsRequired per active platform
Gainsight / ChurnZero APICustomer health signal (don’t request from unhappy customers)Required if CS platform in use
Comms Governance Agent APIPer-request gatingRequired
Brand Voice Agent APIPer-response scoringRequired
Slack APIDaily digest + critical-review alertsRequired
Guardrails — what it must not do
  • Never request a review from a customer with red CS health signal.
  • Never offer incentives that violate platform Terms of Service (G2 has explicit rules here).
  • Never publish vendor responses without Director Customer Marketing approval + Brand Voice Agent scoring.
  • Honor Comms Governance Agent send-rate caps on review-request sequences.
  • Never share critical-review content publicly without Legal review.
  • Never use review-platform data to score CSM performance — that’s a CS leadership decision.
  • Honor customer anonymity preferences when known.
Evals + hallucination defense

Evals — output quality checks:

  1. Vendor response latency. Weekly: p99 time from new-review-detected to response posted. Target < 48 hours.
  2. Request-to-review conversion. Monthly: % of review requests that result in a review. Target ≥ 25%.
  3. Sentiment-shift early warning. Quarterly: was a downward sentiment shift surfaced before a public investor / press conversation? Target 100%.
  4. Voice-fidelity. Monthly: Brand Voice Agent score across vendor responses. Target ≥ 4.2.

Hallucination defense — specific checkpoints:

  • Review counts + ratings must trace to platform API exports with timestamp.
  • Sentiment claims must cite verbatim review excerpts, not synthesized summaries.
  • Per-platform benchmarks must cite source (platform analytics, public reports, manual research).
  • Customer-health claims must trace to CS platform signal, not inferred.
  • When platform API is silent for > 48 hours, surface the data gap.
Maturity curve + first-run checklist
v0.1 — Manual-assistAgent compiles weekly digest. Vendor responses drafted on-request. Useful from day 1.
v0.5 — SupervisedDaily scan + response drafting + weekly digest. Director Customer Marketing approves every response. Default ship state.
v1.0 — Semi-autonomousAfter 90 days clean evals + voice-fidelity ≥ 4.2, agent can auto-publish vendor responses to 4+ star reviews. Critical reviews always require Director approval.

First-run checklist — 5 steps from spec to running agent:

  1. Author per-platform strategy YAML with Director Customer Marketing + Head of PMM.
  2. Wire platform APIs + CS health signal + Comms Governance + Brand Voice Agent.
  3. Run in shadow mode for 2 weeks. Director approves every response.
  4. Turn on live. Subscribe Director Customer Marketing + Head of PMM to weekly digest.
  5. Schedule quarterly strategy review. Log every run.
Strategic · Creative · Data Driven · Revenue Accelerator