Test Studio

Proven, Not Just Promise.

Our Test Studio reruns real-world surveys on the Synthetic People platform and measures how closely outcomes align with human results.

The benchmarks are transparent. The methodology is visible. We measure ourselves publicly.

Key Metrics

Tests Conducted

Validation runs are executed across surveys to test how closely Synthetic People replicate human survey responses.

Total studies executed across scenarios.

Avg. Similarity

Average statistical similarity between human and synthetic responses at a question and distribution level.

80.2%

Average similarity between human and synthetic results.

Avg. Directional Alignment

How often Synthetic People arrive at the same conclusion as humans, even when exact numbers differ.

94.9%

Alignment of directional trends across outputs.

Scenarios Covered

Types of research problems tested - pricing, behaviour, adoption, intent.

Adoption Readiness33%Preference Mapping33%Purchase Intent33%

Breadth of scenarios tested in the suite.

Industries Covered

Industries where validation has been executed.

Food & Beverage33%Real Estate33%Venture Capital & Private Equity33%

Industry mix represented across all studies.

Individual Study Comparison

Same study. Two techniques. One comparison.

One built on stated responses. One built on behaviour. Both answering the same questions.

Human Survey

Survey Title:State of AI Adoption in Startups

Publisher Name:Elevation Capital

Industry:Venture Capital & Private Equity

Scenario:Adoption Readiness

Geography:India

Sample Size:200 respondents

No. of Questions:8

Economics (Human)

Calculated based on actual human survey execution dynamics, so what you see reflects what it really takes.

Estimated CostEstimated using $5-$8 per response across the sample size.

$1,000-$1,600

Estimated TimeTypical fieldwork duration based on sample size.

1-2 weeks

Estimated EffortGeneral time required for cleaning, analysis, and reporting.

80-120 hours

Synthetic-People Simulation

Distinct personas modelled to represent the audience segments defined in the human survey, aligned to its demographics, context, and decision environment.

Population Calibrated

200

Purchases, transactions, and intent patterns analysed as per the human survey context, reflecting how similar audiences actually act, not just how they respond.

Behavior Signals

18,932

relevant people's actions

Multi-platform conversations aligned to the human survey's audience and scenario, ensuring temporal context is extracted.

Contextual Threads

conversations inferred

Various high-quality sources aligned to the human survey's topic, industry and audience, reinforcing the context and simulation scenarios.

Knowledge Bank

sources analysed

Economics (Synthetic-People)

Nothing changed but the method, so efficiency isn't claimed, it's directly comparable.

Estimated CostAligned to human survey coverage, same depth, no fieldwork.

$2499

Estimated TimeWeeks of survey cycles, compressed into hours, same baseline.

3-4 hrs

Estimated EffortSame rigor as human analysts, without the operational drag.

1-2 hrs

Outcome Matrix

What holds. What shifts. What matters.

Avg. SimilarityHow closely synthetic responses match human distributions across all questions, calibrated to the same survey structure, audience, and context.

84.6%

Directional AlignmentHow often both systems point to the same conclusion, even when the exact numbers differ.

97.0%

Prediction AccuracyHow reliably synthetic outputs anticipate the dominant human choice across questions within this study.

97.1%

Relationship StrengthHow consistently patterns between options hold across both datasets, not just individual answers, but how they move together.

96.2%

Performance by Signal Category

AttributeQuestions about respondent facts, context, or profile. High similarity suggests synthetic personas are closely matching stable human characteristics.BehaviourQuestions about actions, habits, or past activity. High similarity suggests synthetic personas are reflecting human behavioural patterns with consistency.KnowledgeQuestions about awareness, recall, or recognition. High similarity suggests synthetic personas align well on what people know or remember.PreferenceQuestions about what respondents choose or favour. High similarity suggests synthetic personas are capturing human choice patterns well.EvaluationQuestions that ask respondents to rate, judge, or assess something. High similarity suggests synthetic personas align with human judgment and perceived value.EmotionQuestions about feelings such as worry, confidence, or excitement. High similarity suggests synthetic personas are capturing affective patterns with nuance.ReasoningQuestions that ask why respondents think, choose, or behave a certain way. High similarity suggests synthetic personas are reflecting human rationale and explanatory logic.IntentQuestions about future plans, likelihood, or willingness to act. High similarity suggests synthetic personas align well with forward-looking human intent.

A signal-level view of where synthetic persona mirrors human response patterns.

The Verdict

Across 8 comparable questions spanning AI adoption, workforce transformation, investment intent, and organizational impact, Synthetic-People and human survey responses show Consistent Patterns at 84.6% average similarity, with a strong directional alignment reaching 97%. The strongest convergence appears in operational AI adoption, workforce adaptation, and measurable business impact, where both methods closely mirror how start-ups are progressing from experimentation into implementation and operational deployment. The variation appears in emotionally framed excitement and future investment expectation questions, which are naturally more sensitive to the startup’s current sentiment and stated self-perception. Overall, the findings indicate that Synthetic-People effectively captures the underlying operational and behavioral architecture shaping startup AI adoption decisions, while the human survey remains the anchor for emotionally framed future intent and investment posture.

What MatchesWhere SP and human responses converge on the same signals.

Among the major behavioral patterns evaluated, Sales Adoption (90.4%), Hiring Impact (89%), HR Adoption (87%), and Customer Support Adoption (86.1%) showed the strongest convergence with human survey outcomes.

Where It DiffersTopics and scenarios where SP and human results do not align.

The largest variation appears in emotionally framed excitement toward AI adoption and future investment expectation questions, which fall within the preferences shift and lower consistent patterns ranges at 73.5% and 80.0% similarity respectively.

Why The DifferenceMost mismatches come from the messy parts of human judgment - social pressure, recall bias, memory gaps, weak samples, and the classic say-vs-do divide.

Variation is most visible in future-oriented sentiment, investment intent, and emotional framing questions, where responses are naturally influenced by internal positioning, executive signaling, recent market exposure, and rationalization in the moment.

Go beyond the summary. Inspect every question, every distribution, and every gap - side by side.

Book Demo

Same study. Two techniques. One comparison.

One built on stated responses. One built on behaviour. Both answering the same questions.

Human Survey

Survey Title:Home Buyers Sentiment Survey

Publisher Name:Anarock

Industry:Real Estate

Scenario:Purchase Intent

Geography:India

Sample Size:8,250 respondents

No. of Questions:10

Economics (Human)

Calculated based on actual human survey execution dynamics, so what you see reflects what it really takes.

Estimated CostEstimated using $5-$8 per response across the sample size.

$41,250-$66,000

Estimated TimeTypical fieldwork duration based on sample size.

4-5 weeks

Estimated EffortGeneral time required for cleaning, analysis, and reporting.

80-120 hours

Synthetic-People Simulation

Distinct personas modelled to represent the audience segments defined in the human survey, aligned to its demographics, context, and decision environment.

Population Calibrated

8,250

Purchases, transactions, and intent patterns analysed as per the human survey context, reflecting how similar audiences actually act, not just how they respond.

Behavior Signals

57,271

relevant people's actions

Multi-platform conversations aligned to the human survey's audience and scenario, ensuring temporal context is extracted.

Contextual Threads

119

conversations inferred

Various high-quality sources aligned to the human survey's topic, industry and audience, reinforcing the context and simulation scenarios.

Knowledge Bank

sources analysed

Economics (Synthetic-People)

Nothing changed but the method, so efficiency isn't claimed, it's directly comparable.

Estimated CostAligned to human survey coverage, same depth, no fieldwork.

$2499

Estimated TimeWeeks of survey cycles, compressed into hours, same baseline.

3-4 hrs

Estimated EffortSame rigor as human analysts, without the operational drag.

1-2 hrs

Outcome Matrix

What holds. What shifts. What matters.

Avg. SimilarityHow closely synthetic responses match human distributions across all questions, calibrated to the same survey structure, audience, and context.

79.4%

Directional AlignmentHow often both systems point to the same conclusion, even when the exact numbers differ.

93.1%

Prediction AccuracyHow reliably synthetic outputs anticipate the dominant human choice across questions within this study.

96.8%

Relationship StrengthHow consistently patterns between options hold across both datasets, not just individual answers, but how they move together.

76.1%

Performance by Signal Category

A signal-level view of where synthetic persona mirrors human response patterns.

The Verdict

Across 10 comparable questions spanning real-estate buyer signals, Synthetic-People and human survey responses show Consistent Patterns at 77.6% average similarity, with directional alignment reaching 96%. The strongest convergence appears in emotional influence, property preference, and investment behavior, where both methods closely mirror the broader structure of buyer decision-making. The largest variation appears in purchase-stage timing and future price expectations, areas that are naturally more sensitive to framing, current market sentiment, and stated self-image. Overall, the findings indicate that Synthetic-People effectively captures the underlying behavioral and motivational architecture shaping real-estate purchase decisions, while the human survey remains the anchor for timing-sensitive stated intent.

What MatchesWhere SP and human responses converge on the same signals.

Among the five signal categories evaluated, Emotion (88%), Reason / Explanation (82%), and Behavior (79%) showed the strongest convergence with human survey outcomes.

Where It DiffersTopics and scenarios where SP and human results do not align.

The largest variation appears in purchase-stage preference and future price expectation questions, which fall within the Preferences Shift range at 61.0% and 67.7% similarity respectively.

Why The DifferenceMost mismatches come from the messy parts of human judgment - social pressure, recall bias, memory gaps, weak samples, and the classic say-vs-do divide.

Variation is most visible in timing, future outlook, and purchase-stage intent questions, where responses are naturally influenced by framing, recent market exposure, and rationalization in the moment.

Go beyond the summary. Inspect every question, every distribution, and every gap - side by side.

Book Demo

Same study. Two techniques. One comparison.

One built on stated responses. One built on behaviour. Both answering the same questions.

Human Survey

Survey Title:Food Industry - Voice of the Consumer 2025: India Perspective

Publisher Name:PWC

Industry:Food & Beverage

Scenario:Preference Mapping

Geography:India

Sample Size:1,031 respondents

No. of Questions:11

Economics (Human)

Calculated based on actual human survey execution dynamics, so what you see reflects what it really takes.

Estimated CostEstimated using $5-$8 per response across the sample size.

$5,155-$8,248

Estimated TimeTypical fieldwork duration based on sample size.

1-2 weeks

Estimated EffortGeneral time required for cleaning, analysis, and reporting.

80-120 hours

Synthetic-People Simulation

Distinct personas modelled to represent the audience segments defined in the human survey, aligned to its demographics, context, and decision environment.

Population Calibrated

1,031

Purchases, transactions, and intent patterns analysed as per the human survey context, reflecting how similar audiences actually act, not just how they respond.

Behavior Signals

421,872

relevant people's actions

Multi-platform conversations aligned to the human survey's audience and scenario, ensuring temporal context is extracted.

Contextual Threads

conversations inferred

Various high-quality sources aligned to the human survey's topic, industry and audience, reinforcing the context and simulation scenarios.

Knowledge Bank

sources analysed

Economics (Synthetic-People)

Nothing changed but the method, so efficiency isn't claimed, it's directly comparable.

Estimated CostAligned to human survey coverage, same depth, no fieldwork.

$2499

Estimated TimeWeeks of survey cycles, compressed into hours, same baseline.

3-4 hrs

Estimated EffortSame rigor as human analysts, without the operational drag.

1-2 hrs

Outcome Matrix

What holds. What shifts. What matters.

Avg. SimilarityHow closely synthetic responses match human distributions across all questions, calibrated to the same survey structure, audience, and context.

76.7%

Directional AlignmentHow often both systems point to the same conclusion, even when the exact numbers differ.

94.6%

Prediction AccuracyHow reliably synthetic outputs anticipate the dominant human choice across questions within this study.

97.2%

Relationship StrengthHow consistently patterns between options hold across both datasets, not just individual answers, but how they move together.

88.8%

Performance by Signal Category

A signal-level view of where synthetic persona mirrors human response patterns.

The Verdict

Across 11 comparable questions spanning food behavior, wellness attitudes, sustainability priorities, affordability pressure, and consumer decision signals, Synthetic-People and human survey responses show Consistent Patterns at 76.8% average similarity, with the strongest convergence appearing in emotional concern, consumer motivation, and broad behavioral drivers. The strongest overlap appears in food safety concern, local-food motivation, and brand-switching behavior, where both methods closely mirror the broader structure of consumer decision-making. The largest variation appears in future food consumption intent and climate-related self-reported behaviors, areas that are naturally more sensitive to framing, self-image, social desirability, and future-state rationalization. Overall, the findings indicate that Synthetic-People effectively captures the underlying behavioral and motivational architecture shaping food purchase decisions, affordability pressure, and wellness priorities.

What MatchesWhere SP and human responses converge on the same signals.

Among the six signal categories evaluated, Emotion (91.4%), Attribute (89.2%), Reasoning (86.4%), and Behavior (84.0%) showed the strongest convergence with human survey outcomes.

Where It DiffersTopics and scenarios where SP and human results do not align.

The largest variation appears in future food consumption intent and climate-related behavioral reporting, which fall within the emerging difference range at 54.3% and 63.8% similarity respectively.

Why The DifferenceMost mismatches come from the messy parts of human judgment - social pressure, recall bias, memory gaps, weak samples, and the classic say-vs-do divide.

Variation is most visible in future-oriented consumption questions and sustainability-related self-reporting, where responses are naturally influenced by aspiration, framing, perceived social responsibility, and rationalization in the moment.

Go beyond the summary. Inspect every question, every distribution, and every gap - side by side.

Book Demo

Proven, Not Just Promise.

Key Metrics

Tests Conducted

Avg. Similarity

Avg. Directional Alignment

Scenarios Covered

Industries Covered

Individual Study Comparison

Same study. Two techniques. One comparison.

Human Survey

Economics (Human)

Synthetic-People Simulation

Economics (Synthetic-People)

Outcome Matrix

What holds. What shifts. What matters.

Performance by Signal Category

The Verdict

What MatchesiWhere SP and human responses converge on the same signals.

Where It DiffersiTopics and scenarios where SP and human results do not align.

Why The DifferenceiMost mismatches come from the messy parts of human judgment - social pressure, recall bias, memory gaps, weak samples, and the classic say-vs-do divide.

Same study. Two techniques. One comparison.

Human Survey

Economics (Human)

Synthetic-People Simulation

Economics (Synthetic-People)

Outcome Matrix

What holds. What shifts. What matters.

Performance by Signal Category

The Verdict

What MatchesiWhere SP and human responses converge on the same signals.

Where It DiffersiTopics and scenarios where SP and human results do not align.

Why The DifferenceiMost mismatches come from the messy parts of human judgment - social pressure, recall bias, memory gaps, weak samples, and the classic say-vs-do divide.

Same study. Two techniques. One comparison.

Human Survey

Economics (Human)

Synthetic-People Simulation

Economics (Synthetic-People)

Outcome Matrix

What holds. What shifts. What matters.

Performance by Signal Category

The Verdict

What MatchesiWhere SP and human responses converge on the same signals.

Where It DiffersiTopics and scenarios where SP and human results do not align.

Why The DifferenceiMost mismatches come from the messy parts of human judgment - social pressure, recall bias, memory gaps, weak samples, and the classic say-vs-do divide.

What MatchesWhere SP and human responses converge on the same signals.

Where It DiffersTopics and scenarios where SP and human results do not align.

Why The DifferenceMost mismatches come from the messy parts of human judgment - social pressure, recall bias, memory gaps, weak samples, and the classic say-vs-do divide.

What MatchesWhere SP and human responses converge on the same signals.

Where It DiffersTopics and scenarios where SP and human results do not align.

Why The DifferenceMost mismatches come from the messy parts of human judgment - social pressure, recall bias, memory gaps, weak samples, and the classic say-vs-do divide.

What MatchesWhere SP and human responses converge on the same signals.

Where It DiffersTopics and scenarios where SP and human results do not align.

Why The DifferenceMost mismatches come from the messy parts of human judgment - social pressure, recall bias, memory gaps, weak samples, and the classic say-vs-do divide.