Sound Advice: What's it like to talk to an AI survey agent? | TERAC Blog

"Hi! I'm a research assistant calling on behalf of a client. Is now a good time to talk?"

"Yep, I can talk."

A few days ago, the inaugural class of Terac.com voice agents made their first calls to consumers worldwide. To ask some critical questions. About Monkeys.

The idea might seem a little bananas, but testing conversational agents is all about stress. If an AI interviewer can spark a genuine dialogue about quirky topics, it can handle any survey. When an AI voice agent calls you, it doesn't play a pre-recorded script or a list of questions from a prescribed list; instead, it engages, listens, adapts, and learns in real-time. Survey companies report that the typical response rate for phone surveys is 18% (Williams, 2025). So, survey companies can innovate by either (i) increasing response rates or (ii) making those 18% of potential interviews more engaging and valuable. In either case, AI survey agents represent the dramatic improvement in participation, completion, and quality metrics that have long plagued traditional telephone research.

Traditional Telephone Surveys

Traditional surveys have become a declining paradigm. Recent public opinion polling indicates response rates as low as 7% due to the widespread prevalence of robocalls, which amount to approximately 3.4 billion per month (Kennedy and Hartig, 2019). Rejected calls are driven by:

Caller-ID and "spam" labeling, which deflects up to 30% of calls to voicemail.
Robocall fatigue
Intrusive perceptions: Over 80% of potential respondents cite time constraints or privacy concerns as reasons for refusal.

Another study found that, among households that chose to respond, it took as many as 12 calls to reach them, while most households responded after three or four calls.

Number of Calls to Households that Complete

(McGuckin et. al, 2025)

AI Voice Agents

AI-powered voice agents transform surveys into conversational interactions, leveraging text-to-speech, automatic speech recognition, and large language models to produce human-like dialogues. Early deployment reports:

60% higher completion rates compared to traditional forms
15% reduction in drop-offs, as conversational flow and natural follow-up questions maintain engagement.

Indeed, this is because our callers expected us to reach out. When people sign up to participate in a survey with Terac or agree to complete a customer feedback report with a Terac agent, they can schedule their conversation at a time that is convenient for them. Moreover, they know we are not robocallers because they are enthusiastic about receiving our calls, sometimes because of the payment incentive.

Large-Scale Evaluation of an LLM-Based Telephone Survey System

Lang & Eskenazi (2025) conducted one of the first large-scale field tests of an LLM-based telephone survey agent in two populations:

United States: 75 participants invited via web links and reached by direct AI calls
Peru: 2,739 randomly sampled mobile numbers for AI agent outreach.

This exciting study concluded that structured-item accuracy approached human-interviewer benchmarks with closed-ended question agreement rates above 90%. Other conclusions include:

Open-ended response quality is on par with that of human interviewers. Nearly 85% of AI-generated verbatim answers meet researcher-related relevance and coherence standards. In cases where output is misunderstood, advanced AI agents can adapt, maneuver, and rephrase questions in ways that promote high-quality feedback responses.
No Significant difference in completion times
Operational efficiency gains by eliminating interviewer training, scheduling, and deployment saved 60% of survey time.

Similar literature and studies have concluded that AI agents are capable of delivering consistent, scalable, and cost-effective telephone surveys without sacrificing data quality. In some cases, AI survey agents can significantly boost quality by fine-tuning to a user's background and experience.

LLM Survey System Performance Results

(Lang & Eskenazi, 2025)

User Trust and Acceptance: Quantitative Insights

Trust and user experience have a significant impact on the adoption of conversational agents. In healthcare contexts, systematic reviews find:

Successful task completion rates of 90-97%, with recognition accuracy exceeding 90% for health-related conversational agents.
Improved clinical outcomes in reducing depressive symptoms through an RCT of a chatbot intervention.
User satisfaction averaged 85% in patient-facing trials, indicating strong acceptance of AI interactions despite initial skepticism.

By embedding standardization protocols and iterative performance monitoring, organizations can harness AI survey agents responsibly and effectively (Laranjo et al., 2025).

How Terac's AI Survey Agents Stand Out: Our First Trial

Terac's AI agents leverage advanced techniques and brand quality to deliver superior survey experiences, as demonstrated in our inaugural "monkey" survey trial.

Brand Trust & Incentive-Driven Uptake

In consumer feedback surveys, respondents found Terac clients to be "reliable" and "professional," resulting in a 25% higher initial response rate compared to typical calls. Coupled with a transparent incentive structure customized to the company based on consumer feedback, Terac achieved a completion rate of over 80%, significantly higher than the 18% commonly expected from phone interviews.

Stress-Tested Conversational Resilience

During the "stress test" for opinions about monkeys, agents maintained topic coherence through humorous side streams and re-contextualization techniques, allowing 95% of participants to remain engaged through challenging or unexpected questions about their opinions about monkeys. This demonstrates applicability beyond niche topics, from market research polls to customer-service dialogues.

Key Takeaways from our First Trial:

Brand transparency fosters trust. Clear disclosure of the AI interviewer and gifted digital vouchers encouraged respondents to proceed, translating into an 82% completion rate
Iterative Feedback Loops Create Value. Rapid incorporation of survey insights, from dialogue adjustments to incentive tweaks, ensured continuous refinement. By adapting question logic, prompts, and strategies with respect to respondent background and sensitivity, we optimize for data quality and respondent satisfaction.

Metric	AI Voice Agent	Human Interviewer
Call Initiation	Instant call with branded ID and pre-vetted background interests. Increases with consumer feedback surveys, as consumers can schedule Terac calls online through web invites	Manual dialing, including potential misdial, caller-ID block, or spam filtering
Ambient Noise Handling	Real-time noise suppression and voice isolation (98% accuracy)	Dependent on Interviewer's environmental awareness
Question consistency	Scripted alongside client, but adaptable with dynamic follow-ups with Machine Learning	Human paraphrasing: potential for variation and misunderstanding
Response Accuracy	90-95% closed-ended agreement with advanced models	90% is expected, but subject to interviewer interpretation and bias
Engagement Rate	Upwards of 80%	Industry Average of about 18%
Speed of Deployment	Live within minutes of client signup	Weeks to onboard and train
Data Quality	Automated timestamping, verbatim transcripts, emotion tagging, etc.	Manual note-taking, trasncripton and analysis errors possible
Scalability	Concurrently handles thousands of calls	Limited by interveiwer headcount and scheduling
Cost efficiency	60% higher per-call efficiency at scale	Higher labor and overhead cost
Adaptability	Learns new modules in real time	New scripts require retraining and scheduling

Works Cited

Kennedy, Courtney, and Hannah Hartig. "Response Rates in Telephone Surveys Have Resumed Their Decline." Pew Research Center, 27 Feb. 2019, www.pewresearch.org/short-reads/2019/02/27/response-rates-in-telephone-surveys-have-resumed-their-decline/.

Lang, Max M, and Sol Eskenazi. "Telephone Surveys Meet Conversational AI: Evaluating a LLM-Based Telephone Survey System at Scale." ArXiv.org, 2025, arxiv.org/abs/2502.20140.

Laranjo, Liliana, et al. "Conversational Agents in Healthcare: A Systematic Review ." Journal of the American Medical Informatics Association, vol. 25, no. 9, 2022, academic.oup.com/jamia/article/25/9/1248/5052181?login=true.

McGuckin, Nancy, et al. "Hang Ups: Looking at Non-Response in Telephone Surveys." Dot.gov, 2025, www.fhwa.dot.gov/ohim/hang_ups.htm.

Williams, Kate. "2025 Survey Response Rates Benchmarks: Are You below Industry Standards?" SurveySparrow, 2025, surveysparrow.com/blog/survey-response-rate-benchmarks/.