Datasets

Verified human data for frontier AI

Explore and license production-ready datasets sourced from paid, consenting experts, or brief us on a bespoke build we scope from scratch. Every record is rights-cleared and reviewed for PII before delivery, ready to drop into your training and evaluation pipeline.

VideoSensorEnterprise

Computer-Use Workflows

Continuous screen recordings of real practitioners working in 500+ professional desktop and web applications, with action context and end-to-end workflow boundaries.

View dataset
VideoSensorEnterprise

General Egocentric Video

First-person recordings spanning 20,000+ unique tasks across households, factories, shops, and more, with synchronised IMU data and roughly 95% hand visibility.

View dataset
CodeEnterprise

GitHub Repositories

Real repositories from verified expert engineers, with full commit history, pull-request review threads, and license and authorship provenance on every repo.

View dataset
Video

Diverse Egocentric POV Video

First-person dexterous hand activity across 50+ real-world environments.

View dataset
VideoSensorCustom

Egocentric Vision for Accessibility AI

First-person video from accessibility users with rich metadata.

View dataset
Video

Egocentric Gemstone Carving & Lapidary Video

First-person POV footage of gemstone cutting, polishing, faceting, and lapidary craftsmanship.

View dataset
ImageCustom

Cancer Medical Imagery

Close-up melanoma, carcinoma, and keratosis images with rich metadata.

View dataset
TextEnterprise

Expert Preference & RLHF Data

Pairwise comparisons, ratings, and rankings from verified domain experts, with written rationales, for RLHF and reward modeling.

View dataset
TextEnterprise

Expert Reasoning Traces

Step-by-step solutions to hard problems from verified domain experts, with intermediate work, for process supervision and RL.

View dataset
TextCodeEnterprise

Long-Horizon Agent Trajectories

Full traces of verified experts completing long, multi-tool tasks, with actions, tool calls, and outcomes.

View dataset
AudioTextEnterprise

AI-Moderated Interview Transcripts

Transcribed AI-moderated voice interviews with verified participants, including audio, speaker labels, and screener metadata.

View dataset
TextCustom

Red-Team Safety Evaluations

Adversarial prompts and expert safety judgments across risk categories, with severity labels, for red-teaming and safety evals.

View dataset
AudioEnglishEnterprise

English Conversational Speech

Stereo multi-speaker English dialogue with left/right speaker separation, native-reviewed transcripts, and emotion annotations.

View dataset
AudioFrenchEnterprise

French Conversational Speech

Stereo multi-speaker French dialogue with left/right speaker separation, native-reviewed transcripts, and emotion annotations.

View dataset
AudioGermanEnterprise

German Conversational Speech

Stereo multi-speaker German dialogue with left/right speaker separation, native-reviewed transcripts, and emotion annotations.

View dataset
AudioJapaneseCustom

Japanese Conversational Speech

Stereo multi-speaker Japanese dialogue with left/right speaker separation, native-reviewed transcripts, and emotion annotations.

View dataset
AudioSpanishCustom

Spanish Conversational Speech

Stereo multi-speaker Spanish dialogue with left/right speaker separation, native-reviewed transcripts, and emotion annotations.

View dataset
AudioTeluguCustom

Telugu Conversational Speech

Stereo multi-speaker Telugu dialogue with left/right speaker separation, native-reviewed transcripts, and emotion annotations.

View dataset
AudioHindiCustom

Hindi Conversational Speech

Stereo multi-speaker Hindi dialogue with left/right speaker separation, native-reviewed transcripts, and emotion annotations.

View dataset
AudioTamilCustom

Tamil Conversational Speech

Stereo multi-speaker Tamil dialogue with left/right speaker separation, native-reviewed transcripts, and emotion annotations.

View dataset
AudioMarathiCustom

Marathi Conversational Speech

Stereo multi-speaker Marathi dialogue with left/right speaker separation, native-reviewed transcripts, and emotion annotations.

View dataset
AudioEnglishEnterprise

English Monologue Speech

Professional single-speaker English recordings with word-level timestamps and emotion annotations.

View dataset
AudioFrenchEnterprise

French Monologue Speech

Professional single-speaker French recordings with word-level timestamps and emotion annotations.

View dataset
AudioGermanEnterprise

German Monologue Speech

Professional single-speaker German recordings with word-level timestamps and emotion annotations.

View dataset
AudioJapaneseEnterprise

Japanese Monologue Speech

Professional single-speaker Japanese recordings with word-level timestamps and emotion annotations.

View dataset
AudioHindiCustom

Hindi Monologue Speech

Professional single-speaker Hindi recordings with word-level timestamps and emotion annotations.

View dataset
AudioTamilCustom

Tamil Monologue Speech

Professional single-speaker Tamil recordings with word-level timestamps and emotion annotations.

View dataset
AudioMarathiCustom

Marathi Monologue Speech

Professional single-speaker Marathi recordings with word-level timestamps and emotion annotations.

View dataset
AudioChinese MandarinEnterprise

Chinese Mandarin Speech

Professional Mandarin Chinese recordings for ASR, TTS, and voice-AI training.

View dataset
AudioTeluguCustom

Telugu Expressive TTS Voice

Natural, expressive Telugu speech recordings from native speakers across major regions.

View dataset
AudioEnglishUrduCustom

Doctor-Patient Consultation

Clinical consultation dialogues between doctors and patients in English and Urdu.

View dataset
AudioSpanishCustom

Spanish Finance Conversation

Customer-service conversations in Spanish across finance and banking contexts.

View dataset
AudioSpanishCustom

Spanish Customer Support Conversations

Stereo role-play customer-service dialogues in Spanish with L/R speaker separation.

View dataset
AudioSpanishEnglishCustom

Spanish-English Contact Center ASR

Bilingual Spanish-English contact-center conversations with natural code-switching.

View dataset
AudioEnglishCustom

Nighttime Traffic Audio Narrations

Urban nighttime audio narrations with ambient noise profiling.

View dataset
AudioEnterprise

Music Library

A large-scale, professionally produced music collection across modern genres, with full construction kits and production toolkits.

View dataset

Bespoke collection

Need something bespoke? Let's scope it together.

Whether you are training speech models, vision systems, or multimodal agents, our verified contributor network and QA pipeline plug directly into your roadmap. Tell us what you need and we will source, vet, and deliver it.

  • End-to-end sourcing, QA, legal, and delivery handled by our team
  • Flexible licensing: flat fee, per-unit, or recurring access
  • A dedicated team for enterprises with ongoing data needs
Request a custom dataset

Security and compliance

Data is encrypted in transit and at rest, access is scoped and audited, and every delivery is reviewed for PII against a signed-consent trail.

SOC 2 Type II certifiedGDPR compliant

ISO 27001 and CCPA compliance in progress. Request our security documentation for the latest status.

Ready to recruit quality experts, fast?

Unlock the power of AI-led screening and gain deeper understanding of who's powering your research.

© 2026 All Rights Reserved by Terac