Computer-Use Workflows
Continuous screen recordings of real practitioners working in 500+ professional desktop and web applications, with action context and end-to-end workflow boundaries.
View datasetGeneral Egocentric Video
First-person recordings spanning 20,000+ unique tasks across households, factories, shops, and more, with synchronised IMU data and roughly 95% hand visibility.
View datasetGitHub Repositories
Real repositories from verified expert engineers, with full commit history, pull-request review threads, and license and authorship provenance on every repo.
View datasetDiverse Egocentric POV Video
First-person dexterous hand activity across 50+ real-world environments.
View datasetExpert Preference & RLHF Data
Pairwise comparisons, ratings, and rankings from verified domain experts, with written rationales, for RLHF and reward modeling.
View datasetExpert Reasoning Traces
Step-by-step solutions to hard problems from verified domain experts, with intermediate work, for process supervision and RL.
View datasetLong-Horizon Agent Trajectories
Full traces of verified experts completing long, multi-tool tasks, with actions, tool calls, and outcomes.
View datasetAI-Moderated Interview Transcripts
Transcribed AI-moderated voice interviews with verified participants, including audio, speaker labels, and screener metadata.
View datasetRed-Team Safety Evaluations
Adversarial prompts and expert safety judgments across risk categories, with severity labels, for red-teaming and safety evals.
View datasetEnglish Conversational Speech
Stereo multi-speaker English dialogue with left/right speaker separation, native-reviewed transcripts, and emotion annotations.
View datasetFrench Conversational Speech
Stereo multi-speaker French dialogue with left/right speaker separation, native-reviewed transcripts, and emotion annotations.
View datasetGerman Conversational Speech
Stereo multi-speaker German dialogue with left/right speaker separation, native-reviewed transcripts, and emotion annotations.
View datasetSpanish Conversational Speech
Stereo multi-speaker Spanish dialogue with left/right speaker separation, native-reviewed transcripts, and emotion annotations.
View datasetEnglish Monologue Speech
Professional single-speaker English recordings with word-level timestamps and emotion annotations.
View datasetFrench Monologue Speech
Professional single-speaker French recordings with word-level timestamps and emotion annotations.
View datasetGerman Monologue Speech
Professional single-speaker German recordings with word-level timestamps and emotion annotations.
View datasetBespoke collection
Need something bespoke? Let's scope it together.
Whether you are training speech models, vision systems, or multimodal agents, our verified contributor network and QA pipeline plug directly into your roadmap. Tell us what you need and we will source, vet, and deliver it.
- End-to-end sourcing, QA, legal, and delivery handled by our team
- Flexible licensing: flat fee, per-unit, or recurring access
- A dedicated team for enterprises with ongoing data needs
Security and compliance
Data is encrypted in transit and at rest, access is scoped and audited, and every delivery is reviewed for PII against a signed-consent trail.
ISO 27001 and CCPA compliance in progress. Request our security documentation for the latest status.




