- Data & AI Solutions
- Off-the-Shelf Datasets
- German (Germany) remote speech dataset
- audio
German (Germany) remote speech dataset
Featuring over 4,500 scripted prompts recorded by 100 native German speakers from diverse regions across Germany, this high-resolution mono audio dataset (48kHz) is ideal for training automatic speech recognition (ASR) models and voice AI applications tailored for multilingual and regional markets.

Specifications
- Modalities
- Audio
- Language
- German (Germany) [de-DE]
- Licensable
- Yes
- Total prompts
- 4,550
- Total audio length
- 7:46h
- Average recording length (in sec)
- 6.15
- Participants
- 100
- Group
- Adults
- Task category
- Scripted prompts
- Data type
- Remote speech
Accelerate model development & training processes
Diverse regional accent coverage
Leverage LLM-style commands and natural language prompts designed to simulate real-world voice AI interactions and recorded by native speakers that capture authentic German speech.
Formatted for multilingual model development
Part of a broader multilingual dataset collection, with standardized scripts, audio specifications and quality controls, simplifying integration and benchmarking.
High-quality audio optimized for easy integration
Delivered in 48kHz mono with standardized silence padding and remote recording consistency, ensuring clean, ready-to-use data.

Explore our success stories
Evaluating a conversational AI model with a highly complex multimodal STEM dataset
4485Physics prompt-response pairs
9606Math prompt-response pairs
Improving large language model logic and reasoning with a specialized fine-tuning dataset
50KSTEM-based prompt-response pairs created
300Highly-skilled contributors
Evaluating a conversational AI model with a highly complex multimodal STEM dataset
4485Physics prompt-response pairs
9606Math prompt-response pairs
Improving large language model logic and reasoning with a specialized fine-tuning dataset
50KSTEM-based prompt-response pairs created
300Highly-skilled contributors
Access the German (Germany) remote speech dataset
Connect with our experts for pricing and samples.
Explore our custom AI solutions
