- audio
Arabic (Saudi) in-studio speech dataset
Featuring over 1,300 prompts, this dataset supports wake word detection and command phrase recognition. Recorded by voice-actors and native-speakers of Arabic (Saudi dialect) in a studio environment, this dataset delivers mono-channel audio at 44.1 kHz, 24-bit fidelity in clear, consistent quality.

Specifications
- Modalities
- Audio
- Language
- Saudi (Arabic) [ar-SA]
- Total prompts
- 1,392
- Total audio length
- 1:04h
- Average recording length (in sec)
- 2.76
- Participants
- 29
- Group
- Adults
- Task category
- Scripted prompts
- Data type
- In-studio speech
Accelerate model development & training processes
Still searching for the right dataset? We can help.
Reach out and we’ll guide you to the right solution.


Explore our success stories
Evaluating a conversational AI model with a highly complex multimodal STEM dataset
Discover how our off-the-shelf science, technology, engineering and mathematics (STEM) dataset contributed to enhancing scientific reasoning and visual processing capabilities in a chatbot model crafted by a leading-edge tech and AI company.
- 4485Physics prompt-response pairs

Improving large language model logic and reasoning with a specialized fine-tuning dataset
Explore how TELUS Digital created an off-the-shelf dataset to advance the capabilities of large language models (LLMs).
- 50KSTEM-based prompt-response pairs created

Insights
See allAccess the Arabic (Saudi) in-studio speech dataset
Connect with our experts for pricing and samples.
Request the dataset


