1. Data & AI Solutions
  2. Off-the-Shelf Datasets
  3. Visual question answering dataset
  • text
  • images

Visual question answering dataset

Updated May 7, 2025

This dataset of more than 1,000 multi-domain visual question-answer pairs is designed to improve the multimodal capabilities of AI models. It includes image-based questions, detailed answers and expert explanations across science, business, health and medicine.

Specifications

Modalities
Image, text
Language
English
Licensable
Yes
Volume
1000+
Average token per PRP
286
Number of tokens
286,000
Task category
Visual Question-Answering
Domain
Science, Business, Health & Medicine
Source
Expert-generated
Complexity
3 levels ranging from moderate to very hard

Accelerate model development & training processes

  • Quality-focused curation

    The dataset is curated to ensure a highly effective, high-quality set of visual question-answer pairs that can improve multimodal AI training and benchmarking.

  • Comprehensive topic coverage

    The visual quesiton-answer set spans diverse subject matter, with a strong focus on science, business, health and medicine, to add depth and complexity to multimodal understanding.

  • Created by qualified experts

    Our datasets are created by verified subject matter experts, including individuals with Masters degrees and holding PhDs, with a stringent quality process to ensure accuracy and domain-relevant explanations.

Still searching for the right dataset? We can help.

Reach out and we’ll guide you to the right solution.

Case Studies

Explore our success stories

  • Evaluating a conversational AI model with a highly complex multimodal STEM dataset

    Man using his mobile device with a chatbot illustration above the device.

    Discover how our off-the-shelf science, technology, engineering and mathematics (STEM) dataset contributed to enhancing scientific reasoning and visual processing capabilities in a chatbot model crafted by a leading-edge tech and AI company.


    • 4485 Physics prompt-response pairs


    • 9606 Math prompt-response pairs

    Download case study
  • Improving large language model logic and reasoning with a specialized fine-tuning dataset

    Person working at a laptop holding a mobile phone with an overlaid illustration of LLM features.

    Explore how TELUS Digital created an off-the-shelf dataset to advance the capabilities of large language models (LLMs).


    • 50 KSTEM-based prompt-response pairs created


    • 300 Highly-skilled contributors

    Download case study

Access the visual question answering dataset

Connect with our experts for pricing and samples.