What datasets does TELUS Digital offer for improving math reasoning in AI?

{"key":"ed503f92-6cf4-4eca-a2e1-a53481a0c5c3","data":{"blocks":[{"key":"cen8b","text":"TELUS Digital offers several specialized datasets for mathematical reasoning:","type":"unstyled","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"81bh3","text":"Math reasoning dataset - A comprehensive, highly specialized AI training resource featuring complex, reasoning-based prompt-response pairs focused exclusively on advanced mathematics. Every question, answer and explanation is crafted by mathematics experts, including master's graduates, Ph.D holders and industry professionals. The dataset features:","type":"unstyled","depth":0,"inlineStyleRanges":[{"offset":0,"length":22,"style":"BOLD"},{"offset":0,"length":23,"style":"UNDERLINE"}],"entityRanges":[{"offset":0,"length":23,"key":0}],"data":{}},{"key":"e8bfs","text":"Process-centric scoring that evaluates chains of thought step by step","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"c988m","text":"Expert-in-the-loop validation with peer review by expert mathematicians","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"9g0s7","text":"Dynamic evolution with regular updates to ensure continued relevance","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"brrc9","text":"Novel, unpublished problems across diverse domains from number theory to algebraic geometry","type":"unordered-list-item","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"9hp6g","text":"In addition to the math reasoning dataset, TELUS Digital offers the following datasets on math:","type":"unstyled","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}},{"key":"1r89p","text":"Math word problems Q&A dataset - Over 6,000 question-answer pairs designed to enhance AI quantitative reasoning. Features multiple-choice questions with accurate answers and detailed explanations across topics including percentages, geometry, data interpretation, profit and loss, interest calculations, averages and chart interpretation.","type":"unstyled","depth":0,"inlineStyleRanges":[{"offset":0,"length":31,"style":"BOLD"},{"offset":0,"length":30,"style":"UNDERLINE"}],"entityRanges":[{"offset":0,"length":30,"key":1}],"data":{}},{"key":"23obd","text":"Mathematics Q&A multimodal dataset - Comprising over 4,000 verified multimodal pairs, this dataset covers curriculum-based topics ranging from foundational math to advanced subjects like three-dimensional (3D) geometry and vectors. It features five distinct levels of complexity and utilizes diverse question formats to challenge model capabilities. Every response is curated to reflect human-like thought processes and includes step-by-step explanations that have passed strict expert-led quality checks.","type":"unstyled","depth":0,"inlineStyleRanges":[{"offset":0,"length":34,"style":"BOLD"},{"offset":0,"length":34,"style":"UNDERLINE"}],"entityRanges":[{"offset":0,"length":34,"key":2}],"data":{}},{"key":"f3o22","text":"Mathematics Q&A text dataset - This is a high-volume collection of over 22,000 verified text-based question-answer pairs spanning five difficulty levels. It covers a broad spectrum of advanced topics, including statistics, trigonometry and hyperbolas, ensuring comprehensive curriculum coverage. Includes step-by-step explanations that reflect solution thought processes and pass rigorous quality checks for accuracy, formatting and language quality.","type":"unstyled","depth":0,"inlineStyleRanges":[{"offset":0,"length":28,"style":"BOLD"},{"offset":0,"length":28,"style":"UNDERLINE"}],"entityRanges":[{"offset":0,"length":28,"key":3}],"data":{}}],"entityMap":{"0":{"type":"LINK","mutability":"MUTABLE","data":{"href":"https://www.telusdigital.com/solutions/data-for-ai-training/off-the-shelf-data/math-reasoning-dataset","url":"https://www.telusdigital.com/solutions/data-for-ai-training/off-the-shelf-data/math-reasoning-dataset"}},"1":{"type":"LINK","mutability":"MUTABLE","data":{"href":"https://www.telusdigital.com/solutions/digital-services/ccaas/math-word-problems-dataset","url":"https://www.telusdigital.com/solutions/digital-services/ccaas/math-word-problems-dataset"}},"2":{"type":"LINK","mutability":"MUTABLE","data":{"href":"https://www.telusdigital.com/solutions/digital-services/ccaas/mathematics-multimodal-dataset","url":"https://www.telusdigital.com/solutions/digital-services/ccaas/mathematics-multimodal-dataset"}},"3":{"type":"LINK","mutability":"MUTABLE","data":{"href":"https://www.telusdigital.com/solutions/digital-services/ccaas/mathematics-text-dataset","url":"https://www.telusdigital.com/solutions/digital-services/ccaas/mathematics-text-dataset"}}}}}

How do TELUS Digital's datasets help develop chain-of-thought capabilities?

{"key":"46bf01c9-dc60-484d-83a1-1cd6d6174776","data":{"blocks":[{"key":"drgh1","text":"TELUS Digital datasets include expert-crafted step-by-step solutions that demonstrate proper mathematical reasoning from first principles. Each solution shows the complete thought process, including problem decomposition, strategy selection, intermediate steps and verification. This high-fidelity CoT data teaches models how to approach unfamiliar problems systematically rather than jumping directly to answers.","type":"unstyled","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}}],"entityMap":{}}}

How do these datasets compare to publicly available math benchmarks like GSM8K or MATH?

{"key":"cdea2915-e9bb-4ab2-91d0-95d0b0eb0646","data":{"blocks":[{"key":"3rt3v","text":"While public benchmarks serve important roles in measuring progress, they face data contamination risks since models may have encountered these problems during pre-training. Our proprietary datasets provide guaranteed novel problems that models haven't seen, enabling more accurate assessment of genuine reasoning capabilities. Additionally, our process-centric approach and expert validation ensure higher quality CoT data than crowd-sourced alternatives.","type":"unstyled","depth":0,"inlineStyleRanges":[],"entityRanges":[],"data":{}}],"entityMap":{}}}

Data & AI

Why AI's path to AGI runs through math reasoning

Posted February 12, 2026

Bouquet of blue glass tulips generated by AI.

Key takeaways

Mathematical reasoning is essential for artificial general intelligence (AGI) because it moves models toward discrete compositional reasoning, the ability to generalize abstract concepts and apply logic across diverse domains.
While current AI can solve high-level competition problems, it often lacks a deep understanding of mathematical axioms. Progress depends on shifting to rules-based learning that mimics human incremental problem-solving.
The development of reliable chain-of-thought (CoT) architectures needs high-fidelity data that prioritizes the process over the final answer, rewarding sound logic and step-by-step verification.
To ensure AI is truly reasoning, training requires proprietary, unpublished datasets crafted by experts. These novel problems prevent data contamination and force models to demonstrate genuine mathematical understanding.

One area where human intelligence still distinctly excels over neural models is discrete compositional reasoning: the ability to think algebraically and generalize across abstract concepts. This capability is fundamentally different from pattern-based tasks like language translation and represents the kind of structured, logical thinking that is foundational for artificial general intelligence.

Cognitive neuroscience research consistently demonstrates that learning to solve mathematical problems enhances general reasoning abilities in humans, promoting logical thinking, abstract reasoning and transferable problem-solving strategies. Incorporating mathematical reasoning data into AI training could help large language models (LLMs) develop more complex and versatile reasoning abilities, particularly since mathematical problem-solving is one of the few domains where large volumes of long and intricate CoT data can be generated or synthesized.

The challenge of AI mathematical reasoning

AI's ability to reason mathematically still depends heavily on the breadth and quality of its training data. Current LLMs can solve complex problems, even some at math Olympiad levels, by generalizing from patterns. They already handle many math problems at the high school and even college level, where knowledge is relatively structured and the types of problems are predictable.

Recent breakthroughs demonstrate the rapid progress in this domain. AlphaProof, a reinforcement learning-based system for formal math reasoning, achieved a silver medal performance at the 2024 International Mathematical Olympiad (IMO), then advanced to a gold medal performance at the 2025 IMO when combined with Gemini Deep Think. AlphaProof substantially improves state-of-the-art (SOTA) results on historical mathematics competition problems, showing that AI systems can compete at the highest levels of mathematical problem-solving.

Despite these impressive achievements, LLMs work by recognizing and replicating patterns, not by understanding underlying mathematical laws and axioms. We do not come to understand and solve mathematical problems primarily on the back of experience and evidence but on the basis of inferring, learning and exploiting laws, axioms and symbol manipulation rules.

Understanding mathematical reasoning in LLMs

Early approaches in AI have focused on building machines that can solve a problem "at once" by generating a complete solution in a single step. But this is not how people tackle these challenges. We use intuition, break complex problems into component parts and look for ways to make incremental progress. Compare this to "brute-force" versus "genius" learning. With enough practice, people can solve difficult problems. Geniuses by contrast, grasp deep patterns quickly. Most high performers combine both: extensive exposure and rapid internalization. Similarly, LLMs need far more training data than humans to achieve comparable results for a single task.

The most effective approach to training mathematical reasoning mirrors how humans learn: through diverse problem-answer pairs across varied complexities and sub-domains. Rather than memorizing solutions, models need exposure to different variations of core concepts, similar to how a teacher designs new problem variations to teach students fundamental principles. This increases both the volume and diversity of training data, enhancing the model's generalization and adaptability.

Designing effective math reasoning datasets

To understand and measure progress in artificial intelligence, we need carefully designed benchmarks that can assess how well AI systems engage in complex scientific reasoning. The following are some of the essential characteristics of math reasoning datasets that can challenge SOTA models:

Novel, unpublished problems that models haven't encountered during pre-training

Public benchmarks like GSM8K and MATH serve important roles in measuring progress, but they face data contamination risks since models may have been exposed to these problems. Proprietary datasets with guaranteed novel problems enable more accurate assessment of genuine reasoning capabilities.

Expert-level problem creation and validation

The problems should span diverse mathematical domains, from computationally intensive challenges in number theory and real analysis to abstract questions in algebraic geometry and category theory. Each problem should demand creative insight, connecting disparate concepts and sophisticated reasoning rather than routine textbook exercises.

Problems must be "guess proof" with definite, verifiable answers

Random attempts or trivial brute-force approaches should have negligible chances of success. This ensures models must engage in genuine reasoning rather than gaming the evaluation system.

The dataset sourcing challenge

Creating high-quality mathematical reasoning datasets presents unique challenges that require specialized expertise and methodology. The process requires designing entirely new challenges that test genuine understanding.

At TELUS Digital, we’ve found that the most effective approach for developing training datasets involves mathematics experts, including master's graduates, Ph.D holders and industry professionals, crafting each question, answer and explanation from scratch. This expert-in-the-loop validation ensures that every problem undergoes peer review to verify correctness, check for ambiguities and assess appropriate difficulty ratings.

The sourcing methodology should prioritize process over final answers. Rather than scoring only whether a model reaches the correct solution, effective datasets enable evaluation of reasoning chains step by step, rewarding sound logic even if minor arithmetic errors occur. Each solution should demonstrate the complete thought process from first principles: problem decomposition, strategy selection, intermediate steps and verification.

Another critical consideration is dynamic evolution. Like standardized exams for human learners, AI benchmarks should evolve over time, retiring problems once models master them and introducing fresh challenges. Static datasets quickly become obsolete as models improve and potentially memorize solutions that leak into training data.

Building datasets that advance the field

The ultimate goal of mathematical reasoning datasets extends beyond improving benchmark scores. These resources should push models toward genuine mathematical understanding, the ability to independently apply fundamental principles to novel problems they've never encountered.

This requires datasets with sufficient diversity across problem types and mathematical domains. A model that performs well across a broad range of challenges, including problems requiring generalization to new contexts, provides stronger evidence of algebraic reasoning capabilities than one that excels only on narrow problem categories.

TELUS Digital's off-the-shelf math reasoning datasets represent the frontier of AI training resources for mathematical reasoning. Developed by expert mathematicians and validated against SOTA models, our datasets provide the high-quality, diverse and challenging data needed to push LLMs toward genuine mathematical understanding.

Contact our experts today to learn more about our off-the-shelf datasets and how they can accelerate your AI development journey.

GenAI applications that drive business performance

Insights Overview

Categories

Industries

Resource Types

Glossary

Why AI's path to AGI runs through math reasoning

Key takeaways

The challenge of AI mathematical reasoning

Understanding mathematical reasoning in LLMs

Designing effective math reasoning datasets

The dataset sourcing challenge

Building datasets that advance the field

Frequently asked questions

Be the first to know

Related insights

How TELUS Digital built an expert-sourced multimodal dataset for STEM visual question answering

How TELUS Digital built an expert-sourced multimodal dataset for STEM visual question answering

The accelerating GenUI ecosystem: MCP Apps, OpenAI’s Apps SDK and Google A2UI

The accelerating GenUI ecosystem: MCP Apps, OpenAI’s Apps SDK and Google A2UI

Certainty robustness: Evaluating LLM stability under self-challenging prompts

Certainty robustness: Evaluating LLM stability under self-challenging prompts