Report Overview
"A Blueprint for Building National Compute Capacity for Artificial Intelligence" is a comprehensive policy report published by the OECD in February 2023. The report provides the first blueprint for policy makers to help assess and plan for the national AI compute capacity needed to enable productivity gains and capture AI's full economic potential.
Key Insight: Artificial intelligence (AI) is transforming economies and promising new opportunities for productivity, growth, and resilience. However, no country today has sufficient data on, or a targeted plan for, national AI compute capacity. This policy blind-spot may jeopardize domestic economic goals.
Key Data Points
Key Insights Summary
AI Compute Divides Are Emerging
An imbalance of AI compute resources risks reinforcing socioeconomic divides, creating further differences in competitive advantage and productivity gains. Private sector initiatives increasingly benefit from state-of-the-art AI compute resources compared to public research institutes and academia.
Exponential Growth in Compute Demands
The computational capabilities required to train modern machine learning systems has multiplied by hundreds of thousands of times since 2012, despite algorithmic and software improvements that reduce computing power needs.
Specialized Hardware Requirements
ML systems are predominantly trained on specialized processors that comprise hardware optimized for certain types of operations, such as Graphics Processing Units (GPUs), Tensor Processing Units (TPUs), and Neural Processing Units (NPUs).
Industry Dominance in AI Model Training
Industry, rather than academia, is increasingly providing and using the compute capacity and specialized labor required for state-of-the-art ML research and training large AI models.
Complex Supply Chain Challenges
Securing specialized infrastructure and hardware purpose-built for AI can be challenging due to complex supply chains, as illustrated by bottlenecks in the semiconductor industry.
Measurement Gaps Persist
Standardized measures of national AI compute capacity remain a policy gap. Such measures would give OECD and partner economies a greater understanding of AI compute and its relationship to the diffusion of AI.
Content Overview
Document Contents
Abstract
Artificial intelligence (AI) is transforming economies and promising new opportunities for productivity, growth, and resilience. Countries are responding with national AI strategies to capitalize on these transformations. However, no country today has sufficient data on, or a targeted plan for, national AI compute capacity. This policy blind-spot may jeopardize domestic economic goals.
This report provides the first blueprint for policy makers to help assess and plan for the national AI compute capacity needed to enable productivity gains and capture AI's full economic potential. It provides guidance for policy makers on how to develop a national AI compute plan along three dimensions: capacity (availability and use), effectiveness (people, policy, innovation, access), and resilience (security, sovereignty, sustainability).
The report also defines AI compute, takes stock of indicators, datasets, and proxies for measuring national AI compute capacity, and identifies obstacles to measuring and benchmarking national AI compute capacity across countries.
Executive Summary
Many countries have developed national AI strategies without fully assessing whether they have sufficient domestic AI compute infrastructure and software to realize their goals. Other AI enablers, like data, algorithms, and skills, receive significant attention in policy circles, but the hardware, software, and related infrastructure that make AI advances possible have received comparatively less attention.
The demand for AI compute has grown dramatically for machine learning systems, especially deep-learning and neural networks. According to research, the computational capabilities required to train modern machine learning systems, measured in number of mathematical operations (i.e., floating-point operations per second, or FLOPS), has multiplied by hundreds of thousands of times since 2012, despite algorithmic and software improvements that reduce computing power needs.
As governments invest in developing cutting-edge AI, compute divides can emerge or deepen. An imbalance of such compute resources risks reinforcing socioeconomic divides, creating further differences in competitive advantage and productivity gains. Over the past decade, private sector led initiatives within countries have increasingly benefitted from state-of-the-art AI compute resources, particularly from commercial cloud service providers, compared to public research institutes and academia.
This report offers a blueprint for policy makers to develop national AI compute plans aligned with national AI strategies and domestic needs. It takes stock of existing and proposed indicators, datasets, and proxies for measuring national AI compute capacity. Policy makers can assess technology needs and develop national AI compute plans by considering compute's capacity (availability and use), effectiveness (people, policy, innovation, access), and resilience (security, sovereignty, sustainability).
Introduction
Artificial intelligence (AI) is transforming economies and societies, bringing opportunities for increased economic productivity, inclusive growth, and breakthroughs in addressing global challenges. Understanding countries' capacity and readiness to embrace this fast-evolving transition is essential, including the availability of relevant infrastructure enabling computation for AI at scale.
The creation and use of AI relies on key elements, such as a skilled workforce, enabling public policies, regulations and legal frameworks, access to data, and sufficient computing resources – commonly referred to as "compute". For machine learning (ML) based AI systems, there are two key steps involved in their development and use that are enabled by compute: (1) training, meaning the creation or selection of models/algorithms and their calibration, and (2) inferencing, meaning using the AI system to determine an output.
While other key enablers have received significant attention in policy circles, the hardware, software, and related compute infrastructure that make AI advances possible receive comparatively less attention. Ensuring countries have sufficient AI compute to meet their needs is critical to capturing AI's full economic potential.
Evolving Trends in Compute
The report analyzes trends in supercomputer performance and AI compute requirements. Analysis of the Top500 list shows that few economies have supercomputers ranking as top computing systems, with emerging economies sparsely represented. As of November 2022, the United States had the highest share of total compute performance on the list (44%), followed by Japan (13%) and China (11%).
State-of-the-art AI systems increasingly depend on high-performance compute. Researchers estimate that the computational capabilities required to train modern ML systems has grown by hundreds of thousands of times since 2012. This is likely driven by the increasing capabilities of large, compute-intensive AI systems.
AI compute is not well understood beyond specialized technical and policy communities. While awareness is growing of the importance of national policies for AI compute, its technical nature makes it less understood outside specialized communities.
Securing specialized hardware for AI involves complex supply chains, as illustrated by bottlenecks in the semiconductor industry. The prominence of deep learning dramatically increased the size of machine learning systems and their compute demands.
A compute divide can emerge and worsen between the public and private sectors because, increasingly, public sector entities do not have the resources to train cutting edge AI models. Industry, rather than academia, is increasingly providing and using the compute capacity and specialized labor required for state-of-the-art ML research and training large AI models.
Measuring AI Compute
The report defines AI compute as "one or more stacks of hardware and software used to support specialized AI workloads and applications in an efficient manner" with requirements varying significantly according to the user's needs.
AI compute covers a range of different technologies, from chips to data servers to cloud computing. AI compute enables AI systems' training and inferencing. AI compute can be located at and accessed in several ways: centrally in data centers, centrally in the cloud, or at the edge on decentralized devices.
Measuring AI compute capacity and needs is particularly challenging. At present, very few tools and indicators exist to measure AI compute. Literature on AI compute typically focuses on the performance measurement of compute systems, such as application performance benchmarks like MLPerf or throughput benchmarks like the Top500 list.
What qualifies as "domestic" AI compute may vary by country, for example being subject to domestic laws and regulations and physically located within a national jurisdiction. Another measurement challenge is that compute can be general-purpose, meaning that compute infrastructure can be used for AI workloads and non-AI workloads.
Blueprint for Developing a National AI Compute Plan
The report provides a comprehensive blueprint for policy makers to develop national AI compute plans. A national AI compute plan should align with existing national AI strategies and centre around three fundamental questions:
- How much AI compute does the country have?
- How much AI compute does the country need? Is current domestic AI compute capacity sufficient to support national AI strategy objectives?
- How does it compare to other countries?
To answer these questions, policy makers can consider three overarching categories as part of a national AI compute plan:
- Capacity (availability and use): Measuring current AI compute capacity and needs, estimating future capacity, and ongoing monitoring.
- Effectiveness (people, policy, innovation, access): Ensuring effective use through skilled labor, enabling policies, R&D innovation, and accessible compute resources.
- Resilience (security, sovereignty, sustainability): Building secure, sovereign, and sustainable AI compute infrastructure with robust supply chains.
The report provides detailed frameworks and considerations for each of these components, including specific questions policy makers should address when developing their national AI compute plans.
AI Compute in National Policy Initiatives
The report documents various national and regional policy initiatives related to AI compute, including:
- High-performance computing initiatives in countries like Canada, Chile, Colombia, France, Germany, Japan, Korea, Slovenia, Spain, the United Kingdom, and the United States
- Cloud-based services such as the EU's GAIA-X initiative and European Data Infrastructure
- Supply chain initiatives including semiconductor supply chain security efforts in Korea, Spain, the United States, and the European Union
These initiatives demonstrate different approaches countries are taking to provide the digital infrastructure and access required for the development and use of artificial intelligence.
Gap Analysis and Preliminary Findings
The report identifies several critical gaps in existing measurement tools and policy approaches:
- AI policy initiatives need to take AI compute capacity into account: National AI policy initiatives do not include detailed measures of AI compute capacity and corresponding national needs.
- National and regional data collection and measurement standards need to expand: Data collection should be expanded to measure current national AI compute capacity and needs.
- Policy makers need insights into the compute demands of AI systems: Further insights are needed into the compute demands for both the training and inferencing stages of an AI system's lifecycle.
- AI-specific measurements should be differentiated from general-purpose compute: Identifying the differences between AI compute and general-purpose compute is challenging but necessary.
- Workers need access to AI compute related skills and training: AI compute hardware alone is not sufficient; users need specific skills to efficiently utilize HPC clusters.
- AI compute supply chains and inputs need to be mapped and analysed: As countries scale up AI compute capacity, demand for various inputs along AI compute supply chains could increase.
Conclusion
AI is a general-purpose technology impacting nearly every facet of the global economy, prompting governments to formulate and publish national AI strategies. The successful implementation of national AI strategies could become one of the factors defining a country's ability to deliver innovation, productivity gains, and long-term growth.
However, many countries have developed AI plans without a full assessment of whether they have sufficient domestic AI compute capacity to realize these goals. Concerns are growing about reinforcing divides between those who have the resources to create and use complex AI models to generate competitive advantage and productivity gains, and those who do not.
Without data on national compute capacity and the needs of AI ecosystems, decisionmakers might not be able to effectively implement and leverage strategic national AI investments and plans for economic growth and competitiveness.
Understanding of AI compute and its relationship to the diffusion of AI across OECD and partner economies can improve the implementation of national AI strategies, and guide future policymaking and investments. Countries should consider systematically taking stock of existing national compute capacity and reviewing the current and emerging needs of their AI ecosystem.
Note: The above is only a summary of the report content. The complete document contains extensive data, charts, and detailed analysis. We recommend downloading the full PDF for in-depth reading.