Methodology details
The problem

Industrial strategy design is only as good as the empirics and analysis that underpin it. Yet, economic analysis methodologies tend to ‘look where the light is’ rather than directing the light toward the concrete, granular data required to build meaningful, actionable policy insights. There are pragmatic reasons for this: Data providers, such as governments and multilateral organizations, generally provide low-resolution, aggregated data structured in standardized datasets that cater to a maximally generic audience (for example, at its highest resolution, trade data is reported in 6-digit HS codes that can contain hundreds (if not thousands) of technologically distinct products under a single code). Consequently, datasets are almost never fit for specific analytical purposes and analysts are forced to make compromises due to the substantial effort (millions of human labor hours) that would be required to create bespoke datasets from scratch. These limitations hinder the accuracy and precision of academic research and often become fatal bottlenecks in the context of industrial policy design and implementation, which require more fine grained data to identify current, latent, potential, and strategic productive capabilities. Our methodology papers discuss how these limitations constrain the utility of common frameworks and tools for industrial policy and trade analysis: Estevez, Chang, and Schollmeyer (2025) and Schollmeyer et al. (2025).

To overcome these long-standing bottlenecks, our Data Lab leverages AI to illuminate granular, concrete, real-world information and construct datasets specifically designed for implementation-ready industrial policy analysis. Recent advances in large language models (LLMs) have made it possible to design and deploy legions of AI researchers that mimic highly-trained human researchers to assemble and structure vast amounts of information. Using these tools, our data team has developed a first-of-its-kind auto-research algorithm that draws on publications, intellectual property rights records, business-to-business price data, targeted web search, and more to discover and characterize productive capabilities at a sufficiently fine level of detail to move from rough analysis to practical policy design and implementation. This includes fine-grained information about productive capabilities, technological dependencies, and value adding processes (firm-level capacities, talent availability, value added, cost, and profit data, resource use, etc.).

Our tools are already being piloted in industrial value chain targeting projects with the World Bank, for the Romanian government; and in large scale company surveys with the European Commission and the World Bank: interview-based surveys of 1,964 SMEs from 34 countries in their local languages using our Verbatim tool.

Visualization dashboard

Our visualization dashboard uses the solar PV value chain as an illustrative example to offer a glimpse into our methodologies for value chain decomposition, technology analysis, trade code mapping, company mapping, competitiveness proxies, and trade balances along the value chain.

Visualization dashboard preview

The 12,800 components and types are fully navigable (see the gif below). By clicking on any of the components or types, you can access detailed information about, e.g., their function, mechanism, as well as their 6-digit trade code. Moreover, you can navigate upstream to explore the inputs required to produce a specific type. While still experimental and incomplete, this mapping is the first of its kind and we aim to make these detailed breakdowns available for as many value chains as possible in the near future for the benefit of policymakers and researchers. Try it yourself.

Choose view by aggregation level