Published on Development Impact

The Future of Surveys: Six Takeaways from the Inaugural Conference on Innovations in Survey Measurement in the Age of AI

This page in:
Development Impact logo

Advances in data and measurement are reshaping how researchers and policymakers understand jobs, livelihoods, and economic resilience in low- and middle-income countries. But improving measurement is far from straightforward. How surveys ask questions, how responses are coded, and how new technologies are integrated can all affect the quality of data.

These challenges were at the center of the inaugural conference “Better Data for Better Jobs and Lives: Innovations in Survey Measurement in the Age of AI,” held at the World Bank in Washington, DC in December 2025, and organized by the World Bank Survey Unit’s  Living Standards Measurement Study (LSMS) program and the Global Poverty Research Lab at Northwestern University, in collaboration with the World Bank Data Academy. An overview of all sessions and their recordings is available here.

Building on that momentum, we are happy to announce the next installment of the Conference on Innovations in Survey Measurement in the Age of AI will take place on December 8 and 9, 2026, at the World Bank Headquarters in Washington, D.C. Stay tuned for the Call for Papers.

A recurring theme throughout the inaugural conference was that household surveys remain the backbone of development data, even in the age of AI. Surveys provide the ground truth that underpins evidence on poverty, jobs, food security, and resilience, and they play an essential role in calibrating and validating emerging data sources such as geospatial analytics and machine learning systems. At the same time, new technologies are opening exciting frontiers for measurement. Realizing their potential will require careful testing, validation, and integration with existing survey systems.

Here are six key takeaways from the research and discussions at the conference and some ideas on what’s next.

1. How we measure shapes what we learn

Across a wide range of topics— from small business activity to poverty and consumption to childcare—presentations showed that survey design choices often have first-order consequences for empirical results.

Choices such as interview mode, question wording, respondent selection, recall periods, and sampling approach can produce differences in estimates that rival the effects researchers are trying to measure (Markhof[1]; Markhof[2]; Glazerman; Amankwah; Yacoubou Djima; Baron; Ambel; Mahler). For example, welfare and livelihoods indicators collected over the phone differ by up to 63% from those collected in-person in Nigeria (Markhof[1]). And household surveys in urban Ghana count over twice as many informal enterprises than those conducted using area-based sampling frames (Amankwah).

Small survey errors can have outsized consequences. A three percent misclassification rate in employment surveys can triple estimated labor market transition rates (Prinsloo); measures of gender gaps in socioemotional skills and occupational bias are substantially affected by whether assessed byself-reports or behavioral measures (Das; Delavallade; Donald); poverty estimates for women and children are sensitive to the consumption goods from which intra-household resource allocation is estimated(Palacios-Lopez); short-form childcare modules miss caregiving duties carried out by fathers, grandparents, or siblings(Contreras); and census enumeration form design can lead to systematic undercounting of the poorest populations (Radu).

2. Measuring jobs and livelihoods in LMICs requires survey innovations built for informal labor markets

In high-income countries, the archetype of employment is relatively straightforward: a single, formal job with regular hours, a defined employer, and a clear occupational category. In most low- and middle-income settings, this description fits a minority of workers. Instead, individuals combine farming with wage work, run informal enterprises from their homes, move in and out of self-employment across seasons, and participate in supply chain networks that blur the boundary between employee and entrepreneur.

Throughout the conference, several presentations explored the unique challenges that informal and diverse labor markets pose for accurate measurement. Researchers investigated topics such as home-based businesses that often escape the scope of standard enterprise surveys (Kagy; Amankwah), a cost-effective approach (‘aggregate relational data’) to mapping the trading, credit, and information networks connecting small firms (Chawla), shifts in agricultural labor as off-farm opportunities become more prevalent and their implications for data collection (Stevenson), and an index measuring ‘poor quality employment’, not just who has a job and their wage (Sehnbruch).

3. Higher frequency data and new data sources are broadening the scope of what surveys can capture and analyze

Traditionally, surveys have been infrequent, constrained to periodic snapshots, and respondent-reported data. Several contributions at the conference demonstrated that this is no longer so.

Nimble survey designs and innovations in data collection are changing the frequency at which survey data can be collected. High-frequency data reveal greater transitions and fluctuations in labor supply, income, time use, and occupations that standard annual surveys often miss (McGavock; Safari; Gonzalez).

Surveys can be complemented with technology. Sensor-based and remote-sensing technologies — including weather stations, soil scanners, and satellite imagery — are increasingly being deployed alongside surveys and greatly expand what surveys can measure (Josephson; Gourlay; Dickinson). This allows researchers to observe economic and environmental conditions with far greater precision than ever before.

Lastly, geospatial data and machine learning methods are also being used to fill gaps where traditional survey data are sparse: estimating poverty from satellite imagery and big data (Marty), improving sub-national human capital estimates through small area estimation (Newhouse), and building causal, shock-sensitive resilience indices (Montalbano).

4. AI is already improving how survey data are processed and analyzed

Large language models, machine learning, and related tools are already being used to classify occupations (Rossow), code open-ended responses (Brueckmann), and analyze qualitative survey answers (Burstein; Tekleselassie).

These tools can automate tasks that traditionally required large teams of human coders, often reducing costs and processing time while maintaining high levels of accuracy. They also allow researchers to extract insights from forms of data that surveys already collect but rarely analyze, such as free-text responses and narrative descriptions (Burstein; Losira) or harmonize data across different sources and survey designs (Mahler).

In this sense, AI is beginning to augment survey workflows — helping researchers extract more value from existing data and improve the efficiency of large-scale survey operations. In addition to streamlining data processing to move more quickly from data collection to analysis, LLMs have the potential to bridge qualitative and quantitative work in new ways.  The high costs and implicit subjectivity in coding qualitative data are greatly reduced by LLMs  (Burstein; Brueckmann; Sayouti), but how researchers put guardrails and train new models are methodological research areas that require great investment if we aim to leverage new LLM technology.  The potential to hear from respondents in their own words is exciting.

5. The AI frontier: separating promise from hype

Beyond these practical applications, the conference also explored more ambitious AI-enabled measurement approaches that are currently being piloted. These include conversational agents that may one day augment or replace human interviewers (Amar; Koca), automated speech-based quality control (Sayouti), and voice biomarkers for mental health screening (Pougué Biyong).

These possibilities are genuinely exciting, but the evidence base remains limited. Most applications are still at the prototype or early pilot stage, and many questions remain about reliability, bias, and how respondents interact with AI systems.

A key message from the keynote (Rothschild) and panel discussions was that AI innovations should be evaluated against the same standards applied to any survey method. Demonstrated accuracy, known failure modes, and transparent reporting are essential before these technologies can be adopted at scale.

6. Turning innovation into practice is the real challenge

Across the conference, a challenge that emerged repeatedly was that, while methodological innovation is advancing rapidly, translating new ideas into widely adopted survey practice remains difficult. Strengthening survey systems therefore requires not only new methods, but also new skills that help institutions adopt and scale them.

For the foreseeable future, high-quality primary data collected through household surveys remains irreplaceable—not because alternatives lack promise, but because no existing technology can yet replicate the breadth, depth, and contextual richness that well-designed household surveys provide. However, the innovations showcased at the conference demonstrate great promise in augmenting surveys and strengthening development data.

Ultimately, the goal is simple: better measurement that leads to better data.


Kathleen Beegle

Lead Economist, Poverty, Inequality and Human Development, Development Economics

Andrew Dillon

Guest blogger/ Clinical Associate Professor of Development Economics within Kellogg’s Public-Private Interface Initiative (KPPI) Director of Research Methods Cluster in the Global Poverty Research Lab

Talip Kilic

Acting Manager & Senior Program Manager, Living Standards Measurement Study (LSMS), World Bank

Yannick Markhof

Postdoc, ETH Zurich & Associated Postdoc, ETH AI Center

Amparo Palacios-Lopez

Senior Economist, Living Standards Measurement Study (LSMS), World Bank

Philip Randolph Wollburg

Senior Economist, Living Standards Measurement Study (LSMS), World Bank

Join the Conversation

The content of this field is kept private and will not be shown publicly
Remaining characters: 1000