AI and Statistics for the SDGs

Data Literacy, Official Statistics, and Big Data for Measurements that Matter

Why AI and Statistics for the SDGs

Though inundated by oceans of data and surrounded by innovative, powerful technologies that hold great promise to help us accomplish and measure the Sustainable Development Goals, the access to and mastery of these resources are highly constrained and unequalboth across and within countriesin ways that reflect and reinforce the weakening of global democratic principles and processes. DPA believes that equitable access to these resources is imperative to ensuring the best future. Under this Program, we equip individuals and organizations with data skills, systems, and standards to navigate the ‘age of data’ and develop measurement methodologies for policy and the SDGs.

“90% of business leaders cite data literacy as key to company success, but only 25% of workers feel confident in their data skills. Not only that, but some estimates suggest that nearly 9 in 10 data science professionals are White, and just 18% are women.” (HBR, 2021).

Methods

DPA leverages the following methods to implement projects under this program: In-person and online courses, advanced data modeling, and AI-based analysis for measuring the unmeasured.

Products

Product 1

Tailored Data Trainings

Product 2

SDGs Assessments and Measurement

Product 4

Human AI Research

OPAL

OPAL

Tailored Data Trainings

DPA developed a four week training on qualitative and quantitative analysis and tools for the members of the Mexican CSO EQUIS Justicia para las Mujeres. The training included different sessions, with online and in-person guided tutorials conducted by DPA facilitators. In addition to showcasing different techniques and tools for data analysis, use cases based on real projects from the organization were leveraged to discuss specific topics. The content was divided into two workshops: the first focused on practical techniques and tutorials on data collection, storage and processing; and the second focused on how to incorporate basic notions of statistics, database reading, and use of visualizations.

This project developed with the support of the Spanish Agency for International Development Cooperation (AECID), strengthened the technical capacities of government officials in Latin America and the Caribbean to take advantage of Big Data for sustainable development and official statistics. During the first phase of the project, through an exploratory study (see Publication below), we analyzed the current state of the infrastructure, institutional framework, regulatory framework, capacities and use cases of Big Data for the generation of public policies in 5 LAC countries: Bolivia, Dominican Republic, El Salvador, Guatemala and Peru.

The second phase focused on developing four capacity building workshops between June 2022 and March 2023.

  • Introduction to Big Data for Sustainable Development
  • Big Data and Poverty Analysis for Sustainable Development
  • Big Data and Health Analysis for Sustainable Development
  • Big Data, Security and Violence for Sustainable Development

This training itinerary provided participants with a comprehensive knowledge of the key concepts, the necessary tools and the main challenges of Big Data for sustainable development, with a special emphasis on the applicability of these data sources for statistical purposes.

This two-day online workshop with government officials from the National Institute of Statistics and Informatics (INEI) in Peru and other public agencies sought to strengthen institutional capacities for the improvement of the national statistical system, particularly by leveraging non-traditional sources of data in administrative reports to measure different indicators of the Sustainable Development Goals (SDGs) towards the achievement of the 2030 Agenda.

EmpoderaData built upon the success of the “Quantitative Step” (Q-Step) program, which was developed as a strategic response to the shortage of quantitatively-skilled social science graduates in the United Kingdom. Together, University of Manchester and Data-Pop Alliance expanded upon the program’s excellent results, exploring this model in the Global South (specifically in Mexico, Brazil and Colombia) as the “EmpoderaData Project”. This initiative aimed to promote a virtuous cycle of social transformation by fostering data literacy skills applied to addressing our society’s most pressing issues within the framework of the Sustainable Development Goals (SDGs).

This series of workshops organized by Data-Pop Alliance, in coordination with GIZ and GIZ Colombia, addressed key terms, necessary tools, and challenges in the Big Data and sustainable development landscape, focusing on the applicability of these information sources in projects related to climate change adaptation and mitigation. This in-person workshop provided an introduction to the “3 C’s of Big Data” as a basis for the development of a Project Lab, enabling participants to incorporate new sources of information in the development of climate change projects.

In November 2019, Data-Pop Alliance, in partnership with ECLAC and the Brazilian Institute of Geography and Statistics (IBGE), conducted its first technical workshop in Rio de Janeiro, tailored specifically to the needs of the staff at IBGE. The goal was to help them to build and strengthen internal capacities to leverage web data collection and analysis in their projects, specifically by delving into web scraping and API’s interaction techniques. ​The workshop emerged as part of the broader training program carried out with ECLAC in the Latin America and Caribbean region: “Big Data for Measuring the Digital Economy”.

In partnership with the United Nations Economic Commission for Latin America and the Caribbean (ECLAC), DPA offered a series of workshops focused on “Big Data and the Digital Economy” in the Latin American and Caribbean region. These workshops were designed for development practitioners, policymakers, and researchers. Five editions were delivered in: Santiago de Chile (March 2016), São Paulo (September 2017) —in partnership with Cetic.br—, Mexico City (October 2017) —in collaboration with the National Digital Strategy (EDN) program and the MIT Sloan School of Management—, Santo Domingo (April 2019), and Bogotá (May 2019) —in partnership with DANE.

Carried out in partnership with United Nations System Staff College (UNSSC), this series of courses aimed to help practitioners and policymakers develop and implement Big Data innovation projects, policies, and partnerships in support of sustainable development objectives. The content was structured into three main modules: contexts and concepts; methods and tools; and strategy and conception/ethics and engagement. The workshops were delivered in Cambridge at MIT (June 2016), Bogotá (December 2016), Nairobi (June 2017), Dakar (March 2018), Bangkok (March 2018), and the MIT Media Lab (October 2018). The same workshop was also conducted in Tunisia (April 2019) with support from UN Tunisie.

SDGs Assessments and Measurement

DPA provided comprehensive support to Equatorial Guinea in developing their Second Voluntary National Review (VNR). Over a six-week sprint, DPA undertook multiple tasks including reviewing and adding information, conducting research, and creating presentations, to support the finalization of the VNR. Key contributions by DPA included:

(1) Document review and improvement through revision and suggestion of improvements for how the country presented its public policies and programs related to the SDGs;
(2) Budget alignment assessment and innovative use of LLMs, by conducting the first assessment of the alignment between the SDGs and the national budget. The team used large language models (LLMs) for budget classification of the SDGs, demonstrating that such technology can facilitate complex and systematic assessments;
(3) Presentation and preparation of content used by the Minister of Planning at the High-Level Political Forum (HLPF) in New York City;
(4) Data compilation and analysis of information related to more than 36 SDG indicators; and
(5) Design of the entire VNR document.

DPA, in collaboration with the UN System in Tunisia and under the implementation leadership of UNDP Tunisia, provided technical support to the National Statistics Council (NSC) and the National Institute of Statistics (INS) in the development of the first National Strategy for the Development of Statistics (NSDS) in Tunisia. The process of the NSDS development included:

(1) Identifying the strengths, weaknesses, and needs of the Tunisian National Statistical System (NSS) through a comprehensive strategic diagnostic of the NSS;
(2) Developing a 5-year National Strategy for the Development of Statistics that facilitates the timely production of reliable statistics to contribute to informed public policymaking aligned with the SDGs;
(3) Developing an advocacy and communication plan to accompany the NSDS implementation, and
(4) Developing a three-year action and financing plans to effectively implement the strategic orientations defined in the NSDS.

Equatorial Guinea adopted the 2030 Agenda for Sustainable Development Goals (SDGs) with the aim of eradicating poverty, protecting the planet and enabling all people to live in peace and prosperity. Undertaken in partnership with the Government of Equatorial Guinea and UNDP Equatorial Guinea, the work contained in this report presents the progress made towards the Agenda 2030 and how to mobilize the resources necessary for its implementation through partnerships and inclusive planning. Based on this commitment, DPA contributed to the first VNR in the country in order to present a self-assessment on progress related to achievement of the SDGs.

DPA, in support of Haiti’s Integrated National Financing Framework (INFF), and in close collaboration with the UNDP Haiti and the Ministry of Planning and External Cooperation (MPCE), has conducted a consultation with the objective of establishing a performance assessment of the achievement of the MDGs, deviations from targets, their justifications, and lessons learned. Additionally, the study measured the progress towards the achievement of the SDGs and proposed priority provisions and mechanisms for Haiti’s advancement towards them.

In partnership with Southern Voice, this research project aimed to rebuild the discourse on development effectiveness from a Southern perspective by developing a new methodological framework. The key objective was to create a methodological guide that leverages innovative data sources and techniques. The study provided an overview of potential open-access and public data, referencing case studies relevant to development effectiveness assessment. It conducted a comparative review of the use of innovative data and techniques—including geospatial information and Big Data—in developing countries, highlighting their advantages, limitations, and associated challenges such as privacy issues. Ultimately, the study proposed alternative methodologies applicable at the country level for measuring development effectiveness, relating these methodologies to inputs, processes, and outputs.

In partnership with the United Nations Development Programme (UNDP), DPA provided support to the Europe and Central Asia Regional Hub in Istanbul for the project “Measuring the Unmeasured” to contribute to SDG measurement and achievement. The effective use of data for public policy was of critical importance to the UN in its efforts to strengthen evidence-based programming and policy development. In particular, generating, analyzing, presenting, and using data was vital to global and regional efforts to monitor and promote the Sustainable Development Goals (SDGs). Our project aimed to scope, develop, and test different methods for measuring Tier III indicators of high SDG priorities for 11 countries in the Arab States, Europe and Central Asia, and Asia Pacific, with the main goal of utilizing this information in policy responses.

Human AI Research

How data are shared and used will determine, to a large extent, the future of democracy and human progress. In this context, the authors of the paper “Sharing is Caring” described four key requirements that had to inform European efforts to ensure that private data were shared and used for the public good in a safe, ethical, and sustainable manner. This paper, co-published with the Vodafone Institute for Society and Communications, is part of the “Digitising Europe” initiative and a series of discussion papers focused on the challenges of European digital policy.

This paper, developed in cooperation with the Vodafone Institute for Society and Communications, highlighted overall takeaways and recommendations in the areas of privacy protection, responsible data governance, transparency, and accountability for unleashing big data-driven innovation. These included:

(1) Putting ‘privacy by design’ into action through privacy-preserving technical procedures and standards for data sharing and use;
(2) Focusing on responsibility in data use by establishing internal responsible data governance standards; and
(3) Keeping transparency, trust, and user control at the center by engaging all data stakeholders.

How can we make full use of data analytics in a responsible and human-centered manner? Which forms of data use should be excluded, and who should set the rules? As the first event paper in the “Digitising Europe” series (published with the Vodafone Institute for Society and Communications), this publication captured the major key themes that emerged from the initiative’s events in Berlin in November 2015. These events explored the possibilities and pitfalls of the data revolution and served to identify key insights and practical solutions for facilitating the use of data while protecting the privacy of citizens.

This white paper, developed in partnership with and funded by The World Bank, examined Call Detail Records (CDRs) and their expanding role in providing insight into human behavior, movements, and social interactions. After providing additional contextual elements (Part 1), the paper summarized current legal frameworks (Part 2), explored structural socio-political parameters and incentives for sharing CDRs (Part 3), proposed guiding ethical principles (Part 4), and discussed operational options and requirements (Part 5).

OPAL

“OPAL for Public Data and Good” seeks to merge different “privacy enhancing techniques” (PETs), such as federated learning, differential privacy, and negative databases, to allow trusted third parties such as researchers or official institutions to analyze censuses or national surveys’ microdata produced by national statistical offices (NSOs), as well as other administrative records, to derive indicators using these data, while avoiding privacy risks. A pilot is expected to be conducted in Mexico, and DPA plans to expand to additional NSOs and other public data holders in the future.