“How to use Big Data? Leading experts’ roadmap to data-driven innovation projects”
This position paper highlights overall takeaways and recommendations in the areas of privacy protection, responsible data governance, transparency and accountability for unleashing big data-driven innovation, including: (1) putting ‘privacy by design’ into action: privacy-preserving technical procedures and standards for data sharing and use; (2) focusing on responsibility in data use: establishing internal responsible data governance standards and (3) Keeping transparency, trust and user control at the centre: engaging all data stakeholders.
“Algorithmic accountability – Applying the concept to different country contexts”
Drawing from interviews with global experts, topic workshops and content research, this scoping paper aims to provide the reader with an understanding of algorithmic decision-making processes and the challenges they pose to our existing understanding of accountability across different contexts. It offers a map of existing technical and governance mechanisms for both identifying and addressing algorithmic harms and bias, as well as a set of recommendations and entry points for the Web Foundation and other stakeholders to contribute to this emerging field most effectively.
“Big Data – Predicting and preventing climate-related shocks”
Big Data as a socio-technological phenomenon has the potential to generate new insights on the functioning and interaction of human and natural ecosystems. In particular, Big Data can improve our understanding of how societies deal with shocks related to climate change, and inform policies and actions to foster adaptive mechanisms. However, such positive effects will not occur automatically and investments to address the technological, human, and ethical barriers of Big Data will be necessary. This article analyzes these factors and makes a series of recommendations on the potential for leveraging Big Data for climate change resilience in LAC, impediments in doing so, and requirements if this is to be effective.
“Fair, Transparent and Accountable Algorithmic Decision-making Processes”
The combination of increased availability of large amounts of fine-grained human behavioral data and advances in machine learning is presiding over a growing reliance on algorithms to address complex societal problems. Algorithmic decision-making processes might lead to more objective and thus potentially fairer decisions than those made by humans who may be influenced by greed, prejudice, fatigue, or hunger. However, algorithmic decision-making has been criticized for its potential to enhance discrimination, information and power asymmetry, and opacity. In this paper, we provide an overview of available technical solutions to enhance fairness,accountability, and transparency in algorithmic decision-making.
“Mining Case Law to Improve Countries’ Accountability To Universal
The United Nations (UN) Universal Periodic Review (UPR) is a process established by the Human Rights Council aiming to monitor and improve the human rights situation in each UN member state. In this study, we hypothesize that leveraging text mining and machine-learning algorithms is a viable strategy for monitoring gender discrimination in sentencing practices of Fiji’s judiciary system, which has been the object of recommendations from Norway and Belgium in the UPR cycles of 2010 and 2015, respectively.
“Oportunidades y requerimientos para aprovechar el uso de Big Data para las estadísticas oficiales y los Objetivos de Desarrollo Sostenible en América Latina”
Este documento se realizó en el marco de un proyecto apoyado por el Banco Mundial e implementado por Data-Pop Alliance en asociación con el Departamento Administrativo Nacional de Estadística de Colombia – DANE. Data-Pop Alliance es una coalición sobre Big Data y el desarrollo creada conjuntamente por la Iniciativa Humanitaria de Harvard, el MIT Media Lab y el Instituto de Desarrollo de Ultramar (ODI por sus siglas en inglés) para promover una revolución de Big Data centrada en las personas.
“Leveraging Algorithms for Positive Disruption: On data, democracy,
society and statistics”
The main objective of this paper is to discuss whether and how the future of algorithms can be crafted such that their development and deployment—from their design to their use, including control, evaluation, auditing, governance—be based on and foster core democratic values such as accountability, transparency, participation, and collaboration. In doing so, we will focus on algorithms affecting public life and policies tomaximize benefit for citizens, or ‘public good algorithms’, but the discussion aims to have broader applicability.
“Big Data and Climate Change Resilience”
This paper (in progress) outlines the Data–Pop Alliance’s ongoing research on Big Data, climate change and environmental resilience. The paper dives deeply into the conceptualization of climate change resilience, both specific and general; addresses Big Data contributions to understanding the components of climate risk; and identifies gaps and challenges to Big Data applications to climate resilience decision-making. Finally, authors offer suggestions for individual and community engagement in building resilience.
Event Paper – Big Data and Privacy: Understanding the Possibilities and Pitfalls of the Data Revolution in Germany
As the first event paper in the digitising Europe’s series, this event paper captures the major key themes emerging from our events in Berlin in November 2015.
The Berlin workshop and public forum focused on the possibilities and pitfalls of using Big Data analytics for economic growth and public good. Bringing together German academic institutions, think tanks, businesses and other thought-leaders, the expert workshop focused on the ongoing political discourse in Berlin surrounding the elements framing the GDPR and EU legislation on data protection.
“Correcting for Sample Bias
with Application to the Case of Senegal”
This paper sets out to explain modeling and correcting sample bias in Call Detail Records (CDRs). A proper understanding of sample bias is key to producing useful estimates derived from CDRs: such calculations rely heavily on a good understanding of how the sample (cell-phone users) relates to the larger populations it is drawn from. It could have major applications in crisis monitoring and response, as in the case of flood vulnerability predictions. Data-Pop Alliance uses both statistical and machine learning approaches, relying on data from Orange’s D4D challenges, official censuses and Demographic and Health Survey (DHS) program data.
“Beyond Data Literacy: Reinventing Community Engagement and Empowerment in the Age of Data”
This paper attempts to delineate the broad contours of “data literacy” through analysis of its history, definition, expectations, application, and promotion. The paper thus defines “data literacy” as “the desire and ability to constructively engage in society through and about data” and argues that promotion of “data literacy” should be via and for social inclusion.
“Big Data for Climate Change and Disaster Resilience: Realising the Benefits for Developing Countries”
This paper evaluates the opportunities, challenges and required steps for leveraging the new ecosystem of Big Data to monitor and detect hazards, mitigate their effects, and assist in relief efforts as poor communities become more vulnerable to natural hazards. There have been increasing calls to make disaster risk reduction a core development concern and to build resilience so that vulnerable communities and countries as complex human ecosystems not only ‘bounce back’ but also learn to adapt to maintain equilibrium in the face of natural hazards.
“Moves on the Street: Classifying Crime Hotspots Using Aggregated Anonymized Data on People Dynamics”
This paper highlights the potential societal benefits derived from big data applications with a focus on citizen safety and crime prevention. Authors detail a case study tackling the problem of crime hotspot classification, that is, the classification of which areas in a city are more likely to witness crimes based on past data. In the proposed approach demographic information is used along with human mobility characteristics as derived from anonymized and aggregated mobile network data.
“Group Privacy in
the Age of Big Data”
This paper attempts to define what is a group and what is privacy in order to determine how a privacy right might attach to groups distinctly from the individual privacy rights of its members, and what might be the content of such a group privacy right. The challenge faced by group privacy is to enable the positive uses of Big Data while restricting the oppressive uses to the extent possible. This cannot be done by legislation or stakeholders alone; it also requires improving awareness and data literacy, and harnessing technology itself to improve data security and accountability for breaches.
“Big Data and Development:
This paper describes the fundamental nature of Big Data as an ecosystem and how it engages with society. Although Big Data has promising applications to real-world problems, it is met with warnings and risks–the most severe being risk to individual privacy, identity and security. In response to these challenges and risks, the paper explores the future of Big Data and how it will be shaped by academic research, legal and technical frameworks for ethical use of data, and larger societal demands for greater accountability and participation.
“The Law, Politics and Ethics of Cell Phone Data Analytics”
This paper examines Call Detail Records (CDRs) and their expanding role in providing insight into human behavior, movements, and social interactions. As a result of their growing application, certain ethical and legal questions need to be addressed. The paper summarizes current legal frameworks, explores structural socio-political parameters and incentives structuring the sharing of CDRs, proposes guiding ethical principles and discusses operational options and
“Quantifying the Data Deluge and
the Data Drought”
This paper investigates how the world’s Big Data capacity can be understood in terms of the world’s storage capacity and the telecommunication capacity to access this storage (‘the cloud’). This paper follows the methodology of what has become the standard reference in estimating the world’s technological information capacity: Hilbert and López (2011).
“Official Statistics, Big Data, and Human Development”
This paper aims to contribute to the ongoing and future debate about the relationships between Big Data, official statistics and development. This paper argues that Big Data needs to be seen as an entirely new ecosystem comprising new data, new tools and methods, and new actors motivated by their own incentives. The emergence of this new ecosystem provides both a historical opportunity, and a political and democratic obligation for official statistical systems to recall, retain or regain their primary role as the legitimate custodian of knowledge and creator of a deliberative public space for and about societies.
Big Data & SDGs note for Global
Sustainable Development Report
This paper focuses on the intersection of Big Data and Sustainable Development Goals (SDGs) and the spectrum of ways and channels through which Big Data as an entirely new ecosystem could impact—contribute to or hamper—human progress as called for and measured by the SDGs. Applications of Big Data to SDGs have the potential to advocate for causes, shape incentives and inform policies This paper argues that BIg Data contributions to the SDGs should expand beyond monitoring–Big Data must contribute directly to SDGs, which will require a data-educated citizenry.
AFD Paper: CDRs & Poverty and
Population Analysis – Côte d’Ivoire
This paper considers Big Data’s potential to partly fill some key data gaps and complement or even replace official statistics. Data-Pop Alliance offers the specific case of Côte d’Ivoire, using Call Records (CDRs) from Orange in conjunction with two other datasets, the WorldPop dataset, which provides population data derived from satellite imagery, and the recently released 2013 Demographic and Health Survey (DHS). The paper intends to predict multidimensional poverty at the sous-prefecture and sub-national levels; and to predict the population of the 11 sub-national regions of Côte d’Ivoire and its 255 sous-prefectures (sub-districts).
“Big Data and Mobility: Migration and Transportation”
This paper (in progress) discusses the linkages between Big Data and mobility—specifically migration and transportation. Its main objective is to give its readers—World Bank staff, policymakers, researchers, development project managers and other professionals—an overview of the main features and parameters of this nexus, as well as provide examples and discuss key considerations—technical, ethical, institutional, etc.—for developing projects, programs and other activities in the field.