Resources & Methodologies
Big data available in the digital ecosystem include an amount of information which goes beyond the human ability to capture, curate, manage, and process information within a tolerable span of time. There are three types of Big Data: unstructured (data that is usually human-generated content on the internet, such as text, pictures, audio and similar); semi-structured (data generated as the result of the interaction between a man and a machine, for example tags in the ecosystem of a website or database); and structured (clean data in a database, that is usually the result of an automated process).
For the purpose of tracking SDGs and measuring specific Indicators, the main focus is on unstructured data, as this type of data production is increasing in developing countries. Despite the rise in volume of Big Data, the analysis requires a set of techniques and technologies that allows to reveal insights from low volume signals hidden within diverse, complex, massive scale datasets.
The main data sources to work with big data:
Provide semi-public profiles of users and can be exploited for data collection and analysis.
Open projects like Wikipedia and OpenStreetMaps which provide a valuable source of data for analysis.
Call records and financial transactions stored in private databases that can be a valuable source of information about users
Locations can be traced by means of the data collected via GPS and mobile phone usage.
Left by web navigation. Examples are IP addresses and cookies, that can be used for tracking users’ behavior.
To view more available resources and methodologies to use Big Data in Development, you can refer to the Data-Pop Alliance Toolkit
The main methodologies to work with Big Data in their ecosystem are listed below:
An operation that consists in the comparison between experimental and control data sets. It is common in research and can be performed by means of machine learning and natural language processing
Developed and delivered in partnership with the UNSSC Knowledge Centre for Sustainable Development, as a part of its SD Talks Special Series initiative, this webinar series aims to examine the critical role that data can play in achieving sustainable development.