Big Data and Disability, Part 1

Gabriel Pestre Blog


This is the first in a series of blog posts on our ongoing work exploring the applications and implications of Big Data and disability. This serves as an initial scoping of how Big Data can contribute to various research areas related to disability. This will form the basis of a White Paper (forthcoming) exploring the ability of the Big Data ecosystem to monitor and understand the state of research about persons with disabilities and their environments.

A large number of persons with disabilities are faced with barriers to actively participating in society: they are denied their rights to “be included in the general school system, to be employed, to live independently in the community, to move freely, to vote, to participate in sport and cultural activities, to enjoy social protection, to access justice, to choose medical treatment and to enter freely into legal commitments such as buying and selling property,” according to the UN Human Rights Office of the High Commissioner. These people are commonly labeled “invisible” and continually are sidelined in communities.

In the past ten years, there has been a revolutionary and global shift in the approach by UN member states to ensure that persons with disabilities receive and enjoy the same rights, equality, and dignity as everyone else. The United Nations Convention on the Rights of Persons with Disabilities of 2008 called for the promotion, protection, and enjoyments of all human rights and freedoms by all persons with disabilities, and respect for their inherent dignity, yet persons with disabilities are continually denied their rights to education, employment, healthcare, and accessible sanitation facilities. In order to effectively protect and promote this marginalized group, a knowledge base of data concerning the situations of persons with disabilities must be built; yet, currently, there is a lack of data and information on the intersection of persons with disabilities and their environments.

The emergence of ‘Big Data’ has shown real value and potential as a new data source for understanding the situation of persons with disabilities. We conceptualize Big Data not just as large datasets, but as a new socio-technological phenomenon resulting from the emergence and development of an ecosystem, characterized by the union of 3 Cs: Big Data ‘Crumbs’ about human behaviors and beliefs; “Capacities” of digital devices, ever more powerful computing power and analytics tools to collect, aggregate, and analyze data; and a vibrant “Community” of actors involved in generating, governing, and using data.

Our work on Big Data and Disability takes root in the following questions:

  1. What sorts of data exist about persons with disabilities and their environments?
  2. In what ways can that data be used (and how has it been used so far) to understand the situation of persons with disabilities?
  3. What are some key areas in which data is lacking, and what kind of interventions might close those gaps?
  4. What are the advantages of using Big Data to understand disability?

The Categories of (Big) Data on Disability

The table below summarizes different categories and types of data, provides examples, and discusses possible opportunities for using (big) data on disability. In general, digital content (category 2) is the category of data that lends itself most readily to studying disability, because information about disability status or accessibility can be more directly/explicitly linked to each record. In the case of exhaust data (category 1), data records are linked to a user profile that does not typically include information on disability status – for instance, financial transactions are associated with an account number, which says nothing about whether the account holder has a disability or not. Opportunities in this area are typically linked to accessibility, but the fact that a service is accessible or has accessible options doesn’t necessarily reveal anything about the actual users. The exception is exhaust data from services that are specifically intended for use by Persons with Disabilities (PwD), such as GPS data from access-a-ride vehicles, or Call Detail Records from TTY/TDD services. Similarly, sensing data (category 3) has few use cases because it is rare to find data in this category that reveals something about disability/accessibility status. Thus, sensing data cannot usually be disaggregated by disability status without combining it with data from other sources/categories.

Table 1. Taxonomy and Examples of (Big) Data Sources Used Across Disability Research
Types Examples Opportunities
Category 1: Exhaust data
Mobile-based Call Details Records (CDRs)
GPS (Fleet tracking, Bus AVL)
Financial transactions Electronic ID
E-licenses (e.g. insurance)
Transportation cards (including airplane fidelity cards)
Credit/debit cards
Online marketplaces (Airbnb, TaskRabbit, UberRush)
Using transaction data to compare cost, availability, and use of services that offer accessible options (for example, accessible Airbnb listings).
Transportation GPS (Fleet tracking, Bus AVL)
EZ passes
E-hailing and rideshare services (Uber, etc)
Using fleet tracking data from access-a-ride services to study how PwD move around a city and evaluate coverage of the transit network.
Online traces Cookies
IP addresses
Using browser tracking to study how PwD interact with digital content (for example, with e-books).
Category 2: Digital Content
Social media Tweets (Twitter API)
Check-ins (Foursquare)
Facebook content
YouTube videos
Using social media data to represent PwD as a network of interactions (for example, using certain Twitter hashtags).
Crowd-sourced/online content Mapping (Open Street Map, Google Maps, Yelp)
Monitoring/Reporting (uReport)
Using crowd-sourcing to map the locations of accessible businesses and public places.
Category 3: Sensing data
Physical Smart meters
Speed/weight trackers
USGS seismometers
Remote Satellite imagery (NASA TRMM, LandSat)
Unmanned Aerial Vehicles (UAVs)

Four Functions of (Big) Data on Disability

Below we propose a possible taxonomy for discussing the functions (big) data in relation to studying problems and proposing solutions to issues stemming from disability. What these different functions actually mean, in context, will depend on what sort of data we are working with.

To illustrate these functions in context, here we use the example of data on location of businesses that are accessible to people with disabilities affecting mobility.

  1. Descriptive: describing and representing the collected information -- for example, representing the locations of accessible businesses in a city, or using apps or social media to collect information on the measure that businesses have taken to ensure accessibility.
  2. Predictive: making inferences based on collected information (such as forecasting) -- for example, showing trends in the growth of number of accessible businesses in certain parts of the city.
  3. Prescriptive (or diagnostic): going beyond description and inference to establish and make recommendations on the basis of causal relations -- for example, showing that the addition of a school or hospital increases the number of accessible businesses in a neighborhood (if such is the case).
  4. Discursive (or engagement): spurring and shaping dialogue within and between communities and with key stakeholders through communication of data -- for example, using data on accessible business locations in public discourse about the needs and resources of persons with disabilities, and what steps society can take to achieve goals that have been set forth.

Seven Areas of Research on (Big) Data and Disability

Given the broad range of definitions and types of disability, as well as the numerous functions that traditional data and Big Data can have (as described above), many interesting research questions emerge at the intersection of (Big) Data and Disability. The uses of such data -- and potential research questions -- range from descriptive applications to discursive applications, depending on the topic area. Indeed, data can be used in a variety of ways to identify and study issues and propose actions and solutions to some of the challenges faced by persons with disabilities. This document pulls together some articles and thoughts on those opportunities and challenges, and is meant as a starting point for identifying promising uses of data and potentially useful datasets. In looking at the types of data that exist and the areas in which data can be used regarding disability, several key themes emerge:

  1. Voting & Representation: Beyond the obvious descriptive uses of data (how are PwD distributed geographically within the electorate, are there trends in how and where the vote, etc.), we can identify some key uses that are more discursive (does anything in the data, or lack of data, point to under-representation or disenfranchisement of PwD among voters).
  2. Employment: Descriptive uses of data include mapping the availability and location of employment opportunities or workplaces that are accessible/open/useful to PwD; or monitoring compliance with legislation or policies on hiring practices. Predictive and prescriptive uses, which include studying the causes and consequences of trends in employment of PwD, can help evaluate existing policies and shape new ones. This area is particularly interesting for comparisons across countries.
  3. Community & Social Media: Descriptive uses of data can help study PwD as a network, in relation to their peers, thought leaders, political representatives, etc. Social media can also be used as a tool for people who are receiving medical treatment to be in contact with physicians and other patients, in order to receive proper support and follow-up care, etc.
  4. Accessibility: Descriptive uses of data include crowd-sourcing and mapping locations of public places, businesses, lodging, and transportation that are accessible to PwD. On the discursive side, data on compliance with accessibility standards can be used as a tool to promote more inclusive economies, cities, and societies.
  5. National and International Programs: Use of data includes evaluating where data does and doesn't exists, comparing countries based on their implementation of national and global targets, and setting new global targets or creating metrics to track their implementation.
  6. Education: There are a variety of descriptive uses of data, in particular relating to the proportion of children with disabilities who are included in the education system (either in specially designed programs or integrated into other programs), to observing what opportunities exist for them to receive education, and identifying gaps and issues in the education system. Data can also be used for creating education material, such as using innovative approaches to digitize books or studying the effectiveness of various teaching tools and methods.
  7. Awareness & Advocacy: The topics covered above each present ways to raise awareness about certain causes, in particular through their discursive functions. More generally, tools (such as standardized metrics, and visualization techniques) can be used to collect and communicate data across all the fields described above.
Table 2. Examples of the Functions of (Big) Data in Various Areas of Research
Descriptive Predictive Prescriptive Discursive
Voting & Representation Using location data to map accessible polling stations. Using polling data to compare voting patterns among PwD versus other voters. Using location and polling data to demonstrate areas where changes are needed or impact is most effective. Using data to show that PwD may be underrepresented because of inaccessible polling stations.
Employment Using location data to map workplaces that are accessible to PwD. Using job data to find trends in employment of PwD. Using data to study the impact of potential legislation changes on rates of employment of PwD. Using data to show which countries meet international targets on employment opportunity for PwD.
Community & Social Media Using social media data to represent PwD as a network of interactions. Using data on hashtags and interactions to study how ideas and movements regarding disability are spread on social media. Using social networks as a tool to keep people receiving treatment/follow-up in contact with physicians and other patients to improve healing. Using social media as a tool to communicate data about disability and promote awareness and advocacy.
Accessibility Using crowd-sourcing to map the locations of accessible businesses and public places. Using data on participation in sharing economies to identify and measure losses cause by businesses and marketplaces that aren’t accessible. Using data to demonstrate the impact of increased accessibility on participation and inclusion, cost, economic gain, etc. Using data on compliance with accessibility standards as a tool to promote more inclusive economies, cities, and societies.
National & International Programs Using data to evaluate where data does and doesn't exists on meeting national and international targets on disability. Using existing data to define and build new metrics for measuring implementation of targets on disability across countries. Using data to study the impact countries’ programs and legislation for meeting international targets on disability. Using data to show which countries are making progress towards meeting their targets.
Education Using data on inclusion of students with disabilities in the education system to map participation and progress. Using digitization and processing techniques like optical character recognition and captchas to created educational material. Using analytics data from educational platforms to evaluate and promote effective learning techniques. Using data to raise awareness and increase understanding of disability by the general public.
Awareness & Advocacy Using visualization tools to communicate data about challenges and opportunities. Using analytical tools to identify and communicate trends in (big) data on disability. Using data to show the impact of various programs, actions, legislation, etc. in the area of (big) data and disability. Using data as a tool to communicate and solve problems and build on existing opportunities in the area of (big) data and disability.