Who Is Visible in Data and Who Is Not

Data is often presented as a neutral reflection of reality. Governments, companies, and institutions rely on data to measure populations, allocate resources, predict risks, and guide decision making. In contemporary digital governance, visibility within data systems increasingly determines who receives attention, services, protection, and opportunity. Scholars in critical data studies have argued that data systems are never fully neutral because they are shaped by institutional assumptions, political priorities, and historical inequalities (Kitchin, 2014).

Yet visibility in data is never evenly distributed.

Some individuals and communities are intensely monitored, categorized, and analyzed, while others remain partially invisible or entirely absent. Data systems do not simply observe society. They actively shape how society is understood and governed. Lyon (2018) explains that modern data infrastructures function as systems of social sorting, where individuals are categorized and evaluated through digital processes that influence life opportunities and institutional treatment.

The question is therefore not only what data shows, but who becomes visible through data and who does not.

Data as a System of Recognition

To be represented in data is, in many contexts, to be recognized by institutions.

Governments rely on data to determine eligibility for welfare, healthcare, education, taxation, and legal identity. Companies use data to identify consumers, target advertisements, evaluate risks, and optimize services. Digital platforms collect behavioral data to personalize interaction and predict future actions. Van Dijck (2014) argues that datafication has transformed social life by converting human activity into quantifiable information that can be monitored and managed.

Visibility within these systems often determines whether individuals are acknowledged as participants within social, economic, and political processes.

However, recognition through data is selective rather than universal.

Data systems prioritize what can be measured, categorized, and standardized. Experiences that are difficult to quantify are frequently ignored. Communities with limited digital access may generate less data, while marginalized populations may appear only through systems of surveillance or risk assessment. Bowker and Star (1999) demonstrate that classification systems are never merely technical tools because they reflect institutional values and shape social inclusion and exclusion.

As a result, visibility is shaped not merely by existence, but by institutional priorities.

The Politics of Absence

Absence from data can produce serious consequences.

Individuals who are not adequately represented in datasets may become excluded from services, policy considerations, or technological development. In many countries, populations lacking formal documentation struggle to access healthcare, banking, education, or social protection because institutional systems depend on data verification. According to the World Bank (2021), more than 850 million people globally still lack official identification, limiting their access to public and financial services.

At the same time, invisibility is not always accidental. Some forms of exclusion emerge because systems are designed around dominant populations and assumptions.

Safiya Umoja Noble (2018), in Algorithms of Oppression, demonstrates how digital systems can reproduce existing inequalities by prioritizing dominant representations while marginalizing others. Search engines and algorithmic systems do not simply organize information neutrally. They reflect broader social and economic structures that shape whose experiences become visible and whose do not (Noble, 2018).

This means that absence in data is often connected to broader structures of inequality.

Hypervisibility and Surveillance

While some communities remain invisible, others experience a different condition: hypervisibility.

Marginalized populations are frequently subjected to disproportionate levels of data collection and monitoring. Welfare recipients, migrants, low income communities, and racial minorities are often more heavily surveilled through administrative systems, predictive analytics, and algorithmic risk assessments. Gilliom (2001) observed that welfare systems often function through intensive monitoring practices that disproportionately target vulnerable populations.

Virginia Eubanks (2018), in Automating Inequality, examines how automated systems in public services disproportionately monitor and discipline poor communities. Rather than creating neutral efficiency, these systems can intensify existing inequalities by transforming vulnerable populations into objects of continuous scrutiny (Eubanks, 2018).

In this context, visibility becomes a form of control rather than empowerment.

Being visible to data systems does not necessarily mean being protected or represented fairly. Visibility can also mean exposure to intervention, suspicion, and restriction. Foucault’s concept of surveillance remains relevant in understanding how visibility can operate as a mechanism of discipline and governance (Foucault, 1977).

Data and Unequal Representation

Data systems frequently reproduce existing social biases because they are built from unequal social realities.

Historical discrimination, unequal access to technology, and institutional bias become embedded within datasets. When algorithms learn from historical data, they may reproduce past inequalities while appearing objective. O’Neil (2016) argues that many algorithmic systems amplify inequality because they inherit patterns from biased social environments while maintaining an appearance of mathematical neutrality.

Ruha Benjamin (2019) further explains that technological systems often reinforce social hierarchies under the appearance of innovation and efficiency. In Race After Technology, she shows how emerging technologies can deepen racial inequality even when framed as progressive or objective (Benjamin, 2019).

This creates an important paradox.

Data driven systems are often trusted because they appear scientific and impartial. Yet the categories, assumptions, and measurements used within these systems are shaped by human choices, institutional histories, and political priorities. As Crawford (2021) notes, artificial intelligence systems are ultimately built upon social, political, and economic structures rather than existing independently from them.

Objectivity in data is therefore never entirely separate from power.

Digital Inequality and Data Production

Visibility in data is also shaped by unequal participation in digital environments.

Individuals with reliable internet access, digital literacy, smartphones, and financial connectivity generate large volumes of data through everyday activity. Others may remain only partially represented because they lack stable access to digital infrastructures. Couldry and Mejias (2019) describe this process as “data colonialism,” where human life increasingly becomes a source of extractable data within digital economies.

This creates forms of digital inequality that extend beyond access alone.

Those who generate more data often receive more personalized services, financial opportunities, and institutional attention. Meanwhile, populations with limited digital traces may become statistically invisible in policy planning and technological development. Van Deursen and Helsper (2015) argue that digital inequality increasingly reflects differences in skills, usage, and outcomes rather than simple internet access alone.

At the same time, digital participation itself is unevenly structured. Platform economies, gig work systems, and social media environments collect extensive behavioral data from users while offering limited transparency regarding how that data is used. Zuboff (2019) characterizes this condition as surveillance capitalism, where behavioral data is extracted and monetized at massive scale.

Data extraction becomes concentrated among populations with the least bargaining power.

The Limits of Quantification

Not all aspects of human life can be meaningfully translated into data.

Experiences such as dignity, vulnerability, fear, trust, exclusion, and social belonging are difficult to quantify fully. Yet institutions increasingly depend on metrics and predictive systems to guide decisions about individuals and communities. Muller (2018) warns that excessive dependence on metrics can distort institutional priorities and oversimplify human realities.

This reliance creates limitations.

Quantitative systems simplify reality in order to make it manageable. Complex social conditions become categories, scores, probabilities, and risk indicators. While this simplification can improve administrative efficiency, it can also erase context and nuance. Amoore (2020) argues that algorithmic systems often reduce uncertainty by converting complex human behavior into manageable forms of prediction.

People become represented through data profiles rather than lived experiences.

This reduction is particularly significant in automated governance systems where decisions are increasingly shaped by predictive models rather than direct human engagement.

Who Controls Visibility

Visibility within data systems is shaped by institutions that determine what data is collected, how categories are defined, and how information is interpreted.

Governments establish census categories and administrative classifications. Technology companies design platforms that influence behavioral tracking. Financial systems determine what counts as measurable economic activity. Artificial intelligence systems prioritize specific forms of data while excluding others. Beer (2016) explains that metrics and data infrastructures increasingly shape institutional authority and social organization.

These decisions are not merely technical.

They reflect political and economic priorities regarding whose lives are considered important, measurable, or profitable.

Data governance therefore becomes inseparable from questions of accountability and justice.

Who decides what is visible? Who benefits from visibility? Who bears the risks of exposure? And who remains excluded from recognition altogether?

These are fundamentally political questions.

A Data Justice Perspective

A data justice perspective shifts attention from technical performance alone toward broader questions of fairness, representation, and power.

Taylor (2017) argues that data justice concerns the ways people are made visible, represented, and treated through digital systems. Representation concerns whose experiences are included within datasets and whose are excluded.

Distribution examines how the benefits and harms of data systems are allocated across different populations.

Governance focuses on who controls the infrastructures, institutions, and rules that shape data collection and use.

From this perspective, the problem is not simply inaccurate data. The deeper issue is unequal visibility within systems that increasingly influence access to rights, resources, and opportunities.

Data justice therefore requires more than improving algorithms. It requires examining the social structures embedded within technological systems themselves.

Toward More Inclusive Data Systems

Creating more equitable data systems requires recognizing the limitations of existing approaches.

At the institutional level, greater transparency is needed regarding how data is collected, categorized, and used in decision making. Communities affected by data driven governance should have opportunities to participate in shaping these systems. The European Commission’s High-Level Expert Group on Artificial Intelligence (2019) emphasizes transparency, accountability, and human oversight as central principles of trustworthy AI governance.

At the technical level, datasets and models must be evaluated for bias, exclusion, and unequal outcomes rather than treated as automatically objective.

At the societal level, policymakers and institutions must acknowledge that visibility itself is unevenly distributed. Some communities require greater recognition within public systems, while others require stronger protections from excessive surveillance.

Equity cannot emerge from data systems that reproduce unequal visibility.

Conclusion

Data increasingly shapes how societies recognize individuals and govern populations.

Yet visibility within data systems is deeply unequal. Some groups remain absent from institutional recognition, while others become hypervisible through surveillance and risk management. Data does not merely describe social reality. It actively participates in producing it.

The question of who is visible in data and who is not is therefore a question about power.

Understanding this inequality is essential for building systems that are not only technologically efficient, but socially accountable and just.

In the digital age, visibility determines more than representation. It shapes access, protection, opportunity, and belonging itself.

References

Amoore, L. (2020). Cloud Ethics: Algorithms and the Attributes of Ourselves and Others. Duke University Press.

Beer, D. (2016). Metric Power. Palgrave Macmillan.

Benjamin, R. (2019). Race After Technology: Abolitionist Tools for the New Jim Code. Polity Press.

Bowker, G. C., & Star, S. L. (1999). Sorting Things Out: Classification and Its Consequences. MIT Press.

Couldry, N., & Mejias, U. A. (2019). The Costs of Connection: How Data Is Colonizing Human Life and Appropriating It for Capitalism. Stanford University Press.

Crawford, K. (2021). Atlas of AI. Yale University Press.

Eubanks, V. (2018). Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor. St. Martin’s Press.

European Commission High-Level Expert Group on AI. (2019). Ethics Guidelines for Trustworthy AI.

Foucault, M. (1977). Discipline and Punish: The Birth of the Prison. Pantheon Books.

Gilliom, J. (2001). Overseers of the Poor: Surveillance, Resistance, and the Limits of Privacy. University of Chicago Press.

Kitchin, R. (2014). The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences. Sage.

Lyon, D. (2018). The Culture of Surveillance: Watching as a Way of Life. Polity Press.

Muller, J. Z. (2018). The Tyranny of Metrics. Princeton University Press.

Noble, S. U. (2018). Algorithms of Oppression: How Search Engines Reinforce Racism. NYU Press.

O’Neil, C. (2016). Weapons of Math Destruction. Crown Publishing.

Taylor, L. (2017). “What Is Data Justice? The Case for Connecting Digital Rights and Freedoms Globally.” Big Data & Society, 4(2).

Van Deursen, A., & Helsper, E. (2015). “The Third-Level Digital Divide.” Social Science Computer Review, 33(1), 29-35.

Van Dijck, J. (2014). “Datafication, Dataism and Dataveillance.” Surveillance & Society, 12(2), 197-208.

World Bank. (2021). Identification for Development (ID4D) Global Dataset.

Zuboff, S. (2019). The Age of Surveillance Capitalism. PublicAffairs.

Leave a comment

Either you run the day or the day runs you. 😁

Category

Tag