Getting Started
Computers are often seen as objective and logical tools, making decisions based purely on data. However, since computer systems are designed by humans and often learn from data created by human society, they can inherit, reflect, and even amplify existing human prejudices. This chapter explores how this "computing bias" arises and the significant ethical and societal impacts it can have.
What You Should Be Able to Do
Define computing bias and its relationship to human bias.
Explain the two primary ways bias is introduced into a computer system: through data and through algorithm design.
Analyze the societal and ethical impacts of a biased computing innovation.
Identify examples of bias in real-world applications like facial recognition, targeted advertising, or criminal justice software.
Key Concepts & Application
The Core Idea
In everyday language, a bias is a prejudice in favor of or against one thing, person, or group compared with another, usually in a way considered to be unfair. In computing, this same concept applies. A system is biased when it systematically and unfairly discriminates against certain individuals or groups in favor of others. This isn't a bug or a random error; it is a repeatable, predictable prejudice embedded within the system's logic.
Imagine a company that historically only hired graduates from a single university. If you were to build an algorithm—a finite set of instructions used to perform a task—to screen new resumes, you might train it on the data of all past successful employees. The algorithm would quickly "learn" that a key feature of a good employee is having attended that specific university. It would then start automatically rejecting qualified candidates from all other schools, not because of malice, but because the data it was trained on contained a hidden, historical bias. Computing bias works the same way, but on a massive, automated scale.
Logic & Application
Bias can enter a system in two main places: the data used to teach the system or the rules the human programmer designed for it.
Sources of Computing Bias
| Source of Bias | Explanation & Example -
| Biased Data Sets | The data used to train a system contains prejudices. If a facial recognition system is trained primarily on images of light-skinned faces, it will be less accurate at identifying dark-skinned faces. The system isn't "racist," but it reflects the bias present in its training data. -
| Biased Algorithm Design | The rules of the algorithm itself contain bias. A programmer might create a rule in a loan application system that gives a lower "risk score" to applicants from certain zip codes, based on a personal (and unfair) assumption that those areas are less financially stable. This bias is explicitly coded into the system's logic. |
Annotated Pseudocode Example: Biased Loan Approval
This pseudocode shows a simplified loan approval process where bias is introduced through the algorithm's design.
PROCEDURE checkLoanEligibility (applicant)
{
// Start with a base score
score <- 500
// Add points for a high income
IF (applicant.income > 75000)
{
score <- score + 100
}
// This is the biased rule. It penalizes applicants from a
// specific zip code, regardless of their individual merit.
IF (applicant.zipCode = 90010)
{
score <- score - 150
}
// Check if the final score meets the approval threshold
IF (score >= 550)
{
DISPLAY("Loan Approved")
}
ELSE
{
DISPLAY("Loan Denied")
}
}
Tracing & Analysis
Logic Trace
Let's trace the checkLoanEligibility procedure for two different applicants.
Applicant A:
income = 80000,zipCode = 12345scorestarts at500.incomeis greater than 75000, soscorebecomes500 + 100 = 600.zipCodeis not 90010, so the biased rule is skipped.Final
scoreis600, which is>= 550.Result: "Loan Approved"
Applicant B:
income = 80000,zipCode = 90010scorestarts at500.incomeis greater than 75000, soscorebecomes500 + 100 = 600.zipCodeis 90010, so the biased rule is applied.scorebecomes600 - 150 = 450.Final
scoreis450, which is not>= 550.Result: "Loan Denied"
Societal Impact
The trace shows that two equally qualified applicants receive different outcomes based on a factor—their zip code—that may correlate with race or socioeconomic status but does not determine their ability to repay a loan. When deployed at scale, such a system can create or reinforce economic inequality, making it harder for people in entire communities to secure loans, buy homes, and build wealth. This is a harmful effect that goes far beyond a single "unfair" decision.
Core Concepts & Terminology
Bias: A prejudice in favor of or against one thing, person, or group compared with another, usually in a way considered to be unfair.
Algorithm: A finite set of instructions that can be performed by a computer to accomplish a task. Algorithms are the building blocks of all software.
Biased Data Sets: When the data used to train a system does not accurately represent the world or reflects existing human prejudices, the system will learn and perpetuate those biases.
Biased Algorithm Design: When the explicit rules or logic of an algorithm are written in a way that produces unfair or prejudicial outcomes for certain groups.
Selection Logic: The part of an algorithm that makes a decision. Biased rules are often implemented using selection statements.
IF (condition) { // Code to run if condition is true } ELSE { // Code to run if condition is false }This structure is where a programmer could insert a rule like
IF (applicant.gender = "female"), creating an explicitly biased system.
Core Skill Check
Logic Tracing: An algorithm gives a hiring bonus point if
yearsOfExperience > 5and subtracts a point ifattendedCommunityCollege = true. What is the final score for a candidate with 6 years of experience who attended community college?Bias Identification: A system that recommends medical treatments was trained using data from only one hospital. What is a potential source of bias in this system?
Application: Describe a real-world example of how targeted advertising could exhibit bias and cause a negative societal impact.
Common Misconceptions & Clarifications
"Computers are objective and can't be biased."
- Clarification: Computers are tools created by people and trained on data from the world. They reflect the biases present in their creators and their data sources.
"Bias is always intentional."
- Clarification: Bias is often unintentional. It can emerge from unconscious assumptions made by developers or from using historical data that contains systemic, unexamined prejudices.
"More data will always fix a biased system."
- Clarification: Not if the new data is also biased. Adding more data that underrepresents a certain group will only make the system more confident in its biased conclusions. The key is better, more representative data.
"Bias is just a technical problem with a technical solution."
- Clarification: While technical fixes can help, computing bias is fundamentally a social and ethical issue. It requires diverse teams, ethical reviews, and a focus on fairness to address properly.
Summary
Computing systems are not inherently neutral. They can contain biases that lead to unfair and harmful outcomes for specific groups of people. This bias is not a random error; it is a systemic prejudice introduced either through biased data used to train the system or through biased rules in the algorithm's design. Recognizing, analyzing, and working to mitigate these biases is a critical responsibility in computer science, ensuring that technology promotes fairness rather than reinforcing existing societal inequalities. Examples in facial recognition, criminal justice, and advertising show that the consequences of biased systems are significant and real.