AP Computer Science A Practice Quiz: Ethical and Social Issues Around Data Collection
Written by AP Content Team, Verified for 2026 AP Exams, Last updated: May 2026
Test your understanding with short quizzes. This quiz has 16 questions to check your progress.
Question 1 of 16
All Questions (16)
A) The data quality might be poor, containing typos.
B) A breach of the system could expose users' private information.
C) The dataset might be inappropriate for answering other questions.
D) The data could be used to create algorithmic bias.
Correct Answer: B
Content point 1 explains the 'risks to privacy from collecting and storing personal data on computer systems.' A security breach that exposes personal information is a primary example of such a risk.
A) To sell the data to the highest bidder to fund the application.
B) To collect as much data as possible for potential future use.
C) To attempt to safeguard the personal privacy of the user.
D) To ensure the data is only used to answer one specific question.
Correct Answer: C
Content point 4 explicitly states, 'When developing new programs, programmers should attempt to safeguard the personal privacy of the user.'
A) A data privacy breach.
B) A data quality issue.
C) Algorithmic bias.
D) An inappropriate dataset selection.
Correct Answer: C
Content point 5 defines algorithmic bias as 'systemic and repeated errors in a program that create unfair outcomes for a specific group of users,' which perfectly matches the scenario described.
A) The dataset contains personal information, creating a privacy risk.
B) The dataset likely has poor data quality and missing values.
C) The contents of the dataset are not related to the question being asked.
D) The dataset could introduce algorithmic bias into the car model analysis.
Correct Answer: C
Content point 6 explains that the 'contents of a data set might be related to a specific question or topic and might not be appropriate to give correct answers or extrapolate information for a different question or topic.' Rainfall data is unrelated to car sales.
A) High-quality data is always free from algorithmic bias.
B) Poor data quality can lead to inaccurate conclusions or solutions.
C) All datasets are created to answer one specific question.
D) Recognizing data quality is the programmer's only privacy safeguard.
Correct Answer: B
Content point 2 emphasizes the 'importance of recognizing data quality and potential issues when using a data set.' Using data with issues (poor quality) can logically lead to flawed results and inaccurate conclusions.
A) A dataset of property tax records for all homeowners in the city.
B) A dataset showing the locations of all existing businesses.
C) A dataset containing census information on the number of households with children by neighborhood.
D) A dataset of traffic flow patterns on major city roads.
Correct Answer: C
Content point 3 discusses the need to 'identify an appropriate data set to use in order to solve a problem or answer a specific question.' The census data directly addresses where children live, which is most relevant to the council's question.
A) It is a one-time error that affects a single user.
B) It is caused by users providing incorrect information.
C) It involves systemic and repeated errors creating unfair outcomes.
D) It only occurs when personal data is stored on insecure systems.
Correct Answer: C
Content point 5 explicitly defines algorithmic bias as 'systemic and repeated errors in a program that create unfair outcomes for a specific group of users.'
A) The data might be used to answer a question it wasn't intended for.
B) The records may contain typos, representing a data quality issue.
C) Unauthorized access to the server could expose sensitive patient information.
D) An algorithm analyzing the data might unfairly prioritize certain patients.
Correct Answer: C
Content point 1 explains the 'risks to privacy from collecting and storing personal data on computer systems.' Unauthorized access to a centralized server of sensitive data is a key example of this risk.
A) A necessary privacy safeguard implemented by the programmer.
B) A potential data quality issue that could affect analysis.
C) An appropriate dataset for determining customer income levels.
D) Algorithmic bias against a specific age group.
Correct Answer: B
Content point 2 highlights the importance of recognizing 'potential issues when using a data set.' A large amount of missing data is a significant quality issue that can skew the results of any analysis performed on it.
A) Storing user passwords in plain, unencrypted text for easy recovery.
B) Requiring users to provide their social security number to create an account.
C) Automatically sharing user purchase history with third-party advertisers.
D) Implementing encryption for stored credit card information.
Correct Answer: D
Content point 4 states that 'programmers should attempt to safeguard the personal privacy of the user.' Encrypting sensitive financial data like credit card numbers is a fundamental method for safeguarding that information from unauthorized access.
A) The data collection creates a significant privacy risk for smartphone users.
B) The dataset's topic and time frame are not appropriate for the question being asked.
C) The dataset is likely to be systematically biased against certain phone brands.
D) The quality of the sales data from 2015 is guaranteed to be poor.
Correct Answer: B
Content point 6 states that a dataset for one topic (or time period) 'might not be appropriate to give correct answers or extrapolate information for a different question or topic.' Data from 2015 is outdated and about hardware sales, not current software usage, making it inappropriate.
A) A data quality problem due to incorrect GPS coordinates.
B) A privacy risk from collecting location data.
C) Algorithmic bias creating an unfair outcome for a specific group.
D) Using an inappropriate dataset to calculate fares.
Correct Answer: C
This scenario fits the definition in content point 5, where 'systemic and repeated errors in a program...create unfair outcomes for a specific group of users' (in this case, residents of low-income neighborhoods).
A) A national survey of sleep patterns among adults.
B) The academic transcripts of all students from the high school.
C) Anonymous survey data from students at that school detailing their sleep hours and their corresponding test scores.
D) A dataset of the school's budget and teacher salaries.
Correct Answer: C
Following the principle in content point 3, one must 'identify an appropriate data set to use in order to solve a problem or answer a specific question.' The dataset in C directly contains the two variables (sleep hours, test scores) for the specific population (students at that school) needed to answer the question.
A) Computer systems are always connected to the internet and are easily accessible.
B) Data on computer systems can be copied and distributed to millions of people almost instantly.
C) Data stored on computers is more likely to contain systemic bias.
D) Computer systems cannot store data for long periods.
Correct Answer: B
Content point 1 discusses the 'risks to privacy from collecting and storing personal data on computer systems.' The ability to rapidly copy and distribute vast amounts of data is a key risk unique to computer systems, as a single breach can have a massive and immediate impact.
A) A data privacy breach during the collection of the images.
B) A data quality issue where the images were low resolution.
C) The dataset being inappropriate for its intended broad use, leading to algorithmic bias.
D) A programmer intentionally writing code to misidentify people.
Correct Answer: C
This question combines multiple concepts. The dataset was not appropriate for the diverse population it was applied to (Content point 6), which resulted in 'systemic and repeated errors...that create unfair outcomes for a specific group' (Content point 5, algorithmic bias).
A) Risks to privacy from storing personal data.
B) Potential issues with data quality.
C) Algorithmic bias.
D) Using a dataset to answer a completely unrelated question.
Correct Answer: D
The scenario illustrates privacy risks (point 1, stolen data), data quality issues (point 2, missing information), and algorithmic bias (point 5, unfair outcome). However, the dataset of past employees is directly related to the task of hiring new employees. The issue is not that the dataset is for a 'different question or topic' (point 6), but that it is a biased and flawed sample for the same topic.