AP Computer Science Principles Practice Quiz: Extracting Information from Data
Written by AP Content Team, Verified for 2026 AP Exams, Last updated: May 2026
Test your understanding with short quizzes. This quiz has 16 questions to check your progress.
Question 1 of 16
All Questions (16)
A) Information is the raw input, and data is the processed output.
B) Data and information are interchangeable terms for the same concept.
C) Information consists of the facts and patterns extracted from data.
D) Data is a visualization of information.
Correct Answer: C
According to the provided text, 'Information is the collection of facts and patterns extracted from data.' This defines information as the result of processing or analyzing data.
A) The marketing emails definitively cause an increase in website traffic.
B) Increased website traffic causes the company to send marketing emails.
C) There is no relationship between marketing emails and website traffic.
D) A relationship between the emails and traffic is observed, but a causal link is not proven.
Correct Answer: D
The text states, 'A correlation found in data does not necessarily indicate that a causal relationship exists. Additional research is needed to understand the exact nature of the relationship.' The observation is a correlation, not proof of causation.
A) The number of people smiling in the photograph.
B) The colors of the objects in the photograph.
C) The file size of the image.
D) The main subject of the photograph.
Correct Answer: C
The provided content defines metadata as 'data about data' and gives the example, 'the metadata may include the date of creation or the file size of the image.' The other options describe the primary data (the content of the image itself).
A) Incomplete data, as not all students will respond.
B) Bias, because the source of the data is not representative of all students.
C) Invalid data, as students may not answer truthfully.
D) A need for parallel systems, because the dataset is too large.
Correct Answer: B
The text states, 'Problems of bias are often created by the type or source of data being collected.' By only surveying computer science majors, the data source is biased and not representative of the entire student population's career preferences.
A) Data cleaning
B) Data collection
C) Metadata extraction
D) Parallel processing
Correct Answer: A
This scenario is an example of non-uniform data collected from users. The text explains, 'Cleaning data is a process that makes the data uniform without changing their meaning (e.g., replacing all equivalent abbreviations, spellings, and capitalizations with the same word).'
A) The data is likely to be biased and cannot be processed.
B) A single computer cannot clean non-uniform data.
C) Large data sets may require parallel systems for processing.
D) Metadata for large datasets cannot be read by single computers.
Correct Answer: C
The content explicitly states, 'Large data sets are difficult to process using a single computer and may require parallel systems.' The size of the dataset is the key factor that necessitates a distributed or parallel computing approach.
A) It corrupts the primary data.
B) It appends the new date to the primary data.
C) It does not change the primary data.
D) It reorganizes the primary data based on the new date.
Correct Answer: C
The 'last modified' date is an example of metadata. The text clearly states, 'Changes and deletions made to metadata do not change the primary data.'
A) Collect a much larger dataset on traffic congestion from the same source.
B) Use a parallel system to process the existing data faster.
C) Combine the traffic data with data from other sources, like public transit schedules or local event calendars.
D) Delete the metadata to simplify the dataset.
Correct Answer: C
The text indicates that, 'Often, a single source does not contain the data needed to draw a conclusion. It may be necessary to combine data from a variety of sources to formulate a conclusion.' Combining data is the logical step to find connections.
A) Data cleaning
B) Causation
C) Metadata
D) Scalability
Correct Answer: D
The text states, 'Scalability of systems is an important consideration when working with data sets, as the computational capacity of a system affects how data sets can be processed and stored.' The anticipated growth directly relates to the need for a scalable system.
A) By correcting errors and bias within the primary data.
B) By providing additional information for finding, organizing, and managing the data.
C) By reducing the file size of the primary data for faster processing.
D) By automatically identifying causal relationships within the data.
Correct Answer: B
According to the text, 'Metadata are used for finding, organizing, and managing information' and 'can increase the effective use of data or data sets by providing additional information.' It helps structure and manage data, not change or analyze it.
A) The need for parallel systems.
B) The requirement for massive storage capacity.
C) The need to clean incomplete or invalid data.
D) The difficulty of processing on a single computer.
Correct Answer: C
The text specifies that 'Data sets pose challenges regardless of size, such as: the need to clean data, incomplete data, invalid data...'. The other options are challenges typically associated with large data sets.
A) Collect significantly more data using the same flawed method.
B) Use a more powerful, parallel system to process the biased data.
C) Modify the data collection method to be more inclusive.
D) Clean the data by removing all entries from the over-represented groups.
Correct Answer: C
The text states that 'Bias is not eliminated by simply collecting more data.' This implies that the source or method of collection is the root of the problem. Therefore, the method itself must be changed. The text doesn't provide C as an explicit solution, but it invalidates A, and C is the only logical solution based on the problem description that 'bias is often created by the type or source of data being collected.'
A) The size of the data set exclusively.
B) The absence of any bias in the data.
C) The capabilities of the users and their tools.
D) The quality of the metadata.
Correct Answer: C
This is a direct reference from the text: 'The ability to process data depends on the capabilities of the users and their tools.'
A) To extract meaningful patterns and trends from the data.
B) To make data uniform without altering its meaning.
C) To add metadata to a dataset for better organization.
D) To prove a causal relationship between two variables.
Correct Answer: B
The text defines this process directly: 'Cleaning data is a process that makes the data uniform without changing their meaning'.
A) Automatically creating metadata.
B) Identifying trends and making connections.
C) Ensuring the scalability of a system.
D) Eliminating the need for additional research.
Correct Answer: B
The text states, 'Data provide opportunities for identifying trends, making connections, and addressing problems.'
A) The size of a data set has no effect on the amount of information that can be extracted.
B) Smaller data sets always provide more accurate information than larger ones.
C) The size of a data set affects the amount of information that can be extracted.
D) Only large data sets can be used to find correlations.
Correct Answer: C
The text makes a direct statement on this relationship: 'The size of a data set affects the amount of information that can be extracted from it.' It does not specify that more is always better, just that size is a factor.