Study Guides
Big Picture Statistics is about analyzing the results of an experiment. Before the results can be analyzed, the experiment must be conducted first. There are many ways to go about conducting an experiment. However, you have to be careful to eliminate sources of bias and confounding variables.
Key Terms Population Data: Data gathered from every individual of interest. Sample Data: Data gathered from some of the individuals of interest. Experiment: The process of taking a measurement or making an observation. Variable: Characteristic of an individual that is being measured or observed. Numerical Variable: Characteristic where the quantity is the most important. Categorical Variable: Characteristic that can be placed into well-defined categories that do not depend on order. Bias: A non-representative sample.
Population vs. Sample Population Data/Census: • Every
unit in the population being studied is measured or surveyed
• Has the potential to be destructive to the population being studied
• Usually is very costly and unrealistic to conduct
Sample Data/Survey: • Is a representative subset of the population • Is much more practical yet is subject to bias
Probability & Statistics
Sampling
if not
collected properly
• Only
carefully planned sample could be a true representation of the population
To gather information about every individual in a population, the United States government conducts a census once every ten years. Information from a smaller group is usually gathered by conducting a survey.
Variables Statistical experiments usually determine the effect a treatment has on a sample or population by observing a possible change in the variable being measured. The variable can be numerical (or quantitative) or categorical (or qualitative). The effect is not always clear due to lurking and confounding variables. Lurking variable:
• A variable that is not included in a study but may still have an effect on the other variables involved Confounding variable:
• A variable that affects the response variable (variable being measured) and is also related to the explanatory variable (variable that explains the response variable)
• Is observed but cannot be distinguished from the other explanatory variables
Errors in Sampling
Types of Sampling Convenience sampling: surveying people with similar interests such as family members
• Example:
Surveying the first 20 people you see at
a mall
• The
list from which the sample is chosen does not accurately reflect the characteristics of the population
• Example:
Only boys are surveyed at a school with both girls and boys
Size Bias • One
particular subgroup in a population is more or less represented due to its size
• Example:
Judgment sampling: an individual or organization chooses the group to be sampled
• Example:
Incorrect Sampling Frame
A teacher chooses specific students to be
If a state to be surveyed is chosen by randomly pointing at a state on a map, the bigger states have a higher probability of being chosen
surveyed Neither methods will result in a sample that is representative of the true population. This guide was created by Lizhi Fan and Jin Yu. To learn more about the student authors, http://www.ck12.org/about/ck-12-interns/.
Page 1 of 3 v1.1.9.2012
Disclaimer: this study guide was not created to replace your textbook and is for classroom or individual use only.
Sampling, or the selection of a group of individuals from within a population to represent a whole population, is much more practical than conducting a census. Depending on the methods used in selecting the sample, there may be a sampling bias. There are many possible sources of sampling bias.
Probability & Statistics
Sampling
cont .
Errors in Sampling (cont.) Response Bias • Problems that result from the ways in which the survey is presented to the individuals in the sample Types of response bias:
• Voluntary Response Bias: Only individuals with strong opinions respond to a survey • Non-Response Bias: Individuals refuse to respond thus their opinions are not represented • Questionnaire/Wording Bias: The way a survey is worded influences the response given by the individual • Incorrect Response Bias: Individuals respond untruthfully to a survey
Ways to Reduce Bias Randomization • Simple random sample (SRS): all samples of size n where each subject has the same chance of being chosen Systematic Sampling • A systematic way in selecting subjects. • Example: If you want a sample of 10 out of a population of 100 people, you assign a number to each member of the population. You then pick a number from 1 to 10, and you choose the number 4. You would include the 4th person and every 10th person after that. So you would include: 4, 14, 24, 34, ... 94
• Easier to conduct than SRS, but not everyone has an equal chance of being chosen Cluster Sampling • Divides the population
into groups/clusters. Some of the clusters are randomly selected, and everyone in that cluster is selected to participate. • Example: Randomly choose a city, then randomly choose a street in that city, then randomly choose a house on that street, and then everyone in that house is sampled.
Stratified Sampling • Divides the population into
similar strata and then choose a sample from each stratum using SRS. Combine the samples at the end to make a complete sample. • Example: Choose 20 people from each grade (freshmen, sophomores...) to make a sample to represent a certain high school
• Not everybody has an equal chance of being chosen
Experimental Design Essential parts to a well-designed experiment: 1. Treatments: imposed on the subjects of the experiment
• An experiment will usually have at least two 2. Randomization: helps to obtain a representative sample of a population
• Randomly
assigning treatments to the members of the sample helps to eliminate confounding variables
3. Replication: experiment can be replicated by others so the results can be independently confirmed group receives whatsoever in an experiment
no
treatment
• Serves as a comparison basis for the treatment group
Page 2 of 3
of a placebo group would think they are receiving
a treatment when in fact the given treatment is a placebo and has no real effect
• In
treatments
Control • A control
Placebos • Members
a double-blind experiment, both the subjects and the person who is conducting the experiment do not know whether the subject is receiving a treatment or placebo
• Is considered the gold-standard for medical research Blocking • Is used to minimize confounding variables in an experiment • The goal is to group subjects with similarities together (example: block by gender, age, etc.)
• Randomized
block design: block the subjects, and each block independently receives treatments
• Matched pairs design: a type of randomized block design in which there are two treatments to apply
• Two
different observations are done on the same subject
•
Also known as a repeated measures design
cont .
How to Use a Random Number Table One important part of any simple random sample is to have the subjects randomly selected. For that to happen, each must have an equal likelihood of being chosen. A method to ensure this is to use a random number chart. This is a computer generated series of numbers that looks similar to this: 31246 53677 59827 88262 29515 32314 50473 69440 59422 16033 46200 82258 20969 57950 47009 To use this chart, the subjects must first be assigned a number. The only requirement is that all the numbers assigned must have the same number of digits. For example, if there were 10 subjects, the numbers could be assigned from 0-9 or from 01-10. Once the numbers have been assigned, we look to a random number table. This is read using the number of digits that are in the assigned numbers. Repeats, gaps, and un-included numbers are ignored. For example, if our assigned numbers had two digits, then according to the random number table above, the numbers corresponding to the subjects chosen would be: 31, 24, 65, 36, etc.
Notes
Page 3 of 3
Probability & Statistics
Sampling