Test Question Possibilities

Chapter 1 –

ü     What are the various ways we can use to graph univariate data? I count 4 major ways. Give a brief explanation of the process involved in for each way.

ü     What are the two competing quantitative methods used to describe univariate data? Give a brief definition or explanation of each parameter within the two competing methods.

ü     What are the main things we look for in graphs of distributions? What additional things should we also consider in these graphs?

ü     How do you know if a data set is skewed left? Skewed right? Be specific. How would a boxplot look that was skewed in each direction?

ü     What is meant by resistant? Non-resistant? Which of the univariate parameters that we have studied would fall into each category?

ü     What is the formula for mean of a sample? For standard deviation of a sample? What symbols are used for the mean and standard deviation of a sample? What symbols (don’t worry about formulas) are used for the mean and standard deviation of a population? How are the variance and standard deviation of a sample (or population) related? Be sure to include a unit of measurement relationship for the last question.

ü     What happens to the mean, median, standard deviation and IQR of a data set if you add a fixed number to each member of a data set? What happens to these same measures if you take a data set and multiply each value by a fixed number?

Chapter 2 –

ü     What conclusions can you make about a distribution if the mean = median? Mean > median? Mean < median? Briefly explain why you can make these conclusions.

ü     What does a density curve represent? What are the two basic requirements for a density curve?

ü     What does the empirical rule say about all normal distributions? How do you standardize observations from a normal distribution with mean and standard deviation into one that is normally distributed with mean 0 and standard deviation 1?

ü     Describe how you would use Table A in the book in two ways. First, to determine the proportions of observations within two given values in a normally distributed population. Second, to determine the observation values that would account for a particular proportion of the data in a normally distributed population.

ü     What is a percentile score? What can you say about the percentile score of a median? Quartile 3? Quartile 1? Mean?

ü     Briefly describe how you would use a histogram of a data set to assess normality. Would using a dotplot instead of a histogram be a more accurate assessment? Why or why not? Yes, I agree using a dotplot may be more tedious, but that is not the question J

ü     Describe how using a normal probability plot can be used to assess normality. Be sure to address what you a plotting on the y-axis and the x-axis on the plot. Also be sure to briefly explain not only WHAT you are looking for in this plot but also WHY this might be useful in assessing normality.

Chapter 3 - 

ü     What are the qualitative things you should look for in a scatterplot of bivariate data? How can you add a categorical variable to a scatterplot? WHY would you want to add a categorical variable to a scatterplot? Give an example.

ü     How do you calculate r for a data set? What does this have to do with Z-scores? What is the interpretation of this r value? What are the units of an r value? What are the units of a z-score?

ü     What are the properties of r?

ü     What is the difference between a correlation coefficient and a coefficient of determination?

ü      In addition to , r and r2 what things should you report with your analysis of bivariate data? HINT: One of these is a graph and the other things are from Chapter 1.

ü      What is our GOAL in obtaining a LSRL? That is, what have we optimized or minimized? Would the LSRL be different if we switched the explanatory and the response variable? What about r and r2?

ü      What is the relationship between the slope of the line and the correlation coefficient? How can you find the y-intercept of the LSRL given only the correlation coefficient and the mean based statistics for both x and y?

ü     What (x,y) can you GUARANTEE will always be on the LSRL? Why can you guarantee this?

ü     What the heck is an outlier? An influential point? Can a point be one and not the other?

ü     How do you calculate residual values? What is the sum of the residuals? What is the sum of the squares of the residuals and what does it have to do with the LSRL?

ü     When analyzing a residual plot, what should you look for?

ü     What is the interpretation of r2?

Chapter 4 –

ü     What are the three main properties of logarithms?

ü     If you have a suspected (exponential) relationship of y = ABx, how do you use the properties of logarithms to obtain a linear relationship? What graph would you suspect to be a straight line? That is, what would you plot on the dependent and independent axes? How do you then complete the inverse transformation to put the model back into the context of the problem?

ü     Same as above question but for power function.

ü     What is an initial test for exponential growth? How can you tell by looking at an equation if you have exponential growth or exponential decay?

ü      What is a marginal distribution? What is a conditional distribution? If you have a two-way table that measures “n” of one characteristic and “m” of another characteristic, how many marginal distributions will there be? How many conditional distributions will there be?

ü      Part 1 - What is extrapolation? Why should you care? Part 2 – What warnings do you have for someone performing linear regression on a data set of averages? Why do you make this warning?

ü     Describe Simpson’s paradox? What is the main lesson learned from Simpson’s paradox in relation to analysis of statistical data?

ü      What is a lurking variable? A common response variable? A confounding variable? Which of the above is a subset of another?

Chapter 5 –

ü     A well-designed survey must have the following three properties. Give an example of a survey you would conduct and how you will make sure each property is existent in your survey:

o       Representative

o       Random

o       Large

ü     What is a SRS? Does each individual have an equal chance of selection in an SRS? What about each combination of individuals?

ü     How do you use a table of digits to get a random sample?

ü     What is the difference between cluster and stratified sampling? What is multi-stage sampling?

ü     What is the difference between stratified sampling and blocking?

ü     A well-designed experiment has four main principles (one of which is optional). Give a brief explanation of each term and an example of an experiment that you might conduct that would satisfy each principle

o        CONTROL

o        RANDOMIZATION

o        REPLICATION

o        BLOCKING

ü     Why is it that the proportion of the population that you sample is unimportant in determining how good your sample is?

ü     What is meant by statistically significant?

ü     Describe each of the following types of bias

·        Voluntary response

·        Convenience sampling

·        Undercoverage

·        Nonresponse

·        Positive Response

·        Wording of questions

 

ü      What distinguishes an experiment from an observational study? What do you hope to determine with an experiment that you cannot determine with an observational study?

ü      Give an example of a basic blocking design in an experiment.

ü      You have three factors in a liquid experiment - temperature, salt content and color. For temperature you have 3 choices, salt content you have 4 choices and color you have 2 choices. How many treatment groups will be necessary for your experiment?

ü     You wish to simulate a poll result cited by FOX news discussing the 2004 democratic primaries. They estimate that 22.3 % of Iowa voters support Gephardt, 24.6 % support Dean and the remaining support another candidate. How will you simulate this using a table of random digits? How many digits will you need? What numbers will you assign to each candidate? Be specific in your simulation design.

Chapter 6 –

ü     What are the two major components of a probability model?

ü     What is a sample space?

ü     What is the multiplication principle and how would you use it?

ü     IF DISJOINT – P(A or B) = ?

ü     IF NOT DISJOINT – P(A or B) = ?

ü     IF INDEPENDENT – P(A and B) = ?

ü     IF NOT INDEPENDENT P(A and B) = ?

ü     IF NOT DISJOINT and also INDEPENDENT, P(A or B) =

ü     Draw a Venn Diagram to represent the following situation: P(A) = .3, P(B) = .6, P(A and B) = .2

o       Are events A and B independent? How do you know?

o       Are events A and B disjoint? How do you know?

o       Find the following:

§        P(A and B)

§        P(A and Bc)

§        P(Ac and B)

§        P(Neither A nor B)

§        P(Not A and B)

ü     The probability of guessing a correct answer on a multiple choice test is .2 (assuming each question has 5 choices). You are going to take a 6 question multiple-choice test and guess the answers on each. Find the following (expressions are fine):

o       P(getting all correct)

o       P(getting all incorrect)

o       P(not getting all correct)…note this is not the same as P(getting all incorrect)

o       P(not getting all incorrect)…note this is not the same as P(getting all correct)

o       P(getting at least one correct)…aha…this is the same as one of the above J

o       P(getting at least one incorrect)…aha…this is the same as one of the above J

ü     If the P(A and B) = .4 and the P(A) = .7, what is the P(B given A)?

ü     If the P(A and B) = .3 and the P(A Given B) = .6, which can you find, the P(A) or the P(B)? Find it.

ü     If P(A given B) = 1/3 and the P(B) = .3, what is the P(A and B)?

ü     If P(A or B) = .8, P(B Given A) = .3 and P(A) = .5, what is the P(B)? Hint: You cannot find directly – first find P(A and B) from given, then find P(B) from P(A or B) given. I absolutely LOVE this question.


Chapter 7 –

ü     What is the definition of a Random Variable? What two things will you need in order to DEFINE your random variable?

ü     What is a probability histogram? A density curve? Are the two related in any way?

ü     What is the difference between a discrete and a continuous RV? What are the two requirements for the probabilities of a discrete RV? Of a continuous RV?

ü     What is a uniform probability distribution? How would the function be defined if the continuous RV can take on the values from 0 to n inclusive?

ü     Is a normal distribution a pdf? Why or why not?

ü     How do you determine (by formula) the mean, variance and standard deviation of a discrete RV? What is the difference between a mean of a RV and an expected value of a RV? [Note – recall that FINDING the mean and the variance of a continuous RV is beyond the scope of this class and requires methods from calculus.]

ü     What does the law of large numbers state? Are there any assumptions about the observations from the population that are crucial to this law?

ü     Complete the following rules for means and variances of RV’s

o       uX+Y =  

o       uX-Y =  

o       ua+bX=

o       variance(a + bX) =

o       variance(X + Y) =

o       variance (X – Y) =   

o       standard deviation(X – Y) =

Chapter 8 –

ü      What are the four conditions that need to be satisfied to for a situation to be a binomial setting?

ü      How is the RV X defined in a binomial distribution? What values can X take? What are the two parameters that you need in order to determine a probability such as P(X = 3).

ü      What is the difference between a pdf (probability distribution function) and a cdf (cumulative distribution function)? If X is a RV that is B(10, .3) give an example of how the two functions are related. 

ü     Do the concepts of a pdf and a cdf make sense outside of the context of a binomial random variable setting? If not, why not. If so, give a quick example.

ü     Explain what the calculator commands binomlpdf(50, .43, 23) and binomlcdf(50, .43, 23) would represent in words. You may put it into the context of a problem if you wish.

ü     What is the formula (not calculator command) for determining the probability of (X = k) in a B(n,p)?

ü     How does a geometric random variable differ from a binomial random variable? How are they similar? What values can a geometric RV take?

ü     What is mean and variance and standard deviation of a random variable that is B(n,p)? What is the mean of a geometric RV that had a probability of success = p? [Note: the variance of a Geometric Dist is given by (1-p)/p2 if you care but is not given in the text]

ü     If Mr. G has a 24% chance of making a free throw and shoots each independently confirm the following probabilities:

o       P(It takes Mr. G Exactly 8 attempts to make his first) =.035

o       P(It takes Mr. G more than 8 attempts to make his first shot)=.111

o       P(It takes Mr. G less than 8 attempts to make his first shot) = .853

Chapter 9

ü     Describe what a sampling distribution represents.

ü     How will the spread of the sampling distribution of x-bar change as you increase the sample size?

ü     How will the spread of the sampling distribution of p-hat change as you increase the sample size?

ü     How will the spread of the sampling distribution of x-bar change as you increase the number of samples?

ü       What is the difference between a parameter and a statistic?

ü       What does it mean for a statistic to be unbiased?

ü       Describe in words what p-hat would represents.

ü       What does the Central Limit Theorem say?

ü       What does the Law of Large Numbers say?

ü       Under what conditions can you assume that the sampling distribution of x-bar is N(mu, sigma/sqrt(n))?

ü       Under what conditions can you assume that the sampling distribution of p-hat is N(p,sqrt[ p(1-p)/n])?

ü       Mr. G’s Geometry class has grades that are normally distributed with  a mean of 84.4% and a standard deviation of 3.3%.

o       What is the probability that a randomly chosen student will have a grade that is below 80%? What assumptions/conditions are necessary in order to ensure this probability is an accurate calculation?

o       What it the probability that a randomly chosen group of 5 students will have a mean grade that is greater than 90%? What assumptions/conditions are necessary in order to ensure this probability is an accurate calculation?

o       What is the probability that a randomly chosen group of 30 students has a mean grade that is less than 82%?

o        Which of the answers above would change if the population of student’s grades were distinctly NONNORMAL?

 

 

Chapter 10

 

CONFIDENCE INTERVALS

1.     What is meant by a critical value (z*)?

2.     What are we assuming about (the standard deviation of the population) throughout this chapter?

3.     What is the formula for , the standard deviation of the sampling distribution of ?

4.     What is the formula for the margin of error for the statistic ?

5.     How do you find a confidence interval from and your margin of error?

6.     What is the interpretation of a confidence interval? Yes, I am asking for those two magic sentences!

7.     You weigh 30 bags of M&M peanut bags and get a mean weight for those 30 bags to be 12.4 oz. You are told that the standard deviation for bag weights, , is known to be .3 oz2. We are interested in a 94% confidence interval. Find the following

a.      z*

b.      

c.     margin of error

d.     94% confidence interval

e.      What is the interpretation of this interval?

8.     Suppose you would like to cut the margin of error in 7c in half. How many total bags should you weigh in order to do this?

9.     How many bags should you weigh in 7 in order to ensure you have a margin of error no more than .03 oz at a confidence interval of 94%?

10. How would you get the margin or error for a statistic from the confidence interval only?

11. Name three cautions you have for someone who picks up a calculator and calculates a confidence interval. (see pages 524-525!)

 

SIGNIFICANCE TESTS/ERRORS/POWER

12. What is the interpretation of a P-value? 

13. When you will reject Ho? Accept Ho?

14. When is a result called statistically significant?

15. Back to our M&M bags from problem 7. If H0:  and Ha:  and your = 12.4 oz, n = 30, alpha = 5% …

a.      What is the P-value?

b.     Will you accept of reject H0?

c.     Is the result = 12.4 statistically significant?

d.     Redo a, b and c if you are given H0:  and Ha:

16. What is the definition of a Type I error?

17. What is the definition of a Type II error?

18. How do you find the P(Type I error)?

19. What is the procedure for finding P(Type II error)?

20. What is the definition of the Power of a significance test?

21. Bag to the M&M bag example. Use the following parameters

= 12.4 oz, n = 30 , H0:  and Ha: , alpha = .05

     and find the following:

a.       P(Type I error)

b.     P(Type II error) assuming that the true mean is 12.35oz?

c.     Power of the test.

 

Chapter 11

 

1.     When should you use a T-distribution INSTEAD of a Z distribution when finding a Confidence Interval or doing a Significance Test?

2.     KEY CONCEPT – CONSIDER THIS AN ESSAY QUESTION! - What is the difference between these three different kinds of T-tests (conceptual and formula)? A) 1-sample B) matched pairs        C) 2-sample

3.     What are the degrees of freedom for 1-sample or a matched pairs T-Statistic? What should you use for the degrees of freedom for a 2-sample T-Statistic if you don’t have a calculator? Briefly explain how the degrees of freedom on a calculator be different and why.

4.     Is a 2-sample T-statistic unbiased? Why or why not? What assumptions/conditions must be met in order to use the T-statistic (see page 606).

5.     What can you say about the sampling distribution of the difference of two means (1 - 2). Specifically, what is it’s center and standard deviation.? What assumptions/conditions must be satisfied in order for you to make these statements?

6.     What is the standard error for an ? How do you use this standard error to get a C% confidence interval?

7.     What is the GENERAL FORM for a confidence interval of any statistic?

8.     Quite often the null hypothesis on a 2-sample T-statistic is what? How does this affect the T-statistic?

9.     Suggestion – Review the concepts of Type I error, Type II error and Power in context of a T-dist whether it be 1-sample, matched pairs or 2-sample.

10. Note – we skipped pages 633-639 since they are optional. Still, you should understand that a calculator uses a different degrees of freedom and the concept of why it is either a conservative or a non-conservative estimate!