This assignment will allow you to work through the first part of the data analysis of Project 4 and get feedback that you can use to improve this section for the final project paper. You can just type the answers OR you can write it out in full sentences. For your final paper, you will need to write this out in full sentence/paragraph format.
Watch the following video before getting started!
This assignment will allow you to work through the last data analysis for Project 4 and get feedback that you can use to improve this section for the final project paper. You can just type the answers OR you can write it out in full sentences. For your final paper, you will need to write this out in full sentence/paragraph format.
Project 4 data for this assignment - Project 4: Data
Part 1: Response variable
From the data we collected, what ONE variable do you think is a good measure of school success? This is your "response variable" for this assignment and should be some type of outcome variable. Example: Graduation Rate, Post-secondary enrollment, SAT Math, etc.
- Pick one variable that you think is a good measure of school success and this is your response variable. For Part 1, state what your response variable is for this project.
Part 2: Predictor variables
From the data we collected, what SEVEN variables do you think might be able to help *predict* the response variable you picked in Part 1? These are your "Predictor" variables. Example: I think student attendance, low-income students, SAT Math, teacher retention, teacher education, teacher attendance, and per-student instructional spending all help to predict Graduation Rate (response variable).
- List the seven predictor variables you picked.
- For each of your seven predictor variables, explain why do you think it can help predict your response variable?
Part 3: Correlations - Answer the questions below for each of your seven predictor variables.
You will be doing seven correlations and answering all of the questions below about each predictor and the response variable. Ex: Response & Predictor 1, Response & Predictor 2, etc...
Make sure to include a heading with the variable names so I know which variables you're using for each analysis.
Write your correlation claim. Hint: You picked these variables because you thought they were related, so your claim should be the alternative claim, "There is a significant linear relationship between [response variable] and [predictor variable]."
- Is it null or alternative?
- Use Minitab to calculate the appropriate correlation statistics in Minitab and include this output in your document.
What are the important test statistics?
- Correlation coefficient (r) =
- Should the calculated regression line be used for estimation? [Yes, No] [Hint, if you said YES for a variable then you WOULD include it in Part 4 and Part 5 down below. If you said NO then you would NOT include it in Part 4 & 5 below.]
- What type of correlation is this? [Positive, Negative, None]
- How strong of a correlation is it? [Strong, Medium, Weak]
- There [IS, IS NOT] enough evidence to [REJECT, SUPPORT] the claim that [insert claim].
- Can this data be used for a linear regression analysis [Yes, No]
- Provide a real-world conclusion.
Part 4: What did you learn?
For this section, review Part 3 and write down all of the predictor variables that CAN be used for a linear regression. Hint: these are the ones where you answered "yes" to #5 (and #9).
Part 5: Linear Regression - Answer the questions below ONLY for the predictor variables you said COULD be used for a linear regression in Part 4.
For this section, you will do a linear regression for each of the variables you listed in part 4 - so each of the variables whose correlation indicated that YES they could be used for a linear regression. This means you might not be doing a linear regression for all seven of your predictor variables. Make sure to include a heading with the variable names so I know which variables you're using for each analysis. Ex: Response variable vs. Predictor 2, Response variable vs. Predictor 5, etc.
Write your claim. Is it null or alternative?
Calculate the appropriate statistics using Minitab and include the Minitab output (including what's needed for F, p, r-sq, and the graphs for the assumptions)
Specifically, type out the following information:
- Linearity - [Positive, Negative, None]
- Equal Error Variance - [Yes, No]
- Independent Observations - [Yes, No]
- Normality of Errors - [Yes, No]
How good does the model fit our data?
- R-sq =
- [Very good, good, fair, poor]
There [IS/IS NOT] enough evidence to [Support/Reject] the claim that [insert claim].
Part 6: What is the ONE best predictor variable?
After doing the linear regressions in Part 5, look at the data and determine which single predictor variable appears to be the ONE best predictor of your response variable. [Hint, there is a very specific and definitive way to tell which is the best predictor variable. If you aren't sure, go back to the notes/videos for linear regressions.]
- Which single predictor variable is the best for your response variable?
- Why is this the best predictor?
- Did you expect this or were you surprised by this?
- What are the real-world results for this predictor and the response variable?