Miami University Exploratory Data Analysis & Underlying Assumptions Project
Question Description
Use the provided file homework3.Rmd
to complete this homework. When saved into the same folder as the data
files below, the knitted Markdown document will provide the diagnostic
plots for problem 1 and some data handling for problems 2 and 3. Make
sure to update the document with your name.
In problem 1 you are simply analyzing the provided diagnostic plots.
For Problem 2 and Problem 3 perform a complete analysis of the described
problem, this includes:
- Exploratory data analysis
- Checking the underlying assumptions
- Proper statistical inference
- Any follow-up procedures
- Conclusions in the context of the problem (You need to report the F statistics along with the degrees of freedom and the p-value)
Problem 1 (10pts) – Model Diagnostics
In this problem, an ANOVA model is fit three data sets and the
diagnostic figures are provided. Please provide comments on these
figures and discuss whether the underlying assumptions of ANOVA are
satisfied or not. Make sure to specific which, if any, assumptions are
violated.
Data 1 – Seaweed grazers
Description: To study the influence of ocean grazers
on the regeneration of seaweed in the Intertidal zone, a researcher
scraped rock plots free of seaweed and observed the degree of
regeneration when certain types of seaweed-grazing animals were denied
access. The grazers were limpets (L), small fishes (f) and large fishes
(F). A plot was taken to be a square rock surface, 100 cm on each side.
Each plot received one of six treatments, named here by which grazers
were allowed access: LfF, fF, Lf, F, L and Control. Because the
intertidal zone is a highly variable environment, the researcher applied
the treatment in eight blocks of 12 plots each. With each block, she
randomly assigned treatments to plots so that each treatment was applied
to two plots. The data set is in file seaweed.csv
Data Source: Ramsey, F.L. and Schafer, D.W. (2013),
“The Statistical Sleuth: A Course in Methods of Data Analysis (3rd ed)”,
Cengage Learning.
Data 2 – Soybeans
Description: In a completely randomized design with a
2x3x5 factorial treatment structure, researchers randomly assigned one
of 30 treatment combinations to open-topped growing chambers, in which
two soybean cultivars were planted. The responses for each chamber were
the yields of the two types of soybean. The diagnostic figure provided
is for a model with only one factor. The data set is in file soybean1.csv
Data Source: Ramsey, F.L. and Schafer, D.W. (2013),
“The Statistical Sleuth: A Course in Methods of Data Analysis (3rd ed)”,
Cengage Learning.
Data 3 – Mice lifetimes
Description: Female mice were randomly assigned to
six treatment groups to investigate whether restricting dietary intake
increases life expectancy. There are six diet treatments. The data set
is in file lifetime.csv
Data Source: Ramsey, F.L. and Schafer, D.W. (2013),
“The Statistical Sleuth: A Course in Methods of Data Analysis (3rd ed)”,
Cengage Learning.
Problem 2 (15pts)
Description: A completely randomized factorial
laboratory experimental design was used to study the effects of washing
cycles and pre-washed methods on the abrasion of denim jeans.
Pre-washed, stone-washed, and cellulase enzyme washed jeans were the
garment washed denim treatments. The laundering cycles were zero
(control group), five, and 25; Edge abrasion is the measure response. A
total of 90 samples were utilized; 30 of each of the three garment
washed denim treatments (pre-washed, stone-washed, and cellulase enzyme
washed). From each group of 30 samples, ten samples were randomly
assigned to each of the three laundering cycles (0/5/25). Samples were
independently rated for edge abrasion after a fixed laundering interval.
The data set is in file denim_abrasion.csv
Below is the information for each column of the data set:
Laundry Cycles (1= Control (0), 2=5 Launderings, 3=25)
Denim Treatment (1=Pre-washed, 2=Stone-Washed, 3=Enzyme Washed)
Edge abrasion Score
Data Source : A. Card, M.A. Moore, M. Ankeny (2006).
“Garment Washed Jeans: Impact of Launderings on Physical Properties,”
International Journal of Clothing Science and Technology, Vol. 18, 1/2,
pp. 43-52.
Problem 3 (15pts)
Description: The effect of germination time (48, 96,
and 144h) on malt quality of four sorghum varieties was investigated to
determine the potential of grain sorghum cultivars in the local brewery
industry. The four evaluated sorghum varieties were Gambella 1107,
Macia, Meko, and Red-Swazi. It is known that germination time will be
influenced by other environmental effects (temperature, humidity,
etc…). Due to limitations in the availability of equipment to perform
the experiment, 12 samples were randomly assigned to each treatment and
the experiment was repeated at three distinct time points (different
days) resulting in 36 total observations. The data set is in file mat_var_germ1.csv
Below is the information for each column of the data set:
Variety (1-4 for 4 varieties)
Germination (1-48h, 2-96h, 3-144h)
Malting weight loss (MWL)
Time (1-3, three time points)
Data source:
A. Bekele, G. Bultosa, and K. Belete (2012). “The Effect of Germination
Time on Malt Quality of Six Sorghum (Sorghum Bicolor) Varieties Grown
at Melkassa, Ethiopia,” Journal of Brewing, Vol. 118, Issue 1, pp.
76-81.
Notes:
- Use headers to separate each question part, and label them meaningfully (e.g. “Problem 3, Part 2”). See in-class Markdown examples of this and use them in your assignment.
- All questions must include written answers in full problem context. Submitting only a Markdown with compiled R code but no supporting answers will only receive limited credit.
-
You will upload your final knitted HTML to Canvas for grade. Make sure you place your name and homework number in the Markdown header, e.g.
- title: “Homework #3”
- author: “Your Name Here”
- date: “September .., 2020”
- output: html_document
Reminder: Assignments in STA 363 are
designed in such a way that we will be able to detect academic
dishonesty. If you turn in another student’s generated Markdown
document, we will know and proceed with an academic dishonesty claim.