# help with R project

The project
should be a concisely, nicely written report of at least 10 pages.
Reports should be written in Rstudio as an .Rnw ﬁle. You should submit a
pdf of your report and the R file used to generate your entire report.

Remark: The pdf will be the the main source for the grade, however,
the submitted underlying .Rnw ﬁle must be compile-able and correct.

In the end, the reports should contain

• Graphs done in R
• Results of your computation with R
• Inferential statistics done with R
• Explanation/interpretation of your ﬁndings/results.

Think of the report as a ”Real life project” which you do for a
company. This means that the reports should be presented nicely and
readable for persons with little statistic knowledge (so make sure you
clearly explain why you did what you did). Present your results so that someone would be interested in reading them.

In addition to the charts you’ve already included, you should now:

• Calculate (and test) the correlation between two appropriate
variables. Compute the linear regression for the related pair, plot the
scatter plot together with the linear regression and explain the
ﬁndings.
• Run an additional, multivariate linear regression, by adding at
least one additional independent variable. The additional variable maybe
numeric or you can create a “dummy” variable by coding a
binary categorical variable with 0s and 1s. Discuss which independent
variables are significant. Discuss each coefficient, and briefly discuss
what it all means.
• Compute a 95% conﬁdence interval for the parameter, p, of a
categorical variable with two outcomes. Explain what the conﬁdence
interval is in general, and discuss what your result means explicitly.
• Compare the conﬁdence intervals of the mean incomes of two subgroups
(e.g. male vs female, college vs no college, etc.). Choose subgroups
that best suit the other points of your project you discussed so far.
Interpret the result.
• Test the diﬀerence of two means of two populations. Make sure to
also run a test of two variances to see how to address the variance of
the two populations when testing the means.
• Use the R function prop.test() to compare two proportions. Interpret your results.
• Use the R function chisq.test() to test two nominal variables for independence. Interpret your results. 