STATS 2DA3 Fall 2024
ASSIGNMENT 1
1. (10 MARKS) Using the iris dataset which is available in R, answer the following questions:
(a) Use one or two lines of R code to display how many rows and columns are in the dataset. (i.e. do not just output all observations in the dataset. Write some code that will output the required information).
(b) Which variables are categorical and which are continuous?
(c) Graph 1: Using the ggplot function, make a scatterplot of “Sepal.Length” against “Petal.Length” (putting “Sepal.Length” on the x-axis).
• Make the data points blue.
• Label the x-axis Sepal Length.
• Label the y-axis Petal Length.
• Label the graph Iris Data.
(d) Graph 2: Use ggplot to make a bar chart (geom bar) displaying “Species” . “fill” using “Species” (i.e. each species of ris should be a different colour on the graph).
(e) Display graphs 1 and 2 in one image 代 写STATS 2DA3 R questions using R code (i.e. do not just screen grab the 2 images and combine them).
2. (3 MARKS) Consider the plot below; it displays information on Vehicle Type and on theassociated drive train. There are 3 different types of drive train : 4 = four wheel drive, f = front wheel drive, r = rear wheel drive.
(a) Which Vehicle Type has the least observations associated with it in the dataset? (b) For “suv” vehicles, what is the majority drive train type?
(c) For “compact” vehicles, which of the 3 drive train types occurs least often?
3. (7 MARKS)
For the Arthritis dataset in the vcd package [there are 3 different levels of improvement (None, Some or Marked) that a patient can experience after receiving 1 of 2 medical treat- ments (Placebo or Treated)], perform the following tasks:
(a) Create a Double Decker plot, displaying “Improved” as a function of “Treatment” and “Sex” . (“Treatment” should be on the lowest x-axis.) Colour the “Improved” variable so that each level is a different colour.
(b) For female patients in the Treated group, what was the most reported level of improve- ment?
(c) For male patients in the Treated group, what was the least reported level of improve- ment?
(d) Using ggplot make a bar chart (geom bar) displaying “Treatment” . Colour (“fill”) the “Treatment” variable with respect to the “Improved” variable.
Assignment Standards
• Answer each question. Do not just provide code. Any graphs must be rendered and reproduced in the report.
• LATEX is strongly recommended but not strictly required. The use of Markdown in R studio is also recommended.
• Submit your assignment as one .pdf document. All R code should be included and organized either at the end of the assignment or inline (if using R Markdown).
• Approximately eleven-point font (times or similar) must be used with around 1.5 line spacing and margins of at least 1 inch all around.
• Do not include a title page. The title and your name should be printed at the top of the first page.
• Various tools, including publicly available internet tools, maybe used by the instructor to check the originality of submitted work.
• Students are not permitted to use generative AI in this course. In alignment with McMaster academic integrity policy, it “shall be an offence knowingly to . . . submit academic work for assessment that was purchased or acquired from another source” . This includes work created by generative AI tools. Also stated in the policy is the fol- lowing, “Contract Cheating is the act of “outsourcing of student work to third parties” (Lancaster & Clarke, 2016, p. 639) with or without payment.” Using Generative AI tools is a form of contract cheating. Charges of academic dishonesty will be brought forward to the Office of Academic Integrity.
标签:code,STATS,drive,dataset,2DA3,Length,questions,should From: https://www.cnblogs.com/WX-codinghelp/p/18419233