首页 > 其他分享 >Problem Set 1 Installing MikTex

Problem Set 1 Installing MikTex

时间:2024-10-10 13:52:23浏览次数:1  
标签:Set ## list Installing element print Problem data first

Problem Set 1XXXDue: 10/10/2024

Introduction

This document was produced by R using RMarkdown. To complete this weeks assignment, we will ask you tocomplete a series of analytical and coding exercises. The Analytical Exercises require no coding, whereasthe Coding Exercises require you to use R. The nice thingabout RMarkdown is you can do both youranalytical and coding exercises in the same document. For each part in the Coding Exercises, we providean empty space of code chunks (area highlighted in grey with a header ofthe form “## ~~ Problem XX ~~##”“)To ease your introduction into R, Problem 2 is a short tutorial into the R programming environment.Hopefully, you have already downloaded RStudio on your computer. If not, please go do that now. You candownload the latest version at (https://www.rstudio.com/products/rstudio/download/).Once you have downloaded RStudio, you will be able to open the R markdown script (assignment_1.rmdfile) that created this assignment_1.pdf file. We ask you to fill in your code in the code chunk sections (theareas highlighted in gray bounded by “‘ marks) in the .rmd file in each of the subparts of the questions. Youwill see that you can include LaTex code in thesedocument. You are not required to use LaTex to do theanalytical exercises (i.e. those without coding), but it is good practice to improve your LaTex skills.In order to compile a markdown document (i.e. turn your code into a pdf file), you must have a version of LaTexdownloaded on your computer. I suggest you download MikTex (https://miktex.org/howto/install-miktex).If at any time you are confused about R, or not sure what a command does or additional arguments availablefor each command, there are two tried and true methodsto help resolve this issue. In the R console, you canuse the help command, where you supply the name of the command you are confused about. Alternatively,

google is your friend.

Installing MikTex

R Markdown was recently updated, and this update has issues with missing packageswith MikTex. In order to deal with these issues, you need to allow MikTex tonstallany missing packageswithout asking you first. To do so, when you are installing MikTex, in the ‘Settings’ screen, it asks Install missing packages on-the-fly. Please select Yes in this screen. If you have already installed MikTex, you

can go to the MikTex console -> Settings and the same box appears in that screen.

Packages to Install Each week, we will list the packages that you need to install into R in order for you to complete the assignments.

This also allows you to know a nice resource to view which packages you have learned throughout this course.The packages used this week are

  • stats
  • ggplot2 (optional)

1Code Setup

## This is a code chunk: it is outlined in grey and has R code inside of it

## Note: you can change what is shown in the final .pdf document using arguments ## inside the curly braces at the top {r, comment='\t\t'}. For example, you ## can turn off print statements showing in the .pdf by adding 'echo=False'

## i.e. changing the header to {r, comment='\t\t',echo=FALSE}

## ~~~~~~~~~~~~~~ CODE SETUP ~~~~~~~~~~~~~~ ##

# ~ This bit of code will be hidden after Problem Set 1 ~

#

# This section sets up the correct directory structure so that

# the working directory for your code is always in the data folder

# Retrieve the code working directory

#script_dir = dirname(sys.frame(1)$ofile)

initial_options <- commandArgs(trailingOnly = FALSE)

render_command <- initial_options[grep('render',initial_options)]

script_name <- gsub("'", "",

regmatches(render_command,

gregexpr("'([ˆ']*)'",

render_command))[[1]][1])

# Determine OS (backslash versus forward slash directory system)

sep_vals = c(length(grep('\\\\',script_name))>0,length(grep('/',script_name))>0)

file_sep = c("\\\\","/")[sep_vals]

# Get data directory

split_str = strsplit(script_name,file_sep)[[1]]

len_split = length(split_str) - 2

data_dir = paste(c(split_str[1:len_split],'data',''),collapse=file_sep)

# Check that the data directory contains the files for this weeks assignment

data_files = list.files(data_dir)

if(any(sort(data_files)!=sort(c('us_air_rev.csv','us_load_factor.csv')))){

cat("ERROR: DATA DIRECTORY NOT CORRECT\n")

cat(paste("data_dir variable set to: ",data_dir,collapse=""))

}

Problem 1 (Analytical Exercise)

Consider a simple AR(1) model:

Yt = αYt1 + εt

with εt N(0, σ2 ) for t = {1, . . . , T} and Y0 = 0

  1. What is the distribution of Y1? What is the distribution of Y2?
  2. What is the distribution of Yt for |α| < 1 as t → ∞.
  3. What is the definition of stationarity? Explain why in this model we can check for stationarity bylooking at the mean and the variance of the Yt.
  1. Suppose that α = 1. Why does this imply that the model is nonstationary? Can you think of a simpletransformation that makes the model stationary?
  1. Now suppose that |α| < 1. Find a formula for the jth autocorrelation ρj = corr(Yt, Ytj ).
  2. 2 Explain how we could use estimates of ρj for j = 1, 2, . . . to check whether some actual time series datawas generated by an AR(1) model like we one described above.

Problem 2 (Coding Exercise)

The problem will take you through a few tasks to familiarize yourself with R, as well as, some basic time

series concepts:

(a) Loading data into R

(b) Doing simple data analysis

(c) Doimg time series analysis

For this problem, we have pulled two seperate datasets from the FRED database, maintained by the Federal

Reserve Bank of Saint Louis (https://fred.stlouisfed.org/). The datasets cover the aggregate revenuand load factor in domestic US flights from 2000 to 2018. In the last two decades, airlines have begunusing sophisticated algorithms to increase capacity utilization fflights (i.e. flights tend to be more full).Furthermore, during the same time period, airline revenues have increased. The point of this exercise willbe to understand therole of these productivity increases in “explaining” increased revenues in the airlineindustry.

e two seperate datasets you will be working with are:

  1. US Domestic Air Travel Revenue Passenger Mile (filename = us_air_rev.csv) : this dataset containsmonthly data detailing the number of miles traveled by paying passengers in domestic US air travel.US Domestic Air 代 写Problem Set 1  Installing MikTex  Travel Load Factor (filename = us_load_factor.csv) : this dataset containsmonthly data detailing the percentage of seats filled up (capacity utilitization) in domestic US air travel.A Small Detour: Brief introduction to print statementsWe ask you to print a number of yourresults in this exercise. In R, there are two different wants to print results:
  1. print
  2. catThere are some deep programmatic differences underlying what each of these does, for our purposes we onlyare about how easy to read your outputs are.Printing StringsLet’s say you have a list of numbers, [4,5,6] and I want you to print out the followingstatement:he first element of the list is: 4The second element of the list is: 5The third element of the list is: 6Below I show you three ways to do so, the first way simply uses print without any additional arguments.The second way uses print with an additional argument, quote=False which gets rid of the quotes around

the strings. The third approaching, using cat, shows how this combines the second approach and has an

additional formatting feature that is useful for printing output.

## Define a list called x with 3 elements

x = c(4,5,6)

## Retrieve 1st, 2nd, 3rd element of list

first_elem

= x[1]

#1st element

second_elem = x[2]

#2nd element

third_elem

= x[3]

#3rd element

## Create strings stating 'The XXXX element of the list is:'

first_str = 'The first element of the list is:'

3second_str = 'The second element of the list is:'

third_str = 'The third element of the list is:'

## Concatenate the list elements and the string to create the whole phrase

first_phrase

= paste(first_str,first_elem,sep=' ')

second_phrase = paste(second_str,second_elem,sep=' ')

third_phrase

= paste(third_str,third_elem,sep=' ')

## ~~ (1) Print without extra arguments ~~ ##

print('~~ (1) Print without extra arguments ~~')

print(first_phrase)

print(second_phrase)

print(third_phrase)

## ~~ (2) Print with extra argument turning off quotes ~~ ##

print('~~ (2) Print with extra argument turning off quotes ~~',quote=F)

print(first_phrase, quote=F)

print(second_phrase, quote=F)

print(third_phrase, quote=F)

## ~~ (3) Print without quotes and without trailing # ~~ ##

cat("\n")

cat("~~ (3) Print without quotes and without trailing # ~~\n")

cat(paste(first_phrase,"\n",sep=''))

cat(paste(second_phrase,"\n",sep=''))

cat(paste(third_phrase,"\n",sep=''))

[1] "~~ (1) Print without extra arguments ~~"

[1] "The first element of the list is: 4"

[1] "The second element of the list is: 5"

[1] "The third element of the list is: 6"

[1] ~~ (2) Print with extra argument turning off quotes ~~

[1] The first element of the list is: 4

[1] The second element of the list is: 5

[1] The third element of the list is: 6

~~ (3) Print without quotes and without trailing # ~~

The first element of the list is: 4

The second element of the list is: 5

The third element of the list is: 6

Printing Dataframes

The main object you will be working with in R is called a dataframe (think anexcel spreadsheet). We will discuss more fully these objects in the following section.However, oftentimes youwill be asked to print out dataframes. In this case, using print is your best option. This is due to differencesbetween cat and print that are beyond the scope of this introduction.

(a) Loading Data

The first thing we want you to do is to load both datasets: us_air_rev.csv and us_load_factor.csv into

  1.  

Please load data in the section below

## ~~ Problem 2: Part (a) Load Data into R ~~ ##

4## Change working directory to data directory

setwd(data_dir) # <- This is set in CODE SETUP section

# If you are having issues, you can manually

# set this variable to the folder where the data is

## Load Air Revenue Data ##

## Load Air Load Factor Data ##

There are two ways to view data that you have loaded into memory in R.

  1. View only first (or last few rows) using head (tails) commands
  2. View the entire dataset in a seperate window using View commandsNote, for very large datasets it is not a good idea to use the View command as it is very memory (RAM)intensive.Other checks you always want to do when loading data includes:
  1. Check the column names using colnames
  2. Check the data types for each column using a loop and xxx
  3. Check the dimension (number of rows and columns) using the dim command

We now want you to run the following checks on both of your loaded datasets:

(1) Print the column names.

(2) Print off the first 20 rows.

(3) Print off the number of rows and columns.

(4) Print the data types of all the columns.Note, for part (4) I have already built the for loop statement to get all the data types for each of the columns.For those familiar with for loops in other environments, R as a built in set of apply functions that are

ptimized for specific objects (lapply is optimized for lists, vapply is optimized for vectors etc). If you are

nfamiliar with for loops, give it a google.

## ~~ Problem 2: Part (a) Run Data Checks ~~ ##

(b) Doing simple data analysis

In the next part, we will have you doing some actual time series analysis. But generally we are interested in

decomposing time series into trend, seasonal and stochastic components. One clear form of seasonality is

month to month variation in the data. An “approximation” for trend components is to look at year to year

changes. We will have you investigate these below.

We now want you to do the following:

(1) Calculate the average revenue and load factor, by year. Do this two ways: (1) Using aggregate andmean, (2) Using aggregate and sum.

2) Calculate the average revenue and load factor, by month. Do this two ways: (1) Using aggregate anmean, (2) Using aggregate and sum3) Plot graphs for part (1) and (2) on the same plot, using your favorite plotting function. Note, you caneither use the built-in plot function or the popular externallibrary ggplot2.For parts (1) and (2), I want you to build a better understanding of using R. I am asking you to computeaverages using two different methods. In Method (1), you can use the built-in mean function to have R dothe work for you. In Method (2), you will do the average calculation yourself by summing over observations

and dividing by the number of observations.

## ~~ Problem 2: Part (b) Simple Data Analysis ~~ ##

5(c) Doing time series analysis

In R, there are already built-in functions that allow us to do these seasonality and trend decompositions

with much fewer lines of code. To do so, we must convert our data into time series objects. What seperates

a normal vector of data from a vector of time series data is that the latter has some time frequency of

observations. In our case, the time frequency is monthly.

We want now return to the main question of this section, how much does capacity utilizations explain increases

in airline revenue?

Fixing notation, we have:

  • t ∈ {01/2000, . . . , 12/2018} = T is month-year combinations
  • Revt is revenue for each month-year combination
  • Loadt is load factor for each month-year combination

We now want you to do the following:

(1) Create a time series object using the ts command for each series. Be sure to specify the correctfrequency for the data.

(2) Plot an autocorrelation function between our two time series: {Revt}tT and {Loadt}tT .

(3) Run the following linear regression, reporting the coefficients and R2 :Revt = α + βLoadt + ϵt

(4) Decompose both series into cyclical and trend components using the decompose command. Ploteperately these cyclical and trend components for each of the series.

(5) Using the dataframe created in part (3), redo parts (2) and (3). What differs from part (2)? Why?What can we conclude about the impact of capacity utilization changes on revenues?

## ~~ Problem 2: Part (b) Time Series Analysis ~~ ##

Problem 3 (Coding Exercise) In class, you have learned about the Wold decomposition, a fundamental result in time series analysis. This

exercise will attempt to walk through Wold’s theorem in practice. We have provided simulated time seriesdata, in an .rda file called “ts_simulation.rda”, where Yt is the tth observation from our data. To open thisfile, use the load command. The name of the dataframe is “sim”.

We now want you to do the following:

(1) Verify the stationarity of the process. Do this in two ways:

(a) “Heuristic” : show that the first-moment and second-moment do not depend on t.

(b) “Testing” : use a Dickey-Fuller test to test for stationarity. Interpret your results.

(2) Estimate three seperate autoregressive models: AR(1), AR(3) and AR(6). For each of the seperatemodels, retrieve the residuals, ϵˆ{t,p}, where p is the order of the AR process. Using each set of residualsof the AR process, estimate an MA(2) model, where ηˆ{t,p,q} are the residuals of this second step.Verify whether the assumptions of Wold are violatedCorrϵ{t,p}Ys] = 0 such that s < t E[η{t,p,q}] = 0V ar(η{t,p,q}) = σ 2

(3) To find the right ARMA(p,q) process, we add new lags (increase p), estimate our model, use aninformation criteria to determine the increase in fit and stop once newmodels do not improve fit. To

simplify the problem, assume q = 2. Build a series of ARMA(p,q) models, using the Akaike Information

Criteria (AIC) to find the right p. (Note: A for loop over p would be a good idea).6## ~~ Problem 3: Wold-Decomposition ~~ ##

标签:Set,##,list,Installing,element,print,Problem,data,first
From: https://www.cnblogs.com/comp9313/p/18456078

相关文章

  • 关于set实现结构体自动去重原理的推论
    转自本人博客,原文链接先说结论在每个操作均为log复杂度的前提下,set无法在判断顺序和重复关键字不同时完成对结构体元素的去重。  首先我们先看这段结构体定义,目的是先按num相等进行去重,再按key降序排列。structnode{intnum;intkey;booloperator<(con......
  • PAT甲级-1150 Travelling Salesman Problem
    题目 题目大意旅行商问题是NP-hard问题,即没有多项式时间内的解法,但是可以验证答案是否正确。给定一个无向图,判断简单环,复杂环和非环。对应“TSsimplecycle”、“TScycle”、“NotaTScycle”。还要求出环的最小路径权值和,及对应的索引。思路主要思路在于如何区分简......
  • 链表Set_LinkList(建立)
    用单链保存集合元素,元素由键盘输入。输入以-1结束,将所建链表打印输出。链表结构如下图所示:提示:1.链表中数据元素为整型,typedef int ElemType;2.用结构体自定义链表结构Set_LinkList ;3.初始化链表函数init(),该函数可创建空链表L,返回L的头指针地址;4.链表插入结点函数......
  • abc174F Range Set Query
    给定数组A[N],有Q个询问,每个询问给出l[i]和r[i],问区间[l[i],r[i]]内有多少个不同的数?1<=N,Q<=5E5;1<=A[i]<=N;1<=l[i]<=r[i]<=N分析:对询问按右端点从小到大排序,然后从左到右依次处理每个A[i],将下标i的位置置为1,如果前面出现过A[i],则把上一次出现的位置置为0,然后处理右端点为i的......
  • java中Set的介绍与实现:HashSet、LinkedHashSet、TreeSet
    在Java中,Set是Collection接口的一个子接口,它是一个不包含重复元素的集合,且通常不保证维护元素的有序或迭代顺序。Set接口主要用于确保集合中每个元素的唯一性。Set接口的主要方法:booleanadd(Ee):将指定的元素添加到此集合中(如果它尚未在集合中)。booleanremove(Objec......
  • 《 C++ 修炼全景指南:十四 》大数据杀手锏:揭秘 C++ 中 BitSet 与 BloomFilter 的神奇性
    本篇博客深入探讨了C++中的两种重要数据结构——BitSet和BloomFilter。我们首先介绍了它们的基本概念和使用场景,然后详细分析了它们的实现方法,包括高效接口设计和性能优化策略。接着,我们通过对比这两种数据结构的性能,探讨了在不同应用场景中的选择依据。最后,博客还涵盖......
  • 在K8S中,DaemonSet类型的资源特性有哪些?
    在Kubernetes(K8s)中,DaemonSet是一种特殊的控制器资源对象,其核心特性和用途使得它非常适合用于在集群的每个节点上运行守护进程或服务。以下是DaemonSet类型的资源特性的详细阐述:1.确保每个节点上运行Pod副本节点级部署:DaemonSet确保集群中的每个节点(或满足特定条件的节点)上都运......
  • PTA JAVA语言 面向对象程序设计 作业二 6-3 Person类 构造Person类。包括姓名(name),性
    6-3Person类 谢谢大佬关注,不定期分享学习笔记,希望大佬能多多支持,三连必回单位 山东科技大学构造Person类。包括姓名(name),性别(sex)和年龄(age)。提供所有属性的set和get函数,提供print函数打印其信息输入描述:姓名(name),性别(sex)和年龄(age)输出描述:用户信息裁判测......
  • vue2 setting配置
    {  "workbench.iconTheme":"vscode-icons",  "vsicons.dontShowNewVersionMessage":true,  "terminal.integrated.profiles.windows":{    "cmd":{      "path":"C:\\Windows......
  • Cannot find current proxy: Set 'exposeProxy' property on Advised to 'true' to ma
    这个错误通常发生在使用SpringAOP时,尤其是当你尝试访问AopContext.currentProxy(),但当前代理对象不可用时。下面是一些解决此问题的建议:1.启用 exposeProxy 属性确保你的AOP配置中设置了exposeProxy属性为true。这可以在使用注解或XML配置中进行设置使用注解如......