BUSS6002 AssignmentSemester 2, 2024
Instructions
- Due: at 23:59 on Friday, October 25, 2024 (end of week 12).
- You must submit a written report (in PDF) with the following filename format, replacingSTUDENTID with your own student ID: BUSS6002 STUDENTID.pdf.
- You must also submit a Jupyter Notebook (.ipynb) file with the following filename format,replacing STUDENTID with your own student ID: BUSS6002 STUDENTID.ipynb.
- There is a limit of 6 A4-pages for your report (including equations, tables, and captions).
- Your report should have an appropriate title (of your own choice).
- Do not include a cover page.
- All plots, computational tasks, and results must be completed using Python.
- Each section of your report must be clearly labelled with a heading.
- Do not include any Python code as part of your report.
- All figures must be appropriately sized and have readable axis labels and legends (whereapplicable).
- The submitted .ipynb file must contain all the code used in the development of your report.
- The submitted .ipynb file must be free of any errors, and the results must be reproducible.
- You may submit multiple times but only your last submission will be marked.
- A late penalty applies if you submit your assignment late without a successful special consideration. See the Unit Outline for more details.
- Generative AI tools (such as ChatGPT) may be used for this assignment but you must add astatement at the end of your report specifying how generative AI was used. E.g., GenerativeI was used only used for editing the final report text.
- Hint! It is highly recommended that you finish the week 10 tutorial before starting this
assignment.
1Description
One of the UN Sustainable Development Goals is ‘climate action’ (goal 13). In this assignment,you are conducting a study that compares the predictive performance between three families ofbasis functions: polynomial, piece-wise constant, and piece-wise linear, for a linear basis functionbetween time andtemperature.You are provided with the ERA5 surface air temperature dataset, which is widely used inclimate research, weather forecasting, and environmental monitoring. The dataset contains 1,017observations of monthly surface air temperature in degrees Celsius (temp) from January 1940 toSeptember 2024. It also contains the year (year) and month (month) for which the temperatureis observed. A scatter plot of the dataset is shown in Figure 1.Figure 1: ERA5 surface air temperature from January 1940 to September 2024.The specific LBF modelbeing considered in your study is given byy = u ⊤α + ϕ(x) ⊤β + ε, where y is the surface air temperature, x is year, and ε is a random noise; u := [u2, . . . , u12] ⊤ is abinary vector of dummy variables, with ui = 1 if y is observed in month i and u = 0 otherwiseϕ(x) denotes the vector of basis function values; the parameter vectors are α and β. Three familiesof basis functions are considered for computing ϕ(x); the first family is the set of polynomial basisunctions ϕ(x) := [1, ϕ1(x), . . . , ϕp(x)]⊤, withi(x) := x i . The second family is the set of piece-wise constant basis functions ϕ(x) := [1, γ1(x), . . . , γk(x)]⊤,2The break points {ti} k i=1 are calculated according towhere xmin and xmax denote the smallest and largest observed values of x, respectively. The thirdfamily is the set of piece-wise linear basis functions ϕ(x) := [1, x, λ1(x), . . . , λk(x)]⊤, witλi(x) := (x − ti)I(x > ti), where ti is given by Equation (1).Before comparing the three basis function families, you must set the degree p for the polynomialmodel, and the number of breakpoints k for the piece-wise constantand piece-wise linear models.The hyperparameter value for each basis function family should be selected using a validation set,by minimising the validation mean squared error (MSE).For the polynomial model, the optimal value of p should be selected by exhaustively searchingthrough an equally-spaced grid from 1 to 10, with a spacing of 1:P := {1, 2, . . . , 10}. For the two piece-wise models,代写BUSS6002 UN Sustainable Developmen you should select the optimal values of k by exhaustively searchingthrough another equally-spaced grid from 1 to 30, with a spacing of 1:K := {1, 2, . . . , 30}. Once the optimal values of the hyperparameters are chosen for all basis function families, youwill be able to compare the predictive performance between the three using a test set (i.e., bycomparing the test MSE between the three optimally selected models).
3Report Structure
Your report must contain the following four sections:
Report Title
1 Introduction (0.5 pages)– Provide a brief project background so that the reader of your report can understandthe general problem that you are solving.– Motivate your research question.
– State the aim of your project.
– Provide a short summary of each of the rest of the sections in your report (e.g., “The
report proceeds as follows: Section 2 presents . . . ”).
2 Methodology (2 pages)
– Define and describe the LBF model.
– Define and describe the three choices of basis function families being investigated– Describe how the parameter vectors α and β are estimated given the hyperparametervalue. Discuss any potential numerical issues associated with the estimation procedure.
– Describe how the hyperparameter value can be determined automatically from data (as
opposed to manually setting the hyperparameter to an arbitrary value).
– Describe how the performance of the three families of basis functions is compared giventhe optimal hyperparameter value.
3 Empirical Study (2.5 pages)
– Describe the datasets used in your study.
– Present (in a table) the selected hyperparameter value for each basis function family.
– Describe and discuss the table of selected hyperparameters.
– Visually present (using plots) the predicted response values for each basis functionamily in the test set.
– Describe and discuss the plots of predicted values.
– Present (in a table) the test MSE values for each basis function family.
– Describe and discuss the table of test MSE values.
– Report the temperature forecasts for October, November, and December of 2024 given
by the model with the smallest test MSE. Include a brief description of how these
forecasts are obtained.
4 Conclusion (0.5 pages)
– Discuss your overall findings / insights.
– Discuss any limitations of your study.
– Suggest potential directions of extending your study.
4Rubric
This assignment is worth 30% of the unit’s marks. The assessment is designed to test your computational skills in implementing algorithms and conducting empirical experiments, as well as yourcommunication skills in writing a concise and coherent report presenting your approach and results.he mark allocation across assessment items is given in Table 1.Assessment Item Goal MarksSection 1 Introduction 4Section 2 Methodology 10Section 3 Empirical Study 16Section 4 Conclusion 3Overall Presentation Clear, concise, coherent, and correct 5Jupyter NotebookReproducable results 2Total 40
Table 1: Assessment Items and Mark Allocation5
标签:used,basis,BUSS6002,report,UN,values,Sustainable,your,must From: https://www.cnblogs.com/CSSE2310/p/18512692