首页 > 其他分享 >MAT188 principal components

MAT188 principal components

时间:2024-11-28 19:43:52浏览次数:8  
标签:plot temperature MAT188 matrix values components table data principal

MAT188: Homework 5

Background

Below is an illustration of the south western portion of the great province of British Columbia. Cities arelabelled in blue, and red circles indicate the location of public weather tations. Included with this assignmentis temperature data from each of these weather stations. Our goal is to use linear algebra, with the help ofMATLAB, to filter noise from the data, detect which weather stations were influenced by ocean temperaturesand which were influenced by forest fires.

Part 1 Step 1: Import weather station data Begin by loading the temperature data into Matlab script:

  1. Temperature data from all weather stations, recorded daily in Celcius, are available at https://github.com/dtxe/mat188_datasets/raw/refs/heads/main/bcweather/temperature.csv
  1. The locations of each weather station (latitude and longitude coordinates) are available at https://github.com/dtxe/mat188_datasets/raw/refs/heads/main/bcweather/stations.csv
  1. Use the readtable command to import both CSV files1Hint: readtable is able to import data directly from the internet, without needing to download it first.my_table_name= readtable('mydataset.csv'); % import CSVfile from current foldermy_table_name = readtable('https://data.sets/mydataset.csv'); % import CSV

ile from an internet source% Import the data below

Step 2: Preview and understand the dataset To get an idea of what the data in the table looks like, use the summary function to quickly summarize thedataset.

summary(T) % print summary of table TYou can also open the table in the MATLAB GUI by double-clicking on the variable name in your "Workspace".% Generate summaries of both tables here

To get a sense of the imported data, we can also preview the first or last few rows of the table using the head or

ail commands.head(T) % return the first 8 rows of table Thead(T,20) % return the first 20 rows of table Ttail(T) % return the last 8 rows of table TShow the first 5 rows of both the temperature and station Data Tables.Let's consider if the imported data makes sense:

  1. Are the values consistent with the dataset description? That is, is the temperature data consistent with yourexpectation of BC temperature.(eg. daily values in Celcius)?
  1. What is the date range of the data?
  2. How many weather stations do we have data for?
  3. 2 In the temperature table, what do rows represent? What do columns represent?

Step 3: Extract temperature data into matrices

In MATLAB a table and a matrix are different objects. Our data are in table format. In this section we convert thedata into matrices.We can index into MATLAB tables using the bracket notation.% Get data by column namevals = T.Station_15; % get data from the column named Station_15%Get data bycolumnindexvals = T(:,10); % get the 10th column from table Tvals = T(:,10:12); % get data from the 10th to 12th columns, inclusive% Get data by row indexvals = T(1:10,:); % get the rows 1-10 from table T% Table to matrixmat = table2array(T); % convert table T into a matrixLet's first extract the date values:Next, let's extract all the temperature values from all stations into one matrix:

From the code examples above, how can we subset a table from all rows of multiple columns by index?

  • With a subset table, how can we convert this into a matrix?

Quick sense check: Verify your variable types in the Workspace.Are they matrices (NxM double) or tables (NxM table)?

3Step 4: Initial visualization

For a better sense of what the temperature dataset looks like, let's plot all the temperature values from every

weather station as a function of time.Plot date on the x-axis. Label the axis "Date".

  • Plot temperature on the y-axis. Label the axis "Tamprature (C )".
  • Each weather station should be an individual line on your line plot.% recall, to make a plot in MATLABfigure % to initialize a new blank figure% then, to generate the plots:plot(x, y) % make a line plot, with variable x on the x-axis, and variable(s) y on the

y-axisscatter(x, y) % make a scatter plot% plots with customized markersscatter(x, y, 'r') % scatter plot with red markersscatter(x, y, '+r') % scatter plot with red plus markers

scatter(x, y, 'xr') % scatter plot with red x markers% label axes and add titlexlabel('Variable (unit)')ylabel('Variable (unit)')title('My plot')

% we've also provided a showmap command to render a map of BC with coordinates roughly to scalefigureshowmaphold onscatter(x, y, 'xr')Let's also plot the location (Longitude and Latitude) of all the weather stations as a scatter plot. Longitude is thex-axis and Latitude the yaxis.

See above for hints on how to customize your plot markers

  • Longitude should be on the x-axis, and Latitude should be on the y-axisRecall how to access values from a table by variable name! (See Step 3 above)
  • Ensure your markers shape, size, and colour are easy to see! You can use use marker 'xr'.

Step 5: Find principal components using eigenvector decomposition First, we need to mean-center and scale our data:4• For each weather station, subtract the mean temperature from the temperature timeseries

  • Divide the timeseries by the standard deviation of the timeseriesmean(X) % compute the mean for each column of matrix X (1st dimension)mean(X, 1) % compute the mean of matrix X along the 1st dimension

mean(X, 2) % compute the mean of matrix X along the 2nd dimensionstd(X) % compute the standard deviation for each column of X (1st dimension)std(X, [], 1) % compute the st dev of X along 1st dimensionstd(X, [], 2) % compute the st dev of X along 2nd dimensionC = cov(X) % compute the covariance matrix for all pairs of columns in X[V,D] = eig(X) % compute the Eigen decomposition of XRepeat this for all weather stations.Hint: What dimension of your matrix corresponds to time? Which dimension do we need to compute the mean

and std along?Find the covariance matrix describing the relationship between the station and the date for the temperature.Find the eigenvalues and eigenvectors of the covariance matrix.Hint: the 'diag' command may be useful for isolating the eigenvalues.

Step 6: Let's do some sense-checks Recall the 代写MAT188  principal components  Eigen decomposition:Which MATLAB variable above corresponds to...The matrix V The matrix5The matrix A Recall that if th matrix A is symmetric eigenbasis can be chosen to be orthonormal and henceV becomesorthonormal. Let's verify this with MATLAB.the x and y axes equally spaced, so the matrix visualization isalso squarecolorbar % show the colorbar for interpretation

t's also confirm this numerically.Compute, then check that all values in this matrix are zero using the provided iszero function.zeros(10) % generate a matrix of zeroseye(10) %generate a 10 x 10 identity matrixiszero( zeros(10 ) % check if a matrix of zeros consists of all zerosans = logical1iszero( eye(10 ) % check if the identity matrix consists of all zerosans = logical0Step 5: Identifying the directions of maximum variance Normalize the eigenvaluesby dividing each eigenvalue by the total sum.6• These are the eigenvalues from a decomposition of the covariance matrix.

The normalized eigenvalues can be interpreted as a fraction of variance in the data captured by eacheigenvector (eg. component).We want to identify the components that encompass the most variance in the data.Sort the eigenvalues from highest to lowest. Then arrange the eigenvectors correspondingly.

orted_values, sort_index] = sort(values, 'descend'); % sort values from highest to lowestsorted_columns = X(:, sort_idx); % rearrange columns of another matrixin the sameorderPlot all the normalized and sorted eigenvalues as a line plot.The index of the eigenvalues should be on the x-axisThe normalized eigenvalue should be on the y-axisLet's zoom into the elbow and take a closer look... (eg

Project the temperature measurements into the Eigenspace of the covariance matrix

The Eigenspace of the covariance matrix projects the data along directions in which the temperaturemeasurements are maximally linearly correlatedCurrently, for one day's temperature measurements, each weather station can be considered adimension/axis

We will now project every day's temperature measurements into a new space, where the dimensions/axes correspond to a group of weather stations that tend to varytogetherFor the temperature measurements for each day, project the temperature measurement timeseries intotheeigenbases by computing:7Now, let's plot the temperature data along the first 5 dimensions over time.

  • Be sure to add a title, legend, and label for the x-axis.

Which weather stations contribute to each principal component?Visualize the spatial influence of each principal component by plotting the PCA loadings (weights) for each

weather station on a geographic map.Explanation:

After PCA, the loadings for each principal component represent the influence (or "weight") that eachweather station has on that component. Higher loadings mean that thetemperature variations at thatstation significantly contribute to the patterns captured by that component.

Plotting these loadings on a map helps identify if certain geographical areas, like coastal or inland

egions, are more strongly associated with specific components.

Instructions:

  • Use the latitude and longitude values in the stations.csv data.
  • For each component (e.g., PC1, PC2, etc.), create a scatter plot where each weather station'scoordinates are marked, and the color of each point represents the magnitude of the loading for thatcomponent.For example, use a color gradient (e.g., blue to red) to show low to high loadings.
  • This visualization will help you interpret the spatial patterns captured by each component

Interpretation

Weather patterns are complex and influenced by several factors, including incident sunlight, prevailing winds,sea surface temperature, and atmosphere particulates (like smoke from wildfires).

  • Which principal components are capturing general seasonal variations?
  • Which principal components capture the effect of sea surface temperature? (Hint: consider thegeographical distribution, and that ocean temperature lags seasonal air temperaturesWhich principal components capture the effect of forest fires? Can you identify where the forest fires areburning?

Which principal components capture only noise?

Problem 2: Using OLS to estimate the trend of the salmon population and wildfire likelihood based on yearly sea surface temperatures In a dataset separate from Problem 1, scientists have compiled yearly data on three interconnectedenvironmental variables:

  1. Estimated Salmon Population
  2. Yearly Trends in Forest Area Burned by Wildfires
  3. Mean Ocean Temperature for the Same YearYour task is to analyze this dataset and explore the relationships between these variables. Using the ordinary

least squares (OLS) method, you will model these relationships and interpret the ecological insights youranalysis provides.Part 1: Import and understand the dataset Use the readtable command to download the weather trends dataset from https://github.com/dtxe/mat188_datasets/raw/refs/heads/main/bcweather/trends.csv.Find the array size of the data (number of rows) using the size functioo get an idea of what the data in the table looks like, you can open the table in the "Workspace" by doubleclicking the name you gave it. Or you can print out a summary of the table in the "Command Window" by usingthe function summary() with your chosen table name in the parenthesis.

Part 2: Initial visualization Plot how the salmon population and temperature change throughout the years. Use yyaxis left andyyaxis right to specify the left (salmonpopluation) and right axis (temperature) on the same plot.Hints for plotting:

  1. Use xlabel and ylabel to set axis titles for your plot
  2. You can use xlim and ylim to change the limits of the x axis and y axis values if required
  3. Use legend to label the graphs corresponding to either Temperature or Salmon Population
  4. Plotting options such as line color and thickness can be added following the first two entries. Exampleplot(x,y,"LineWidth",1.5) will change the thickness of the lines to the input 1.5.figureyyaxis leftplot(x, y1) % plot y1 against xyyaxis rightplot(x, y2) % with a different y-axis on the right, plot y2 against x% Retrieve variables from a tablelot(T.Var1, T.Var2) % plot column Var1 from table T on the x-axis, against Var2 on the

y-axisNow, plot how the wildfire changes with temperature throughout the years.10Part 3: Use projections to perform Ordinary Least Squares to estimate the 1st order (linear)and 2nd order (quadratic) models Let's start with estimating how the salmon population trend changes with yearly sea temperature.Recall the least squares approximation to a solution learned in class that looks like the following:Consider matrix , how can you construct this matrix given the variable temp for both a linear and a quadraticapproximation? Hint: concatenate a ones column vector with the appropriate variable(s).Hint: you can use function inv to take the inverse of a matrix, and a matrix transpose is denoted by the prime

A'. Remeber, you can use the "Workspace" or "Command Window" to visually see how your matrices lookand if their sizes are compatible to perform matrix multiplication.% Perform OLS to find 1st order (Linear model)% Perform OLS to find 2nd order (Quadratic model)Use linspace function to create a column vector of 100 linearly spaced temperature values (these will be thex values on your plot)xpts = linspace(min(temp),max(temp),100)'; % (100 points is default value)Complete the linear and quadratic model matrices using the temp values from the above step. Recall, we arelooking for a linear functionand a quadratic functionthat can model the salmonpopuation trend with sea temperature.% Linear model y values (y = mx+b)% Quadratic model y values (y = ax^2 + bx + c)Part4: Plot the raw data, and your fitted curves 11Graph the salmon population (as the dependent variable) vs the temperature (as the independent variable) as ascatter plot using the scatter command. On the same plot, graph the linear and quaratic models.Hints for plotting: To ensure all three graphs are on one plot, make sure to use the hold on/hold off commands.% Plot Salmon popluation vs temperature data, linear and quatratic model

By visual inspection, which model provides the best fit for the data? In other words, what's the minimumorder of the model (1st, or 2nd order) that adequately captures most of the dependent variation in the databased on the plot? Why do you think so?

Part 5: Analyze trends for wildfires

Repeat the above steps except now replace the salmon population data with the wildfire data. % Perform OLS to find 1st order (Linear model)% Perform OLS to find 2nd order (Quadratic model)% Create a column of 100 linearly spaced x values for plotting (ensure it% encompases all of data)% Linear model y values (y = mx+b)% Quadratic model y values (y = ax^2 + bx + c12% Plot Wildfire popluation vs temperature data, linear and quadratic model

By visual inspection, which model provides the best fit for the data? In other words, what's the minimumorder of the model (1st, or 2nd order) that adequately captures most of the dependent variation in the databased on the plot? Why do you think so?

Well done! You've successfully completed Homework 5! We hope this provided a demonstration of the power of linear algebra and computation software in working withand manipulating, isolating trends in, and extracting insights from large datasets.

Submission instructions

  1. Run this Live Script from top to bottom to verify the correctness of your code. (Home tab > ClearWorkspace; Live Editor tab > Run)
  1. Export this Live Script as a PDF file (Live Editor > Export)
  2. Upload both the MLX and PDF files to GradescopeHelper functions Do not delete the code below.

These functions exist to reduce the complexity of the assignment above, by abstracting away concepts thataren't core to the course curriculum. However, feel free to take a look if you're curious!function [] = showmap()

 

标签:plot,temperature,MAT188,matrix,values,components,table,data,principal
From: https://www.cnblogs.com/CSE231/p/18574276

相关文章

  • ASP.NET Core PDF viewers components Crack
    ASP.NETCorePDFviewerscomponentsCrackASP.NETCorePDFviewerscomponentswithformfillingsupportletusersdirectlycomplete,edit,andsubmitdatawithinPDFforms.TheabilitytoreadandwriteformfieldsinaPDFviewercomponenten......
  • abc372E K-th Largest Connected Components
    有N个顶点的无向图,最初没有边,接下来有Q组询问,格式如下:1uv:在顶点u和v之间加一条边;2xk:问与顶点v连通的分量中,顶点编号第k大的是谁?如果不存在,输出-1.1<=N,Q<=2E5,1<=u<v<=N,1<=x<=N,1<=k<=10分析:由于k比较小,直接用vector维护连通分量的顶点集合,在合并时,如果顶点数超过k,......
  • 题解:AT_abc372_e [ABC372E] K-th Largest Connected Components
    题意给出\(q\)个操作。将\(u\)和\(v\)连边。问\(u\)所在的连通块中编号第\(k\)大的点。思路连通块很容易想到并查集,求第\(k\)大可以用平衡树(虽然赛时没看到\(k\le10\)),合并时将信息从将小的连通块合并到大的连通块,这样可以减少时间复杂度。什么?你不会写平衡......
  • CF1270H Number of Components 题解
    Description给一个长度为\(n\)的数组\(a\),\(a\)中的元素两两不同。对于每个数对\((i,j)(i<j)\),若\(a_i<a_j\),则让\(i\)向\(j\)连一条边。求图中连通块个数。支持\(q\)次修改数组某个位置的值,每次修改后输出图中连通块个数。\(n,q\le5\times10^5,1\lea_i\le10^......
  • 解决React Warning: Function components cannot be given refs. Attempts to access
    问题当我使用如下方式调用组件子组件UploadModal并且绑定Ref时React报错“Warning:Functioncomponentscannotbegivenrefs.Attemptstoaccessthisrefwillfail.DidyoumeantouseReact.forwardRef()?”;constUploadModalRef=useRef(null);constopenUploadModa......
  • 微信小程序报错:Component is not found in path "components/comp/comp.js"
    完整错误jsEnginScriptError:Componentisnotfoundinpath"components/comp/comp.js"(usingbypages/index/index);onAppRouteError:Componentisnotfoundinpath"components/comp/comp.js"(usingbypages/index/index) ine(...) ...错误......
  • 微信小程序报错:Component is not found in path "components/comp/comp.js"
    完整错误jsEnginScriptError:Componentisnotfoundinpath"components/comp/comp.js"(usingbypages/index/index);onAppRouteError:Componentisnotfoundinpath"components/comp/comp.js"(usingbypages/index/index) ine(...) ...错误......
  • ProComponents——ProForm,设置初始值后,点击【重置】按钮,值已清除但页面未更新
     我的问题umi+antd,使用ProComponents的QueryFilter表单进行列表筛选,首页有个进入列表的快捷跳转,会筛选列表状态(在线1/离线0)。设置筛选状态初始值为1后,点击【重置】按钮:1.打印初始值1已清除,但页面上未更新,仍显示筛选在线状态2.点击2次【重置】按钮,页面才会更新3.点击下拉框的......
  • OPenCV结构分析与形状描述符(5)查找图像中的连通组件的函数connectedComponents()的使用
    操作系统:ubuntu22.04OpenCV版本:OpenCV4.9IDE:VisualStudioCode编程语言:C++11算法描述connectedComponents函数计算布尔图像的连通组件标签图像。该函数接受一个具有4或8连通性的二值图像,并返回N,即标签总数(标签范围为[0,N-1],其中0代表背景标签)。ltype参数指......
  • ISO 26262中的失效率计算:SN 29500-4 Expected values for passive components
    目录概要1基准条件下的失效率2失效率转换2.1失效率预测模型2.2电压应力系数2.2.1电压应力系数计算模型2.2.2电压应力系数计算2.3温度应力系数2.3.1温度应力系数计算模型2.3.2温度应力系数计算2.4质量系数3任务剖面应力系数4早期失效系数概要SN29......