首页 > 其他分享 >STA 141C 大数据与高性能统计

STA 141C 大数据与高性能统计

时间:2023-06-01 12:24:28浏览次数:24  
标签:digits digit STA 141C should 高性能 each data method


STA 141C - Big Data & High Performance Statistical Computing Spring 2022
Homework 4
Lecturer: Bo Y.-C. Ning Due June 02, 2023
Due June 02, 2023 by 11:59pm.
A few notes:
1. Submit your homework using the file name ”LastName FirstName hw4”
2. Answer all questions with complete sentences.
3. Your code should be readable; writing a piece of code should be compared to writing a page of a book.
Adopt the one-statement-per-line rule. Consider splitting a lengthy statement into multiple lines
to improve readability. (You will lose one point for each line that does not follow the one-statementper-line rule)
4. To help understand and maintain code, you should always add comments to explain your code. (homework with no comments will receive 0 points). For a very long comment, break it into multiple lines.
5. Submit your final work with one .pdf (or .html) file to Canvas. I encourage you to use LATEX for
writing equations. Handwriting is acceptable, you have to scan it and then combine it with the coding
part into a single .pdf (or .html) file. Handwriting should be clean and readable.
1 Handwriting recognition
In this homework, we work on a model-based method for handwritten digit recognition. Following figure
shows example bitmaps of handwritten digits from U.S. postal envelopes.
Each digit is represented by a 32×32 bitmap in which each element indicates one pixel with a value of white
or black. Each 32 × 32 bitmap is divided into blocks of 4 × 4, and the number of white pixels are counted
in each block. Therefore each handwritten digit is summarized by a vector x = (x1, . . . , x64) of length 64
where each element is a count between 0 and 16.
By a model-based method, we mean to impose a distribution on the count vector and carry out classification
using probabilities. The goal is to predict handwritten digit. We separate the dataset into training data and
test data. The training set contains 3823 handwritten digits and the test set contains 1797 digits.
A common distribution for count vectors is the multinomial distribution. However, it is not a good model for
handwritten digits. Let’s work on a more flexible model for count vectors, the Dirichlet-multinomial model.
In the folder uploaded on Piazza, you will find
• Data containing the training data and the testing data
• ‘ddirmult.R’, which evaluates the likelihood function (if log = FALSE) or the log-likelihood function
(if log = TRUE) of the Dirichlet-multinomial density
Homework 4: Bo Y.-C. Ning 4-3
• ‘ddirmult.fit.R‘, which estimates the maximum likelihood estimator (MLE) by Newton’s method
• ‘trainingMLE.R‘, which estimates the MLE based on the training data
Question 1. Open ‘trainingMLE.R’ and obtain MLE estimators for each of the 10 handwriting digits
(0, 1, 2, . . . , 9). (You may need to change the path when loading the data)
Question 2. Read in 代写STA 141Cthe testing data. Use the estimated MLE for each digit from training data to predict
handwriting digits for the testing data.
• Hint 1: To predict the handwriting digit, you should use the ‘ddirmult.R’ function. The following code
can be helpful
1 # testDigitProb stores posterior probability of each digit being 0 ,1 ,... ,9
2 testDigitProb <- matrix (0 , dim ( testdata ) [1] , 10)
3 for ( dig in 0:9) {
4 testDigitProb [, dig + 1] <-
5 ddirmult ( testdata [ , -65] , alphahat [ dig + 1, ], log = TRUE )
6 }
7 testDigitProb <- testDigitProb +
8 rep ( log ( digitCount / sum ( digitCount ) ) , each = nrow ( testdata ))
9 digitPredicted <- max . col ( testDigitProb ) - 1
• Hint 2: To summarize the result, you can construct a confusion table using the code
1 table ( testdata [, 65] , digitPredicted )
The output should look like this:
Question 3. Comment on using gradient descent to obtain the MLE (instead of Newton’s method)? (You
do not need to implement this.)
Question 4. What is the advantage and disadvantage of using gradient descent instead of Newton’s method?
Question 5. Do you think the current method is satisfactory for predicting handwriting digits? Do you
know any other method(s) that can achieve a higher accuracy?

WX:codehelp 

标签:digits,digit,STA,141C,should,高性能,each,data,method
From: https://www.cnblogs.com/simpleyfc/p/17448579.html

相关文章

  • Arm NN 成功适配 openEuler Embedded,提供高性能神经网络推理能力
    近期,RISC-VSIG完成了ArmNN在openEulerEmbedded系统的适配,于2023年1月合入系统构建工程代码库,经测试验证可用,实现了神经网络加速库在openEulerEmbedded嵌入式系统上的加速和优化。系统构建工程下载地址:https://gitee.com/openeuler/yocto-meta-openeuler支持ArmNN......
  • cap@0.2.1 install: `node-gyp rebuild`
    异常:首先检查系统环境变量NODE_PATH值是否设置正常路径下是否有node-gyp包若是没有就执行以下命令npminstall-gnode-gyp......
  • easy_install uncompyle6 egg
    localhost:~#easy_installuncompyle6-3.0.1-py3.6.eggProcessinguncompyle6-3.0.1-py3.6.eggCopyinguncompyle6-3.0.1-py3.6.eggto/root/anaconda3/lib/python3.6/site-packagesAddinguncompyle63.0.1toeasy-install.pthfileInstallingpydisassemblescriptto/......
  • JVM-常用工具(jps、jstat、jinfo、jmap、jhat、jstack、jconsole、jvisualvm)使用
    场景记录JVM中常用工具。jps:虚拟机进程状态工具jps(JVMProcessStatusTool):虚拟机进程状态工具,可以列出正在运行的虚拟机进程,并显示虚拟机执行主类(MainClass,main()函数所在的类)的名称,以及这些进程的本地虚拟机的唯一ID。命令格式:jps[options][hostid]示例:jps-l ......
  • 卫星定位北斗芯片AT6558一款高性能BDS/GNSS多模卫星导航接收机SOC单芯片
    1芯片简介AT6558R是一款高性能BDS/GNSS多模卫星导航接收机SOC单芯片,片上集成射频前端,数字基带处理器,32位的RISCCPU,电源管理功能。芯片支持多种卫星导航系统,包括中国的北斗卫星导航系统BDS,美国的GPS,俄罗斯GLONASS,并实现多系统联合定位。1.2主要特征■功能规范●支持BDS/GPS/GLO......
  • 攻防世界_PWN_stack2
    本文通过结合其他师傅的思路以及自己的一些理解完成。希望在记录自己所学知识的同时能够帮助有同样疑惑的人。pwn入门新手一个,如果有说错的地方请师傅们多多包涵0x00前置知识本题关键汇编指令:mov指令和lea指令以及ret指令movmov指令的功能是传送数据,它可以把一个操作数的值......
  • 高性能 Go 的 6 个技巧 — Go 高级主题
    本文旨在讨论6个提示,这些提示可以帮助诊断和修复Go应用程序中的性能问题。基准测试:在Go中编写有效的基准测试对于了解代码性能至关重要。可以通过将文件命名为“_test.go”,并使用testing包的Benchmark函数来创建基准测试。以下是一个示例:funcfibonacci(nint)int{ ifn<=......
  • CMakeLists --- install和uninstall
    install假设生成了以下几个文件:静态库target1,动态库target2,可执行文件target3 1.安装文件至指定位置#只安装静态库install(TARGETStarget1LIBRARYDESTINATIONlib)#安装静态库,动态库,可执行文件install(TARGETStarget1target2target3LIBRARYDESTINATIONli......
  • JDK高版本反射修改 private static fianl 修饰的对象
    在JDK高版本中,Java语言规范已经更新,因可能会破坏Java语言的安全性和稳定性,不再允许通过反射改变final字段的值,需要自己做一下处理。 创建工具类importjava.lang.reflect.Field;importsun.misc.Unsafe;publicclassFieldUtil{privatestaticUnsafeunsafe......
  • POJ2352 stars(树状数组)
    题目:Stars #include<stdio.h>#include<string.h>constintN=32005;intC[N];intlevel[N];intLowbit(intx){returnx&(-x);}voidUpdate(intx){inti;for(i=x;i<=N;i+=Lowbit(i)){C[i]++;}}i......