首页 > 其他分享 >COMP5328 - Advanced Machine Learning

COMP5328 - Advanced Machine Learning

时间:2024-09-22 10:46:36浏览次数:10  
标签:NMF should Machine need algorithms Learning report your Advanced

COMP5328 - Advanced Machine Learning

Assignment 1

Due: 19/09/2024, 11:59PMThis assignment is to be completed in groups of 3 to 4 students. It is worth 25%of your total mark.

1 Objective

The objective of this assignment is to implement Non-negative Matrix Factorization (NMF) algorithms and analyze the robustness of NMF algorithms when thedataset is contaminated by large magnitude noise or corruption. More specifically,you should implement at least two NMF algorithms and compare their robustness.

2 Instructions

2.1 Dataset description

In this assignment, you need to apply NMF algorithms on two real-world faceimage datasets: (1) ORL dataset1 ; (2) Extended YaleB dataset2 .

  • ORL dataset: it contains 400 images of 40 distinct subjects (i.e., 10 imagesper subject). For some subjects, the images were taken at different times,varying the lighting, facial expressions and facial details (glasses / no glasses).All the images were taken against a dark homogeneous background with thesubjects in an upright, frontal position. All images are cropped and resizedto 92×112 pixels.
  • Extended YaleB dataset: it contains 2414 images of 38 subjects under9 poses and 64 illumination conditions. All images are manually aligned,cropped, and then resized to 168×192 pixels.1https://cam-orl.co.uk/facedatabase.html2http://vision.ucsd.edu/ leekc/ExtYaleDatabase/ExtYaleB.htmlFigure 1: An example face image and its occluded versions by b × b-blocks withb = 10, 12, and 14 pixels.

Note: we provide a tutorial for this assignment, which contains example code forloading a dataset to numpy array. Please find more details in assignment1.ipynb.

2.2 Assignment tasks

  1. You need to implement at least two Non-negative Matrix Factorization (NMF)algorithms:
  • You should implement at least two NMF algorithms with at least onenot taught in this course (e.g., L1-Norm Based NMF, Hypersurface CostBased NMF, L1-Norm Regularized Robust NMF, and L2,1-Norm BasedNMF).
  • For each algorithm, you need to describe the definition of the objectivefunction as well as the optimization methods used in your implementation.
  1. You need to analyze the robustness of each algorithm on two datasets:
  • You are allowed to design your own data preprocessing method (if necessary).
  • You need to use a block-occlusion noise similar to those shown in Figure
  1. The noise is generated by setting the pixel values to be 255 in theblock. You can design your own value for b (not neccessary to be 10, 12or 14). You are also encouraged to design your own noise 代 写COMP5328 - Advanced Machine Learning other thanhe block-occlusion noise.2• You need to demonstrate each type of noise used in your experiment(show the original image as well as the image contaminated by noise).
  • You should carefully choose the NMF algorithms and design experimentsettings to clearly show the different robustness of the algorithms youhave implemented.
  1. You are only allowed to use the python standard library, numpy and

scipy (if necessary) to implement NMF algorithms.

2.3 Programming and External Libraries This assignment is required to be finished by Python3. When you implementNMF algorithms, you are not allowed to use external libraries which containsNMF implementations, such as scikit-learn, and Nimfa (i.e., you have to implement the NMF algorithms by yourself). You are allowed to use scikit-learn for evaluation only (please find more details in assignment1.ipynb). If you haveany ambiguity whether you can use a particular library or afunction, please poston canvas under the ”Assignment 1” thread.

2.4 Evaluate metrics

To compare the performance and robustness of different NMF algorithms, we provide three evaluation metrics: (1) Relative Reconstruction Errors; (2) Average

Accuracy (optional); (3) Normalized Mutual Information (optional). For all experiments, you need to use at least one metric, i.e., Relative Reconstruction Errors. You are encouraged to use the other two metrics, i.e., AverageAccuracy and Normalized Mutual Information.

  • Relative Reconstruction Errors (RRE): let V denote the contaminateddataset (by adding noise), and Vˆ denote the clean dataset. Let W and Hdenote the factorization results on V , the relative reconstruction errors then can be defined as follows:
  • Average Accuracy: Let W and H denote the factorization results on , you need to perform some clustering algorithms (i.e., K-means) withnum clusters equal to num classes. Each example is assigned with thecluster label (please find more details in assignment1.ipynb). Lastly, you3can evaluate the accuracy of predictions Ypred as follows:
  • Normalized Mutual Information (NMI):

Note: we expect you to have a rigorous performance evaluation. To providean estimate of the performance of the algorithms in the report, you can repeatmultiple times (e.g., 5 times) for each experiment by randomly sampling 90% datafrom the whole dataset, and average the metrics on different subset. You are alsorequired to report the standard deviations.

3 Report

The report should be organized similar to research papers, and should contain the

following sections:

  • In abstract, you should briefly introduce the topic of this assignment anddescribe the organization of your report.
  • In introduction, you should first introduce the main idea of NMF as wellas its applications. You should then give an overview of the methods youwant to use.
  • In related work, you are expected to review the main idea of related NMFalgorithms (including their advantages and disadvantages).
  • In methods, you should describe the details of your method (includingthe definition of cost functions as well as optimization steps). You shouldalso describe your choices of noise and you are encouraged to explain therobustness of each algorithm from theoretical view.
  • In experiment, firstly, you should introduce the experimental setup (e.g.,datasets, algorithms, and noise used in your experiment for comparison).Second, you should show the experimental results and give some comments.
  • In conclusion, you should summarize your results and discuss your insightsfor future work.

4• In reference, you should list all references cited in your report and formattedall references in a consistent way.The layout of the report:

  • Font: Times New Roman; Title: font size 14; Body: font size 12
  • Length: Ideally 10 to 15 pages - maximum 20 pages

Note: Submissions must be typeset in LaTex using the provided template.

4 Submissions

Detailed instructions are as follows:

  1. The submission contains two parts: report and source code.(a) report (a pdf file): the report should include each member’s detailsstudent id and name).

(b) code (a compressed folder)

  1. algorithm (a sub-folder): your code could be multiple files.
  2. data (an empty sub-folder): although two datasets should be insidehe data folder, please do not include them in the zip file. We willcopy two datasets to the data folder when we test the code.
  1. The report (file type: pdf) and the codes (file type: zip) must be namedas student ID numbers of all group members separated by underscores. Forexample, “xxxxxxxx xxxxxxxx xxxxxxxx.pdf”.
  1. OOnly one student needs to submit your report (file type: pdf) to Assignment 1 (report) and upload your codes (file type: zip) to Assignment 1(codes).
  1. Your submission should include the report and the code. A plagiarismchecker will be used.
  1. You need to clearly provide instructions on how to run your code in theappendix of the report.
  1. You need to indicate the contribution of each group member.
  2. A penalty of minus 5 (5%) marks per each day after due (email late submissions to TA and confirm late submission dates with TA). Maximum delay is10 days, after that assignments will not be accepted.

 

6CategoryCriterionMarks CommentsCode [20]

  • Code runs within a feasible time
  • Well organized, commented and documentedPenalties [
  • Badly written code: [20]
  • Not including instructions on how to runyour code: [20]

Note: Marks for each category is indicated in square brackets. The minimum mark for the assignment will be 0 (zero).7

标签:NMF,should,Machine,need,algorithms,Learning,report,your,Advanced
From: https://www.cnblogs.com/WX-codinghelp/p/18424302

相关文章

  • Federated Learning Challenges, Methods, and Future Directions
    本文讨论了联邦学习的独特特征和挑战,提供了当前方法的广泛概述,并概述了与广泛的研究社区相关的未来工作的几个方向。背景:现代分布式网络中的设备(如移动电话、可穿戴设备和自动驾驶汽车等)每天会产生大量数据,由于这些设备的计算能力不断增强,以及对传输私人信息的担忧,在本地......
  • Suspense and Fiber- The Intricate Machinery Behind React&#s Rendering Elegance
    reactfiber是react并发渲染的核心,它使框架能够将任务分解为更小的单元,并优先处理更重要的任务,从而实现更流畅、响应更灵敏的用户界面。当与suspense配合使用时,它允许react“暂停”渲染,在等待数据获取或计算等任务完成时显示后备ui。fiber是一个javascript对象,代表rea......
  • 基于Q-learning算法和ε-greedy策略解决随机生成的方形迷宫问题(Matlab代码实现)
     ......
  • Advanced .Net Debugging 11:完结篇
    一、介绍这是我的《Advanced.NetDebugging》这个系列的第十一篇文章,也是这个系列的最后一篇了。我已经把原书的前八章内容全部写完了,本来打算继续写第九章和第十章的内容,后来我放弃逐章逐节的编写,选择了将两章的内容进行过滤后,合为一篇,只把重要的内容包含进来的做法。原......
  • How to use the shell, terminal and the advanced tools
    Howtousetheshell,terminalandtheadvancedtoolsIntroduction‍WhyuseEnglishinsteadofChinesewhenwritingablog?Astimegoesby,themoreIhavelearned,themoreIhavetohandlewiththeEnglishdocumentsorpapers.So,Irealizeditwasti......
  • How to use the shell, terminal and the advanced tools
    Howtousetheshell,terminalandtheadvancedtoolsIntroduction‍WhyuseEnglishinsteadofChinesewhenwritingablog?Astimegoesby,themoreIhavelearned,themoreIhavetohandlewiththeEnglishdocumentsorpapers.So,Irealizeditwasti......
  • How to use the shell, terminal and the advanced tools
    Howtousetheshell,terminalandtheadvancedtoolsIntroduction‍WhyuseEnglishinsteadofChinesewhenwritingablog?Astimegoesby,themoreIhavelearned,themoreIhavetohandlewiththeEnglishdocumentsorpapers.So,Irealizeditwasti......
  • How to use the shell, terminal and the advanced tools
    Howtousetheshell,terminalandtheadvancedtoolsIntroduction‍WhyuseEnglishinsteadofChinesewhenwritingablog?Astimegoesby,themoreIhavelearned,themoreIhavetohandlewiththeEnglishdocumentsorpapers.So,Irealizeditwasti......
  • 论文阅读:Unsupervised Representation Learning with Deep Convolutional Generative
    Abstract背景:希望能缩小CNN在监督学习和无监督学习之间成功应用的差距。贡献:引入了一类称为深度卷积生成对抗网络(DCGAN)的CNN。结果:DCGAN在生成器和判别器中都能从对象到场景学习表示层次结构。1.Introduction贡献:提出DCGAN用于图像分类任务,展示其性能对滤波器......
  • Imitating Language via Scalable Inverse Reinforcement Learning
    本文是LLM系列文章,针对《ImitatingLanguageviaScalableInverseReinforcementLearning》的翻译。通过可扩展的逆向强化学习模仿语言摘要1引言2方法3实验4相关工作5讨论6结论摘要大多数语言模型训练都建立在模仿学习的基础上。它涵盖了预训练、监......