首页 > 其他分享 >CS 839: FOUNDATION MODELS

CS 839: FOUNDATION MODELS

时间:2024-10-03 20:44:08浏览次数:8  
标签:FOUNDATION What shot 839 training but will CS your

CS 839: FOUNDATION MODELS

HOMEWORK 1

Instructions: Read the two problems below. Type up your results and include your plots in LaTeX. Submit youranswers in two weeks (i.e., Oct. 3 2024, end of day). You will need a machine for this assignment, but a laptop(even without GPU) should still work. You may also need an OpenAI account to use ChatGPT, but a free accountshould work.

  1. NanoGPT Experiments.

We will experiment with a few aspects of GPT training. While this normallyrequires significant resources, we will use a mini-implementation that can be made to run (for the character level)on any laptop. If you have a GPU on your machine (or access to one), even better, but no resources are strictly

required.

  • 1. Clone Karpathy’s nanoGPT repo (https://github.com/karpathy/nanoGPT). We will use this repo for allthe experiments in this problem. Read and get acquainted with the README.
  • 2. Setup and Reproduction. Run the Shakespeare character-level GPT model. Start by running the prepcode, then a basic run with the default settings. Note that you will use a different command line if you havea GPU versus a non-GPU. After completing training, produce samples. In your answer, include the first twolines you’ve generated.
  • 3. Hyperparameter Experimentation. Modify the number of layers and heads, but do not take more than 10minutes per run. What is the lowest loss you can obtain? What settings produce it on your machine?
  • 4. Evaluation Metrics. Implement a specific and a general evaluation metric. You can pick any that youwould like, but with the following goals: Your specific metric is meant to capture how close your generateddata distribution is to the training distribution. Your general metric need not necessarily do this and shouldbe applicable without comparing against the training dataset. Explain your choices and report your metricson the settings above.
  • 5. Dataset. Obtain your favorite text dataset. This might be collected data by a writer (but not Shakespeare!),text in a different language, or whatever you would prefer. Scrape and format this data. Train nanoGPT 代 写 CS 839: FOUNDATION MODELS onyour new data. Vary the amount of characters of your dataset. Draw a plot on number of training charactersversus your metrics from the previous part. How much data do you need to produce a reasonable scoreaccording to your metrics?
  • 6. Fine-tuning. Fine-tune the trained Shakespeare model on the dataset you built above. How much dataand training do you need to go from Shakesperean output to something that resembles your dataset?
  1. Prompting.We will attempt to see how ChatGPT can cope with challenging questions.
  • 1. Zero-shot vs. Few-shot. Find an example of a prompt that ChatGPT cannot answer in a zero-shot manner,but can with a few-shot approach.
  • 2. Ensembling and Majority Vote. Use a zero-shot question and vary the temperature parameter to obtainmultiple samples. How many samples are required before majority vote recovers the correct answer?
  • 3. Rot13. In this problem our goal is to use Rot13 encoding and ‘teach’ ChatGPT how to apply it. You canuse rot13.com to quickly encode and decode. Also read about it at https://en.wikipedia.org/wiki/ROT13.Our goal is to ask questions like

What is the capital of France?, but encoded with Rot13, i.e.,

Jung vf gur pncvgny bs Senapr?, 1Homework 1 CS 839: Foundation Models

What do you obtain if you ask a question like this zero-shot? Note: you may need to decode back.

What do you obtain with a few-shot variant?

Provide the model with additional instructions. What can you obtain?

Find a strategy to ultimately produce the correct answer to an encoded geographic (or other) questionlike this one.2

标签:FOUNDATION,What,shot,839,training,but,will,CS,your
From: https://www.cnblogs.com/comp9021/p/18445454

相关文章

  • Cornell cs3110 - Chapter5 Exercises
    (*Exercise:complexsynonym*)moduletypeComplexSig=sigtypecomplexvalzero:complexvaladd:complex->complex->complexend(*Exercise:complexencapsulation*)moduleComplex:ComplexSig=structtypecomplex=float*flo......
  • CSP-S 2024 第八次
    忘记写了,补一下A依次加入每个\(a_i\),拿个大根堆维护当前以\(i\)结尾的和最大子段,和超过\(s\)了就弹堆顶直到和不超过\(s\)。不过好像出现了一些语文事故,先不管了。B倍增预处理出\(f_i\)表示\(i\)上方第一个大于\(a_i\)的点,询问\(u,v,c\)时,先倍增找到\(u\)上......
  • CSS display: flex布局
    CSSdisplay:flex布局来源https://zhuanlan.zhihu.com/p/646436119前言早期CSS布局依赖display属性+position属性+float属性。它对特殊的布局非常不方便,如,垂直居中。于是,W3C在2009年提出了一种新的方案——Flex方案,可以简便、完整、响应式地实现各种页面布局。目前,它已......
  • CSS display属性 inline-block flex grid
    CSSdisplayinline-block flexgrid=======================================CSS的display属性是一个核心属性,用于控制元素如何在页面布局中显示,包括其盒模型的行为。以下是display属性的一些常见值及其示例代码:1.block   说明:将元素变为块级元素,独占一行,可以设置宽高、......
  • 信息学奥赛复赛复习10-CSP-J2020-03表达式求值-栈、后缀表达式、isdigit函数、c_str函
    PDF文档公众号回复关键字:202410031P7073[CSP-J2020]表达式[题目描述]小C热衷于学习数理逻辑。有一天,他发现了一种特别的逻辑表达式。在这种逻辑表达式中,所有操作数都是变量,且它们的取值只能为0或1,运算从左往右进行。如果表达式中有括号,则先计算括号内的子表达式的......
  • CSP 模拟 37
    Amedian如果保证每个数互不相同,直接统计每个序列中小于\(x\)和大于\(x\)的数量就好。但是有重复的数,答案会算重,考虑给每一个数一个独一无二的特征,保证满足大小关系,直接给所有数排个序后,记录排序后的位置即可。时间复杂度\(\mathcal{O}(n\logn)\)。Btravel当\(k\to\in......
  • 2024/10/2 CSP-S daimayuan模拟赛复盘
    2024/10/2CSP-Sdaimayuancontestlink(Day7)A.序列题面描述给你一个序列\(r_1,r_2,\dots,r_n\),问有多少非负整数序列\(x_1,x_2,\dots,x_n\)满足:对于所有\(i\),\(0\leqx_i\leqr_i\)。满足\(x_1|x_2|…|x_n=x_1+x_2+⋯+x_n\),左边为二进制或。输出答案对......
  • CSS3--美若天仙!?
    免责声明:本文仅做分享~ 目录CSS引入方式 选择器盒子尺寸和背景色文字控制属性单行文字垂直居中字体族font复合属性文本对齐方式文本修饰线color文字颜色-----复合选择器伪类选择器超链接伪类CSS特性继承性层叠性优先级Emmet写法背景属性背景图......
  • CSP-J模拟赛补题报告
    前言最SB的一次&做的最差的一次T1:AC100pts......
  • css颜色
    1.前景色<styletype="text/css"> body{ padding:20px; font-family:Arial,Verdana,sans-serif;} /*colorname*/ h1{ color:DarkCyan;} /*hexcode*/ h2{ color:#ee3e80;} /*rgbvalue*/ p{ color......