【Coursera GenAI with LLM】 Week 2 Fine-tuning LLMs with instruction Class Notes

时间：2024-03-13 17:13:32浏览次数：28

标签：Week tuning LLMs instruction models LLM -- fine

GenAI Project Lifecycle: After picking pre-trained models, we can fine-tune!

In-context learning (ICL): zero / one / few shot inference. Including a few models in the prompt for model to learn and generate a better complement (aka output). Its drawbacks are:

for smaller models, it doesn't work even when a lot of examples are included
take up context window

Pre-training: you train the LLM using vast amounts of unstructured textual data via self-supervised learning

Fine-tuning: supervised learning process where you use a data set of labeled examples to update the weights of the LLM.

Two types of fine-tuning

Instruction fine-tuning (full fine-tuning: very costly!)
It trains the model using examples that demonstrate how it should respond to a specific instruction.
Prepare instruction dataset --> split the dataset into training, validation, and test --> calculate the loss between training completion and the provided label --> use the loss to calculate the model weights in standard backpropagation
PEFT (Parameter Efficient Fine-tuning: cheaper!)
PEFT is a set of techniques that preserves the weights of the original LLM and trains only a small number of task-specific adapter layers and parameters.
ex. LoRA

Catastrophic forgetting: full fine-tuning process modifies the weights of the original LLM, which can degrade performance on other tasks
--> To solve catastrophic forgetting, we can use PEFT!

Multi-task instruction: it can instruct the fine tuning on many tasks, but it requires a lot of data and examples

FLAN: fine-tuned language net, is a specific set of instructions used to fine-tune different models. Like the yummy dessert

Terms

Unigram: a single word
Bigram: two words
n-gram: n words

Model Evaluation Metrics

**Accuracy **= Correct Predictions / Total Predictions
ROUGE (recall oriented under study for jesting evaluation): assess the quality of automatically generated **summaries **by comparing them to human-generated reference summaries.
BLEU (bilingual evaluation understudy): an algorithm designed to evaluate the quality of machine-**translated **text by comparing it to human-generated translations.

Benchmarks:
tests that evaluate the capabilities of models. ex. GLUE, SuperGLUE, MMLU (Massive Multitask Language Understanding), Big-bench Hard, HELM (Holistic Evaluation of Language Models)

标签：Week,tuning,LLMs,instruction,models,LLM,--,fine
From： https://www.cnblogs.com/miramira/p/18070904

AI推介-大语言模型LLMs论文速览（arXiv方向）：2024.03.05-2024.03.10—（1）
文章目录~1.EditingConceptualKnowledgeforLargeLanguageModels2.TRAD:EnhancingLLMAgentswithStep-WiseThoughtRetrievalandAlignedDecision3.AreYouBeingTracked?DiscoverthePowerofZero-ShotTrajectoryTracingwithLLMs!4.CanLLMSubstit......
ChatGLM-6B模型基于 P-Tuning v2 微调脚本参数解释
1、地址：https://github.com/THUDM/ChatGLM-6B/blob/main/ptuning/README.md2、参数示例PRE_SEQ_LEN=128LR=2e-2CUDA_VISIBLE_DEVICES=0python3main.py\--do_train\--train_fileAdvertiseGen/train.json\--validation_fileAdvertiseGen/dev.json\......
Pacing guide is based on five 50 minute class sessions per week
Pacingguideisbasedonfive50minuteclasssessionsperweekcorecontent corecontent capstone explorations optionalcontent WEEK1 Session1Session2Session......
Papers in week 1
文章总结(week1)2024.3.4~2024.3.10DeepRitzMethodforEllipticalMultipleEigenvalueProblemsIF=2.5,JournalofScientificComputingDOI:10.1007/s10915-023-02443-8文章研究了用神经网络求解椭圆型多重特征值问题。基于椭圆特征值问题的惩罚变分形式，提出了......
蓝桥杯算法集训 - Week1：二分、前缀和、差分算法
蓝桥杯算法集训-Week1本系列随笔用于整理AcWing题单——《蓝桥杯集训·每日一题2024》的系列题型及其对应的算法模板。一、二分查找二分算法原理复习参考：二分查找-Hello算法Ⅰ、二分模板boolcheck(intx){/*...*/}//检查x是否满足某种性质//区间[l,r]被划分......
SMU Winter 2024 div2 ptlks的周报Week 5（3.4-3.10）
维护子树的全部子树的权值和时，需要用到树的DFS序列，树的每个子树都对应DFS序列中的连续一段黄金树影题意：给定一棵树及每个节点的权值，给定一组操作，输入1ax，表示节点a权值加上x；输入2a，表示询问节点a的子树权值和（包含a）。考虑到树的DFS序列，则问题转变为对某个序列维护区间和以......
Weekly Contest 387
ProblemADistributeElementsIntoTwoArraysI思路按照题意模拟即可.代码classSolution{publicint[]resultArray(int[]nums){intn=nums.length;int[]ans=newint[n];int[]arr1=newint[n];int[]arr2=newint[......
NewStarCTF 2023 公开赛道做题随笔（WEEK1|MISC部分）
第一题下载打开得到TXT文件好的看样子应该是base32，复制到base在线转换看看得到这玩意 base58转换得到出了flag 第二题下载得到一张二维码用隐写软件试试得到一张这个以为是摩斯密码，试试得到有个这玩意，嘶，好像不是试试LSB 得到flag 第三题......
Week 2 Problems
T1代换式、替换式求代换式\((P\rightarrow(P\rightarrowQ))[P/P\rightarrowR]\)求替换式\((P\lorR\rightarrowP\lorR\landS)[(P\lorR)/(P\landR)]\)已知\(P,Q,R,S\)是命题逻辑合式公式，\(P\)是\(Q\)的子公式，\(R\)不是\(Q\)的子公式，用\(Q^1\equivQ[P/R]\)和「替......
NewStar Week2-3部分pwn wp
stack_migrationchecksec开启了NX保护，但是没有PIE和Canary代码审计可以看到有两个read和一个printf。第一个read没什么用我们看第二个。因为v2距离rbp有0x50个字节，而read只能读入0x60个字节，意味着我们剩余的字节数只有0x10，没法构造完整的ROP链，那么我们就只能利用栈迁移来变......

【Coursera GenAI with LLM】 Week 2 Fine-tuning LLMs with instruction Class Notes

相关文章

赞助商

阅读排行