首页 > 其他分享 >【Coursera GenAI with LLM】 Week 2 Fine-tuning LLMs with instruction Class Notes

【Coursera GenAI with LLM】 Week 2 Fine-tuning LLMs with instruction Class Notes

时间:2024-03-13 17:13:32浏览次数:28  
标签:Week tuning LLMs instruction models LLM -- fine

GenAI Project Lifecycle: After picking pre-trained models, we can fine-tune!

In-context learning (ICL): zero / one / few shot inference. Including a few models in the prompt for model to learn and generate a better complement (aka output). Its drawbacks are:

  • for smaller models, it doesn't work even when a lot of examples are included
  • take up context window

Pre-training: you train the LLM using vast amounts of unstructured textual data via self-supervised learning

Fine-tuning: supervised learning process where you use a data set of labeled examples to update the weights of the LLM.

Two types of fine-tuning

  1. Instruction fine-tuning (full fine-tuning: very costly!)
    It trains the model using examples that demonstrate how it should respond to a specific instruction.
    Prepare instruction dataset --> split the dataset into training, validation, and test --> calculate the loss between training completion and the provided label --> use the loss to calculate the model weights in standard backpropagation
  2. PEFT (Parameter Efficient Fine-tuning: cheaper!)
    PEFT is a set of techniques that preserves the weights of the original LLM and trains only a small number of task-specific adapter layers and parameters.
    ex. LoRA

Catastrophic forgetting: full fine-tuning process modifies the weights of the original LLM, which can degrade performance on other tasks
--> To solve catastrophic forgetting, we can use PEFT!

Multi-task instruction: it can instruct the fine tuning on many tasks, but it requires a lot of data and examples

FLAN: fine-tuned language net, is a specific set of instructions used to fine-tune different models. Like the yummy dessert

Terms

  1. Unigram: a single word
  2. Bigram: two words
  3. n-gram: n words

Model Evaluation Metrics

  1. **Accuracy **= Correct Predictions / Total Predictions
  2. ROUGE (recall oriented under study for jesting evaluation): assess the quality of automatically generated **summaries **by comparing them to human-generated reference summaries.
  3. BLEU (bilingual evaluation understudy): an algorithm designed to evaluate the quality of machine-**translated **text by comparing it to human-generated translations.



Benchmarks:
tests that evaluate the capabilities of models. ex. GLUE, SuperGLUE, MMLU (Massive Multitask Language Understanding), Big-bench Hard, HELM (Holistic Evaluation of Language Models)

标签:Week,tuning,LLMs,instruction,models,LLM,--,fine
From: https://www.cnblogs.com/miramira/p/18070904

相关文章

  • AI推介-大语言模型LLMs论文速览(arXiv方向):2024.03.05-2024.03.10—(1)
    文章目录~1.EditingConceptualKnowledgeforLargeLanguageModels2.TRAD:EnhancingLLMAgentswithStep-WiseThoughtRetrievalandAlignedDecision3.AreYouBeingTracked?DiscoverthePowerofZero-ShotTrajectoryTracingwithLLMs!4.CanLLMSubstit......
  • ChatGLM-6B模型基于 P-Tuning v2 微调脚本参数解释
    1、地址:https://github.com/THUDM/ChatGLM-6B/blob/main/ptuning/README.md2、参数示例PRE_SEQ_LEN=128LR=2e-2CUDA_VISIBLE_DEVICES=0python3main.py\--do_train\--train_fileAdvertiseGen/train.json\--validation_fileAdvertiseGen/dev.json\......
  • Pacing guide is based on five 50 minute class sessions per week
    Pacingguideisbasedonfive50minuteclasssessionsperweekcorecontent     corecontent     capstone     explorations     optionalcontent     WEEK1 Session1Session2Session......
  • Papers in week 1
    文章总结(week1)2024.3.4~2024.3.10DeepRitzMethodforEllipticalMultipleEigenvalueProblemsIF=2.5,JournalofScientificComputingDOI:10.1007/s10915-023-02443-8文章研究了用神经网络求解椭圆型多重特征值问题。基于椭圆特征值问题的惩罚变分形式,提出了......
  • 蓝桥杯算法集训 - Week1:二分、前缀和、差分算法
    蓝桥杯算法集训-Week1本系列随笔用于整理AcWing题单——《蓝桥杯集训·每日一题2024》的系列题型及其对应的算法模板。一、二分查找二分算法原理复习参考:二分查找-Hello算法Ⅰ、二分模板boolcheck(intx){/*...*/}//检查x是否满足某种性质//区间[l,r]被划分......
  • SMU Winter 2024 div2 ptlks的周报Week 5(3.4-3.10)
    维护子树的全部子树的权值和时,需要用到树的DFS序列,树的每个子树都对应DFS序列中的连续一段黄金树影题意:给定一棵树及每个节点的权值,给定一组操作,输入1ax,表示节点a权值加上x;输入2a,表示询问节点a的子树权值和(包含a)。考虑到树的DFS序列,则问题转变为对某个序列维护区间和以......
  • Weekly Contest 387
    ProblemADistributeElementsIntoTwoArraysI思路按照题意模拟即可.代码classSolution{publicint[]resultArray(int[]nums){intn=nums.length;int[]ans=newint[n];int[]arr1=newint[n];int[]arr2=newint[......
  • NewStarCTF 2023 公开赛道 做题随笔(WEEK1|MISC部分)
    第一题下载打开得到TXT文件好的看样子应该是base32,复制到base在线转换看看得到这玩意 base58转换得到 出了flag  第二题 下载得到一张二维码用隐写软件试试得到一张这个以为是摩斯密码,试试得到有个这玩意,嘶,好像不是试试LSB 得到flag 第三题......
  • Week 2 Problems
    T1代换式、替换式求代换式\((P\rightarrow(P\rightarrowQ))[P/P\rightarrowR]\)求替换式\((P\lorR\rightarrowP\lorR\landS)[(P\lorR)/(P\landR)]\)已知\(P,Q,R,S\)是命题逻辑合式公式,\(P\)是\(Q\)的子公式,\(R\)不是\(Q\)的子公式,用\(Q^1\equivQ[P/R]\)和「替......
  • NewStar Week2-3部分pwn wp
    stack_migrationchecksec开启了NX保护,但是没有PIE和Canary代码审计可以看到有两个read和一个printf。第一个read没什么用我们看第二个。因为v2距离rbp有0x50个字节,而read只能读入0x60个字节,意味着我们剩余的字节数只有0x10,没法构造完整的ROP链,那么我们就只能利用栈迁移来变......