首页 > 其他分享 >llama-factory fine-tuning

llama-factory fine-tuning

时间:2023-11-29 12:55:57浏览次数:31  
标签:optional instruction factory dataset user llama model fine

data preparation

for llama-factory fine-tuning, here is the instruction for custom dataset preparation.

dataset classification

alpaca

stanford_alpaca dataset is a famous example to fine-tuning llama2 to get alpaca model, follow is its structure. 

[
  {
    "instruction": "user instruction (required)",
    "input": "user input (optional)",
    "output": "model response (required)",
    "history": [
      ["user instruction in the first round (optional)", "model response in the first round (optional)"],
      ["user instruction in the second round (optional)", "model response in the second round (optional)"]
    ]
  }
]

from bellow digraph, you can get how they get alpaca model: 

 

 

sharegpt

ShareGPT is a dialogue dataset actively contributed to and shared by users. It contains conversation samples from different domains, topics, styles, and emotions, covering a variety of types such as chit-chat, Q&A, stories, poetry, and song lyrics. This dataset is characterized by high quality, diversity, personalization, and emotional richness, which can provide conversational robots with more abundant and authentic linguistic knowledge and semantic information.

here is it's data structure.

[
  {
    "conversations": [
      {
        "from": "human",
        "value": "user instruction"
      },
      {
        "from": "gpt",
        "value": "model response"
      }
    ]
  }
]

 

   

 

标签:optional,instruction,factory,dataset,user,llama,model,fine
From: https://www.cnblogs.com/ldzbky/p/17864572.html

相关文章

  • TypeError: Cannot read properties of undefined (reading '$modal')
    原代码:handleFinish(row){this.$modal.confirm('确认录取学生编号为"'+row.stuCode+'"的成绩?').then(function(){finishStudentScore({id:row.id}).then((response)=>{if(response.code==......
  • 使用emqttd时执行emqttd console时无反应或者报错Node undefined not responding to p
    1.无反应:  2.报错:Nodeundefinednotrespondingtopings. 解决办法:路径不能有空格,最好用存英文的路径。......
  • medical custom dataset for fine-tuning llama2
    datapreparationweusehuggingfaceshibin6624/medical tofine-tuningllama2,pleasenotethatthisdatasetisconsistofenandcndata,herewejustuseendata.datasetstructure nowwedownloadandloaddataset,thensavethemintotrain.csv,valida......
  • 你知道Spring中BeanFactoryPostProcessors是如何执行的吗?
    Spring中的BeanFactoryPostProcessor是在Spring容器实例化Bean之后,初始化之前执行的一个扩展机制。它允许开发者在Bean的实例化和初始化之前对BeanDefinition进行修改和处理,从而对Bean的创建过程进行干预和定制化。BeanFactoryPostProcessor接口定义了一个方法:postProcessBeanFac......
  • SpringCloud——网关过滤工厂GatewayFilterFactory
    目录GatewayFilter工厂AddRequestHeaderAddRequestHeadersIfNotPresentAddRequestParameterAddResponseHeaderGatewayFilter工厂网关过滤器工厂GatewayFilterFactory允许以某种方式修改传入的HTTP请求或返回的HTTP响应。其作用域是某些特定路由。SpringCloudGateway包括......
  • 报错:undefined reference to `WinMain'
    报错:undefinedreferenceto`WinMain'错音是编译器找不到main()函数:可能缺少是main()函数,比如main拼写错误可能是main()函数不再全局命名空间中,注意main()函数必须置于默认命名空间(即全局命名空间)下......
  • GPU部署llama-cpp-python(llama.cpp通用)
    title:GPU部署llama-cpp-python(llama.cpp通用)banner_img:https://cdn.studyinglover.com/pic/2023/08/a5e39db5abf0853e6c456728df8bd971.jpgdate:2023-8-623:01:00tags:-踩坑GPU部署llama-cpp-python(llama.cpp通用)通用流程我们的安装平台是Ubuntu20.04,Python3.......
  • 快速上手llama2.c
    title:快速上手llama2.cbanner_img:https://github.com/karpathy/llama2.c/blob/master/assets/llama_cute.jpgdate:2023-7-2516:19:00tags:-踩坑快速上手llama2.cllama2.c一个完整的解决方案,可以使用PyTorch从头开始训练的Llama2LLM(LightweightLanguageModel)模型......
  • 快速上手llama2.c(更新版)
    title:快速上手llama2.c(更新版)banner_img:https://github.com/karpathy/llama2.c/blob/master/assets/llama_cute.jpgdate:2023-7-2816:31:00tags:-踩坑快速上手llama2.c(更新版)在上一次我同时在我的博客和知乎发布了快速上手llama2.c之后,我一个小透明也收获了不......
  • 解决ls: relocation error: /lib64/libacl.so.1: symbol getxattr, version ATTR_1.0
    解决ls:relocationerror:/lib64/libacl.so.1:symbolgetxattr,versionATTR_1.0notdefinedinfilelibattr.so.1withlinktimereference参考:https://www.cnblogs.com/biohujun/p/17613372.html 这个问题是在我conda装了一个包之后就出现了,ls等最基础的命令没有办......