Transformers
Hugging Face Transformer
提供了模型的加载、推理、微调接口,使用该库可以轻松完成自然语言模型的部署微调工作,其有继承自AutoClass
的四个最为常见的接口,且调用方式均为AutoClass.from_pretrain("model_name")
:
AutoTokenizer
: 用于文本分词AutoFeatureExtractor
: 用于特征提取AutoProcessor
: 用于数据处理AutoModel
: 用于加载模型
例如使用ChatGLM系列的模型,模型调用有chat
和stream_chat
两个接口,一个最为简洁的DEMO为:
# load
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = AutoModel.from_pretrained(model_path, trust_remote_code=True).half().cuda()
model.eval()
'''
init ...
'''
# contact method1
response, history = model.chat(tokenizer, query, history=history)
return response
# OR contact method2
for response, history in model.stream_chat(tokenizer, query, history=history):
yield response
P-Tuning v2 DEMO
首先来看main.py
中的核心训练代码:
checkpoint = None
if training_args.resume_from_checkpoint is not None:
checkpoint = training_args.resume_from_checkpoint
# elif last_checkpoint is not None:
# checkpoint = last_checkpoint
model.gradient_checkpointing_enable()
model.enable_input_require_grads()
train_result = trainer.train(resume_from_checkpoint=checkpoint)
# trainer.save_model() # Saves the tokenizer too for easy upload
metrics = train_result.metrics
max_train_samples = (
data_args.max_train_samples if data_args.max_train_samples is not None else len(train_dataset)
)
metrics["train_samples"] = min(max_train_samples, len(train_dataset))
trainer.log_metrics("train", metrics)
trainer.save_metrics("train", metrics)
trainer.save_state()
可以看到,其核心是封装好的trainer
:
from trainer_seq2seq import Seq2SeqTrainer
trainer = Seq2SeqTrainer(
model=model,
args=training_args,
train_dataset=train_dataset if training_args.do_train else None,
eval_dataset=eval_dataset if training_args.do_eval else None,
tokenizer=tokenizer,
data_collator=data_collator,
compute_metrics=compute_metrics if training_args.predict_with_generate else None,
save_prefixencoder=model_args.pre_seq_len is not None
)
阅读trainer_seq2seq
可以看到:
from trainer import Trainer
class Seq2SeqTrainer(Trainer):
def evaluate(
return super().evaluate(
eval_dataset,
ignore_keys=ignore_keys,
metric_key_prefix=metric_key_prefix
)
def predict(
return super().predict(
test_dataset,
ignore_keys=ignore_keys,
metric_key_prefix=metric_key_prefix
)
def prediction_step(
return (loss, generated_tokens, labels)
def _pad_tensors_to_max_len(self, tensor, max_length):
return padded_tensor
其最终继承自transformers
库,并修改了部分函数用于当前任务的额外需求,比如说预测。
参考资料
[1] Hugging Face快速入门(重点讲解模型(Transformers)和数据集部分(Datasets))_huggingface-CSDN博客
[2] ChatGLM-6B/ptuning at main · THUDM/ChatGLM-6B · GitHub
标签:trainer,metrics,Transformers,args,Hugging,Face,checkpoint,train,model From: https://www.cnblogs.com/yichengliu0219/p/18264216