• 2024-07-01跨模型知识融合:大模型的知识融合
     大模型(LLMs)在多个领域的应用日益广泛,但确保它们的行为与人类价值观和意图一致却充满挑战。传统对齐方法,例如基于人类反馈的强化学习(RLHF),虽取得一定进展,仍面临诸多难题:训练奖励模型需准确反映人类偏好,这本身难度很大;actor-critic架构的设计和优化过程复杂;RLHF通常需要直接访问
  • 2024-05-31[论文阅读] Aligner@ Achieving Efficient Alignment through Weak-to-Strong Correction
    Pretitle:Aligner:AchievingEfficientAlignmentthroughWeak-to-StrongCorrectionsource:Arxiv2024paper:https://arxiv.org/abs/2402.02416code:https://aligner2024.github.io/ref:https://mp.weixin.qq.com/s/O9PP4Oc_Ee3R_HxKyd31Qg关键词:LLM,align,fin