首页 > 其他分享 >《Language Model Cascades》论文学习

《Language Model Cascades》论文学习

时间:2023-07-21 15:14:53浏览次数:31  
标签:None Language question trees answer Cascades Model id he

一、Introduction

语言模型 (LM) 已展现出令人印象深刻的小样本学习能力,很多人建议应该将LM视为一个基础通用推理计算器,这个基础通用推理计算器可以被用于例如:

  • scratchpads
  • chain of thought prompting
  • learned verifiers
  • selection-inference
  • bootstrapping
  • been applied in formal mathematics settings to guide theorem provers

等场景中。

要完成上述这些”计算“场景中,有两种和LM的交互范式可以采用:

  • 通过prompt技术引导单个LM进行step-by-step的交互式推理
  • 通过将多个LM(采用了不同的微调方式,或者使用了不同的prompt方式)进行串联进行交互式推理

我们认为,概率编程语言(probabilistic programming languages,PPL)会在未来成为统一上述问题的一种通用框架。

和传统编程框架最大的不同是,概率变成语言(PPL)不使用整型或者浮点型数字作为输入和输出,取而代之的是自然语言字符串。

也就是说,我们使用PPL来定义基于字符串值随机变量的联合概率模型,同时以字符串作为输入参数,LM模型计算得到的后验结果(model inference)也是一个字符串。

我们将这个概率程序称为”语言模型级联(language model cascade)“。我们希望language model cascade能够使面向业务目标的端到端开发(通用计算和推理)成为可能。 

从概率统计的角度,我们来重新理解一下LM prompting技术。

zero-shot prompt的预测过程可以视为如下的条件概率预测,

  • Q:代表我们向LLM输入的问题
  • θ:代表预训练模型的模型参数
  • A:代表模型预测答案的概率分布

few-shot prompt的预测过程可以视为如下的条件概率预测, 

  • D:代表着一组question-answer pairs,

fine-tune/sft model的预测过程可以视为如下的条件概率预测, 

  • θ’:代表fine-tune/sft模型的模型参数

给LM推理过程增加thought辅助,可以视为如下的条件概率预测,

  • T:代表问题分解思考

上述条件概率分解公式,很直观地阐述了,如果希望LLM进行thought inference,需要在prompt中提供一个对应的question-thought-answer示例。

chain-of-thought prompt的预测过程可以视为如下的条件概率预测, 

 

  • prompt中的question-thought-answer示例可以是多个 

参考链接:

https://model-cascades.github.io/
https://arxiv.org/pdf/2207.10342.pdf

 

二、Cascades介绍

在本节中,我们将展示如何创建 cascades LM,以此解决各种基于语言的推理问题。

级联是一个概率程序,包含从LM采样得到的字符串随机变量。下图 2 是一个用于简单问答任务的cascade程序。

上图是一个Chain of thought的cascade程序例子,每一个Yield 表达式返回一个符合语言模型 S 的字符串概率分布。这个程序定义了一个包含”question“、”thought“、”answer“变量的联合概率分布。

其他的cascade程序例子如下:

The basic question answering graph directly generates the answer given the question

Self critique introduces a step in which the model critiques its own reasoning in natural languag 

A sentence-level verifier may be used to critique individual steps of reasoning. Furthermore, when to halt generation may itself be a random variable 

Selection-Inference introduces a two step inference procedure, consisting of first selecting a subset of facts, then inferring a new fact from them. 

我们将cascade作为一种基于跟踪的概率编程语言,嵌入 Python 中,构建出了一种新的概率编程语言。同时cascade支持任意控制流和递归。

虽然我们在本文中的演示是关于因果语言模型的few-shot promping,但理论上,cascade可以适用于微调模型、特殊 LM mask 设定,以及其他复杂数据类型(例如图像)。

 

三、Cascades代码示例

Cascades是一个Python库,支持语言模型的复杂组合,例如

  • scratchpads
  • chain of thought
  • tool use
  • selection-inference

Cascades可以被嵌入Python程序中,作为一个通用的、基于跟踪的概率编程库(universal trace-based probabilistic programming library)来使用。

0x1:Scratchpads and Chain of thought

# Copyright 2023 The cascades Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""Basic tests for Scratchpads."""
from absl.testing import absltest
import cascades as cc
from cascades.examples import scratchpad


class ScratchpadTest(absltest.TestCase):

  def test_sample_solution(self):
    examples = scratchpad.load_chain_examples()
    print("examples: ", examples)
    target = scratchpad.ReasonIO(
        question='Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?',
        reason=None,
        answer='72',
        id='test/123')

    mock_lm = cc.mock_lm(
        response=': Half of 48 is 24. 24 + 48 is 72.\nAnswer: 72\n===')

    model = scratchpad.sample_with_prompts(
        lm=mock_lm,
        target=target,
        examples=examples,
        n_prompts=3)
    trace = model.sample(seed=0)
    print("trace: ", trace)

    self.assertEqual('72', trace.return_value)
    self.assertEqual('test/123', trace['problem_id'].value)


if __name__ == '__main__':
  absltest.main()


/*

examples:  (ReasonIO(question='There are 15 trees in the grove. Grove workers will plant trees in the grove today. After they are done, there will be 21 trees. How many trees did the grove workers plant today?', reason='We start with 15 trees. Later we have 21 trees. The difference must be the number of trees they planted. So, they must have planted 21 - 15 = 6 trees.', answer='6', id=None), ReasonIO(question='If there are 3 cars in the parking lot and 2 more cars arrive, how many cars are in the parking lot?', reason='There are 3 cars in the parking lot already. 2 more arrive. Now there are 3 + 2 = 5 cars.', answer='5', id=None), ReasonIO(question='Leah had 32 chocolates and her sister had 42. If they ate 35, how many pieces do they have left in total?', reason='Leah had 32 chocolates and Leah’s sister had 42. That means there were originally 32 + 42 = 74 chocolates. 35 have been eaten. So in total they still have 74 - 35 = 39 chocolates.', answer='39', id=None), ReasonIO(question='Jason had 20 lollipops. He gave Denny some lollipops. Now Jason has 12 lollipops. How many lollipops did Jason give to Denny?', reason='Jason had 20 lollipops. Since he only has 12 now, he must have given the rest to Denny. The number of lollipops he has given to Denny must have been 20 - 12 = 8 lollipops.', answer='8', id=None), ReasonIO(question='Shawn has five toys. For Christmas, he got two toys each from his mom and dad. How many toys does he have now?', reason='He has 5 toys. He got 2 from mom, so after that he has 5 + 2 = 7 toys. Then he got 2 more from dad, so in total he has 7 + 2 = 9 toys.', answer='9', id=None), ReasonIO(question='There were nine computers in the server room. Five more computers were installed each day, from monday to thursday. How many computers are now in the server room?', reason='There are 4 days from monday to thursday. 5 computers were added each day. That means in total 4 * 5 = 20 computers were added. There were 9 computers in the beginning, so now there are 9 + 20 = 29 computers.', answer='29', id=None), ReasonIO(question='Michael had 58 golf balls. On tuesday, he lost 23 golf balls. On wednesday, he lost 2 more. How many golf balls did he have at the end of wednesday?', reason='Michael initially had 58 balls. He lost 23 on Tuesday, so after that he has 58 - 23 = 35 balls. On Wednesday he lost 2 more so now he has 35 - 2 = 33 balls.', answer='33', id=None), ReasonIO(question='Olivia has $23. She bought five bagels for $3 each. How much money does she have left?', reason='She bought 5 bagels for $3 each. This means she spent 5 * $3 = $15 on the bagels. She had $23 in beginning, so now she has $23 - $15 = $8.', answer='8', id=None))


trace:  Record(
  problem_id: Log(name='problem_id', score=0.0, value='test/123', should_stop=False, replayed=False, metadata=None)
  choose_prompts: Sample(name='choose_prompts', score=Array(-6.2383246, dtype=float32, weak_type=True), value=[ReasonIO(question='There are 15 trees in the grove. Grove workers will plant trees in the grove today. After they are done, there will be 21 trees. How many trees did the grove workers plant today?', reason='We start with 15 trees. Later we have 21 trees. The difference must be the number of trees they planted. So, they must have planted 21 - 15 = 6 trees.', answer='6', id=None), ReasonIO(question='Shawn has five toys. For Christmas, he got two toys each from his mom and dad. How many toys does he have now?', reason='He has 5 toys. He got 2 from mom, so after that he has 5 + 2 = 7 toys. Then he got 2 more from dad, so in total he has 7 + 2 = 9 toys.', answer='9', id=None), ReasonIO(question='Jason had 20 lollipops. He gave Denny some lollipops. Now Jason has 12 lollipops. How many lollipops did Jason give to Denny?', reason='Jason had 20 lollipops. Since he only has 12 now, he must have given the rest to Denny. The number of lollipops he has given to Denny must have been 20 - 12 = 8 lollipops.', answer='8', id=None)], should_stop=False, replayed=False, metadata=None)
  thought: Sample(name='thought', score=Array(0., dtype=float32, weak_type=True), value=': Half of 48 is 24. 24 + 48 is 72.\nAnswer: 72\n', should_stop=False, replayed=False, metadata=None)
  return_value: Log(name='return_value', score=0.0, value='72', should_stop=False, replayed=False, metadata=None)
)

*/

0x2:Semi-supervised learning

Figure 3. QTA model with hidden thoughts. 

0x3:Selection-Inference

Selection Inference是一种多步推理prompt技术,它将prompt推理分为两个模块:

  • 选择模块(selection module):从给定问题的事实集合中,选择一个事实子集
  • 推理模块(inference module):根据被选择出的事实子集推断出新的事实

可以用下图4的模型来表示这个流程,

 

Figure 4. Selection inference as a cascade. Here S is the selected subset of facts and I is an inference driven by this subset. 

这里S是从预先指定的“事实”集合中选择的一组事实,I 是由该事实驱动的推论。

S 和 I 节点可以迭代以进行多步推理。

同时,基于few-shot prompt技术,我们可以通过给出示例来“训练”模型,以此来改变模型预测的前验概率,

0x4:Verifiers

尽管向模型添加明确的“thought”变量已经可以提高性能,但模型仍然可能得到得到错误的答案,或者从错误的推理中得到的”正确“的答案。

解决这个问题的一个直观地办法是,引导模型在推理的过程中判断答案和想法是否有效。

 

Figure 5. Verifier model. The small double-ringed nodes are deterministic buffer nodes that concatenate their inputs, accumulating all past strings. All other nodes are stochastic. The verifiers are observed to take on the “correct” value. 

0x5:Tool-use 

0x6:Twenty questions 

Figure 6. Twenty questions. 

参考链接:

https://github.com/google-research/cascades

 

标签:None,Language,question,trees,answer,Cascades,Model,id,he
From: https://www.cnblogs.com/LittleHann/p/17570471.html

相关文章

  • 线性模型(linear model)基本定义及参数求解数学本质、损失函数的选择与评估数学原理、及
    线性模型(linearmodel)基本定义及参数求解数学本质、损失函数的选择与评估数学原理、及其基于线性模型衍生的其他机器学习模型相关原理讨论1.线性模型简介0x1:线性模型的现实意义在一个理想的连续世界中,任何非线性的东西都可以被线性的东西来拟合(参考Taylor......
  • Rethinking with Retrieval Faithful Large Language Model Inference
    目录概Rethinkingwithretrieval(RR)代码HeH.,ZhangH.andRothD.Rethinkingwithretrieval:faithfullargelanguagemodelinference.arXivpreprintarXiv:2301.00303,2023.概LLM(LargeLanguageModel)+检索.Rethinkingwithretrieval(RR)CoT(Chai......
  • IfcLanguageId
    IfcLanguageId类型定义IfcLanguageId标识自然语言文本所用的语言。它使用语言标签来识别语言。 注:标签应符合RFC5646中表达的互联网工程任务组(IETF)语言标签。一致性程序应支持“简单语言”子标签和“语言区域”标签格式。简单语言标签en(英语)、de(德语)、fr(法国)或ja(日语)的示例......
  • 《Prompting Is Programming: A Query Language for Large Language Models》论文学习
    一、前言大型语言模型在诸如对话问答、代码生成等广泛任务上表现出了出色的性能。在较高的层次上,给定一段输入,大语言模型可用于按照概率统计方式自动补全序列。在此基础上,用户用指令(instructions)或示例(examples)去提示(prompt)大语言模型,以实施各种下游任务。本质上,提示(prompt)方法......
  • <%@ page language="java" contentType="text/html; charset=ISO-8859-1" pageEnc
    实现<%@pagelanguage="java"contentType="text/html;charset=ISO-8859-1"pageEnc的步骤为了帮助这位刚入行的小白实现<%@pagelanguage="java"contentType="text/html;charset=ISO-8859-1"pageEnc,我们需要按照以下步骤进行操作:步骤操作1创建一个......
  • model的meta 设定
     fromdjango.dbimportmodelsclassPost(models.Model):no=models.IntegerField()name=models.CharField(max_length=128)classMeta:#昇順ordering=["no"]#降順#ordering=['-no']......
  • REALM Retrieval-Augmented Language Model Pre-Training
    目录概REALMGuuK.,LeeK.,TungZ.,PasupatP.andChangM.REALM:Retrieval-augmentedlanguagemodelpre-training.ICML,2020.概赋予生成模型检索的能力.REALM如上图所示,作者希望实现这样一个事情:给定一个'预测'任务,如"The[MASK]atthetopofthep......
  • 经纬恒润ModelBase为智能驾驶仿真全面护航
        近期,工信部发文表示,支持L3及以上级别的自动驾驶的商业化应用。早在2016年,兰德智库就指出,一套自动驾驶系统至少需要测试110亿英里,才能达到量产应用条件。这漫长的测试里程,如果仅通过实车路试,将耗费巨大资源。因此,可以任意创造场景的仿真测试平台,成为当前自动驾驶仿真测......
  • jfinal 框架学习笔记-第三天 Model相关学习--record+Model增删改查的用法(震惊之今日刷
    1.了解了数据库连接池。其中使用最多也是最广泛的是druid数据库连接池也就是阿里云研发的数据库连接池2.ActiveRecord(jFinal的核心技术)+DruidPlugin(数据库连接词,如何与数据库打交道)ActiveRecord:1.Record(记录,相当于一个通用的Model),2.Model(提供日常CRUD的封装)Model示例......
  • 数据拟合曲线_LMFit(GaussianModel )
    今天,我们来谈一谈曲线拟合,想了很久要不要记录一下数据拟合曲线这个问题,最近又遇到了,还是决定浅浅记录一下,以免遗忘。主要还是说一下类似双峰的曲线或者形状怪异的曲线怎么进行拟合。首先说一下,在数据拟合的时候,往往遇到的曲线并非常规曲线,此时会发现,基本函数无法完美拟合,经......