一、LLM给软件开发范式带来了什么改变?
人们一直在说Github Copilot将取代程序员。我们认为这是错误的。我们已经有了类似GPT-4这种强大的LLM模型,却还要把自己限制在编写传统代码上吗?不!所有代码都有bug!
代码不是对业务逻辑进行编码的理想方式,代码必须经过审查,并且它按照程序员的指示执行,而不是按照程序员的要求进行编码和执行。这里最麻烦地事在于,每一层的意图传递都存在信息丢失:
现实世界业务逻辑 -> 程序员对现实世界的观察 -> 程序员根据自己的观察对业务逻辑继续进行建模 -> 程序员通过代码和框架实现一套建模仿真 -> 程序在计算机系统中被执行
显然,上述流程是一种低效、不精确的流程,商业逻辑的正确形式应该是人类智慧!谁需要python和ec2s、biz逻辑和postgres?
一个理想地流程应该是如下这样的:
现实世界业务逻辑 -> 包含人类智慧(对世界本质的理解)的一种信息载体 -> 程序在计算机系统中被执行
以一个简单的”todo list app“为例,我们构建了一个完整的由LLM提供支持的后端+数据库。它根据API调用的名称推断业务逻辑,并可以持久存储数程序运行的状态。 和传统software1.0相比,这个software2.0程序的主要特点如下:- 通过prompt programing的方式,而不需要由程序员编写固定的backen业务逻辑代码(例如:sort_todos_alphabetically() )和路由代码(例如:"This is a todo list app")
- 通过few-shot prompt的方式,显式初始化json数据格式(例如:{todo_items:[{title:“吃早餐”,completed:true},{title:“上学”,completed:false}])
- 通过自然语言和backend进行交互,理论上,frontend用户可以调用无限多地backend endpoints,LLM会自动将用户输入的自然语言指令,翻译为合理的业务逻辑,并通过LLM内部状态执行完成后,更新当前上下文状态
参考链接:
https://github.com/RootbeerComputer/backend-GPT
二、一个software2.0 based-on LLM案例:a todo-list app
我们基本上使用GPT来处理待办事项列表应用程序的所有后端逻辑。我们用一个带有预填充条目的json作为应用程序的状态存储,这些条目有助于定义模式。然后,我们传入提示符、当前状态和一些用户输入指令/API调用,并提取对客户端+新状态的响应。
在这里,LLM可以被视为一个底层通用操作系统,用于处理所有基本的CRUD逻辑,省去了我们编写后端路由的工作。例如,用户可以输入add_five_housework_todos()或delete_last_two_todos)或sort_todos_alphabetically()等命令,而不是写特定的路由。
server.py
import json from flask import Flask from flask_cors import CORS import re import ast def gpt3(input): import openai openai.api_type = "azure" openai.api_base = "https://openai-service-instance-yuxuan.openai.azure.com/" openai.api_version = "2023-06-01-preview" openai.api_key = "79244fc233c445c59e986c8b3da84f08" def getCompletion(prompt): message = [ {"role": "system", "content": "you are a useful assistant."}, {"role": "user", "content": prompt} ] response = openai.ChatCompletion.create( engine="deployment-gpt-35-turbo-16k", # engine = "deployment_name". messages=message ) return response['choices'][0]['message']['content'] completion = getCompletion(input) return completion def dict_to_json(d): return d.__dict__ app = Flask(__name__) CORS(app) db = json.load(open('db.json','r')) print("INITIAL DB STATE") print(db['todo_list']["state"]) @app.route('/<app_name>/<api_call>') def api(app_name, api_call): db = json.load(open('db.json','r')) print("INPUT DB STATE") print(db[app_name]["state"]) gpt3_input = f"""{db[app_name]["prompt"]} API Call (indexes are zero-indexed): {api_call} Database State: {db[app_name]["state"]} Output the API response as json prefixed with '!API response!:'. Then output the new database state as json, prefixed with '!New Database State!:'. If the API call is only requesting data, then don't change the database state, but base your 'API Response' off what's in the database. """ #print("gpt3_input: ", gpt3_input) completion = gpt3(gpt3_input) print("completion: ", completion) # parsing "API Response" and "New Database State" with regex api_response_match = re.search("(?<=!API Response!:).*(?=!New Database State!:)", completion, re.M | re.I | re.S) new_database_match = re.search("(?<=!New Database State!:).*", completion, re.M | re.I | re.S) # converting regex result into json string api_response_text = api_response_match.string[api_response_match.regs[0][0]:api_response_match.regs[0][1]].strip() new_database_text = new_database_match.string[new_database_match.regs[0][0]:new_database_match.regs[0][1]].strip() #print("api_response_text: ", api_response_text) #print("new_database_text: ", new_database_text) response = json.loads(json.dumps(ast.literal_eval(api_response_text))) print("API RESPONSE") print(response) new_state = json.loads(json.dumps(ast.literal_eval(new_database_text))) print("New Database NEW_STATE") print(new_state) db[app_name]["state"] = new_state json.dump(db, open('db.json', 'w'), indent=4, default=dict_to_json) return response if __name__ == "__main__": app.run( host='0.0.0.0', port=4321 )
db.json
{ "todo_list": { "prompt": "This is a todo list app.", "state": { "todos": [ { "title": "Learn react", "completed": true }, { "title": "Buy Milk", "completed": true }, { "title": "Do laundry", "completed": false }, { "title": "Clean room", "completed": true } ] } }, "chess": { "prompt": "You are a chess assistant", "state": { "board": [ [ "r", "n", "b", "q", "k", "b", "n", "r" ], [ "p", "p", "p", "p", "p", "p", "p", "p" ], [ " ", " ", " ", " ", " ", " ", " ", " " ], [ " ", " ", " ", " ", " ", " ", " ", " " ], [ " ", " ", " ", " ", " ", " ", " ", " " ], [ " ", " ", " ", " ", " ", " ", " ", " " ], [ "P", "P", "P", "P", "P", "P", "P", "P" ], [ "R", "N", "B", "Q", "K", "B", "N", "R" ] ], "turn": "white", "white_castle_kingside": true, "white_castle_queenside": true, "black_castle_kingside": true, "black_castle_queenside": true, "en_passant": null, "halfmove_clock": 0, "fullmove_number": 1 } } }
列举当前todo list数据库,
http://8.222.198.132:4321/todo_list/sort_todos_alphabetically() /* INPUT DB STATE {'todos': [{'title': 'Learn react', 'completed': True}, {'title': 'Buy Milk', 'completed': True}, {'title': 'Do laundry', 'completed': False}, {'title': 'Clean room', 'completed': True}, {'title': 'ToDo 1', 'completed': False}, {'title': 'ToDo 2', 'completed': False}, {'title': 'ToDo 3', 'completed': False}, {'title': 'ToDo 4', 'completed': False}, {'title': 'ToDo 5', 'completed': False}]} completion: !API response!: {'todos': [{'title': 'Buy Milk', 'completed': True}, {'title': 'Clean room', 'completed': True}, {'title': 'Do laundry', 'completed': False}, {'title': 'Learn react', 'completed': True}, {'title': 'ToDo 1', 'completed': False}, {'title': 'ToDo 2', 'completed': False}, {'title': 'ToDo 3', 'completed': False}, {'title': 'ToDo 4', 'completed': False}, {'title': 'ToDo 5', 'completed': False}]} !New Database State!: {'todos': [{'title': 'Learn react', 'completed': True}, {'title': 'Buy Milk', 'completed': True}, {'title': 'Do laundry', 'completed': False}, {'title': 'Clean room', 'completed': True}, {'title': 'ToDo 1', 'completed': False}, {'title': 'ToDo 2', 'completed': False}, {'title': 'ToDo 3', 'completed': False}, {'title': 'ToDo 4', 'completed': False}, {'title': 'ToDo 5', 'completed': False}]} API RESPONSE {'todos': [{'title': 'Buy Milk', 'completed': True}, {'title': 'Clean room', 'completed': True}, {'title': 'Do laundry', 'completed': False}, {'title': 'Learn react', 'completed': True}, {'title': 'ToDo 1', 'completed': False}, {'title': 'ToDo 2', 'completed': False}, {'title': 'ToDo 3', 'completed': False}, {'title': 'ToDo 4', 'completed': False}, {'title': 'ToDo 5', 'completed': False}]} New Database NEW_STATE {'todos': [{'title': 'Learn react', 'completed': True}, {'title': 'Buy Milk', 'completed': True}, {'title': 'Do laundry', 'completed': False}, {'title': 'Clean room', 'completed': True}, {'title': 'ToDo 1', 'completed': False}, {'title': 'ToDo 2', 'completed': False}, {'title': 'ToDo 3', 'completed': False}, {'title': 'ToDo 4', 'completed': False}, {'title': 'ToDo 5', 'completed': False}]} 122.235.82.203 - - [21/Jul/2023 21:23:43] "GET /todo_list/sort_todos_alphabetically() HTTP/1.1" 200 - */
从数据库中删除5个未完成的todo list,
http://8.222.198.132:4321/todo_list/delete_five_housework_todos_which_not_complete() /* INPUT DB STATE {'todos': [{'title': 'Learn react', 'completed': True}, {'title': 'Buy Milk', 'completed': True}, {'title': 'Do laundry', 'completed': False}, {'title': 'Clean room', 'completed': True}, {'title': 'ToDo 1', 'completed': False}, {'title': 'ToDo 2', 'completed': False}, {'title': 'ToDo 3', 'completed': False}, {'title': 'ToDo 4', 'completed': False}, {'title': 'ToDo 5', 'completed': False}]} completion: !API response!: {"deleted_todos": ["ToDo 1", "ToDo 2", "ToDo 3", "ToDo 4", "ToDo 5"]} !New Database State!: {'todos': [{'title': 'Learn react', 'completed': True}, {'title': 'Buy Milk', 'completed': True}, {'title': 'Do laundry', 'completed': False}, {'title': 'Clean room', 'completed': True}]} API RESPONSE {'deleted_todos': ['ToDo 1', 'ToDo 2', 'ToDo 3', 'ToDo 4', 'ToDo 5']} New Database NEW_STATE {'todos': [{'title': 'Learn react', 'completed': True}, {'title': 'Buy Milk', 'completed': True}, {'title': 'Do laundry', 'completed': False}, {'title': 'Clean room', 'completed': True}]} 122.235.82.203 - - [21/Jul/2023 21:25:24] "GET /todo_list/delete_five_housework_todos_which_not_complete() HTTP/1.1" 200 - */
再次尝试删除5个未完成的todo list,
http://8.222.198.132:4321/todo_list/delete_five_housework_todos_which_not_complete() /* INPUT DB STATE {'todos': [{'title': 'Learn react', 'completed': True}, {'title': 'Buy Milk', 'completed': True}, {'title': 'Do laundry', 'completed': False}, {'title': 'Clean room', 'completed': True}]} completion: !API response!: {"message": "Deleted 2 housework todos which were not complete"} !New Database State!: {'todos': [{'title': 'Learn react', 'completed': True}, {'title': 'Buy Milk', 'completed': True}]} API RESPONSE {'message': 'Deleted 2 housework todos which were not complete'} New Database NEW_STATE {'todos': [{'title': 'Learn react', 'completed': True}, {'title': 'Buy Milk', 'completed': True}]} 122.235.82.203 - - [21/Jul/2023 21:26:31] "GET /todo_list/delete_five_housework_todos_which_not_complete() HTTP/1.1" 200 - */
可以看到这次LLM返回结果出现了错误!它错误地将已经处于”complete状态“的”Clean room“删除了,而我们的指令很明确地是说删除”not_complete“的todo list。
三、从这个例子,看software2.0相比software1.0的优缺点
0x1:software2.0的优势
- LLM-based的程序通用性强,理论上完成一套prompt program之后,可以运行在全世界任何服务器的任何后端LLM模型上。因为输入-输出都是字符串这种最简单格式的自然语言,因此也不存在opcode、vm、arch适配等等问题,是一种最纯粹意义上的跨平台运行。
- software2.0的代码短小精悍,通俗易懂。
0x2:software2.0暂时存在的局限性
- 存在一定不可重入风险,software1.0的传统程序,最大的特定就是可重复,一旦业务代码编写并测试完毕发布上线,之后不管运行多久,不管输入参数如何,其程序的运行结果都是可预测和可重入的,不会担心程序运行出现非预期结果。但是因为LLM本质上是一个连续概率模型,而且模型参数和运行路径十分庞大(数百亿条分支),不管训练如何充分,运行时间多久,始终存在一定概率会产生非预期结果,这就给software2.0程序的稳定性带来了巨大的挑战,尤其是当涉及到金融以及国计民生的重点软件项目时,可重入性就变得十分重要了。
- 本质上,LLM是基于预训练语料得到的条件概率预测模型,在通过自然语言试图驱动LLM进行复杂的业务逻辑模拟运行时,还是偶尔会出现错误。LLM是否真的用于逻辑智能,甚至人类智能?这个问题还有待研究。
- 输入长度可能超过LLM的input token最大限制。LLM不适合那种输入-输出大量信息的数据应用。
需要注意的是!以上所述的局限性仅仅代表笔者成文所在的时间点,而LLM是一门急速发展的研究和技术领域,以上这些问题也许在不久地将来就会得到很好地解决。
标签:False,title,True,completed,LLM,need,ToDo,todos,backend From: https://www.cnblogs.com/LittleHann/p/17572449.html