6 Ways For Running A Local LLM

https://semaphoreci.com/blog/local-llm

1. Hugging Face and Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch


tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-medium", padding_side='left')
model = AutoModelForCausalLM.from_pretrained("microsoft/DialoGPT-medium")

# source: https://huggingface.co/microsoft/DialoGPT-medium

# Let's chat for 5 lines
for step in range(5):
    # encode the new user input, add the eos_token and return a tensor in Pytorch
    new_user_input_ids = tokenizer.encode(input(">> User:") + tokenizer.eos_token, return_tensors='pt')

    # append the new user input tokens to the chat history
    bot_input_ids = torch.cat([chat_history_ids, new_user_input_ids], dim=-1) if step > 0 else new_user_input_ids

    # generated a response while limiting the total chat history to 1000 tokens, 
    chat_history_ids = model.generate(bot_input_ids, max_length=1000, pad_token_id=tokenizer.eos_token_id)

    # pretty print last output tokens from bot
    print("DialoGPT: {}".format(tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)))

2. LangChain

from langchain.llms.huggingface_pipeline import HuggingFacePipeline

hf = HuggingFacePipeline.from_model_id(
    model_id="microsoft/DialoGPT-medium", task="text-generation", pipeline_kwargs={"max_new_tokens": 200, "pad_token_id": 50256},
)

from langchain.prompts import PromptTemplate

template = """Question: {question}

Answer: Let's think step by step."""
prompt = PromptTemplate.from_template(template)

chain = prompt | hf

question = "What is electroencephalography?"

print(chain.invoke({"question": question})

3. Llama.cpp

Llama.cpp is a C and C++ based inference engine for LLMs, optimized for Apple silicon and running Meta’s Llama2 models.

Once we clone the repository and build the project, we can run a model with:
$ ./main -m /path/to/model-file.gguf -p "Hi there!"

4. Llamafile

Llamafile, developed by Mozilla, offers a user-friendly alternative for running LLMs. Llamafile is known for its portability and the ability to create single-file executables.

Once we download llamafile and any GGUF-formatted model, we can start a local browser session with:
$ ./llamafile -m /path/to/model.gguf

5. Ollama

Ollama is a more user-friendly alternative to Llama.cpp and Llamafile. You download an executable that installs a service on your machine. Once installed, you open a terminal and run:
$ ollama run llama2
Ollama will download the model and start an interactive session.

6. GPT4ALL

GPT4ALL is an easy-to-use desktop application with an intuitive GUI. It supports local model running and offers connectivity to OpenAI with an API key. It stands out for its ability to process local documents for context, ensuring privacy.

标签：Ways,ids,Running,user,chat,new,input,model,Local
From： https://www.cnblogs.com/lightsong/p/18169565

ILA抓出匪夷所思的错误，如一个always块里面的两个相同逻辑寄存器赋值出现毛刺
有一种可能性是下载器太烂了，可以降速使用，或者换个质量好的（带屏蔽的下载器）。出错代码：（已知所有条件正确、length_fpga1和length_fpga2的逻辑完全相同，时钟稳定，时序无误）可以看到length_fpga2的bit11莫名其妙翻转，现在直接揭晓答案，因为下载器丢包了，当把测试的特定值改为0xaa，0x55等......
java代码运行出现DENIED Redis is running in protected mode because protected mode
这个错误是因为开启了保护模式，导致出错。所以需要关闭redis的保护模式。编辑redis的redis.config 注释bind127.0.0.1 、修改protected-mode为no、修改 daemonize为no然后重启redis ......
idea启动项目时抛出错误信息Error running 'XXXApplication' Error running XXXApplic
很多新手小白在启动项目时会出现下面问题，不知道怎么办出现的问题：启动项目时抛出下图错误提示：Errorrunning'XXXApplication'ErrorrunningXXXApplication.Commandlineistoolong.Shortenthecommandlineandrerun. 解决办法：1、直接点击下图位置，进入EditRunConfi......
TransmittableThreadLocal & InheritableThreadLocal
InheritableThreadLocal类是ThreadLocal类的一个子类，它提供了一个线程局部变量，该变量的值可以被当前线程以及所有子线程共享。这在多线程编程中非常有用，特别是在需要在父线程和子线程之间传递数据时。下面是一个简单的Java代码示例，演示了InheritableThreadLocal的用法：publiccl......
C:\Users\用户名\AppData\Local 用node如何获取电脑的这个目录
在Node.js中，你可以使用内置的os模块获取用户目录，并结合path模块来拼接特定的子目录路径。对于Windows环境下的AppData目录，你可以这样做：constos=require('os');constpath=require('path');//获取用户主目录lethomeDir=os.homedir();//AppData目录在Windows系统通......
Android保存字符串到本地储存卡中saveLocal
publicclassSaveLocal{//保存文件到sd卡publicstaticvoidsaveToFile(Stringcontent){BufferedWriterout=null;//获取SD卡状态Stringstate=Environment.getExternalStorageState();//判断SD卡是否就绪if(......
Electron打包的时候路径出现问题!include: could not find: "C:\Users\xxxx\AppDat
!include:couldnotopenfile:"C:\ztg\projects\electron-vite-vue-ts\node_modules\.pnpm\app-builder-lib@24.13.3_dmg-builder@24.13.3_electron-builder-squirrel-windows@24.13.3_dmg-bui_lrspnoputfiosacwyigcypdbdi\node_modules\app-builder-lib\t......
解决 java 实体中用 LocalDateTime 在转换时候报错 Error attempting to get column
java中的实体类用到了LocalDateTime类型。在转换时候报错Errorattemptingtogetcolumn‘XXX’fromresultset.Cause:java.sql.解决方法最为简单。是因为com.alibaba的版本问题。切换版本号到1.1.22即可消除问题<dependency><groupId>com.alibaba</gro......
禁止 SSH 传递 locale 环境变量
SSH在连接远程机器时默认会传递一些环境变量，其中就包括你本机的locale变量。这会导致远程机器的locale配置变成和你本地主机一样。有时候我们不希望这种行为，我们可以通过修改SSH配置文件来取消这一行为。编辑/etc/ssh/ssh_config文件：sudovim/etc/ssh/ssh_config可......
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib
ERROR2002(HY000):Can'tconnecttolocalMySQLserverthroughsocket'/var/lib/mysql/mysql.sock'(2)=====================================================步骤：以下可用。（1）关闭mysql：servicemysqldstop（2）查看mysql.sock的位置(base)[root@VM-0-2-ce......

6 Ways For Running A Local LLM

6 Ways For Running A Local LLM

1. Hugging Face and Transformers

2. LangChain

3. Llama.cpp

4. Llamafile

5. Ollama

6. GPT4ALL

相关文章

赞助商

阅读排行