首页 > 其他分享 >240908-结合DBGPT与Ollama实现RAG本地知识检索增强

240908-结合DBGPT与Ollama实现RAG本地知识检索增强

时间:2024-09-18 11:24:21浏览次数:3  
标签:RAG ## your API DBGPT proxy MODEL Ollama model


A. 最终效果

240908-结合DBGPT与Ollama实现RAG本地知识检索增强_Ollama

B. 背景说明

  • DBGPT在0.5.6版本中开始支持Ollama:v0.5.6 版本更新
  • 240908-结合DBGPT与Ollama实现RAG本地知识检索增强_DBGPT_02

  • 网友对其Web端及界面端的设置进行了分享:

240908-结合DBGPT与Ollama实现RAG本地知识检索增强_环境配置_03

C. 环境配置

  • 参考官网教程完成环境配置

D. 配置文件

  • ⚠️:注意下面带⭐️的操作
#*******************************************************************#
#**             DB-GPT  - GENERAL SETTINGS                        **#  
#*******************************************************************#

#*******************************************************************#
#**                        Webserver Port                         **#
#*******************************************************************#
# DBGPT_WEBSERVER_PORT=5670
## Whether to enable the new web UI, enabled by default,False use old ui
# USE_NEW_WEB_UI=True
#*******************************************************************#
#***                       LLM PROVIDER                          ***#
#*******************************************************************#

# TEMPERATURE=0

#*******************************************************************#
#**                         LLM MODELS                            **#
#*******************************************************************#
# ⭐️ 添加Ollama配置
LLM_MODEL=ollama_proxyllm
PROXY_SERVER_URL=http://127.0.0.1:11434
PROXYLLM_BACKEND="qwen2:1.5b"
PROXY_API_KEY=not_used
EMBEDDING_MODEL=proxy_ollama
proxy_ollama_proxy_server_url=http://127.0.0.1:11434
proxy_ollama_proxy_backend="nomic-embed-text:latest"

# LLM_MODEL=ollama_proxyllm
# MODEL_SERVER=http://127.0.0.1:11434
# PROXYLLM_BACKEND=llama3.1:8b
# EMBEDDING_MODEL=proxy_ollama
# proxy_ollama_proxy_server_url=http://127.0.0.1:11434
# proxy_ollama_proxy_backend=llama3.1:8b

# LLM_MODEL=ollama_proxyllm
# PROXY_SERVER_URL=http://127.0.0.1:11434
# PROXYLLM_BACKEND="qwen:0.5b" 
# PROXY_API_KEY=not_used 
# EMBEDDING_MODEL=proxy_ollama 
# proxy_ollama_proxy_server_url=http://127.0.0.1:11434 
# proxy_ollama_proxy_backend="nomic-embed-text:latest"   



# # LLM_MODEL, see dbgpt/configs/model_config.LLM_MODEL_CONFIG
# LLM_MODEL=glm-4-9b-chat
# ## LLM model path, by default, DB-GPT will read the model path from LLM_MODEL_CONFIG based on the LLM_MODEL.
# ## Of course you can specify your model path according to LLM_MODEL_PATH
# ## In DB-GPT, the priority from high to low to read model path:
# ##    1. environment variable with key: {LLM_MODEL}_MODEL_PATH (Avoid multi-model conflicts)
# ##    2. environment variable with key: MODEL_PATH
# ##    3. environment variable with key: LLM_MODEL_PATH
# ##    4. the config in dbgpt/configs/model_config.LLM_MODEL_CONFIG
# # LLM_MODEL_PATH=/app/models/glm-4-9b-chat
# # LLM_PROMPT_TEMPLATE=vicuna_v1.1
# MODEL_SERVER=http://127.0.0.1:8000
# LIMIT_MODEL_CONCURRENCY=5
# MAX_POSITION_EMBEDDINGS=4096
# QUANTIZE_QLORA=True
# QUANTIZE_8bit=True
# # QUANTIZE_4bit=False
# ## SMART_LLM_MODEL - Smart language model (Default: vicuna-13b)
# ## FAST_LLM_MODEL - Fast language model (Default: chatglm-6b)
# # SMART_LLM_MODEL=vicuna-13b
# # FAST_LLM_MODEL=chatglm-6b
# ## Proxy llm backend, this configuration is only valid when "LLM_MODEL=proxyllm", When we use the rest API provided by deployment frameworks like fastchat as a proxyllm, 
# ## "PROXYLLM_BACKEND" is the model they actually deploy. We can use "PROXYLLM_BACKEND" to load the prompt of the corresponding scene. 
# # PROXYLLM_BACKEND=

# ### You can configure parameters for a specific model with {model name}_{config key}=xxx
# ### See dbgpt/model/parameter.py
# ## prompt template for current model
# # llama_cpp_prompt_template=vicuna_v1.1
# ## llama-2-70b must be 8
# # llama_cpp_n_gqa=8
# ## Model path
# # llama_cpp_model_path=/data/models/TheBloke/vicuna-13B-v1.5-GGUF/vicuna-13b-v1.5.Q4_K_M.gguf

# ### LLM cache
# ## Enable Model cache
# # MODEL_CACHE_ENABLE=True
# ## The storage type of model cache, now supports: memory, disk
# # MODEL_CACHE_STORAGE_TYPE=disk
# ## The max cache data in memory, we always store cache data in memory fist for high speed. 
# # MODEL_CACHE_MAX_MEMORY_MB=256
# ## The dir to save cache data, this configuration is only valid when MODEL_CACHE_STORAGE_TYPE=disk
# ## The default dir is pilot/data/model_cache
# # MODEL_CACHE_STORAGE_DISK_DIR=

#*******************************************************************#
#**                         EMBEDDING SETTINGS                    **#
#*******************************************************************#
# ⭐️ 取消非Ollama的Embedding设置
# EMBEDDING_MODEL=text2vec
# #EMBEDDING_MODEL=m3e-large
# #EMBEDDING_MODEL=bge-large-en
# #EMBEDDING_MODEL=bge-large-zh
# KNOWLEDGE_CHUNK_SIZE=500
# KNOWLEDGE_SEARCH_TOP_SIZE=5
# KNOWLEDGE_GRAPH_SEARCH_TOP_SIZE=200
# ## Maximum number of chunks to load at once, if your single document is too large,
# ## you can set this value to a higher value for better performance.
# ## if out of memory when load large document, you can set this value to a lower value.
# # KNOWLEDGE_MAX_CHUNKS_ONCE_LOAD=10
# #KNOWLEDGE_CHUNK_OVERLAP=50
# # Control whether to display the source document of knowledge on the front end.
# KNOWLEDGE_CHAT_SHOW_RELATIONS=False
# # Whether to enable Chat Knowledge Search Rewrite Mode
# KNOWLEDGE_SEARCH_REWRITE=False
# ## EMBEDDING_TOKENIZER   - Tokenizer to use for chunking large inputs
# ## EMBEDDING_TOKEN_LIMIT - Chunk size limit for large inputs
# # EMBEDDING_MODEL=all-MiniLM-L6-v2
# # EMBEDDING_TOKENIZER=all-MiniLM-L6-v2
# # EMBEDDING_TOKEN_LIMIT=8191

# ## Openai embedding model, See dbgpt/model/parameter.py
# # EMBEDDING_MODEL=proxy_openai
# # proxy_openai_proxy_server_url=https://api.openai.com/v1
# # proxy_openai_proxy_api_key={your-openai-sk}
# # proxy_openai_proxy_backend=text-embedding-ada-002


# ## qwen embedding model, See dbgpt/model/parameter.py
# # EMBEDDING_MODEL=proxy_tongyi
# # proxy_tongyi_proxy_backend=text-embedding-v1
# # proxy_tongyi_proxy_api_key={your-api-key}

# ## qianfan embedding model, See dbgpt/model/parameter.py
# #EMBEDDING_MODEL=proxy_qianfan
# #proxy_qianfan_proxy_backend=bge-large-zh
# #proxy_qianfan_proxy_api_key={your-api-key}
# #proxy_qianfan_proxy_api_secret={your-secret-key}


# ## Common HTTP embedding model
# # EMBEDDING_MODEL=proxy_http_openapi
# # proxy_http_openapi_proxy_server_url=http://localhost:8100/api/v1/embeddings
# # proxy_http_openapi_proxy_api_key=1dce29a6d66b4e2dbfec67044edbb924
# # proxy_http_openapi_proxy_backend=text2vec

#*******************************************************************#
#**                         RERANK SETTINGS                       **#
#*******************************************************************#
## Rerank model
# RERANK_MODEL=bge-reranker-base
## If you not set RERANK_MODEL_PATH, DB-GPT will read the model path from EMBEDDING_MODEL_CONFIG based on the RERANK_MODEL.
# RERANK_MODEL_PATH=
## The number of rerank results to return
# RERANK_TOP_K=3

## Common HTTP rerank model
# RERANK_MODEL=rerank_proxy_http_openapi
# rerank_proxy_http_openapi_proxy_server_url=http://127.0.0.1:8100/api/v1/beta/relevance
# rerank_proxy_http_openapi_proxy_api_key={your-api-key}
# rerank_proxy_http_openapi_proxy_backend=bge-reranker-base




#*******************************************************************#
#**                  DB-GPT METADATA DATABASE SETTINGS            **#
#*******************************************************************#
### SQLite database (Current default database)
LOCAL_DB_TYPE=sqlite

### MYSQL database
# LOCAL_DB_TYPE=mysql
# LOCAL_DB_USER=root
# LOCAL_DB_PASSWORD={your_password}
# LOCAL_DB_HOST=127.0.0.1
# LOCAL_DB_PORT=3306
# LOCAL_DB_NAME=dbgpt
### This option determines the storage location of conversation records. The default is not configured to the old version of duckdb. It can be optionally db or file (if the value is db, the database configured by LOCAL_DB will be used)
#CHAT_HISTORY_STORE_TYPE=db

#*******************************************************************#
#**                         COMMANDS                              **#
#*******************************************************************#
EXECUTE_LOCAL_COMMANDS=False

#*******************************************************************#
#**            VECTOR STORE / KNOWLEDGE GRAPH SETTINGS            **#
#*******************************************************************#
VECTOR_STORE_TYPE=Chroma
GRAPH_STORE_TYPE=TuGraph
GRAPH_COMMUNITY_SUMMARY_ENABLED=True
KNOWLEDGE_GRAPH_EXTRACT_SEARCH_TOP_SIZE=5
KNOWLEDGE_GRAPH_EXTRACT_SEARCH_RECALL_SCORE=0.3
KNOWLEDGE_GRAPH_COMMUNITY_SEARCH_TOP_SIZE=20
KNOWLEDGE_GRAPH_COMMUNITY_SEARCH_RECALL_SCORE=0.0

### Chroma vector db config
#CHROMA_PERSIST_PATH=/root/DB-GPT/pilot/data

### Milvus vector db config
#VECTOR_STORE_TYPE=Milvus
#MILVUS_URL=127.0.0.1
#MILVUS_PORT=19530
#MILVUS_USERNAME
#MILVUS_PASSWORD
#MILVUS_SECURE=

### Weaviate vector db config
#VECTOR_STORE_TYPE=Weaviate
#WEAVIATE_URL=https://kt-region-m8hcy0wc.weaviate.network

## ElasticSearch vector db config
#VECTOR_STORE_TYPE=ElasticSearch
ElasticSearch_URL=127.0.0.1
ElasticSearch_PORT=9200
ElasticSearch_USERNAME=elastic
ElasticSearch_PASSWORD={your_password}

### TuGraph config
#TUGRAPH_HOST=127.0.0.1
#TUGRAPH_PORT=7687
#TUGRAPH_USERNAME=admin
#TUGRAPH_PASSWORD=73@TuGraph
#TUGRAPH_VERTEX_TYPE=entity
#TUGRAPH_EDGE_TYPE=relation
#TUGRAPH_PLUGIN_NAMES=leiden

#*******************************************************************#
#**                  WebServer Language Support                   **#
#*******************************************************************#
# en, zh, fr, ja, ko, ru
LANGUAGE=en
#LANGUAGE=zh


#*******************************************************************#
# **    PROXY_SERVER (openai interface | chatGPT proxy service), use chatGPT as your LLM.
# ⭐️ 注释掉Ollama之外的PROXY_SERVER_URL
# ** if your server can visit openai, please set PROXY_SERVER_URL=https://api.openai.com/v1/chat/completions
# ** else if you have a chatgpt proxy server, you can set PROXY_SERVER_URL={your-proxy-serverip:port/xxx}
#*******************************************************************#
# PROXY_API_KEY={your-openai-sk}
# PROXY_SERVER_URL=https://api.openai.com/v1/chat/completions

# # from https://bard.google.com/     f12-> application-> __Secure-1PSID
# BARD_PROXY_API_KEY={your-bard-token}

#*******************************************************************#
# **  PROXY_SERVER +                                              **#
#*******************************************************************#

# Aliyun tongyi
TONGYI_PROXY_API_KEY={your-tongyi-sk}

## Baidu wenxin
#WEN_XIN_MODEL_VERSION={version}
#WEN_XIN_API_KEY={your-wenxin-sk}
#WEN_XIN_API_SECRET={your-wenxin-sct}

## Zhipu
#ZHIPU_MODEL_VERSION={version}
#ZHIPU_PROXY_API_KEY={your-zhipu-sk}

## Baichuan
#BAICHUN_MODEL_NAME={version}
#BAICHUAN_PROXY_API_KEY={your-baichuan-sk}
#BAICHUAN_PROXY_API_SECRET={your-baichuan-sct}

# Xunfei Spark
#XUNFEI_SPARK_API_VERSION={version}
#XUNFEI_SPARK_APPID={your_app_id}
#XUNFEI_SPARK_API_KEY={your_api_key}
#XUNFEI_SPARK_API_SECRET={your_api_secret}

## Yi Proxyllm, https://platform.lingyiwanwu.com/docs
#YI_MODEL_VERSION=yi-34b-chat-0205
#YI_API_BASE=https://api.lingyiwanwu.com/v1
#YI_API_KEY={your-yi-api-key}

## Moonshot Proxyllm, https://platform.moonshot.cn/docs/
# MOONSHOT_MODEL_VERSION=moonshot-v1-8k
# MOONSHOT_API_BASE=https://api.moonshot.cn/v1
# MOONSHOT_API_KEY={your-moonshot-api-key}

## Deepseek Proxyllm, https://platform.deepseek.com/api-docs/
# DEEPSEEK_MODEL_VERSION=deepseek-chat
# DEEPSEEK_API_BASE=https://api.deepseek.com/v1
# DEEPSEEK_API_KEY={your-deepseek-api-key}


#*******************************************************************#
#**    SUMMARY_CONFIG                                             **#
#*******************************************************************#
SUMMARY_CONFIG=FAST

#*******************************************************************#
#**    MUlti-GPU                                                  **#
#*******************************************************************#
## See https://developer.nvidia.com/blog/cuda-pro-tip-control-gpu-visibility-cuda_visible_devices/
## If CUDA_VISIBLE_DEVICES is not configured, all available gpus will be used
# CUDA_VISIBLE_DEVICES=0
## You can configure the maximum memory used by each GPU.
# MAX_GPU_MEMORY=16Gib

#*******************************************************************#
#**                         LOG                                   **#
#*******************************************************************#
# FATAL, ERROR, WARNING, WARNING, INFO, DEBUG, NOTSET
DBGPT_LOG_LEVEL=INFO
# LOG dir, default: ./logs
#DBGPT_LOG_DIR=


#*******************************************************************#
#**                         API_KEYS                              **#
#*******************************************************************#
# API_KEYS - The list of API keys that are allowed to access the API. Each of the below are an option, separated by commas.
# API_KEYS=dbgpt

#*******************************************************************#
#**                         ENCRYPT                               **#
#*******************************************************************#
# ENCRYPT KEY - The key used to encrypt and decrypt the data
# ENCRYPT_KEY=your_secret_key

#*******************************************************************#
#**                         File Server                           **#
#*******************************************************************#
## The local storage path of the file server, the default is pilot/data/file_server
# FILE_SERVER_LOCAL_STORAGE_PATH =

#*******************************************************************#
#**                     Application Config                        **#
#*******************************************************************#
## Non-streaming scene retries
# DBGPT_APP_SCENE_NON_STREAMING_RETRIES_BASE=1
## Non-streaming scene parallelism
# DBGPT_APP_SCENE_NON_STREAMING_PARALLELISM_BASE=1

#*******************************************************************#
#**                   Observability Config                        **#
#*******************************************************************#
## Whether to enable DB-GPT send trace to OpenTelemetry
# TRACER_TO_OPEN_TELEMETRY=False
## Following configurations are only valid when TRACER_TO_OPEN_TELEMETRY=True
## More details see https://opentelemetry-python.readthedocs.io/en/latest/exporter/otlp/otlp.html
# OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://localhost:4317
# OTEL_EXPORTER_OTLP_TRACES_INSECURE=False
# OTEL_EXPORTER_OTLP_TRACES_CERTIFICATE=
# OTEL_EXPORTER_OTLP_TRACES_HEADERS=
# OTEL_EXPORTER_OTLP_TRACES_TIMEOUT=
# OTEL_EXPORTER_OTLP_TRACES_COMPRESSION=

#*******************************************************************#
#**                     FINANCIAL CHAT Config                     **#
#*******************************************************************#
# FIN_REPORT_MODEL=/app/models/bge-large-zh
  • 错误排查

240908-结合DBGPT与Ollama实现RAG本地知识检索增强_环境配置_04

E. 运行使用

E.1 启动
python dbgpt/app/dbgpt_server.py
E.2 使用

240908-结合DBGPT与Ollama实现RAG本地知识检索增强_DBGPT_05

F. 引文出处的设置

在新版本中,引文出处转移到了应用,通过创建应用绑定知识库,然后在应用里面对话后就会显示出处。

F.1 点击创建应用

240908-结合DBGPT与Ollama实现RAG本地知识检索增强_DBGPT_06

240908-结合DBGPT与Ollama实现RAG本地知识检索增强_sed_07

F.2 绑定知识库

240908-结合DBGPT与Ollama实现RAG本地知识检索增强_DBGPT_08

F.3 选择应用

240908-结合DBGPT与Ollama实现RAG本地知识检索增强_DBGPT_09

F.4对话引文查看

240908-结合DBGPT与Ollama实现RAG本地知识检索增强_API_10


标签:RAG,##,your,API,DBGPT,proxy,MODEL,Ollama,model
From: https://blog.51cto.com/guokliu/12044608

相关文章

  • 使用fake-useragent库伪装请求头
    部分网站做了反爬虫机制,不允许程序访问网站的数据,而使用同一个useragent(用户代理)短时间爬取大量数据也可能被网站反爬虫程序识别。为了更好地模拟浏览器地工作,可以使用第三方库fake-useragent生成假的useragent字符串伪装浏览器,从而绕过一些网站的反爬虫措施。首先在命令行中输入......
  • 基于AI知识库RAG的综合窗口系统
    背景     电子政务网上大厅统一接件系统是为各市、区(市)县所有行政权力事项(行政许可、行政处罚、行政强制、行政征收、其他行政权力等)实现在线统一接件、办理调度、办件过程信息查询、结果查询及公开、服务评价、办件统计等功能的应用系统,它通过标准数据接口、基于政务信......
  • WPF this.DragMove() DropShadowEffect
    //xaml<Windowx:Class="WpfApp367.MainWindow"xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"xmlns:d="http://schemas.mi......
  • AI大语言模型LLM学习-本地部署大语言模型(Ollama)
    系列文章1.AI大语言模型LLM学习-入门篇2.AI大语言模型LLM学习-Token及流式响应3.AI大语言模型LLM学习-WebAPI搭建4.AI大语言模型LLM学习-基于Vue3的AI问答页面5.AI大语言模型LLM学习-语义检索(RAG前导篇)6.AI大语言模型LLM学习-RAG技术及代码实现7.AI大语言模型LL......
  • ragflow
    ragflowhttps://github.com/infiniflow/ragflowRAGFlowisanopen-sourceRAG(Retrieval-AugmentedGeneration)enginebasedondeepdocumentunderstanding.ItoffersastreamlinedRAGworkflowforbusinessesofanyscale,combiningLLM(LargeLanguageMode......
  • 实现 Excel 文件导入到向量数据库(Milvus),并支持 先查询知识库(Milvus),然后再查询大模型(Ol
    为了实现Excel文件导入到向量数据库(Milvus),并支持先查询知识库(Milvus),然后再查询大模型(Ollama)的功能,以下是具体的实现步骤:1.导入Excel数据到向量数据库(Milvus)首先,您需要将Excel文件中的数据向量化,并将这些向量导入到Milvus数据库中。可以使用pandas读取Excel文件,使用......
  • k8s(kubernetes)的PV / PVC / StorageClass(理论+实践)
    NFS总是不支持PVC扩容先来个一句话总结:PV、PVC是K8S用来做存储管理的资源对象,它们让存储资源的使用变得可控,从而保障系统的稳定性、可靠性。StorageClass则是为了减少人工的工作量而去自动化创建PV的组件。所有Pod使用存储只有一个原则:先规划→后申请→再使用。一、理论......
  • 尤雨溪推荐的拖拽插件,支持Vue2/Vue3 VueDraggablePlus
    大家好,我是「前端实验室」爱分享的了不起~今天在网上看到尤雨溪推荐的这款拖拽组件,试了一下非常不错,这里推荐给大家。说到拖拽工具库,非大名鼎鼎的的Sortablejs莫属。它是前端领域比较知名的,且功能强大的工具。但我们直接使用Sortablejs的情况很少,一般都是使用基于它的......
  • RAG 幻觉检测方法
    RAG幻觉检测方法未经检查的幻觉在今天的检索增强生成应用中仍然是一个大问题。本研究评估了4个公共RAG数据集中流行的幻觉检测器。使用AUROC和精度/召回率,我们报告了G-eval、Ragas和可信语言模型等方法如何能够自动标记不正确的 LLM响应。利用各种幻觉检测方法识别......
  • [NLP/AIGC/GPT] RAG : 检索增强型生成技术,智能体的外挂知识库
    1概述:RAGRAG技术的概念、起源大家每天都会看到各种RAG框架、论文和开源项目,也都知道RAG(Retrieval-AugmentedGeneration)是检索增强型生成。但大家还记得RAG这个概念源自哪里吗?RAG概念来自FacebookAIResearch在2020年的一篇论文:《**Retrieval-Augmented......