实现 Excel 文件导入到向量数据库（Milvus），并支持先查询知识库（Milvus），然后再查询大模型（Ollama）的功能

标签：导入到 knowledge 知识库 Excel 查询 query Milvus

为了实现 Excel 文件导入到向量数据库（Milvus），并支持 先查询知识库（Milvus），然后再查询大模型（Ollama） 的功能，以下是具体的实现步骤：

1. 导入 Excel 数据到向量数据库（Milvus）

首先，您需要将 Excel 文件中的数据向量化，并将这些向量导入到 Milvus 数据库中。可以使用 pandas 读取 Excel 文件，使用 sentence-transformers 将数据转为向量，再将这些向量存入 Milvus。

安装必要依赖：

pip install pandas openpyxl sentence-transformers pymilvus

代码示例：

import pandas as pd
from sentence_transformers import SentenceTransformer
from pymilvus import connections, CollectionSchema, FieldSchema, DataType, Collection

# 连接 Milvus
connections.connect(host="localhost", port="19530")

# 定义 Milvus 中的 Collection Schema
fields = [
    FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=True),
    FieldSchema(name="text", dtype=DataType.VARCHAR, max_length=500),
    FieldSchema(name="vector", dtype=DataType.FLOAT_VECTOR, dim=768)  # 根据模型的输出维度调整
]
schema = CollectionSchema(fields, "Excel Data Collection")

# 创建 Collection
collection = Collection(name="excel_data", schema=schema)

# 载入 Excel 文件并向量化
def load_excel_to_milvus(excel_file):
    # 读取 Excel 数据
    df = pd.read_excel(excel_file)
    
    # 加载句子向量化模型
    model = SentenceTransformer('all-MiniLM-L6-v2')  # 或根据需要选择合适的模型
    
    # 向量化 Excel 数据中的文本列
    vectors = model.encode(df['text_column'].tolist())  # 假设有一列 'text_column'
    
    # 插入向量和文本到 Milvus
    collection.insert([df['text_column'].tolist(), vectors])

    # 创建索引
    collection.create_index(field_name="vector", index_params={"index_type": "IVF_FLAT", "metric_type": "L2", "params": {"nlist": 100}})
    collection.load()

# 示例：导入 Excel 文件
load_excel_to_milvus('data.xlsx')

2. 通过 Milvus 查询知识库

当数据导入 Milvus 后，您可以通过向量化的方式查询知识库。在此，用户输入会被转为向量，并与 Milvus 中的向量进行匹配。

def query_knowledge_base(user_query):
    # 使用相同的向量化模型将用户查询向量化
    model = SentenceTransformer('all-MiniLM-L6-v2')
    query_vector = model.encode([user_query])

    # 在 Milvus 中查询最相似的向量
    search_params = {"metric_type": "L2", "params": {"nprobe": 10}}
    results = collection.search(
        data=query_vector,
        anns_field="vector",
        param=search_params,
        limit=5,
        output_fields=["text"]
    )
    
    # 返回查询到的文本结果
    return [result.entity.get("text") for result in results[0]]

3. 调用 Ollama 模型进行增强

知识库返回的结果可以作为输入传递给大模型（Ollama），以增强大模型的生成能力。这种方法叫做 知识增强推理，它先使用知识库对查询进行初步过滤，再用大模型进行推理。

def call_ollama_model_with_knowledge(knowledge_context, user_query):
    # 将知识库的结果作为上下文传递给 Ollama 模型
    prompt = f"Here is some relevant information from the knowledge base: {knowledge_context}\nNow answer the user's question: {user_query}"
    
    url = "http://localhost:port/ollama_model"  # 根据您的 Ollama 服务配置调整
    headers = {
        "Content-Type": "application/json"
    }
    data = {
        "prompt": prompt
    }
    
    response = requests.post(url, json=data, headers=headers)
    return response.json()

4. 使用 FastAPI 实现完整流程

可以使用 FastAPI 来实现整个流程：Excel 文件导入 -> 知识库查询 -> Ollama 模型增强。

FastAPI 代码示例：

from fastapi import FastAPI, UploadFile, File
import pandas as pd

app = FastAPI()

# 导入Excel到 Milvus 向量数据库
@app.post("/upload_excel/")
async def upload_excel(file: UploadFile = File(...)):
    df = pd.read_excel(file.file)
    load_excel_to_milvus(df)
    return {"message": "Excel file has been uploaded and processed"}

# 查询知识库，然后调用 Ollama 模型
@app.post("/query/")
async def query_knowledge_and_model(user_input: str):
    # 1. 查询 Milvus 知识库
    knowledge_results = query_knowledge_base(user_input)
    
    # 2. 将知识库结果作为上下文传递给 Ollama 模型
    knowledge_context = " ".join(knowledge_results)
    ollama_response = call_ollama_model_with_knowledge(knowledge_context, user_input)
    
    return {"knowledge_response": knowledge_results, "ollama_response": ollama_response}

5. 运行 FastAPI 应用

启动 FastAPI 服务器：

uvicorn main:app --reload

6. 使用 API 流程

导入 Excel 文件到 Milvus：

curl -X POST "http://127.0.0.1:8000/upload_excel/" -F "file=@data.xlsx"

查询知识库并调用 Ollama 模型：

curl -X POST "http://127.0.0.1:8000/query/" -H "Content-Type: application/json" -d '{"user_input": "关于人工智能的最新进展"}'

7. 总结

Excel 导入：通过 pandas 读取 Excel 数据，并使用 sentence-transformers 模型将文本转为向量，存入 Milvus。
知识库查询：通过向量化方式在 Milvus 中进行查询，并返回最相似的结果。
增强推理：使用查询到的知识库上下文作为 Ollama 模型的输入，增强大模型的回答能力。

标签：导入到,knowledge,知识库,Excel,查询,query,Milvus
From： https://blog.51cto.com/u_13171517/12017500

实现 Excel 文件导入到向量数据库（Milvus），并支持先查询知识库（Milvus），然后再查询大模型（Ollama）的功能

1. 导入 Excel 数据到向量数据库（Milvus）

安装必要依赖：

代码示例：

2. 通过 Milvus 查询知识库

3. 调用 Ollama 模型进行增强

4. 使用 FastAPI 实现完整流程

FastAPI 代码示例：

5. 运行 FastAPI 应用

6. 使用 API 流程

7. 总结

相关文章

赞助商

阅读排行

实现 Excel 文件导入到向量数据库（Milvus），并支持 先查询知识库（Milvus），然后再查询大模型（Ollama） 的功能

1. 导入 Excel 数据到向量数据库（Milvus）

安装必要依赖：

代码示例：

2. 通过 Milvus 查询知识库

3. 调用 Ollama 模型进行增强

4. 使用 FastAPI 实现完整流程

FastAPI 代码示例：

5. 运行 FastAPI 应用

6. 使用 API 流程

7. 总结

相关文章

赞助商

阅读排行

实现 Excel 文件导入到向量数据库（Milvus），并支持先查询知识库（Milvus），然后再查询大模型（Ollama）的功能