milvus入门使用

时间：2024-06-13 20:46:36浏览次数：21

标签：index 入门 sentence collection params 使用 data milvus name

插入数据后的效果：

代码如下：

import configparser
from pymilvus import connections, Collection, DataType, FieldSchema, CollectionSchema
import numpy as np

def create_collection():
    # Define the schema
    fields = [
        FieldSchema(name="sentence_id", dtype=DataType.INT64, is_primary=True, auto_id=True),
        FieldSchema(name="sentence", dtype=DataType.VARCHAR, max_length=512),
        FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=128)
    ]
    schema = CollectionSchema(fields, description="Sentence collection")

    # Create the collection
    collection = Collection(name="sentence_collection", schema=schema)
    return collection

def insert_data(collection):
    sentences = [
        "这是第一句。",
        "这是第二句。",
        "这是第三句。"
    ]
    
    embeddings = np.random.rand(len(sentences), 128).tolist()  # Generate 128-dimensional vectors
    
    entities = [
        sentences,
        embeddings
    ]

    insert_result = collection.insert(entities)
    print(f"Inserted {len(insert_result.primary_keys)} records into collection.")

def create_index(collection):
    index_params = {
        "index_type": "IVF_FLAT",
        "params": {"nlist": 128},
        "metric_type": "L2"
    }
    collection.create_index(field_name="embedding", index_params=index_params)
    print("Index created.")

def search_data(collection, query_sentence):
    query_embedding = np.random.rand(1, 128).tolist()  # Generate a vector for the query sentence

    search_params = {"metric_type": "L2", "params": {"nprobe": 10}}
    
    results = collection.search(
        data=query_embedding,
        anns_field="embedding",
        param=search_params,
        limit=3,
        expr=None,
        output_fields=["sentence"]
    )
    
    for hits in results:
        for hit in hits:
            print(f"Match found: {hit.id} with distance: {hit.distance}, sentence: {hit.entity.get('sentence')}")

if __name__ == '__main__':
    # Connect to Milvus
    cfp = configparser.RawConfigParser()
    cfp.read('config.ini')
    milvus_uri = cfp.get('example', 'uri')
    token = cfp.get('example', 'token')
    connections.connect("default",
                        uri=milvus_uri,
                        token=token)
    print(f"Connecting to DB: {milvus_uri}")
    
    # Create collection
    collection = create_collection()

    # Insert data
    insert_data(collection)
    
    # Create index
    create_index(collection)
    
    # Load the collection into memory
    collection.load()
    
    # Search data
    search_data(collection, "这是一个查询句子。")

运行效果：

python hello_zilliz_vectordb.py
Connecting to DB: https://in03-ca69f49bb65709f.api.gcp-us-west1.zillizcloud.com
Inserted 3 records into collection.
Index created.
Match found: 450140263656791260 with distance: 19.557846069335938, sentence: 这是第二句。
Match found: 450140263656791261 with distance: 20.327802658081055, sentence: 这是第三句。
Match found: 450140263656791259 with distance: 20.40052032470703, sentence: 这是第一句。

注意事项：

向量转换：上面的代码使用了随机向量来模拟句子向量。在实际应用中，您需要使用 NLP 模型（例如中文 BERT）来将中文句子转换为向量。
字符编码：确保在读取和处理中文文本时使用正确的字符编码（通常是 UTF-8）。

标签：index,入门,sentence,collection,params,使用,data,milvus,name
From： https://www.cnblogs.com/bonelee/p/18246704

【Git系列】Git LFS常用命令的使用
前言LFS是LargeFileStorage的缩写，是一个Git扩展，用于管理大型二进制文件，它允许将这些文件存储在单独的存储库中，以便更有效地处理Git仓库。常用命令LFS安装gitlfs要求git>=1.8.2linux环境：gitlfsinstall执行显示UpdatedGithooks.GitLFSinitialized.......
【安全函数】常用的安全函数的使用
前言本文章描述常用的不安全函数与对应的安全函数的使用。不安全函数原型参考菜鸟教程。不安全函数与相应的安全函数输入输出sprintf功能发送格式化输出到str所指向的字符串sprintf()函数C标准库#include<stdio.h>函数原型intsprintf(char*str,constc......
lvgl table的使用(重绘,事件,行选中,点击,蒙版)
////验证//密码//人脸//刷卡#include"baseapp.h"staticlv_group_t*appGroupBtn;staticlv_obj_t*infoMeterLVGLBrushCard=NULL;staticlv_obj_t*infoTextareaMeterPasswdValue;staticlv_obj_t*appObjCamera;staticlv_obj_t*appObjCameraAiFaceImg;stat......
使用exec函数族，父子进程分别拷贝图片前后部分
1#include<stdio.h>2#include<sys/stat.h>3#include<sys/types.h>4#include<unistd.h>5#include<fcntl.h>67intmain(intargc,constchar*argv[])8{9intfd_r=open("./1zh.jpg",O......
从零开始的模拟集成电路设计（2）：软件的使用与二输入与非门的设计仿真
从零开始的模拟集成电路设计（1）：软件的使用与简单数字集成电路的设计仿真-CSDN博客上接前文：我们在前面的课程中已经学会了如何设计一个简单的数字集成电路：反向器，现在我们继续学习下一个非常实用的数字集成电路：与非门。学习目的：1.掌握集成电路模拟仿真的基本流程2.掌握集成电......
【解决】无法打开该文件因为设备正在使用
当我把手机（苹果14）用数据线连到电脑上，想往电脑传输文件时，总是显示“文件正在使用中”，“设备正在使用”等，让我稍后再试。试了以下几种办法，最终在我不懈努力下成功了1.重启资源管理器，如下图所示2.重新连接手机和电脑3.刷新界面试了以上几种办法，发现总结起来还是一句话：重启......
golang reflect 反射机制的使用场景
Go语言中的reflect包提供了运行时反射机制，允许程序在运行时检查和操作任意对象的数据类型和值。以下是reflect包的一些典型使用场景： 1.动态类型判断与转换：当需要处理多种类型的变量且具体类型直到运行时才能确定时，可以使用反射来检查变量的实际类型，并在可能的情况......
ASP.NET Core应用程序10：使用表单标签助手
本章描述用于创建HTML表单的内置标签助手。这些标签助手确保表单提交到正确的操作或页面处理程序方法，并确保元素准确地表示特定的模型属性。本章解释ASP.NETCore提供的创建HTML表单的功能。展示如何使用标签助手来选择表单目标和关联的imput、textarea和select......
解决方案 | winrar 使用命令行解压到同名文件夹（QTTabBar 中创建一个【解压文件】命令
需求：我们经常需要把rar或者zip解压到当前文件夹，如果是直接解压的话可能会解压出来很多文件，事实上我们当然可以通过右键解压到这个指定文件夹。但是经过查询知道，如果是指定文件夹好说，直接指定.\new_data\表示在当前目录下的new_data文件夹即可。但是这不是我想要的，我想......
提醒：网站使用微软雅黑字体的三种方式，两种侵权，一种不侵权。
大家都知道微软雅黑是windows系统的默认字体，但是不知道微软雅黑的版权归属方正字体，而且方正字体仅仅授权了微软在windows系统中使用该字体，脱离了windows使用，那是极易中招的，网页字体使用是前端开发的工作之一，贝格前端工场带领大家看看如何正确使用微软雅黑字体。一、微软雅黑......

milvus入门使用

注意事项：

相关文章

赞助商

阅读排行