首页 > 其他分享 >Using a text embedding model locally with semantic kernel

Using a text embedding model locally with semantic kernel

时间:2024-07-05 21:28:46浏览次数:25  
标签:kernel semantic text nomic Microsoft examples using SemanticKernel

题意:在本地使用带有语义核(Semantic Kernel)的文本嵌入模型

问题背景:

I've been reading Stephen Toub's blog post about building a simple console-based .NET chat application from the ground up with semantic-kernel. I'm following the examples but instead of OpenAI I want to use microsoft Phi 3 and the nomic embedding model. The first examples in the blog post I could recreate using the semantic kernel huggingface plugin. But I can't seem to run the text embedding example.

我一直在阅读Stephen Toub的博客文章,文章讲述了如何使用语义核(semantic-kernel)从头开始构建一个基于控制台的简单.NET聊天应用程序。我按照示例操作,但我想使用微软的Phi 3和nomic嵌入模型,而不是OpenAI。我能够使用语义核的huggingface插件重现博客文章中的第一个示例。但是,我似乎无法运行文本嵌入的示例。

I've downloaded Phi and nomic embed text and are running them on a local server with lm studio.

我已经下载了Phi和nomic嵌入文本模型,并正在使用lm studio在本地服务器上运行它们。

Here's the code I came up with that uses the huggingface plugin:

这里是我编写的使用huggingface插件的代码

using System.Net;
using System.Text;
using System.Text.RegularExpressions;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Embeddings;
using Microsoft.SemanticKernel.Memory;
using System.Numerics.Tensors;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging;
using Microsoft.SemanticKernel.ChatCompletion;

#pragma warning disable SKEXP0070, SKEXP0003, SKEXP0001, SKEXP0011, SKEXP0052, SKEXP0055, SKEXP0050  // Type is for evaluation purposes only and is subject to change or removal in future updates. 

internal class Program
{
    private static async Task Main(string[] args)
    {
        //Suppress this diagnostic to proceed.
        // Initialize the Semantic kernel
        IKernelBuilder kernelBuilder = Kernel.CreateBuilder();
        kernelBuilder.Services.ConfigureHttpClientDefaults(c => c.AddStandardResilienceHandler());
        var kernel = kernelBuilder
            .AddHuggingFaceTextEmbeddingGeneration("nomic-ai/nomic-embed-text-v1.5-GGUF/nomic-embed-text-v1.5.Q8_0.gguf",
            new Uri("http://localhost:1234/v1"),
            apiKey: "lm-studio",
            serviceId: null)
            .Build();

        var embeddingGenerator = kernel.GetRequiredService<ITextEmbeddingGenerationService>();
        var memoryBuilder = new MemoryBuilder();
        memoryBuilder.WithTextEmbeddingGeneration(embeddingGenerator);
        memoryBuilder.WithMemoryStore(new VolatileMemoryStore());
        var memory = memoryBuilder.Build();
        // Download a document and create embeddings for it
        string input = "What is an amphibian?";
        string[] examples = [ "What is an amphibian?",
                              "Cos'è un anfibio?",
                              "A frog is an amphibian.",
                              "Frogs, toads, and salamanders are all examples.",
                              "Amphibians are four-limbed and ectothermic vertebrates of the class Amphibia.",
                              "They are four-limbed and ectothermic vertebrates.",
                              "A frog is green.",
                              "A tree is green.",
                              "It's not easy bein' green.",
                              "A dog is a mammal.",
                              "A dog is a man's best friend.",
                              "You ain't never had a friend like me.",
                              "Rachel, Monica, Phoebe, Joey, Chandler, Ross"];
        for (int i = 0; i < examples.Length; i++)
            await memory.SaveInformationAsync("net7perf", examples[i], $"paragraph{i}");
        var embed = await embeddingGenerator.GenerateEmbeddingsAsync([input]);
        ReadOnlyMemory<float> inputEmbedding = (embed)[0];
        // Generate embeddings for each chunk.
        IList<ReadOnlyMemory<float>> embeddings = await embeddingGenerator.GenerateEmbeddingsAsync(examples);
        // Print the cosine similarity between the input and each example
        float[] similarity = embeddings.Select(e => TensorPrimitives.CosineSimilarity(e.Span, inputEmbedding.Span)).ToArray();
        similarity.AsSpan().Sort(examples.AsSpan(), (f1, f2) => f2.CompareTo(f1));
        Console.WriteLine("Similarity Example");
        for (int i = 0; i < similarity.Length; i++)
            Console.WriteLine($"{similarity[i]:F6}   {examples[i]}");
    }
}

At the line:   这部分代码存在问题

for (int i = 0; i < examples.Length; i++)
    await memory.SaveInformationAsync("net7perf", examples[i], $"paragraph{i}");

I get the following exception:        得到了下面的异常信息

JsonException: The JSON value could not be converted to Microsoft.SemanticKernel.Connectors.HuggingFace.Core.TextEmbeddingResponse

Does anybody know what I'm doing wrong?        有人知道我错在哪里吗?

I've downloaded the following nuget packages into the project:

我已经将以下NuGet包下载到项目中:

IdVersionsProjectName
Microsoft.SemanticKernel.Core{1.15.0}LocalLlmApp
Microsoft.SemanticKernel.Plugins.Memory{1.15.0-alpha}LocalLlmApp
Microsoft.Extensions.Http.Resilience{8.6.0}LocalLlmApp
Microsoft.Extensions.Logging{8.0.0}LocalLlmApp
Microsoft.SemanticKernel.Connectors.HuggingFace{1.15.0-preview}LocalLlmApp
Newtonsoft.Json{13.0.3}LocalLlmApp
Microsoft.Extensions.Logging.Console{8.0.0}LocalLlmApp

问题解决:

I think you cannot use AddHuggingFaceTextEmbeddingGeneration with an embedding model from LM Studio out of the box. The reason is that the HuggingFaceClient internally changes the url and adds:

我认为你不能直接使用AddHuggingFaceTextEmbeddingGeneration与LM Studio中的嵌入模型,因为HuggingFaceClient内部会更改URL并添加:

pipeline/feature-extraction/

private Uri GetEmbeddingGenerationEndpoint(string modelId)
     => new($"{this.Endpoint}{this.Separator}pipeline/feature-extraction/{modelId}");

that's the same as the Error Message I get in the LM Studio Console:

这与我在LM Studio控制台中收到的错误信息相同:

[2024-07-03 22:18:19.898] [ERROR] Unexpected endpoint or method. (POST /v1/embedding/pipeline/feature-extraction/nomic-ai/nomic-embed-text-v1.5-GGUF/nomic-embed-text-v1.5.Q5_K_M.gguf). Returning 200 anyway

In order to get this working the url would have to be changed.

为了使这个工作正常进行,URL必须被更改。

标签:kernel,semantic,text,nomic,Microsoft,examples,using,SemanticKernel
From: https://blog.csdn.net/suiusoar/article/details/140218094

相关文章

  • 如何让其他模型也能在SemanticKernel中调用本地函数
    在SemanticKernel的入门例子中://ImportpackagesusingMicrosoft.SemanticKernel;usingMicrosoft.SemanticKernel.ChatCompletion;usingMicrosoft.SemanticKernel.Connectors.OpenAI;//CreateakernelwithAzureOpenAIchatcompletionvarbuilder=Kernel.CreateB......
  • kaggle运行报错RuntimeError: cutlassF: no kernel found to launch!
    项目场景:项目场景:使用原始Llama3推理,到这里都是能行的!pipinstall-qmodelscopeimporttorchfrommodelscopeimportsnapshot_download,AutoModel,AutoTokenizerimportosmodel_dir=snapshot_download('LLM-Research/Meta-Llama-3-8B-Instruct',cache_dir='/r......
  • WPF DataContext
    后台代码:publicclassStudent{publicintId{get;set;}publicstringName{get;set;}publicintAge{get;set;}} 前台代码:<Windowx:Class="BindingTest.MainWindow"xmlns="http://schem......
  • WPF Datagrid ContextMenu MenuItem Command CommandParameter MultiBinding
     //xaml<Windowx:Class="WpfApp194.MainWindow"xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"xmlns:d="http://schemas......
  • python @contextmanager
          在Python中,@contextmanager是一个装饰器,用于将一个生成器函数转换为一个上下文管理器。上下文管理器是一种用于管理资源的机制,通过with语句来使用。常见的例子如文件操作,在进入with代码块时获取资源(打开文件),在离开with代码块时自动释放资源(关闭......
  • sublime text3 修改 exec.py文件编译警告返回信息,去掉绝对路径
    第一步:找到exec.py文件1.找到路径:C:\SublimeText3\Packages。2.找到Default.sublime-package复制一个备份,后缀改成Default.rar并且解压缩,在解压缩文件里面找到exec.py文件。3.复制exec.py文件到 C:\SublimeText3\Data\Packages\User下面,或者从编辑器上面打开......
  • manim边学边做--Text
    与之前介绍的Tex,MathTex等等类不一样,本次介绍的是Text类,是专门用来显示纯文本的。Text类虽然不能显示数学公式,却提供了更加丰富和方便的方式来展示文字内容。Text在manim各个模块中的位置大致如上图中所示。1.主要参数Text的参数比较多,方便我们用多种方式来展示文本。其中,常用......
  • Plugin开发基本知识点 IPluginExecutionContext, iOrganization Service
    IPluginExecutionContext`IPluginExecutionContext`接口在MicrosoftDynamics365插件开发中用于获取有关当前插件执行上下文的信息。它提供了丰富的属性和方法,帮助开发者在插件执行时获取与当前操作相关的各种数据和元数据。以下是`IPluginExecutionContext`的一些主要功能和属......
  • Winform SynchronizationContext多线程更新画面控件
    SynchronizationContext在通讯中充当传输者的角色,实现功能就是一个线程和另外一个线程的通讯。需要注意的是,不是每个线程都附加SynchronizationContext这个对象,只有UI线程是一直拥有的。故获取SynchronizationContext也只能在UI线程上进行SynchronizationContextcontex......
  • 动手学Avalonia:基于SemanticKernel与硅基流动构建AI聊天与翻译工具
    Avalonia是什么?Avalonia是一个跨平台的UI框架,专为.NET开发打造,提供灵活的样式系统,支持Windows、macOS、Linux、iOS、Android及WebAssembly等多种平台。它已成熟并适合生产环境,被SchneiderElectric、Unity、JetBrains和GitHub等公司采用。许多人认为Avalonia是WPF的继任者,它为XA......