Serving

2024-11-02论文阅读《SpotServe：Serving Generative Large Language Models》
记录我在本科期间的论文阅读笔记，可能有些地方理解的不对，理性看待哈！SportServe:在抢占式实例上为生成式大型语言模型服务摘要LLM（大型语言模型）具有非常高昂的计算代价，所以降低他们的成本非常具有挑战性，本文利用云服务上的可抢占式的GPU实例来降低成本，但要解决频繁的实例抢
2024-10-14k8s 1.28 安装配置 knative-serving v1.15.2 + cert-manager v1.16.1
安装配置knative-serving配置基础组件#考虑image可能存在拉取问题，可以使用https://github.com/DaoCloud/public-image-mirror方法替换kubectlapply-fhttps://github.com/knative/serving/releases/download/knative-v1.15.2/serving-crds.yamlkubectlapply-fhttps:
2024-10-09TensorFlow Serving: 高性能机器学习模型部署利器
servingTensorFlowServing简介TensorFlowServing是一个专为生产环境设计的灵活、高性能机器学习模型服务系统。它主要处理机器学习的推理(inference)阶段，负责管理训练后模型的生命周期，并通过高性能的引用计数查找表为客户端提供版本化访问。虽然TensorFlowServing原生支持Ten
2024-08-10smbmap报[*] Detected 0 hosts serving SMB
执行smbmapsmbmap-H{target_ip}显示[*]Detected0hostsservingSMB[*]Closed0connectionsvps连目标机时正常，vps距离目标时延较低抓包显示本机直接syn, syn+ack,第三个包直接rst怀疑是timeout设置问题查看帮助mansmbmap发现可以设置--timeout,默认0.5s
2024-08-06Tensorflow Serving部署及客户端访问编程实践
昨天我们实现了Tensorflow.js的花卉识别程序，它的优点是不需要服务器支持，在客户端就可以完成花卉识别，使用非常方便，但也存在一些缺点。对于很多深度学习的应用来说，由于其训练模型复杂、计算量大，所以，一般来说，仍然需要服务器支持。下面仍然以花卉识别为例，介绍如何部署Tensorflow
2024-07-20fastchat vs vLLM
vLLMhttps://github.com/vllm-project/vllmhttps://docs.vllm.ai/en/latest/推理和服务，但是更加偏向推理。 vLLMisafastandeasy-to-uselibraryforLLMinferenceandserving.vLLMisfastwith:State-of-the-artservingthroughputEfficientmanagementofat
2024-07-16vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention
vLLM:Easy,Fast,andCheapLLMServingwithPagedAttentionhttps://blog.vllm.ai/2023/06/20/vllm.htmlLLMspromisetofundamentallychangehowweuseAIacrossallindustries.However,actuallyservingthesemodelsischallengingandcanbesurprisingly
2024-07-13模型部署之 TensorFlow Serving
TensorFlowServing是一个开源的机器学习模型服务系统,由Google开发并贡献给开源社区。它主要用于部署和管理TensorFlow训练的模型,提供高性能、可扩展的推理服务。TensorFlowServing的主要功能和特点包括:多版本模型管理:支持同时部署和管理多个版本的TensorFlow模型,并提供
2024-05-15bert服务启动报错
错误:(myenv)F:\作业\软件架构\机器学习\实验\实验26-2-使用bert构建词向量\chinese_bert_wwm_L-12_H-768_A-12>bert-serving-start-model_dir./publish/-num_worker=1usage:E:\soft\anoconda\envs\myenv\Scripts\bert-serving-start-model_dir./publish/-num_worker=1I:
2024-05-15实验26 使用bert构件词向量错误解决方案
已经有源码没有报错，按步骤准备在terminal启动bert-serving-server，输入命令之后报错显示没有这样的命令：原先是按照网上的教程用：pipinstallbert-serving-server#serverpipinstallbert-serving-client#client,independentof`bert-serving-server`这俩命令安装了服
2024-05-01FastChat
FastChathttps://github.com/lm-sys/FastChat为服务基于大模型的chat应用，FastChat提供框架。提供三个功能training,serving,andevaluating有评价功能、训练功能（finetuning）主要的还是serving能力，可以支持大模型的负载均衡 FastChatisanopenplatformfortrain
2024-04-29vite 项目，背景图报错 The request url "xx/xx/xx.xx" is outside of Vite serving allow list.
版本vite3.2.6vue3.2.37 背景本地启项目，项目中引用了自研组件库（没有安装，通过文件路径直接引用，便于调试项目和组件），两者文件夹是平级的组件库中背景图：background:100%/100%no-repeaturl('../assets/svg/xxx.svg'); 问题本地启动项目之后，背景图未正常展示浏览器控
2024-03-08PaddleOCR 服务化部署(基于PaddleHub Serving)
最近用到百度飞桨的PaddleOCR，研究了一下PaddleOCR的服务化部署，简单记录一些部署过程和碰到的问题。基础环境paddlepaddle2.5.2python3.7paddlehub2.1.0PaddleOCR2.6pip20#查看python版本python--version#查看pip版本pip--version#查看paddlepaddle版本
2024-03-05AI时代：本地运行大模型vllm
https://docs.vllm.ai/en/latest/index.html高吞吐量、高内存效率的LLMs推理和服务引擎(快速搭建本地大模型，且openAIAPI兼容）vLLMisafastandeasy-to-uselibraryforLLMinferenceandserving.vLLMisfastwith:State-of-the-artservingthroughputEfficientman
2024-02-18tensorflow serving: REST request
1.savetrainedmodel#inmodulefileoftfxcomponenttrainerdef_apply_preprocessing(raw_features,tft_layer):transformed_features=tft_layer(raw_features)if_LABEL_KEYinraw_features:transformed_label=transformed_features.pop(_
2024-01-15OpenShift 中部署无服务器
简介OpenShift4中通过名为“RedHatOpenShiftServerless”的Operator提供了基于Knative的Serverless（无服务器架构）的运行环境。OpenShift的无服务器组件主要涉及KnativeServing和KnativeEventing。KnativeServing是一个开源软件框架，用于构建和管理可扩展、容错和
2024-01-02OpenShift Knative Serving 配置服务（1）
自动缩放Knative提供了基于Kubernetes的自动缩放功能，根据指标（如CPU利用率、内存使用量等）自动调整Pod的副本数，以实现弹性和高可用性。Knative的KnativeServing的组件，用于管理应用程序的生命周期，在KnativeServing中，可以配置自动缩放规则，以指定应用程序的缩放行为。通过配置自动
2023-12-29乒乓球比赛
fromrandomimportrandom#打印程序介绍信息defprintIntro():print("这是单人赛模拟程序:")#获得程序运行参数defgetInputs():a=eval(input("请输入选手A的能力值(0-1):"))b=eval(input("请输入选手B的能力值(0-1):"))n=eval(input
2023-12-28比赛模拟
fromrandomimportrandom#打印程序介绍信息defprintIntro():print("22信计1班23号")print("这是单人赛模拟程序:")#获得程序运行参数defgetInputs():a=eval(input("请输入选手A的能力值(0-1):"))b=eval(input("请输入选手B的能力值(0-1):
2023-12-28羽毛球比赛
fromrandomimportrandomdefprintIntro():print("学号09，题目为模拟羽毛球")defgetInputs():a=eval(input("请输入选手A的能力值（0—1）："))b=eval(input("请输入选手B的能力值（0—1）："))returna,bdefsimNgames(n,probA,probB):winsA
2023-11-13rancher2.7.5更新web证书方法
1.dockerexec-itxxxx/bin/bash2.kubectl--insecure-skip-tls-verify-nkube-systemdeletesecretsk3s-serving kubectl--insecure-skip-tls-verifydeletesecretserving-cert-ncattle-system rm-f/var/lib/rancher/k3s/server/tls/dynamic-cert.json3.
2023-10-17Go - Serving Through HTTPS
Problem: YouwanttoserveyourwebapplicationthroughHTTPS.Solution: Usethehttp.ListenAndServeTLSfunctiontoserveyourwebapplicationthroughHTTPS. HTTPSisnothingmorethanlayeringHTTPontopoftheTransportSecurityLayer(TLS).Thenet
2023-10-17Go - Serving Static Files
Problem: Youwanttoservestaticfilessuchasimages,CSS,andJavaScriptfiles.Solution: Usethehttp.FileServerfunctiontoservestaticfiles. funcmain(){dir:=http.Dir("./static")fs:=http.FileS
2023-10-08knative serving 流量管理
创建客户端#kubectlrunclient--image=ikubernetes/admin-box-it--rm--restart=Never--command-nknative-demo--/bin/bashroot@client/#创建应用hello-world-v1.yamlapiVersion:serving.knative.dev/v1kind:Servicemetadata:name:helloworld-gonames
2023-10-07knative serving 域名映射
创建应用hello-world.yamlapiVersion:serving.knative.dev/v1kind:Servicemetadata:name:helloworld-gonamespace:knative-demospec:template:spec:containers:-image:ghcr.dockerproxy.com/knative/helloworld-go:latestenv