首页 > 其他分享 >NVME-oF------PCIe-NVME

NVME-oF------PCIe-NVME

时间:2023-07-13 22:44:52浏览次数:45  
标签:completion target NVME driver PCIe command ------ data NVMe

Similar to the SQ and CQ queue pairing mechanism for NVMe devices described in Section 6.2, InfiniBand also uses queue pairs of work queues (WQs) and completion queues (CQs). HCAs support hosting WQs on device memory (similar to NVMe CMBs described in Section 7.4.1), and hosting CQs in system memory. This allows a userspace application to post work requests, such as send and receive operations, and poll for completions directly, bypassing the kernel entirely in the data path. An additional benefit is that this design maps very well onto the NVMe-oF architecture; the NVMe-oF target driver can “bind” the receive WQ to the NVMe SQ. This means that NVMe commands are already enqueued (in memory) when the target driver is notified about received commands, and the target driver may simply ring the SQ's doorbell register. Figure 36 illustrates the steps involved in reading 4 kB of data from storage using RDMA:

Figure 36

 

Fig. 36. Flow chart of an I/O read operation for NVMe-oF using InfiniBand RDMA. While the target-side CPU is required to initiate NVMe operations and start the RDMA write transfer, neither commands, completions, nor data is moved by the CPU. As InfiniBand queues and NVMe queues are bound to each other, commands and completions are written directly to the queues by the HCAs using DMA.

  1.  The initiator prepares an I/O read command for the NVMe device with the desired block offset. Memory used for RDMA is already known to both NVMe-oF initiator, as it was registered by the target driver as a RDMA MR in advance. This allows the initiator to simply use target-side physical addresses of this MR in the read command. It then posts the command to the send WQ, sending the command across the network, directly to the target drivers memory.
  2.  The target driver receives a receive completion indicating that it has received an NVMe command. As the HCA has already written the command to the appropriate location in target's memory, the target driver can immediately ring the doorbell register of the bound SQ, initiating the NVMe I/O operation. The initiator driver has already resolved target-side physical addresses in advance, so there is no processing required. After ringing the doorbell, it checks what type of NVMe command this is. Seeing that it is an read command, it starts preparing a WQ request for RDMA write from the local MR to a known MR on the initiator host.
  3.  The target driver receives the NVMe command completion, indicating that the NVMe device has written data to memory. The target posts the prepared RDMA write request to the appropriate WQ. By using DMA to read from the MR, the HCA begins sending the data over the InfiniBand fabric. The initiator-side HCA will start writing received data into the initiators memory, also using DMA.
  4.  Since requests in the same WQ are always ordered, the target driver immediately posts a send request for the NVMe completion, knowing that when the initiator driver receives the completion the data must have arrived before it. This optimization means that the target driver avoids needing to wait for the RDMA write completion, which is particularly useful for larger data transfers.
  5.  The initiator driver receives a receive completion for the NVMe command completion, and knows that the data must have arrived in its local memory before the completion due to WQ ordering. The data read from the remote NVMe device is now available for use.

标签:completion,target,NVME,driver,PCIe,command,------,data,NVMe
From: https://www.cnblogs.com/longbowchi/p/17552402.html

相关文章

  • 2023年 1月 做题记录
    LOJ#10132异象石题目简述:支持对树上一点集删单点和加单点的操作,询问点集组成的虚树的边权之和(虚树边权为原树上两点间距离)。做法:考虑给定点集答案的求法,将其中的点按dfs序排序,使dfs序从小到大的点依次相邻,同时使dfs序最大和最小的相邻,构成一个环。环上相邻点的距离就是答案。......
  • 2023年 2月 做题记录
    前言:记录2月一些好题的解法,同时有可能会补以前写过的题目。CF718CSashaandArray对于每一个下标\(i\)记数对\(S_i=(a_i,b_i)\)。代表第\(i\)个位置斐波那契数列的前第\(0\)项和第\(1\)项是\(a_i\)和\(b_i\)。例原数组中\(a[3]=3\),则\(S_{3}=(1,2)\)。考虑对......
  • XRAY安装与使用
    XRAY安装与使用1. 下载地址Github:https://github.com/chaitin/xray/releases一般会下载这个64位的,可根据自己的电脑自行选择2. 安装流程解压xray只发现一个exe文件,先不要运行,进入到该目录下的命令行模式输入命令xray_windows_amd64.exegenca会自动在该目录下生成证......
  • 卸载 Calico
    删除k8s资源kubectl-nkube-systemdeletedeploycalico-kube-controllerskubectl-nkube-systemdeletedscalico-nodekubectl-nkube-systemdeletecmcalico-config#删除CRD及相关资源kubectlgetcrd-ojsonpath='{range.items[*]}{.metadata.name}{"\n......
  • Vue项目配置Https双向认证
    访问双向认证的Https接口本地运行设置修改webpack配置 jsconstfs=require('fs')constoptions={//客户端密钥key:fs.readFileSync(path.join(__dirname,'../ca/key.pem')),//客户端公钥cert:fs.readFileSync(path.join(__dirname,'../ca......
  • QT从入门到实战完整版 P17
    #include"mainwindownoui.h"#include<QMenuBar>#include<QToolBar>#include<QDebug>#include<QPushButton>MainWindownoui::MainWindownoui(QWidget*parent):QMainWindow(parent){resize(600,400);//重置窗口大小Q......
  • vue配置https
    constpath=require('path');constfs=require('fs');consthttps=require('https');constoptions={key:fs.readFileSync(path.join(__dirname,'./ca/client.key')),cert:fs.readFileSync(path.join(__dirname,......
  • 1-IP
    杂类概念:路由器不能组建局域网,只能实现连接内网和外网交换机不能连接内网和外网,只能用来组建局域网简单局域网构成:交换机、网线、pc交换机:组建内网的局域网设备ip:一段网络编码,由长度32位的二进制数组成转换为十进制方便记忆->点分十进制形式:X.X.X.XX的范围:0-255子网......
  • AtCoder Regular Contest 164 A~C
    A题都没做出来(被自已菜晕A.TernaryDecompositionA-TernaryDecomposition(atcoder.jp)题意给定一个正整数\(N\),问是否存在\(K\)个整数,使得\(N=3^{m_1}+3^{m_2}+...+3^{m_k}\)思路首先对于一个正整数\(N\),最多有\(N\)个整数使得正式成立,即\(m_i\)全为0。再对\(N\)进行三......
  • Chat GPT降重
    在当今信息爆炸的时代,我们每天都会面临大量的文字内容。无论是在工作中处理邮件、文档,还是在日常生活中阅读新闻、社交媒体,我们都需要处理大量的文字信息。然而,这也给我们带来了一个问题:信息过载。如何高效地处理这些海量的文字内容成为了一个挑战。幸运的是,随着人工智能技术的发......