前面的一篇文章在qemu中绑定pci bus到numa node - 半山随笔 - 博客园 (cnblogs.com)中记录了如何使用qemu命令行来设置pci bus与numa的亲和性。本篇来记录如何使用libvirt来做同样的事。
libvirt相较于qemu是一个更高层的工具,在提供便捷性的同时也损失了一点灵活性。就拿设置pci bus的numa亲和性而言,libvirt就很难去配置。参照libvirt的官方说明:
PCI controllers also have an optional subelement <target> with the attributes and subelements listed below. These are configurable items that 1) are visible to the guest OS so must be preserved for guest ABI compatibility, and 2) are usually left to default values or derived automatically by libvirt. In almost all cases, you should not manually add a <target> subelement to a controller, nor should you modify the values in the those that are automatically generated by libvirt. Since 1.2.19 (QEMU only).
node
Some PCI controllers (pci-expander-bus for the pc machine type, pcie-expander-bus for the q35 machine type and, since 3.6.0, pci-root for the pseries machine type) can have an optional <node> subelement within the <target> subelement, which is used to set the NUMA node reported to the guest OS for that bus - the guest OS will then know that all devices on that bus are a part of the specified NUMA node (it is up to the user of the libvirt API to attach host devices to the correct pci-expander-bus when assigning them to the domain).
也就是说在pci controller下面确实有numa相关的配置,但是在大部分情况下是不允许用户去改动的,这部分内容由libvirt自动生成。这就不好办了,既不能改xml配置,也不能在安装的时候设置。我的理解按照官方的设想这个参数只会在host device passthrough的时候会自动设置而不是让用户自由设置。
但是对于确实有这方面需求的该怎么办呢?
还好virt-install提供了qemu commandline直接注入的方式。
virt-install --help ... --qemu-commandline QEMU_COMMANDLINE Pass arguments directly to the qemu emulator. Ex: --qemu-commandline='-display gtk,gl=on' --qemu-commandline env=DISPLAY=:0.1
既然我们可以在qemu中设置pci bus和numa亲和性,那么也可以通过将qemu参数直接注入libvirt来实现同样的功能。
sudo virt-install \ --connect qemu:///system \ --name node1 \ --disk none \ --memory 8192 \ --vcpus 4,sockets=1,cores=4,threads=1 \ --network bridge=virbr0 \ --os-type linux \ --virt-type kvm \ --boot hd \ --graphics none \ --cpu cell0.cpus=0-1,cell0.memory=4194304,cell1.cpus=2-3,cell1.memory=4194304 \ --qemu-commandline='-device pxb,id=pcie.1,bus=pci.0,addr=0x6,numa_node=0,bus_nr=5 -device pcie-pci-bridge,id=pcie-pci-br0,bus=pcie.1 -device virtio-blk-pci,scsi=off,bus=pcie-pci-br0,addr=0x1,drive=hd1,id=virtio-disk0,bootindex=1 -drive if=none,file=/home/jianyong/vm/AnolisOS-8.9-x86_64-RHCK.qcow2,id=hd1' \
上面的命令行就可以将virtio-blk设备挂到由pxb扩展出来的pci桥上,而pxb可以设置numa亲和性,这样也就设置了virtio blk的numa亲和性。进入虚拟机查看一下。
# lstopo-no-graphics Machine (7685MB total) L3 L#0 (16MB) Group0 L#0 NUMANode L#0 (P#0 3710MB) Package L#0 + L2 L#0 (512KB) + L1d L#0 (64KB) + L1i L#0 (64KB) + Core L#0 + PU L#0 (P#0) Package L#1 + L2 L#1 (512KB) + L1d L#1 (64KB) + L1i L#1 (64KB) + Core L#1 + PU L#1 (P#1) HostBridge PCIBridge PCIBridge PCI 07:01.0 (SCSI) Block "vda" Group0 L#1 NUMANode L#1 (P#1 3975MB) Package L#2 + L2 L#2 (512KB) + L1d L#2 (64KB) + L1i L#2 (64KB) + Core L#2 + PU L#2 (P#2) Package L#3 + L2 L#3 (512KB) + L1d L#3 (64KB) + L1i L#3 (64KB) + Core L#3 + PU L#3 (P#3) HostBridge PCI 00:01.1 (IDE) PCI 00:02.0 (Ethernet) Net "eth0" Misc(MemoryModule)
可以看到vda设备已经挂到numa0之下,而eth0并没有跟任何numa绑定。
当然这种设置只是虚拟的,并没有实质性的绑定,只是给guest os一个假象。如果要提高性能还是要根绝host 相关设备topology来设置。
标签:--,bus,libvirt,pci,numa,qemu From: https://www.cnblogs.com/banshanjushi/p/18179128