报错:
mportant;">模块“DevicePowerOn”打开电源失败。
vmkernel.log:
2024-09-13T15:14:17.520Z In(182) vmkernel: cpu91:2102143)PCIPassthru: 4686: pcipDevInfo(0x4313bac015f0) allocated for 0000:4e:00.0
2024-09-13T15:14:17.521Z In(182) vmkernel: cpu0:2097565)PCIEHP: 1573: 0000:4c:01.0: hotplug slot:0x2: num reads=1 slot status=0x108.
2024-09-13T15:14:17.521Z In(182) vmkernel: cpu0:2097565)PCIEHP: 1497: 0000:4c:01.0: hotplug slot:0x2 (0000:4e:00.0) Adapter removed.
2024-09-13T15:14:17.521Z In(182) vmkernel: cpu0:2097565)PCIEHP: 1049: 0000:4c:01.0: Disabling hotplug slot:0x2
2024-09-13T15:14:17.521Z In(182) vmkernel: cpu15:2097563)PCIEHP: 1573: 0000:4c:01.0: hotplug slot:0x2: num reads=0 slot status=0x0.
2024-09-13T15:14:19.266Z In(182) vmkernel: cpu2:2098149)igbn: igbn_CheckRxHang:1414: vmnic1: false hang detected on RX queue 0
2024-09-13T15:14:19.843Z In(182) vmkernel: cpu0:2097564)PCIEHP: 1573: 0000:4c:01.0: hotplug slot:0x2: num reads=1 slot status=0x148.
2024-09-13T15:14:19.843Z In(182) vmkernel: cpu0:2097564)PCIEHP: 1478: 0000:4c:01.0: hotplug slot:0x2 (0000:4e:00.0) Adapter inserted.
2024-09-13T15:14:19.843Z In(182) vmkernel: cpu15:2097563)PCIEHP: 1573: 0000:4c:01.0: hotplug slot:0x2: num reads=0 slot status=0x0.
2024-09-13T15:14:19.945Z In(182) vmkernel: cpu0:2097564)PCIEHP: 983: 0000:4c:01.0: Enabling hotplug slot:0x2
2024-09-13T15:14:19.945Z In(182) vmkernel: cpu0:2097564)PCIEHP: 638: 0000:4c:01.0: hotplug slot: 0x2: Prior device 0000:4e:00.0 was yanked
2024-09-13T15:14:19.945Z Wa(180) vmkwarning: cpu0:2097564)WARNING: PCIEHP: 641: 0000:4c:01.0: hotplug slot: 0x2: Device insertion detected while prior device 0000:4e:00.0 removal is still pending
尝试的解决办法:
BIOS开启above 4G
设置EFI引导
设置显卡直通
配置高级参数:
pciPassthru.use64bitMMIO="TRUE"
第二个参数需要进行一个简单的计算。计算打算传递给虚拟机的高端PCI设备数量,将该数字乘以16,然后向上取整到下一个2的幂。例如,如果使用两个设备进行直通,计算结果为:2 * 16 = 32,向上取整到下一个2的幂,得到64。对于一个设备,使用32。将此值用于第二项设置:
(如果没出现电源启动错误,但是开机后进不去系统又自动关机了,可以尝试把这个值调大。解决上面报错后我测试4张A100需要设置成512才能开机)
pciPassthru.64bitMMIOSizeGB="64"
然并卵
看到了相同的错误,解决方案如下:
1.开启exsi的ssh和shell:
2.输入:
esxcli system settings kernel set -s enablePCIEHotplug -v FALSE
然后重启,重启之后可以输入以下命令验证PCIe设备热插拔是否已禁用:
esxcli system settings kernel list -o enablePCIEHotplug
这样就是禁用了。
再开机就可以成功了,记得给PCI设备重新设置直通,并且在虚拟机配置里把之前没识别到的PCI设备移除。
参考资料:
machine-on-vmware-esxi-hyperviso.html" rel="nofollow" title="1.Virtual Machine On VMware ESXi Hypervisor Will Stop Responding or Fail to Power On When Configured With the NVIDIA A40/A10 PCIe Graphics Accelerator As a "Passthrough" Device (broadcom.com)" style="box-sizing: border-box; outline: none; margin: 0px; padding: 0px; text-decoration-line: none; cursor: pointer; color: rgb(78, 161, 219); font-synthesis-style: auto; overflow-wrap: break-word;">1.Virtual Machine On VMware ESXi Hypervisor Will Stop Responding or Fail to Power On When Configured With the NVIDIA A40/A10 PCIe Graphics Accelerator As a "Passthrough" Device (broadcom.com)
本文链接:https://www.kinber.cn/post/4588.html 转载需授权!
推荐本站淘宝优惠价购买喜欢的宝贝: