PSOD with vSphere 8 and HPE ProLiant DL325 Gen10 Plus v2
We are facing regularly PSOD on all our HPE ProLiant DL325 Gen10 Plus v2 ESX hosts with vSphere 8. The reason for the PSOD has been investigated by VMware and is pointing to an issue with the HPE ilo package installed on the host.
The PSOD is reporting a non-empty heap issue.
The case with HPE is now open for since 5th of Mai and currently reached the engineer team attention as the issue affect not only us. We implemented a boot parameter to help HPE investigate the issue for dumping additional information at every PSOD.
As a workaround you can simply remove the ilo package from the host. This does not affect the functionality of the host at all, and you can stabilize the cluster. Especially if you are running vSAN like us.
esxcli software vib remove -n ilo
I will let you know, when a permanent solution has been found. Keep in mind that the ilo package will be installed again if you patch your host with vSphere Lifecycle Manager (LCM) including the vendor add-on for HPE servers.