Ealier this week I’ve been configuring a new VSAN cluster. The old configuration was using hardware that wasn’t complying with the VMware vSAN HCL. We ordered new raid controllers and HDD’s and reused the SSD disks, as those where compliant with the HCL.
Adding and configuring the new hardware went well and ESXi was installed without a problem. After doing some network configurations on the hosts it was time to turn on VSAN. And that is where I ran into an issue.
The moment I turned on VSAN (manual mode) each host created a disk group containing the SSD’s present in the host. Each disk group however was showing only absent HDD’s. Seeing as I was using SSD’s from a previous configuration I wasn’t that suprised and tried removing the disk group, figuring I could just recreate the disk group manually with the new HDD’s.
Trying to remove the disk group the task reported back as being completed, but the disk group was still present. Taking a look at the task details there was an error stack shown:
Having a look at the VMkernel log I could not find any information that would help me with further troubleshooting this issue. I then decided to try the same thing again but then with RVC hoping that maybe I could force it.
Within RVC I used the “vsan.host_wipe_vsan_disks” command with the “-f” argument to force the removal. Unfortunatly this ended with the same result:
After some time tinkering around and trying to make the command work I decided to turn off VSAN. Once turned off I opened up a SSH session with each host and removed the partition information of the SSD’s (you could also choose to boot the host with an ISO including disk formatting tools).
Using the following commands you can remove the partition information:
- Get a list of disks, you need this to find the unique ID of the SSD disk: ls -l /dev/disks/
- To get the partition information of the disk use: partedUtil get “/dev/disks/”
- And to remove the actual partions run this command for each partition present: partedUtil delete “/dev/disks/”
With the partitions removed I rebooted the hosts and enabled VSAN again. This resulted in a situation with no disk groups present and I was then able to create new disk groups. The VSAN is now running like a charm.