vRealize AutomationvRealize Automation Data Collection is the process where it connects to its infrastructure source endpoints and their compute resources to collect all kinds of information. Within vRealize Automation there are different data collection types:

  • Infrastructure Source Endpoint Data Collection; Updates information about virtualization hosts, templates, and ISO images for virtualization environments. Updates virtual datacenters and templates for vCloud Director. Updates regions and machines provisioned on them for Amazon.
  • Inventory Data Collection; Updates the record of the virtual machines whose resource use is tied to a specific compute resource, including detailed information about the networks, storage, and virtual machines.
  • State Data Collection; Updates the record of the power state of each machine discovered through inventory data collection. State data collection also records missing machines that vRealize Automation manages but cannot be detected on the virtualization compute resource or cloud endpoint.
  • Performance Data Collection (vSphere compute resources only); Updates the record of the average CPU, storage, memory, and network usage for each virtual machine discovered through inventory data collection.
  • vCNS inventory data collection (vSphere compute resources only); Updates the record of network and security data related to vCloud Networking and Security and NSX, particularly information about security groups and load balancing, for each machine following inventory data collection.
  • WMI data collection (Windows compute resources only); Updates the record of the management data for each Windows machine.
  • Cost data collection (compute resources managed by vRealize Business Standard Edition only). Updates the CPU, memory, and storage costs for each compute resource managed by vRealize Business Standard Edition. The costs of catalog items that can be provisioned by using the compute resources are updated.

Data collection occurs at regular intervals. Each type of data collection has a default interval that you can override or modify to suit your needs.

  • Inventory – daily;
  • State – every 15 minutes;
  • Performance – daily;
  • Network and Security Inventory – daily;
  • Cost – daily.

But data collection can also be started manually from the vRealize Automation Console. IaaS administrators can manually initiate data collection for infrastructure source endpoints and fabric administrators can manually initiate data collection for compute resources. Just go to ‘Infrastructure’, select ‘Compute Resources‘, hover over the resource you want to run a data collection on, select ‘Data Collection‘ and select request on the item which needs a refresh.

Data Collection

But what does it do exactly? Last week I got an interesting question from a customer. A colleague had just done a presentation of all things new in vSphere 6, which also includes all vMotion enhancements. Their question was, ‘What if vMotion moves virtual machines to different hosts, networks, clusters, vCenters. Is vRealize Automation able to track these changes?

Good question! But I had no conclusive answer other than ‘I think the vRealize Automation Data Collection takes care of that. But I will find that out for you‘.

As there is no real documentation on this, there’s only one option. Try it! So I did.

Change in resources (CPU, memory, disk, network)

When using vRealize Automation this is the front-end for doing infrastructure changes. Mainly because the nice self-service portal and integrations with third party solutions like CMDB, Infoblox, ServiceNow, etc. But it happens that for some reason an administrator makes changes using the vSphere client. Like number of CPU’s, amount of memory, connected network, disk sizes.

First I deployed a virtual machine with the specifications below.

Compute: 1 vCPU, 4GB of RAM.

Data Collection

Storage: 50GB on two volumes (40GB, 10GB).

Data Collection

Network: Connected to VM Network.

DataCollection-Network-org

Then I changed these settings in the vSphere Web Client to the configuration below.

Data Collection

After running the Data Collection, the same virtual machine in vRealize Automation looks like below.

Compute: 4 vCPU, 16GB of RAM.

Data Collection

Storage: 100GB on two volumes (80GB, 20GB).

Data Collection

Network: Connected to DP_MGT.

Data Collection

So, resource changes are detected by the Data Collector which reports the correct values back to vRealize Automation.

Change in location (Storage vMotion)

Compute resources are grouped in a cluster and the cluster is added to vRealize Automation through fabric groups and reservations. But storage is handles through storage reservations. What if I do a storage vMotion to another datastore in the same reservation? Let’s try to move our virtual machine from the ‘ComputeDS07‘ datastore to another datastore.

Data Collection

We will move the virtual machine to the ‘ComputeDS10‘ datastore.

Storage-vMotion2

After running the Data Collection, the virtual machine’s storage path in vRealize Automation is updated to the new location.

DataCollection-Storage-org3

Remove/add from inventory (same name/location)

Let’s make it a bit more complicated. When I remove a virtual machine from the vSphere inventory and run a data collection, it is reported as ‘Missing‘.

Item-Missing-from-Catalog

What if I register the virtual machine again and use the same name and location?

Register-VM

After power on the following question pops up because of the re-registration. Answer ‘I Copied It‘.

VM-question

After running the Data Collection, the virtual machine’s is detected again and available as an item in vRealize Automation.

Item-back-in-Catalog

Remove/add from inventory (different name/location)

Let raise the bar another bit. What will happen when I remove a virtual machine from the vSphere inventory and run a data collection, and register the virtual machine again but now we use a different name and location?

Register-VM-new-name

The new name is ‘WIN-073-NEW‘ and the new location is the ‘VRM‘ folder.

After power on the following question pops up because of the re-registration. Answer ‘I Copied It‘.

VM-question

After running the Data Collection, the virtual machine’s is detected again and available as an item in vRealize Automation. But strangely enough the name change does not come through.

Item-back-in-Catalog

Trying to get the virtual machine registered under its new name I also changed the NetBIOS name of the virtual machine.

DataCollection-BIOS-name

After running another Data Collection, the virtual machine’s is finally showing its new name in the vRealize Automation console and it is ready to use.

DataCollection-New-Name

The vRealize Automation Data Collection is a simple but very efficient process which keeps the information in vSphere and vRealize Automation in sync. A bulletproof system to guard you from administrators who bypass vRealize Automation to perform second day operations.

Note: Although Data Collection keeps vSphere and vRealize Automation in sync, it does not handle any updates to third party systems which are connected vRealize Automation, like CMDB-updates configured on stubs within the blueprint deployment.