Extract via PowerCLI the list of “probably” orphaned files -vmdk, vmx etc – across multiples datastores in parallel using PowerShell workflow.
There are already many great resources on how to find orphaned vmdk files in a VMware environment.
The logic from the script below has been initially inspired from a post from Jason Coleman, that has been inspired itself from a script from HJA von Bokhoven modified by Luc Dekens.
This logic was working well when working with one datastore at a time.
However this script was not fast enough when working with a datastore cluster with many large datastores.
A possible solution for this speed issue has been inspired by the following post “PowerCLI and PowerShell Workflows” from Luc Dekens.
The initial script has been “slightly” modified and is now based on four functions.
Get-FilesIdentifiedAsAssociatedToAllVMs
Function Get-FilesIdentifiedAsAssociatedToAllVMs{ <# .SYNOPSIS Get file associated to a VM via the API. However some files will not be reported like "ctk.vmdk" .NOTES Author: Christophe Calvet Blog: http://www.thecrazyconsultant.com/ #> process{ try{ Get-View -ViewType VirtualMachine | foreach-object{ $VMName = $_.Name $VMinstanceUuid = $_.config.instanceUuid $Template = $_.config.template $_.layoutex.file | foreach-object{ $Output = New-Object -Type PSObject -Prop ([ordered]@{ 'VMName'= $VMname 'VMinstanceUuid' = $VMinstanceUuid 'IsTemplate' = $Template 'FileKey' = $_.Key 'FileName' = $_.Name 'FileSize' = $_.Size 'FileType' = $_.Type 'FileUniqueSize' = $_.UniqueSize }) Return $Output } } } Catch{ Write-error $_ } } }
This function will extract the “majority” of files that are associated to all virtual machines and template in a vCenter server.
The key point here is “majority”. Some files associated with a VM will not be extracted.
Get-FileInDatastore
Function Get-FileInDatastore{ <# .SYNOPSIS Extract the list of all files in datastore(S). .NOTES Author: Christophe Calvet Blog: http://www.thecrazyconsultant.com/ .PARAMETER Datastore Pipe one or many PowerCLI datastore object .PARAMETER matchPattern This is the search parameter. By default "*" but it can be replaced by "*.vmdk" or "*.vmx" for example #> param( [Parameter(Mandatory=$true,ValueFromPipeline=$true)] [VMware.VimAutomation.ViCore.Impl.V1.DatastoreManagement.DatastoreImpl]$Datastore, [string]$matchPattern = "*" ) process{ try{ $HostDatastoreBrowserSearchSpec = New-Object VMware.Vim.HostDatastoreBrowserSearchSpec $HostDatastoreBrowserSearchSpec.matchPattern = $matchPattern $HostDatastoreBrowserSearchSpec.sortFoldersFirst = $true $fileQueryFlags = New-Object VMware.Vim.FileQueryFlags $fileQueryFlags.fileOwner = $True $fileQueryFlags.fileSize = $True $fileQueryFlags.fileType = $True $fileQueryFlags.modification = $True $HostDatastoreBrowserSearchSpec.details = $fileQueryFlags $DatastoreName = $Datastore.extensiondata.Name $DatastoreUrl = $Datastore.extensiondata.info.url $DatastoreBrowser = Get-view -id ($Datastore.extensiondata.Browser) $datastorePath = "[" + $DatastoreName + "]" $HostDatastoreBrowserSearchResults = $DatastoreBrowser.SearchDatastoreSubFolders($datastorePath,$HostDatastoreBrowserSearchSpec) $HostDatastoreBrowserSearchResults | foreach-object{ $FolderPath = $_.FolderPath $_.file | foreach-object{ $FileTypeFullName = ($_.gettype()).FullName If($FileTypeFullName -ne "VMware.Vim.FolderFileInfo"){ $Output = New-Object -Type PSObject -Prop ([ordered]@{ 'DatastoreName'= $DatastoreName 'DatastoreUrl' = $DatastoreUrl 'FolderPath' = $FolderPath 'Path' = $_.Path 'FullPath' = $FolderPath + $_.Path 'FileSize' = $_.FileSize 'Modification' = $_.Modification 'Owner' = $_.Owner 'FileTypeFullName' = $FileTypeFullName }) Return $Output } } } } Catch{ Write-error $_ } } }
This function can be used independently and will provide you the list of all files in one or many datastores.
It is also possible to modify the search criteria according to your needs.
For example to extract the list and location of all ISO files across all datastores:
Get-datastore | Get-FileInDatastore -matchPattern “*.iso” | ogv
get-FileInDatastoreWithWorkflow
workflow get-FileInDatastoreWithWorkflow{ <# .SYNOPSIS Get all files accross multiple datastores using workflow to increase the speed. .NOTES Author: Christophe Calvet Blog: http://www.thecrazyconsultant.com/ .PARAMETER vCenter The vCenter name .PARAMETER session An existing vCenter session ($global:DefaultVIServer.SessionSecret) .PARAMETER matchPattern This is the search parameter. By default "*" but it can be replaced by "*.vmdk" or "*.vmx" for example .PARAMETER Datastores A table containing the name of all datastore to analyse. #> param( [Parameter(Mandatory=$true)] [string]$vcenter, [Parameter(Mandatory=$true)] [string]$session, [string]$matchPattern = "*", [Parameter(Mandatory=$true)] [string[]]$Datastores ) foreach -parallel ($Datastore in $Datastores){ $DatastoreFiles = InlineScript{ Function Get-FileInDatastore{ <# .SYNOPSIS Extract the list of all files in datastore(S). .NOTES Author: Christophe Calvet Blog: http://www.thecrazyconsultant.com/ .PARAMETER Datastore Pipe one or many PowerCLI datastore object .PARAMETER matchPattern This is the search parameter. By default "*" but it can be replaced by "*.vmdk" or "*.vmx" for example #> param( [Parameter(Mandatory=$true,ValueFromPipeline=$true)] [VMware.VimAutomation.ViCore.Impl.V1.DatastoreManagement.DatastoreImpl]$Datastore, [string]$matchPattern = "*" ) process{ try{ $HostDatastoreBrowserSearchSpec = New-Object VMware.Vim.HostDatastoreBrowserSearchSpec $HostDatastoreBrowserSearchSpec.matchPattern = $matchPattern $HostDatastoreBrowserSearchSpec.sortFoldersFirst = $true $fileQueryFlags = New-Object VMware.Vim.FileQueryFlags $fileQueryFlags.fileOwner = $True $fileQueryFlags.fileSize = $True $fileQueryFlags.fileType = $True $fileQueryFlags.modification = $True $HostDatastoreBrowserSearchSpec.details = $fileQueryFlags $DatastoreName = $Datastore.extensiondata.Name $DatastoreUrl = $Datastore.extensiondata.info.url $DatastoreBrowser = Get-view -id ($Datastore.extensiondata.Browser) $datastorePath = "[" + $DatastoreName + "]" $HostDatastoreBrowserSearchResults = $DatastoreBrowser.SearchDatastoreSubFolders($datastorePath,$HostDatastoreBrowserSearchSpec) $HostDatastoreBrowserSearchResults | foreach-object{ $FolderPath = $_.FolderPath $_.file | foreach-object{ $FileTypeFullName = ($_.gettype()).FullName If($FileTypeFullName -ne "VMware.Vim.FolderFileInfo"){ $Output = New-Object -Type PSObject -Prop ([ordered]@{ 'DatastoreName'= $DatastoreName 'DatastoreUrl' = $DatastoreUrl 'FolderPath' = $FolderPath 'Path' = $_.Path 'FullPath' = $FolderPath + $_.Path 'FileSize' = $_.FileSize 'Modification' = $_.Modification 'Owner' = $_.Owner 'FileTypeFullName' = $FileTypeFullName }) Return $Output } } } } Catch{ Write-error $_ } } } Add-PSSnapin VMware.VimAutomation.Core Connect-VIServer -Server $Using:vcenter -Session $Using:session | Out-Null Get-datastore -name $using:Datastore | Get-FileInDatastore -matchPattern $using:matchPattern } $DatastoreFiles } }
Please check the post of Luc Dekens to understand the logic of Workflow
In this case we use a table of “datastore name” as a parameter.
You will notice that the function Get-FileInDatastore is defined in the InlineScript.
So now this function will be executed in parallel across many datatores.
get-ProbablyOrphanedFile
function get-ProbablyOrphanedFile{ <# .SYNOPSIS Get all file that are probably orphaned. .NOTES Author: Christophe Calvet Blog: http://www.thecrazyconsultant.com/ .PARAMETER Datastore Pipe one or many PowerCLI datastore object .PARAMETER matchPattern This is the search parameter. By default "*" but it can be replaced by "*.vmdk" or "*.vmx" for example .PARAMETER SafeSearch Enabled by default. It should contain only the type of file that can be identified as orphaned. (No ctk.vmdk for example) When disabled it will report all files not identified as associated to any VMs, it means that they can be associated to some VMs (Like ctk.vmdk) #> param( [Parameter(Mandatory=$true,ValueFromPipeline=$true)] $Datastores, $matchPattern = "*", [boolean]$SafeSearch = $True ) process{ if ($global:DefaultVIServers.Count -gt 1 -OR $global:DefaultVIServers.RefCount -gt 1 ) { Write-error "Only one connection to vCenter allowed" } Else{ Try{ $DatastoresName = $Datastores.Name $DatastoreFiles = get-FileInDatastoreWithWorkflow -Datastores $DatastoresName -matchPattern $matchPattern -vcenter $global:DefaultVIServer.NAme -session $global:DefaultVIServer.SessionSecret $FilesAssociatedToAllVMs = Get-FilesIdentifiedAsAssociatedToAllVMs $FilesNotIdentifiedAsAssociatedToAnyVM = $DatastoreFiles | foreach-object{ $FullPath = $_.FullPath If ($FilesAssociatedToAllVMs.FileName -notcontains $FullPath){ Return $_ } } if ($SafeSearch) { $ProbablyOrphanedFiles = $FilesNotIdentifiedAsAssociatedToAnyVM | where{ $_.FileTypeFullName -match "VMware.Vim.Vm*" -OR ($_.FileTypeFullName -eq "VMware.Vim.FileInfo" -AND ($_.Fullpath -match ".vmsd" -OR $_.Fullpath -match ".vmxf" -OR $_.Fullpath -match "aux.xml" -OR $_.Fullpath -match ".vswp" -OR ($_.Fullpath -match ".vmdk" -AND $_.Fullpath -notmatch "ctk.vmdk") -OR ($_.Fullpath -match ".vmx" -AND $_.Fullpath -notmatch ".vmx~" -AND $_.Fullpath -notmatch ".vmx.lck") ))} $ProbablyOrphanedFiles } else{ $FilesNotIdentifiedAsAssociatedToAnyVM } } Catch{ Write-error $_ } } } }
The final function glue all functions presented above.
The function “Get-FilesIdentifiedAsAssociatedToAllVMs” is executed AFTER extracting the list of files in datastore(s). It reduces the impact of “false positive”.
In case of storage VMotion during the execution of this function the “orphaned files” will be the location of the files before the migration instead of the location post migration used in production.
More explanation regarding the “safesearch” parameter.
The “Get-FilesIdentifiedAsAssociatedToAllVMs” will report many files associated to all VMs but not all.
All files of type VMware.Vim.Vm* will be identified as associated to a VM like for example the “.log” of type “VMware.Vim.VmLogFileInfo”
For the files of type “VMware.Vim.FileInfo”, while browsing the datastore, this is more challenging.
The following files will not be identified as associated to a VM:
ctk.vmdk
.hlog
.vmx.lck
.vmx~
However the following files will be identified as associated to a VM:
.vmsd /snapshotlist
.vmx / Config
.vmxf / extendedconfig
.vmdk / DiskDescriptor (FOR RDM)
-rdmp.vmdk / diskExtent (FOR RDM)
aux.xml / snapshotmanifestlist
.vswp /swap or unswap
How to use it?
Connect-VIServer -Server “testVC”
#To identify all probably orphaned files
Get-Datastore | where {$_.name -like “SSD*”} | get-ProbablyOrphanedFile -matchpattern “*”| ogv
#To identify all probably orphaned vmdk files
Get-Datastore | where {$_.name -like “SSD*”} | get-ProbablyOrphanedFile -matchpattern “*.vmdk”| ogv
#To identify an orphaned VM (Handy if someone has removed a VM from the inventory by mistake)
Get-Datastore | where {$_.name -like “SSD*”} | get-ProbablyOrphanedFile -matchpattern “*.vmx”| ogv
#To identify all ISOs while searching in paraller accross multiples datastores
Get-Datastore | where {$_.name -like “SSD*”} | get-ProbablyOrphanedFile -matchpattern “*.iso” -SafeSearch $false| ogv
DIsconnect-VIServer -Server “testVC” -confirm:$False
Should you delete automatically the orphaned files?
Legitimate question, short answer NO.
A datastore can be shared across multiples vCenter servers.
I have seen a VM that was not reporting, wrongly, any disks associated to it
If some operations happen at the storage level at the same time like a snapshot or rotation of logs you will end up with “false positives”
So a manual check will be strongly recommended.
Known issues:
Workflow will be limited to 5, even if increasing throttle limit.
This is due to the “inlinescript” and the maximum number of process.
I didn’t found a solution so far on how to increase this number above 5 for a local “workflow”
Time out errors?
I ended up with a time out error when working with a very large NetApp NFS datastore.
Daniel Jensen has described this issue and a possible solution in great details in this post “Orphaned vmdk search return exception on large Datastores”
I will update this post sonn with another workaround.
Hi
i’m looking for script which to find orphaned VM and i saw your post here, i have tried your script and it seem not working or may be my step is wrong. how do i process the correct step?
i copied the script “get-ProbablyOrphanedFile“ and past into CLI
and on next action i copied “Get-Datastore | where {$_.name -like “SSD*”} | get-ProbablyOrphanedFile -matchpattern “*.vmx”| ogv” and past in to CLI
the result is nothing, which could be the thing went wrong?
Can any one inform me is there any similar approach in vijava/yavijava