sailesh1993 / rpms / cloud-init

Forked from rpms/cloud-init a year ago
Clone
c36ff1
From 078f3a218394eef3b28a2a061d836efe42b6c9ed Mon Sep 17 00:00:00 2001
c36ff1
From: Emanuele Giuseppe Esposito <eesposit@redhat.com>
c36ff1
Date: Fri, 14 Jan 2022 16:49:28 +0100
c36ff1
Subject: [PATCH 1/5] Datasource for VMware (#953)
c36ff1
c36ff1
RH-Author: Emanuele Giuseppe Esposito <eesposit@redhat.com>
c36ff1
RH-MergeRequest: 17: Datasource for VMware
c36ff1
RH-Commit: [1/5] 7b47334ec524dcf1b8edd02b65df7d0ff5a366e0 (eesposit/cloud-init-centos-)
c36ff1
RH-Bugzilla: 2040090
c36ff1
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
c36ff1
RH-Acked-by: Eduardo Otubo <otubo@redhat.com>
c36ff1
c36ff1
commit 8b4a9bc7b81e61943af873bad92e2133f8275b0b
c36ff1
Author: Andrew Kutz <101085+akutz@users.noreply.github.com>
c36ff1
Date:   Mon Aug 9 21:24:07 2021 -0500
c36ff1
c36ff1
    Datasource for VMware (#953)
c36ff1
c36ff1
    This patch finally introduces the Cloud-Init Datasource for VMware
c36ff1
    GuestInfo as a part of cloud-init proper. This datasource has existed
c36ff1
    since 2018, and rapidly became the de facto datasource for developers
c36ff1
    working with Packer, Terraform, for projects like kube-image-builder,
c36ff1
    and the de jure datasource for Photon OS.
c36ff1
c36ff1
    The major change to the datasource from its previous incarnation is
c36ff1
    the name. Now named DatasourceVMware, this new version of the
c36ff1
    datasource will allow multiple transport types in addition to
c36ff1
    GuestInfo keys.
c36ff1
c36ff1
    This datasource includes several unique features developed to address
c36ff1
    real-world situations:
c36ff1
c36ff1
      * Support for reading any key (metadata, userdata, vendordata) both
c36ff1
        from the guestinfo table when running on a VM in vSphere as well as
c36ff1
        from an environment variable when running inside of a container,
c36ff1
        useful for rapid dev/test.
c36ff1
c36ff1
      * Allows booting with DHCP while still providing full participation
c36ff1
        in Cloud-Init instance data and Jinja queries. The netifaces library
c36ff1
        provides the ability to inspect the network after it is online,
c36ff1
        and the runtime network configuration is then merged into the
c36ff1
        existing metadata and persisted to disk.
c36ff1
c36ff1
      * Advertises the local_ipv4 and local_ipv6 addresses via guestinfo
c36ff1
        as well. This is useful as Guest Tools is not always able to
c36ff1
        identify what would be considered the local address.
c36ff1
c36ff1
    The primary author and current steward of this datasource spoke at
c36ff1
    Cloud-Init Con 2020 where there was interest in contributing this datasource
c36ff1
    to the Cloud-Init codebase.
c36ff1
c36ff1
    The datasource currently lives in its own GitHub repository at
c36ff1
    https://github.com/vmware/cloud-init-vmware-guestinfo. Once the datasource
c36ff1
    is merged into Cloud-Init, the old repository will be deprecated.
c36ff1
c36ff1
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
c36ff1
---
c36ff1
 README.md                                     |   2 +-
c36ff1
 cloudinit/settings.py                         |   1 +
c36ff1
 cloudinit/sources/DataSourceVMware.py         | 871 ++++++++++++++++++
c36ff1
 doc/rtd/topics/availability.rst               |   1 +
c36ff1
 doc/rtd/topics/datasources.rst                |   2 +-
c36ff1
 doc/rtd/topics/datasources/vmware.rst         | 359 ++++++++
c36ff1
 requirements.txt                              |  12 +
c36ff1
 .../unittests/test_datasource/test_common.py  |   3 +
c36ff1
 .../unittests/test_datasource/test_vmware.py  | 377 ++++++++
c36ff1
 tests/unittests/test_ds_identify.py           | 279 +++++-
c36ff1
 tools/.github-cla-signers                     |   1 +
c36ff1
 tools/ds-identify                             |  76 +-
c36ff1
 12 files changed, 1980 insertions(+), 4 deletions(-)
c36ff1
 create mode 100644 cloudinit/sources/DataSourceVMware.py
c36ff1
 create mode 100644 doc/rtd/topics/datasources/vmware.rst
c36ff1
 create mode 100644 tests/unittests/test_datasource/test_vmware.py
c36ff1
c36ff1
diff --git a/README.md b/README.md
c36ff1
index 435405da..aa4fad63 100644
c36ff1
--- a/README.md
c36ff1
+++ b/README.md
c36ff1
@@ -39,7 +39,7 @@ get in contact with that distribution and send them our way!
c36ff1
 
c36ff1
 | Supported OSes | Supported Public Clouds | Supported Private Clouds |
c36ff1
 | --- | --- | --- |
c36ff1
-| Alpine Linux
ArchLinux
Debian
Fedora
FreeBSD
Gentoo Linux
NetBSD
OpenBSD
RHEL/CentOS
SLES/openSUSE
Ubuntu










| Amazon Web Services
Microsoft Azure
Google Cloud Platform
Oracle Cloud Infrastructure
Softlayer
Rackspace Public Cloud
IBM Cloud
Digital Ocean
Bigstep
Hetzner
Joyent
CloudSigma
Alibaba Cloud
OVH
OpenNebula
Exoscale
Scaleway
CloudStack
AltCloud
SmartOS
HyperOne
Rootbox
| Bare metal installs
OpenStack
LXD
KVM
Metal-as-a-Service (MAAS)















|
c36ff1
+| Alpine Linux
ArchLinux
Debian
Fedora
FreeBSD
Gentoo Linux
NetBSD
OpenBSD
RHEL/CentOS
SLES/openSUSE
Ubuntu










| Amazon Web Services
Microsoft Azure
Google Cloud Platform
Oracle Cloud Infrastructure
Softlayer
Rackspace Public Cloud
IBM Cloud
Digital Ocean
Bigstep
Hetzner
Joyent
CloudSigma
Alibaba Cloud
OVH
OpenNebula
Exoscale
Scaleway
CloudStack
AltCloud
SmartOS
HyperOne
Rootbox
| Bare metal installs
OpenStack
LXD
KVM
Metal-as-a-Service (MAAS)
VMware















|
c36ff1
 
c36ff1
 ## To start developing cloud-init
c36ff1
 
c36ff1
diff --git a/cloudinit/settings.py b/cloudinit/settings.py
c36ff1
index 2acf2615..d5f32dbb 100644
c36ff1
--- a/cloudinit/settings.py
c36ff1
+++ b/cloudinit/settings.py
c36ff1
@@ -42,6 +42,7 @@ CFG_BUILTIN = {
c36ff1
         'Exoscale',
c36ff1
         'RbxCloud',
c36ff1
         'UpCloud',
c36ff1
+        'VMware',
c36ff1
         # At the end to act as a 'catch' when none of the above work...
c36ff1
         'None',
c36ff1
     ],
c36ff1
diff --git a/cloudinit/sources/DataSourceVMware.py b/cloudinit/sources/DataSourceVMware.py
c36ff1
new file mode 100644
c36ff1
index 00000000..22ca63de
c36ff1
--- /dev/null
c36ff1
+++ b/cloudinit/sources/DataSourceVMware.py
c36ff1
@@ -0,0 +1,871 @@
c36ff1
+# Cloud-Init DataSource for VMware
c36ff1
+#
c36ff1
+# Copyright (c) 2018-2021 VMware, Inc. All Rights Reserved.
c36ff1
+#
c36ff1
+# Authors: Anish Swaminathan <anishs@vmware.com>
c36ff1
+#          Andrew Kutz <akutz@vmware.com>
c36ff1
+#
c36ff1
+# This file is part of cloud-init. See LICENSE file for license information.
c36ff1
+
c36ff1
+"""Cloud-Init DataSource for VMware
c36ff1
+
c36ff1
+This module provides a cloud-init datasource for VMware systems and supports
c36ff1
+multiple transports types, including:
c36ff1
+
c36ff1
+    * EnvVars
c36ff1
+    * GuestInfo
c36ff1
+
c36ff1
+Netifaces (https://github.com/al45tair/netifaces)
c36ff1
+
c36ff1
+    Please note this module relies on the netifaces project to introspect the
c36ff1
+    runtime, network configuration of the host on which this datasource is
c36ff1
+    running. This is in contrast to the rest of cloud-init which uses the
c36ff1
+    cloudinit/netinfo module.
c36ff1
+
c36ff1
+    The reasons for using netifaces include:
c36ff1
+
c36ff1
+        * Netifaces is built in C and is more portable across multiple systems
c36ff1
+          and more deterministic than shell exec'ing local network commands and
c36ff1
+          parsing their output.
c36ff1
+
c36ff1
+        * Netifaces provides a stable way to determine the view of the host's
c36ff1
+          network after DHCP has brought the network online. Unlike most other
c36ff1
+          datasources, this datasource still provides support for JINJA queries
c36ff1
+          based on networking information even when the network is based on a
c36ff1
+          DHCP lease. While this does not tie this datasource directly to
c36ff1
+          netifaces, it does mean the ability to consistently obtain the
c36ff1
+          correct information is paramount.
c36ff1
+
c36ff1
+        * It is currently possible to execute this datasource on macOS
c36ff1
+          (which many developers use today) to print the output of the
c36ff1
+          get_host_info function. This function calls netifaces to obtain
c36ff1
+          the same runtime network configuration that the datasource would
c36ff1
+          persist to the local system's instance data.
c36ff1
+
c36ff1
+          However, the netinfo module fails on macOS. The result is either a
c36ff1
+          hung operation that requires a SIGINT to return control to the user,
c36ff1
+          or, if brew is used to install iproute2mac, the ip commands are used
c36ff1
+          but produce output the netinfo module is unable to parse.
c36ff1
+
c36ff1
+          While macOS is not a target of cloud-init, this feature is quite
c36ff1
+          useful when working on this datasource.
c36ff1
+
c36ff1
+          For more information about this behavior, please see the following
c36ff1
+          PR comment, https://bit.ly/3fG7OVh.
c36ff1
+
c36ff1
+    The authors of this datasource are not opposed to moving away from
c36ff1
+    netifaces. The goal may be to eventually do just that. This proviso was
c36ff1
+    added to the top of this module as a way to remind future-us and others
c36ff1
+    why netifaces was used in the first place in order to either smooth the
c36ff1
+    transition away from netifaces or embrace it further up the cloud-init
c36ff1
+    stack.
c36ff1
+"""
c36ff1
+
c36ff1
+import collections
c36ff1
+import copy
c36ff1
+from distutils.spawn import find_executable
c36ff1
+import ipaddress
c36ff1
+import json
c36ff1
+import os
c36ff1
+import socket
c36ff1
+import time
c36ff1
+
c36ff1
+from cloudinit import dmi, log as logging
c36ff1
+from cloudinit import sources
c36ff1
+from cloudinit import util
c36ff1
+from cloudinit.subp import subp, ProcessExecutionError
c36ff1
+
c36ff1
+import netifaces
c36ff1
+
c36ff1
+
c36ff1
+PRODUCT_UUID_FILE_PATH = "/sys/class/dmi/id/product_uuid"
c36ff1
+
c36ff1
+LOG = logging.getLogger(__name__)
c36ff1
+NOVAL = "No value found"
c36ff1
+
c36ff1
+DATA_ACCESS_METHOD_ENVVAR = "envvar"
c36ff1
+DATA_ACCESS_METHOD_GUESTINFO = "guestinfo"
c36ff1
+
c36ff1
+VMWARE_RPCTOOL = find_executable("vmware-rpctool")
c36ff1
+REDACT = "redact"
c36ff1
+CLEANUP_GUESTINFO = "cleanup-guestinfo"
c36ff1
+VMX_GUESTINFO = "VMX_GUESTINFO"
c36ff1
+GUESTINFO_EMPTY_YAML_VAL = "---"
c36ff1
+
c36ff1
+LOCAL_IPV4 = "local-ipv4"
c36ff1
+LOCAL_IPV6 = "local-ipv6"
c36ff1
+WAIT_ON_NETWORK = "wait-on-network"
c36ff1
+WAIT_ON_NETWORK_IPV4 = "ipv4"
c36ff1
+WAIT_ON_NETWORK_IPV6 = "ipv6"
c36ff1
+
c36ff1
+
c36ff1
+class DataSourceVMware(sources.DataSource):
c36ff1
+    """
c36ff1
+    Setting the hostname:
c36ff1
+        The hostname is set by way of the metadata key "local-hostname".
c36ff1
+
c36ff1
+    Setting the instance ID:
c36ff1
+        The instance ID may be set by way of the metadata key "instance-id".
c36ff1
+        However, if this value is absent then the instance ID is read
c36ff1
+        from the file /sys/class/dmi/id/product_uuid.
c36ff1
+
c36ff1
+    Configuring the network:
c36ff1
+        The network is configured by setting the metadata key "network"
c36ff1
+        with a value consistent with Network Config Versions 1 or 2,
c36ff1
+        depending on the Linux distro's version of cloud-init:
c36ff1
+
c36ff1
+            Network Config Version 1 - http://bit.ly/cloudinit-net-conf-v1
c36ff1
+            Network Config Version 2 - http://bit.ly/cloudinit-net-conf-v2
c36ff1
+
c36ff1
+        For example, CentOS 7's official cloud-init package is version
c36ff1
+        0.7.9 and does not support Network Config Version 2. However,
c36ff1
+        this datasource still supports supplying Network Config Version 2
c36ff1
+        data as long as the Linux distro's cloud-init package is new
c36ff1
+        enough to parse the data.
c36ff1
+
c36ff1
+        The metadata key "network.encoding" may be used to indicate the
c36ff1
+        format of the metadata key "network". Valid encodings are base64
c36ff1
+        and gzip+base64.
c36ff1
+    """
c36ff1
+
c36ff1
+    dsname = "VMware"
c36ff1
+
c36ff1
+    def __init__(self, sys_cfg, distro, paths, ud_proc=None):
c36ff1
+        sources.DataSource.__init__(self, sys_cfg, distro, paths, ud_proc)
c36ff1
+
c36ff1
+        self.data_access_method = None
c36ff1
+        self.vmware_rpctool = VMWARE_RPCTOOL
c36ff1
+
c36ff1
+    def _get_data(self):
c36ff1
+        """
c36ff1
+        _get_data loads the metadata, userdata, and vendordata from one of
c36ff1
+        the following locations in the given order:
c36ff1
+
c36ff1
+            * envvars
c36ff1
+            * guestinfo
c36ff1
+
c36ff1
+        Please note when updating this function with support for new data
c36ff1
+        transports, the order should match the order in the dscheck_VMware
c36ff1
+        function from the file ds-identify.
c36ff1
+        """
c36ff1
+
c36ff1
+        # Initialize the locally scoped metadata, userdata, and vendordata
c36ff1
+        # variables. They are assigned below depending on the detected data
c36ff1
+        # access method.
c36ff1
+        md, ud, vd = None, None, None
c36ff1
+
c36ff1
+        # First check to see if there is data via env vars.
c36ff1
+        if os.environ.get(VMX_GUESTINFO, ""):
c36ff1
+            md = guestinfo_envvar("metadata")
c36ff1
+            ud = guestinfo_envvar("userdata")
c36ff1
+            vd = guestinfo_envvar("vendordata")
c36ff1
+
c36ff1
+            if md or ud or vd:
c36ff1
+                self.data_access_method = DATA_ACCESS_METHOD_ENVVAR
c36ff1
+
c36ff1
+        # At this point, all additional data transports are valid only on
c36ff1
+        # a VMware platform.
c36ff1
+        if not self.data_access_method:
c36ff1
+            system_type = dmi.read_dmi_data("system-product-name")
c36ff1
+            if system_type is None:
c36ff1
+                LOG.debug("No system-product-name found")
c36ff1
+                return False
c36ff1
+            if "vmware" not in system_type.lower():
c36ff1
+                LOG.debug("Not a VMware platform")
c36ff1
+                return False
c36ff1
+
c36ff1
+        # If no data was detected, check the guestinfo transport next.
c36ff1
+        if not self.data_access_method:
c36ff1
+            if self.vmware_rpctool:
c36ff1
+                md = guestinfo("metadata", self.vmware_rpctool)
c36ff1
+                ud = guestinfo("userdata", self.vmware_rpctool)
c36ff1
+                vd = guestinfo("vendordata", self.vmware_rpctool)
c36ff1
+
c36ff1
+                if md or ud or vd:
c36ff1
+                    self.data_access_method = DATA_ACCESS_METHOD_GUESTINFO
c36ff1
+
c36ff1
+        if not self.data_access_method:
c36ff1
+            LOG.error("failed to find a valid data access method")
c36ff1
+            return False
c36ff1
+
c36ff1
+        LOG.info("using data access method %s", self._get_subplatform())
c36ff1
+
c36ff1
+        # Get the metadata.
c36ff1
+        self.metadata = process_metadata(load_json_or_yaml(md))
c36ff1
+
c36ff1
+        # Get the user data.
c36ff1
+        self.userdata_raw = ud
c36ff1
+
c36ff1
+        # Get the vendor data.
c36ff1
+        self.vendordata_raw = vd
c36ff1
+
c36ff1
+        # Redact any sensitive information.
c36ff1
+        self.redact_keys()
c36ff1
+
c36ff1
+        # get_data returns true if there is any available metadata,
c36ff1
+        # userdata, or vendordata.
c36ff1
+        if self.metadata or self.userdata_raw or self.vendordata_raw:
c36ff1
+            return True
c36ff1
+        else:
c36ff1
+            return False
c36ff1
+
c36ff1
+    def setup(self, is_new_instance):
c36ff1
+        """setup(is_new_instance)
c36ff1
+
c36ff1
+        This is called before user-data and vendor-data have been processed.
c36ff1
+
c36ff1
+        Unless the datasource has set mode to 'local', then networking
c36ff1
+        per 'fallback' or per 'network_config' will have been written and
c36ff1
+        brought up the OS at this point.
c36ff1
+        """
c36ff1
+
c36ff1
+        host_info = wait_on_network(self.metadata)
c36ff1
+        LOG.info("got host-info: %s", host_info)
c36ff1
+
c36ff1
+        # Reflect any possible local IPv4 or IPv6 addresses in the guest
c36ff1
+        # info.
c36ff1
+        advertise_local_ip_addrs(host_info)
c36ff1
+
c36ff1
+        # Ensure the metadata gets updated with information about the
c36ff1
+        # host, including the network interfaces, default IP addresses,
c36ff1
+        # etc.
c36ff1
+        self.metadata = util.mergemanydict([self.metadata, host_info])
c36ff1
+
c36ff1
+        # Persist the instance data for versions of cloud-init that support
c36ff1
+        # doing so. This occurs here rather than in the get_data call in
c36ff1
+        # order to ensure that the network interfaces are up and can be
c36ff1
+        # persisted with the metadata.
c36ff1
+        self.persist_instance_data()
c36ff1
+
c36ff1
+    def _get_subplatform(self):
c36ff1
+        get_key_name_fn = None
c36ff1
+        if self.data_access_method == DATA_ACCESS_METHOD_ENVVAR:
c36ff1
+            get_key_name_fn = get_guestinfo_envvar_key_name
c36ff1
+        elif self.data_access_method == DATA_ACCESS_METHOD_GUESTINFO:
c36ff1
+            get_key_name_fn = get_guestinfo_key_name
c36ff1
+        else:
c36ff1
+            return sources.METADATA_UNKNOWN
c36ff1
+
c36ff1
+        return "%s (%s)" % (
c36ff1
+            self.data_access_method,
c36ff1
+            get_key_name_fn("metadata"),
c36ff1
+        )
c36ff1
+
c36ff1
+    @property
c36ff1
+    def network_config(self):
c36ff1
+        if "network" in self.metadata:
c36ff1
+            LOG.debug("using metadata network config")
c36ff1
+        else:
c36ff1
+            LOG.debug("using fallback network config")
c36ff1
+            self.metadata["network"] = {
c36ff1
+                "config": self.distro.generate_fallback_config(),
c36ff1
+            }
c36ff1
+        return self.metadata["network"]["config"]
c36ff1
+
c36ff1
+    def get_instance_id(self):
c36ff1
+        # Pull the instance ID out of the metadata if present. Otherwise
c36ff1
+        # read the file /sys/class/dmi/id/product_uuid for the instance ID.
c36ff1
+        if self.metadata and "instance-id" in self.metadata:
c36ff1
+            return self.metadata["instance-id"]
c36ff1
+        with open(PRODUCT_UUID_FILE_PATH, "r") as id_file:
c36ff1
+            self.metadata["instance-id"] = str(id_file.read()).rstrip().lower()
c36ff1
+            return self.metadata["instance-id"]
c36ff1
+
c36ff1
+    def get_public_ssh_keys(self):
c36ff1
+        for key_name in (
c36ff1
+            "public-keys-data",
c36ff1
+            "public_keys_data",
c36ff1
+            "public-keys",
c36ff1
+            "public_keys",
c36ff1
+        ):
c36ff1
+            if key_name in self.metadata:
c36ff1
+                return sources.normalize_pubkey_data(self.metadata[key_name])
c36ff1
+        return []
c36ff1
+
c36ff1
+    def redact_keys(self):
c36ff1
+        # Determine if there are any keys to redact.
c36ff1
+        keys_to_redact = None
c36ff1
+        if REDACT in self.metadata:
c36ff1
+            keys_to_redact = self.metadata[REDACT]
c36ff1
+        elif CLEANUP_GUESTINFO in self.metadata:
c36ff1
+            # This is for backwards compatibility.
c36ff1
+            keys_to_redact = self.metadata[CLEANUP_GUESTINFO]
c36ff1
+
c36ff1
+        if self.data_access_method == DATA_ACCESS_METHOD_GUESTINFO:
c36ff1
+            guestinfo_redact_keys(keys_to_redact, self.vmware_rpctool)
c36ff1
+
c36ff1
+
c36ff1
+def decode(key, enc_type, data):
c36ff1
+    """
c36ff1
+    decode returns the decoded string value of data
c36ff1
+    key is a string used to identify the data being decoded in log messages
c36ff1
+    """
c36ff1
+    LOG.debug("Getting encoded data for key=%s, enc=%s", key, enc_type)
c36ff1
+
c36ff1
+    raw_data = None
c36ff1
+    if enc_type in ["gzip+base64", "gz+b64"]:
c36ff1
+        LOG.debug("Decoding %s format %s", enc_type, key)
c36ff1
+        raw_data = util.decomp_gzip(util.b64d(data))
c36ff1
+    elif enc_type in ["base64", "b64"]:
c36ff1
+        LOG.debug("Decoding %s format %s", enc_type, key)
c36ff1
+        raw_data = util.b64d(data)
c36ff1
+    else:
c36ff1
+        LOG.debug("Plain-text data %s", key)
c36ff1
+        raw_data = data
c36ff1
+
c36ff1
+    return util.decode_binary(raw_data)
c36ff1
+
c36ff1
+
c36ff1
+def get_none_if_empty_val(val):
c36ff1
+    """
c36ff1
+    get_none_if_empty_val returns None if the provided value, once stripped
c36ff1
+    of its trailing whitespace, is empty or equal to GUESTINFO_EMPTY_YAML_VAL.
c36ff1
+
c36ff1
+    The return value is always a string, regardless of whether the input is
c36ff1
+    a bytes class or a string.
c36ff1
+    """
c36ff1
+
c36ff1
+    # If the provided value is a bytes class, convert it to a string to
c36ff1
+    # simplify the rest of this function's logic.
c36ff1
+    val = util.decode_binary(val)
c36ff1
+    val = val.rstrip()
c36ff1
+    if len(val) == 0 or val == GUESTINFO_EMPTY_YAML_VAL:
c36ff1
+        return None
c36ff1
+    return val
c36ff1
+
c36ff1
+
c36ff1
+def advertise_local_ip_addrs(host_info):
c36ff1
+    """
c36ff1
+    advertise_local_ip_addrs gets the local IP address information from
c36ff1
+    the provided host_info map and sets the addresses in the guestinfo
c36ff1
+    namespace
c36ff1
+    """
c36ff1
+    if not host_info:
c36ff1
+        return
c36ff1
+
c36ff1
+    # Reflect any possible local IPv4 or IPv6 addresses in the guest
c36ff1
+    # info.
c36ff1
+    local_ipv4 = host_info.get(LOCAL_IPV4)
c36ff1
+    if local_ipv4:
c36ff1
+        guestinfo_set_value(LOCAL_IPV4, local_ipv4)
c36ff1
+        LOG.info("advertised local ipv4 address %s in guestinfo", local_ipv4)
c36ff1
+
c36ff1
+    local_ipv6 = host_info.get(LOCAL_IPV6)
c36ff1
+    if local_ipv6:
c36ff1
+        guestinfo_set_value(LOCAL_IPV6, local_ipv6)
c36ff1
+        LOG.info("advertised local ipv6 address %s in guestinfo", local_ipv6)
c36ff1
+
c36ff1
+
c36ff1
+def handle_returned_guestinfo_val(key, val):
c36ff1
+    """
c36ff1
+    handle_returned_guestinfo_val returns the provided value if it is
c36ff1
+    not empty or set to GUESTINFO_EMPTY_YAML_VAL, otherwise None is
c36ff1
+    returned
c36ff1
+    """
c36ff1
+    val = get_none_if_empty_val(val)
c36ff1
+    if val:
c36ff1
+        return val
c36ff1
+    LOG.debug("No value found for key %s", key)
c36ff1
+    return None
c36ff1
+
c36ff1
+
c36ff1
+def get_guestinfo_key_name(key):
c36ff1
+    return "guestinfo." + key
c36ff1
+
c36ff1
+
c36ff1
+def get_guestinfo_envvar_key_name(key):
c36ff1
+    return ("vmx." + get_guestinfo_key_name(key)).upper().replace(".", "_", -1)
c36ff1
+
c36ff1
+
c36ff1
+def guestinfo_envvar(key):
c36ff1
+    val = guestinfo_envvar_get_value(key)
c36ff1
+    if not val:
c36ff1
+        return None
c36ff1
+    enc_type = guestinfo_envvar_get_value(key + ".encoding")
c36ff1
+    return decode(get_guestinfo_envvar_key_name(key), enc_type, val)
c36ff1
+
c36ff1
+
c36ff1
+def guestinfo_envvar_get_value(key):
c36ff1
+    env_key = get_guestinfo_envvar_key_name(key)
c36ff1
+    return handle_returned_guestinfo_val(key, os.environ.get(env_key, ""))
c36ff1
+
c36ff1
+
c36ff1
+def guestinfo(key, vmware_rpctool=VMWARE_RPCTOOL):
c36ff1
+    """
c36ff1
+    guestinfo returns the guestinfo value for the provided key, decoding
c36ff1
+    the value when required
c36ff1
+    """
c36ff1
+    val = guestinfo_get_value(key, vmware_rpctool)
c36ff1
+    if not val:
c36ff1
+        return None
c36ff1
+    enc_type = guestinfo_get_value(key + ".encoding", vmware_rpctool)
c36ff1
+    return decode(get_guestinfo_key_name(key), enc_type, val)
c36ff1
+
c36ff1
+
c36ff1
+def guestinfo_get_value(key, vmware_rpctool=VMWARE_RPCTOOL):
c36ff1
+    """
c36ff1
+    Returns a guestinfo value for the specified key.
c36ff1
+    """
c36ff1
+    LOG.debug("Getting guestinfo value for key %s", key)
c36ff1
+
c36ff1
+    try:
c36ff1
+        (stdout, stderr) = subp(
c36ff1
+            [
c36ff1
+                vmware_rpctool,
c36ff1
+                "info-get " + get_guestinfo_key_name(key),
c36ff1
+            ]
c36ff1
+        )
c36ff1
+        if stderr == NOVAL:
c36ff1
+            LOG.debug("No value found for key %s", key)
c36ff1
+        elif not stdout:
c36ff1
+            LOG.error("Failed to get guestinfo value for key %s", key)
c36ff1
+        return handle_returned_guestinfo_val(key, stdout)
c36ff1
+    except ProcessExecutionError as error:
c36ff1
+        if error.stderr == NOVAL:
c36ff1
+            LOG.debug("No value found for key %s", key)
c36ff1
+        else:
c36ff1
+            util.logexc(
c36ff1
+                LOG,
c36ff1
+                "Failed to get guestinfo value for key %s: %s",
c36ff1
+                key,
c36ff1
+                error,
c36ff1
+            )
c36ff1
+    except Exception:
c36ff1
+        util.logexc(
c36ff1
+            LOG,
c36ff1
+            "Unexpected error while trying to get "
c36ff1
+            + "guestinfo value for key %s",
c36ff1
+            key,
c36ff1
+        )
c36ff1
+
c36ff1
+    return None
c36ff1
+
c36ff1
+
c36ff1
+def guestinfo_set_value(key, value, vmware_rpctool=VMWARE_RPCTOOL):
c36ff1
+    """
c36ff1
+    Sets a guestinfo value for the specified key. Set value to an empty string
c36ff1
+    to clear an existing guestinfo key.
c36ff1
+    """
c36ff1
+
c36ff1
+    # If value is an empty string then set it to a single space as it is not
c36ff1
+    # possible to set a guestinfo key to an empty string. Setting a guestinfo
c36ff1
+    # key to a single space is as close as it gets to clearing an existing
c36ff1
+    # guestinfo key.
c36ff1
+    if value == "":
c36ff1
+        value = " "
c36ff1
+
c36ff1
+    LOG.debug("Setting guestinfo key=%s to value=%s", key, value)
c36ff1
+
c36ff1
+    try:
c36ff1
+        subp(
c36ff1
+            [
c36ff1
+                vmware_rpctool,
c36ff1
+                ("info-set %s %s" % (get_guestinfo_key_name(key), value)),
c36ff1
+            ]
c36ff1
+        )
c36ff1
+        return True
c36ff1
+    except ProcessExecutionError as error:
c36ff1
+        util.logexc(
c36ff1
+            LOG,
c36ff1
+            "Failed to set guestinfo key=%s to value=%s: %s",
c36ff1
+            key,
c36ff1
+            value,
c36ff1
+            error,
c36ff1
+        )
c36ff1
+    except Exception:
c36ff1
+        util.logexc(
c36ff1
+            LOG,
c36ff1
+            "Unexpected error while trying to set "
c36ff1
+            + "guestinfo key=%s to value=%s",
c36ff1
+            key,
c36ff1
+            value,
c36ff1
+        )
c36ff1
+
c36ff1
+    return None
c36ff1
+
c36ff1
+
c36ff1
+def guestinfo_redact_keys(keys, vmware_rpctool=VMWARE_RPCTOOL):
c36ff1
+    """
c36ff1
+    guestinfo_redact_keys redacts guestinfo of all of the keys in the given
c36ff1
+    list. each key will have its value set to "---". Since the value is valid
c36ff1
+    YAML, cloud-init can still read it if it tries.
c36ff1
+    """
c36ff1
+    if not keys:
c36ff1
+        return
c36ff1
+    if not type(keys) in (list, tuple):
c36ff1
+        keys = [keys]
c36ff1
+    for key in keys:
c36ff1
+        key_name = get_guestinfo_key_name(key)
c36ff1
+        LOG.info("clearing %s", key_name)
c36ff1
+        if not guestinfo_set_value(
c36ff1
+            key, GUESTINFO_EMPTY_YAML_VAL, vmware_rpctool
c36ff1
+        ):
c36ff1
+            LOG.error("failed to clear %s", key_name)
c36ff1
+        LOG.info("clearing %s.encoding", key_name)
c36ff1
+        if not guestinfo_set_value(key + ".encoding", "", vmware_rpctool):
c36ff1
+            LOG.error("failed to clear %s.encoding", key_name)
c36ff1
+
c36ff1
+
c36ff1
+def load_json_or_yaml(data):
c36ff1
+    """
c36ff1
+    load first attempts to unmarshal the provided data as JSON, and if
c36ff1
+    that fails then attempts to unmarshal the data as YAML. If data is
c36ff1
+    None then a new dictionary is returned.
c36ff1
+    """
c36ff1
+    if not data:
c36ff1
+        return {}
c36ff1
+    try:
c36ff1
+        return util.load_json(data)
c36ff1
+    except (json.JSONDecodeError, TypeError):
c36ff1
+        return util.load_yaml(data)
c36ff1
+
c36ff1
+
c36ff1
+def process_metadata(data):
c36ff1
+    """
c36ff1
+    process_metadata processes metadata and loads the optional network
c36ff1
+    configuration.
c36ff1
+    """
c36ff1
+    network = None
c36ff1
+    if "network" in data:
c36ff1
+        network = data["network"]
c36ff1
+        del data["network"]
c36ff1
+
c36ff1
+    network_enc = None
c36ff1
+    if "network.encoding" in data:
c36ff1
+        network_enc = data["network.encoding"]
c36ff1
+        del data["network.encoding"]
c36ff1
+
c36ff1
+    if network:
c36ff1
+        if isinstance(network, collections.abc.Mapping):
c36ff1
+            LOG.debug("network data copied to 'config' key")
c36ff1
+            network = {"config": copy.deepcopy(network)}
c36ff1
+        else:
c36ff1
+            LOG.debug("network data to be decoded %s", network)
c36ff1
+            dec_net = decode("metadata.network", network_enc, network)
c36ff1
+            network = {
c36ff1
+                "config": load_json_or_yaml(dec_net),
c36ff1
+            }
c36ff1
+
c36ff1
+        LOG.debug("network data %s", network)
c36ff1
+        data["network"] = network
c36ff1
+
c36ff1
+    return data
c36ff1
+
c36ff1
+
c36ff1
+# Used to match classes to dependencies
c36ff1
+datasources = [
c36ff1
+    (DataSourceVMware, (sources.DEP_FILESYSTEM,)),  # Run at init-local
c36ff1
+    (DataSourceVMware, (sources.DEP_FILESYSTEM, sources.DEP_NETWORK)),
c36ff1
+]
c36ff1
+
c36ff1
+
c36ff1
+def get_datasource_list(depends):
c36ff1
+    """
c36ff1
+    Return a list of data sources that match this set of dependencies
c36ff1
+    """
c36ff1
+    return sources.list_from_depends(depends, datasources)
c36ff1
+
c36ff1
+
c36ff1
+def get_default_ip_addrs():
c36ff1
+    """
c36ff1
+    Returns the default IPv4 and IPv6 addresses based on the device(s) used for
c36ff1
+    the default route. Please note that None may be returned for either address
c36ff1
+    family if that family has no default route or if there are multiple
c36ff1
+    addresses associated with the device used by the default route for a given
c36ff1
+    address.
c36ff1
+    """
c36ff1
+    # TODO(promote and use netifaces in cloudinit.net* modules)
c36ff1
+    gateways = netifaces.gateways()
c36ff1
+    if "default" not in gateways:
c36ff1
+        return None, None
c36ff1
+
c36ff1
+    default_gw = gateways["default"]
c36ff1
+    if (
c36ff1
+        netifaces.AF_INET not in default_gw
c36ff1
+        and netifaces.AF_INET6 not in default_gw
c36ff1
+    ):
c36ff1
+        return None, None
c36ff1
+
c36ff1
+    ipv4 = None
c36ff1
+    ipv6 = None
c36ff1
+
c36ff1
+    gw4 = default_gw.get(netifaces.AF_INET)
c36ff1
+    if gw4:
c36ff1
+        _, dev4 = gw4
c36ff1
+        addr4_fams = netifaces.ifaddresses(dev4)
c36ff1
+        if addr4_fams:
c36ff1
+            af_inet4 = addr4_fams.get(netifaces.AF_INET)
c36ff1
+            if af_inet4:
c36ff1
+                if len(af_inet4) > 1:
c36ff1
+                    LOG.warning(
c36ff1
+                        "device %s has more than one ipv4 address: %s",
c36ff1
+                        dev4,
c36ff1
+                        af_inet4,
c36ff1
+                    )
c36ff1
+                elif "addr" in af_inet4[0]:
c36ff1
+                    ipv4 = af_inet4[0]["addr"]
c36ff1
+
c36ff1
+    # Try to get the default IPv6 address by first seeing if there is a default
c36ff1
+    # IPv6 route.
c36ff1
+    gw6 = default_gw.get(netifaces.AF_INET6)
c36ff1
+    if gw6:
c36ff1
+        _, dev6 = gw6
c36ff1
+        addr6_fams = netifaces.ifaddresses(dev6)
c36ff1
+        if addr6_fams:
c36ff1
+            af_inet6 = addr6_fams.get(netifaces.AF_INET6)
c36ff1
+            if af_inet6:
c36ff1
+                if len(af_inet6) > 1:
c36ff1
+                    LOG.warning(
c36ff1
+                        "device %s has more than one ipv6 address: %s",
c36ff1
+                        dev6,
c36ff1
+                        af_inet6,
c36ff1
+                    )
c36ff1
+                elif "addr" in af_inet6[0]:
c36ff1
+                    ipv6 = af_inet6[0]["addr"]
c36ff1
+
c36ff1
+    # If there is a default IPv4 address but not IPv6, then see if there is a
c36ff1
+    # single IPv6 address associated with the same device associated with the
c36ff1
+    # default IPv4 address.
c36ff1
+    if ipv4 and not ipv6:
c36ff1
+        af_inet6 = addr4_fams.get(netifaces.AF_INET6)
c36ff1
+        if af_inet6:
c36ff1
+            if len(af_inet6) > 1:
c36ff1
+                LOG.warning(
c36ff1
+                    "device %s has more than one ipv6 address: %s",
c36ff1
+                    dev4,
c36ff1
+                    af_inet6,
c36ff1
+                )
c36ff1
+            elif "addr" in af_inet6[0]:
c36ff1
+                ipv6 = af_inet6[0]["addr"]
c36ff1
+
c36ff1
+    # If there is a default IPv6 address but not IPv4, then see if there is a
c36ff1
+    # single IPv4 address associated with the same device associated with the
c36ff1
+    # default IPv6 address.
c36ff1
+    if not ipv4 and ipv6:
c36ff1
+        af_inet4 = addr6_fams.get(netifaces.AF_INET)
c36ff1
+        if af_inet4:
c36ff1
+            if len(af_inet4) > 1:
c36ff1
+                LOG.warning(
c36ff1
+                    "device %s has more than one ipv4 address: %s",
c36ff1
+                    dev6,
c36ff1
+                    af_inet4,
c36ff1
+                )
c36ff1
+            elif "addr" in af_inet4[0]:
c36ff1
+                ipv4 = af_inet4[0]["addr"]
c36ff1
+
c36ff1
+    return ipv4, ipv6
c36ff1
+
c36ff1
+
c36ff1
+# patched socket.getfqdn() - see https://bugs.python.org/issue5004
c36ff1
+
c36ff1
+
c36ff1
+def getfqdn(name=""):
c36ff1
+    """Get fully qualified domain name from name.
c36ff1
+    An empty argument is interpreted as meaning the local host.
c36ff1
+    """
c36ff1
+    # TODO(may want to promote this function to util.getfqdn)
c36ff1
+    # TODO(may want to extend util.get_hostname to accept fqdn=True param)
c36ff1
+    name = name.strip()
c36ff1
+    if not name or name == "0.0.0.0":
c36ff1
+        name = util.get_hostname()
c36ff1
+    try:
c36ff1
+        addrs = socket.getaddrinfo(
c36ff1
+            name, None, 0, socket.SOCK_DGRAM, 0, socket.AI_CANONNAME
c36ff1
+        )
c36ff1
+    except socket.error:
c36ff1
+        pass
c36ff1
+    else:
c36ff1
+        for addr in addrs:
c36ff1
+            if addr[3]:
c36ff1
+                name = addr[3]
c36ff1
+                break
c36ff1
+    return name
c36ff1
+
c36ff1
+
c36ff1
+def is_valid_ip_addr(val):
c36ff1
+    """
c36ff1
+    Returns false if the address is loopback, link local or unspecified;
c36ff1
+    otherwise true is returned.
c36ff1
+    """
c36ff1
+    # TODO(extend cloudinit.net.is_ip_addr exclude link_local/loopback etc)
c36ff1
+    # TODO(migrate to use cloudinit.net.is_ip_addr)#
c36ff1
+
c36ff1
+    addr = None
c36ff1
+    try:
c36ff1
+        addr = ipaddress.ip_address(val)
c36ff1
+    except ipaddress.AddressValueError:
c36ff1
+        addr = ipaddress.ip_address(str(val))
c36ff1
+    except Exception:
c36ff1
+        return None
c36ff1
+
c36ff1
+    if addr.is_link_local or addr.is_loopback or addr.is_unspecified:
c36ff1
+        return False
c36ff1
+    return True
c36ff1
+
c36ff1
+
c36ff1
+def get_host_info():
c36ff1
+    """
c36ff1
+    Returns host information such as the host name and network interfaces.
c36ff1
+    """
c36ff1
+    # TODO(look to promote netifices use up in cloud-init netinfo funcs)
c36ff1
+    host_info = {
c36ff1
+        "network": {
c36ff1
+            "interfaces": {
c36ff1
+                "by-mac": collections.OrderedDict(),
c36ff1
+                "by-ipv4": collections.OrderedDict(),
c36ff1
+                "by-ipv6": collections.OrderedDict(),
c36ff1
+            },
c36ff1
+        },
c36ff1
+    }
c36ff1
+    hostname = getfqdn(util.get_hostname())
c36ff1
+    if hostname:
c36ff1
+        host_info["hostname"] = hostname
c36ff1
+        host_info["local-hostname"] = hostname
c36ff1
+        host_info["local_hostname"] = hostname
c36ff1
+
c36ff1
+    default_ipv4, default_ipv6 = get_default_ip_addrs()
c36ff1
+    if default_ipv4:
c36ff1
+        host_info[LOCAL_IPV4] = default_ipv4
c36ff1
+    if default_ipv6:
c36ff1
+        host_info[LOCAL_IPV6] = default_ipv6
c36ff1
+
c36ff1
+    by_mac = host_info["network"]["interfaces"]["by-mac"]
c36ff1
+    by_ipv4 = host_info["network"]["interfaces"]["by-ipv4"]
c36ff1
+    by_ipv6 = host_info["network"]["interfaces"]["by-ipv6"]
c36ff1
+
c36ff1
+    ifaces = netifaces.interfaces()
c36ff1
+    for dev_name in ifaces:
c36ff1
+        addr_fams = netifaces.ifaddresses(dev_name)
c36ff1
+        af_link = addr_fams.get(netifaces.AF_LINK)
c36ff1
+        af_inet4 = addr_fams.get(netifaces.AF_INET)
c36ff1
+        af_inet6 = addr_fams.get(netifaces.AF_INET6)
c36ff1
+
c36ff1
+        mac = None
c36ff1
+        if af_link and "addr" in af_link[0]:
c36ff1
+            mac = af_link[0]["addr"]
c36ff1
+
c36ff1
+        # Do not bother recording localhost
c36ff1
+        if mac == "00:00:00:00:00:00":
c36ff1
+            continue
c36ff1
+
c36ff1
+        if mac and (af_inet4 or af_inet6):
c36ff1
+            key = mac
c36ff1
+            val = {}
c36ff1
+            if af_inet4:
c36ff1
+                af_inet4_vals = []
c36ff1
+                for ip_info in af_inet4:
c36ff1
+                    if not is_valid_ip_addr(ip_info["addr"]):
c36ff1
+                        continue
c36ff1
+                    af_inet4_vals.append(ip_info)
c36ff1
+                val["ipv4"] = af_inet4_vals
c36ff1
+            if af_inet6:
c36ff1
+                af_inet6_vals = []
c36ff1
+                for ip_info in af_inet6:
c36ff1
+                    if not is_valid_ip_addr(ip_info["addr"]):
c36ff1
+                        continue
c36ff1
+                    af_inet6_vals.append(ip_info)
c36ff1
+                val["ipv6"] = af_inet6_vals
c36ff1
+            by_mac[key] = val
c36ff1
+
c36ff1
+        if af_inet4:
c36ff1
+            for ip_info in af_inet4:
c36ff1
+                key = ip_info["addr"]
c36ff1
+                if not is_valid_ip_addr(key):
c36ff1
+                    continue
c36ff1
+                val = copy.deepcopy(ip_info)
c36ff1
+                del val["addr"]
c36ff1
+                if mac:
c36ff1
+                    val["mac"] = mac
c36ff1
+                by_ipv4[key] = val
c36ff1
+
c36ff1
+        if af_inet6:
c36ff1
+            for ip_info in af_inet6:
c36ff1
+                key = ip_info["addr"]
c36ff1
+                if not is_valid_ip_addr(key):
c36ff1
+                    continue
c36ff1
+                val = copy.deepcopy(ip_info)
c36ff1
+                del val["addr"]
c36ff1
+                if mac:
c36ff1
+                    val["mac"] = mac
c36ff1
+                by_ipv6[key] = val
c36ff1
+
c36ff1
+    return host_info
c36ff1
+
c36ff1
+
c36ff1
+def wait_on_network(metadata):
c36ff1
+    # Determine whether we need to wait on the network coming online.
c36ff1
+    wait_on_ipv4 = False
c36ff1
+    wait_on_ipv6 = False
c36ff1
+    if WAIT_ON_NETWORK in metadata:
c36ff1
+        wait_on_network = metadata[WAIT_ON_NETWORK]
c36ff1
+        if WAIT_ON_NETWORK_IPV4 in wait_on_network:
c36ff1
+            wait_on_ipv4_val = wait_on_network[WAIT_ON_NETWORK_IPV4]
c36ff1
+            if isinstance(wait_on_ipv4_val, bool):
c36ff1
+                wait_on_ipv4 = wait_on_ipv4_val
c36ff1
+            else:
c36ff1
+                wait_on_ipv4 = util.translate_bool(wait_on_ipv4_val)
c36ff1
+        if WAIT_ON_NETWORK_IPV6 in wait_on_network:
c36ff1
+            wait_on_ipv6_val = wait_on_network[WAIT_ON_NETWORK_IPV6]
c36ff1
+            if isinstance(wait_on_ipv6_val, bool):
c36ff1
+                wait_on_ipv6 = wait_on_ipv6_val
c36ff1
+            else:
c36ff1
+                wait_on_ipv6 = util.translate_bool(wait_on_ipv6_val)
c36ff1
+
c36ff1
+    # Get information about the host.
c36ff1
+    host_info = None
c36ff1
+    while host_info is None:
c36ff1
+        # This loop + sleep results in two logs every second while waiting
c36ff1
+        # for either ipv4 or ipv6 up. Do we really need to log each iteration
c36ff1
+        # or can we log once and log on successful exit?
c36ff1
+        host_info = get_host_info()
c36ff1
+
c36ff1
+        network = host_info.get("network") or {}
c36ff1
+        interfaces = network.get("interfaces") or {}
c36ff1
+        by_ipv4 = interfaces.get("by-ipv4") or {}
c36ff1
+        by_ipv6 = interfaces.get("by-ipv6") or {}
c36ff1
+
c36ff1
+        if wait_on_ipv4:
c36ff1
+            ipv4_ready = len(by_ipv4) > 0 if by_ipv4 else False
c36ff1
+            if not ipv4_ready:
c36ff1
+                host_info = None
c36ff1
+
c36ff1
+        if wait_on_ipv6:
c36ff1
+            ipv6_ready = len(by_ipv6) > 0 if by_ipv6 else False
c36ff1
+            if not ipv6_ready:
c36ff1
+                host_info = None
c36ff1
+
c36ff1
+        if host_info is None:
c36ff1
+            LOG.debug(
c36ff1
+                "waiting on network: wait4=%s, ready4=%s, wait6=%s, ready6=%s",
c36ff1
+                wait_on_ipv4,
c36ff1
+                ipv4_ready,
c36ff1
+                wait_on_ipv6,
c36ff1
+                ipv6_ready,
c36ff1
+            )
c36ff1
+            time.sleep(1)
c36ff1
+
c36ff1
+    LOG.debug("waiting on network complete")
c36ff1
+    return host_info
c36ff1
+
c36ff1
+
c36ff1
+def main():
c36ff1
+    """
c36ff1
+    Executed when this file is used as a program.
c36ff1
+    """
c36ff1
+    try:
c36ff1
+        logging.setupBasicLogging()
c36ff1
+    except Exception:
c36ff1
+        pass
c36ff1
+    metadata = {
c36ff1
+        "wait-on-network": {"ipv4": True, "ipv6": "false"},
c36ff1
+        "network": {"config": {"dhcp": True}},
c36ff1
+    }
c36ff1
+    host_info = wait_on_network(metadata)
c36ff1
+    metadata = util.mergemanydict([metadata, host_info])
c36ff1
+    print(util.json_dumps(metadata))
c36ff1
+
c36ff1
+
c36ff1
+if __name__ == "__main__":
c36ff1
+    main()
c36ff1
+
c36ff1
+# vi: ts=4 expandtab
c36ff1
diff --git a/doc/rtd/topics/availability.rst b/doc/rtd/topics/availability.rst
c36ff1
index f58b2b38..6606367c 100644
c36ff1
--- a/doc/rtd/topics/availability.rst
c36ff1
+++ b/doc/rtd/topics/availability.rst
c36ff1
@@ -64,5 +64,6 @@ Additionally, cloud-init is supported on these private clouds:
c36ff1
 - LXD
c36ff1
 - KVM
c36ff1
 - Metal-as-a-Service (MAAS)
c36ff1
+- VMware
c36ff1
 
c36ff1
 .. vi: textwidth=79
c36ff1
diff --git a/doc/rtd/topics/datasources.rst b/doc/rtd/topics/datasources.rst
c36ff1
index 228173d2..8afed470 100644
c36ff1
--- a/doc/rtd/topics/datasources.rst
c36ff1
+++ b/doc/rtd/topics/datasources.rst
c36ff1
@@ -49,7 +49,7 @@ The following is a list of documents for each supported datasource:
c36ff1
    datasources/smartos.rst
c36ff1
    datasources/upcloud.rst
c36ff1
    datasources/zstack.rst
c36ff1
-
c36ff1
+   datasources/vmware.rst
c36ff1
 
c36ff1
 Creation
c36ff1
 ========
c36ff1
diff --git a/doc/rtd/topics/datasources/vmware.rst b/doc/rtd/topics/datasources/vmware.rst
c36ff1
new file mode 100644
c36ff1
index 00000000..996eb61f
c36ff1
--- /dev/null
c36ff1
+++ b/doc/rtd/topics/datasources/vmware.rst
c36ff1
@@ -0,0 +1,359 @@
c36ff1
+.. _datasource_vmware:
c36ff1
+
c36ff1
+VMware
c36ff1
+======
c36ff1
+
c36ff1
+This datasource is for use with systems running on a VMware platform such as
c36ff1
+vSphere and currently supports the following data transports:
c36ff1
+
c36ff1
+
c36ff1
+* `GuestInfo <https://github.com/vmware/govmomi/blob/master/govc/USAGE.md#vmchange>`_ keys
c36ff1
+
c36ff1
+Configuration
c36ff1
+-------------
c36ff1
+
c36ff1
+The configuration method is dependent upon the transport:
c36ff1
+
c36ff1
+GuestInfo Keys
c36ff1
+^^^^^^^^^^^^^^
c36ff1
+
c36ff1
+One method of providing meta, user, and vendor data is by setting the following
c36ff1
+key/value pairs on a VM's ``extraConfig`` `property <https://vdc-repo.vmware.com/vmwb-repository/dcr-public/723e7f8b-4f21-448b-a830-5f22fd931b01/5a8257bd-7f41-4423-9a73-03307535bd42/doc/vim.vm.ConfigInfo.html>`_ :
c36ff1
+
c36ff1
+.. list-table::
c36ff1
+   :header-rows: 1
c36ff1
+
c36ff1
+   * - Property
c36ff1
+     - Description
c36ff1
+   * - ``guestinfo.metadata``
c36ff1
+     - A YAML or JSON document containing the cloud-init metadata.
c36ff1
+   * - ``guestinfo.metadata.encoding``
c36ff1
+     - The encoding type for ``guestinfo.metadata``.
c36ff1
+   * - ``guestinfo.userdata``
c36ff1
+     - A YAML document containing the cloud-init user data.
c36ff1
+   * - ``guestinfo.userdata.encoding``
c36ff1
+     - The encoding type for ``guestinfo.userdata``.
c36ff1
+   * - ``guestinfo.vendordata``
c36ff1
+     - A YAML document containing the cloud-init vendor data.
c36ff1
+   * - ``guestinfo.vendordata.encoding``
c36ff1
+     - The encoding type for ``guestinfo.vendordata``.
c36ff1
+
c36ff1
+
c36ff1
+All ``guestinfo.*.encoding`` values may be set to ``base64`` or
c36ff1
+``gzip+base64``.
c36ff1
+
c36ff1
+Features
c36ff1
+--------
c36ff1
+
c36ff1
+This section reviews several features available in this datasource, regardless
c36ff1
+of how the meta, user, and vendor data was discovered.
c36ff1
+
c36ff1
+Instance data and lazy networks
c36ff1
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
c36ff1
+
c36ff1
+One of the hallmarks of cloud-init is `its use of instance-data and JINJA
c36ff1
+queries <../instancedata.html#using-instance-data>`_
c36ff1
+-- the ability to write queries in user and vendor data that reference runtime
c36ff1
+information present in ``/run/cloud-init/instance-data.json``. This works well
c36ff1
+when the metadata provides all of the information up front, such as the network
c36ff1
+configuration. For systems that rely on DHCP, however, this information may not
c36ff1
+be available when the metadata is persisted to disk.
c36ff1
+
c36ff1
+This datasource ensures that even if the instance is using DHCP to configure
c36ff1
+networking, the same details about the configured network are available in
c36ff1
+``/run/cloud-init/instance-data.json`` as if static networking was used. This
c36ff1
+information collected at runtime is easy to demonstrate by executing the
c36ff1
+datasource on the command line. From the root of this repository, run the
c36ff1
+following command:
c36ff1
+
c36ff1
+.. code-block:: bash
c36ff1
+
c36ff1
+   PYTHONPATH="$(pwd)" python3 cloudinit/sources/DataSourceVMware.py
c36ff1
+
c36ff1
+The above command will result in output similar to the below JSON:
c36ff1
+
c36ff1
+.. code-block:: json
c36ff1
+
c36ff1
+   {
c36ff1
+       "hostname": "akutz.localhost",
c36ff1
+       "local-hostname": "akutz.localhost",
c36ff1
+       "local-ipv4": "192.168.0.188",
c36ff1
+       "local_hostname": "akutz.localhost",
c36ff1
+       "network": {
c36ff1
+           "config": {
c36ff1
+               "dhcp": true
c36ff1
+           },
c36ff1
+           "interfaces": {
c36ff1
+               "by-ipv4": {
c36ff1
+                   "172.0.0.2": {
c36ff1
+                       "netmask": "255.255.255.255",
c36ff1
+                       "peer": "172.0.0.2"
c36ff1
+                   },
c36ff1
+                   "192.168.0.188": {
c36ff1
+                       "broadcast": "192.168.0.255",
c36ff1
+                       "mac": "64:4b:f0:18:9a:21",
c36ff1
+                       "netmask": "255.255.255.0"
c36ff1
+                   }
c36ff1
+               },
c36ff1
+               "by-ipv6": {
c36ff1
+                   "fd8e:d25e:c5b6:1:1f5:b2fd:8973:22f2": {
c36ff1
+                       "flags": 208,
c36ff1
+                       "mac": "64:4b:f0:18:9a:21",
c36ff1
+                       "netmask": "ffff:ffff:ffff:ffff::/64"
c36ff1
+                   }
c36ff1
+               },
c36ff1
+               "by-mac": {
c36ff1
+                   "64:4b:f0:18:9a:21": {
c36ff1
+                       "ipv4": [
c36ff1
+                           {
c36ff1
+                               "addr": "192.168.0.188",
c36ff1
+                               "broadcast": "192.168.0.255",
c36ff1
+                               "netmask": "255.255.255.0"
c36ff1
+                           }
c36ff1
+                       ],
c36ff1
+                       "ipv6": [
c36ff1
+                           {
c36ff1
+                               "addr": "fd8e:d25e:c5b6:1:1f5:b2fd:8973:22f2",
c36ff1
+                               "flags": 208,
c36ff1
+                               "netmask": "ffff:ffff:ffff:ffff::/64"
c36ff1
+                           }
c36ff1
+                       ]
c36ff1
+                   },
c36ff1
+                   "ac:de:48:00:11:22": {
c36ff1
+                       "ipv6": []
c36ff1
+                   }
c36ff1
+               }
c36ff1
+           }
c36ff1
+       },
c36ff1
+       "wait-on-network": {
c36ff1
+           "ipv4": true,
c36ff1
+           "ipv6": "false"
c36ff1
+       }
c36ff1
+   }
c36ff1
+
c36ff1
+
c36ff1
+Redacting sensitive information
c36ff1
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
c36ff1
+
c36ff1
+Sometimes the cloud-init userdata might contain sensitive information, and it
c36ff1
+may be desirable to have the ``guestinfo.userdata`` key (or other guestinfo
c36ff1
+keys) redacted as soon as its data is read by the datasource. This is possible
c36ff1
+by adding the following to the metadata:
c36ff1
+
c36ff1
+.. code-block:: yaml
c36ff1
+
c36ff1
+   redact: # formerly named cleanup-guestinfo, which will also work
c36ff1
+   - userdata
c36ff1
+   - vendordata
c36ff1
+
c36ff1
+When the above snippet is added to the metadata, the datasource will iterate
c36ff1
+over the elements in the ``redact`` array and clear each of the keys. For
c36ff1
+example, when the guestinfo transport is used, the above snippet will cause
c36ff1
+the following commands to be executed:
c36ff1
+
c36ff1
+.. code-block:: shell
c36ff1
+
c36ff1
+   vmware-rpctool "info-set guestinfo.userdata ---"
c36ff1
+   vmware-rpctool "info-set guestinfo.userdata.encoding  "
c36ff1
+   vmware-rpctool "info-set guestinfo.vendordata ---"
c36ff1
+   vmware-rpctool "info-set guestinfo.vendordata.encoding  "
c36ff1
+
c36ff1
+Please note that keys are set to the valid YAML string ``---`` as it is not
c36ff1
+possible remove an existing key from the guestinfo key-space. A key's analogous
c36ff1
+encoding property will be set to a single white-space character, causing the
c36ff1
+datasource to treat the actual key value as plain-text, thereby loading it as
c36ff1
+an empty YAML doc (hence the aforementioned ``---``\ ).
c36ff1
+
c36ff1
+Reading the local IP addresses
c36ff1
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
c36ff1
+
c36ff1
+This datasource automatically discovers the local IPv4 and IPv6 addresses for
c36ff1
+a guest operating system based on the default routes. However, when inspecting
c36ff1
+a VM externally, it's not possible to know what the *default* IP address is for
c36ff1
+the guest OS. That's why this datasource sets the discovered, local IPv4 and
c36ff1
+IPv6 addresses back in the guestinfo namespace as the following keys:
c36ff1
+
c36ff1
+
c36ff1
+* ``guestinfo.local-ipv4``
c36ff1
+* ``guestinfo.local-ipv6``
c36ff1
+
c36ff1
+It is possible that a host may not have any default, local IP addresses. It's
c36ff1
+also possible the reported, local addresses are link-local addresses. But these
c36ff1
+two keys may be used to discover what this datasource determined were the local
c36ff1
+IPv4 and IPv6 addresses for a host.
c36ff1
+
c36ff1
+Waiting on the network
c36ff1
+^^^^^^^^^^^^^^^^^^^^^^
c36ff1
+
c36ff1
+Sometimes cloud-init may bring up the network, but it will not finish coming
c36ff1
+online before the datasource's ``setup`` function is called, resulting in an
c36ff1
+``/var/run/cloud-init/instance-data.json`` file that does not have the correct
c36ff1
+network information. It is possible to instruct the datasource to wait until an
c36ff1
+IPv4 or IPv6 address is available before writing the instance data with the
c36ff1
+following metadata properties:
c36ff1
+
c36ff1
+.. code-block:: yaml
c36ff1
+
c36ff1
+   wait-on-network:
c36ff1
+     ipv4: true
c36ff1
+     ipv6: true
c36ff1
+
c36ff1
+If either of the above values are true, then the datasource will sleep for a
c36ff1
+second, check the network status, and repeat until one or both addresses from
c36ff1
+the specified families are available.
c36ff1
+
c36ff1
+Walkthrough
c36ff1
+-----------
c36ff1
+
c36ff1
+The following series of steps is a demonstration on how to configure a VM with
c36ff1
+this datasource:
c36ff1
+
c36ff1
+
c36ff1
+#. Create the metadata file for the VM. Save the following YAML to a file named
c36ff1
+   ``metadata.yaml``\ :
c36ff1
+
c36ff1
+   .. code-block:: yaml
c36ff1
+
c36ff1
+       instance-id: cloud-vm
c36ff1
+       local-hostname: cloud-vm
c36ff1
+       network:
c36ff1
+         version: 2
c36ff1
+         ethernets:
c36ff1
+           nics:
c36ff1
+             match:
c36ff1
+               name: ens*
c36ff1
+             dhcp4: yes
c36ff1
+
c36ff1
+#. Create the userdata file ``userdata.yaml``\ :
c36ff1
+
c36ff1
+   .. code-block:: yaml
c36ff1
+
c36ff1
+       #cloud-config
c36ff1
+
c36ff1
+       users:
c36ff1
+       - default
c36ff1
+       - name: akutz
c36ff1
+           primary_group: akutz
c36ff1
+           sudo: ALL=(ALL) NOPASSWD:ALL
c36ff1
+           groups: sudo, wheel
c36ff1
+           ssh_import_id: None
c36ff1
+           lock_passwd: true
c36ff1
+           ssh_authorized_keys:
c36ff1
+           - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDE0c5FczvcGSh/tG4iw+Fhfi/O5/EvUM/96js65tly4++YTXK1d9jcznPS5ruDlbIZ30oveCBd3kT8LLVFwzh6hepYTf0YmCTpF4eDunyqmpCXDvVscQYRXyasEm5olGmVe05RrCJSeSShAeptv4ueIn40kZKOghinGWLDSZG4+FFfgrmcMCpx5YSCtX2gvnEYZJr0czt4rxOZuuP7PkJKgC/mt2PcPjooeX00vAj81jjU2f3XKrjjz2u2+KIt9eba+vOQ6HiC8c2IzRkUAJ5i1atLy8RIbejo23+0P4N2jjk17QySFOVHwPBDTYb0/0M/4ideeU74EN/CgVsvO6JrLsPBR4dojkV5qNbMNxIVv5cUwIy2ThlLgqpNCeFIDLCWNZEFKlEuNeSQ2mPtIO7ETxEL2Cz5y/7AIuildzYMc6wi2bofRC8HmQ7rMXRWdwLKWsR0L7SKjHblIwarxOGqLnUI+k2E71YoP7SZSlxaKi17pqkr0OMCF+kKqvcvHAQuwGqyumTEWOlH6TCx1dSPrW+pVCZSHSJtSTfDW2uzL6y8k10MT06+pVunSrWo5LHAXcS91htHV1M1UrH/tZKSpjYtjMb5+RonfhaFRNzvj7cCE1f3Kp8UVqAdcGBTtReoE8eRUT63qIxjw03a7VwAyB2w+9cu1R9/vAo8SBeRqw== sakutz@gmail.com
c36ff1
+
c36ff1
+#. Please note this step requires that the VM be powered off. All of the
c36ff1
+   commands below use the VMware CLI tool, `govc <https://github.com/vmware/govmomi/blob/master/govc>`_.
c36ff1
+
c36ff1
+   Go ahead and assign the path to the VM to the environment variable ``VM``\ :
c36ff1
+
c36ff1
+   .. code-block:: shell
c36ff1
+
c36ff1
+      export VM="/inventory/path/to/the/vm"
c36ff1
+
c36ff1
+#. Power off the VM:
c36ff1
+
c36ff1
+   .. raw:: html
c36ff1
+
c36ff1
+      
c36ff1
+
c36ff1
+      ⚠️ First Boot Mode
c36ff1
+
c36ff1
+   To ensure the next power-on operation results in a first-boot scenario for
c36ff1
+   cloud-init, it may be necessary to run the following command just before
c36ff1
+   powering off the VM:
c36ff1
+
c36ff1
+   .. code-block:: bash
c36ff1
+
c36ff1
+      cloud-init clean
c36ff1
+
c36ff1
+   Otherwise cloud-init may not run in first-boot mode. For more information
c36ff1
+   on how the boot mode is determined, please see the
c36ff1
+   `First Boot Documentation <../boot.html#first-boot-determination>`_.
c36ff1
+
c36ff1
+   .. raw:: html
c36ff1
+
c36ff1
+      
c36ff1
+
c36ff1
+   .. code-block:: shell
c36ff1
+
c36ff1
+      govc vm.power -off "${VM}"
c36ff1
+
c36ff1
+#.
c36ff1
+   Export the environment variables that contain the cloud-init metadata and
c36ff1
+   userdata:
c36ff1
+
c36ff1
+   .. code-block:: shell
c36ff1
+
c36ff1
+      export METADATA=$(gzip -c9 <metadata.yaml | { base64 -w0 2>/dev/null || base64; }) \
c36ff1
+           USERDATA=$(gzip -c9 <userdata.yaml | { base64 -w0 2>/dev/null || base64; })
c36ff1
+
c36ff1
+#.
c36ff1
+   Assign the metadata and userdata to the VM:
c36ff1
+
c36ff1
+   .. code-block:: shell
c36ff1
+
c36ff1
+       govc vm.change -vm "${VM}" \
c36ff1
+       -e guestinfo.metadata="${METADATA}" \
c36ff1
+       -e guestinfo.metadata.encoding="gzip+base64" \
c36ff1
+       -e guestinfo.userdata="${USERDATA}" \
c36ff1
+       -e guestinfo.userdata.encoding="gzip+base64"
c36ff1
+
c36ff1
+   Please note the above commands include specifying the encoding for the
c36ff1
+   properties. This is important as it informs the datasource how to decode
c36ff1
+   the data for cloud-init. Valid values for ``metadata.encoding`` and
c36ff1
+   ``userdata.encoding`` include:
c36ff1
+
c36ff1
+
c36ff1
+   * ``base64``
c36ff1
+   * ``gzip+base64``
c36ff1
+
c36ff1
+#.
c36ff1
+   Power on the VM:
c36ff1
+
c36ff1
+   .. code-block:: shell
c36ff1
+
c36ff1
+       govc vm.power -vm "${VM}" -on
c36ff1
+
c36ff1
+If all went according to plan, the CentOS box is:
c36ff1
+
c36ff1
+* Locked down, allowing SSH access only for the user in the userdata
c36ff1
+* Configured for a dynamic IP address via DHCP
c36ff1
+* Has a hostname of ``cloud-vm``
c36ff1
+
c36ff1
+Examples
c36ff1
+--------
c36ff1
+
c36ff1
+This section reviews common configurations:
c36ff1
+
c36ff1
+Setting the hostname
c36ff1
+^^^^^^^^^^^^^^^^^^^^
c36ff1
+
c36ff1
+The hostname is set by way of the metadata key ``local-hostname``.
c36ff1
+
c36ff1
+Setting the instance ID
c36ff1
+^^^^^^^^^^^^^^^^^^^^^^^
c36ff1
+
c36ff1
+The instance ID may be set by way of the metadata key ``instance-id``. However,
c36ff1
+if this value is absent then then the instance ID is read from the file
c36ff1
+``/sys/class/dmi/id/product_uuid``.
c36ff1
+
c36ff1
+Providing public SSH keys
c36ff1
+^^^^^^^^^^^^^^^^^^^^^^^^^
c36ff1
+
c36ff1
+The public SSH keys may be set by way of the metadata key ``public-keys-data``.
c36ff1
+Each newline-terminated string will be interpreted as a separate SSH public
c36ff1
+key, which will be placed in distro's default user's
c36ff1
+``~/.ssh/authorized_keys``. If the value is empty or absent, then nothing will
c36ff1
+be written to ``~/.ssh/authorized_keys``.
c36ff1
+
c36ff1
+Configuring the network
c36ff1
+^^^^^^^^^^^^^^^^^^^^^^^
c36ff1
+
c36ff1
+The network is configured by setting the metadata key ``network`` with a value
c36ff1
+consistent with Network Config Versions
c36ff1
+`1 <../network-config-format-v1.html>`_ or
c36ff1
+`2 <../network-config-format-v2.html>`_\ , depending on the Linux
c36ff1
+distro's version of cloud-init.
c36ff1
+
c36ff1
+The metadata key ``network.encoding`` may be used to indicate the format of
c36ff1
+the metadata key "network". Valid encodings are ``base64`` and ``gzip+base64``.
c36ff1
diff --git a/requirements.txt b/requirements.txt
c36ff1
index 5b8becd7..41d01d62 100644
c36ff1
--- a/requirements.txt
c36ff1
+++ b/requirements.txt
c36ff1
@@ -29,3 +29,15 @@ requests
c36ff1
 
c36ff1
 # For patching pieces of cloud-config together
c36ff1
 jsonpatch
c36ff1
+
c36ff1
+# For validating cloud-config sections per schema definitions
c36ff1
+jsonschema
c36ff1
+
c36ff1
+# Used by DataSourceVMware to inspect the host's network configuration during
c36ff1
+# the "setup()" function.
c36ff1
+#
c36ff1
+# This allows a host that uses DHCP to bring up the network during BootLocal
c36ff1
+# and still participate in instance-data by gathering the network in detail at
c36ff1
+# runtime and merge that information into the metadata and repersist that to
c36ff1
+# disk.
c36ff1
+netifaces>=0.10.9
c36ff1
diff --git a/tests/unittests/test_datasource/test_common.py b/tests/unittests/test_datasource/test_common.py
c36ff1
index 5912f7ee..475a2cf8 100644
c36ff1
--- a/tests/unittests/test_datasource/test_common.py
c36ff1
+++ b/tests/unittests/test_datasource/test_common.py
c36ff1
@@ -28,6 +28,7 @@ from cloudinit.sources import (
c36ff1
     DataSourceScaleway as Scaleway,
c36ff1
     DataSourceSmartOS as SmartOS,
c36ff1
     DataSourceUpCloud as UpCloud,
c36ff1
+    DataSourceVMware as VMware,
c36ff1
 )
c36ff1
 from cloudinit.sources import DataSourceNone as DSNone
c36ff1
 
c36ff1
@@ -50,6 +51,7 @@ DEFAULT_LOCAL = [
c36ff1
     RbxCloud.DataSourceRbxCloud,
c36ff1
     Scaleway.DataSourceScaleway,
c36ff1
     UpCloud.DataSourceUpCloudLocal,
c36ff1
+    VMware.DataSourceVMware,
c36ff1
 ]
c36ff1
 
c36ff1
 DEFAULT_NETWORK = [
c36ff1
@@ -66,6 +68,7 @@ DEFAULT_NETWORK = [
c36ff1
     OpenStack.DataSourceOpenStack,
c36ff1
     OVF.DataSourceOVFNet,
c36ff1
     UpCloud.DataSourceUpCloud,
c36ff1
+    VMware.DataSourceVMware,
c36ff1
 ]
c36ff1
 
c36ff1
 
c36ff1
diff --git a/tests/unittests/test_datasource/test_vmware.py b/tests/unittests/test_datasource/test_vmware.py
c36ff1
new file mode 100644
c36ff1
index 00000000..597db7c8
c36ff1
--- /dev/null
c36ff1
+++ b/tests/unittests/test_datasource/test_vmware.py
c36ff1
@@ -0,0 +1,377 @@
c36ff1
+# Copyright (c) 2021 VMware, Inc. All Rights Reserved.
c36ff1
+#
c36ff1
+# Authors: Andrew Kutz <akutz@vmware.com>
c36ff1
+#
c36ff1
+# This file is part of cloud-init. See LICENSE file for license information.
c36ff1
+
c36ff1
+import base64
c36ff1
+import gzip
c36ff1
+from cloudinit import dmi, helpers, safeyaml
c36ff1
+from cloudinit import settings
c36ff1
+from cloudinit.sources import DataSourceVMware
c36ff1
+from cloudinit.tests.helpers import (
c36ff1
+    mock,
c36ff1
+    CiTestCase,
c36ff1
+    FilesystemMockingTestCase,
c36ff1
+    populate_dir,
c36ff1
+)
c36ff1
+
c36ff1
+import os
c36ff1
+
c36ff1
+PRODUCT_NAME_FILE_PATH = "/sys/class/dmi/id/product_name"
c36ff1
+PRODUCT_NAME = "VMware7,1"
c36ff1
+PRODUCT_UUID = "82343CED-E4C7-423B-8F6B-0D34D19067AB"
c36ff1
+REROOT_FILES = {
c36ff1
+    DataSourceVMware.PRODUCT_UUID_FILE_PATH: PRODUCT_UUID,
c36ff1
+    PRODUCT_NAME_FILE_PATH: PRODUCT_NAME,
c36ff1
+}
c36ff1
+
c36ff1
+VMW_MULTIPLE_KEYS = [
c36ff1
+    "ssh-rsa AAAAB3NzaC1yc2EAAAA... test1@vmw.com",
c36ff1
+    "ssh-rsa AAAAB3NzaC1yc2EAAAA... test2@vmw.com",
c36ff1
+]
c36ff1
+VMW_SINGLE_KEY = "ssh-rsa AAAAB3NzaC1yc2EAAAA... test@vmw.com"
c36ff1
+
c36ff1
+VMW_METADATA_YAML = """instance-id: cloud-vm
c36ff1
+local-hostname: cloud-vm
c36ff1
+network:
c36ff1
+  version: 2
c36ff1
+  ethernets:
c36ff1
+    nics:
c36ff1
+      match:
c36ff1
+        name: ens*
c36ff1
+      dhcp4: yes
c36ff1
+"""
c36ff1
+
c36ff1
+VMW_USERDATA_YAML = """## template: jinja
c36ff1
+#cloud-config
c36ff1
+users:
c36ff1
+- default
c36ff1
+"""
c36ff1
+
c36ff1
+VMW_VENDORDATA_YAML = """## template: jinja
c36ff1
+#cloud-config
c36ff1
+runcmd:
c36ff1
+- echo "Hello, world."
c36ff1
+"""
c36ff1
+
c36ff1
+
c36ff1
+class TestDataSourceVMware(CiTestCase):
c36ff1
+    """
c36ff1
+    Test common functionality that is not transport specific.
c36ff1
+    """
c36ff1
+
c36ff1
+    def setUp(self):
c36ff1
+        super(TestDataSourceVMware, self).setUp()
c36ff1
+        self.tmp = self.tmp_dir()
c36ff1
+
c36ff1
+    def test_no_data_access_method(self):
c36ff1
+        ds = get_ds(self.tmp)
c36ff1
+        ds.vmware_rpctool = None
c36ff1
+        ret = ds.get_data()
c36ff1
+        self.assertFalse(ret)
c36ff1
+
c36ff1
+    def test_get_host_info(self):
c36ff1
+        host_info = DataSourceVMware.get_host_info()
c36ff1
+        self.assertTrue(host_info)
c36ff1
+        self.assertTrue(host_info["hostname"])
c36ff1
+        self.assertTrue(host_info["local-hostname"])
c36ff1
+        self.assertTrue(host_info["local_hostname"])
c36ff1
+        self.assertTrue(host_info[DataSourceVMware.LOCAL_IPV4])
c36ff1
+
c36ff1
+
c36ff1
+class TestDataSourceVMwareEnvVars(FilesystemMockingTestCase):
c36ff1
+    """
c36ff1
+    Test the envvar transport.
c36ff1
+    """
c36ff1
+
c36ff1
+    def setUp(self):
c36ff1
+        super(TestDataSourceVMwareEnvVars, self).setUp()
c36ff1
+        self.tmp = self.tmp_dir()
c36ff1
+        os.environ[DataSourceVMware.VMX_GUESTINFO] = "1"
c36ff1
+        self.create_system_files()
c36ff1
+
c36ff1
+    def tearDown(self):
c36ff1
+        del os.environ[DataSourceVMware.VMX_GUESTINFO]
c36ff1
+        return super(TestDataSourceVMwareEnvVars, self).tearDown()
c36ff1
+
c36ff1
+    def create_system_files(self):
c36ff1
+        rootd = self.tmp_dir()
c36ff1
+        populate_dir(
c36ff1
+            rootd,
c36ff1
+            {
c36ff1
+                DataSourceVMware.PRODUCT_UUID_FILE_PATH: PRODUCT_UUID,
c36ff1
+            },
c36ff1
+        )
c36ff1
+        self.assertTrue(self.reRoot(rootd))
c36ff1
+
c36ff1
+    def assert_get_data_ok(self, m_fn, m_fn_call_count=6):
c36ff1
+        ds = get_ds(self.tmp)
c36ff1
+        ds.vmware_rpctool = None
c36ff1
+        ret = ds.get_data()
c36ff1
+        self.assertTrue(ret)
c36ff1
+        self.assertEqual(m_fn_call_count, m_fn.call_count)
c36ff1
+        self.assertEqual(
c36ff1
+            ds.data_access_method, DataSourceVMware.DATA_ACCESS_METHOD_ENVVAR
c36ff1
+        )
c36ff1
+        return ds
c36ff1
+
c36ff1
+    def assert_metadata(self, metadata, m_fn, m_fn_call_count=6):
c36ff1
+        ds = self.assert_get_data_ok(m_fn, m_fn_call_count)
c36ff1
+        assert_metadata(self, ds, metadata)
c36ff1
+
c36ff1
+    @mock.patch(
c36ff1
+        "cloudinit.sources.DataSourceVMware.guestinfo_envvar_get_value"
c36ff1
+    )
c36ff1
+    def test_get_subplatform(self, m_fn):
c36ff1
+        m_fn.side_effect = [VMW_METADATA_YAML, "", "", "", "", ""]
c36ff1
+        ds = self.assert_get_data_ok(m_fn, m_fn_call_count=4)
c36ff1
+        self.assertEqual(
c36ff1
+            ds.subplatform,
c36ff1
+            "%s (%s)"
c36ff1
+            % (
c36ff1
+                DataSourceVMware.DATA_ACCESS_METHOD_ENVVAR,
c36ff1
+                DataSourceVMware.get_guestinfo_envvar_key_name("metadata"),
c36ff1
+            ),
c36ff1
+        )
c36ff1
+
c36ff1
+    @mock.patch(
c36ff1
+        "cloudinit.sources.DataSourceVMware.guestinfo_envvar_get_value"
c36ff1
+    )
c36ff1
+    def test_get_data_metadata_only(self, m_fn):
c36ff1
+        m_fn.side_effect = [VMW_METADATA_YAML, "", "", "", "", ""]
c36ff1
+        self.assert_get_data_ok(m_fn, m_fn_call_count=4)
c36ff1
+
c36ff1
+    @mock.patch(
c36ff1
+        "cloudinit.sources.DataSourceVMware.guestinfo_envvar_get_value"
c36ff1
+    )
c36ff1
+    def test_get_data_userdata_only(self, m_fn):
c36ff1
+        m_fn.side_effect = ["", VMW_USERDATA_YAML, "", ""]
c36ff1
+        self.assert_get_data_ok(m_fn, m_fn_call_count=4)
c36ff1
+
c36ff1
+    @mock.patch(
c36ff1
+        "cloudinit.sources.DataSourceVMware.guestinfo_envvar_get_value"
c36ff1
+    )
c36ff1
+    def test_get_data_vendordata_only(self, m_fn):
c36ff1
+        m_fn.side_effect = ["", "", VMW_VENDORDATA_YAML, ""]
c36ff1
+        self.assert_get_data_ok(m_fn, m_fn_call_count=4)
c36ff1
+
c36ff1
+    @mock.patch(
c36ff1
+        "cloudinit.sources.DataSourceVMware.guestinfo_envvar_get_value"
c36ff1
+    )
c36ff1
+    def test_get_data_metadata_base64(self, m_fn):
c36ff1
+        data = base64.b64encode(VMW_METADATA_YAML.encode("utf-8"))
c36ff1
+        m_fn.side_effect = [data, "base64", "", ""]
c36ff1
+        self.assert_get_data_ok(m_fn, m_fn_call_count=4)
c36ff1
+
c36ff1
+    @mock.patch(
c36ff1
+        "cloudinit.sources.DataSourceVMware.guestinfo_envvar_get_value"
c36ff1
+    )
c36ff1
+    def test_get_data_metadata_b64(self, m_fn):
c36ff1
+        data = base64.b64encode(VMW_METADATA_YAML.encode("utf-8"))
c36ff1
+        m_fn.side_effect = [data, "b64", "", ""]
c36ff1
+        self.assert_get_data_ok(m_fn, m_fn_call_count=4)
c36ff1
+
c36ff1
+    @mock.patch(
c36ff1
+        "cloudinit.sources.DataSourceVMware.guestinfo_envvar_get_value"
c36ff1
+    )
c36ff1
+    def test_get_data_metadata_gzip_base64(self, m_fn):
c36ff1
+        data = VMW_METADATA_YAML.encode("utf-8")
c36ff1
+        data = gzip.compress(data)
c36ff1
+        data = base64.b64encode(data)
c36ff1
+        m_fn.side_effect = [data, "gzip+base64", "", ""]
c36ff1
+        self.assert_get_data_ok(m_fn, m_fn_call_count=4)
c36ff1
+
c36ff1
+    @mock.patch(
c36ff1
+        "cloudinit.sources.DataSourceVMware.guestinfo_envvar_get_value"
c36ff1
+    )
c36ff1
+    def test_get_data_metadata_gz_b64(self, m_fn):
c36ff1
+        data = VMW_METADATA_YAML.encode("utf-8")
c36ff1
+        data = gzip.compress(data)
c36ff1
+        data = base64.b64encode(data)
c36ff1
+        m_fn.side_effect = [data, "gz+b64", "", ""]
c36ff1
+        self.assert_get_data_ok(m_fn, m_fn_call_count=4)
c36ff1
+
c36ff1
+    @mock.patch(
c36ff1
+        "cloudinit.sources.DataSourceVMware.guestinfo_envvar_get_value"
c36ff1
+    )
c36ff1
+    def test_metadata_single_ssh_key(self, m_fn):
c36ff1
+        metadata = DataSourceVMware.load_json_or_yaml(VMW_METADATA_YAML)
c36ff1
+        metadata["public_keys"] = VMW_SINGLE_KEY
c36ff1
+        metadata_yaml = safeyaml.dumps(metadata)
c36ff1
+        m_fn.side_effect = [metadata_yaml, "", "", ""]
c36ff1
+        self.assert_metadata(metadata, m_fn, m_fn_call_count=4)
c36ff1
+
c36ff1
+    @mock.patch(
c36ff1
+        "cloudinit.sources.DataSourceVMware.guestinfo_envvar_get_value"
c36ff1
+    )
c36ff1
+    def test_metadata_multiple_ssh_keys(self, m_fn):
c36ff1
+        metadata = DataSourceVMware.load_json_or_yaml(VMW_METADATA_YAML)
c36ff1
+        metadata["public_keys"] = VMW_MULTIPLE_KEYS
c36ff1
+        metadata_yaml = safeyaml.dumps(metadata)
c36ff1
+        m_fn.side_effect = [metadata_yaml, "", "", ""]
c36ff1
+        self.assert_metadata(metadata, m_fn, m_fn_call_count=4)
c36ff1
+
c36ff1
+
c36ff1
+class TestDataSourceVMwareGuestInfo(FilesystemMockingTestCase):
c36ff1
+    """
c36ff1
+    Test the guestinfo transport on a VMware platform.
c36ff1
+    """
c36ff1
+
c36ff1
+    def setUp(self):
c36ff1
+        super(TestDataSourceVMwareGuestInfo, self).setUp()
c36ff1
+        self.tmp = self.tmp_dir()
c36ff1
+        self.create_system_files()
c36ff1
+
c36ff1
+    def create_system_files(self):
c36ff1
+        rootd = self.tmp_dir()
c36ff1
+        populate_dir(
c36ff1
+            rootd,
c36ff1
+            {
c36ff1
+                DataSourceVMware.PRODUCT_UUID_FILE_PATH: PRODUCT_UUID,
c36ff1
+                PRODUCT_NAME_FILE_PATH: PRODUCT_NAME,
c36ff1
+            },
c36ff1
+        )
c36ff1
+        self.assertTrue(self.reRoot(rootd))
c36ff1
+
c36ff1
+    def assert_get_data_ok(self, m_fn, m_fn_call_count=6):
c36ff1
+        ds = get_ds(self.tmp)
c36ff1
+        ds.vmware_rpctool = "vmware-rpctool"
c36ff1
+        ret = ds.get_data()
c36ff1
+        self.assertTrue(ret)
c36ff1
+        self.assertEqual(m_fn_call_count, m_fn.call_count)
c36ff1
+        self.assertEqual(
c36ff1
+            ds.data_access_method,
c36ff1
+            DataSourceVMware.DATA_ACCESS_METHOD_GUESTINFO,
c36ff1
+        )
c36ff1
+        return ds
c36ff1
+
c36ff1
+    def assert_metadata(self, metadata, m_fn, m_fn_call_count=6):
c36ff1
+        ds = self.assert_get_data_ok(m_fn, m_fn_call_count)
c36ff1
+        assert_metadata(self, ds, metadata)
c36ff1
+
c36ff1
+    def test_ds_valid_on_vmware_platform(self):
c36ff1
+        system_type = dmi.read_dmi_data("system-product-name")
c36ff1
+        self.assertEqual(system_type, PRODUCT_NAME)
c36ff1
+
c36ff1
+    @mock.patch("cloudinit.sources.DataSourceVMware.guestinfo_get_value")
c36ff1
+    def test_get_subplatform(self, m_fn):
c36ff1
+        m_fn.side_effect = [VMW_METADATA_YAML, "", "", "", "", ""]
c36ff1
+        ds = self.assert_get_data_ok(m_fn, m_fn_call_count=4)
c36ff1
+        self.assertEqual(
c36ff1
+            ds.subplatform,
c36ff1
+            "%s (%s)"
c36ff1
+            % (
c36ff1
+                DataSourceVMware.DATA_ACCESS_METHOD_GUESTINFO,
c36ff1
+                DataSourceVMware.get_guestinfo_key_name("metadata"),
c36ff1
+            ),
c36ff1
+        )
c36ff1
+
c36ff1
+    @mock.patch("cloudinit.sources.DataSourceVMware.guestinfo_get_value")
c36ff1
+    def test_get_data_userdata_only(self, m_fn):
c36ff1
+        m_fn.side_effect = ["", VMW_USERDATA_YAML, "", ""]
c36ff1
+        self.assert_get_data_ok(m_fn, m_fn_call_count=4)
c36ff1
+
c36ff1
+    @mock.patch("cloudinit.sources.DataSourceVMware.guestinfo_get_value")
c36ff1
+    def test_get_data_vendordata_only(self, m_fn):
c36ff1
+        m_fn.side_effect = ["", "", VMW_VENDORDATA_YAML, ""]
c36ff1
+        self.assert_get_data_ok(m_fn, m_fn_call_count=4)
c36ff1
+
c36ff1
+    @mock.patch("cloudinit.sources.DataSourceVMware.guestinfo_get_value")
c36ff1
+    def test_metadata_single_ssh_key(self, m_fn):
c36ff1
+        metadata = DataSourceVMware.load_json_or_yaml(VMW_METADATA_YAML)
c36ff1
+        metadata["public_keys"] = VMW_SINGLE_KEY
c36ff1
+        metadata_yaml = safeyaml.dumps(metadata)
c36ff1
+        m_fn.side_effect = [metadata_yaml, "", "", ""]
c36ff1
+        self.assert_metadata(metadata, m_fn, m_fn_call_count=4)
c36ff1
+
c36ff1
+    @mock.patch("cloudinit.sources.DataSourceVMware.guestinfo_get_value")
c36ff1
+    def test_metadata_multiple_ssh_keys(self, m_fn):
c36ff1
+        metadata = DataSourceVMware.load_json_or_yaml(VMW_METADATA_YAML)
c36ff1
+        metadata["public_keys"] = VMW_MULTIPLE_KEYS
c36ff1
+        metadata_yaml = safeyaml.dumps(metadata)
c36ff1
+        m_fn.side_effect = [metadata_yaml, "", "", ""]
c36ff1
+        self.assert_metadata(metadata, m_fn, m_fn_call_count=4)
c36ff1
+
c36ff1
+    @mock.patch("cloudinit.sources.DataSourceVMware.guestinfo_get_value")
c36ff1
+    def test_get_data_metadata_base64(self, m_fn):
c36ff1
+        data = base64.b64encode(VMW_METADATA_YAML.encode("utf-8"))
c36ff1
+        m_fn.side_effect = [data, "base64", "", ""]
c36ff1
+        self.assert_get_data_ok(m_fn, m_fn_call_count=4)
c36ff1
+
c36ff1
+    @mock.patch("cloudinit.sources.DataSourceVMware.guestinfo_get_value")
c36ff1
+    def test_get_data_metadata_b64(self, m_fn):
c36ff1
+        data = base64.b64encode(VMW_METADATA_YAML.encode("utf-8"))
c36ff1
+        m_fn.side_effect = [data, "b64", "", ""]
c36ff1
+        self.assert_get_data_ok(m_fn, m_fn_call_count=4)
c36ff1
+
c36ff1
+    @mock.patch("cloudinit.sources.DataSourceVMware.guestinfo_get_value")
c36ff1
+    def test_get_data_metadata_gzip_base64(self, m_fn):
c36ff1
+        data = VMW_METADATA_YAML.encode("utf-8")
c36ff1
+        data = gzip.compress(data)
c36ff1
+        data = base64.b64encode(data)
c36ff1
+        m_fn.side_effect = [data, "gzip+base64", "", ""]
c36ff1
+        self.assert_get_data_ok(m_fn, m_fn_call_count=4)
c36ff1
+
c36ff1
+    @mock.patch("cloudinit.sources.DataSourceVMware.guestinfo_get_value")
c36ff1
+    def test_get_data_metadata_gz_b64(self, m_fn):
c36ff1
+        data = VMW_METADATA_YAML.encode("utf-8")
c36ff1
+        data = gzip.compress(data)
c36ff1
+        data = base64.b64encode(data)
c36ff1
+        m_fn.side_effect = [data, "gz+b64", "", ""]
c36ff1
+        self.assert_get_data_ok(m_fn, m_fn_call_count=4)
c36ff1
+
c36ff1
+
c36ff1
+class TestDataSourceVMwareGuestInfo_InvalidPlatform(FilesystemMockingTestCase):
c36ff1
+    """
c36ff1
+    Test the guestinfo transport on a non-VMware platform.
c36ff1
+    """
c36ff1
+
c36ff1
+    def setUp(self):
c36ff1
+        super(TestDataSourceVMwareGuestInfo_InvalidPlatform, self).setUp()
c36ff1
+        self.tmp = self.tmp_dir()
c36ff1
+        self.create_system_files()
c36ff1
+
c36ff1
+    def create_system_files(self):
c36ff1
+        rootd = self.tmp_dir()
c36ff1
+        populate_dir(
c36ff1
+            rootd,
c36ff1
+            {
c36ff1
+                DataSourceVMware.PRODUCT_UUID_FILE_PATH: PRODUCT_UUID,
c36ff1
+            },
c36ff1
+        )
c36ff1
+        self.assertTrue(self.reRoot(rootd))
c36ff1
+
c36ff1
+    @mock.patch("cloudinit.sources.DataSourceVMware.guestinfo_get_value")
c36ff1
+    def test_ds_invalid_on_non_vmware_platform(self, m_fn):
c36ff1
+        system_type = dmi.read_dmi_data("system-product-name")
c36ff1
+        self.assertEqual(system_type, None)
c36ff1
+
c36ff1
+        m_fn.side_effect = [VMW_METADATA_YAML, "", "", "", "", ""]
c36ff1
+        ds = get_ds(self.tmp)
c36ff1
+        ds.vmware_rpctool = "vmware-rpctool"
c36ff1
+        ret = ds.get_data()
c36ff1
+        self.assertFalse(ret)
c36ff1
+
c36ff1
+
c36ff1
+def assert_metadata(test_obj, ds, metadata):
c36ff1
+    test_obj.assertEqual(metadata.get("instance-id"), ds.get_instance_id())
c36ff1
+    test_obj.assertEqual(metadata.get("local-hostname"), ds.get_hostname())
c36ff1
+
c36ff1
+    expected_public_keys = metadata.get("public_keys")
c36ff1
+    if not isinstance(expected_public_keys, list):
c36ff1
+        expected_public_keys = [expected_public_keys]
c36ff1
+
c36ff1
+    test_obj.assertEqual(expected_public_keys, ds.get_public_ssh_keys())
c36ff1
+    test_obj.assertIsInstance(ds.get_public_ssh_keys(), list)
c36ff1
+
c36ff1
+
c36ff1
+def get_ds(temp_dir):
c36ff1
+    ds = DataSourceVMware.DataSourceVMware(
c36ff1
+        settings.CFG_BUILTIN, None, helpers.Paths({"run_dir": temp_dir})
c36ff1
+    )
c36ff1
+    ds.vmware_rpctool = "vmware-rpctool"
c36ff1
+    return ds
c36ff1
+
c36ff1
+
c36ff1
+# vi: ts=4 expandtab
c36ff1
diff --git a/tests/unittests/test_ds_identify.py b/tests/unittests/test_ds_identify.py
c36ff1
index 1d8aaf18..8617d7bd 100644
c36ff1
--- a/tests/unittests/test_ds_identify.py
c36ff1
+++ b/tests/unittests/test_ds_identify.py
c36ff1
@@ -649,6 +649,50 @@ class TestDsIdentify(DsIdentifyBase):
c36ff1
         """EC2: bobrightbox.com in product_serial is not brightbox'"""
c36ff1
         self._test_ds_not_found('Ec2-E24Cloud-negative')
c36ff1
 
c36ff1
+    def test_vmware_no_valid_transports(self):
c36ff1
+        """VMware: no valid transports"""
c36ff1
+        self._test_ds_not_found('VMware-NoValidTransports')
c36ff1
+
c36ff1
+    def test_vmware_envvar_no_data(self):
c36ff1
+        """VMware: envvar transport no data"""
c36ff1
+        self._test_ds_not_found('VMware-EnvVar-NoData')
c36ff1
+
c36ff1
+    def test_vmware_envvar_no_virt_id(self):
c36ff1
+        """VMware: envvar transport success if no virt id"""
c36ff1
+        self._test_ds_found('VMware-EnvVar-NoVirtID')
c36ff1
+
c36ff1
+    def test_vmware_envvar_activated_by_metadata(self):
c36ff1
+        """VMware: envvar transport activated by metadata"""
c36ff1
+        self._test_ds_found('VMware-EnvVar-Metadata')
c36ff1
+
c36ff1
+    def test_vmware_envvar_activated_by_userdata(self):
c36ff1
+        """VMware: envvar transport activated by userdata"""
c36ff1
+        self._test_ds_found('VMware-EnvVar-Userdata')
c36ff1
+
c36ff1
+    def test_vmware_envvar_activated_by_vendordata(self):
c36ff1
+        """VMware: envvar transport activated by vendordata"""
c36ff1
+        self._test_ds_found('VMware-EnvVar-Vendordata')
c36ff1
+
c36ff1
+    def test_vmware_guestinfo_no_data(self):
c36ff1
+        """VMware: guestinfo transport no data"""
c36ff1
+        self._test_ds_not_found('VMware-GuestInfo-NoData')
c36ff1
+
c36ff1
+    def test_vmware_guestinfo_no_virt_id(self):
c36ff1
+        """VMware: guestinfo transport fails if no virt id"""
c36ff1
+        self._test_ds_not_found('VMware-GuestInfo-NoVirtID')
c36ff1
+
c36ff1
+    def test_vmware_guestinfo_activated_by_metadata(self):
c36ff1
+        """VMware: guestinfo transport activated by metadata"""
c36ff1
+        self._test_ds_found('VMware-GuestInfo-Metadata')
c36ff1
+
c36ff1
+    def test_vmware_guestinfo_activated_by_userdata(self):
c36ff1
+        """VMware: guestinfo transport activated by userdata"""
c36ff1
+        self._test_ds_found('VMware-GuestInfo-Userdata')
c36ff1
+
c36ff1
+    def test_vmware_guestinfo_activated_by_vendordata(self):
c36ff1
+        """VMware: guestinfo transport activated by vendordata"""
c36ff1
+        self._test_ds_found('VMware-GuestInfo-Vendordata')
c36ff1
+
c36ff1
 
c36ff1
 class TestBSDNoSys(DsIdentifyBase):
c36ff1
     """Test *BSD code paths
c36ff1
@@ -1136,7 +1180,240 @@ VALID_CFG = {
c36ff1
     'Ec2-E24Cloud-negative': {
c36ff1
         'ds': 'Ec2',
c36ff1
         'files': {P_SYS_VENDOR: 'e24cloudyday\n'},
c36ff1
-    }
c36ff1
+    },
c36ff1
+    'VMware-NoValidTransports': {
c36ff1
+        'ds': 'VMware',
c36ff1
+        'mocks': [
c36ff1
+            MOCK_VIRT_IS_VMWARE,
c36ff1
+        ],
c36ff1
+    },
c36ff1
+    'VMware-EnvVar-NoData': {
c36ff1
+        'ds': 'VMware',
c36ff1
+        'mocks': [
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_envvar_vmx_guestinfo',
c36ff1
+                'ret': 0,
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_envvar_vmx_guestinfo_metadata',
c36ff1
+                'ret': 1,
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_envvar_vmx_guestinfo_userdata',
c36ff1
+                'ret': 1,
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_envvar_vmx_guestinfo_vendordata',
c36ff1
+                'ret': 1,
c36ff1
+            },
c36ff1
+            MOCK_VIRT_IS_VMWARE,
c36ff1
+        ],
c36ff1
+    },
c36ff1
+    'VMware-EnvVar-NoVirtID': {
c36ff1
+        'ds': 'VMware',
c36ff1
+        'mocks': [
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_envvar_vmx_guestinfo',
c36ff1
+                'ret': 0,
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_envvar_vmx_guestinfo_metadata',
c36ff1
+                'ret': 0,
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_envvar_vmx_guestinfo_userdata',
c36ff1
+                'ret': 1,
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_envvar_vmx_guestinfo_vendordata',
c36ff1
+                'ret': 1,
c36ff1
+            },
c36ff1
+        ],
c36ff1
+    },
c36ff1
+    'VMware-EnvVar-Metadata': {
c36ff1
+        'ds': 'VMware',
c36ff1
+        'mocks': [
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_envvar_vmx_guestinfo',
c36ff1
+                'ret': 0,
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_envvar_vmx_guestinfo_metadata',
c36ff1
+                'ret': 0,
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_envvar_vmx_guestinfo_userdata',
c36ff1
+                'ret': 1,
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_envvar_vmx_guestinfo_vendordata',
c36ff1
+                'ret': 1,
c36ff1
+            },
c36ff1
+            MOCK_VIRT_IS_VMWARE,
c36ff1
+        ],
c36ff1
+    },
c36ff1
+    'VMware-EnvVar-Userdata': {
c36ff1
+        'ds': 'VMware',
c36ff1
+        'mocks': [
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_envvar_vmx_guestinfo',
c36ff1
+                'ret': 0,
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_envvar_vmx_guestinfo_metadata',
c36ff1
+                'ret': 1,
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_envvar_vmx_guestinfo_userdata',
c36ff1
+                'ret': 0,
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_envvar_vmx_guestinfo_vendordata',
c36ff1
+                'ret': 1,
c36ff1
+            },
c36ff1
+            MOCK_VIRT_IS_VMWARE,
c36ff1
+        ],
c36ff1
+    },
c36ff1
+    'VMware-EnvVar-Vendordata': {
c36ff1
+        'ds': 'VMware',
c36ff1
+        'mocks': [
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_envvar_vmx_guestinfo',
c36ff1
+                'ret': 0,
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_envvar_vmx_guestinfo_metadata',
c36ff1
+                'ret': 1,
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_envvar_vmx_guestinfo_userdata',
c36ff1
+                'ret': 1,
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_envvar_vmx_guestinfo_vendordata',
c36ff1
+                'ret': 0,
c36ff1
+            },
c36ff1
+            MOCK_VIRT_IS_VMWARE,
c36ff1
+        ],
c36ff1
+    },
c36ff1
+    'VMware-GuestInfo-NoData': {
c36ff1
+        'ds': 'VMware',
c36ff1
+        'mocks': [
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_rpctool',
c36ff1
+                'ret': 0,
c36ff1
+                'out': '/usr/bin/vmware-rpctool',
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_rpctool_guestinfo_metadata',
c36ff1
+                'ret': 1,
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_rpctool_guestinfo_userdata',
c36ff1
+                'ret': 1,
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_rpctool_guestinfo_vendordata',
c36ff1
+                'ret': 1,
c36ff1
+            },
c36ff1
+            MOCK_VIRT_IS_VMWARE,
c36ff1
+        ],
c36ff1
+    },
c36ff1
+    'VMware-GuestInfo-NoVirtID': {
c36ff1
+        'ds': 'VMware',
c36ff1
+        'mocks': [
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_rpctool',
c36ff1
+                'ret': 0,
c36ff1
+                'out': '/usr/bin/vmware-rpctool',
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_rpctool_guestinfo_metadata',
c36ff1
+                'ret': 0,
c36ff1
+                'out': '---',
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_rpctool_guestinfo_userdata',
c36ff1
+                'ret': 1,
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_rpctool_guestinfo_vendordata',
c36ff1
+                'ret': 1,
c36ff1
+            },
c36ff1
+        ],
c36ff1
+    },
c36ff1
+    'VMware-GuestInfo-Metadata': {
c36ff1
+        'ds': 'VMware',
c36ff1
+        'mocks': [
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_rpctool',
c36ff1
+                'ret': 0,
c36ff1
+                'out': '/usr/bin/vmware-rpctool',
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_rpctool_guestinfo_metadata',
c36ff1
+                'ret': 0,
c36ff1
+                'out': '---',
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_rpctool_guestinfo_userdata',
c36ff1
+                'ret': 1,
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_rpctool_guestinfo_vendordata',
c36ff1
+                'ret': 1,
c36ff1
+            },
c36ff1
+            MOCK_VIRT_IS_VMWARE,
c36ff1
+        ],
c36ff1
+    },
c36ff1
+    'VMware-GuestInfo-Userdata': {
c36ff1
+        'ds': 'VMware',
c36ff1
+        'mocks': [
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_rpctool',
c36ff1
+                'ret': 0,
c36ff1
+                'out': '/usr/bin/vmware-rpctool',
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_rpctool_guestinfo_metadata',
c36ff1
+                'ret': 1,
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_rpctool_guestinfo_userdata',
c36ff1
+                'ret': 0,
c36ff1
+                'out': '---',
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_rpctool_guestinfo_vendordata',
c36ff1
+                'ret': 1,
c36ff1
+            },
c36ff1
+            MOCK_VIRT_IS_VMWARE,
c36ff1
+        ],
c36ff1
+    },
c36ff1
+    'VMware-GuestInfo-Vendordata': {
c36ff1
+        'ds': 'VMware',
c36ff1
+        'mocks': [
c36ff1
+            {
c36ff1
+                'name': 'vmware_has_rpctool',
c36ff1
+                'ret': 0,
c36ff1
+                'out': '/usr/bin/vmware-rpctool',
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_rpctool_guestinfo_metadata',
c36ff1
+                'ret': 1,
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_rpctool_guestinfo_userdata',
c36ff1
+                'ret': 1,
c36ff1
+            },
c36ff1
+            {
c36ff1
+                'name': 'vmware_rpctool_guestinfo_vendordata',
c36ff1
+                'ret': 0,
c36ff1
+                'out': '---',
c36ff1
+            },
c36ff1
+            MOCK_VIRT_IS_VMWARE,
c36ff1
+        ],
c36ff1
+    },
c36ff1
 }
c36ff1
 
c36ff1
 # vi: ts=4 expandtab
c36ff1
diff --git a/tools/.github-cla-signers b/tools/.github-cla-signers
c36ff1
index 689d7902..cbfa883c 100644
c36ff1
--- a/tools/.github-cla-signers
c36ff1
+++ b/tools/.github-cla-signers
c36ff1
@@ -1,5 +1,6 @@
c36ff1
 ader1990
c36ff1
 ajmyyra
c36ff1
+akutz
c36ff1
 AlexBaranowski
c36ff1
 Aman306
c36ff1
 andrewbogott
c36ff1
diff --git a/tools/ds-identify b/tools/ds-identify
c36ff1
index 2f2486f7..c01eae3d 100755
c36ff1
--- a/tools/ds-identify
c36ff1
+++ b/tools/ds-identify
c36ff1
@@ -125,7 +125,7 @@ DI_DSNAME=""
c36ff1
 # be searched if there is no setting found in config.
c36ff1
 DI_DSLIST_DEFAULT="MAAS ConfigDrive NoCloud AltCloud Azure Bigstep \
c36ff1
 CloudSigma CloudStack DigitalOcean AliYun Ec2 GCE OpenNebula OpenStack \
c36ff1
-OVF SmartOS Scaleway Hetzner IBMCloud Oracle Exoscale RbxCloud UpCloud"
c36ff1
+OVF SmartOS Scaleway Hetzner IBMCloud Oracle Exoscale RbxCloud UpCloud VMware"
c36ff1
 DI_DSLIST=""
c36ff1
 DI_MODE=""
c36ff1
 DI_ON_FOUND=""
c36ff1
@@ -1350,6 +1350,80 @@ dscheck_IBMCloud() {
c36ff1
     return ${DS_NOT_FOUND}
c36ff1
 }
c36ff1
 
c36ff1
+vmware_has_envvar_vmx_guestinfo() {
c36ff1
+    [ -n "${VMX_GUESTINFO:-}" ]
c36ff1
+}
c36ff1
+
c36ff1
+vmware_has_envvar_vmx_guestinfo_metadata() {
c36ff1
+    [ -n "${VMX_GUESTINFO_METADATA:-}" ]
c36ff1
+}
c36ff1
+
c36ff1
+vmware_has_envvar_vmx_guestinfo_userdata() {
c36ff1
+    [ -n "${VMX_GUESTINFO_USERDATA:-}" ]
c36ff1
+}
c36ff1
+
c36ff1
+vmware_has_envvar_vmx_guestinfo_vendordata() {
c36ff1
+    [ -n "${VMX_GUESTINFO_VENDORDATA:-}" ]
c36ff1
+}
c36ff1
+
c36ff1
+vmware_has_rpctool() {
c36ff1
+    command -v vmware-rpctool >/dev/null 2>&1
c36ff1
+}
c36ff1
+
c36ff1
+vmware_rpctool_guestinfo_metadata() {
c36ff1
+    vmware-rpctool "info-get guestinfo.metadata"
c36ff1
+}
c36ff1
+
c36ff1
+vmware_rpctool_guestinfo_userdata() {
c36ff1
+    vmware-rpctool "info-get guestinfo.userdata"
c36ff1
+}
c36ff1
+
c36ff1
+vmware_rpctool_guestinfo_vendordata() {
c36ff1
+    vmware-rpctool "info-get guestinfo.vendordata"
c36ff1
+}
c36ff1
+
c36ff1
+dscheck_VMware() {
c36ff1
+    # Checks to see if there is valid data for the VMware datasource.
c36ff1
+    # The data transports are checked in the following order:
c36ff1
+    #
c36ff1
+    #   * envvars
c36ff1
+    #   * guestinfo
c36ff1
+    #
c36ff1
+    # Please note when updating this function with support for new data
c36ff1
+    # transports, the order should match the order in the _get_data
c36ff1
+    # function from the file DataSourceVMware.py.
c36ff1
+
c36ff1
+    # Check to see if running in a container and the VMware
c36ff1
+    # datasource is configured via environment variables.
c36ff1
+    if vmware_has_envvar_vmx_guestinfo; then
c36ff1
+        if vmware_has_envvar_vmx_guestinfo_metadata || \
c36ff1
+            vmware_has_envvar_vmx_guestinfo_userdata || \
c36ff1
+            vmware_has_envvar_vmx_guestinfo_vendordata; then
c36ff1
+            return "${DS_FOUND}"
c36ff1
+        fi
c36ff1
+    fi
c36ff1
+
c36ff1
+    # Do not proceed unless the detected platform is VMware.
c36ff1
+    if [ ! "${DI_VIRT}" = "vmware" ]; then
c36ff1
+        return "${DS_NOT_FOUND}"
c36ff1
+    fi
c36ff1
+
c36ff1
+    # Do not proceed if the vmware-rpctool command is not present.
c36ff1
+    if ! vmware_has_rpctool; then
c36ff1
+        return "${DS_NOT_FOUND}"
c36ff1
+    fi
c36ff1
+
c36ff1
+    # Activate the VMware datasource only if any of the fields used
c36ff1
+    # by the datasource are present in the guestinfo table.
c36ff1
+    if { vmware_rpctool_guestinfo_metadata || \
c36ff1
+         vmware_rpctool_guestinfo_userdata || \
c36ff1
+         vmware_rpctool_guestinfo_vendordata; } >/dev/null 2>&1; then
c36ff1
+        return "${DS_FOUND}"
c36ff1
+    fi
c36ff1
+
c36ff1
+    return "${DS_NOT_FOUND}"
c36ff1
+}
c36ff1
+
c36ff1
 collect_info() {
c36ff1
     read_uname_info
c36ff1
     read_virt
c36ff1
-- 
c36ff1
2.27.0
c36ff1