render / rpms / libvirt

Forked from rpms/libvirt a year ago
Clone
7548c0
From cfe170216accf60938ff4ea9440a4ac78b0bd83f Mon Sep 17 00:00:00 2001
7548c0
Message-Id: <cfe170216accf60938ff4ea9440a4ac78b0bd83f@dist-git>
7548c0
From: Dmytro Linkin <dlinkin@nvidia.com>
7548c0
Date: Thu, 28 Jan 2021 23:17:29 -0500
7548c0
Subject: [PATCH] util: Add phys_port_name support on virPCIGetNetName
7548c0
7548c0
virPCIGetNetName is used to get the name of the netdev associated with
7548c0
a particular PCI device. This is used when we have a VF name, but need
7548c0
the PF name in order to send a netlink command (e.g. in order to
7548c0
get/set the MAC address of the VF).
7548c0
7548c0
In simple cases there is a single netdev associated with any PCI
7548c0
device, so it is easy to figure out the PF netdev for a VF - just look
7548c0
for the PCI device that has the VF listed in its "virtfns" directory;
7548c0
the only name in the "net" subdirectory of that PCI device's sysfs
7548c0
directory is the PF netdev that is upstream of the VF in question.
7548c0
7548c0
In some cases there can be more than one netdev in a PCI device's net
7548c0
directory though. In the past, the only case of this was for SR-IOV
7548c0
NICs that could have multiple PF's per PCI device. In this case, all
7548c0
PF netdevs associated with a PCI address would be listed in the "net"
7548c0
subdirectory of the PCI device's directory in sysfs. At the same time,
7548c0
all VF netdevs and all PF netdevs have a phys_port_id in their sysfs,
7548c0
so the way to learn the correct PF netdev for a particular VF netdev
7548c0
is to search through the list of devices in the net subdirectory of
7548c0
the PF's PCI device, looking for the one netdev with a "phys_port_id"
7548c0
matching that of the VF netdev.
7548c0
7548c0
But starting in kernel 5.8, the NVIDIA Mellanox driver began linking
7548c0
the VFs' representor netdevs to the PF PCI address [1], and so the VF
7548c0
representor netdevs would also show up in the net
7548c0
subdirectory. However, all of the devices that do so also only have a
7548c0
single PF netdev for any given PCI address.
7548c0
7548c0
This means that the net directory of the PCI device can still hold
7548c0
multiple net devices, but only one of them will be the PF netdev (the
7548c0
others are VF representors):
7548c0
7548c0
$ ls '/sys/bus/pci/devices/0000:82:00.0/net'
7548c0
ens1f0  eth0  eth1
7548c0
7548c0
In this case the way to find the PF device is to look at the
7548c0
"phys_port_name" attribute of each netdev in sysfs. All PF devices
7548c0
have a phys_port_name matching a particular regex
7548c0
7548c0
  (p[0-9]+$)|(p[0-9]+s[0-9]+$)
7548c0
7548c0
Since there can only be one PF in the entire list of devices, once we
7548c0
match that regex, we've found the PF netdev.
7548c0
7548c0
[1] - https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/
7548c0
      commit/?id=123f0f53dd64b67e34142485fe866a8a581f12f1
7548c0
7548c0
Resolves: https://bugzilla.redhat.com/1918708
7548c0
Co-Authored-by: Moshe Levi <moshele@nvidia.com>
7548c0
Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com>
7548c0
Reviewed-by: Adrian Chiris <adrianc@nvidia.com>
7548c0
Reviewed-by: Laine Stump <laine@redhat.com>
7548c0
(cherry picked from commit 5b1c525b1f3608156884aed0dc5e925306c1e260)
7548c0
7548c0
Conflicts: src/util/virpci.c - upstream all DIR* were converted to use
7548c0
    g_autoptr, which permitted virPCIGetNetName() to be
7548c0
    simplified. Unfortunately, backporting this refactor would require
7548c0
    backporting an ever-ballooning set of patches, making the
7548c0
    possibility of causing a regression a very real danger. Instead,
7548c0
    one small refactor of virPCIGetName() that didn't affect any other
7548c0
    functions was backported, and this patch (adding phys_port_name
7548c0
    support) resolved the remaining conflicts by mimicking the current
7548c0
    upstream version of the function, but with all "return 0" replaced
7548c0
    by "ret = 0; goto cleanup;" and all "return -1" replaced by "goto
7548c0
    cleanup;" (the code at cleanup: just closes the DIR* and returns
7548c0
    the current value of ret). This will assure identical behavior to
7548c0
    upstream.
7548c0
Signed-off-by: Laine Stump <laine@redhat.com>
7548c0
Message-Id: <20210129041729.1076345-4-laine@redhat.com>
7548c0
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
7548c0
---
7548c0
 src/util/virpci.c | 93 ++++++++++++++++++++++++++++-------------------
7548c0
 src/util/virpci.h |  5 +++
7548c0
 2 files changed, 61 insertions(+), 37 deletions(-)
7548c0
7548c0
diff --git a/src/util/virpci.c b/src/util/virpci.c
7548c0
index 00377eed31..d5c038b7fe 100644
7548c0
--- a/src/util/virpci.c
7548c0
+++ b/src/util/virpci.c
7548c0
@@ -2424,9 +2424,9 @@ virPCIDeviceAddressGetSysfsFile(virPCIDeviceAddressPtr addr,
7548c0
  * virPCIGetNetName:
7548c0
  * @device_link_sysfs_path: sysfs path to the PCI device
7548c0
  * @idx: used to choose which netdev when there are several
7548c0
- *       (ignored if physPortID is set)
7548c0
+ *       (ignored if physPortID is set or physPortName is available)
7548c0
  * @physPortID: match this string in the netdev's phys_port_id
7548c0
- *       (or NULL to ignore and use idx instead)
7548c0
+ *       (or NULL to ignore and use phys_port_name or idx instead)
7548c0
  * @netname: used to return the name of the netdev
7548c0
  *       (set to NULL (but returns success) if there is no netdev)
7548c0
  *
7548c0
@@ -2460,6 +2460,14 @@ virPCIGetNetName(const char *device_link_sysfs_path,
7548c0
     }
7548c0
 
7548c0
     while (virDirRead(dir, &entry, pcidev_sysfs_net_path) > 0) {
7548c0
+        /* save the first entry we find to use as a failsafe
7548c0
+         * in case we don't match the phys_port_id. This is
7548c0
+         * needed because some NIC drivers (e.g. i40e)
7548c0
+         * implement phys_port_id for PFs, but not for VFs
7548c0
+         */
7548c0
+        if (!firstEntryName)
7548c0
+            firstEntryName = g_strdup(entry->d_name);
7548c0
+
7548c0
         /* if the caller sent a physPortID, compare it to the
7548c0
          * physportID of this netdev. If not, look for entry[idx].
7548c0
          */
7548c0
@@ -2470,50 +2478,61 @@ virPCIGetNetName(const char *device_link_sysfs_path,
7548c0
                 goto cleanup;
7548c0
 
7548c0
             /* if this one doesn't match, keep looking */
7548c0
-            if (STRNEQ_NULLABLE(physPortID, thisPhysPortID)) {
7548c0
-                /* save the first entry we find to use as a failsafe
7548c0
-                 * in case we don't match the phys_port_id. This is
7548c0
-                 * needed because some NIC drivers (e.g. i40e)
7548c0
-                 * implement phys_port_id for PFs, but not for VFs
7548c0
-                 */
7548c0
-                if (!firstEntryName)
7548c0
-                    firstEntryName = g_strdup(entry->d_name);
7548c0
-
7548c0
+            if (STRNEQ_NULLABLE(physPortID, thisPhysPortID))
7548c0
                 continue;
7548c0
-            }
7548c0
+
7548c0
         } else {
7548c0
-            if (i++ < idx)
7548c0
-                continue;
7548c0
-        }
7548c0
+            /* Most switch devices use phys_port_name instead of
7548c0
+             * phys_port_id.
7548c0
+             * NOTE: VFs' representors net devices can be linked to PF's PCI
7548c0
+             * device, which mean that there'll be multiple net devices
7548c0
+             * instances and to get a proper net device need to match on
7548c0
+             * specific regex.
7548c0
+             * To get PF netdev, for ex., used following regex:
7548c0
+             * "(p[0-9]+$)|(p[0-9]+s[0-9]+$)"
7548c0
+             * or to get exact VF's netdev next regex is used:
7548c0
+             * "pf0vf1$"
7548c0
+             */
7548c0
+            g_autofree char *thisPhysPortName = NULL;
7548c0
 
7548c0
-        *netname = g_strdup(entry->d_name);
7548c0
+            if (virNetDevGetPhysPortName(entry->d_name, &thisPhysPortName) < 0)
7548c0
+                goto cleanup;
7548c0
 
7548c0
-        ret = 0;
7548c0
-        break;
7548c0
-    }
7548c0
+            if (thisPhysPortName) {
7548c0
+
7548c0
+                /* if this one doesn't match, keep looking */
7548c0
+                if (!virStringMatch(thisPhysPortName, VIR_PF_PHYS_PORT_NAME_REGEX))
7548c0
+                    continue;
7548c0
 
7548c0
-    if (ret < 0) {
7548c0
-        if (physPortID) {
7548c0
-            if (firstEntryName) {
7548c0
-                /* we didn't match the provided phys_port_id, but this
7548c0
-                 * is probably because phys_port_id isn't implemented
7548c0
-                 * for this NIC driver, so just return the first
7548c0
-                 * (probably only) netname we found.
7548c0
-                 */
7548c0
-                *netname = firstEntryName;
7548c0
-                firstEntryName = NULL;
7548c0
-                ret = 0;
7548c0
             } else {
7548c0
-                virReportError(VIR_ERR_INTERNAL_ERROR,
7548c0
-                               _("Could not find network device with "
7548c0
-                                 "phys_port_id '%s' under PCI device at %s"),
7548c0
-                               physPortID, device_link_sysfs_path);
7548c0
+
7548c0
+                if (i++ < idx)
7548c0
+                    continue;
7548c0
             }
7548c0
-        } else {
7548c0
-            ret = 0; /* no netdev at the given index is *not* an error */
7548c0
         }
7548c0
+
7548c0
+        *netname = g_strdup(entry->d_name);
7548c0
+        ret = 0;
7548c0
+        goto cleanup;
7548c0
     }
7548c0
- cleanup:
7548c0
+
7548c0
+    if (firstEntryName) {
7548c0
+        /* we didn't match the provided phys_port_id / find a
7548c0
+         * phys_port_name matching VIR_PF_PHYS_PORT_NAME_REGEX / find
7548c0
+         * as many net devices as the value of idx, but this is
7548c0
+         * probably because phys_port_id / phys_port_name isn't
7548c0
+         * implemented for this NIC driver, so just return the first
7548c0
+         * (probably only) netname we found.
7548c0
+         */
7548c0
+        *netname = g_steal_pointer(&firstEntryName);
7548c0
+        ret = 0;
7548c0
+        goto cleanup;
7548c0
+    }
7548c0
+
7548c0
+    virReportError(VIR_ERR_INTERNAL_ERROR,
7548c0
+                   _("Could not find any network device under PCI device at %s"),
7548c0
+                   device_link_sysfs_path);
7548c0
+cleanup:
7548c0
     VIR_DIR_CLOSE(dir);
7548c0
     return ret;
7548c0
 }
7548c0
diff --git a/src/util/virpci.h b/src/util/virpci.h
7548c0
index f6796fc422..e47c766918 100644
7548c0
--- a/src/util/virpci.h
7548c0
+++ b/src/util/virpci.h
7548c0
@@ -49,6 +49,11 @@ struct _virZPCIDeviceAddress {
7548c0
 
7548c0
 #define VIR_PCI_DEVICE_ADDRESS_FMT "%04x:%02x:%02x.%d"
7548c0
 
7548c0
+/* Represents format of PF's phys_port_name in switchdev mode:
7548c0
+ * 'p%u' or 'p%us%u'. New line checked since value is readed from sysfs file.
7548c0
+ */
7548c0
+#define VIR_PF_PHYS_PORT_NAME_REGEX  "(p[0-9]+$)|(p[0-9]+s[0-9]+$)"
7548c0
+
7548c0
 struct _virPCIDeviceAddress {
7548c0
     unsigned int domain;
7548c0
     unsigned int bus;
7548c0
-- 
7548c0
2.30.0
7548c0