yeahuh / rpms / qemu-kvm

Forked from rpms/qemu-kvm 2 years ago
Clone
4ec855
From 5935958fc4eb9934b1493486a69f0f571e7da112 Mon Sep 17 00:00:00 2001
4ec855
From: Thomas Huth <thuth@redhat.com>
4ec855
Date: Fri, 30 Aug 2019 12:56:24 +0100
4ec855
Subject: [PATCH 06/10] file-posix: Handle undetectable alignment
4ec855
4ec855
RH-Author: Thomas Huth <thuth@redhat.com>
4ec855
Message-id: <20190830125628.23668-2-thuth@redhat.com>
4ec855
Patchwork-id: 90209
4ec855
O-Subject: [RHEL-8.1.0 qemu-kvm PATCH v2 1/5] file-posix: Handle undetectable alignment
4ec855
Bugzilla: 1738839
4ec855
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
4ec855
RH-Acked-by: Max Reitz <mreitz@redhat.com>
4ec855
RH-Acked-by: David Hildenbrand <david@redhat.com>
4ec855
4ec855
In some cases buf_align or request_alignment cannot be detected:
4ec855
4ec855
1. With Gluster, buf_align cannot be detected since the actual I/O is
4ec855
   done on Gluster server, and qemu buffer alignment does not matter.
4ec855
   Since we don't have alignment requirement, buf_align=1 is the best
4ec855
   value.
4ec855
4ec855
2. With local XFS filesystem, buf_align cannot be detected if reading
4ec855
   from unallocated area. In this we must align the buffer, but we don't
4ec855
   know what is the correct size. Using the wrong alignment results in
4ec855
   I/O error.
4ec855
4ec855
3. With Gluster backed by XFS, request_alignment cannot be detected if
4ec855
   reading from unallocated area. In this case we need to use the
4ec855
   correct alignment, and failing to do so results in I/O errors.
4ec855
4ec855
4. With NFS, the server does not use direct I/O, so both buf_align cannot
4ec855
   be detected. In this case we don't need any alignment so we can use
4ec855
   buf_align=1 and request_alignment=1.
4ec855
4ec855
These cases seems to work when storage sector size is 512 bytes, because
4ec855
the current code starts checking align=512. If the check succeeds
4ec855
because alignment cannot be detected we use 512. But this does not work
4ec855
for storage with 4k sector size.
4ec855
4ec855
To determine if we can detect the alignment, we probe first with
4ec855
align=1. If probing succeeds, maybe there are no alignment requirement
4ec855
(cases 1, 4) or we are probing unallocated area (cases 2, 3). Since we
4ec855
don't have any way to tell, we treat this as undetectable alignment. If
4ec855
probing with align=1 fails with EINVAL, but probing with one of the
4ec855
expected alignments succeeds, we know that we found a working alignment.
4ec855
4ec855
Practically the alignment requirements are the same for buffer
4ec855
alignment, buffer length, and offset in file. So in case we cannot
4ec855
detect buf_align, we can use request alignment. If we cannot detect
4ec855
request alignment, we can fallback to a safe value. To use this logic,
4ec855
we probe first request alignment instead of buf_align.
4ec855
4ec855
Here is a table showing the behaviour with current code (the value in
4ec855
parenthesis is the optimal value).
4ec855
4ec855
Case    Sector    buf_align (opt)   request_alignment (opt)     result
4ec855
4ec855
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
4ec855
---
4ec855
 block/file-posix.c | 36 +++++++++++++++++++++++++-----------
4ec855
 1 file changed, 25 insertions(+), 11 deletions(-)
4ec855
4ec855
diff --git a/block/file-posix.c b/block/file-posix.c
4ec855
index 4b404e4..84c5a31 100644
4ec855
--- a/block/file-posix.c
4ec855
+++ b/block/file-posix.c
4ec855
@@ -324,6 +324,7 @@ static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp)
4ec855
     BDRVRawState *s = bs->opaque;
4ec855
     char *buf;
4ec855
     size_t max_align = MAX(MAX_BLOCKSIZE, getpagesize());
4ec855
+    size_t alignments[] = {1, 512, 1024, 2048, 4096};
4ec855
 
4ec855
     /* For SCSI generic devices the alignment is not really used.
4ec855
        With buffered I/O, we don't have any restrictions. */
4ec855
@@ -350,25 +351,38 @@ static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp)
4ec855
     }
4ec855
 #endif
4ec855
 
4ec855
-    /* If we could not get the sizes so far, we can only guess them */
4ec855
-    if (!s->buf_align) {
4ec855
+    /*
4ec855
+     * If we could not get the sizes so far, we can only guess them. First try
4ec855
+     * to detect request alignment, since it is more likely to succeed. Then
4ec855
+     * try to detect buf_align, which cannot be detected in some cases (e.g.
4ec855
+     * Gluster). If buf_align cannot be detected, we fallback to the value of
4ec855
+     * request_alignment.
4ec855
+     */
4ec855
+
4ec855
+    if (!bs->bl.request_alignment) {
4ec855
+        int i;
4ec855
         size_t align;
4ec855
-        buf = qemu_memalign(max_align, 2 * max_align);
4ec855
-        for (align = 512; align <= max_align; align <<= 1) {
4ec855
-            if (raw_is_io_aligned(fd, buf + align, max_align)) {
4ec855
-                s->buf_align = align;
4ec855
+        buf = qemu_memalign(max_align, max_align);
4ec855
+        for (i = 0; i < ARRAY_SIZE(alignments); i++) {
4ec855
+            align = alignments[i];
4ec855
+            if (raw_is_io_aligned(fd, buf, align)) {
4ec855
+                /* Fallback to safe value. */
4ec855
+                bs->bl.request_alignment = (align != 1) ? align : max_align;
4ec855
                 break;
4ec855
             }
4ec855
         }
4ec855
         qemu_vfree(buf);
4ec855
     }
4ec855
 
4ec855
-    if (!bs->bl.request_alignment) {
4ec855
+    if (!s->buf_align) {
4ec855
+        int i;
4ec855
         size_t align;
4ec855
-        buf = qemu_memalign(s->buf_align, max_align);
4ec855
-        for (align = 512; align <= max_align; align <<= 1) {
4ec855
-            if (raw_is_io_aligned(fd, buf, align)) {
4ec855
-                bs->bl.request_alignment = align;
4ec855
+        buf = qemu_memalign(max_align, 2 * max_align);
4ec855
+        for (i = 0; i < ARRAY_SIZE(alignments); i++) {
4ec855
+            align = alignments[i];
4ec855
+            if (raw_is_io_aligned(fd, buf + align, max_align)) {
4ec855
+                /* Fallback to request_aligment. */
4ec855
+                s->buf_align = (align != 1) ? align : bs->bl.request_alignment;
4ec855
                 break;
4ec855
             }
4ec855
         }
4ec855
-- 
4ec855
1.8.3.1
4ec855