|
Pablo Greco |
e6a3ae |
From 1832a90928232cb91a8542613b754079fd1f0f0e Mon Sep 17 00:00:00 2001
|
|
Pablo Greco |
e6a3ae |
From: Max Reitz <mreitz@redhat.com>
|
|
Pablo Greco |
e6a3ae |
Date: Tue, 23 Jul 2019 14:45:46 +0100
|
|
Pablo Greco |
e6a3ae |
Subject: [PATCH 08/14] nbd/server: Advertise actual minimum block size
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
RH-Author: Max Reitz <mreitz@redhat.com>
|
|
Pablo Greco |
e6a3ae |
Message-id: <20190723144546.23701-8-mreitz@redhat.com>
|
|
Pablo Greco |
e6a3ae |
Patchwork-id: 89652
|
|
Pablo Greco |
e6a3ae |
O-Subject: [RHEL-8.1.0 qemu-kvm PATCH 7/7] nbd/server: Advertise actual minimum block size
|
|
Pablo Greco |
e6a3ae |
Bugzilla: 1678979
|
|
Pablo Greco |
e6a3ae |
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
|
|
Pablo Greco |
e6a3ae |
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
|
|
Pablo Greco |
e6a3ae |
RH-Acked-by: John Snow <jsnow@redhat.com>
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
From: Eric Blake <eblake@redhat.com>
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
Both NBD_CMD_BLOCK_STATUS and structured NBD_CMD_READ will split their
|
|
Pablo Greco |
e6a3ae |
reply according to bdrv_block_status() boundaries. If the block device
|
|
Pablo Greco |
e6a3ae |
has a request_alignment smaller than 512, but we advertise a block
|
|
Pablo Greco |
e6a3ae |
alignment of 512 to the client, then this can result in the server
|
|
Pablo Greco |
e6a3ae |
reply violating client expectations by reporting a smaller region of
|
|
Pablo Greco |
e6a3ae |
the export than what the client is permitted to address (although this
|
|
Pablo Greco |
e6a3ae |
is less of an issue for qemu 4.0 clients, given recent client patches
|
|
Pablo Greco |
e6a3ae |
to overlook our non-compliance at EOF). Since it's always better to
|
|
Pablo Greco |
e6a3ae |
be strict in what we send, it is worth advertising the actual minimum
|
|
Pablo Greco |
e6a3ae |
block limit rather than blindly rounding it up to 512.
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
Note that this patch is not foolproof - it is still possible to
|
|
Pablo Greco |
e6a3ae |
provoke non-compliant server behavior using:
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
$ qemu-nbd --image-opts driver=blkdebug,align=512,image.driver=file,image.filename=/path/to/non-aligned-file
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
That is arguably a bug in the blkdebug driver (it should never pass
|
|
Pablo Greco |
e6a3ae |
back block status smaller than its alignment, even if it has to make
|
|
Pablo Greco |
e6a3ae |
multiple bdrv_get_status calls and determine the
|
|
Pablo Greco |
e6a3ae |
least-common-denominator status among the group to return). It may
|
|
Pablo Greco |
e6a3ae |
also be possible to observe issues with a backing layer with smaller
|
|
Pablo Greco |
e6a3ae |
alignment than the active layer, although so far I have been unable to
|
|
Pablo Greco |
e6a3ae |
write a reliable iotest for that scenario (but again, an issue like
|
|
Pablo Greco |
e6a3ae |
that could be argued to be a bug in the block layer, or something
|
|
Pablo Greco |
e6a3ae |
where we need a flag to bdrv_block_status() to state whether the
|
|
Pablo Greco |
e6a3ae |
result must be aligned to the current layer's limits or can be
|
|
Pablo Greco |
e6a3ae |
subdivided for accuracy when chasing backing files).
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
Anyways, as blkdebug is not normally used, and as this patch makes our
|
|
Pablo Greco |
e6a3ae |
server more interoperable with qemu 3.1 clients, it is worth applying
|
|
Pablo Greco |
e6a3ae |
now, even while we still work on a larger patch series for the 4.1
|
|
Pablo Greco |
e6a3ae |
timeframe to have byte-accurate file lengths.
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
Note that the iotests output changes - for 223 and 233, we can see the
|
|
Pablo Greco |
e6a3ae |
server's better granularity advertisement; and for 241, the three test
|
|
Pablo Greco |
e6a3ae |
cases have the following effects:
|
|
Pablo Greco |
e6a3ae |
- natural alignment: the server's smaller alignment is now advertised,
|
|
Pablo Greco |
e6a3ae |
and the hole reported at EOF is now the right result; we've gotten rid
|
|
Pablo Greco |
e6a3ae |
of the server's non-compliance
|
|
Pablo Greco |
e6a3ae |
- forced server alignment: the server still advertises 512 bytes, but
|
|
Pablo Greco |
e6a3ae |
still sends a mid-sector hole. This is still a server compliance bug,
|
|
Pablo Greco |
e6a3ae |
which needs to be fixed in the block layer in a later patch; output
|
|
Pablo Greco |
e6a3ae |
does not change because the client is already being tolerant of the
|
|
Pablo Greco |
e6a3ae |
non-compliance
|
|
Pablo Greco |
e6a3ae |
- forced client alignment: the server's smaller alignment means that
|
|
Pablo Greco |
e6a3ae |
the client now sees the server's status change mid-sector without any
|
|
Pablo Greco |
e6a3ae |
protocol violations, but the fact that the map shows an unaligned
|
|
Pablo Greco |
e6a3ae |
mid-sector hole is evidence of the block layer problems with aligned
|
|
Pablo Greco |
e6a3ae |
block status, to be fixed in a later patch
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
Signed-off-by: Eric Blake <eblake@redhat.com>
|
|
Pablo Greco |
e6a3ae |
Message-Id: <20190329042750.14704-7-eblake@redhat.com>
|
|
Pablo Greco |
e6a3ae |
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
|
|
Pablo Greco |
e6a3ae |
[eblake: rebase to enhanced iotest 241 coverage]
|
|
Pablo Greco |
e6a3ae |
(cherry picked from commit b0245d6478ea5906e3d7a542244d5c015fd47bc7)
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
Conflicts:
|
|
Pablo Greco |
e6a3ae |
- tests/qemu-iotests/223.out: We are missing
|
|
Pablo Greco |
e6a3ae |
ddd09448fd833d646952c769ae9ce3d39bee989f downstream, which adds
|
|
Pablo Greco |
e6a3ae |
qemu-nbd --list tests to 223. (qemu-nbd --list does not exist
|
|
Pablo Greco |
e6a3ae |
downstream.)
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
- tests/qemu-iotests/233.out: Does not exist downstream.
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
- tests/qemu-iotests/241.out: Does not exist downstream, because it
|
|
Pablo Greco |
e6a3ae |
would require qemu-nbd --list.
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
Signed-off-by: Max Reitz <mreitz@redhat.com>
|
|
Pablo Greco |
e6a3ae |
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
|
|
Pablo Greco |
e6a3ae |
---
|
|
Pablo Greco |
e6a3ae |
nbd/server.c | 13 ++++++++-----
|
|
Pablo Greco |
e6a3ae |
1 file changed, 8 insertions(+), 5 deletions(-)
|
|
Pablo Greco |
e6a3ae |
|
|
Pablo Greco |
e6a3ae |
diff --git a/nbd/server.c b/nbd/server.c
|
|
Pablo Greco |
e6a3ae |
index e094300..96b6631 100644
|
|
Pablo Greco |
e6a3ae |
--- a/nbd/server.c
|
|
Pablo Greco |
e6a3ae |
+++ b/nbd/server.c
|
|
Pablo Greco |
e6a3ae |
@@ -608,13 +608,16 @@ static int nbd_negotiate_handle_info(NBDClient *client, uint16_t myflags,
|
|
Pablo Greco |
e6a3ae |
/* Send NBD_INFO_BLOCK_SIZE always, but tweak the minimum size
|
|
Pablo Greco |
e6a3ae |
* according to whether the client requested it, and according to
|
|
Pablo Greco |
e6a3ae |
* whether this is OPT_INFO or OPT_GO. */
|
|
Pablo Greco |
e6a3ae |
- /* minimum - 1 for back-compat, or 512 if client is new enough.
|
|
Pablo Greco |
e6a3ae |
- * TODO: consult blk_bs(blk)->bl.request_alignment? */
|
|
Pablo Greco |
e6a3ae |
- sizes[0] =
|
|
Pablo Greco |
e6a3ae |
- (client->opt == NBD_OPT_INFO || blocksize) ? BDRV_SECTOR_SIZE : 1;
|
|
Pablo Greco |
e6a3ae |
+ /* minimum - 1 for back-compat, or actual if client will obey it. */
|
|
Pablo Greco |
e6a3ae |
+ if (client->opt == NBD_OPT_INFO || blocksize) {
|
|
Pablo Greco |
e6a3ae |
+ sizes[0] = blk_get_request_alignment(exp->blk);
|
|
Pablo Greco |
e6a3ae |
+ } else {
|
|
Pablo Greco |
e6a3ae |
+ sizes[0] = 1;
|
|
Pablo Greco |
e6a3ae |
+ }
|
|
Pablo Greco |
e6a3ae |
+ assert(sizes[0] <= NBD_MAX_BUFFER_SIZE);
|
|
Pablo Greco |
e6a3ae |
/* preferred - Hard-code to 4096 for now.
|
|
Pablo Greco |
e6a3ae |
* TODO: is blk_bs(blk)->bl.opt_transfer appropriate? */
|
|
Pablo Greco |
e6a3ae |
- sizes[1] = 4096;
|
|
Pablo Greco |
e6a3ae |
+ sizes[1] = MAX(4096, sizes[0]);
|
|
Pablo Greco |
e6a3ae |
/* maximum - At most 32M, but smaller as appropriate. */
|
|
Pablo Greco |
e6a3ae |
sizes[2] = MIN(blk_get_max_transfer(exp->blk), NBD_MAX_BUFFER_SIZE);
|
|
Pablo Greco |
e6a3ae |
trace_nbd_negotiate_handle_info_block_size(sizes[0], sizes[1], sizes[2]);
|
|
Pablo Greco |
e6a3ae |
--
|
|
Pablo Greco |
e6a3ae |
1.8.3.1
|
|
Pablo Greco |
e6a3ae |
|