dcavalca / rpms / systemd

Forked from rpms/systemd 3 months ago
Clone
Blob Blame History Raw
From 35a233228ff0105892b3edc86a0cdda06282a9ac Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Zbigniew=20J=C4=99drzejewski-Szmek?= <zbyszek@in.waw.pl>
Date: Tue, 18 Oct 2022 18:23:53 +0200
Subject: [PATCH] coredump: avoid deadlock when passing processed backtrace
 data

We would deadlock when passing the data back from the forked-off process that
was doing backtrace generation back to the coredump parent. This is because we
fork the child and wait for it to exit. The child tries to write too much data
to the output pipe, and and after the first 64k blocks on the parent because
the pipe is full. The bug surfaced in Fedora because of a combination of four
factors:
- 87707784c70dc9894ec613df0a6e75e732a362a3 was backported to v251.5, which
  allowed coredump processing to be successful.
- 1a0281a3ebf4f8c16d40aa9e63103f16cd23bb2a was NOT backported, so the output
  was very verbose.
- Fedora has the ELF package metadata available, so a lot of output can be
  generated. Most other distros just don't have the information.
- gnome-calendar crashes and has a bazillion modules and 69596 bytes of output
  are generated for it.

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=2135778.

The code is changed to try to write data opportunistically. If we get partial
information, that is still logged. In is generally better to log partial
backtrace information than nothing at all.

(cherry picked from commit 076b807be472630692c5348c60d0c2b7b28ad437)

Resolves: #2149074
---
 src/shared/elf-util.c | 37 +++++++++++++++++++++++++++++++------
 1 file changed, 31 insertions(+), 6 deletions(-)

diff --git a/src/shared/elf-util.c b/src/shared/elf-util.c
index 392ed9f31b..644fbae9ce 100644
--- a/src/shared/elf-util.c
+++ b/src/shared/elf-util.c
@@ -30,6 +30,9 @@
 #define THREADS_MAX 64
 #define ELF_PACKAGE_METADATA_ID 0xcafe1a7e
 
+/* The amount of data we're willing to write to each of the output pipes. */
+#define COREDUMP_PIPE_MAX (1024*1024U)
+
 static void *dw_dl = NULL;
 static void *elf_dl = NULL;
 
@@ -700,13 +703,13 @@ int parse_elf_object(int fd, const char *executable, bool fork_disable_dump, cha
                 return r;
 
         if (ret) {
-                r = RET_NERRNO(pipe2(return_pipe, O_CLOEXEC));
+                r = RET_NERRNO(pipe2(return_pipe, O_CLOEXEC|O_NONBLOCK));
                 if (r < 0)
                         return r;
         }
 
         if (ret_package_metadata) {
-                r = RET_NERRNO(pipe2(json_pipe, O_CLOEXEC));
+                r = RET_NERRNO(pipe2(json_pipe, O_CLOEXEC|O_NONBLOCK));
                 if (r < 0)
                         return r;
         }
@@ -750,8 +753,24 @@ int parse_elf_object(int fd, const char *executable, bool fork_disable_dump, cha
                         goto child_fail;
 
                 if (buf) {
-                        r = loop_write(return_pipe[1], buf, strlen(buf), false);
-                        if (r < 0)
+                        size_t len = strlen(buf);
+
+                        if (len > COREDUMP_PIPE_MAX) {
+                                /* This is iffy. A backtrace can be a few hundred kilobytes, but too much is
+                                 * too much. Let's log a warning and ignore the rest. */
+                                log_warning("Generated backtrace is %zu bytes (more than the limit of %u bytes), backtrace will be truncated.",
+                                            len, COREDUMP_PIPE_MAX);
+                                len = COREDUMP_PIPE_MAX;
+                        }
+
+                        /* Bump the space for the returned string.
+                         * Failure is ignored, because partial output is still useful. */
+                        (void) fcntl(return_pipe[1], F_SETPIPE_SZ, len);
+
+                        r = loop_write(return_pipe[1], buf, len, false);
+                        if (r == -EAGAIN)
+                                log_warning("Write failed, backtrace will be truncated.");
+                        else if (r < 0)
                                 goto child_fail;
 
                         return_pipe[1] = safe_close(return_pipe[1]);
@@ -760,13 +779,19 @@ int parse_elf_object(int fd, const char *executable, bool fork_disable_dump, cha
                 if (package_metadata) {
                         _cleanup_fclose_ FILE *json_out = NULL;
 
+                        /* Bump the space for the returned string. We don't know how much space we'll need in
+                         * advance, so we'll just try to write as much as possible and maybe fail later. */
+                        (void) fcntl(json_pipe[1], F_SETPIPE_SZ, COREDUMP_PIPE_MAX);
+
                         json_out = take_fdopen(&json_pipe[1], "w");
                         if (!json_out) {
                                 r = -errno;
                                 goto child_fail;
                         }
 
-                        json_variant_dump(package_metadata, JSON_FORMAT_FLUSH, json_out, NULL);
+                        r = json_variant_dump(package_metadata, JSON_FORMAT_FLUSH, json_out, NULL);
+                        if (r < 0)
+                                log_warning_errno(r, "Failed to write JSON package metadata, ignoring: %m");
                 }
 
                 _exit(EXIT_SUCCESS);
@@ -801,7 +826,7 @@ int parse_elf_object(int fd, const char *executable, bool fork_disable_dump, cha
 
                 r = json_parse_file(json_in, NULL, 0, &package_metadata, NULL, NULL);
                 if (r < 0 && r != -ENODATA) /* ENODATA: json was empty, so we got nothing, but that's ok */
-                        return r;
+                        log_warning_errno(r, "Failed to read or parse json metadata, ignoring: %m");
         }
 
         if (ret)