Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp166148rdg; Thu, 12 Oct 2023 01:56:44 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGVq2BkauZAmftRwMcU5th4R/N0W7tyFvuDznKWgKKLMhn77MCWXQpXtBxxZP8jnKDxKuaW X-Received: by 2002:a05:6a00:2d01:b0:68a:6d34:474b with SMTP id fa1-20020a056a002d0100b0068a6d34474bmr29863053pfb.15.1697101003481; Thu, 12 Oct 2023 01:56:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697101003; cv=none; d=google.com; s=arc-20160816; b=Z/5eFR3IvW6g6WFm1atv8iOQO91XNEJm46al82R0V4jn5g9PPPQ1zgybNv2EqKRzhy GRnzSp/txJt6clVYB8nOYlZX3MfcPmeWKPOP+wb0vorOPKkIoigqSmvrM/UD5vEWZbUX 26SKVLdFTnfLSTYr1+FevMbDdIiERL4X1XIEM7msOmZdWH2FaSRFe/KKd0wn61ey/Nyp AhHuX8nd7D+lvRh1sDA3vStIsO9p5lQspKTs6SwHkh3n++n6sFB7NRwp94Df9zxCugen xkNpkUXFDW9+Ofp3k8uUFPsohpApdtVC41l2kH/Ri7inoNq0JAVHACGiiPQRSrtRoE40 50Pw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:in-reply-to:message-id :subject:cc:to:from:date:dkim-signature; bh=j+4wftdPCRze2GpHmlyjzCqZh3APF/15k4sxfk/ja1g=; fh=E/OqnXzcKsj0df7ieLIuKlCrMJSnCS+qxyd9GWI60eE=; b=iup3xZkgTHqSef9K6uJUTS0o6/t70uX0tvG33RAq3lFnDFtyqqi600UocmOe8CifM2 hcFKv+5gNXwDNCqMXVM5VhgLHRaYLhnkNNEhX/EEhWu2W/81WNyaH8MIvI0Na8E1FM22 kiG/1QDdTVngEJ0Eausmk9NC7PwaF+1y4DxC9EocsmD4GqZ7aSl5vWntOZuw+/aUleVT 0wJrm1zqyvyiiDJBsLPKRZyy5J6D4zltY7f3UT+Wt7ggWIyMdGYRMh7m03p6RMlCbTbb ysmk4dcshhRKde29K8ulDFY6dfpuhImtFdCBFCb3rCEERv7d96FwwlTBDcdRxDQU4HpB 1gfw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=MfWX2pSw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id fb40-20020a056a002da800b0068ff3927553si14161568pfb.35.2023.10.12.01.56.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Oct 2023 01:56:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=MfWX2pSw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 41FC18040D42; Thu, 12 Oct 2023 01:56:03 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233712AbjJLI4B (ORCPT + 99 others); Thu, 12 Oct 2023 04:56:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49100 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232199AbjJLIz7 (ORCPT ); Thu, 12 Oct 2023 04:55:59 -0400 Received: from mail-lj1-x230.google.com (mail-lj1-x230.google.com [IPv6:2a00:1450:4864:20::230]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1831090 for ; Thu, 12 Oct 2023 01:55:56 -0700 (PDT) Received: by mail-lj1-x230.google.com with SMTP id 38308e7fff4ca-2c16bc71e4cso8706411fa.0 for ; Thu, 12 Oct 2023 01:55:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697100954; x=1697705754; darn=vger.kernel.org; h=mime-version:references:in-reply-to:message-id:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=j+4wftdPCRze2GpHmlyjzCqZh3APF/15k4sxfk/ja1g=; b=MfWX2pSwd9g1RYY1cHJF48DK3WG3CAMuXzOvDfhFuTblNjB6A2HTK7Z+cYNwdwm2Ps GRRRnsgZX86pzvxlsXAbw/mCRRCQit8Z0E0Au8rcVQw1EP3yweIUQ2bc+npQEFnsfBwz ohZORMjBHL5mYFvs0xRj+X6QG2FzM/aC0WlskXrqOqZyxiREIdpiM4rRf4shesgiRi9r QtAsreeGSFII8bZ2rM7ZmlUbpAtjVwfbQ6v5QaMbF+TX047c97vsbhUI6LnINZO1ZtY7 MjDcqEje7IEXVLeW6kwBdZsnFYt7eo4XTpyrCzcu2Z5Y2rnHsLATkwXJV8bVQZ/DBV6B Uh3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697100954; x=1697705754; h=mime-version:references:in-reply-to:message-id:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=j+4wftdPCRze2GpHmlyjzCqZh3APF/15k4sxfk/ja1g=; b=Wc5ymJXwpMflpT8L5UV0Y2kLWIBEIwsLlP9ePb+Dg0aOfmIRahnJsCZ8l6/aMWTm3s d7aIy2BswHxoyhLP225tgnUrB/2bCmmM10RzpYRb2xl/Nuo8qCemtSdJa0qxLSSr84/9 nazqYTA+eA6ddT8MGI8x0Cbb/4glYamK/lJP7H195v4KaYXA5ir6T3mxhfkfvtM9pfhx Cq+Gl0cqthMQEdMjTZ4JsA9I2FN0EoDvwg1cI7pjBRRY886G6p2ShRxjzajGfqfG5qVZ sYXWzWPdM7ScimjBMGgbLighKqgV69gR4lQ2dxkHIMwyzVV7XvhQCzXDxvu0Fk5LK0mW QZcg== X-Gm-Message-State: AOJu0YygGFcTFuThLbpTYh3POVVCv/rdU/59uf250EKu9EVSPlRcSgaa LgbkNVuP9Ln/93ziQkK5vp4= X-Received: by 2002:a2e:b784:0:b0:2bb:a123:2db7 with SMTP id n4-20020a2eb784000000b002bba1232db7mr20078552ljo.51.1697100953257; Thu, 12 Oct 2023 01:55:53 -0700 (PDT) Received: from eldfell ([194.136.85.206]) by smtp.gmail.com with ESMTPSA id q6-20020a2e2a06000000b002c3c60e86a6sm2866579ljq.124.2023.10.12.01.55.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Oct 2023 01:55:52 -0700 (PDT) Date: Thu, 12 Oct 2023 11:55:48 +0300 From: Pekka Paalanen To: Daniel Vetter Cc: jim.cromie@gmail.com, =?UTF-8?B?xYF1a2Fzeg==?= Bartosik , linux-kernel@vger.kernel.org, Steven Rostedt , "wayland-devel@lists.freedesktop.org" , Sean Paul , dri-devel Subject: Re: [PATCH v1] dynamic_debug: add support for logs destination Message-ID: <20231012115548.292fa0bb@eldfell> In-Reply-To: References: <20231003155810.6df9de16@gandalf.local.home> <20231011114816.19d79f43@eldfell> X-Mailer: Claws Mail 4.1.1 (GTK 3.24.37; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: multipart/signed; boundary="Sig_/Hc.s9O5m0Pcos=5MuauxwdY"; protocol="application/pgp-signature"; micalg=pgp-sha256 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Thu, 12 Oct 2023 01:56:03 -0700 (PDT) --Sig_/Hc.s9O5m0Pcos=5MuauxwdY Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Wed, 11 Oct 2023 11:42:24 +0200 Daniel Vetter wrote: > On Wed, Oct 11, 2023 at 11:48:16AM +0300, Pekka Paalanen wrote: > > On Tue, 10 Oct 2023 10:06:02 -0600 > > jim.cromie@gmail.com wrote: > > =20 > > > since I name-dropped you all, =20 > >=20 > > Hi everyone, > >=20 > > I'm really happy to see this topic being developed! I've practically > > forgot about it myself, but the need for it has not diminished at all. > >=20 > > I didn't understand much of the conversation, so I'll just reiterate > > what I would use it for, as a Wayland compositor developer. > >=20 > > I added a few more cc's to get better coverage of DRM and Wayland > > compositor developers. > > =20 > > > On Tue, Oct 10, 2023 at 10:01=E2=80=AFAM wrote= : =20 > > > > > > > > On Mon, Oct 9, 2023 at 4:47=E2=80=AFPM =C5=81ukasz Bartosik wrote: =20 > >=20 > > ... > > =20 > > > > > I don't have a real life use case to configure different trace > > > > > instance for each callsite. > > > > > I just tried to be as much flexible as possible. > > > > > =20 > > > > > > > > Ive come around to agree - I looked back at some old threads > > > > (that I was a part of, and barely remembered :-} > > > > > > > > At least Sean Paul, Lyude, Simon Ser, Pekka Paalanen > > > > have expressed a desire for a "flight-recorder" > > > > it'd be hard to say now that 2-3 such buffers would always be enoug= h, > > > > esp as theres a performance reason for having your own. =20 > >=20 > > A Wayland compositor has roughly three important things where the kernel > > debugs might come in handy: > > - input > > - DRM KMS > > - DRM GPU rendering > >=20 > > DRM KMS is the one I've been thinking of in the flight recorder context > > the most, because KMS hardware varies a lot, and there is plenty of > > room for both KMS drivers and KMS userspace to go wrong. The usual > > result is your display doesn't work, so the system is practically > > unusable to the end user. In the wild, the simplest or maybe the only > > way out of that may be a reboot, maybe an automated one (e.g. digital > > signage). In order to debug such problems, we would need both > > compositor logs and the relevant kernel debug messages. > >=20 > > For example, Weston already has a flight recorder framework of its own, > > so we have the compositor debug logs. It would be useful to get the > > selected kernel debug logs in the same place. It could be used for > > automated or semi-manual bug reporting, for example, making the > > administrator or end user life much easier reporting issues. > >=20 > > Since this is usually a production environment, and the Wayland > > compositor runs without root privileges, we need something that works > > with that. We would likely want the kernel debug messages in the > > compositor to combine and order them properly with the compositor debug > > messages. > >=20 > > It's quite likely that developers would like to pick and choose which > > kernel debug messages might be interesting enough to record, to avoid > > excessive log flooding. The flight recorder in Weston is fixed size to > > avoid running out of memory or disk space. I can also see that Weston > > could have debugging options that affect which kernel debug messages it > > subscribes to. We can have a reasonable default setup that allows us to > > pinpoint the problem area and figure out most problems, and if needed, > > we could ask the administrator pass another debug option to Weston. It > > helps if there is just one place to configure everything about the > > compositor. > >=20 > > This implies that it would be really nice to have userspace subscriber > > specific debug message streams from the kernel, or a good way to filter > > the messages we want. A Wayland compositor would not be interested in > > file system or wireless debugs for example, but another system > > component might be. There is also a security aspect of which component = is > > allowed to see which messages in case they could contain anything > > sensitive (input debug could contain typed passwords). > >=20 > > Configuring the kernel debug message selection for our debug message > > stream can and probably should require elevated privileges, and we can > > likely solve that in userspace with a daemon or such, to allow the > > Wayland compositor to run as a regular user. > >=20 > > Thinking of desktop systems, and especially physically multi-seat syste= ms: > > - there can be multiple different Wayland compositors running simultane= ously > > - each of them may want debug messages only from a specific DRM KMS > > device instance, and not from others > > - each of them may have a different idea of which debug messages are im= portant > > - because DRM KMS leasing is a thing, different compositor instances > > could be using the same DRM KMS device instance simultaneously; since > > this is specific to DRM KMS, and it should be harmless to get a > > little too much DRM KMS debug (that is, from the whole device instead > > of just the leased parts), it may not be worth to consider splitting > > debug message streams this far. > >=20 > > If userspace is offered some standardised fields in kernel debug > > structures, then userspace could do some filtering on its own too, but I > > guess it would be better to filter at the source and not need that. > >=20 > > There is also an anti-goal. The kernel debug message contents are > > specifically not machine-parsable. I very much do not want to impose > > debug strings as ABI, they are for human (and AI?) readers only. > >=20 > >=20 > > As a summary, here are the most important requirements first: > > - usable in production as a normal thing to enable always by default > > - final delivery to unprivileged userspace process =20 >=20 > I think this is the one that's trickiest, and I don't fully understand why > you need it. It's not an arbitrary unprivileged userspace process. It is an unprivileged userspace process configured by the system administrator to receive the information. I would simply like it to be so that you don't need to have a userspace process running as root shovelling the bulk debug message data manually, but that the target process can read it directly e.g. by getting passed an appropriate fd to it. > The issues I'm seeing: >=20 > - logs tend to leak a lot of kernel internal state that's useful for > attacks. There's measures for the worst (like obfuscating kernel > pointers by hashing them), so there's always going to be a difference > here between what full sysadmin can get and what unpriviledged userspace > can get. And there's always a risk we miss something that we should > obfuscate but didn't. >=20 > - handing this to userspace increases the risks it becomes uapi. Who's > going to stop compositors from sussing out the reason an atomic commit > failed from the logs if they can get them easily, and these logs contain > very interesting information about why something failed? That problem exists regardless of where in userspace the data goes to. Compositors could as well get it from journald if they need it. Or from the same place where 'dmesg' gets its stuff from, with a tiny helper with the needed access. > Much better if journald or a crash handler assemebles all the different > flight recorder logs and packages it into a bug report so that the > compositor cannot ever get at these directly. Yeah this needs some OS > support with a dbus request or similar so that the compositor can ask > for a crash dump with everything relevant to its session. Ok. > - the idea of an in-kernel flight recorder is that it's really fast. The > entire tracing infra is built such that recording an event is really > quick, but printing it is not - the entire string formatting is delayed > to when userspace reads the buffers. If you constantly push the log > messages to userspace we toss the advantage of the low-overhead > in-kernel flight recorder. If you push logs to dmesg there's a > substantial measureable overhead which you don't really want in > production, and your requirement would impose the same. Right. > - I'm not sure how this is supposed to mesh with userspace log aggregators > like journald when every compositor has it's own flight recorder on top. > Feels a bit like a solution that ignores the entire os stack and assumes > that the only pieces we can touch are the kernel and the compositor to > get to such a flight recorder. >=20 > You might object that events will get out-of-order if you merge multiple > logs after the fact, but that's the case anyway if we use the tracing > framework since that's always a ringbuffer within the kernel and not > synchronous. The only thing we could do is allow userspace to push > markers into that ringbuffer, which is easily done by adding more debug > output lines (heck we could even add a logging cookie to certain ioctl > when userspace really cares about knowing exact ordering of it's > requests with the stuff the kernel does). That's a good point. So we need to agree on the clock to timestamp all messages in all components, both kernel and userspace, and then use a log aggregator to interleave logs by timestamp. Is that clock already defined as UABI? The compositor flight recorder is a ring buffer too, because it would be far too much volume to send it all to another process all the time. >=20 > - If you really want direct deliver to userspace I guess we could do > something where sessiond opens the flight recorder fd for you, sets it > all up and passes it to the compositor. But I'm really not a big fan of > sending the full kms dbg spam to compositors to freely digest in real > time. Well, something has to get the logs in userspace, be that a compositor, systemd-journald, or another aggregator. And that needs to be available in production, so not via debugfs. If it does not need too run as root, even better. It doesn't need to be real-time all the time, it is enough if it can be triggered when needed. > > - per debug-print selection of messages (finer or coarser, categories > > within a kernel sub-system could be enough) > > - per originating device (driver instance) selection of messages =20 >=20 > The dyndbg stuff can do all that already, which is why I'm so much in > favour of relying on that framework. Cool. I have no idea of any frameworks, and no opinion of which one should be used, anyway. > > - all selections tailored separately for each userspace subscriber > > (- per open device file description selection of messages) =20 >=20 > Again this feels like a userspace problem. Sessions could register what > kind of info they need for their session, and something like journald can > figure out how to record it all. Only if the kernel actually attaches all the required information to the debug messages *in machine readable form* so that userspace actually can do the filtering. And that makes *that* information UABI. Maybe that's fine? I wouldn't know. > If you want the kernel to keep separate flight recorders I guess we could > add that, but I don't think it currently exists for the dyndbg stuff at > least. Maybe a flight recorder v2 feature, once the basics are in. Maybe, maybe not necessary, I'm not sure. The thing is, ring buffers are of fixed size for a reason, they always overwrite oldest messages, so the volume of messages with ring buffer size determines how long into the past you can get the messages. Different message sources can have widely different bandwidth (volume), and different use cases need logs for different past time periods. It's really cool if there is no need to copy the same data into multiple ring buffers, but it is bad to not be able to get logs far enough back when some unrelated thing is spamming the same ring buffer. You said that turning the kernel ring buffer contents into strings is a very heavy operation, so it is not possible to push this scope separation to userspace, right? > > That's my idea of it. It is interesting to see how far the requirements > > can be reasonably realised. =20 >=20 > I think aside from the "make it available directly to unpriviledged > userspace" everything sounds reasonable and doable. I never said "free for all". I meant specifically chosen unprivileged (not running as root or with caps) processes. > More on the process side of things, I think Jim is very much looking for > acks and tested-by by people who are interested in better drm logging > infra. That should help that things are moving in a direction that's > actually useful, even when it's not yet entirely complete. I'd love to help with that, but I can barely afford to send this email. Do let me know if my comments are not helping, so I can do other things instead. Thanks, pq --Sig_/Hc.s9O5m0Pcos=5MuauxwdY Content-Type: application/pgp-signature Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEJQjwWQChkWOYOIONI1/ltBGqqqcFAmUntJQACgkQI1/ltBGq qqcpZBAAiKQUS6YwaMfbNnaNZRN5OJTxEjiNK3Q9aQ/4Kxfin0fRc7HNMKSAmwwG HlO5tPbiJUep0PJrmMxqQ06Fgttr4sHpvvY437uKGY4zXFjZYl+4QLRIOEV2HCV0 EPj2tgkRy9dIqOo0uYgZW1e5AlLlhlb51KlgZg9rfSA2k548DmtcOd548hyoRVbZ nfZpUY9xJZrjFtn7pazKbQUcMG6ZfGRSGoxUVkc9SVEoKGeMBwm4rutjS9y0PenL CXnjykuwc6ZdivPcSN7eOVJ7Irkj8irqrloO6n24mW53Y9/wsrAW1il6LG/ZJ1pt mTdF/yLHNHGVTAsoEsLry8wkKicwL9TT9bsKz6CCnCwqEcGVjlNWA1FThWI/XLT4 Q/C0ul4YhZDXJVkyQeLFjvCJA3LLxIJZqJmQp9hIDZuNIfjyWjZis7ob/ShO7ouN Ic3LxUtUS6htp09Sb8YQitm59BeVPYx5d5PU0yL1h9KW/cXPNvV0bB7QnY56646i IXs4+SEd34oL2bVgotx+axE9Rqgg274lRjRaPB5P+ocZAz9hxUXLqJP1YUyrDFjN LfK1RagFTNL/ZpAXHRWj1pv7qqVflEw9XvIoeffsIr82ei9T8A1EjXKTouQHoelM D8kXc0J0WUc/X89QdX+lHGBBzf7pEvlZGAajceawlsrwg5RIZu0= =uoe0 -----END PGP SIGNATURE----- --Sig_/Hc.s9O5m0Pcos=5MuauxwdY--