Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp835972rdg; Wed, 11 Oct 2023 07:00:02 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFX9hvQmRknow/AT28DliQwJDc2YUqDgpdKKE2lCbd1QzLynFtOHfQMJFaT2lJo1Ly7wWzA X-Received: by 2002:a17:90b:3a86:b0:278:fbe3:a31f with SMTP id om6-20020a17090b3a8600b00278fbe3a31fmr18711986pjb.37.1697032801402; Wed, 11 Oct 2023 07:00:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697032801; cv=none; d=google.com; s=arc-20160816; b=iSuq23AG/RrxrA1hd/9IRLDqnIOm+o2RZZuO52gXDjOJSYccSkyn6IEpo6N97iaPik NjDV1QnMRWnL2I/DyCCai4YnhSa2KBQuQ5sxLszgCNJGh5P8CtkFck6Jc2uRjJF9vcvk i5ibHUt/DHwx4qx5iUUx5Bd50cMQxZIEL/5HPL6+IjCt7b62Dt+JQ37TSBigrIDoQUnv nlqTCb2J4h61idgtkAbyQCUBDb3Te3smtvp2G1dl09PPlyiVoapn5cqpEEY0pd/zAh6A DQ9rRwsDUI/FkBWTXd8Y8HNlamVblOK8mGkID/g379d5W94m7wJKLFamoqp80VjAKIWm IxMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:to:subject:message-id :date:from:in-reply-to:references:mime-version:dkim-signature; bh=sjbWrUHRp424ca1ilWAiBj15r5PfEjMvJc6xVU/mfP4=; fh=hetvhG/Rt5rzpCQggVpTtd7cviBTDGrQ7+7qs8HDlzs=; b=bWL45jUwXTlw9BlhWeX65aJ6MEbUKJ81d9TnHRP70abFfwO8iKjdxv+nsjv4sAmYfX c3Y4eHCgpGZWynfIeZyxzD6XIjpQUGoh6sHji4PwKCG1opCMJ8DDxVYduKEotfumr1IN OVt0Rhp9xoXAuLaBNcbI7E/54dyKqhj1RfTclRyXfrDjlESCvW/7G29j9ahvfYhUz+Qf OauCd4iRD0z9Fevdoof0EucQKWHPHDJBgOOriPRJSW1aY61Yjtz5wvEzIlQYOVnTmakZ KDGIsXc++4G6scwZJr7NmwG3qB06q9MxX4fuVsjRcobws0Bjso+82tb+vpLS+FVGgloC UbAA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@semihalf.com header.s=google header.b=ASRdr3H0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=semihalf.com Return-Path: Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id u14-20020a17090a2b8e00b0026800178358si14700051pjd.144.2023.10.11.07.00.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 07:00:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; dkim=pass header.i=@semihalf.com header.s=google header.b=ASRdr3H0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=semihalf.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 0EF128135BDE; Wed, 11 Oct 2023 06:59:58 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235010AbjJKN7s (ORCPT + 99 others); Wed, 11 Oct 2023 09:59:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55350 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232575AbjJKN7r (ORCPT ); Wed, 11 Oct 2023 09:59:47 -0400 Received: from mail-ed1-x52f.google.com (mail-ed1-x52f.google.com [IPv6:2a00:1450:4864:20::52f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D4A6DC0 for ; Wed, 11 Oct 2023 06:59:44 -0700 (PDT) Received: by mail-ed1-x52f.google.com with SMTP id 4fb4d7f45d1cf-53d9b94731aso3122201a12.1 for ; Wed, 11 Oct 2023 06:59:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=semihalf.com; s=google; t=1697032783; x=1697637583; darn=vger.kernel.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=sjbWrUHRp424ca1ilWAiBj15r5PfEjMvJc6xVU/mfP4=; b=ASRdr3H0aNPN4LbslUMND1NPUf8nsn34H/R5DrCPfo/BSDs6o9w8yZbnqfZyLsA27Q l+xI3HvSD89ZI+smGinGSim7yahWpQvx4Q4LI/dQThxSrrk2FNWvnBXnfPAZDMWN66a4 n73H4sy4ffOIoOERtFji6WiuQldt6OfVPzwV/A7XbNhHs3TeHxygVlITNGWcHDL+szAR C7S28tw9GDPXfV/29sLMgifmyEOLiXFejDzugiiOXNZZL6jsOquBQg1CzIn+OjhdHcAE lm+pAzoJ6f3GWsIOptWktaPobbS6ym4o5yc8uosYL1rCrxBR5BiQ/oe3SOlbInRCdVUc reTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697032783; x=1697637583; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sjbWrUHRp424ca1ilWAiBj15r5PfEjMvJc6xVU/mfP4=; b=eQg3ugLJxkD3PuyUlf181WJZtsHHgA9h5445fsi/lvouHB0NV5Hmuvylo5EV7/AFbT 0+IyDleEs87G9kU8ZQkKmAvSCHkUV4s+VLucAQ1xwlwQ26HlX+htOeujqr1n/dY9AOMG 83RjtKyei7QCmfFvPl0do2pHcMKO0vwNC8cD1k6/F0u58LWoAZ/6acMWTCAmBHnUyALy k5UNVKeCBG2+Z9myxooCiGoQeeNwun4O5BpdhZfuZzw5DeGOHoGmjnvV33VCtPrxYFP8 dW4T65kb7asQ6Bvp8WqUdlHuV4q4dvSvKoosY+R3gJol9kWayYEqMSdayI79PrJFjEYn aPBg== X-Gm-Message-State: AOJu0YzKtd9JKgKJIOOkv0rmwGCqV1SKk5dIVHCLFve2Dglv6YRnNS1G kj2s1dlSC0aBwvK+8APNKMuI0KAiEwtmd5j9gubz X-Received: by 2002:a05:6402:1641:b0:533:87c9:4a81 with SMTP id s1-20020a056402164100b0053387c94a81mr18738691edx.29.1697032783125; Wed, 11 Oct 2023 06:59:43 -0700 (PDT) MIME-Version: 1.0 References: <20231003155810.6df9de16@gandalf.local.home> <20231011114816.19d79f43@eldfell> In-Reply-To: From: =?UTF-8?Q?=C5=81ukasz_Bartosik?= Date: Wed, 11 Oct 2023 15:59:32 +0200 Message-ID: Subject: Re: [PATCH v1] dynamic_debug: add support for logs destination To: Pekka Paalanen , jim.cromie@gmail.com, =?UTF-8?Q?=C5=81ukasz_Bartosik?= , linux-kernel@vger.kernel.org, Steven Rostedt , "wayland-devel@lists.freedesktop.org" , Sean Paul , dri-devel Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=2.7 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Wed, 11 Oct 2023 06:59:58 -0700 (PDT) X-Spam-Level: ** =C5=9Br., 11 pa=C5=BA 2023 o 11:42 Daniel Vetter napisa= =C5=82(a): > > On Wed, Oct 11, 2023 at 11:48:16AM +0300, Pekka Paalanen wrote: > > On Tue, 10 Oct 2023 10:06:02 -0600 > > jim.cromie@gmail.com wrote: > > > > > since I name-dropped you all, > > > > Hi everyone, > > > > I'm really happy to see this topic being developed! I've practically > > forgot about it myself, but the need for it has not diminished at all. > > It's good to hear you guys are also interested in this feature. > > I didn't understand much of the conversation, so I'll just reiterate > > what I would use it for, as a Wayland compositor developer. > > > > I added a few more cc's to get better coverage of DRM and Wayland > > compositor developers. > > > > > On Tue, Oct 10, 2023 at 10:01=E2=80=AFAM wrote= : > > > > > > > > On Mon, Oct 9, 2023 at 4:47=E2=80=AFPM =C5=81ukasz Bartosik wrote: > > > > ... > > > > > > > I don't have a real life use case to configure different trace > > > > > instance for each callsite. > > > > > I just tried to be as much flexible as possible. > > > > > > > > > > > > > Ive come around to agree - I looked back at some old threads > > > > (that I was a part of, and barely remembered :-} > > > > > > > > At least Sean Paul, Lyude, Simon Ser, Pekka Paalanen > > > > have expressed a desire for a "flight-recorder" > > > > it'd be hard to say now that 2-3 such buffers would always be enoug= h, > > > > esp as theres a performance reason for having your own. > > > > A Wayland compositor has roughly three important things where the kerne= l > > debugs might come in handy: > > - input > > - DRM KMS > > - DRM GPU rendering > > > > DRM KMS is the one I've been thinking of in the flight recorder context > > the most, because KMS hardware varies a lot, and there is plenty of > > room for both KMS drivers and KMS userspace to go wrong. The usual > > result is your display doesn't work, so the system is practically > > unusable to the end user. In the wild, the simplest or maybe the only > > way out of that may be a reboot, maybe an automated one (e.g. digital > > signage). In order to debug such problems, we would need both > > compositor logs and the relevant kernel debug messages. > > > > For example, Weston already has a flight recorder framework of its own, > > so we have the compositor debug logs. It would be useful to get the > > selected kernel debug logs in the same place. It could be used for > > automated or semi-manual bug reporting, for example, making the > > administrator or end user life much easier reporting issues. > > > > Since this is usually a production environment, and the Wayland > > compositor runs without root privileges, we need something that works > > with that. We would likely want the kernel debug messages in the > > compositor to combine and order them properly with the compositor debug > > messages. > > > > It's quite likely that developers would like to pick and choose which > > kernel debug messages might be interesting enough to record, to avoid > > excessive log flooding. The flight recorder in Weston is fixed size to > > avoid running out of memory or disk space. I can also see that Weston > > could have debugging options that affect which kernel debug messages it > > subscribes to. We can have a reasonable default setup that allows us to > > pinpoint the problem area and figure out most problems, and if needed, > > we could ask the administrator pass another debug option to Weston. It > > helps if there is just one place to configure everything about the > > compositor. > > > > This implies that it would be really nice to have userspace subscriber > > specific debug message streams from the kernel, or a good way to filter > > the messages we want. A Wayland compositor would not be interested in > > file system or wireless debugs for example, but another system > > component might be. There is also a security aspect of which component = is > > allowed to see which messages in case they could contain anything > > sensitive (input debug could contain typed passwords). > > > > Configuring the kernel debug message selection for our debug message > > stream can and probably should require elevated privileges, and we can > > likely solve that in userspace with a daemon or such, to allow the > > Wayland compositor to run as a regular user. > > > > Thinking of desktop systems, and especially physically multi-seat syste= ms: > > - there can be multiple different Wayland compositors running simultane= ously > > - each of them may want debug messages only from a specific DRM KMS > > device instance, and not from others > > - each of them may have a different idea of which debug messages are im= portant > > - because DRM KMS leasing is a thing, different compositor instances > > could be using the same DRM KMS device instance simultaneously; since > > this is specific to DRM KMS, and it should be harmless to get a > > little too much DRM KMS debug (that is, from the whole device instead > > of just the leased parts), it may not be worth to consider splitting > > debug message streams this far. > > > > If userspace is offered some standardised fields in kernel debug > > structures, then userspace could do some filtering on its own too, but = I > > guess it would be better to filter at the source and not need that. > > > > There is also an anti-goal. The kernel debug message contents are > > specifically not machine-parsable. I very much do not want to impose > > debug strings as ABI, they are for human (and AI?) readers only. > > > > > > As a summary, here are the most important requirements first: > > - usable in production as a normal thing to enable always by default > > - final delivery to unprivileged userspace process > > I think this is the one that's trickiest, and I don't fully understand wh= y > you need it. The issues I'm seeing: > > - logs tend to leak a lot of kernel internal state that's useful for > attacks. There's measures for the worst (like obfuscating kernel > pointers by hashing them), so there's always going to be a difference > here between what full sysadmin can get and what unpriviledged userspac= e > can get. And there's always a risk we miss something that we should > obfuscate but didn't. > > - handing this to userspace increases the risks it becomes uapi. Who's > going to stop compositors from sussing out the reason an atomic commit > failed from the logs if they can get them easily, and these logs contai= n > very interesting information about why something failed? > > Much better if journald or a crash handler assemebles all the different > flight recorder logs and packages it into a bug report so that the > compositor cannot ever get at these directly. Yeah this needs some OS > support with a dbus request or similar so that the compositor can ask > for a crash dump with everything relevant to its session. > This is similar to what we plan to do in ChromeOS. We want to enable writing debug logs of each subsystem/driver of interest (e.g. thunderbolt) into a separate trace instance and when an issue happens and a user submits a feedback report we will attach the captured logs from trace instances to the report. > - the idea of an in-kernel flight recorder is that it's really fast. The > entire tracing infra is built such that recording an event is really > quick, but printing it is not - the entire string formatting is delayed > to when userspace reads the buffers. If you constantly push the log > messages to userspace we toss the advantage of the low-overhead > in-kernel flight recorder. If you push logs to dmesg there's a > substantial measureable overhead which you don't really want in > production, and your requirement would impose the same. > > - I'm not sure how this is supposed to mesh with userspace log aggregator= s > like journald when every compositor has it's own flight recorder on top= . > Feels a bit like a solution that ignores the entire os stack and assume= s > that the only pieces we can touch are the kernel and the compositor to > get to such a flight recorder. > > You might object that events will get out-of-order if you merge multipl= e > logs after the fact, but that's the case anyway if we use the tracing > framework since that's always a ringbuffer within the kernel and not > synchronous. The only thing we could do is allow userspace to push > markers into that ringbuffer, which is easily done by adding more debug > output lines (heck we could even add a logging cookie to certain ioctl > when userspace really cares about knowing exact ordering of it's > requests with the stuff the kernel does). > > - If you really want direct deliver to userspace I guess we could do > something where sessiond opens the flight recorder fd for you, sets it > all up and passes it to the compositor. But I'm really not a big fan of > sending the full kms dbg spam to compositors to freely digest in real > time. > > > - per debug-print selection of messages (finer or coarser, categories > > within a kernel sub-system could be enough) > > - per originating device (driver instance) selection of messages > > The dyndbg stuff can do all that already, which is why I'm so much in > favour of relying on that framework. > > > - all selections tailored separately for each userspace subscriber > > (- per open device file description selection of messages) > > Again this feels like a userspace problem. Sessions could register what > kind of info they need for their session, and something like journald can > figure out how to record it all. > > If you want the kernel to keep separate flight recorders I guess we could > add that, but I don't think it currently exists for the dyndbg stuff at > least. Maybe a flight recorder v2 feature, once the basics are in. > > > That's my idea of it. It is interesting to see how far the requirements > > can be reasonably realised. > > I think aside from the "make it available directly to unpriviledged > userspace" everything sounds reasonable and doable. > > More on the process side of things, I think Jim is very much looking for > acks and tested-by by people who are interested in better drm logging > infra. That should help that things are moving in a direction that's > actually useful, even when it's not yet entirely complete. > > Cheers, Sima > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch