Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp3123263pxb; Tue, 21 Sep 2021 15:16:40 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxHwZwcILYbOicDMuQ2e2+Kgcof54SYimqtcG5SU3WIlZ+6JQCKIaY82zMJX9cu2E/ggVl5 X-Received: by 2002:a17:906:1f49:: with SMTP id d9mr38819644ejk.150.1632262600234; Tue, 21 Sep 2021 15:16:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1632262600; cv=none; d=google.com; s=arc-20160816; b=XHucS9XhCpJ7RnwmASGiev2MJ0S68nVeuDV/DNR1t60HLZvM1LfiCwYKCcJ6YY+0rY tYv0tUkvIEqMUJsNNoRp/pJpEuIhfP503IIEqarPSkqtqkf4Sh+cwzJeSmttKun96/v3 GupRdXVIn6xCIj2n0htMqHEY539rdvCrQ0Y8Wc0KI5gYgh1Exyj5D6PWTtK8pW2dxwAE kk30EYhBSAsOTBjHc/VzSIEj5iN+C6rndheqvBoOCYkw66Z26YPc2MQvkjAYdqjdfvfe A20z0nyyS3M0+ZDQ7aM0+TqqOwRr9RpG7PXrbEX1558CISAGWqI4c5ceN41K1LjsK5f3 136A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-disposition:mime-version:message-id :subject:cc:to:from:date; bh=RhH4UMfkpeCVcI5jxtDsofAyscqbEtDlLINg5prG9ZY=; b=HXaLOjUE8rQaRTV/uwumVVuExXVQ4yNdkbapfUE1xK+r2eir9R6JfzRadVfBiZ4obz VeHx16alaOCRB18pETVBJIclg9W7MBjZKYwZG24SxUi9X2JoleCyK/vlt+hgJ//S2Iy4 FKQJz8EDdvAQ0Coi/cVmYq0yic2NOjE6zvVjMQf3L2q4rCl1uvUxTjX9TQnbVxD6ZFai ablYkC3RC8bBtj4YbxutHVDk7zyywNjjQt8BBA51i2CybaaJ0msXtvUV3rr4IpYg+6TE VXisBrq8tOkQqPv/TvzrStrV67H2fE+U7v758Fp2PkcPZsf/L+7FHW4U3f61c6+/YxPt GE+A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id rh9si427601ejb.137.2021.09.21.15.16.16; Tue, 21 Sep 2021 15:16:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233960AbhIUTeT (ORCPT + 99 others); Tue, 21 Sep 2021 15:34:19 -0400 Received: from shells.gnugeneration.com ([66.240.222.126]:54162 "EHLO shells.gnugeneration.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232667AbhIUTeS (ORCPT ); Tue, 21 Sep 2021 15:34:18 -0400 Received: by shells.gnugeneration.com (Postfix, from userid 1000) id 6FAD31A56019; Tue, 21 Sep 2021 12:32:49 -0700 (PDT) Date: Tue, 21 Sep 2021 12:32:49 -0700 From: Vito Caputo To: jpoimboe@redhat.com Cc: linux-kernel Subject: CONFIG_ORC_UNWINDER=y breaks get_wchan()? Message-ID: <20210921193249.el476vlhg5k6lfcq@shells.gnugeneration.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Josh (and CC:lkml), I've recently transitioned to an Arch system which has CONFIG_ORC_UNWINDER=y in the default kernel. My window manager integrates process monitoring showing the wchans of processes, making it very apparent when wchan breaks. Glancing at the kernel code to see what's involved in get_wchan() for x86, it looks to assume there are frame pointers in the stack. I don't see any mention of ORC_UNWINDER in the get_wchan() code which seems like an oversight when ORC_UNWINDER=y gets rid of them. I had originally assumed this was just a Kconfig problem and asked lkml about it (hearing crickets back) [0], but have since learned of ORC_UNWINDER's existence via the Arch kernel maintainer. Is this an oversight of the ORC_UNWINDER implementation? It's arguably a regression to completely break wchans for tools like `ps -o wchan` and `top`, or my window manager and its separate monitoring utility. Presumably there are other tools out there sampling wchans for monitoring as well, there's also an internal use of get_chan() in kernel/sched/fair.c for sleep profiling. I've occasionally seen when monitoring at a high sample rate (60hz) on something churny like a parallel kernel or systemd build, there's a spurious non-zero sample coming out of /proc/[pid]/wchan containing a hexadecimal address like 0xffffa9ebc181bcf8. This all smells broken, is get_wchan() occasionally spitting out random junk here kallsyms can't resolve, because get_chan() is completely ignorant of ORC_UNWINDER's effects? My time to spend on this currently is very limited, but I'd like to at least get the relevant parties aware if they're not already... Maybe I should just file something in bugzilla. Thanks, Vito Caputo [0] https://lore.kernel.org/lkml/20210914012612.vwlowt5wsojmyfzr@shells.gnugeneration.com/