Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 53878C433F5 for ; Thu, 16 Dec 2021 13:05:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237279AbhLPNFq (ORCPT ); Thu, 16 Dec 2021 08:05:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51504 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237249AbhLPNFp (ORCPT ); Thu, 16 Dec 2021 08:05:45 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 987D9C061574 for ; Thu, 16 Dec 2021 05:05:44 -0800 (PST) From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1639659942; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=+gyXtx8lEPXesBkD1UtIPzCEvJTY7LppYRBflO8e+Gs=; b=Eb1r5CsvHxD65dsjmOQ7chQ1HKW1NKG8CaTmVmpoj2o63splbAAgD6Sl4P5uWNO04pUfQP Nu3Qtv4Ya2WQfEq4EbutSSqG1XbiDCzArfuVKT3rq3VtOpf3MVUCWYJmZ+yi7GHdH0UMn/ eSEova/dQlRgZwgJO2RjssE4UvCtFh36B9Jgs1dJYElipzWdX2QNkVjxTqfCl7cekuxhqE Am8o4I6ibrLxZsEsAMp5qDIdLrYqCTnN8l4XWQPAxidCxErXrRdj6baXO+qbqmioPQbVkU hiHuyqPFWZT0BsKEiYb8TPij0bQPIJkjt3rz0nksffu38J1NROx22C7xWTqYwQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1639659942; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=+gyXtx8lEPXesBkD1UtIPzCEvJTY7LppYRBflO8e+Gs=; b=U2nS3Tiwbg3EWnEExNzwMx5VntF0IaEkxixX3StiV+4OHxezgh+F/RQNf6fawlBf+rDgIt 5H7nqdEaLXUjFiBA== To: Peter Collingbourne Cc: Catalin Marinas , Will Deacon , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Andy Lutomirski , Kees Cook , Andrew Morton , Masahiro Yamada , Sami Tolvanen , YiFei Zhu , Mark Rutland , Frederic Weisbecker , Viresh Kumar , Andrey Konovalov , Gabriel Krisman Bertazi , Chris Hyser , Daniel Vetter , Chris Wilson , Arnd Bergmann , Dmitry Vyukov , Christian Brauner , "Eric W. Biederman" , Alexey Gladkov , Ran Xiaokai , David Hildenbrand , Xiaofeng Cao , Cyrill Gorcunov , Thomas Cedeno , Marco Elver , Alexander Potapenko , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Evgenii Stepanov Subject: Re: [PATCH v4 4/7] uaccess-buffer: add CONFIG_GENERIC_ENTRY support In-Reply-To: References: <20211209221545.2333249-1-pcc@google.com> <20211209221545.2333249-5-pcc@google.com> <87a6h7webt.ffs@tglx> Date: Thu, 16 Dec 2021 14:05:41 +0100 Message-ID: <87a6h0lmxm.ffs@tglx> MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Peter, On Wed, Dec 15 2021 at 17:25, Peter Collingbourne wrote: > On Sat, Dec 11, 2021 at 3:50 AM Thomas Gleixner wrote: >> This restores the signal blocked mask _after_ exit_to_user_mode_loop() >> has completed, recalculates pending signals and goes out to user space >> with eventually pending signals. >> >> How is this supposed to be even remotely correct? > > Please see this paragraph from the documentation: > > When entering the kernel with a non-zero uaccess descriptor > address for a reason other than a syscall (for example, when > IPI'd due to an incoming asynchronous signal), any signals other > than ``SIGKILL`` and ``SIGSTOP`` are masked as if by calling > ``sigprocmask(SIG_SETMASK, set, NULL)`` where ``set`` has been > initialized with ``sigfillset(set)``. This is to prevent incoming > signals from interfering with uaccess logging. > > I believe that we will also go out to userspace with pending signals > when one of the signals that came in was a masked (via sigprocmask) > asynchronous signal, so this is an expected state. Believe is not part of a technical analysis, believe belongs into the realm of religion. It's a fundamental difference whether the program masks signals itself or the kernel decides to do that just because. Pending signals, which are not masked by the process, have to be delivered _before_ returning to user space. That's the expected behaviour. Period. Instrumentation which changes the semantics of the observed code is broken by definition. >> So what is this pre/post exit loop code about? Handle something which >> cannot happen in the first place? > > The pre/post_exit_loop() functions are checking UACCESS_BUFFER_ENTRY, > which is set when prctl(PR_SET_UACCESS_DESCRIPTOR_ADDR_ADDR) has been > used to set the uaccess descriptor address address to a non-zero > value. It is a different flag from UACCESS_BUFFER_EXIT. It is > certainly possible for the ENTRY flag to be set in your 2) above, > since that flag is not normally modified while inside the kernel. Let me try again. The logger is only active when: 1) PR_SET_UACCESS_DESCRIPTOR_ADDR_ADDR has set an address, which sets UACCESS_BUFFER_ENTRY 2) The task enters the kernel via syscall and the syscall entry observes UACCESS_BUFFER_ENTRY and sets UACCESS_BUFFER_EXIT because the log functions only log when UACCESS_BUFFER_EXIT is set. UACCESS_BUFFER_EXIT is cleared in the syscall exit path _before_ the exit to usermode loop is reached, which means signal delivery is _NOT_ logged at all. A non-syscall entry from user space - interrupt, exception, NMI - will _NOT_ set UACCESS_BUFFER_EXIT because it takes a different entry path. So when that non-syscall entry returns and delivers a signal then there is no logging. When the task has entered the kernel via a syscall and the kernel gets interrupted and that interruption raises a signal, then there is no signal delivery. The interrupt returns to kernel mode, which obviously does not go through exit_to_user_mode(). The raised signal is delivered when the task returns from the syscall to user mode, but that won't be logged because UACCESS_BUFFER_EXIT is already cleared before the exit to user mode loop is reached. See? >> then we have a very differrent understanding of what documentation >> should provide. > > This was intended as interface documentation, so it doesn't go into > too many details. It could certainly be improved though by referencing > the user documentation, as I mentioned above. Explanations which are required to make the code understandable have to be in the code/kernel-doc comments and not in some disjunct place. This disjunct documentation is guaranteed to be out of date within no time. Thanks, tglx