Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935329AbcCJK62 (ORCPT ); Thu, 10 Mar 2016 05:58:28 -0500 Received: from mail-wm0-f67.google.com ([74.125.82.67]:35066 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965651AbcCJK6E (ORCPT ); Thu, 10 Mar 2016 05:58:04 -0500 Date: Thu, 10 Mar 2016 11:57:54 +0100 From: Ingo Molnar To: Andy Lutomirski Cc: Linus Torvalds , Andy Lutomirski , the arch/x86 maintainers , Linux Kernel Mailing List , Borislav Petkov , "musl@lists.openwall.com" , Andrew Morton , Thomas Gleixner , Peter Zijlstra Subject: Re: [musl] Re: [RFC PATCH] x86/vdso/32: Add AT_SYSINFO cancellation helpers Message-ID: <20160310105754.GA31225@gmail.com> References: <06079088639eddd756e2092b735ce4a682081308.1457486598.git.luto@kernel.org> <20160309085631.GA3247@gmail.com> <20160309113449.GZ29662@port70.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2244 Lines: 51 * Andy Lutomirski wrote: > Let me try to summarize my understanding of the semantics. > > Thread A sends thread B a signal. Thread B wants to ignore the signal and defer > handling unless it's either in a particular syscall and returns -EINTR or unless > the thread is about to do the syscall. s/the syscall/an interruptible syscall/ The fundamental intention is to essentially allow the asynchronous killing (cancellation) of pthread threads without corrupting user-space data structures such as malloc() state. There's a long list of system calls listed at pthread(8) that must be cancellation points, plus an even longer list of system calls and libc APIs that may be cancellation points. On glibc signal 32 (the first RT signal) is used as the cancellation signal. But I guess you knew all this already! So my original thinking was this: | What surprises me is why Musl even bothers with trying to detect system calls | that are about to be executed. Cancellation is a fundamentally polling-type | API, a very small, 2-3 instructions window to 'miss' the current system call | has no practical latency effect - so why does it even attempt to detect that | RIP range? Why doesn't Musl just check the cancellation flag (activated by | signal 32) and is content? Am I misunderstanding something about it? ... and when I wrote that up I realized the detail that I missed: it's a problematic race if the thread starts a long-lived blocking system call (such as accept()), shortly after the cancellation signal has been sent. So the signal-32 handler _has_ to check the RIP and make sure that the system call is not about to be executed - cancellation might be delayed indefinitely otherwise. It's essentially needed for correctness. Linus's suggestion to allow system calls to be more interruptible via a new SA_ flag also makes sense, but that is a latency improvement change - while the aspect I was wondering about was a fundamental correctness detail. So I withdraw my objection regarding AT_SYSINFO cancellation helpers. User-space needs to have a signal-atomic way to prevent system calls from being started after a cancellation signal has been received. Thanks, Ingo