Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933184AbcCKBPB (ORCPT ); Thu, 10 Mar 2016 20:15:01 -0500 Received: from mail-oi0-f50.google.com ([209.85.218.50]:36519 "EHLO mail-oi0-f50.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932892AbcCKBO6 (ORCPT ); Thu, 10 Mar 2016 20:14:58 -0500 MIME-Version: 1.0 In-Reply-To: <20160311004858.GS9349@brightrain.aerifal.cx> References: <06079088639eddd756e2092b735ce4a682081308.1457486598.git.luto@kernel.org> <20160309085631.GA3247@gmail.com> <20160309113449.GZ29662@port70.net> <20160310033446.GL9349@brightrain.aerifal.cx> <20160310111646.GA13102@gmail.com> <20160310164104.GM9349@brightrain.aerifal.cx> <20160310180331.GB15940@gmail.com> <20160310232819.GR9349@brightrain.aerifal.cx> <20160311001853.GA10198@port70.net> <20160311004858.GS9349@brightrain.aerifal.cx> From: Andy Lutomirski Date: Thu, 10 Mar 2016 17:14:38 -0800 Message-ID: Subject: Re: [musl] Re: [RFC PATCH] x86/vdso/32: Add AT_SYSINFO cancellation helpers To: Rich Felker Cc: Ingo Molnar , Linus Torvalds , Andy Lutomirski , "the arch/x86 maintainers" , Linux Kernel Mailing List , Borislav Petkov , "musl@lists.openwall.com" , Andrew Morton , Thomas Gleixner , Peter Zijlstra Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6666 Lines: 136 On Thu, Mar 10, 2016 at 4:48 PM, Rich Felker wrote: > On Fri, Mar 11, 2016 at 01:18:54AM +0100, Szabolcs Nagy wrote: >> * Rich Felker [2016-03-10 18:28:20 -0500]: >> > On Thu, Mar 10, 2016 at 07:03:31PM +0100, Ingo Molnar wrote: >> > > >> > > * Rich Felker wrote: >> > > >> > > > > So instead of a sticky cancellation flag, we could introduce a sticky >> > > > > cancellation signal. >> > > > > >> > > > > A 'sticky signal' is not cleared from signal_pending() when the signal handler >> > > > > executes, but it's automatically blocked so no signal handler recursion >> > > > > occurs. (A sticky signal could still be cleared via a separate mechanism, by >> > > > > the cancellation cleanup code.) >> > > > > >> > > > > Such a 'sticky cancellation signal' would, in the racy situation, cause new >> > > > > blocking system calls to immediately return with -EINTR. Non-blocking syscalls >> > > > > could still be used. (So the cancellation signal handler itself would still >> > > > > have access to various fundamental system calls.) >> > > > > >> > > > > I think this would avoid messy coupling between the kernel's increasingly more >> > > > > varied system call entry code and C libraries. >> > > > > >> > > > > Sticky signals could be requested via a new SA_ flag. >> > > > > >> > > > > What do you think? >> > > > >> > > > This still doesn't address the issue that the code making the syscall needs to >> > > > be able to control whether it's cancellable or not. Not only do some syscalls >> > > > whose public functions are cancellation points need to be used internally in >> > > > non-cancellable ways; there's also the pthread_setcancelstate interface that >> > > > allows deferring cancellation so that it's possible to call functions which are >> > > > cancellation points without invoking cancellation. >> > > >> > > I don't think there's a problem - but I might be wrong: >> > > >> > > One way I think it would work is the following: a sticky signal is not the >> > > cancellation flag - it's a helper construct to implement the flag in user-space in >> > > a race-free way. >> > > >> > > Say you have RT signal-32 as the cancellation signal, and it's a sticky signal. >> > > >> > > When pthread_cancel() wants to cancel another thread, it first (atomically) sets >> > > the desired cancel state of the target thread. If that state signals that the >> > > thread is cancellable right now, and that we initiated its cancellation, then we >> > > send signal-32. I.e. the signal only ever gets sent if the thread is in a >> > > cancellable state. >> > > >> > > libc internal functions and the pthread_setcancelstate() API can temporarily >> > > change the cancel state of a thread to non-cancellable - but pthread_cancel() >> > > observes those state transitions. >> > > >> > > The 'sticky' nature of signal-32 will make a difference in the following race >> > > condition, if the cancellation flag is checked before a system call by the C >> > > library, and signal-32 arrives before the system call is executed. In that case >> > > the 'sticky' nature of the signal makes sure that all subsequent system calls >> > > return immediately. >> > > >> > > The sticky signal is only ever sent when the thread is in cancellable state - and >> > > if the target thread notices the cancellation request before the signal arrives, >> > > it first waits for its arrival before executing any new system calls (as part of >> > > the teardown, etc.). >> > > >> > > So the C library never has to do complex work with a sticky signal pending. >> > > >> > > Does that make more sense to you? >> > >> > No, it doesn't work. Cancellability of the target thread at the time >> > of the cancellation request (when you would decide whether or not to >> > send the signal) has no relation to cancellability at the time of >> > calling the cancellation point. Consider 2 threads A and B and the >> > following sequence of events: >> > >> > 1. A has cancellation enabled >> > 2. B calls pthread_cancel(A) and sets sticky pending signal >> > 3. A disables cancellation >> > 4. A calls cancellation point and syscall wrongly gets interrupted >> > >> > This can be solved with more synchronization in pthread_cancel and >> > pthread_setcancelstate, but it seems costly. pthread_setcancelstate >> > would have to clear pending sticky cancellation signals, and any >> > internal non-cancellable syscalls would have to be made using the same >> > mechanism (effectively calling pthread_setcancelstate). A naive >> > implementation of such clearing would involve a syscall itself, >> >> i think a syscall in setcancelstate in case of pending sticky signal >> is not that bad given that cancellation is very rarely used. > > I agree, but it's not clear to me whether you could eliminate syscalls > in the case where it's not pending, since AS-safe lock machinery is > hard to get right. I don't see a way it can be done with just atomics > because the syscall that sends the signal cannot be atomic with the > memory operating setting a flag, which suggests a lock is needed, and > then there are all sorts of issues to deal with. > >> however maintaining two completely different cancellation designs >> is expensive and only the current one works on old kernels. > > Indeed. I think it would be hard to justify supporting a new one in > musl unless there's some easy way to isolate the complexity of having > both, being that vdso syscall is of marginal value to begin with > anyway... I would argue that vdso syscalls are of considerably more than marginal utility. They are vastly faster. The difference isn't subtle. However... while it seems straightforward that a pthread cancellation implementation should be correct, is there any reason that it needs to fast? After all musl could always do: if (this thread is cancellable right now) use int $0x80 and eat the performance hit else call AT_SYSINFO Aside from a branch, this adds minimal overhead to sane programs, and programs crazy enough to try use pthread cancellation get penalized on x86_32. If I read it right, that's what musl already does. But I could be reading it wrong: if ((st=(self=__pthread_self())->canceldisable) && (st==PTHREAD_CANCEL_DISABLE || nr==SYS_close)) return __syscall(nr, u, v, w, x, y, z); ... slow path ... Figuring out what "(st=(self=__pthread_self())->canceldisable) && ..." does and why requires more cross-referencing that I care to do right now. Shouldn't that "&&" at least be a "," or perhaps just be a separate statement? On glibc, at least, PTHREAD_CANCEL_DISABLE == 1, so this looks nearly tautological. --Andy