Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752864AbcCII4o (ORCPT ); Wed, 9 Mar 2016 03:56:44 -0500 Received: from mail-wm0-f65.google.com ([74.125.82.65]:33934 "EHLO mail-wm0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751610AbcCII4g (ORCPT ); Wed, 9 Mar 2016 03:56:36 -0500 Date: Wed, 9 Mar 2016 09:56:31 +0100 From: Ingo Molnar To: Andy Lutomirski Cc: x86@kernel.org, linux-kernel@vger.kernel.org, Borislav Petkov , "musl@lists.openwall.com" , Linus Torvalds , Andrew Morton , Thomas Gleixner , Peter Zijlstra Subject: Re: [RFC PATCH] x86/vdso/32: Add AT_SYSINFO cancellation helpers Message-ID: <20160309085631.GA3247@gmail.com> References: <06079088639eddd756e2092b735ce4a682081308.1457486598.git.luto@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <06079088639eddd756e2092b735ce4a682081308.1457486598.git.luto@kernel.org> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3871 Lines: 89 * Andy Lutomirski wrote: > musl implements system call cancellation in an unusual but clever way. So I'm sceptical about the concept. Could someone remind me why cancellation points matter to user-space? I know the pthread APIs and semantics that are behind it, I just don't see how it can be truly utilized for any meaningful programmatic property: for example the moment you add any sort of ad-hoc printf() based tracing or any other spontaneous logging IO to your application, you add in a lot of potential cancellation points into various places in your user-space logic ... It's _very_ easy to add inadvertent cancellation point to the code in practice, so using the default pthread cancellation model and relying on what is a cancellation point is crazy and very libc dependent in general. POSIX seems to be pretty vague about it as well. So unless you make heavy use of pthread_setcancelstate() to explicitly mark your work atoms, it's a really bad interface to rely on. And if you are using pthread_setcancelstate(), instead of relying on calcellation, then you are not really using the built-in cancellation points but have to spike your code with pthread_testcancel(). In that case, why not just use your own explicit 'cancellation' points in a few strategic places - which is mostly just a simple flag really. That's what most worker thread models that I've seen use. I suspect more complex runtimes like java runtimes couldn't care less, so it's really something that only libc using C/C++ code cares about. > When a thread issues a cancellable syscall, musl issues the syscall > through a special thunk that looks roughly like this: > > cancellable_syscall: > test whether a cancel is queued > jnz cancel_me > int $0x80 > end_cancellable_syscall: > > If a pthread cancellation signal hits with > cancellable_syscall <= EIP < end_cancellable_syscall, then the > signal interrupted a cancellation point before the syscall in > question started. If so, it rewrites the calling context to skip > the syscall and simulate a -EINTR return. The caller will detect > this simulated -EINTR or an actual -EINTR and handle a possible > cancellation event. Why is so much complexity added to avoid a ~3 instructions window where calcellation is tested? Cancellation at work atom boundaries is a fundamentally 'polling' model anyway, and signal delivery is asynchronous, with a fundamental IPI delay if it's cross-CPU. > This technique doesn't work if int $0x80 is replaced by a call to > AT_SYSINFO: the signal handler can no longer tell whether it's > interrupting a call to AT_SYSINFO or, if it is, where AT_SYSINFO was > called from. > > Add minimal helpers so that musl's signal handler can learn the > status of a possible pending AT_SYSINFO invocation and, if it hasn't > entered the kernel yet, abort it without needing to parse the vdso > DWARF unwind data. > > Signed-off-by: Andy Lutomirski > --- > > musl people- > > Does this solve your AT_SYSINFO cancellation problem? I'd like to > make sure it survives an actual implementation before I commit to the ABI. > > x86 people- > > Are you okay with this idea? > > > arch/x86/entry/vdso/Makefile | 3 +- > arch/x86/entry/vdso/vdso32/cancellation_helpers.c | 116 ++++++++++++++++++++++ > arch/x86/entry/vdso/vdso32/vdso32.lds.S | 2 + > tools/testing/selftests/x86/unwind_vdso.c | 57 +++++++++-- > 4 files changed, 171 insertions(+), 7 deletions(-) > create mode 100644 arch/x86/entry/vdso/vdso32/cancellation_helpers.c I'd really like to see a cost/benefit analysis here! Some before/after explanation - exactly what is not possible today (in practical terms), what are the practical effects of not being able to do that, and how would the bright future look like? Thanks, Ingo