Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750940AbcCHTLh (ORCPT ); Tue, 8 Mar 2016 14:11:37 -0500 Received: from mail-ob0-f171.google.com ([209.85.214.171]:35589 "EHLO mail-ob0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750747AbcCHTLa (ORCPT ); Tue, 8 Mar 2016 14:11:30 -0500 MIME-Version: 1.0 In-Reply-To: <56DF20FE.4040607@zytor.com> References: <3cc149b4ce9a108a092d816c5158808c62c557f0.1457285880.git.luto@kernel.org> <20160307082228.GA11026@gmail.com> <85B7C74C-3B32-44D1-90FE-352097F0A627@zytor.com> <20160308103004.GB5407@gmail.com> <56DF1C98.6030506@zytor.com> <56DF1E4F.6050809@zytor.com> <56DF20FE.4040607@zytor.com> From: Andy Lutomirski Date: Tue, 8 Mar 2016 11:11:09 -0800 Message-ID: Subject: Re: [PATCH] x86/entry: Improve system call entry comments To: "H. Peter Anvin" Cc: Ingo Molnar , Andy Lutomirski , X86 ML , "linux-kernel@vger.kernel.org" , Borislav Petkov , Oleg Nesterov , Andrew Cooper , Brian Gerst , Linus Torvalds , Andrew Morton , Peter Zijlstra , Thomas Gleixner Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1825 Lines: 46 On Tue, Mar 8, 2016 at 10:59 AM, H. Peter Anvin wrote: > On 03/08/16 10:50, Andy Lutomirski wrote: >> On Tue, Mar 8, 2016 at 10:47 AM, H. Peter Anvin wrote: >>> On 03/08/16 10:45, Andy Lutomirski wrote: >>>> >>>> s/modern/most, perhaps? >>>> >>>> I'm hoping that some day Bionic goes away and gets replaced by musl. >>>> >>>> Of course, musl doesn't always use fast syscalls because it needs a >>>> vdso facility that doesn't currently exist. I'll deal with that >>>> eventually. >>>> >>> >>> You don't actually need actual DSO support to support fast system calls >>> on i386. Even klibc uses them now, and the additional code to support >>> it is trivial. >> >> That's not the issue. The issue is that musl does something >> crazy^Wclever to support POSIX pthread cancellation, and it involves >> being able to tell whether a signal's ucontext points to a syscall >> and, if so, what the return address is. This is straightforward with >> an inlined int $0x80, but doing it reliably with the current vdso >> design would requiring parsing the DWARF data, and I can't really >> blame musl for not wanting to do that. >> >> There was a thread awhile back about adding a new vdso helper to do >> this. I think I even had some code for it. If I find time, I'll try >> to send patches for 4.7. >> > > As far as I know, when we get a signal the EIP always points to int > $0x80 as we don't support system call restart (being a rare case) for > the fast system calls. > We actually fully support system call restart on fast syscalls as of (IIRC) 4.5, even on AMD. Phew! However, the nasty case for musl is when the cancellation signal happens immediately before the actual kernel entry. The signal handler needs some way to detect whether the thread is at a cancellation point. --Andy -