Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753343AbbHMPi4 (ORCPT ); Thu, 13 Aug 2015 11:38:56 -0400 Received: from mail-ob0-f170.google.com ([209.85.214.170]:34450 "EHLO mail-ob0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751820AbbHMPiz convert rfc822-to-8bit (ORCPT ); Thu, 13 Aug 2015 11:38:55 -0400 MIME-Version: 1.0 In-Reply-To: <55CCB625.3000900@list.ru> References: <55CBA4CE.1040108@list.ru> <55CBA909.3020306@list.ru> <55CBB053.7050803@list.ru> <55CBB2CC.9090600@list.ru> <55CBBFB9.1080201@list.ru> <20150813083949.GA17091@gmail.com> <55CC911D.3080607@list.ru> <55CCB625.3000900@list.ru> From: Andy Lutomirski Date: Thu, 13 Aug 2015 08:38:34 -0700 Message-ID: Subject: Re: [regression] x86/signal/64: Fix SS handling for signals delivered to 64-bit programs breaks dosemu To: Stas Sergeev Cc: Ingo Molnar , X86 ML , Linux kernel , Linus Torvalds , "H. Peter Anvin" , Thomas Gleixner , Brian Gerst , Borislav Petkov , Stas Sergeev Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5405 Lines: 138 On Thu, Aug 13, 2015 at 8:22 AM, Stas Sergeev wrote: > 13.08.2015 17:58, Andy Lutomirski пишет: > >> On Thu, Aug 13, 2015 at 5:44 AM, Stas Sergeev wrote: >>> >>> 13.08.2015 11:39, Ingo Molnar пишет: >>>> >>>> * Andy Lutomirski wrote: >>>> >>>> >>>>>> OK. >>>>>> I'll try to test the patch tomorrow, but I think the sigreturn()'s >>>>>> capability detection is still needed to easily replace the iret >>>>>> trampoline >>>>>> in userspace (without generating a signal and testing by hands). >>>>>> Can of course be done with a run-time kernel version check... >>>>> >>>>> That feature is so specialized that I think you should just probe it. >>>>> >>>>> void foo(...) { >>>>> sigcontext->ss = 7; >>>>> } >>>>> >>>>> modify_ldt(initialize descriptor 0); >>>>> sigaction(SIGUSR1, foo, SA_SIGINFO); >>>>> if (ss == 7) >>>>> yay; >>>>> >>>>> Fortunately, all kernels that restore ss also have espfix64, so you >>>>> don't need to worry about esp[31:16] corruption on those kernels >>>>> either. >>>>> >>>>> I suppose we could add a new uc_flag to indicate that ss is saved and >>>>> restored, >>>>> though. Ingo, hpa: any thoughts on that? There will always be some >>>>> kernel >>>>> versions that save and restore ss but don't set the flag, though. >>>> >>>> So this new flag would essentially be a 'the ss save/restore bug is >>>> fixed >>>> for >>>> sure' flag, not covering old kernels that happen to have the correct >>>> behavior, >>>> right? >>>> >>>> Could you please map out the range of kernel versions involved - which >>>> ones: >>>> >>>> - 'never do the right thing' >>>> - 'do the right thing sometimes' >>>> - 'do the right thing always, but by accident' >>>> - 'do the right thing always and intentionally' >>>> >>>> ? >>>> >>>> I'd hate to complicate a legacy ABI any more. My gut feeling is to let >>>> apps either >>>> assume that the kernel works right, or probe the actual behavior. Adding >>>> the flag >>>> just makes it easy to screw certain kernel versions that would still >>>> work >>>> fine if >>>> the app used actual probing. So I don't see the flag as an improvement. >>>> >>>> If your patch fixes the regression that would be a good first step. >>> >>> I've tested the patch. >>> It doesn't fix the problem. >>> It allows dosemu to save the ss the old way, but, >>> because dosemu doesn't save it to the sigreturn()'s-expected >>> place (sigcontext.__pad0), it crashes on sigreturn(). >>> >>> So the problem can't be fixed this way --> NACK to the patch. >>> >>> I may be unavailable for further testings till next week. >> >> I'm still fighting with getting DOSEMU to run at all in my VM. >> >> I must be missing something. What ends up in ss/__pad0? Wouldn't it >> contain whatever signal delivery put there (i.e. some valid ss value)? > > The crash happens when DOS program terminates. > At that point dosemu subverts the execution flow by > replacing segregs and cs/ip ss/sp in sigcontext with its own. > But __pad0 still has DOS SS, which crash because (presumably) > the DOS LDT have been just removed. That's unfortunate. I don't really know what to do about this. My stupid heuristic for signal delivery seems unlikely to cause problems, but I'm not coming up with a great heuristic for detecting when a program that *modifies* sigcontext hasn't set all the fields. Even adding a flag won't really help here, since DOSEMU won't know to manipulate the flag. Ingo, here's the situation, assuming I remember the versions right: v4.0 and before: If we try to deliver a signal while SS is bad, we fail and the process dies. If SS is good but nonstandard, we end up in the signal handler with whatever SS value was loaded when the signal was sent. We do *not* put SS anywhere in the sigcontext, so the only way for a program to figure out what SS was is to look at the HW state before making any syscalls. We also don't even try to restore SS, so SS is unconditionally set to __USER_DS, necessitating nasty workarounds (and breaking all kinds of test cases). v4.1 and current -linus: We always set SS to __USER_DS when delivering a signal. We save the old SS in the sigcontext and restore it, just like 32-bit signals. My patch: We leave SS alone when delivering a signal, unless it's invalid, in which case we replace it with __USER_DS. We still save the old SS in the sigcontext and restore it on return. Apparently the remaining regression is that DOSEMU doesn't realize that SS is saved so, when it tries to return to full 64-bit mode after a signal that hit in 16-bit mode, it fails because it's invalidated the old SS descriptor in the mean time. So... what do we do about it? We could revert the whole mess. We could tell everyone to fix their DOSEMU, which violates policy and is especially annoying given how much effort we've put into keeping 16-bit mode fully functional lately. We could add yet more heuristics and teach sigreturn to ignore the saved SS value in sigcontext if the saved CS is 64-bit and the saved SS is unusable. --Andy -- Andy Lutomirski AMA Capital Management, LLC -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/