Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934559AbdC3SXj (ORCPT ); Thu, 30 Mar 2017 14:23:39 -0400 Received: from mail-vk0-f46.google.com ([209.85.213.46]:35135 "EHLO mail-vk0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933547AbdC3SXi (ORCPT ); Thu, 30 Mar 2017 14:23:38 -0400 MIME-Version: 1.0 In-Reply-To: References: <20170328145413.GA3164@redhat.com> <20170329163335.GA23977@redhat.com> <20170329165554.GA24250@redhat.com> <20170329170442.GA24342@redhat.com> <20170329185041.GA24806@redhat.com> <20170330135100.GA25882@redhat.com> <20170330154902.GA27416@redhat.com> From: Andy Lutomirski Date: Thu, 30 Mar 2017 11:23:15 -0700 Message-ID: Subject: Re: syscall_get_error() && TS_ checks To: Linus Torvalds Cc: Oleg Nesterov , Andrew Morton , Andy Lutomirski , Denys Vlasenko , "H. Peter Anvin" , Ingo Molnar , Jan Kratochvil , Pedro Alves , Thomas Gleixner , "the arch/x86 maintainers" , Linux Kernel Mailing List Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2610 Lines: 63 On Thu, Mar 30, 2017 at 10:46 AM, Linus Torvalds wrote: > For example, let's assume that %eax contains a 32-bit pointer with the > high bit set, and we're using a 32-bit debugger on a 32-bit program > (ie you're just running a 32-bit distro on a 64-bit kernel, which > people have definitely done). > > We *really* shouldn't sign-extend that value if the debugger ends up > updating the pointer (or maybe the debugger just reloads previous > values, not really "updating" anything - I think that's what gdb does > when you do a call within the context of the debugged program from > within gdb, for example) Can you think of a case where this would actually matter? > > So I really *really* don't think you can just sign-extend %eax. Which > is exactly why we have that nasty odd sign-extension in the signal > path instead, but then have to make it conditional on running a 32-bit > program. > > But maybe there is still something I'm not understanding in your > argument. This thread has been a series of mis-understandings. As the daft kernel hacker who introduced TS_I386_REGS_POKED in the first place, I'll try to explain what I think is going on. TS_I386_REGS_POKED is an enormous kludge, and it serves two purposes. It avoids a potential security bug that the old code had, and it at least documents the code paths that are thoroughly broken. (Before they were TS_COMPAT instead, but most of the TS_COMPAT users are fine.) It's used in two places: --- issue 1 --- get_nr_restart_syscall() does: if (current->thread.status & (TS_COMPAT|TS_I386_REGS_POKED)) return __NR_ia32_restart_syscall; This is very, very buggy. Fixing this appears to require somewhat some surgery. Proposals include adding new restart_syscall numbers that match across 32-bit and 64-bit (interacts quite awkwardly with seccomp) or trying to store syscall bitness along with restart_block (ick, not actually 100% reliable depending on just how abusing the debugger is). --- issue 2 --- syscall_get_error(). This is available on all arches, but it appears to be used *only* on x86. It's used to figure out whether we're restarting a syscall. It could plausibly matter if we have a buggy compat syscall that returns int instead of long, but the main purpose is for compatibility with 32-bit debuggers. Neither Oleg nor I have thought of anything other than this code path that cares at all about the high bits of RAX on a process that's being poked using 32-bit ptrace. Sign-extending RAX seems like it would get rid of this code path entirely to me. --Andy