Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932209AbdC1RK6 (ORCPT ); Tue, 28 Mar 2017 13:10:58 -0400 Received: from mail-vk0-f54.google.com ([209.85.213.54]:34606 "EHLO mail-vk0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752585AbdC1RK5 (ORCPT ); Tue, 28 Mar 2017 13:10:57 -0400 MIME-Version: 1.0 In-Reply-To: <20170328162736.GA3983@redhat.com> References: <20170328145413.GA3164@redhat.com> <20170328145432.GA3163@redhat.com> <20170328162736.GA3983@redhat.com> From: Andy Lutomirski Date: Tue, 28 Mar 2017 10:10:30 -0700 Message-ID: Subject: Re: [PATCH 1/1] get_nr_restart_syscall() should return __NR_ia32_restart_syscall if __USER32_CS To: Oleg Nesterov Cc: Andy Lutomirski , Andrew Morton , Linus Torvalds , Denys Vlasenko , "H. Peter Anvin" , Ingo Molnar , Jan Kratochvil , Pedro Alves , Thomas Gleixner , X86 ML , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1648 Lines: 49 On Tue, Mar 28, 2017 at 9:27 AM, Oleg Nesterov wrote: > On 03/28, Andy Lutomirski wrote: >> >> On Tue, Mar 28, 2017 at 7:54 AM, Oleg Nesterov wrote: >> > get_nr_restart_syscall() checks TS_I386_REGS_POKED but this bit is only >> > set if debugger is 32-bit. If a 64-bit debugger restores the registers >> > of a 32-bit debugee outside of syscall exit path get_nr_restart_syscall() >> > wrongly returns __NR_restart_syscall. >> >> I had sent a patch that introduced a new syscall nr, but it's not >> quite safe because it could break seccomp-using programs. > > Ah, indeed... This is, in theory, solvable. It would be ugly and would pollute seccomp a bit. > >> But your >> patch here is also screwy. > > Yes, yes, it doesn't try to solve all possible problems, I even mentioned > this in the changelog. > >> How about we store the syscall arch to be restored in task_struct >> along with restart_block? > > Yes, perhaps we will have to finally do this. Not really nice too. > >> the way there without heuristics as nasty as yours. > > I agree it will be better, but I refuse to treat them as mine checks ;) :) > >> P.S. __USER32_CS is the wrong check even if we used your approach. >> user_64bit_regs() is much better. > > Yes, thanks. If only I understood what cs == pv_info.extra_user_64bit_cs > actually means... > It means that, if Linux is a Xen PV guest, the GDT contains a bunch of entries supplied by Xen and outside of Linux's control, and one of those entries is a 64-bit DPL=3 code segment. On the one hand, it's annoying. On the other hand, it serves a real purpose performance-wise. --Andy