Received: by 10.213.65.68 with SMTP id h4csp995244imn; Sun, 25 Mar 2018 20:49:14 -0700 (PDT) X-Google-Smtp-Source: AG47ELvs9W3KjYX6RbUiAcicpGFUjTDqzKPo6JREPKD3EJmlx+XCWixcS4G5NJePw9VGNwqDm8Rk X-Received: by 2002:a17:902:68c2:: with SMTP id x2-v6mr29760786plm.129.1522036154948; Sun, 25 Mar 2018 20:49:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522036154; cv=none; d=google.com; s=arc-20160816; b=u/Z7jYszSuf+mUsqs0aMYqDeCWp0dOvFBANhatra4xbLAFKEzo7z280lnvCUXNAb0O fyfBrbkTKyyOD0Peqz0du8UR9lys3fu2jDB+2+PjEw1c/iteJkjCz6pGQ42qdhkeU7Cj 77BeF0uvsMo0eRMxNoyS+nGtAX8C3Tf7zrUAat/Ems41jrXxkxJ4ZGQDnGb66odn412S IDeFCX6KVSXklzGqIQUrtp352K+gSDc/+F423ycvUkRnZKHp+plz6lQjwAepQFIcveI+ ds6dwwmzsTmkWzgn4bC8Z//3WYqN5SLOd3v2k/LKPxHnKaXJ8WG29o4BOlc5rzMtCiqo 2k7w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=3jU5w7ubwVXI2YmUfw9N6LQcVLVP5fmEvCHQVPiZwtM=; b=dItzK1y9n/kE248HwlP9ZTwpe4s1+6RraacUccIooMFUiSAO3U6dVLxPjWgUNcy37Y 3AxFVibm0zvBKF57Qo4e96XsR153o+vU21YSD91UrleUufmchkz812t9lyF8r7L2tIwK 8vkjqIQAm7KE7/zAs/jIaza392xZ8d98/B+tfJGtw+V8ncBqIJS5puLCnW/XB+Licep8 YQboYQYU39vyAWzN881YnDOeWq8pydDSQ0xYnkJEkrp026qG8wx7fITq8t4v7u8+h2O/ 5deWbP99YAFAMXR76kM/kCRbNPFlyfVeJFRAzn510bwSbnHtcXAp2QsNHC4egn7zKmXA m/Ow== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 1-v6si14398834pln.656.2018.03.25.20.48.59; Sun, 25 Mar 2018 20:49:14 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751299AbeCZDsB (ORCPT + 99 others); Sun, 25 Mar 2018 23:48:01 -0400 Received: from zeniv.linux.org.uk ([195.92.253.2]:38104 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750960AbeCZDr7 (ORCPT ); Sun, 25 Mar 2018 23:47:59 -0400 Received: from viro by ZenIV.linux.org.uk with local (Exim 4.87 #1 (Red Hat Linux)) id 1f0J6s-0007JP-S2; Mon, 26 Mar 2018 03:47:50 +0000 Date: Mon, 26 Mar 2018 04:47:50 +0100 From: Al Viro To: Ingo Molnar Cc: Linus Torvalds , Dominik Brodowski , Linux Kernel Mailing List , Arnd Bergmann , linux-arch , Ralf Baechle , James Hogan , linux-mips , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , ppc-dev , Martin Schwidefsky , Heiko Carstens , linux-s390 , "David S . Miller" , sparclinux@vger.kernel.org, Ingo Molnar , Jiri Slaby , the arch/x86 maintainers Subject: Re: [RFC] new SYSCALL_DEFINE/COMPAT_SYSCALL_DEFINE wrappers Message-ID: <20180326034750.GN30522@ZenIV.linux.org.uk> References: <20180318161056.5377-1-linux@dominikbrodowski.net> <20180318161056.5377-5-linux@dominikbrodowski.net> <20180318174014.GR30522@ZenIV.linux.org.uk> <20180318181848.GU30522@ZenIV.linux.org.uk> <20180319042300.GW30522@ZenIV.linux.org.uk> <20180319092920.tbh2xwkruegshzqe@gmail.com> <20180319232342.GX30522@ZenIV.linux.org.uk> <20180322001532.GA18399@ZenIV.linux.org.uk> <20180326004017.GA2211@ZenIV.linux.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180326004017.GA2211@ZenIV.linux.org.uk> User-Agent: Mutt/1.9.1 (2017-09-22) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 26, 2018 at 01:40:17AM +0100, Al Viro wrote: > Kinda-sorta part: > * asmlinkage_protect is taken out for now, so m68k has problems. > * syscalls that run out of 6 slots barf violently. For mips it's > wrong (there we have 8 slots); for stuff like arm and ppc it's right, but > it means that things like e.g. compat sync_file_range() should not even > be compiled on those. __ARCH_WANT_SYS_SYNC_FILE_RANGE, presumably... > In any case, we *can't* do pt_regs-based wrappers for those syscalls on > such architectures, so ifdefs around those puppies are probably the right > thing to do. > * s390 macrology in compat_wrapper.c not even touched; it needs > a trivial update to keep working (__MAP callbacks take an extra argument, > unused for those users). > * sys_... and compat_sys_... aliases are unchanged; if we kill > direct callers, we can trivially rename SyS##name and compat_SyS##name > to sys##name and compat_sys##name and get rid of aliases. * mips n32 and x86 x32 can become an extra source of headache. That actually applies to any plans of passing struct pt_regs *. As it is, e.g. syscall 515 on amd64 is compat_sys_readv(). Dispatched via this: /* * NB: Native and x32 syscalls are dispatched from the same * table. The only functional difference is the x32 bit in * regs->orig_ax, which changes the behavior of some syscalls. */ if (likely((nr & __SYSCALL_MASK) < NR_syscalls)) { nr = array_index_nospec(nr & __SYSCALL_MASK, NR_syscalls); regs->ax = sys_call_table[nr]( regs->di, regs->si, regs->dx, regs->r10, regs->r8, regs->r9); } Now, syscall 145 via 32bit call is *also* compat_sys_readv(), dispatched via nr = array_index_nospec(nr, IA32_NR_syscalls); /* * It's possible that a 32-bit syscall implementation * takes a 64-bit parameter but nonetheless assumes that * the high bits are zero. Make sure we zero-extend all * of the args. */ regs->ax = ia32_sys_call_table[nr]( (unsigned int)regs->bx, (unsigned int)regs->cx, (unsigned int)regs->dx, (unsigned int)regs->si, (unsigned int)regs->di, (unsigned int)regs->bp); Right now it works - we call the same function, passing it arguments picked from different set of registers (di/si/dx in x32 case, bx/cx/dx in i386 one). But if we switch to passing struct pt_regs * and have the wrapper fetch regs->{bx,cx,dx}, we have a problem. It won't work for both entry points. IMO it's a good reason to have dispatcher(s) handle extraction from pt_regs and let the wrapper deal with the resulting 6 u64 or 6 u32, normalizing them and arranging them into arguments expected by syscall body. Linus, Dominik - how do you plan dealing with that fun? Regardless of the way we generate the glue, the issue remains. We can't get the same struct pt_regs *-taking function for both; we either need to produce a separate chunk of glue for each compat_sys_... involved (either making COMPAT_SYSCALL_DEFINE generate both, or having duplicate X32_SYSCALL_DEFINE for each of those COMPAT_SYSCALL_DEFINE - with identical body, at that) or we need to have the registers-to-slots mapping done in dispatcher...