Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp2007575imu; Fri, 23 Nov 2018 03:27:04 -0800 (PST) X-Google-Smtp-Source: AFSGD/VDB+6rVM0HPulYaPJGUt1YiZXcdNVhb/qnD9xRpjfMcmdEjtYbKYNGv+dDdpk1h/Fw9tjJ X-Received: by 2002:a17:902:2d81:: with SMTP id p1-v6mr14871683plb.97.1542972424935; Fri, 23 Nov 2018 03:27:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542972424; cv=none; d=google.com; s=arc-20160816; b=VxtJ/kRXNzRmz/9AURxSIOPPYP3QmRvW4gRhlPYWpW3YffNBDnmE399rVGTXk6XAmn PQz7aCd57aN7b9zI0U69dU7zRwdy7mwx7VQpiunRA0coXC8hkrsiJaDBM77DXQcSx8M0 tYSOlLQ65Np79MLAfShBqATv49V63WR+WarfO/C0kU7RIulHrUJEv/SdEfA4WP1YQj58 WD3TbvAKjf1rSWm/Actip8OpnQVM/sJsj32j+GLwig0QclPlmqEju6Vaam7jogFeo9dk 99tcaCj7PSuMN+qeEptOQ1giGAFe18pBCSrZaFh4rCzZo+MvDPBl9mhdGsDIJRuNQT1n se4Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=l57SLsSzQbtZugTE6jJy9K6Pf6qm6kn/+UlrybrpMvc=; b=0UX76qEOFkMfHbcb59MvwGpc0ukve4iDPX1h2AeQ3ASJvP9p2iEaBI0ON6gV/E1/nZ 7YS0oS9fIFpAeJOi7gQwnEO+PWISpiFNNF3v8s9f25XDFE+ZT14TXTo7QAZbuk+E9PoG kEPzFr+N4gzAFA8zNS8qxhzPd7d3TkIrIk8O3TiTVwN7Yjfk95EsR9NltfIcEhBjahIo XpbLkREvMlqjF027jCTZN4NMAFyr2M55FxJ7p1DetoYRjCpn/HML6kQEc6YjOJWlqQ62 fLUn8di4BuHHev4kehZ4r6oUyE4zqRT1SGhCuBloPO+ZeztrFNBU1KB/VhaDD/YbKnmm 6KRQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=g5IcbZcI; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y6si3876445pfi.228.2018.11.23.03.26.42; Fri, 23 Nov 2018 03:27:04 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=g5IcbZcI; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2405220AbeKVVL0 (ORCPT + 99 others); Thu, 22 Nov 2018 16:11:26 -0500 Received: from mail-wr1-f65.google.com ([209.85.221.65]:35810 "EHLO mail-wr1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729651AbeKVVLZ (ORCPT ); Thu, 22 Nov 2018 16:11:25 -0500 Received: by mail-wr1-f65.google.com with SMTP id 96so8735576wrb.2 for ; Thu, 22 Nov 2018 02:32:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=l57SLsSzQbtZugTE6jJy9K6Pf6qm6kn/+UlrybrpMvc=; b=g5IcbZcIxAjJ7yohPo9NftdJ8iKrGi3KYhrE3Pc0akSG5WPfxLGVHXxCHJDhexd9uq +sDt5I50KeRGPvWhTNakvF9FlLkTv+J8gkVzuq0lY+UGeYdFgAncPSEKp/m31rHRHT9D b1cydph5TaBqOTOzSC9tTrX7KYOxcX3XI33monPUOKgR+OGZvhVyXsfgWAvEVOV4Mt2c GVx/I5LQzYOvlyhvTiHRELZvTXeR3Ml1FvzDQYZkitrPTTLyIEBPS7s5kcRQeLXHnGqm n7aMQ1s15AFbEvows0YrInFjfaHDq+Xq1VoNNBmtLuG63Z/daNMNInMDlzCCiF3Fp8tF UXkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=l57SLsSzQbtZugTE6jJy9K6Pf6qm6kn/+UlrybrpMvc=; b=rSkv6Xnsq7fIyRhJwET6ljtFcEz41E6JFXY4bcDWPzRk7tEsSUIyJ9GTDmDw287OY1 iyYC4hOQ1cIK0vE44vKIue1jQNk4FF2awPWBQ3+r1IQSdIisGgUpNrolpFo5LTFyS6By 3ad5tLBNO6PDD+u3xeWIw/4fxJcqn+NlfIXUiqdWNCO9J++l4j8o19AQrmdzNsXNa6+b IBYLsf28VndzMEJ78cRfLrhyYgpSFecpneP/3IvdWqvZuEn/WXcQ2YUhYYe1lExpjqqE 03TKZ+Wtzt9f0L66X3JW3t8dV70bBmR2/HCaEE7YiBTotSYkBK5vs1ppPoQ3IbnheZvs 8KGw== X-Gm-Message-State: AA+aEWbEAvoit98mQvJAkdm1tlL63O2IcjBRKn8vT1ofNF5ikO5B/Qtg l98O3Tzrx8hyn4pk/uTmXPo= X-Received: by 2002:adf:a50c:: with SMTP id i12mr8733250wrb.220.1542882754531; Thu, 22 Nov 2018 02:32:34 -0800 (PST) Received: from gmail.com (2E8B0CD5.catv.pool.telekom.hu. [46.139.12.213]) by smtp.gmail.com with ESMTPSA id b22sm2849927wmj.9.2018.11.22.02.32.32 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 22 Nov 2018 02:32:33 -0800 (PST) Date: Thu, 22 Nov 2018 11:32:31 +0100 From: Ingo Molnar To: Linus Torvalds Cc: pabeni@redhat.com, Jens Axboe , Thomas Gleixner , Ingo Molnar , bp@alien8.de, Peter Anvin , the arch/x86 maintainers , Andrew Morton , Andrew Lutomirski , Peter Zijlstra , dvlasenk@redhat.com, brgerst@gmail.com, Linux List Kernel Mailing Subject: Re: [PATCH] x86: only use ERMS for user copies for larger sizes Message-ID: <20181122103231.GA102790@gmail.com> References: <02bfc577-32a5-66be-64bf-d476b7d447d2@kernel.dk> <20181121063609.GA109082@gmail.com> <48e27a3a-2bb2-ff41-3512-8aeb3fd59e57@kernel.dk> <1c22125bb5d22c2dcd686d0d3b390f115894f746.camel@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Linus Torvalds wrote: > On Wed, Nov 21, 2018 at 10:16 AM Linus Torvalds > wrote: > > > > It might be interesting to just change raw_copy_to/from_user() to > > handle a lot more cases (in particular, handle cases where 'size' is > > 8-byte aligned). The special cases we *do* have may not be the right > > ones (the 10-byte case in particular looks odd). > > > > For example, instead of having a "if constant size is 8 bytes, do one > > get/put_user()" case, we might have a "if constant size is < 64 just > > unroll it into get/put_user()" calls. > > Actually, x86 doesn't even set INLINE_COPY_TO_USER, so I don't think > the constant size cases ever trigger at all the way they are set up > now. Side note, there's one artifact the patch exposes: some of the __builtin_constant_p() checks are imprecise and don't trigger at the early stage where GCC checks them, but the lenght is actually known to the compiler at later optimization stages. This means that with Jen's patch some of the length checks go away. I checked x86-64 defconfig and a distro config, and the numbers were ~7% and 10%, so not a big effect. The kernel text size reduction with Jen's patch is small but real: text data bss dec hex filename 19572694 11516934 19873888 50963516 309a43c vmlinux.before 19572468 11516934 19873888 50963290 309a35a vmlinux.after But I checked the disassembly, and it's not a real win, the new code is actually more complex than the old one, as expected, but GCC (7.3.0) does some particularly stupid things which bloats the generated code. > I do have a random patch that makes "unsafe_put_user()" actually use > "asm goto" for the error case, and that, together with the attached > patch seems to generate fairly nice code, but even then it would > depend on gcc actually unrolling things (which we do *not* want in > general). > > But for a 32-byte user copy (cp_old_stat), and that > INLINE_COPY_TO_USER, it generates this: > > stac > movl $32, %edx #, size > movq %rsp, %rax #, src > .L201: > movq (%rax), %rcx # MEM[base: src_155, offset: 0B], > MEM[base: src_155, offset: 0B] > 1: movq %rcx,0(%rbp) # MEM[base: src_155, offset: 0B], > MEM[(struct __large_struct *)dst_156] > ASM_EXTABLE_HANDLE from=1b to=.L200 handler="ex_handler_uaccess" # > > addq $8, %rax #, src > addq $8, %rbp #, statbuf > subq $8, %rdx #, size > jne .L201 #, > clac > > which is actually fairly close to "optimal". Yeah, that looks pretty sweet! > Random patch (with my "asm goto" hack included) attached, in case > people want to play with it. Doesn't even look all that hacky to me. Any hack in it that I didn't notice? :-) The only question is the inlining overhead - will try to measure that. > Impressively, it actually removes more lines of code than it adds. But > I didn't actually check whether the end result *works*, so hey.. Most of the linecount reduction appears to come from the simplification of the unroll loop and moving it into C, from a 6-way hard-coded copy routine: > - switch (size) { > - case 1: > - case 2: > - case 4: > - case 8: > - case 10: > - case 16: to a more flexible 4-way loop unrolling: > + while (size >= sizeof(unsigned long)) { > + while (size >= sizeof(unsigned int)) { > + while (size >= sizeof(unsigned short)) { > + while (size >= sizeof(unsigned char)) { Which is a nice improvement in itself. > + user_access_begin(); > + if (dirent) > + unsafe_put_user(offset, &dirent->d_off, efault_end); > dirent = buf->current_dir; > + unsafe_put_user(d_ino, &dirent->d_ino, efault_end); > + unsafe_put_user(reclen, &dirent->d_reclen, efault_end); > + unsafe_put_user(0, dirent->d_name + namlen, efault_end); > + unsafe_put_user(d_type, (char __user *) dirent + reclen - 1, efault_end); > + user_access_end(); > + > if (copy_to_user(dirent->d_name, name, namlen)) > goto efault; > buf->previous = dirent; > dirent = (void __user *)dirent + reclen; > buf->current_dir = dirent; > buf->count -= reclen; > return 0; > +efault_end: > + user_access_end(); > efault: > buf->error = -EFAULT; > return -EFAULT; In terms of high level APIs, could we perhaps use the opportunity to introduce unsafe_write_user() instead, which would allow us to write it as: unsafe_write_user(&dirent->d_ino, d_ino, efault_end); unsafe_write_user(&dirent->d_reclen, reclen, efault_end); unsafe_write_user(dirent->d_name + namlen, 0, efault_end); unsafe_write_user((char __user *)dirent + reclen - 1, d_type, efault_end); if (copy_to_user(dirent->d_name, name, namlen)) goto efault; This gives it the regular 'VAR = VAL;' notation of C assigments, instead of the weird historical reverse notation that put_user()/get_user() uses. Note how this newfangled ordering now matches the 'copy_to_user()' natural C-assignment parameter order that comes straight afterwards and makes it obvious that the d->name+namelen was writing the delimiter at the end. I think we even had bugs from put_user() ordering mixups? Or is it too late to try to fix this particular mistake? Thanks, Ingo