Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp1657957ybt; Thu, 18 Jun 2020 14:04:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwYbnwNxewXFWQoJaGFSuwJR+AKof0bEmuVWUjNYU0aEaHxE4UmXTYps0kCRRN2ZsLgRdZS X-Received: by 2002:a17:906:a889:: with SMTP id ha9mr567170ejb.429.1592514290362; Thu, 18 Jun 2020 14:04:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1592514290; cv=none; d=google.com; s=arc-20160816; b=ekXQzhHd7tNGVWyy9JAmem1ui6lOoO366tGRsRdMJWiC4xHvbuMdEy5gDtfGs0Jtq8 5ui1aZ3vCXnTdB5dFTlGu7TtXs9XfrYsjvVf9x4GwoF7l9fwFVqPggEknBewHClZ5Vw5 1rgiwJBNVh91daYAEn56042nHEyXdBOY5ZBwLL8bl7zsJ+LcWrWncn5XQKrRmCfrCSGL /tTtl5uyV549fWnm2XmU1XNu6x6ZnWYVvnpHxlGNrKgTrKylTZBf5yBpz2pTf3Vr/sEJ oNo+IFCOqp+7eWxq+iqdWASuhnHtNFEvg89Hm9NMWCIqoHpIISu0rQQxj3K859ETIbvM jRIA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=/pSg+CjVVGDmkJgSAWGcUvsAuaB+HD/87Cz0ih1iEpU=; b=uDuEU3shqUIlWJ6epJWSzVV9tCVIViURWVY5ZEy3zL908lk37lZ3luqa3nvl4RWlZr Li+HMOD5v5CHGlFOso9kQrkxje+lPvF0ssmXjPW53/4cPkadfiBhHxLLbnC7O9pb8Pyn KPVFzC1SCWJUyCNpsFX2yG1zfmlFIGZyzE0uzCYv6X7yeaFb3THuzU2ytEpfFjaqn3h1 7Gja1v9yt6+UFwWSxhM56b9LoevirqB19VYsLl0/bFtUUjoEQFv5R2+Fmz34x7j4qqvy mSZKgVqaSevzRi2pfyEw70ANzKPMiI65lrNh1I4S3fae0I3PNom2zviuvrmzWqkMM+6p kdKw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=PLb5yOPF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c5si2753139edq.283.2020.06.18.14.04.27; Thu, 18 Jun 2020 14:04:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=PLb5yOPF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728161AbgFRVBz (ORCPT + 99 others); Thu, 18 Jun 2020 17:01:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59394 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725829AbgFRVBz (ORCPT ); Thu, 18 Jun 2020 17:01:55 -0400 Received: from mail-ej1-x642.google.com (mail-ej1-x642.google.com [IPv6:2a00:1450:4864:20::642]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1A29BC06174E; Thu, 18 Jun 2020 14:01:55 -0700 (PDT) Received: by mail-ej1-x642.google.com with SMTP id dr13so7903308ejc.3; Thu, 18 Jun 2020 14:01:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=/pSg+CjVVGDmkJgSAWGcUvsAuaB+HD/87Cz0ih1iEpU=; b=PLb5yOPFdttKNB7R79USKjGUc6xJvIlhZqG+6dY5T4MJMkja97YVPqO+VSTT/MP9BC VhKNgCL/UdQeqWQ44SWAUxDfJSr0Oxs6mPorLGZDnq4kdo2ExhUoMSBG3VN3AUlstCy7 mjhlDXSgDAZiv8qjLEcuOBxTeO2lZ4T+DUgGf3CmeTxaF87+K23NqjjR/JhyexUBycN7 C013wGARBTw1P0OoO36i1zedNrT8Jz5IGC1BJuzLM20q9H+Gp6tCPmKhxfLwQjGeh6gm 7Rh9cMIQZVKkbZHFnGg42SctEGeuL4SbcMi/gl+DX5ADL1/ocrDljJhq7BnwVmC/TXOI 6cuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=/pSg+CjVVGDmkJgSAWGcUvsAuaB+HD/87Cz0ih1iEpU=; b=ClPj4HiYlGMuWTBvsrLpx7zZfHbXBpJnZH9T2E80r8Ge31sKYMjeJHJ1POnPwmvXbR fXJl2ZEGQ4t3XCxM2n8Po2osnkaz+lF/Ugiiq+sJhZ58cNvJ/wxvAeqNBxCRELxhr0ft WcoomjedFgSMv2JshcAChcafEJL58uTKzPkS1Ahg0KplwEqP+kgPpF9bfo1cAB5NZqoe fneLfTrIoVzw0ohzWP/eg5QC8KmBAdQUWG6k1icwhIZEn0ECTgd2fou33TJByqyX64Uv Ku98f2t3Zv00NUMJJ9E1h2LCLA2znFPoR6drDpw89XjUzZC3EhcsxKKyzccFlgDjtuSB p9IA== X-Gm-Message-State: AOAM533Qww6KVTkWFO5u1ZjFsqpUWu0/dUbfJIACZ2tQFk/fqtfUNN/v cdth1Fy3o7EuFCtfogAdITsun5Y= X-Received: by 2002:a17:906:4e59:: with SMTP id g25mr583619ejw.60.1592514113829; Thu, 18 Jun 2020 14:01:53 -0700 (PDT) Received: from localhost.localdomain ([46.53.250.254]) by smtp.gmail.com with ESMTPSA id z15sm3133755eju.18.2020.06.18.14.01.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Jun 2020 14:01:53 -0700 (PDT) Date: Fri, 19 Jun 2020 00:01:51 +0300 From: Alexey Dobriyan To: David Laight Cc: 'Matt Fleming' , Ingo Molnar , Peter Zijlstra , Thomas Gleixner , "linux-kernel@vger.kernel.org" , "Grimm, Jon" , "Kumar, Venkataramanan" , Jan Kara , "stable@vger.kernel.org" Subject: Re: [PATCH] x86/asm/64: Align start of __clear_user() loop to 16-bytes Message-ID: <20200618210151.GA2212102@localhost.localdomain> References: <20200618102002.30034-1-matt@codeblueprint.co.uk> <39f8304b75094f87a54ace7732708d30@AcuMS.aculab.com> <20200618131655.GA24607@localhost.localdomain> <20b0166e11f44bf491062838090b93be@AcuMS.aculab.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20b0166e11f44bf491062838090b93be@AcuMS.aculab.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 18, 2020 at 04:39:35PM +0000, David Laight wrote: > From: Alexey Dobriyan > > Sent: 18 June 2020 14:17 > ... > > > > diff --git a/arch/x86/lib/usercopy_64.c b/arch/x86/lib/usercopy_64.c > > > > index fff28c6f73a2..b0dfac3d3df7 100644 > > > > --- a/arch/x86/lib/usercopy_64.c > > > > +++ b/arch/x86/lib/usercopy_64.c > > > > @@ -24,6 +24,7 @@ unsigned long __clear_user(void __user *addr, unsigned long size) > > > > asm volatile( > > > > " testq %[size8],%[size8]\n" > > > > " jz 4f\n" > > > > + " .align 16\n" > > > > "0: movq $0,(%[dst])\n" > > > > " addq $8,%[dst]\n" > > > > " decl %%ecx ; jnz 0b\n" > > > > > > You can do better that that loop. > > > Change 'dst' to point to the end of the buffer, negate the count > > > and divide by 8 and you get: > > > "0: movq $0,($[dst],%%ecx,8)\n" > > > " add $1,%%ecx" > > > " jnz 0b\n" > > > which might run at one iteration per clock especially on cpu that pair > > > the add and jnz into a single uop. > > > (You need to use add not inc.) > > > > /dev/zero should probably use REP STOSB etc just like everything else. > > Almost certainly it shouldn't, and neither should anything else. > Potentially it could use whatever memset() is patched to. > That MIGHT be 'rep stos' on some cpu variants, but in general > it is slow. Yes, that's what I meant: alternatives choosing REP variant. memset loops are so 21-st century.