Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp1677993pxb; Wed, 9 Feb 2022 01:58:54 -0800 (PST) X-Google-Smtp-Source: ABdhPJwguP3tIarrIASxDVqNyrk9bQ+PuRuApt+s72HDHTFciQs0UJJ29QPs3SCjq8FRwK7nZmoT X-Received: by 2002:a17:902:b493:: with SMTP id y19mr1592096plr.97.1644400734527; Wed, 09 Feb 2022 01:58:54 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644400734; cv=none; d=google.com; s=arc-20160816; b=LQl18v0pVVJ1Ea6V3N1Bsv8VmhqQDqzg2ck7brEthAdaG34xO+WWpSXDFmDpbsdwgk Bqee3csL+w0MvpGozhwXrZ4vmctdHKfnegRab6YOThCZ0MMJiR2o8OWU5YD4uV2Yq/0b akrfIJw3DuW845F7SLMLQx/FL3DmQN8/hhz1mJI5IQDaNIi5RGQEGU5DZbLTS8abuxES Ut3PuMSLb6NxAVVsHpi2UZMaQ4ZIUeWQ7MxUl6WPuu17gyb4BIcqgLZYxYCeV0zSVpNt bNVlYrHAC9LmVOxyZFsdo0j78eRFTUN/H+huJchiNokAYDiZswRxaG+ZGo8kxlTGxwoI gf0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :mime-version:accept-language:in-reply-to:references:message-id:date :thread-index:thread-topic:subject:cc:to:from; bh=rTk7K8G0MtfwNRhDzwaRZz4wS0SDBuT2WVSfL4oDGks=; b=HgkgYv3UtCnKETmg/WSPXEGv8Pkka/UiwcMTyy3jas0OkkuYj/d1Px4FUYXggCc+jO N4cy4xfcLJE2g9Pm4Ffc33CqOjKCLmBer9O7EZidqbfGWVHv9hMk7wZXnKAQrwh59//o pHGdSyMtioRWdj3U3Ac/+C4a0OLtDRSuJlZnC1+F/XNgx+zeR7hyB5MO71jJxSctvIOU yaDie3rqIzX/IuA5hmiTYaLGU/oWpW+0xqd4dcyXYSo/a52OWu5sbREZJA/aY3BvT+Si pknukkXrItpcKa9QkYOOz153x14WAR4P7pI8K8EcBdSY4Ka2+vx3YXzn5AcEtk9ttdkZ yApg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u25si4288131pfh.109.2022.02.09.01.58.42; Wed, 09 Feb 2022 01:58:54 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1357141AbiBHLu7 convert rfc822-to-8bit (ORCPT + 99 others); Tue, 8 Feb 2022 06:50:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39410 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1358551AbiBHLu2 (ORCPT ); Tue, 8 Feb 2022 06:50:28 -0500 Received: from eu-smtp-delivery-151.mimecast.com (eu-smtp-delivery-151.mimecast.com [185.58.85.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 882FEC0302C6 for ; Tue, 8 Feb 2022 03:45:41 -0800 (PST) Received: from AcuMS.aculab.com (156.67.243.121 [156.67.243.121]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id uk-mta-70-nKD5K5YQP-CRRdtrR-hd-w-1; Tue, 08 Feb 2022 11:45:39 +0000 X-MC-Unique: nKD5K5YQP-CRRdtrR-hd-w-1 Received: from AcuMS.Aculab.com (fd9f:af1c:a25b:0:994c:f5c2:35d6:9b65) by AcuMS.aculab.com (fd9f:af1c:a25b:0:994c:f5c2:35d6:9b65) with Microsoft SMTP Server (TLS) id 15.0.1497.28; Tue, 8 Feb 2022 11:45:38 +0000 Received: from AcuMS.Aculab.com ([fe80::994c:f5c2:35d6:9b65]) by AcuMS.aculab.com ([fe80::994c:f5c2:35d6:9b65%12]) with mapi id 15.00.1497.028; Tue, 8 Feb 2022 11:45:38 +0000 From: David Laight To: 'Hugh Dickins' , Borislav Petkov CC: Peter Zijlstra , "linux-kernel@vger.kernel.org" , "x86@kernel.org" Subject: RE: x86: should clear_user() have alternatives? Thread-Topic: x86: should clear_user() have alternatives? Thread-Index: AQHYHK8l5MEK9qlbxkOnyE8hF1HtfqyJhs7w Date: Tue, 8 Feb 2022 11:45:38 +0000 Message-ID: <9fc41af45fcb40e3ae607eb4f52d7ef9@AcuMS.aculab.com> References: <2f5ca5e4-e250-a41c-11fb-a7f4ebc7e1c9@google.com> In-Reply-To: <2f5ca5e4-e250-a41c-11fb-a7f4ebc7e1c9@google.com> Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.107] MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=C51A453 smtp.mailfrom=david.laight@aculab.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Hugh Dickins > Sent: 08 February 2022 05:46 > > I've noticed that clear_user() is slower than it need be: > > dd if=/dev/zero of=/dev/null bs=1M count=1M > 1099511627776 bytes (1.1 TB) copied, 45.9641 s, 23.9 GB/s > whereas with the hacked patch below > 1099511627776 bytes (1.1 TB) copied, 33.4 s, 32.9 GB/s > > That was on some Intel machine: IIRC an AMD went faster. > > It's because clear_user() lacks alternatives, and uses a > nowadays suboptimal implementation; whereas clear_page() > and copy_user() do support alternatives. > ... > +SYM_FUNC_START(__clear_user) > + ASM_STAC > + movl %esi,%ecx > + xorq %rax,%rax > +1: rep stosb > +2: movl %ecx,%eax > + ASM_CLAC > + ret You only want to even consider than version for long copies (and possibly only for aligned ones). The existing code (I've not quoted) does look sub-optimal though. It should be easy to obtain a write every clock. But I suspect the loop is too long. The code gcc generates might even be better! Note that for copies longer than 8 bytes 'odd' lengths can be handled by a single misaligned write to the end of the buffer. No need for a byte copy loop. I've not experimented with misaligned writes - they might take two clocks. So it might be worth aligning them - but they may not happen often enough for it to be an overall gain. Misaligned reads usually don't make any difference. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)