Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp3353719imm; Mon, 6 Aug 2018 03:18:47 -0700 (PDT) X-Google-Smtp-Source: AAOMgpf5Hh0De+nE81dR7rH9m+1DqyYg1+D7KJtZGpBfN08DJAF8Wwc3E4bUb5vVcje8/a/qJdJA X-Received: by 2002:a63:8dca:: with SMTP id z193-v6mr14022471pgd.228.1533550727261; Mon, 06 Aug 2018 03:18:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533550727; cv=none; d=google.com; s=arc-20160816; b=uEg8n/5bhHU75W6hRAZVBxM4SJWs/4p1ShMJuyvZ66ITW94wt2vOKiFJ4bD0qOg9+P 76Pl1n+Kp+dA1DbQJ2lUuXet1Hr1I6mdSiqHAJguhuJ3H68LjftsXeYtD5f+JAgHM9Yc Hm5BSTisb9q5xnQJCeSyzMxNf2RJycXE1tWQ8sODJ3Tsg8uQlRwXtDRekHTzkEi4mYZE Oo5TTdnGMHlDA1VAwJQoVYb3aFaXiiFgDAF3SpKoNk1Ph0KmsCmv4mmYxvmdVdLX3zfc o0WF3KFqU0EGNoxnMZx3JtWOyv6gywZMpBPgq3BIEiUN/GXT05b1R5Q/+yfNIyowPWnN W/jg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :content-language:accept-language:in-reply-to:references:message-id :date:thread-index:thread-topic:subject:cc:to:from :arc-authentication-results; bh=NnaCPQhvs9xY362iogiMb+NAv0jzmAwCaT1789tOv4s=; b=ZMKzJ6v/rxTPnGLjt23sv13SVc1BOfPIcKwuXBqu7UBrCGw0qi5jzq0alRjOyjP1Ni WmyFrgUS1bk4yiZvgNrCMQh8Gl7412bAKB8xXah7OwjQljjL+CJeXPm1TKSgFyyvJYtx NgCQ81+8RF8S019TYleEeBxcphSBIDR5StdRLBtCEBQHoW5K8kZiy+PWRAucRPkwhk3j yGDxOKtAlxZgn6tn/i+pgC1OmDXXRXwuXgit7zF/DvZmRJfJ4Tt4lTHShmoeR93l4RFA uJWvA1HkNoh5oE9ih/NwtxN7B0PMA842ZkfZhCblHUsP+v8cnHprUr1bJO/V504zagSn iXfg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a7-v6si13862220pfg.200.2018.08.06.03.18.32; Mon, 06 Aug 2018 03:18:47 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729096AbeHFMZW convert rfc822-to-8bit (ORCPT + 99 others); Mon, 6 Aug 2018 08:25:22 -0400 Received: from eu-smtp-delivery-211.mimecast.com ([146.101.78.211]:45217 "EHLO eu-smtp-delivery-211.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726699AbeHFMZV (ORCPT ); Mon, 6 Aug 2018 08:25:21 -0400 Received: from AcuMS.aculab.com (156.67.243.126 [156.67.243.126]) (Using TLS) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-143-X99eNT_KMRCVOQycZ9ycPQ-1; Mon, 06 Aug 2018 11:16:54 +0100 Received: from AcuMS.Aculab.com (fd9f:af1c:a25b:0:43c:695e:880f:8750) by AcuMS.aculab.com (fd9f:af1c:a25b:0:43c:695e:880f:8750) with Microsoft SMTP Server (TLS) id 15.0.1347.2; Mon, 6 Aug 2018 11:18:33 +0100 Received: from AcuMS.Aculab.com ([fe80::43c:695e:880f:8750]) by AcuMS.aculab.com ([fe80::43c:695e:880f:8750%12]) with mapi id 15.00.1347.000; Mon, 6 Aug 2018 11:18:33 +0100 From: David Laight To: 'Mikulas Patocka' CC: 'Ard Biesheuvel' , Ramana Radhakrishnan , Florian Weimer , "Thomas Petazzoni" , GNU C Library , Andrew Pinski , "Catalin Marinas" , Will Deacon , "Russell King" , LKML , linux-arm-kernel Subject: RE: framebuffer corruption due to overlapping stp instructions on arm64 Thread-Topic: framebuffer corruption due to overlapping stp instructions on arm64 Thread-Index: AQHUKwzKyzS7gP0u+Em6lFS72D3AkaSt4YCg///76QCAACCeQIADLl6AgAFULpA= Date: Mon, 6 Aug 2018 10:18:33 +0000 Message-ID: <51a6c4e102ad4193b3f42498f0ff11a4@AcuMS.aculab.com> References: <9acdacdb-3bd5-b71a-3003-e48132ee1371@redhat.com> In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.33] MIME-Version: 1.0 X-MC-Unique: X99eNT_KMRCVOQycZ9ycPQ-1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Mikulas Patocka > Sent: 05 August 2018 15:36 > To: David Laight ... > There's an instruction movntdqa (and vmovntdqa) that can actually do > prefetch on write-combining memory type. It's the only instruction that > can do it. > > It this instruction is used on non-write-combining memory type, it behaves > like movdqa. > ... > I benchmarked it on a processor with ERMS - for writes to the framebuffer, > there's no difference between memcpy, 8-byte writes, rep stosb, rep stosq, > mmx, sse, avx - all this method achieve 16-17 GB/s The combination of write-combining, posted writes and a fast PCIe slave are probably why there is little difference. > For reading from the framebuffer: > 323 MB/s - memcpy (using avx2) > 91 MB/s - explicit 8-byte reads > 249 MB/s - rep movsq > 307 MB/s - rep movsb You must be getting the ERMS hardware optimised 'rep movsb'. > 90 MB/s - mmx > 176 MB/s - sse > 4750 MB/s - sse movntdqa > 330 MB/s - avx avx512 is probably faster still. > 5369 MB/s - avx vmovntdqa > > So - it may make sense to introduce a function memcpy_from_framebuffer() > that uses movntdqa or vmovntdqa on CPUs that support it. For kernel space it ought to be just memcpy_fromio(). Can you easily repeat the tests using a non-write-combining map of the same PCIe slave? I can probably run the same measurements against our rather leisurely FPGA based PCIe slave. IIRC PCIe reads happen every 128 clocks of the cards 62.5MHz clock, increasing the size of the registers makes a significant different. I've not tried mapping write-combining and using (v)movntdaq. I'm not sure what effect write-combining would have if the whole BAR were mapped that way - so I'll either have to map the physical addresses twice or add in another BAR. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)