Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752445AbdCET0u (ORCPT ); Sun, 5 Mar 2017 14:26:50 -0500 Received: from mail-it0-f53.google.com ([209.85.214.53]:33978 "EHLO mail-it0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750770AbdCET0t (ORCPT ); Sun, 5 Mar 2017 14:26:49 -0500 MIME-Version: 1.0 In-Reply-To: <20170305095059.l4od2yjqm5yxx6ln@pd.tnic> References: <20170304224341.zfp4fl37ypt57amg@pd.tnic> <5CCEF10D-5647-4503-A398-0681DF2C8847@zytor.com> <20170305001447.kcxignj3nsq35vci@pd.tnic> <20170305003349.6kgq4ovj7ipezfxu@pd.tnic> <20170305095059.l4od2yjqm5yxx6ln@pd.tnic> From: Linus Torvalds Date: Sun, 5 Mar 2017 11:19:42 -0800 X-Google-Sender-Auth: 0sRUocZSGG2CX3_Xxtw63b8C90I Message-ID: Subject: Re: Question Regarding ERMS memcpy To: Borislav Petkov Cc: Peter Anvin , Logan Gunthorpe , Thomas Gleixner , Ingo Molnar , Tony Luck , Al Viro , "the arch/x86 maintainers" , Linux Kernel Mailing List Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1792 Lines: 55 On Sun, Mar 5, 2017 at 1:50 AM, Borislav Petkov wrote: > > gcc can't possibly know on what targets is that kernel going to be > booted on. So it probably does some universally optimal things, like in > the dmi_scan_machine() case: > > memcpy_fromio(buf, p, 32); > > turns into: > > .loc 3 219 0 > movl $8, %ecx #, tmp79 > movq %rax, %rsi # p, p > movq %rsp, %rdi #, tmp77 > rep movsl > > Apparently it thinks it is fine to do 8*4-byte MOVS. But why not > 4*8-byte MOVS? Actually, the "fromio/toio" code should never use regular memcpy(). There used to be devices that literally broke on 64-bit accesses due to broken PCI crud. We seem to have broken this *really* long ago, though. On x86-64 we used to have a special __inline_memcpy() that copies our historical 32-bit thing, and was used for memcpy_fromio() and memcpy_toio(). That was then undone by commit 6175ddf06b61 ("x86: Clean up mem*io functions") That commit says "Iomem has no special significance on x86" but that's not strictly true. iomem is in the same address space and uses the same access instructions as regular memory, but iomem _is_ special. And I think it's a bug that we use "memcpy()" on it. Not because of any gcc issues, but simply because our own memcpy() optimizations are not appropriate for iomem. For example, "rep movsb" really is the right thing to use on normal memory on modern CPU's. But it is *not* the right thing to use on IO memory, because the CPU only does the magic cacheline access optimizations on cacheable memory! So I think we should re-introduce that old "__inline_memcpy()" as that special "safe memcpy" thing. Not just for KMEMCHECK, and not just for 64-bit. Hmm? Linus