Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932431AbdCIO40 (ORCPT ); Thu, 9 Mar 2017 09:56:26 -0500 Received: from www62.your-server.de ([213.133.104.62]:52764 "EHLO www62.your-server.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932284AbdCIO4Z (ORCPT ); Thu, 9 Mar 2017 09:56:25 -0500 Message-ID: <58C16C6A.2060400@iogearbox.net> Date: Thu, 09 Mar 2017 15:53:30 +0100 From: Daniel Borkmann User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Thomas Gleixner CC: Kees Cook , Laura Abbott , Linus Torvalds , Ingo Molnar , Peter Anvin , Fengguang Wu , Network Development , LKML , LKP , ast@fb.com, the arch/x86 maintainers , "David S. Miller" Subject: Re: [net/bpf] 3051bf36c2 BUG: unable to handle kernel paging request at 0000a7cf References: <20170301125426.l4nf65rx4wahohyl@wfg-t540p.sh.intel.com> <20170302202338.ci6wwb3yzjmdy4n2@wfg-t540p.sh.intel.com> <58B88353.2010508@iogearbox.net> <58C08535.3070000@iogearbox.net> <7af7bcc9-9115-be9f-2240-a022487e9b70@redhat.com> <58C152F1.9090004@iogearbox.net> <58C157E6.1010909@iogearbox.net> In-Reply-To: <58C157E6.1010909@iogearbox.net> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Authenticated-Sender: daniel@iogearbox.net Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1920 Lines: 53 On 03/09/2017 02:25 PM, Daniel Borkmann wrote: > On 03/09/2017 02:10 PM, Thomas Gleixner wrote: >> On Thu, 9 Mar 2017, Daniel Borkmann wrote: >>> With regard to CPA_FLUSHTLB that Linus mentioned, when I investigated >>> code paths in change_page_attr_set_clr(), I did see that CPA_FLUSHTLB >>> was set each time we switched attrs and a cpa_flush_range() was >>> performed (with the correct number of pages and cache set to 0). That >>> would be a __flush_tlb_all() eventually. >>> >>> Hmm, it indeed might seem likely that this could be an emulation bug. >> >> Which variant of __flush_tlb_all() is used when the test fails? >> >> Check for the following flags in /proc/cpuinfo: pge invpcid > > I added the following and booted with both variants: > > printk("X86_FEATURE_PGE:%u\n", static_cpu_has(X86_FEATURE_PGE)); > printk("X86_FEATURE_INVPCID:%u\n", static_cpu_has(X86_FEATURE_INVPCID)); > > "-cpu host" gives: > > [ 8.326117] X86_FEATURE_PGE:1 > [ 8.326381] X86_FEATURE_INVPCID:1 > > "-cpu kvm64" gives: > > [ 8.517069] X86_FEATURE_PGE:1 > [ 8.517393] X86_FEATURE_INVPCID:0 Fwiw, I tried switching from using cr4 (__native_flush_tlb_global_irq_disabled()) to slower cr3 (__native_flush_tlb()) in "-cpu kvm64" mode, and it looks like it also lets all test cases pass (rodata_test, test_setmem, test_bpf), no corruption happening, etc. Test diff used: diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 6fa8594..34f4582 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -188,9 +188,9 @@ static inline void __native_flush_tlb_single(unsigned long addr) static inline void __flush_tlb_all(void) { - if (static_cpu_has(X86_FEATURE_PGE)) - __flush_tlb_global(); - else +// if (static_cpu_has(X86_FEATURE_PGE)) +// __flush_tlb_global(); +// else __flush_tlb(); }