Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754104AbdCHWp1 (ORCPT ); Wed, 8 Mar 2017 17:45:27 -0500 Received: from mail-it0-f48.google.com ([209.85.214.48]:37820 "EHLO mail-it0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751494AbdCHWpW (ORCPT ); Wed, 8 Mar 2017 17:45:22 -0500 MIME-Version: 1.0 In-Reply-To: <58C08535.3070000@iogearbox.net> References: <20170301125426.l4nf65rx4wahohyl@wfg-t540p.sh.intel.com> <20170302202338.ci6wwb3yzjmdy4n2@wfg-t540p.sh.intel.com> <58B88353.2010508@iogearbox.net> <58C08535.3070000@iogearbox.net> From: Linus Torvalds Date: Wed, 8 Mar 2017 14:43:44 -0800 X-Google-Sender-Auth: QizSmg2QZCBRwboicnd8TS4DU7U Message-ID: Subject: Re: [net/bpf] 3051bf36c2 BUG: unable to handle kernel paging request at 0000a7cf To: Daniel Borkmann Cc: Thomas Gleixner , Ingo Molnar , Peter Anvin , Fengguang Wu , Network Development , LKML , LKP , ast@fb.com, "the arch/x86 maintainers" , Kees Cook , Laura Abbott , David Miller Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1366 Lines: 31 On Wed, Mar 8, 2017 at 2:27 PM, Daniel Borkmann wrote: > > The issue seems to be accessing buff first (can be read or write access) > and then doing set_memory_ro() doesn't make it read-only immediately, > meaning the subsequent call into probe_kernel_write() will succeed without > error. > > Then, if I don't touch buff first and only do the set_memory_ro() seems > to work and probe_kernel_write() will then fail as expected due to pages > being read-only now. Ok, that definitely sounds like a TLB invalidate didn't happen. > Now, if I access buff, do the set_memory_ro() and then a msleep(0), for > example, it "kind of" works most of the time (see last log extract below), > and probe_kernel_write() will fail. Yeah, very much consistent with a missing TLB invalidate. Scheduling will end up invalidating it, although if it's a global page even that might not do it (but eventually the entry will just get flushed due to other activity). > None of this seems an issue with x86_64 and the test_setmem runs fine all > the time, same for the actual BPF stuff. The code does look somewhat confused about when to actually flush things - see my earlier note about NX - but it would seem to always do __flush_tlb_all() unless I missed something. At least as long as CPA_FLUSHTLB is set. Maybe some case forgets to set that.. Linus