Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp2765873rwb; Thu, 17 Nov 2022 16:03:22 -0800 (PST) X-Google-Smtp-Source: AA0mqf4HQaoGYqipmBuNL8ito8tCkUKxWRc/TwdVjCpMHkoQwfxqyKF+bwZKk06edL9vqi0LVcPs X-Received: by 2002:aa7:d4cf:0:b0:461:a9ce:5408 with SMTP id t15-20020aa7d4cf000000b00461a9ce5408mr4202946edr.201.1668729802421; Thu, 17 Nov 2022 16:03:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668729802; cv=none; d=google.com; s=arc-20160816; b=wSnt4C29uscmREmXwnq9vkM5e3x9ukYnoQ3MUMfrBcxHsM+OqM+RGnggqfiNhmBcjy ml2fEJjNYOo6GzktN0iAlB18DsPR+9czSajo5qyVV0G55I+MwMBT5Ex6h97muTAIFyHr D/bllAtqy96IApJTjsWdq46WCBV0T+kzllRopoGR0VqjsFQ5NYL2VfiSIMRZo7PsQsdk 4UCXrMrW9kkxZq0F/ICWiipXm9wm/1gv2zfAnqIE7bxwntuOUq/oa44gBOMJRn8WHTbd MLZ+aFaHfjQ4nKRO3B05cQ5IVY+ySi3Go2/sTS2SBcTCovXngiaBSj9rG896ZWuo6g4K NuFA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=To+1hKZHNGliAAsi3sjlLMbkRpr0UyHLDkyr8XsSEH4=; b=Ss0W+wDKdl3/ZKrgFrVMOlySEk2iKLVjKV8Wo6s/YIMChrlb0sUv+K7xnU8uxOofoA xFNJIKpruCqO14s4rOWtZhQcB47a9q9w34oZdVn6EkQKGr4dIEzv39lOmN6YjbqXxwbP HZjx0DNvuaBQM4rLKEln6t/wQFNGof2IcQqbtSTP105ZMMJwzxoDAqlZPDwBOl1pBukO ZQoQxUcgaZeKZT+PYmhtWdvcYwVKVCBFxBeFxhjPiV9iD/qVNCd9Pnwe4pbfVF5HPHX9 ulXzZTKBGWUwrm/9omOWA738T7rcHWZ7CEIQBmyymQOrI8id+/wCntHEef6boPpA58aA 0WyA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=qpqdZ0sq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b8-20020a056402350800b0043dfc949d31si2120652edd.25.2022.11.17.16.02.42; Thu, 17 Nov 2022 16:03:22 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=qpqdZ0sq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235140AbiKQXXQ (ORCPT + 91 others); Thu, 17 Nov 2022 18:23:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38046 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235033AbiKQXXO (ORCPT ); Thu, 17 Nov 2022 18:23:14 -0500 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0EE88742DD for ; Thu, 17 Nov 2022 15:23:13 -0800 (PST) Received: by mail-wr1-x434.google.com with SMTP id k8so6480145wrh.1 for ; Thu, 17 Nov 2022 15:23:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=user-agent:in-reply-to:content-disposition:mime-version:references :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=To+1hKZHNGliAAsi3sjlLMbkRpr0UyHLDkyr8XsSEH4=; b=qpqdZ0sqb/StH3lCysYFk74dHUJh7EZV2fUhEEVHMn80HcqT9aJ8kHz28WVtNYnon7 iVPoihuR+QYXMiA7Ak+I3j+7ZZukwg/8E4VqY2WibVFfszYGO3Q43ZLIJoI44o8mHdWA 5E15D3LyRolOWgZGiC16oXnyv64RBOgy+sd/WTsdf6pr9swHoqqmSmTP5pYtd3WUdqqA A5SjgOZrJxCzWiTq/8M91jgz2jw7FIa/lSmu6zbmJO46o39gOeN5tQ2M2BV1hAo0yEmK +v/dJI/Q0Le/WyZsqDRpwTaLC4CoEjRfw1LE7Mhy/g22lh1f86JVRp1D77JPTcfHzo4y sPdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=user-agent:in-reply-to:content-disposition:mime-version:references :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=To+1hKZHNGliAAsi3sjlLMbkRpr0UyHLDkyr8XsSEH4=; b=JZjhFz4Ox6/g3Ay1+bE+qCAXPLfPDN+u7InMYY3A/ne3NhuPmgFB5xnmFW6T7sSVyZ 7MenCtWTtqdJuoO29fKGQYxGTirh+vtPewhqGOebpeB9iNcg3jPz1/SL65DBtauzCih6 SoMsK/BLltzuFZ7Xn87vAxOgpbw35q2Um/kAX6mJQGphMKHRvdPaxpwAwwQgC+Uvc6oT CuxW4bijc3Tl3+qAK8/NRFg0buRhYq/rLNlTSEK77mr8e+AfXb68Ze9Mq5w/fzSVJ1Hz 9YolZzW3s9rVtdkq3ikdoMCi9ZsEEO4/gPdEJRaUTMeK8kNnyEDeBj5TlJrGpTtfVbP3 rz6g== X-Gm-Message-State: ANoB5pmOqtxGwnvS5JoFdrQkkOTLQqznC6JA39hmvuf+SBHR8xXBh231 R95Ni8ereBeaFmR+otQBTLvs3A== X-Received: by 2002:adf:e103:0:b0:22e:3180:f75a with SMTP id t3-20020adfe103000000b0022e3180f75amr2743740wrz.340.1668727391387; Thu, 17 Nov 2022 15:23:11 -0800 (PST) Received: from elver.google.com ([2a00:79e0:9c:201:4799:a943:410e:976]) by smtp.gmail.com with ESMTPSA id k1-20020a5d6281000000b0022ae0965a8asm2062148wru.24.2022.11.17.15.23.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Nov 2022 15:23:10 -0800 (PST) Date: Fri, 18 Nov 2022 00:23:03 +0100 From: Marco Elver To: Dave Hansen Cc: Naresh Kamboju , Peter Zijlstra , kasan-dev , X86 ML , open list , linux-mm , regressions@lists.linux.dev, lkft-triage@lists.linaro.org, Andrew Morton , Alexander Potapenko Subject: Re: WARNING: CPU: 0 PID: 0 at arch/x86/include/asm/kfence.h:46 kfence_protect Message-ID: References: <4208866d-338f-4781-7ff9-023f016c5b07@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4208866d-338f-4781-7ff9-023f016c5b07@intel.com> User-Agent: Mutt/2.2.7 (2022-08-07) X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 17, 2022 at 06:34AM -0800, Dave Hansen wrote: > On 11/17/22 05:58, Marco Elver wrote: > > [ 0.663761] WARNING: CPU: 0 PID: 0 at arch/x86/include/asm/kfence.h:46 kfence_protect+0x7b/0x120 > > [ 0.664033] WARNING: CPU: 0 PID: 0 at mm/kfence/core.c:234 kfence_protect+0x7d/0x120 > > [ 0.664465] kfence: kfence_init failed > > Any chance you could add some debugging and figure out what actually > made kfence call over? Was it the pte or the level? > > if (WARN_ON(!pte || level != PG_LEVEL_4K)) > return false; > > I can see how the thing you bisected to might lead to a page table not > being split, which could mess with the 'level' check. Yes - it's the 'level != PG_LEVEL_4K'. We do actually try to split the pages in arch_kfence_init_pool() (above this function) - so with "x86/mm: Inhibit _PAGE_NX changes from cpa_process_alias()" this somehow fails... > Also, is there a reason this code is mucking with the page tables > directly? It seems, uh, rather wonky. This, for instance: > > > if (protect) > > set_pte(pte, __pte(pte_val(*pte) & ~_PAGE_PRESENT)); > > else > > set_pte(pte, __pte(pte_val(*pte) | _PAGE_PRESENT)); > > > > /* > > * Flush this CPU's TLB, assuming whoever did the allocation/free is > > * likely to continue running on this CPU. > > */ > > preempt_disable(); > > flush_tlb_one_kernel(addr); > > preempt_enable(); > > Seems rather broken. I assume the preempt_disable() is there to get rid > of some warnings. But, there is nothing I can see to *keep* the CPU > that did the free from being different from the one where the TLB flush > is performed until the preempt_disable(). That makes the > flush_tlb_one_kernel() mostly useless. > > Is there a reason this code isn't using the existing page table > manipulation functions and tries to code its own? What prevents it from > using something like the attached patch? Yes, see the comment below - it's to avoid the IPIs and TLB shoot-downs, because KFENCE _can_ tolerate the inaccuracy even if we hit the wrong TLB or other CPUs' TLBs aren't immediately flushed - we trade a few false negatives for minimizing performance impact. > diff --git a/arch/x86/include/asm/kfence.h b/arch/x86/include/asm/kfence.h > index ff5c7134a37a..5cdb3a1f3995 100644 > --- a/arch/x86/include/asm/kfence.h > +++ b/arch/x86/include/asm/kfence.h > @@ -37,34 +37,13 @@ static inline bool arch_kfence_init_pool(void) > return true; > } > > -/* Protect the given page and flush TLB. */ > static inline bool kfence_protect_page(unsigned long addr, bool protect) > { > - unsigned int level; > - pte_t *pte = lookup_address(addr, &level); > - > - if (WARN_ON(!pte || level != PG_LEVEL_4K)) > - return false; > - > - /* > - * We need to avoid IPIs, as we may get KFENCE allocations or faults > - * with interrupts disabled. Therefore, the below is best-effort, and > - * does not flush TLBs on all CPUs. We can tolerate some inaccuracy; > - * lazy fault handling takes care of faults after the page is PRESENT. > - */ > - ^^ See this comment. Additionally there's a real performance concern, and the inaccuracy is something that we deliberately accept. > if (protect) > - set_pte(pte, __pte(pte_val(*pte) & ~_PAGE_PRESENT)); > + set_memory_np(addr, addr + PAGE_SIZE); > else > - set_pte(pte, __pte(pte_val(*pte) | _PAGE_PRESENT)); > + set_memory_p(addr, addr + PAGE_SIZE); Isn't this going to do tons of IPIs and shoot down other CPU's TLBs? KFENCE shouldn't incur this overhead on large machines with >100 CPUs if we can avoid it. What does "x86/mm: Inhibit _PAGE_NX changes from cpa_process_alias()" do that suddenly makes all this fail? What solution do you prefer that both fixes the issue and avoids the IPIs? Thanks, -- Marco