Received: by 2002:a05:6a10:6d10:0:0:0:0 with SMTP id gq16csp941961pxb; Fri, 22 Apr 2022 14:57:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxQXBlR2AdSlt6q/Z2P5lr82DcoXozSckiKBpcus39/XYEgm9+olXD7arbUhKeFyxphHfhe X-Received: by 2002:a05:6a00:15d2:b0:50c:e283:f701 with SMTP id o18-20020a056a0015d200b0050ce283f701mr7045351pfu.23.1650664661090; Fri, 22 Apr 2022 14:57:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1650664661; cv=none; d=google.com; s=arc-20160816; b=asRVwxa8IbW+mcB2+c/30p9PJUmnIATZw3jsJMFNp7oaGavZ9lKgG1HwbZyb3hD397 b5K4RA/fZFhiIA71YBWVY4FdXPpf1tIm+h4TjPuL2Df3BQ/QhheREXiruC0k/rmHoeer VWuLTG0jQhW4louU/apA6r8Jw943C3Y2idLrrCL1BEjYGizeQsspOCtgGi6xMTgu5n7M p6nMDIHc9BOpDVsuh5bmsO8tHSifbcp5ylz4+NUUFbJCq5Xl9UUa+9OaJvNgVtnbC7oy 0ikVC9d+ZHGSz7mXfmJsVSRKNxXVIBSw7iV1i9ckQC1OQJeII/kBuMs7VbbNrjCaxbv/ feeg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=VLPNm1zfYFG8sspty4o9ORFRIBEAx3D1EZl4YM6WiPM=; b=SB71THRJ7QjMiA73Dv1//gTjwmAbLUWU/Ueb+j3mAK2kPBKQ+BHGiCxNrdaw906vaJ I7tqGRbZSKAKRAJ7KIQUGqQosMlMz5NG9jbm0LVnydgR/PqY1fTrC5u3+y8cPk7voqzi o5426KrpTkOfuvDdCh2tJfPUQhMK3MUvh1Klq217IB5rPOMqWLovP26aZFfcF+7N32wj TdBUorjX6OIIqrlLwTKuxuFq/RuT6LOSNOYCVIRKoSwnRiwyFKlBzD2JyX68nHtYldG/ +pzoHClrESg9fuR3dxTcBdwrW0RRA9z0/njVLUlDF0r30RW6MJ9ZhIDlb17OfC7BfA6o 8YiA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Aex8SDRr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id f16-20020a170902ce9000b0015ab2a22100si6664823plg.415.2022.04.22.14.57.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Apr 2022 14:57:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Aex8SDRr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 893D2146D43; Fri, 22 Apr 2022 13:05:26 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1358206AbiDSWfv (ORCPT + 99 others); Tue, 19 Apr 2022 18:35:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60416 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1358189AbiDSWft (ORCPT ); Tue, 19 Apr 2022 18:35:49 -0400 Received: from mail-vk1-xa2d.google.com (mail-vk1-xa2d.google.com [IPv6:2607:f8b0:4864:20::a2d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 84A372A734 for ; Tue, 19 Apr 2022 15:33:03 -0700 (PDT) Received: by mail-vk1-xa2d.google.com with SMTP id j4so7887525vki.8 for ; Tue, 19 Apr 2022 15:33:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=VLPNm1zfYFG8sspty4o9ORFRIBEAx3D1EZl4YM6WiPM=; b=Aex8SDRrS5VzWfWlV8iWRhxTbkg8b4dWxKbFr3SMF+tCFKjE+IPPMx16F51NoilY7s 21rG4wwBx6CoPq8yBlg1ps9EKJcunUoKzUqzpGSZI84Kie/674fmOYdbgurGa8RkFRrs rKW+FDdVeXVBWi33KdLRuFtfIHeokXFBUx5QnTm7H0UDJe2qvQ6exPnmq2aGX5w1Jva5 eiRK3qLEMAdhGPBBscmEY8e5Z74TmukXQM9SI288cjtKyAMxK576z/J2iZdgTInGXKG0 RgcwEKsAXzQuvMt/REKDj1Zk+2ZWlVRhze2V36wImn0pLRx7cTH3Uy2ycA/1eeWFnTd9 E+2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=VLPNm1zfYFG8sspty4o9ORFRIBEAx3D1EZl4YM6WiPM=; b=FMcjxgyEsRqXiUiPYJw2Qm3P1I03RmmTML/m23XnKfIMKQXtU1bJN0SjLzhFPVzZ3+ NHaaXWZ+XxOXhvx3D1pH1/k01iD/PHnje1CyLEDiBLrBH4YBLsX98nZ624ptmgDz7jTH KxVNi5WKgnWafDMbtdXhSVuENe/7JGfWqneNnjIUUQ1NGlpIRpO2EboseBRVTAQCi5IY vvPNUincc373tlFPjwCmZBzbg+y9pugNWAOVCVKw9prtMlSdwAxX3/bva8Wlkk7bcunL RxhayXPTXDrKmuvt9HeWjk5CMiS2jT3+aXP9TamCyMDE0hh4gni6jq2pRYzS+sDTf3y2 bfFg== X-Gm-Message-State: AOAM531b/i5iUL3uU1RpYEjKEXDtd30VWoPGVY2sz4fJOYHrDOuCK7jI pSz5sh06/qeR1xmhczQG/sSYSavZygk+ekOPjRqnHg== X-Received: by 2002:a05:6122:887:b0:332:699e:7e67 with SMTP id 7-20020a056122088700b00332699e7e67mr5134984vkf.35.1650407582433; Tue, 19 Apr 2022 15:33:02 -0700 (PDT) MIME-Version: 1.0 References: <20220407031525.2368067-1-yuzhao@google.com> <20220407031525.2368067-9-yuzhao@google.com> <20220411191621.0378467ad99ebc822d5ad005@linux-foundation.org> <20220414185654.e7150bcbe859e0dd4b9c61af@linux-foundation.org> <20220415121521.764a88dda55ae8c676ad26b0@linux-foundation.org> <20220415143220.cc37b0b0a368ed2bf2a821f8@linux-foundation.org> In-Reply-To: From: Yu Zhao Date: Tue, 19 Apr 2022 16:32:26 -0600 Message-ID: Subject: Re: [PATCH v10 08/14] mm: multi-gen LRU: support page table walks To: Justin Forbes Cc: Andrew Morton , Stephen Rothwell , Linux-MM , Andi Kleen , Aneesh Kumar , Barry Song <21cnbao@gmail.com>, Catalin Marinas , Dave Hansen , Hillf Danton , Jens Axboe , Jesse Barnes , Johannes Weiner , Jonathan Corbet , Linus Torvalds , Matthew Wilcox , Mel Gorman , Michael Larabel , Michal Hocko , Mike Rapoport , Rik van Riel , Vlastimil Babka , Will Deacon , Ying Huang , Linux ARM , "open list:DOCUMENTATION" , linux-kernel , Kernel Page Reclaim v2 , "the arch/x86 maintainers" , Brian Geffon , Jan Alexander Steffens , Oleksandr Natalenko , Steven Barrett , Suleiman Souhlal , Daniel Byrne , Donald Carr , =?UTF-8?Q?Holger_Hoffst=C3=A4tte?= , Konstantin Kharlamov , Shuang Zhai , Sofia Trinh , Vaibhav Jain Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.5 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Apr 16, 2022 at 10:32 AM Justin Forbes wrote: > > On Fri, Apr 15, 2022 at 4:33 PM Andrew Morton wrote: > > > > On Fri, 15 Apr 2022 14:11:32 -0600 Yu Zhao wrote: > > > > > > > > > > I grabbed > > > > https://kojipkgs.fedoraproject.org//packages/kernel/5.18.0/0.rc2.23.fc37/src/kernel-5.18.0-0.rc2.23.fc37.src.rpm > > > > and > > > > > > Yes, Fedora/RHEL is one concrete example of the model I mentioned > > > above (experimental/stable). I added Justin, the Fedora kernel > > > maintainer, and he can further clarify. > > We almost split into 3 scenarios. In rawhide we run a standard Fedora > config for rcX releases and .0, but git snapshots are built with debug > configs only. The trade off is that we can't turn on certain options > which kill performance, but we do get more users running these kernels > which expose real bugs. The rawhide kernel follows Linus' tree and is > rebuilt most weekdays. Stable Fedora is not a full debug config, but > in cases where we can keep a debug feature on without it much getting > in the way of performance, as is the case with CONFIG_DEBUG_VM, I > think there is value in keeping those on, until there is not. And of > course RHEL is a much more conservative config, and a much more > conservative rebase/backport codebase. > > > > If we don't want more VM_BUG_ONs, I'll remove them. But (let me > > > reiterate) it seems to me that just defeats the purpose of having > > > CONFIG_DEBUG_VM. > > > > > > > Well, I feel your pain. It was never expected that VM_BUG_ON() would > > get subverted in this fashion. > > Fedora is not trying to subvert anything. If keeping the option on > becomes problematic, we can simply turn it off. Fedora certainly has > a more diverse installed base than typical enterprise distributions, > and much more diverse than most QA pools. Both in the array of > hardware, and in the use patterns, so things do get uncovered that > would not be seen otherwise. > > > We could create a new MM-developer-only assertion. Might even call it > > MM_BUG_ON(). With compile-time enablement but perhaps not a runtime > > switch. > > > > With nice simple semantics, please. Like "it returns void" and "if you > > pass an expression with side-effects then you lose". And "if you send > > a patch which produces warnings when CONFIG_MM_BUG_ON=n then you get to > > switch to windows95 for a month". > > > > Let's leave the mglru assertions in place for now and let's think about > > creating something more suitable, with a view to switching mglru over > > to that at a later time. > > > > > > > > But really, none of this addresses the core problem: *_BUG_ON() often > > kills the kernel. So guess what we just did? We killed the user's > > kernel at the exact time when we least wished to do so: when they have > > a bug to report to us. So the thing is self-defeating. > > > > It's much much better to WARN and to attempt to continue. This makes > > it much more likely that we'll get to hear about the kernel flaw. > > I agree very much with this. We hear about warnings from users, they > don't go unnoticed, and several of these users are willing to spend > time to help get to the bottom of an issue. They may not know the > code, but plenty are willing to test various patches or scenarios. Thanks, Justin. Glad to hear warnings are collected from the field. Based on all the feedback, my action item is to replace all VM_BUG_ONs with VM_WARN_ON_ONCEs.