Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp151830pxb; Thu, 2 Sep 2021 00:40:53 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyToP0kQNHbjpy2mPoI7EB5ukp6TObYeaVCats+WYUyYLMKBOoQQNTuI/Ejzc34Don5Zt7c X-Received: by 2002:a17:906:2f15:: with SMTP id v21mr2296630eji.444.1630568453732; Thu, 02 Sep 2021 00:40:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1630568453; cv=none; d=google.com; s=arc-20160816; b=HbaumWrguerL9jPYZRH952jKPamYQqWeBrswf/NzCUmh21NkR/VECIvOuGKL7m+KZZ dLsP0VjKZDx8zSgLpQ+vq10F7wWbxLo9HYj+vyTZjbXeR4FlmATKMGaIAdzTS2OrK7js PuRtXAA4nWHU5yXOsALAab8U6jyjIpbd4ksvZMxQIUAlZsFzvw+iCNbw5OcmiQ/30+LU zz7DLi5L0InEhDmevPFMTfg116Ve/grS7B48MJDrBGqtLwZ0jrOTA8mKHtH7EcIBaLpG Bep46hXa041Y5u16RMCxaBOti+Ve8BvWjcNLnRpMPrVWXRP9cBGJowcFetHe6cqK+FUT 8yWQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=YCGFLqMq5foN0CchG94heh9UdfAs9r0bapuDRR/OG2Q=; b=yr37KsjAxnkn2+BV2Az0mUu8vorKHxveCni31RNSFHuJB96Q2xFttiI6ExM/G6BtpD +DRUupph51KgL/27QxUG1767p0OJQviR3rK+QQb14OVc+BvDUiHWaUjPTbmxR+9/OLbE hX8AJsQLsKgblbdRy68PISe6gXAzQgk0tVpg7ZKk1YR0LpUZ3N+acqXdNCfJcCMUoeBE SMnBUCkeR+mgBYmvZtYaANnk4KEx2Fbo5RfUWfz+Uyt05G4ahArBaSB1Gj6QN8FVtL0R X4iePkM/1QJr4tfrKnUZ33x+VEqAJVu/kli62N5uy/A45VVRQzwnXpkfyXoZ9VvgNeC4 aQLQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v26si956388ejk.700.2021.09.02.00.40.12; Thu, 02 Sep 2021 00:40:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243653AbhIBHgP (ORCPT + 99 others); Thu, 2 Sep 2021 03:36:15 -0400 Received: from mga01.intel.com ([192.55.52.88]:65023 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243454AbhIBHgO (ORCPT ); Thu, 2 Sep 2021 03:36:14 -0400 X-IronPort-AV: E=McAfee;i="6200,9189,10094"; a="241284266" X-IronPort-AV: E=Sophos;i="5.84,371,1620716400"; d="scan'208";a="241284266" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Sep 2021 00:35:16 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.84,371,1620716400"; d="scan'208";a="520915216" Received: from shbuild999.sh.intel.com (HELO localhost) ([10.239.146.151]) by fmsmga004.fm.intel.com with ESMTP; 02 Sep 2021 00:34:34 -0700 Date: Thu, 2 Sep 2021 15:34:33 +0800 From: Feng Tang To: Michal Hocko Cc: David Rientjes , "linux-mm@kvack.org" , Andrew Morton , Christian Brauner , "linux-kernel@vger.kernel.org" Subject: Re: [RFC PATCH] mm/oom: detect and kill task which has allocation forbidden by cpuset limit Message-ID: <20210902073433.GA48711@shbuild999.sh.intel.com> References: <1630399085-70431-1-git-send-email-feng.tang@intel.com> <52d80e9-cf27-9a59-94fd-d27a1e2dac6f@google.com> <20210901024402.GB46357@shbuild999.sh.intel.com> <20210901134200.GA50993@shbuild999.sh.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 01, 2021 at 04:05:40PM +0200, Michal Hocko wrote: [SNIP] > > This looks better than the previous attempt. It would be still better to > solve this at the page allocator layer. The slowpath is already doing > this for the nodemask. E.g. > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index eeb3a9cb36bb..a3193134540d 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -4929,6 +4929,17 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, > if (!ac->preferred_zoneref->zone) > goto nopage; > > + /* > + * Check for insane configurations where the cpuset doesn't contain any suitable > + * zone to satisfy the request - e.g. kernel allocations from MOVABLE nodes only > + */ > + if (cpusets_enabled() && (gfp_mask & __GFP_HARDWALL)) { > + struct zoneref *z = first_zones_zonelist(ac->zonelist, ac->highest_zoneidx, > + &cpuset_current_mems_allowed); > + if (!z->zone) > + goto nopage; > + } > + > if (alloc_flags & ALLOC_KSWAPD) > wake_all_kswapds(order, gfp_mask, ac); Thanks for the suggestion! It dose bail out early skipping the kswapd, direct reclaim and compaction. I also looked at prepare_alloc_pages() which does some cpuset check and zone initialization, but I'd better leave it alone as it's in a real hot path, while here is in slowpath anyway. Will run some page fault benchmark cases with this patch. Thanks, Feng > if this is seen as an additional overhead for an insane configuration > then we can add insane_cpusets_enabled() which would be a static branch > enabled when somebody actually tries to configure movable only cpusets > or potentially other dubious usage. > -- > Michal Hocko > SUSE Labs