Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp3140671pxj; Sun, 23 May 2021 22:57:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxXqxeoE7+/OI2o1SUREDBa35xl/iyWRqHFjuydHygfnEq23ISX/QOeRmuJt5HDRi5sv5Ts X-Received: by 2002:a17:906:6dc3:: with SMTP id j3mr5048682ejt.448.1621835870662; Sun, 23 May 2021 22:57:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1621835870; cv=none; d=google.com; s=arc-20160816; b=cLCtRk96dhE2pKULfwhXzcH6i3x90Ui7QoHr+c/vyC7N6pJqzOmvHCGaE3RiffLmWb tWe4C22ThNOyvLK3tYmGXRz6I3O95RP1c0O5Nc9mN7sKlOCRhIKubsZ6H8bkVU5hqV6K s34u4NmNp9dShNpqhf7eYOlhl/vbRzvqIpHRPWxCi5/YqqQN7hWSMSucXls9ov/trybj ceGeXDcocwYrtVxaPVLtUMI0ydtPbKrbmFzx64lJQ2fKFeu2IsTHDicUwWfQW+s3WmfM KZmV9WCvi7OXiAZeTqtbyiD55ATlPup9bF0Svy3TFic/RnZSPvvsp0rpXMXVxcoA+87R nedQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :ironport-sdr:ironport-sdr; bh=0niH3tnqZBrjMlWvVDsltYzl47pKozr0p1Fv15znacQ=; b=Vc+CYipx98wN9ATB3NGPs1VxHsOX4M1OrcfpnezjUnAVoU6vzptMER1zJmju2zxoj0 5S/u/FwxCO2TALGZAMVJs7a2WkVe76+lbGRnrjpOJhEdfGeTTYX1jSPgPpWdP3xZ8k1Z lOdGnlz/hsiVKnZJZdyBM980/W+9BeAXOOW2DQMY1IXRe7P5AtzaPehQWTJh6nD5bpxP 7TtpqR12EoOJbsY2OaIUB2JJvKH2eK5EDwbTXuU2TUuH7bqqEgDoGYk5XTonoS4gYF5j JnMyVYHzEiMvhmiqor1h3FH/dkIo4Zxhg5ZvJY0w4qtNneBXaK/u9/qpYHx85E08u4du pYAw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c17si12072733ejf.35.2021.05.23.22.57.27; Sun, 23 May 2021 22:57:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232254AbhEXF5U (ORCPT + 99 others); Mon, 24 May 2021 01:57:20 -0400 Received: from mga01.intel.com ([192.55.52.88]:33767 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229733AbhEXF5T (ORCPT ); Mon, 24 May 2021 01:57:19 -0400 IronPort-SDR: 9+g1tvz6dvq4X3VziN0G8KhqOTs5hxzie4jEERQ+RoUy0AxB7SOopgSncvz/V9mFmTN8+tdpkd rT+ot8egapDQ== X-IronPort-AV: E=McAfee;i="6200,9189,9993"; a="223013736" X-IronPort-AV: E=Sophos;i="5.82,319,1613462400"; d="scan'208";a="223013736" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 May 2021 22:55:51 -0700 IronPort-SDR: PivyM5cGpVb+S3bGUvM+XALFFw+T7T+dYvtAtW5DbYVR3y7Br3ZLPQWg3C/JhxmYWWkBd93oRX YnoJwNpk1CNw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.82,319,1613462400"; d="scan'208";a="475690932" Received: from shbuild999.sh.intel.com (HELO localhost) ([10.239.147.94]) by orsmga001.jf.intel.com with ESMTP; 23 May 2021 22:55:48 -0700 Date: Mon, 24 May 2021 13:55:47 +0800 From: Feng Tang To: David Rientjes Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Michal Hocko , Andrea Arcangeli , Mel Gorman , Mike Kravetz , Randy Dunlap , Vlastimil Babka , Dave Hansen , Ben Widawsky , Andi Kleen , Dan Williams , ying.huang@intel.com Subject: Re: [RFC Patch v2 1/4] mm/mempolicy: skip nodemask intersect check for 'interleave' when oom Message-ID: <20210524055547.GA48704@shbuild999.sh.intel.com> References: <1621499404-67756-1-git-send-email-feng.tang@intel.com> <1621499404-67756-2-git-send-email-feng.tang@intel.com> <682c92e5-ccb3-4b76-1f56-617f8e6e8f2@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <682c92e5-ccb3-4b76-1f56-617f8e6e8f2@google.com> User-Agent: Mutt/1.5.24 (2015-08-30) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi David, Thanks for the review! On Sun, May 23, 2021 at 10:15:00PM -0700, David Rientjes wrote: > On Thu, 20 May 2021, Feng Tang wrote: > > > diff --git a/mm/mempolicy.c b/mm/mempolicy.c > > index d79fa29..1964cca 100644 > > --- a/mm/mempolicy.c > > +++ b/mm/mempolicy.c > > @@ -2098,7 +2098,7 @@ bool init_nodemask_of_mempolicy(nodemask_t *mask) > > * > > * If tsk's mempolicy is "default" [NULL], return 'true' to indicate default > > * policy. Otherwise, check for intersection between mask and the policy > > - * nodemask for 'bind' or 'interleave' policy. For 'preferred' or 'local' > > + * nodemask for 'bind' policy. For 'interleave', 'preferred' or 'local' > > * policy, always return true since it may allocate elsewhere on fallback. > > * > > * Takes task_lock(tsk) to prevent freeing of its mempolicy. > > @@ -2111,29 +2111,13 @@ bool mempolicy_nodemask_intersects(struct task_struct *tsk, > > > > if (!mask) > > return ret; > > + > > task_lock(tsk); > > mempolicy = tsk->mempolicy; > > - if (!mempolicy) > > - goto out; > > - > > - switch (mempolicy->mode) { > > - case MPOL_PREFERRED: > > - /* > > - * MPOL_PREFERRED and MPOL_F_LOCAL are only preferred nodes to > > - * allocate from, they may fallback to other nodes when oom. > > - * Thus, it's possible for tsk to have allocated memory from > > - * nodes in mask. > > - */ > > - break; > > - case MPOL_BIND: > > - case MPOL_INTERLEAVE: > > + if (mempolicy && mempolicy->mode == MPOL_BIND) > > ret = nodes_intersects(mempolicy->v.nodes, *mask); > > If MPOL_INTERLEAVE is deemed only a suggestion, the same could be > considered true of MPOL_BIND intersection as well, no? IIUC, 'bind' and 'interleave' are different regarding memory allocation. In alloc_pages_vma(), there are: nmask = policy_nodemask(gfp, pol); preferred_nid = policy_node(gfp, pol, node); page = __alloc_pages(gfp, order, preferred_nid, nmask); mpol_cond_put(pol); and in plicy_nodemask(), only 'bind' policy may return its desired nodemask, while all other returns NULL including 'interleave'. And this 'NULL' enables the 'interleave' policy can get memory from other nodes than its nodemask. So when allocating memory, 'interleave' can get memory from all nodes. I did some experements which also confirm this. Thanks, Feng > > > - break; > > - default: > > - BUG(); > > - } > > -out: > > task_unlock(tsk); > > + > > return ret; > > } > > > > -- > > 2.7.4 > > > >