Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752799AbdDKRcg (ORCPT ); Tue, 11 Apr 2017 13:32:36 -0400 Received: from resqmta-ch2-05v.sys.comcast.net ([69.252.207.37]:46956 "EHLO resqmta-ch2-05v.sys.comcast.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752626AbdDKRcb (ORCPT ); Tue, 11 Apr 2017 13:32:31 -0400 Date: Tue, 11 Apr 2017 12:32:29 -0500 (CDT) From: Christoph Lameter X-X-Sender: cl@east.gentwo.org To: Vlastimil Babka cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Li Zefan , Michal Hocko , Mel Gorman , David Rientjes , Hugh Dickins , Andrea Arcangeli , Anshuman Khandual , "Kirill A. Shutemov" Subject: Re: [RFC 2/6] mm, mempolicy: stop adjusting current->il_next in mpol_rebind_nodemask() In-Reply-To: <20170411140609.3787-3-vbabka@suse.cz> Message-ID: References: <20170411140609.3787-1-vbabka@suse.cz> <20170411140609.3787-3-vbabka@suse.cz> Content-Type: text/plain; charset=US-ASCII X-CMAE-Envelope: MS4wfOhPbB6XMGL7MlJRImJASo+rWlGo9cXY16HNKgepTppxvSTBBdbrYri2Li/ZXmMGy4zQYhcgpBnYV50/G2Q81GjJvocNe2Wnmr2b/f+N98VbbSgI1juv tKuxEA2GO9/0vh+CNyJ77PpW2Q13Ho/5HAx9bX5opCOpbRpTbL/SMhI7Xc8GnrweH9bHjx60JwYIb4U4fyesSnHoKFxwd3gf5KJ3GdyHTtaknJrQq8YwZsJg +A/KPa5/eONV5oWVMDa5IXR3Toue+KkR9Z0U5MEcCVrWyzZfJSf0/GxGQOjC5ZeCUJZS1jOJQvenBhx3rfVWhnxzjRpqrMSZ3IhgPdl4GOuuln0NP4M1smag 3qRFyp7YjCLeqXuVVrrb5xsO3m3uAf70szJhEU1OY3JiFmqUtxhpg73wamg5XHz10ZUdCLz3mMv95t2064XokPlEn+fmAL5jOd04CbLxZm6zG36RLZAvuO2Z V2LbFdqbn9BipqGaPxWkV6F1jtG3ou5HFfWmrMKG6JGZKEArQoc2O+oulHw= Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1141 Lines: 22 On Tue, 11 Apr 2017, Vlastimil Babka wrote: > The task->il_next variable remembers the last allocation node for task's > MPOL_INTERLEAVE policy. mpol_rebind_nodemask() updates interleave and > bind mempolicies due to changing cpuset mems. Currently it also tries to > make sure that current->il_next is valid within the updated nodemask. This is > bogus, because 1) we are updating potentially any task's mempolicy, not just > current, and 2) we might be updating per-vma mempolicy, not task one. > > The interleave_nodes() function that uses il_next can cope fine with the value > not being within the currently allowed nodes, so this hasn't manifested as an > actual issue. Thus it also won't be an issue if we just remove this adjustment > completely. Well, interleave_nodes() will then potentially return a node outside of the allowed memory policy when its called for the first time after mpol_rebind_.. . But thenn it will find the next node within the nodemask and work correctly for the next invocations. But yea the race can probably be ignored. The idea was that the application has a stable memory footprint during rebinding.