Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753031AbdDKTD5 (ORCPT ); Tue, 11 Apr 2017 15:03:57 -0400 Received: from mx2.suse.de ([195.135.220.15]:47205 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752214AbdDKTDw (ORCPT ); Tue, 11 Apr 2017 15:03:52 -0400 Subject: Re: [RFC 2/6] mm, mempolicy: stop adjusting current->il_next in mpol_rebind_nodemask() To: Christoph Lameter References: <20170411140609.3787-1-vbabka@suse.cz> <20170411140609.3787-3-vbabka@suse.cz> Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Li Zefan , Michal Hocko , Mel Gorman , David Rientjes , Hugh Dickins , Andrea Arcangeli , Anshuman Khandual , "Kirill A. Shutemov" From: Vlastimil Babka Message-ID: <9665a022-197a-4b02-8813-66aca252f0f9@suse.cz> Date: Tue, 11 Apr 2017 21:03:53 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1440 Lines: 28 On 11.4.2017 19:32, Christoph Lameter wrote: > On Tue, 11 Apr 2017, Vlastimil Babka wrote: > >> The task->il_next variable remembers the last allocation node for task's >> MPOL_INTERLEAVE policy. mpol_rebind_nodemask() updates interleave and >> bind mempolicies due to changing cpuset mems. Currently it also tries to >> make sure that current->il_next is valid within the updated nodemask. This is >> bogus, because 1) we are updating potentially any task's mempolicy, not just >> current, and 2) we might be updating per-vma mempolicy, not task one. >> >> The interleave_nodes() function that uses il_next can cope fine with the value >> not being within the currently allowed nodes, so this hasn't manifested as an >> actual issue. Thus it also won't be an issue if we just remove this adjustment >> completely. > > Well, interleave_nodes() will then potentially return a node outside of > the allowed memory policy when its called for the first time after > mpol_rebind_.. . But thenn it will find the next node within the > nodemask and work correctly for the next invocations. Hmm, you're right. But that could be easily fixed if il_next became il_prev, so we would return the result of next_node_in(il_prev) and also store it as the new il_prev, right? I somehow assumed it already worked that way. > But yea the race can probably be ignored. The idea was that the > application has a stable memory footprint during rebinding.