Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1950394AbdD3VdU (ORCPT ); Sun, 30 Apr 2017 17:33:20 -0400 Received: from resqmta-ch2-01v.sys.comcast.net ([69.252.207.33]:48458 "EHLO resqmta-ch2-01v.sys.comcast.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1427518AbdD3VdN (ORCPT ); Sun, 30 Apr 2017 17:33:13 -0400 Date: Sun, 30 Apr 2017 16:33:10 -0500 (CDT) From: Christoph Lameter X-X-Sender: cl@east.gentwo.org To: Vlastimil Babka cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Li Zefan , Michal Hocko , Mel Gorman , David Rientjes , Hugh Dickins , Andrea Arcangeli , Anshuman Khandual , "Kirill A. Shutemov" , linux-api@vger.kernel.org Subject: Re: [RFC 1/6] mm, page_alloc: fix more premature OOM due to race with cpuset update In-Reply-To: Message-ID: References: <20170411140609.3787-1-vbabka@suse.cz> <20170411140609.3787-2-vbabka@suse.cz> Content-Type: text/plain; charset=US-ASCII X-CMAE-Envelope: MS4wfDq1GCHS9VElRKtuqjpGXWJTEeXv+wKP3w1QzBSnCc2l5Tbi4g25EB201An85jl0j5TRg5eft4MOZcgC+8qOE7U5A8XyT8WFA4Yxjoks4znhKGLb3ZO7 XPZ35NigQk0ZWYWl3AsucCEPnMxhPaaPGBNgl2/RLOrI3uEQTDZnXLLlSkm68WETmGPQi/jGIwe+pw52kk2CnygQK+8kgm5KCN+Xhf251e/zGTVhLeKJFueR K8yF1/HKngZD39g0naHzEjDVv60tg8ZosZb6VV+B6UET5bjTrvgTy23mFsREfYA4eQd5zeF+A5RXjHipuUo1QhlhUH4KRzCs/Zwwzye4yr3TTxtZBB0vLiGQ pFNxQCWgEihqnUOD5SS4covjGefuTU2Bk3SkjNqfSIYzsNiTIClmN6v4hK2CcyZg3rmvzjKefCkM9MAcLnzSMCaGQLDhuGEL9jxVMdfONC1b7qrwXw7mGKaG EVf+wB0GsNtz3P4cd4F2H30g4YXFFcQO84mKw0gREr3fmFme6lNDecHsQKg0/GdS5UfJEgoiTZUAJ5e7EwnMajXKFcpbxeN4SBJMyQ== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1351 Lines: 28 On Wed, 26 Apr 2017, Vlastimil Babka wrote: > > Such an application typically already has such logic and executes a > > binding after discovering its numa node configuration on startup. It would > > have to be modified to redo that action when it gets some sort of a signal > > from the script telling it that the node config would be changed. > > > > Having this logic in the application instead of the kernel avoids all the > > kernel messes that we keep on trying to deal with and IMHO is much > > cleaner. > > That would be much simpler for us indeed. But we still IMHO can't > abruptly start denying page fault allocations for existing applications > that don't have the necessary awareness. We certainly can do that. The failure of the page faults are due to the admin trying to move an application that is not aware of this and is using mempols. That could be an error. Trying to move an application that contains both absolute and relative node numbers is definitely something that is potentiall so screwed up that the kernel should not muck around with such an app. Also user space can determine if the application is using memory policies and can then take appropriate measures (message to the sysadmin to eval tge situation f.e.) or mess aroud with the processes memory policies on its own. So this is certainly a way out of this mess.