Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760554AbaJ3SZE (ORCPT ); Thu, 30 Oct 2014 14:25:04 -0400 Received: from relay1.sgi.com ([192.48.180.66]:43943 "EHLO relay.sgi.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758517AbaJ3SZC (ORCPT ); Thu, 30 Oct 2014 14:25:02 -0400 Date: Thu, 30 Oct 2014 13:25:42 -0500 From: Alex Thorlton To: Andi Kleen Cc: Alex Thorlton , linux-mm@kvack.org, Andrew Morton , Bob Liu , David Rientjes , "Eric W. Biederman" , Hugh Dickins , Ingo Molnar , Kees Cook , "Kirill A. Shutemov" , Mel Gorman , Oleg Nesterov , Peter Zijlstra , Rik van Riel , Thomas Gleixner , Vladimir Davydov , linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/4] Convert khugepaged to a task_work function Message-ID: <20141030182542.GB2984@sgi.com> References: <1414032567-109765-1-git-send-email-athorlton@sgi.com> <87lho0pf4l.fsf@tassilo.jf.intel.com> <20141029215839.GO2979@sgi.com> <20141030083544.GX12538@two.firstfloor.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20141030083544.GX12538@two.firstfloor.org> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 30, 2014 at 09:35:44AM +0100, Andi Kleen wrote: > We already have too many VM tunables. Better would be to switch > automatically somehow. > > I guess you could use some kind of work stealing scheduler, but these > are fairly complicated. Maybe some simpler heuristics can be found. That would be a better option in general, but (admittedly not having thought about it much), I can't think of a good way to determine when to make that switch. The main problem being that we're not really seeing a negative performance impact from khugepaged, but some undesired behavior, which always exists. Perhaps we could make a decision based on the number of remote allocations made by khugepaged? If we see a lot of allocations to distant nodes, then maybe we tell khugepaged to stop running scans for a particular process/mm and let the job handle things itself, either using the task_work style scan that I've proposed, or just banning khugepaged, period. Again, I don't think this is a very good way to make the decision, but something to think about. > BTW my thinking has been usually to actually use more khugepageds to > scan large address spaces faster. I hadn't thought of it, but I suppose that is an option as well. Unless I've completely missed something in the code, I don't think there's a way to do this now, right? Either way, I suppose it wouldn't be too hard to do, but this still leaves the window wide open for allocations to be made far away from where the process really needs them. Maybe if we had a way to spin up a new khugepaged on the fly, so that users can pin it where they want it, that would work? Just brainstorming here... - Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/