Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752801AbdHVVYK (ORCPT ); Tue, 22 Aug 2017 17:24:10 -0400 Received: from mga04.intel.com ([192.55.52.120]:49151 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751787AbdHVVYJ (ORCPT ); Tue, 22 Aug 2017 17:24:09 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.41,414,1498546800"; d="scan'208";a="140870687" Date: Tue, 22 Aug 2017 14:24:08 -0700 From: Andi Kleen To: Christopher Lameter Cc: Linus Torvalds , Peter Zijlstra , "Liang, Kan" , Mel Gorman , Mel Gorman , "Kirill A. Shutemov" , Tim Chen , Ingo Molnar , Andrew Morton , Johannes Weiner , Jan Kara , linux-mm , Linux Kernel Mailing List Subject: Re: [PATCH 1/2] sched/wait: Break up long wake list walk Message-ID: <20170822212408.GC28715@tassilo.jf.intel.com> References: <20170818185455.qol3st2nynfa47yc@techsingularity.net> <20170821183234.kzennaaw2zt2rbwz@techsingularity.net> <37D7C6CF3E00A74B8858931C1DB2F07753788B58@SHSMSX103.ccr.corp.intel.com> <37D7C6CF3E00A74B8858931C1DB2F0775378A24A@SHSMSX103.ccr.corp.intel.com> <20170822190828.GO32112@worktop.programming.kicks-ass.net> <20170822193714.GZ28715@tassilo.jf.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.8.3 (2017-05-23) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1065 Lines: 26 On Tue, Aug 22, 2017 at 04:08:52PM -0500, Christopher Lameter wrote: > On Tue, 22 Aug 2017, Andi Kleen wrote: > > > We only see it on 4S+ today. But systems are always getting larger, > > so what's a large system today, will be a normal medium scale system > > tomorrow. > > > > BTW we also collected PT traces for the long hang cases, but it was > > hard to find a consistent pattern in them. > > Hmmm... Maybe it would be wise to limit the pages autonuma can migrate? > > If a page has more than 50 refcounts or so then dont migrate it. I think > high number of refcounts and a high frequewncy of calls are reached in > particular for pages of the c library. Attempting to migrate those does > not make much sense anyways because the load may shift and another > function may become popular. We may end up shifting very difficult to > migrate pages back and forth. I believe in this case it's used by threads, so a reference count limit wouldn't help. If migrating code was a problem I would probably rather just disable migration of read-only pages. -Andi