Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932462AbdHWSRc (ORCPT ); Wed, 23 Aug 2017 14:17:32 -0400 Received: from mail-oi0-f65.google.com ([209.85.218.65]:38789 "EHLO mail-oi0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932111AbdHWSRb (ORCPT ); Wed, 23 Aug 2017 14:17:31 -0400 MIME-Version: 1.0 In-Reply-To: <6e8b81de-e985-9222-29c5-594c6849c351@linux.intel.com> References: <37D7C6CF3E00A74B8858931C1DB2F077537879BB@SHSMSX103.ccr.corp.intel.com> <20170818144622.oabozle26hasg5yo@techsingularity.net> <37D7C6CF3E00A74B8858931C1DB2F07753787AE4@SHSMSX103.ccr.corp.intel.com> <20170818185455.qol3st2nynfa47yc@techsingularity.net> <20170821183234.kzennaaw2zt2rbwz@techsingularity.net> <37D7C6CF3E00A74B8858931C1DB2F07753788B58@SHSMSX103.ccr.corp.intel.com> <37D7C6CF3E00A74B8858931C1DB2F0775378A24A@SHSMSX103.ccr.corp.intel.com> <37D7C6CF3E00A74B8858931C1DB2F0775378A377@SHSMSX103.ccr.corp.intel.com> <37D7C6CF3E00A74B8858931C1DB2F0775378A8AB@SHSMSX103.ccr.corp.intel.com> <6e8b81de-e985-9222-29c5-594c6849c351@linux.intel.com> From: Linus Torvalds Date: Wed, 23 Aug 2017 11:17:30 -0700 X-Google-Sender-Auth: _YszyALA1fpQUUOt7kjnhohxVAc Message-ID: Subject: Re: [PATCH 1/2] sched/wait: Break up long wake list walk To: Tim Chen Cc: "Liang, Kan" , Mel Gorman , Mel Gorman , "Kirill A. Shutemov" , Peter Zijlstra , Ingo Molnar , Andi Kleen , Andrew Morton , Johannes Weiner , Jan Kara , linux-mm , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1286 Lines: 35 On Wed, Aug 23, 2017 at 8:58 AM, Tim Chen wrote: > > Will you still consider the original patch as a fail safe mechanism? I don't think we have much choice, although I would *really* want to get this root-caused rather than just papering over the symptoms. Maybe still worth testing that "sched/numa: Scale scan period with tasks in group and shared/private" patch that Mel mentioned. In fact, looking at that patch description, it does seem to match this particular load a lot. Quoting from the commit message: "Running 80 tasks in the same group, or as threads of the same process, results in the memory getting scanned 80x as fast as it would be if a single task was using the memory. This really hurts some workloads" So if 80 threads causes 80x as much scanning, a few thousand threads might indeed be really really bad. So once more unto the breach, dear friends, once more. Please. The patch got applied to -tip as commit b5dd77c8bdad, and can be downloaded here: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=b5dd77c8bdada7b6262d0cba02a6ed525bf4e6e1 (Hmm. It says it's cc'd to me, but I never noticed that patch simply because it was in a big group of other -tip commits.. Oh well). Linus