Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756930AbdCUKhn (ORCPT ); Tue, 21 Mar 2017 06:37:43 -0400 Received: from www262.sakura.ne.jp ([202.181.97.72]:56440 "EHLO www262.sakura.ne.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756728AbdCUKhl (ORCPT ); Tue, 21 Mar 2017 06:37:41 -0400 To: mhocko@kernel.org, hannes@cmpxchg.org Cc: riel@redhat.com, akpm@linux-foundation.org, mgorman@suse.de, vbabka@suse.cz, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm, vmscan: do not loop on too_many_isolated for ever From: Tetsuo Handa References: <20170307133057.26182-1-mhocko@kernel.org> <1488916356.6405.4.camel@redhat.com> <20170309180540.GA8678@cmpxchg.org> <20170310102010.GD3753@dhcp22.suse.cz> <201703102044.DBJ04626.FLVMFOQOJtOFHS@I-love.SAKURA.ne.jp> In-Reply-To: <201703102044.DBJ04626.FLVMFOQOJtOFHS@I-love.SAKURA.ne.jp> Message-Id: <201703211937.FDE04610.OSQOFtOFFHMJVL@I-love.SAKURA.ne.jp> X-Mailer: Winbiff [Version 2.51 PL2] X-Accept-Language: ja,en,zh Date: Tue, 21 Mar 2017 19:37:39 +0900 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2319 Lines: 42 On 2017/03/10 20:44, Tetsuo Handa wrote: > Michal Hocko wrote: >> On Thu 09-03-17 13:05:40, Johannes Weiner wrote: >>>> It may be OK, I just do not understand all the implications. >>>> >>>> I like the general direction your patch takes the code in, >>>> but I would like to understand it better... >>> >>> I feel the same way. The throttling logic doesn't seem to be very well >>> thought out at the moment, making it hard to reason about what happens >>> in certain scenarios. >>> >>> In that sense, this patch isn't really an overall improvement to the >>> way things work. It patches a hole that seems to be exploitable only >>> from an artificial OOM torture test, at the risk of regressing high >>> concurrency workloads that may or may not be artificial. >>> >>> Unless I'm mistaken, there doesn't seem to be a whole lot of urgency >>> behind this patch. Can we think about a general model to deal with >>> allocation concurrency? >> >> I am definitely not against. There is no reason to rush the patch in. > > I don't hurry if we can check using watchdog whether this problem is occurring > in the real world. I have to test corner cases because watchdog is missing. Today I tested linux-next-20170321 with not so insane stress, and I again hit this problem. Thus, I think this problem might occur in the real world. http://I-love.SAKURA.ne.jp/tmp/serial-20170321.txt.xz (Logs up to before swapoff are eliminated.) ---------- [ 2250.175109] MemAlloc-Info: stalling=16 dying=0 exiting=4 victim=0 oom_count=1155386 [ 2257.535653] MemAlloc-Info: stalling=16 dying=0 exiting=4 victim=0 oom_count=1155386 [ 2319.806880] MemAlloc-Info: stalling=19 dying=0 exiting=4 victim=0 oom_count=1155386 [ 2320.722282] MemAlloc-Info: stalling=19 dying=0 exiting=4 victim=0 oom_count=1155386 [ 2381.243393] MemAlloc-Info: stalling=20 dying=0 exiting=4 victim=0 oom_count=1155386 [ 2389.777052] MemAlloc-Info: stalling=20 dying=0 exiting=4 victim=0 oom_count=1155386 [ 2450.878287] MemAlloc-Info: stalling=20 dying=0 exiting=4 victim=0 oom_count=1155386 [ 2459.386321] MemAlloc-Info: stalling=20 dying=0 exiting=4 victim=0 oom_count=1155386 [ 2520.500633] MemAlloc-Info: stalling=20 dying=0 exiting=4 victim=0 oom_count=1155386 [ 2529.042088] MemAlloc-Info: stalling=20 dying=0 exiting=4 victim=0 oom_count=1155386 ----------