Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751974AbdF3Ncl (ORCPT ); Fri, 30 Jun 2017 09:32:41 -0400 Received: from mx2.suse.de ([195.135.220.15]:36501 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751668AbdF3Nck (ORCPT ); Fri, 30 Jun 2017 09:32:40 -0400 Date: Fri, 30 Jun 2017 15:32:36 +0200 From: Michal Hocko To: Tetsuo Handa Cc: hannes@cmpxchg.org, riel@redhat.com, akpm@linux-foundation.org, mgorman@suse.de, vbabka@suse.cz, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm, vmscan: do not loop on too_many_isolated for ever Message-ID: <20170630133236.GM22917@dhcp22.suse.cz> References: <20170307133057.26182-1-mhocko@kernel.org> <1488916356.6405.4.camel@redhat.com> <20170309180540.GA8678@cmpxchg.org> <20170310102010.GD3753@dhcp22.suse.cz> <201703102044.DBJ04626.FLVMFOQOJtOFHS@I-love.SAKURA.ne.jp> <201706300914.CEH95859.FMQOLVFHJFtOOS@I-love.SAKURA.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201706300914.CEH95859.FMQOLVFHJFtOOS@I-love.SAKURA.ne.jp> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 972 Lines: 22 On Fri 30-06-17 09:14:22, Tetsuo Handa wrote: [...] > Ping? Ping? When are we going to apply this patch or watchdog patch? > This problem occurs with not so insane stress like shown below. > I can't test almost OOM situation because test likely falls into either > printk() v.s. oom_lock lockup problem or this too_many_isolated() problem. So you are saying that the patch fixes this issue. Do I understand you corretly? And you do not see any other negative side effectes with it applied? I am sorry I didn't have much time to think about feedback from Johannes yet. A more robust throttling method is surely due but also not trivial. So I am not sure how to proceed. It is true that your last test case with only 10 processes fighting resembles the reality much better than hundreds (AFAIR) that you were using previously. Rik, Johannes what do you think? Should we go with the simpler approach for now and think of a better plan longterm? -- Michal Hocko SUSE Labs