Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp577653imm; Fri, 29 Jun 2018 02:55:45 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJ++QaUPraYR2JDMf4bmdad90J5UVD4GrAEC7iFMyw415fMzyT/j2cJ+9M6BFbxiR91RWBd X-Received: by 2002:a65:4eca:: with SMTP id w10-v6mr11995942pgq.13.1530266145621; Fri, 29 Jun 2018 02:55:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530266145; cv=none; d=google.com; s=arc-20160816; b=oxoQDHce52/LobC53wYfZXK/2Ocs7geCWEJqVoHG3Ez+LOcjP/Gh8ACAQD4AtQXBUC 04MSPad3n5BHghwHbdK/33CaSdNVexTBrwAkAuGJ8Bjnklha+KEMcm+vB1ReCy3vX466 dUkWxmlooPVmKksG/cNOi79GfAwnftoHsRymPTxHSGK+iWlblvR+iZtNEW75hSNdaJfr Jch3eWIe5LljQysrf1APiw97DmQDLiBiuwFQ8aTAUtENc/nRXThQtppnU7bfItH+yrLc IQD0lSR/7Mc8iLYZEIHbGfrAUEnP8QQmyeAi0qSpze1snIHZhknXDkKs8OIKT9FludZ9 VsMw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=yofr8yb5pzvY9qNtjH96jTgbz8qS0El3hzEc50JUA7E=; b=aRpUaEYt36CI+lCNM6AoguSzf32jRXXC2ofdmYNsr6JAtJaG7VdTmgKBNTGb3lXMVg AOlrIrSh80VxM1m1zEzULwDwwo32WF4AKH/v9y3di0nZ9qav2KK6b91dnNJtPyrSFpjf sxBD6HNFE1owBdSn5bffzp/vKSTMMU48SienktEGG0aT9MwWa5YMURAp+sNr7OZeLzTJ WwaQM2u5sOIagyfhmu4G7IzgN7opj8Y/CRL7Zo4Oj4Bba5hLVAlSVIqLueVJQeqNzbUA GOG1M1ROddPXBeBLHN0vl2lFQmIVikq68zolXgDYwp+OK/D2Ljomvzgn46AZWhozh4Q5 CcDw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y187-v6si3764493pfy.101.2018.06.29.02.55.30; Fri, 29 Jun 2018 02:55:45 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754169AbeF2JEY (ORCPT + 99 others); Fri, 29 Jun 2018 05:04:24 -0400 Received: from mx2.suse.de ([195.135.220.15]:35608 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753279AbeF2JEW (ORCPT ); Fri, 29 Jun 2018 05:04:22 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 018D2AF31; Fri, 29 Jun 2018 09:04:20 +0000 (UTC) Date: Fri, 29 Jun 2018 11:04:19 +0200 From: Michal Hocko To: "Paul E. McKenney" Cc: Tetsuo Handa , David Rientjes , linux-mm@kvack.org, Andrew Morton , linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm,oom: Bring OOM notifier callbacks to outside of OOM killer. Message-ID: <20180629090419.GD13860@dhcp22.suse.cz> References: <1529493638-6389-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp> <20180621073142.GA10465@dhcp22.suse.cz> <2d8c3056-1bc2-9a32-d745-ab328fd587a1@i-love.sakura.ne.jp> <20180626170345.GA3593@linux.vnet.ibm.com> <20180627072207.GB32348@dhcp22.suse.cz> <20180627143125.GW3593@linux.vnet.ibm.com> <20180628113942.GD32348@dhcp22.suse.cz> <20180628213105.GP3593@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180628213105.GP3593@linux.vnet.ibm.com> User-Agent: Mutt/1.10.0 (2018-05-17) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu 28-06-18 14:31:05, Paul E. McKenney wrote: > On Thu, Jun 28, 2018 at 01:39:42PM +0200, Michal Hocko wrote: > > On Wed 27-06-18 07:31:25, Paul E. McKenney wrote: > > > On Wed, Jun 27, 2018 at 09:22:07AM +0200, Michal Hocko wrote: > > > > On Tue 26-06-18 10:03:45, Paul E. McKenney wrote: > > > > [...] > > > > > 3. Something else? > > > > > > > > How hard it would be to use a different API than oom notifiers? E.g. a > > > > shrinker which just kicks all the pending callbacks if the reclaim > > > > priority reaches low values (e.g. 0)? > > > > > > Beats me. What is a shrinker? ;-) > > > > This is a generich mechanism to reclaim memory that is not on standard > > LRU lists. Lwn.net surely has some nice coverage (e.g. > > https://lwn.net/Articles/548092/). > > "In addition, there is little agreement over what a call to a shrinker > really means or how the called subsystem should respond." ;-) > > Is this set up using register_shrinker() in mm/vmscan.c? I am guessing Yes, exactly. You are supposed to implement the two methods in struct shrink_control > that the many mentions of shrinker in DRM are irrelevant. > > If my guess is correct, the API seems a poor fit for RCU. I can > produce an approximate number of RCU callbacks for ->count_objects(), > but a given callback might free a lot of memory or none at all. Plus, > to actually have ->scan_objects() free them before returning, I would > need to use something like rcu_barrier(), which might involve longer > delays than desired.` Well, I am not yet sure how good fit this is because I still do not understand the underlying problem your notifier is trying to solve. So I will get back to this once that is settled. > > Or am I missing something here? > > > > More seriously, could you please point me at an exemplary shrinker > > > use case so I can see what is involved? > > > > Well, I am not really sure what is the objective of the oom notifier to > > point you to the right direction. IIUC you just want to kick callbacks > > to be handled sooner under a heavy memory pressure, right? How is that > > achieved? Kick a worker? > > That is achieved by enqueuing a non-lazy callback on each CPU's callback > list, but only for those CPUs having non-empty lists. This causes > CPUs with lists containing only lazy callbacks to be more aggressive, > in particular, it prevents such CPUs from hanging out idle for seconds > at a time while they have callbacks on their lists. > > The enqueuing happens via an IPI to the CPU in question. I am afraid this is too low level for my to understand what is going on here. What are lazy callbacks and why do they need any specific action when we are getting close to OOM? I mean, I do understand that we might have many callers of call_rcu and free memory lazily. But there is quite a long way before we start the reclaim until we reach the OOM killer path. So why don't those callbacks get called during that time period? How are their triggered when we are not hitting the OOM path? They surely cannot sit there for ever, right? Can we trigger them sooner? Maybe the shrinker is not the best fit but we have a retry feedback loop in the page allocator, maybe we can kick this processing from there. -- Michal Hocko SUSE Labs