Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp851473imm; Fri, 29 Jun 2018 07:26:15 -0700 (PDT) X-Google-Smtp-Source: ADUXVKIU1WzHIPWAI9M83yF0yMCDNe8BJn/w63Lz1k+HuEcHTb5423+1LO2GnWrgqBKIeEEZfRdI X-Received: by 2002:a17:902:4d:: with SMTP id 71-v6mr15115582pla.317.1530282375323; Fri, 29 Jun 2018 07:26:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530282375; cv=none; d=google.com; s=arc-20160816; b=RaJacTQvuWorkNxP4kQw8WyETygwm2zNprJ6Zw5ucnmy7oOXejTijbDJl0CRjz/8Rt vkTegqXpvo0CU6fyXR6kY0kje4oI19AEP29HmzKclmMvFOl1zPe3WGLsreRBJrVfMLB7 Bl07Cqv/X5OOwUlrOoledUhsOHQEb51rVPIrKEaOE6MzV6NXIIE8F/8h34FdJ0WcvqlL 9PyDuKIes7qUG8QWgHVx9RBA5z3SfO4YMeyxes5uu0EPf4sv2pb17FDEnGgpJZ6njQB9 +5OiHRCWzQeiITGr3xIfLBIJ0gmjuc6HerBtcgQU54O5qKX7UVsEEoL3C2tGEbh15cdm afXQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=x0oWUzzsneHLOyS3hKbZ5kjoEaIpbgWWzDB4ym8raxU=; b=v+ieMrtOxDAnDAx5n8SCgCtP/l7PIdlOGPdnn0mH0grpEXkZzkKhYtiO7ZxGPCtLug sRvlMGgHproUnMoA4y7tUoo4p8W7XWWJjgfQxC5Uuhzs6Ffi7Ib47j4wz/038guUPH/3 JOHYkbChimCbCRlTYgmqQrm1fRU748iHYpICcgCxjMkq2+hjJHfs84OLR2Tmy6A0atrF BOIBeDps2TulcMxKqmwWSBLFtpDj0L42cfAzD9/axH0e0G/9fURrFDcN4Qufg8lNct5+ ilB5UwGW60nuRifFFZlbnIGdfVXkQddZhoOMJjyWJL3Ukcmhkb0hfsesOtvXbkdkA2Vq HJcQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j184-v6si8173489pge.607.2018.06.29.07.26.01; Fri, 29 Jun 2018 07:26:15 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754677AbeF2N0m (ORCPT + 99 others); Fri, 29 Jun 2018 09:26:42 -0400 Received: from mx2.suse.de ([195.135.220.15]:51602 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753471AbeF2N0k (ORCPT ); Fri, 29 Jun 2018 09:26:40 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 6DC53ADDA; Fri, 29 Jun 2018 13:26:39 +0000 (UTC) Date: Fri, 29 Jun 2018 15:26:38 +0200 From: Michal Hocko To: "Paul E. McKenney" Cc: Tetsuo Handa , David Rientjes , linux-mm@kvack.org, Andrew Morton , linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm,oom: Bring OOM notifier callbacks to outside of OOM killer. Message-ID: <20180629132638.GD5963@dhcp22.suse.cz> References: <20180621073142.GA10465@dhcp22.suse.cz> <2d8c3056-1bc2-9a32-d745-ab328fd587a1@i-love.sakura.ne.jp> <20180626170345.GA3593@linux.vnet.ibm.com> <20180627072207.GB32348@dhcp22.suse.cz> <20180627143125.GW3593@linux.vnet.ibm.com> <20180628113942.GD32348@dhcp22.suse.cz> <20180628213105.GP3593@linux.vnet.ibm.com> <20180629090419.GD13860@dhcp22.suse.cz> <20180629125218.GX3593@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180629125218.GX3593@linux.vnet.ibm.com> User-Agent: Mutt/1.10.0 (2018-05-17) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri 29-06-18 05:52:18, Paul E. McKenney wrote: > On Fri, Jun 29, 2018 at 11:04:19AM +0200, Michal Hocko wrote: > > On Thu 28-06-18 14:31:05, Paul E. McKenney wrote: > > > On Thu, Jun 28, 2018 at 01:39:42PM +0200, Michal Hocko wrote: [...] > > > > Well, I am not really sure what is the objective of the oom notifier to > > > > point you to the right direction. IIUC you just want to kick callbacks > > > > to be handled sooner under a heavy memory pressure, right? How is that > > > > achieved? Kick a worker? > > > > > > That is achieved by enqueuing a non-lazy callback on each CPU's callback > > > list, but only for those CPUs having non-empty lists. This causes > > > CPUs with lists containing only lazy callbacks to be more aggressive, > > > in particular, it prevents such CPUs from hanging out idle for seconds > > > at a time while they have callbacks on their lists. > > > > > > The enqueuing happens via an IPI to the CPU in question. > > > > I am afraid this is too low level for my to understand what is going on > > here. What are lazy callbacks and why do they need any specific action > > when we are getting close to OOM? I mean, I do understand that we might > > have many callers of call_rcu and free memory lazily. But there is quite > > a long way before we start the reclaim until we reach the OOM killer path. > > So why don't those callbacks get called during that time period? How are > > their triggered when we are not hitting the OOM path? They surely cannot > > sit there for ever, right? Can we trigger them sooner? Maybe the > > shrinker is not the best fit but we have a retry feedback loop in the page > > allocator, maybe we can kick this processing from there. > > The effect of RCU's current OOM code is to speed up callback invocation > by at most a few seconds (assuming no stalled CPUs, in which case > it is not possible to speed up callback invocation). > > Given that, I should just remove RCU's OOM code entirely? Yeah, it seems so. I do not see how this would really help much. If we really need some way to kick callbacks then we should do so much earlier in the reclaim process - e.g. when we start struggling to reclaim any memory. I am curious. Has the notifier been motivated by a real world use case or it was "nice thing to do"? -- Michal Hocko SUSE Labs