Received: by 10.192.165.156 with SMTP id m28csp412527imm; Wed, 11 Apr 2018 00:52:22 -0700 (PDT) X-Google-Smtp-Source: AIpwx49sJfux19ms6lYvzhDH/ruNBH7FPKOQNKmXLg26Clm61VIwL3p7U7SkyCTxYibpU2eJVmZq X-Received: by 2002:a17:902:a701:: with SMTP id w1-v6mr3853232plq.109.1523433142511; Wed, 11 Apr 2018 00:52:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523433142; cv=none; d=google.com; s=arc-20160816; b=r270KlABn4tabGuMEpXQgI/afsiLpLZgXrbEp09ayRfAMTt6bH0Gndikee+F1VomvU CCCqfW1AgU6IYqmLdR+5JP8IhQlxPY/R/HAe5OsYPDhpUYs1spDc2VyQlpP8TSwy8zJG y0mtkam7j0UwoyeLnBpFwOkgV7dtFkdEcCLVk5Y2Nv/AK4eGJOOrinkJ6D4dqfpLKz8R myvQ+OeM7hdecWpsazPTkbB1GuVDiYOoD4bGnT4ePxMmdMRlPZqsqGIwGpccHR8aqAbk AYpbQPaN2eX80CLYWVhRKBGJqxj+8/IykthigWGft7HFu5+5khLcgx3LtFKS5BnmikxL fLzg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=j1j2AJQcD54g2XQNsZww9iEjvbVB5dRBuVJnjC1jKjg=; b=1Dp27ttKKfAk11hJYvOmEckdlWBlv68urDKWCGzVEB4hJIJ7GYtnPASzVzGr6I5oEQ oAz9bp75Yn3RZTRAmvWRdTzrUBLUvHuB/gGi1bOoXIFi2DVmKnKdFrGRfWB+0S94rd8G 4uWX+qfLGDUAjVolTG+HrHod42LVmNVIn8mV9bhxR+Ytn6GHSmorbas+ZND2C9v5CJOF BOfsSiaa3aGpyBdPNxv8jLnKYu408Dby9XPsH/ZgKrX30aiDEN5eTP9olrtTMiPWUiXn YKEXtlJCedxPqJO75rBtjzgRaLRruzzo3Dv+GGviJVsMMiAuiV+zeC3/xGb/QCUAlJXv VVJw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=ZMpJNGIm; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n6-v6si611636pla.26.2018.04.11.00.51.44; Wed, 11 Apr 2018 00:52:22 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=ZMpJNGIm; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752181AbeDKHsp (ORCPT + 99 others); Wed, 11 Apr 2018 03:48:45 -0400 Received: from mail-wm0-f44.google.com ([74.125.82.44]:37915 "EHLO mail-wm0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751694AbeDKHsn (ORCPT ); Wed, 11 Apr 2018 03:48:43 -0400 Received: by mail-wm0-f44.google.com with SMTP id i3so1961460wmf.3 for ; Wed, 11 Apr 2018 00:48:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=j1j2AJQcD54g2XQNsZww9iEjvbVB5dRBuVJnjC1jKjg=; b=ZMpJNGImmMRkUPt8/5HqYpp4obQ01Y2K9Gw2gYaxn3v8J1lfK//eFT71zzWCUf8tFu IyN5LW5GbJpN5w28XLIx+L0r5XX9ddEszEeiEHt5I3Ymd67S/OcKZSVenXUvFcRgkfSq 2+l6OTNmqt6jOT6Ip+JHSKdhtwfZMy0C2QWRbynA3STNoe+bTckUeoO/ZFO0HFCkO4/V AjcrhuLwb9cTvdpOBMd052rchAMpO2HqEsqUdfRd1xHmGb7DwWw/Uah+ivKBUCpeI/ck cpyJgJS8xbP1J7kc5nypH+M4zyDE6AD7oJJP1Dys7XcGFbN4POjcGu1G3QKGsYcjB3hw qZVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=j1j2AJQcD54g2XQNsZww9iEjvbVB5dRBuVJnjC1jKjg=; b=Q0cdhG0i1mzkzpCyNUGF5qYNLL3Q8TqIM/KChoNfyWS0TxQ7c0Pum1pMWifOf1DhR8 C9qlZ3z3kmSoaZ0zKIXlTdSWe7nU7z7otjI3P8idbE8/ccEOOjKYFVG7reENP95FgI7J VD6MYyZloQYOfyCIqnrbH9wTzk5fVDeLediX1XYyXVT+k/O6zoJMQdGD6Q5qh2dLr+AS e5sFdkqwJl3y7d/DvmdDM/U1/q3YzJqIktRd16kkukot+3ono8yqkYudvkefdgbDXD1p DvZt6fNaRz2igVa/x7BbEL5STsGgwJarNyQ3x3Q/9javyVse1kWBtAWiMb2gmG5G0vck 0V3Q== X-Gm-Message-State: ALQs6tDeS826HIXPPEqiIiDM4m6IcIRU4YWgJw4dfqUF83blRznBI1bb AuNzE809GqjJbOdN34nI1VyCXe7IaxfxAFfMsck= X-Received: by 10.80.146.170 with SMTP id k39mr8235593eda.110.1523432922031; Wed, 11 Apr 2018 00:48:42 -0700 (PDT) MIME-Version: 1.0 Received: by 10.80.201.76 with HTTP; Wed, 11 Apr 2018 00:48:41 -0700 (PDT) In-Reply-To: References: <20180410061447.GQ21835@dhcp22.suse.cz> <20180410074921.GU21835@dhcp22.suse.cz> <20180410081231.GV21835@dhcp22.suse.cz> <20180410090128.GY21835@dhcp22.suse.cz> <20180410104902.GC21835@dhcp22.suse.cz> <20180410082316.263d34ec@gandalf.local.home> <20180410122706.GH21835@dhcp22.suse.cz> <20180410083625.2c904ab2@gandalf.local.home> <20180410091311.20bd8ccc@gandalf.local.home> <20180410140036.650a8732@gandalf.local.home> From: Zhaoyang Huang Date: Wed, 11 Apr 2018 15:48:41 +0800 Message-ID: Subject: Re: [PATCH v1] ringbuffer: Don't choose the process with adj equal OOM_SCORE_ADJ_MIN To: Joel Fernandes Cc: Steven Rostedt , Michal Hocko , Ingo Molnar , LKML Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 11, 2018 at 2:39 AM, Joel Fernandes wrote: > Hi Steve, > > On Tue, Apr 10, 2018 at 11:00 AM, Steven Rostedt wrote: >> On Tue, 10 Apr 2018 09:45:54 -0700 >> Joel Fernandes wrote: >> >>> > diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h >>> > index a0233edc0718..807e2bcb21b3 100644 >>> > --- a/include/linux/ring_buffer.h >>> > +++ b/include/linux/ring_buffer.h >>> > @@ -106,7 +106,8 @@ __poll_t ring_buffer_poll_wait(struct ring_buffer *buffer, int cpu, >>> > >>> > void ring_buffer_free(struct ring_buffer *buffer); >>> > >>> > -int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size, int cpu); >>> > +int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size, >>> > + int cpu, int rbflags); >>> > >>> > void ring_buffer_change_overwrite(struct ring_buffer *buffer, int val); >>> > >>> > @@ -201,6 +202,7 @@ int ring_buffer_print_page_header(struct trace_seq *s); >>> > >>> > enum ring_buffer_flags { >>> > RB_FL_OVERWRITE = 1 << 0, >>> > + RB_FL_NO_RECLAIM = 1 << 1, >>> >>> But the thing is, set_oom_origin doesn't seem to be doing the >>> desirable thing every time anyway as per my tests last week [1] and >>> the si_mem_available check alone seems to be working fine for me (and >>> also Zhaoyang as he mentioned). >> >> But did you try it with just plain GFP_KERNEL, and not RETRY_MAYFAIL. > > Yes I tried it with just GFP_KERNEL as well. What I did based on your > suggestion for testing the OOM hint is: > 1. Comment the si_mem_available check > 2. Do only GFP_KERNEL > > The system gets destabilized with this combination even with the OOM > hint. These threads are here: > https://lkml.org/lkml/2018/4/5/720 > >> My tests would always trigger the allocating task without the >> RETRY_MAYFAIL, but with RETRY_MAYFAIL it would sometimes take out other >> tasks. >> >>> >>> Since the problem Zhaoyang is now referring to is caused because of >>> calling set_oom_origin in the first place, can we not just drop that >>> patch and avoid adding more complexity? >> >> Actually, I'm thinking of dropping the MAYFAIL part. It really should >> be the one targeted if you are extending the ring buffer. > > This then sounds like it should be fixed in -mm code? If we're giving > the hint and its not getting killed there then that's an -mm issue. > >> I could add two loops. One that does NORETRY without the oom origin, >> and if it succeeds, its fine. But if it requires reclaim, it will then >> set oom_origin and go harder (where it should be the one targeted). >> >> But that may be pointless, because if NORETRY succeeds, there's not >> really any likelihood of oom triggering in the first place. > > Yes. > >> >>> >>> IMHO I feel like for things like RB memory allocation, we shouldn't >>> add a knob if we don't need to. >> >> It was just a suggestion. > > Cool, I understand. > >>> >>> Also I think Zhaoyang is developing for Android too since he mentioned >>> he ran CTS tests so we both have the same "usecase" but he can feel >>> free to correct me if that's not the case ;) >> >> I think if you are really worried with the task being killed by oom, >> then I agree with Michal and just fork a process to do the allocation >> for you. > > Yes I agree. So lets just do that and no other patches additional > patches are needed then. Let me know if there's anything else I > missed? > > Also I got a bit confused, I reread all the threads. Zhaoyang's > current issue is that the OOM hint *IS* working which is what > triggered your patch to toggle the behavior through an option. Where > was in this message we are discussing that the OOM hint doesn't always > work which is not Zhaoyang's current issue. Let me know if I missed > something? Sorry if I did. > > thanks, > > - Joel Hi Joel, you are right. My issue is to make Steven's patch safer by keeping -1000 process out of OOM. I think it is ok either we just have si_mem_available or apply set/clear_current_oom_origin with absolving -1000 process. The CTS case failed because the system_server was killed as the innocent. If Steven think it is rared corner case, I am ok with that.