Received: by 10.213.65.68 with SMTP id h4csp4252906imn; Tue, 10 Apr 2018 11:43:46 -0700 (PDT) X-Google-Smtp-Source: AIpwx4+I28knc5ZqHDrqXnUJe5G3k5bedG7n7FO1rbIDc6hMhNZh2avQiqnm72mUMY53/NyAdx4C X-Received: by 10.98.174.5 with SMTP id q5mr1272324pff.155.1523385826564; Tue, 10 Apr 2018 11:43:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523385826; cv=none; d=google.com; s=arc-20160816; b=ZzSFARjnrX7/WOPkQvn4vOEhbVS4LFL+7OdxoNwhDtUie8gfS79/9fnKL52Z9TKPhi kIx9HhGSyoLQkIdLs9kTutitVoImkrLUr2LA+Nq2MWY1tHubBQO0ZZTINrLsZmLznzE6 r4ojpOP/gSdNf4slQeLtpH1EgCIlMEYXgOb5OZL6BQ/BJERkTMimBtfVW3ovtIBWiz7A sWeigRgpoAsz5SbYp9ZikjhxAPh6S+6ir3maU6/ht976Xs0nJB4FvmPmK3Nys0sLMILl QBzoMglMOf3YPLEUwr+Nf0UscPgByzpE4u9lHo41rtSsg9e7hcF+QmHCY3i1mgfDJb6F 7nIg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=ACWP4S5vHAQbcDGcOHeVyNY7DzhL52jnY0N0XyZk8Xs=; b=t8CPmiwbpjpQAahhpn4Pz6ZQnLokvqmaS0UymoE/tDYqnd55D1QuJCH4XFq4jn5nlT RK+k2AJqVX5Kfwdn/I7E9RfIAT/IyUejnxZ45wj+50lu+L7jh/nlDoO/3rQIqy4hE1KC a6oE0HzA87SGY0l3QzrAs9pB0VdqAx8KCXwMm4P7UQ8WHMRUbJvoev0DrSMwbhpGWoQQ PUH+vXE1nxTtRk4+qvxn45ZdmIhAse4IAu96bR8QQYuPuWP9gXatb47kpxBOrz7fHnV+ ah2xu2Ck8Xv2FBkrfybtskZe8FDOg7QsteAbqR8TC6eWprjoI5TGRROHPAy8ALoIW8TM zLQA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=RMoHihjL; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j13si2146798pgt.402.2018.04.10.11.43.09; Tue, 10 Apr 2018 11:43:46 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=RMoHihjL; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753193AbeDJSjd (ORCPT + 99 others); Tue, 10 Apr 2018 14:39:33 -0400 Received: from mail-it0-f46.google.com ([209.85.214.46]:53719 "EHLO mail-it0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753151AbeDJSj0 (ORCPT ); Tue, 10 Apr 2018 14:39:26 -0400 Received: by mail-it0-f46.google.com with SMTP id m134-v6so17253227itb.3 for ; Tue, 10 Apr 2018 11:39:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=ACWP4S5vHAQbcDGcOHeVyNY7DzhL52jnY0N0XyZk8Xs=; b=RMoHihjLtSx6sYasOEiMFOWVcBDbLe428obJgUG8oh3QUMaUungBbQiDIbEwEgz9x4 LCXj3Z0gdXiWBBew4JwVAH8pv9bb+wmV0+hLBqdHU4YqN1E+2MKYphN1H8Pn6AVb2LY7 b/XC8Mj8PGg8DfvNYFfAe5JAAPJlfkXKs5ayJpjONRtPauApDAiHpIZfakCnMachLqv6 COshdE/anDZihfudfl9E0sAdwkdBhiNLX/bjBUnS0WEvThu485/06I59nUleitz/pAsF 8G0tgBK9w6EN84kFayPRfpOIY5Lakt98Iiz773jW2Y1hRhEs8ejgUMrutEA13YBOjQjR hb5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=ACWP4S5vHAQbcDGcOHeVyNY7DzhL52jnY0N0XyZk8Xs=; b=JVTTTpRjQ6FjUOCUS5dOcqK/JvZ7oKgIXySZzFZdApAZx72AQLA065Uc6DW9uyFy// cOlmc8ZGPByAl1Y0FEuJVaC3dNvsPa7ZqF+15ETlDDdASrgOg96dLt/brWY3i2pZ6anI QGDY2FCmRvTV/h1IZ6UafyhUK22Cx74fKYeYRDWRRmq2YBDYteEv3PGHHdkICgrHE596 rNK/1S3+//CbeYhGW41b5bzsJa5P9r0anr0N7MIn1fgpjYAlkSpWhn4L1ltRnPXL6oL7 f3PPs6cOfzV8fCeO+yxsvYb6ye/kHE2bhEjFgHhxZ+WbDjHM0IARVaf1Ggi6oGauJkXk k2fg== X-Gm-Message-State: ALQs6tCkGNrDX3ZnmTVKVdmG/Dza+OcmvxTcA+MGscQOXin/3qfpTBJV BcR1Bl+t97bqFRsnURcic/ljAkT9HjHfOPOvyydnzg== X-Received: by 2002:a24:5f45:: with SMTP id r66-v6mr702402itb.126.1523385565221; Tue, 10 Apr 2018 11:39:25 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.11.158 with HTTP; Tue, 10 Apr 2018 11:39:24 -0700 (PDT) In-Reply-To: <20180410140036.650a8732@gandalf.local.home> References: <20180410061447.GQ21835@dhcp22.suse.cz> <20180410074921.GU21835@dhcp22.suse.cz> <20180410081231.GV21835@dhcp22.suse.cz> <20180410090128.GY21835@dhcp22.suse.cz> <20180410104902.GC21835@dhcp22.suse.cz> <20180410082316.263d34ec@gandalf.local.home> <20180410122706.GH21835@dhcp22.suse.cz> <20180410083625.2c904ab2@gandalf.local.home> <20180410091311.20bd8ccc@gandalf.local.home> <20180410140036.650a8732@gandalf.local.home> From: Joel Fernandes Date: Tue, 10 Apr 2018 11:39:24 -0700 Message-ID: Subject: Re: [PATCH v1] ringbuffer: Don't choose the process with adj equal OOM_SCORE_ADJ_MIN To: Steven Rostedt Cc: Michal Hocko , Zhaoyang Huang , Ingo Molnar , LKML Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Steve, On Tue, Apr 10, 2018 at 11:00 AM, Steven Rostedt wrote: > On Tue, 10 Apr 2018 09:45:54 -0700 > Joel Fernandes wrote: > >> > diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h >> > index a0233edc0718..807e2bcb21b3 100644 >> > --- a/include/linux/ring_buffer.h >> > +++ b/include/linux/ring_buffer.h >> > @@ -106,7 +106,8 @@ __poll_t ring_buffer_poll_wait(struct ring_buffer *buffer, int cpu, >> > >> > void ring_buffer_free(struct ring_buffer *buffer); >> > >> > -int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size, int cpu); >> > +int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size, >> > + int cpu, int rbflags); >> > >> > void ring_buffer_change_overwrite(struct ring_buffer *buffer, int val); >> > >> > @@ -201,6 +202,7 @@ int ring_buffer_print_page_header(struct trace_seq *s); >> > >> > enum ring_buffer_flags { >> > RB_FL_OVERWRITE = 1 << 0, >> > + RB_FL_NO_RECLAIM = 1 << 1, >> >> But the thing is, set_oom_origin doesn't seem to be doing the >> desirable thing every time anyway as per my tests last week [1] and >> the si_mem_available check alone seems to be working fine for me (and >> also Zhaoyang as he mentioned). > > But did you try it with just plain GFP_KERNEL, and not RETRY_MAYFAIL. Yes I tried it with just GFP_KERNEL as well. What I did based on your suggestion for testing the OOM hint is: 1. Comment the si_mem_available check 2. Do only GFP_KERNEL The system gets destabilized with this combination even with the OOM hint. These threads are here: https://lkml.org/lkml/2018/4/5/720 > My tests would always trigger the allocating task without the > RETRY_MAYFAIL, but with RETRY_MAYFAIL it would sometimes take out other > tasks. > >> >> Since the problem Zhaoyang is now referring to is caused because of >> calling set_oom_origin in the first place, can we not just drop that >> patch and avoid adding more complexity? > > Actually, I'm thinking of dropping the MAYFAIL part. It really should > be the one targeted if you are extending the ring buffer. This then sounds like it should be fixed in -mm code? If we're giving the hint and its not getting killed there then that's an -mm issue. > I could add two loops. One that does NORETRY without the oom origin, > and if it succeeds, its fine. But if it requires reclaim, it will then > set oom_origin and go harder (where it should be the one targeted). > > But that may be pointless, because if NORETRY succeeds, there's not > really any likelihood of oom triggering in the first place. Yes. > >> >> IMHO I feel like for things like RB memory allocation, we shouldn't >> add a knob if we don't need to. > > It was just a suggestion. Cool, I understand. >> >> Also I think Zhaoyang is developing for Android too since he mentioned >> he ran CTS tests so we both have the same "usecase" but he can feel >> free to correct me if that's not the case ;) > > I think if you are really worried with the task being killed by oom, > then I agree with Michal and just fork a process to do the allocation > for you. Yes I agree. So lets just do that and no other patches additional patches are needed then. Let me know if there's anything else I missed? Also I got a bit confused, I reread all the threads. Zhaoyang's current issue is that the OOM hint *IS* working which is what triggered your patch to toggle the behavior through an option. Where was in this message we are discussing that the OOM hint doesn't always work which is not Zhaoyang's current issue. Let me know if I missed something? Sorry if I did. thanks, - Joel