Received: by 10.213.65.68 with SMTP id h4csp3729484imn; Tue, 10 Apr 2018 03:54:03 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/P1RYBBredWdZ/KUpwBicBztIQQNrFvQX+grDm5vEC84RHgmqxQSKUouz8RyjaZOie+cLo X-Received: by 10.101.102.197 with SMTP id c5mr17506322pgw.93.1523357643184; Tue, 10 Apr 2018 03:54:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523357643; cv=none; d=google.com; s=arc-20160816; b=szpV/9D+qOTPQ/0XAnm44R+CmHBT7CVlDkRY+snqdAJTmLgX4d24mDjcOYZovJFe9v vDxnTME5sDtZ6FJmtjXSKtCrPX+jw4Fo+IDs1TDKW8mlpJUoRcgXSqm4WA1i9c6Xx7CX 2j3k1+uy7RwGk5iEkCw3L7gSxUJQlrAdC8pzI9T+WnxvbT6bGdlaBDJZIbna7Ivi93oM 6q5KecnlbCQBHUWJgWE1aS//04jPU5Y3QpVTw3oJk/ZFMX5m7O+lDMHJgts7PRefcOTd xN8uQ8k0QNMeHPrvLRtdUEpqom7rLstMYb27vx2DseM3OHqSbLxihJ7J0TQL2ab/or3z CjrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=0Oty0v3QoFOERRXu/9Widk6oURuRC9fo2Vtis4qyWvc=; b=DLlzEEabilze6Ov/ZqhuE4Tk9m4+jAnQvKwKK9Z/nJwS1Bjlbb4ivi/BC1F5S6WMER YVKTyLW6ewPgjIXXAJvgJUDZt3RCq+RCRH8r4PtMMdsEn2HORpxE+SzbCOlvRKeZ6I1l 1okDX46KyitUiyhLw2GxeonQZmeCz9V2lks6LGzK2UJ2UOfb34yUxZ0EN0JSq8RIdzZI VqWQ/24ho3Y8rCk6GSaS37fevcxo6n1VlK1PpkFZqAjSwAcpJRWwZDrOzlgkTm6p4LnU y5Vovb1dpSuHBQcKEQnIv7LvAVtMdEk2/PeLM1lvKA7kNV2e5UyH8DmOCBxxBk4lXlzq JLLw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y7-v6si2401680pln.425.2018.04.10.03.53.25; Tue, 10 Apr 2018 03:54:03 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752603AbeDJKtF (ORCPT + 99 others); Tue, 10 Apr 2018 06:49:05 -0400 Received: from mx2.suse.de ([195.135.220.15]:58656 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751942AbeDJKtE (ORCPT ); Tue, 10 Apr 2018 06:49:04 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 1E187AEED; Tue, 10 Apr 2018 10:49:03 +0000 (UTC) Date: Tue, 10 Apr 2018 12:49:02 +0200 From: Michal Hocko To: Zhaoyang Huang Cc: Steven Rostedt , Ingo Molnar , LKML Subject: Re: [PATCH v1] ringbuffer: Don't choose the process with adj equal OOM_SCORE_ADJ_MIN Message-ID: <20180410104902.GC21835@dhcp22.suse.cz> References: <20180409231230.1ab99e85@vmware.local.home> <20180410061447.GQ21835@dhcp22.suse.cz> <20180410074921.GU21835@dhcp22.suse.cz> <20180410081231.GV21835@dhcp22.suse.cz> <20180410090128.GY21835@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 10-04-18 17:32:44, Zhaoyang Huang wrote: > On Tue, Apr 10, 2018 at 5:01 PM, Michal Hocko wrote: > > On Tue 10-04-18 16:38:32, Zhaoyang Huang wrote: > >> On Tue, Apr 10, 2018 at 4:12 PM, Michal Hocko wrote: > >> > On Tue 10-04-18 16:04:40, Zhaoyang Huang wrote: > >> >> On Tue, Apr 10, 2018 at 3:49 PM, Michal Hocko wrote: > >> >> > On Tue 10-04-18 14:39:35, Zhaoyang Huang wrote: > >> >> >> On Tue, Apr 10, 2018 at 2:14 PM, Michal Hocko wrote: > >> > [...] > >> >> >> > OOM_SCORE_ADJ_MIN means "hide the process from the OOM killer completely". > >> >> >> > So what exactly do you want to achieve here? Because from the above it > >> >> >> > sounds like opposite things. /me confused... > >> >> >> > > >> >> >> Steve's patch intend to have the process be OOM's victim when it > >> >> >> over-allocating pages for ring buffer. I amend a patch over to protect > >> >> >> process with OOM_SCORE_ADJ_MIN from doing so. Because it will make > >> >> >> such process to be selected by current OOM's way of > >> >> >> selecting.(consider OOM_FLAG_ORIGIN first before the adj) > >> >> > > >> >> > I just wouldn't really care unless there is an existing and reasonable > >> >> > usecase for an application which updates the ring buffer size _and_ it > >> >> > is OOM disabled at the same time. > >> >> There is indeed such kind of test case on my android system, which is > >> >> known as CTS and Monkey etc. > >> > > >> > Does the test simulate a real workload? I mean we have two things here > >> > > >> > oom disabled task and an updater of the ftrace ring buffer to a > >> > potentially large size. The second can be completely isolated to a > >> > different context, no? So why do they run in the single user process > >> > context? > >> ok. I think there are some misunderstandings here. Let me try to > >> explain more by my poor English. There is just one thing here. The > >> updater is originally a oom disabled task with adj=OOM_SCORE_ADJ_MIN. > >> With Steven's patch, it will periodically become a oom killable task > >> by calling set_current_oom_origin() for user process which is > >> enlarging the ring buffer. What I am doing here is limit the user > >> process to the ones that adj > -1000. > > > > I've understood that part. And I am arguing whether this is really such > > an important case to play further tricks. Wouldn't it be much simpler to > > put the updater out to a separate process? OOM disabled processes > > shouldn't really do unexpectedly large allocations. Full stop. Otherwise > > you risk a large system disruptions. > > -- > It is a real problem(my android system just hung there while running > the test case for the innocent key process killed by OOM), however, > the problem is we can not define the userspace's behavior as you > suggested. What Steven's patch doing here is to keep the system to be > stable by having the updater to take the responsbility itself. My > patch is to let the OOM disabled processes remain the unkillable > status. But you do realize that what you are proposing is by no means any safer, don't you? The memory allocated for the ring buffer is _not_ accounted to any process and as such it is not considered by the oom killer when picking up an oom victim so you are quite likely to pick up an innocent process to be killed. So basically you are risking an allocation runaway completely hidden from the OOM killer. Now, the downside of the patch is that the OOM_SCORE_ADJ_MIN task might get killed which is something that shouldn't happen because it is a contract. I would call this an unsolvable problem and a inherent broken design of the oom disabled task. So far I haven't heard a single _argument_ why supporting such a weird cornercase is desirable when your application can trivial do fork(); set_oom_score_adj(); exec("echo $VAR > $RINGBUFFER_FILE") -- Michal Hocko SUSE Labs