Received: by 2002:ac0:a591:0:0:0:0:0 with SMTP id m17-v6csp183808imm; Fri, 6 Jul 2018 17:06:32 -0700 (PDT) X-Google-Smtp-Source: AAOMgpdKsSUihUMOBLJYwGFwzmMUPf9sLwQcLRiqrg0l1TsYSYjTlOjqi4ufAj/v0tWQkHuEOA7K X-Received: by 2002:a62:4808:: with SMTP id v8-v6mr12421291pfa.89.1530921992786; Fri, 06 Jul 2018 17:06:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530921992; cv=none; d=google.com; s=arc-20160816; b=BrgpgV+EBAZ43ThWv8dVO3eXVeSMI3h3SZZcOHb3zTsJj4Ai6Tp78p9o0fJVobhjc/ X//IdRAqO+esr6PsJataXJQ/P3R0WLjFJAjwDQH20TX+lYGan2tPZIOKyVo5IIqmsz7O PDkGQ/JQdko5taTRW4RNZgFRNDQfRngGK8XIhunxCxwrs+3UCIofBZOlUVjMqDpv9/Cy C2osknFEg5m8ZjWX9wqPYUJBU2/Qoen1pYpgfAyATDBikVHnOXGBJ8tT7K+PIScSCPZO eIqSEuXc0wWpVwXpkCYfaCXNaN9LL6gTJxPs59vRMDUVHDZTS8WbUHBIOy4BKxFfHw54 rKYQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date:dkim-signature :arc-authentication-results; bh=6wQuN62ccyfs9nr9jUlD6eYsJahyFhmLIYXwZ6NcS28=; b=bl+xRW+kqSj8p9sozvHFKPt6d9mtbnkfOIYMb+C1nadxur5KXSY3APno57WOj3bWrm tc7Om6boI1vNp7SGCRANAmYGqtnpj/hxMhLAyF2NZJDsBjLVc87MuHVFDR9xxnMQinpi canuok6SCzIic9k3X+Dqs1VugWsPbVQOOtwBYEVGiV+QAIy8cYtZLuuR1n2e/2FphUzD iAb4FEqdTI40wdv0SH3yWSe2a7hjWfr/gUX40ezj+sDMBgdlC4AqmvE7l/hjmzfFJnZ/ a3ow8bMTcbukweTVFfak9S2liIhWqryRA8joqcGspIYmAF0t+9jOv900aRsVma84WR8P bhoA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=ivZZ7vr0; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e7-v6si8778733pgf.317.2018.07.06.17.06.17; Fri, 06 Jul 2018 17:06:32 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=ivZZ7vr0; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753993AbeGGAFm (ORCPT + 99 others); Fri, 6 Jul 2018 20:05:42 -0400 Received: from mail-pl0-f68.google.com ([209.85.160.68]:36413 "EHLO mail-pl0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753652AbeGGAFl (ORCPT ); Fri, 6 Jul 2018 20:05:41 -0400 Received: by mail-pl0-f68.google.com with SMTP id a7-v6so3516897plp.3 for ; Fri, 06 Jul 2018 17:05:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=6wQuN62ccyfs9nr9jUlD6eYsJahyFhmLIYXwZ6NcS28=; b=ivZZ7vr00JX7COz+MH5y5MYVkFNHAtsrEnM6hsMKlW5Vg/3+Q6AkAi+/4L02ED3Occ mcFFL3GtR4iRE2TYAaaXE2TaV2dA91+K3u0W1mTrO93ELGYDN+ql8TJW6udbBxg1LU/n 8Dxw9sSimQo6JXf2uyGt/K2j7xoCJbdsBIub088cA6Kdjtq+rn3QUH7CNWMAklwanIxl nvT1aI3i6zUFFwDaOdlsxaal5+af3YqFDXr5uDh05cyP/khlZVTyZvRto/wvd0ibdbiy EPkd2V1NnPMtkGcWN7+cKTb8HCLTmH9H1pMSv44sZ9oQnkgE6rMhPxiZWrP4gftSU059 deDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=6wQuN62ccyfs9nr9jUlD6eYsJahyFhmLIYXwZ6NcS28=; b=dGzwXFeQ/Dc/il5PiRM5QjHODvm0HWuIS/sRTyoXUYCHFKdBrQZV48RTLI8ifO2aZl +y6wXLuNj4D9+IWujYAFcwTBEuG1iVSH6yS9hir3ITUy6bBlw93Paezqangp0jWTgWPi JCogAcP45D9pl0PBlC4sAspWgwWy/T8up/NGI0K1BEhzi937oerhrrCpW3MRdsq/iIbQ hLWClMtzqkvU792XH7VEeearKBajwXUimFoxYo71gvUShfGdN9MokGjT6c4IxBlT52hN C3kaf7wjYsYi2fOVFW99QJaEpl91nDCfA+vUWkjP4vR68885L3qd0Z7MSuLLje9N0bW8 wTjg== X-Gm-Message-State: APt69E1FMGjl14bzvb3Bgskusomb11b/tQ+P15oXWYd3K9FMuOmYjB5G RyzP/dgjAfIwpJ3KzeTafw2vUg== X-Received: by 2002:a17:902:6802:: with SMTP id h2-v6mr11862027plk.113.1530921940448; Fri, 06 Jul 2018 17:05:40 -0700 (PDT) Received: from [2620:15c:17:3:3a5:23a7:5e32:4598] ([2620:15c:17:3:3a5:23a7:5e32:4598]) by smtp.gmail.com with ESMTPSA id l85-v6sm21986297pfk.79.2018.07.06.17.05.39 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 06 Jul 2018 17:05:39 -0700 (PDT) Date: Fri, 6 Jul 2018 17:05:39 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Andrew Morton cc: kbuild test robot , Michal Hocko , Tetsuo Handa , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [patch v3] mm, oom: fix unnecessary killing of additional processes In-Reply-To: <20180705164621.0a4fe6ab3af27a1d387eecc9@linux-foundation.org> Message-ID: References: <20180705164621.0a4fe6ab3af27a1d387eecc9@linux-foundation.org> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 5 Jul 2018, Andrew Morton wrote: > > +#ifdef CONFIG_DEBUG_FS > > +static int oom_free_timeout_ms_read(void *data, u64 *val) > > +{ > > + *val = oom_free_timeout_ms; > > + return 0; > > +} > > + > > +static int oom_free_timeout_ms_write(void *data, u64 val) > > +{ > > + if (val > 60 * 1000) > > + return -EINVAL; > > + > > + oom_free_timeout_ms = val; > > + return 0; > > +} > > +DEFINE_SIMPLE_ATTRIBUTE(oom_free_timeout_ms_fops, oom_free_timeout_ms_read, > > + oom_free_timeout_ms_write, "%llu\n"); > > +#endif /* CONFIG_DEBUG_FS */ > > One of the several things I dislike about debugfs is that nobody > bothers documenting it anywhere. But this should really be documented. > I'm not sure where, but the documentation will find itself alongside a > bunch of procfs things which prompts the question "why it *this* one in > debugfs"? > The only reason I have placed it in debugfs, or making it tunable at all, is to appease others. I know the non-default value we need to use to stop millions of processes being oom killed unnecessarily. Michal suggested a tunable to disable the oom reaper entirely, which is not what we want, so I found this to be the best alternative. I'd like to say that it is purposefully undocumented since it's not a sysctl and nobody can suggest that it is becoming a permanent API that we must maintain for backwards compatibility. Having it be configurable is kind of ridiculous, but such is the nature of trying to get patches merged these days to prevent millions of processes being oom killed unnecessarily. Blockable mmu notifiers and mlocked memory is not the extent of the problem, if a process has a lot of virtual memory we must wait until free_pgtables() completes in exit_mmap() to prevent unnecessary oom killing. For implementations such as tcmalloc, which does not release virtual memory, this is important because, well, it releases this only at exit_mmap(). Of course we cannot do that with only the protection of mm->mmap_sem for read. This is a patch that we'll always need if we continue with the current implementation of the oom reaper. I wouldn't suggest it as a configurable value, but, owell. I'll document the tunable and purposefully repeat myself that this is addresses millions of processes being oom killed unnecessarily so the rather important motivation of the change is clear to anyone who reads this thread now or in the future. Nobody can guess an appropriate value until they have been hit by the issue themselves and need to deal with the loss of work from important processes being oom killed when some best effort logging cron job uses too much memory. Or, of course, pissed off users who have their jobs killed off and you find yourself in the rather unfortunate situation of explaining why the Linux kernel in 2018 needs to immediately SIGKILL processes because of an arbitrary nack related to a timestamp. Thanks.