Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp1905824ybl; Thu, 29 Aug 2019 00:14:04 -0700 (PDT) X-Google-Smtp-Source: APXvYqynF9xwxTqQD/8nn+30tbjw6BvrAHj8EYxbzixlcM80IzHqNB1hA/OpRnBRSaZwvLLDSgdu X-Received: by 2002:a63:20a:: with SMTP id 10mr6869871pgc.226.1567062844401; Thu, 29 Aug 2019 00:14:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567062844; cv=none; d=google.com; s=arc-20160816; b=D3INZ+DAM104DDvUu5pCTqv2Hlcxnn3S6Jwcmm6NHrCOCvV1pSPdLfa7p3etYLruf3 abkkNQ0yGOzzKPoBBG2EqhTLaI7Svc1TD8DmuhWhpJy0UNDEK5vU41VVwIIzpIeLz11f k+/1p9maI9IuR4mERXXsxaDJIZdXzBhgLwvaIYCOR4lPDydWhrW45f0AVTMWyaOS9b7l m0r4CGJD5FJQ2M2JTck59lc5BpQYAwsptiqP8eW6JAwbIYI87C9UI9VWCRi0tcgI3yax lLBsGcBENqN8YTBCoPFBkdFH1R9bWeZN0dhDM/qYSu4+u2uXU1PRJmntubetFf2K/EvL d3Ow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=mytMa4HSqizmqFhncr+1WUdC7Fq8/O8VJh/HyNXnZQs=; b=UdUkyUBbmqCPQ+thBXb+b/jOYyLkPb3gC7BfCZPV6t1TKbkqZKD6mhl2FkJOZQ1S1y sQRE+BSrXEywh6JIhA69aTd/asKUJTRqWt8GvysauxRylW3p4TFI8j1dr/5ceh3Wwy1f PATjfD6b77ogLe0MzGa2kKIqYrBWlRvrsZ2PclGYPnuRw9DwHWsBboufeFv6BnaE+LB4 ZwmGv/0z3ojbj/aoK6perrwiqEC4cVsngUDxrexg3XAS+ePvLo/aNwI9MvsCAhvuq9Rl U2ordKe4ltzJd2aI33HUY2HaWRRAxGJKfC3CmxLh+y7/5buf4xsqVG9nU6ep8IAwldvi zX1A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w24si1622833pfi.80.2019.08.29.00.13.48; Thu, 29 Aug 2019 00:14:04 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727901AbfH2HLJ (ORCPT + 99 others); Thu, 29 Aug 2019 03:11:09 -0400 Received: from mx2.suse.de ([195.135.220.15]:33614 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727467AbfH2HLI (ORCPT ); Thu, 29 Aug 2019 03:11:08 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id C7B2BAF3B; Thu, 29 Aug 2019 07:11:06 +0000 (UTC) Date: Thu, 29 Aug 2019 09:11:05 +0200 From: Michal Hocko To: Edward Chron Cc: Andrew Morton , Roman Gushchin , Johannes Weiner , David Rientjes , Tetsuo Handa , Shakeel Butt , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Ivan Delalande Subject: Re: [PATCH 00/10] OOM Debug print selection and additional information Message-ID: <20190829071105.GQ28313@dhcp22.suse.cz> References: <20190826193638.6638-1-echron@arista.com> <20190827071523.GR7538@dhcp22.suse.cz> <20190828065955.GB7386@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 28-08-19 12:46:20, Edward Chron wrote: [...] > Our belief is if you really think eBPF is the preferred mechanism > then move OOM reporting to an eBPF. I've said that all this additional information has to be dynamically extensible rather than a part of the core kernel. Whether eBPF is the suitable tool, I do not know. I haven't explored that. There are other ways to inject code to the kernel. systemtap/kprobes, kernel modules and probably others. > I mentioned this before but I will reiterate this here. > > So how do we get there? Let's look at the existing report which we know > has issues. > > Other than a few essential OOM messages the OOM code should produce, > such as the Killed process message message sequence being included, > you could have the entire OOM report moved to an eBPF script and > therefore make it customizable, configurable or if you prefer programmable. I believe we should keep the current reporting in place and allow additional information via dynamic mechanism. Be it a registration mechanism that modules can hook into or other more dynamic way. The current reporting has proven to be useful in many typical oom situations in my past years of experience. It gives the rough state of the failing allocation, MM subsystem, tasks that are eligible and task that is killed so that you can understand why the event happened. I would argue that the eligible tasks should be printed on the opt-in bases because this is more of relict from the past when the victim selection was less deterministic. But that is another story. All the rest of dump_header should stay IMHO as a reasonable default and bare minimum. > Why? Because as we all agree, you'll never have a perfect OOM Report. > So if you believe this, than if you will, put your money where your mouth > is (so to speak) and make the entire OOM Report and eBPF script. > We'd be willing to help with this. > > I'll give specific reasons why you want to do this. > > - Don't want to maintain a lot of code in the kernel (eBPF code doesn't > count). > - Can't produce an ideal OOM report. > - Don't like configuring things but favor programmatic solutions. > - Agree the existing OOM report doesn't work for all environments. > - Want to allow flexibility but can't support everything people might > want. > - Then installing an eBPF for OOM Reporting isn't an option, it's > required. This is going into an extreme. We cannot serve all cases but that is true for any other heuristics/reporting in the kernel. We do care about most. > The last reason is huge for people who live in a world with large data > centers. Data center managers are very conservative. They don't want to > deviate from standard operating procedure unless absolutely necessary. > If loading an OOM Report eBPF is standard to get OOM Reporting output, > then they'll accept that. I have already responded to this kind of argumentation elsewhere. This is not a relevant argument for any kernel implementation. This is a data process management process. -- Michal Hocko SUSE Labs