Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp2079498ybl; Thu, 29 Aug 2019 03:17:25 -0700 (PDT) X-Google-Smtp-Source: APXvYqxl4brCeNg5xiniNB3O/ghjYJq5Jj9IEqtKUrkuhFuAexxBxT7+xGxRraA8lZuIUjvmJ/i2 X-Received: by 2002:a17:902:6b88:: with SMTP id p8mr8750538plk.95.1567073845870; Thu, 29 Aug 2019 03:17:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567073845; cv=none; d=google.com; s=arc-20160816; b=CG+1glPgxpB2sCUnRukI6P96fCxCM7hM50ADfh7VIi+BJu1BKgAIx5uSRpwwq72h7X 3fVfpKEPiA7Gl+TqZXHoqzyO1TNHCvXzpZG3+gj4quy0bP9sxxHrTCcDtyolA1hQVJai 9MYLmH20RFQeDGUv/XvrrG5gx5FRquMqzDF78oaEJO9T8rJS0NKiiA1uOlaPftQHS1mv 3Z/Ro1oRrlVwGroSKwG1GNgcnyrSg2B+CeXLesekt6aOKigbFRLBrYi6XqbNaufxIFyG 5YMG5lguYA2Mj9g8cRCjf/FT39ek4G202t9Nlb82CPC3VlFKPFz90QSiHiu/YNzBze2N Y/BQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=SLRlJ/qmBUoGwni5glMUfUY40objW/e4ng/jYKjy9Uk=; b=aYBO5o0/YJ5yZ1Hw/Lw+im0YMb5gjkI6hT5zMguiH+NU62fB4KWG16WYBGpWaux51H b8nVGnBCCfzesbA2bHjfZQ5Qu/UfdZt98BYRVgiKnf+voO3uJwF6iy1hzhN9H6zu3gAH p2XHsxyGP7ut2cmSg0WI60UcneRJVawRWIMwluulKDb3rg3YaECIArbmmsVodISoBCB5 90lHJhGXfXVaeI6E96q1HvXcM/qoZHDD77fdit36Yt2ObRhxWa3ivQwJlV/BxNUrDC9l zRlrzd5z/ylDjb74mgRwG3C/u45RaG672zdB9Afj06DwlTyfhmuIvXns8mu2/675lBT2 e4rg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id cl9si1529426plb.271.2019.08.29.03.17.10; Thu, 29 Aug 2019 03:17:25 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726852AbfH2KO5 (ORCPT + 99 others); Thu, 29 Aug 2019 06:14:57 -0400 Received: from www262.sakura.ne.jp ([202.181.97.72]:53035 "EHLO www262.sakura.ne.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726330AbfH2KO4 (ORCPT ); Thu, 29 Aug 2019 06:14:56 -0400 Received: from fsav402.sakura.ne.jp (fsav402.sakura.ne.jp [133.242.250.101]) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTP id x7TAEseo053121; Thu, 29 Aug 2019 19:14:54 +0900 (JST) (envelope-from penguin-kernel@i-love.sakura.ne.jp) Received: from www262.sakura.ne.jp (202.181.97.72) by fsav402.sakura.ne.jp (F-Secure/fsigk_smtp/530/fsav402.sakura.ne.jp); Thu, 29 Aug 2019 19:14:54 +0900 (JST) X-Virus-Status: clean(F-Secure/fsigk_smtp/530/fsav402.sakura.ne.jp) Received: from [192.168.1.8] (softbank126227201116.bbtec.net [126.227.201.116]) (authenticated bits=0) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTPSA id x7TAEmjO052966 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NO); Thu, 29 Aug 2019 19:14:54 +0900 (JST) (envelope-from penguin-kernel@i-love.sakura.ne.jp) Subject: Re: [PATCH 00/10] OOM Debug print selection and additional information To: Michal Hocko , Edward Chron Cc: Andrew Morton , Roman Gushchin , Johannes Weiner , David Rientjes , Shakeel Butt , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Ivan Delalande References: <20190826193638.6638-1-echron@arista.com> <20190827071523.GR7538@dhcp22.suse.cz> <20190828065955.GB7386@dhcp22.suse.cz> <20190829071105.GQ28313@dhcp22.suse.cz> From: Tetsuo Handa Message-ID: <297cf049-d92e-f13a-1386-403553d86401@i-love.sakura.ne.jp> Date: Thu, 29 Aug 2019 19:14:46 +0900 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <20190829071105.GQ28313@dhcp22.suse.cz> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2019/08/29 16:11, Michal Hocko wrote: > On Wed 28-08-19 12:46:20, Edward Chron wrote: >> Our belief is if you really think eBPF is the preferred mechanism >> then move OOM reporting to an eBPF. > > I've said that all this additional information has to be dynamically > extensible rather than a part of the core kernel. Whether eBPF is the > suitable tool, I do not know. I haven't explored that. There are other > ways to inject code to the kernel. systemtap/kprobes, kernel modules and > probably others. As for SystemTap, guru mode (an expert mode which disables protection provided by SystemTap; allowing kernel to crash when something went wrong) could be used for holding spinlock. However, as far as I know, holding mutex (or doing any operation that might sleep) from such dynamic hooks is not allowed. Also we will need to export various symbols in order to allow access from such dynamic hooks. I'm not familiar with eBPF, but I guess that eBPF is similar. But please be aware that, I REPEAT AGAIN, I don't think neither eBPF nor SystemTap will be suitable for dumping OOM information. OOM situation means that even single page fault event cannot complete, and temporary memory allocation for reading from kernel or writing to files cannot complete. Therefore, we will need to hold all information in kernel memory (without allocating any memory when OOM event happened). Dynamic hooks could hold a few lines of output, but not all lines we want. The only possible buffer which is preallocated and large enough would be printk()'s buffer. Thus, I believe that we will have to use printk() in order to dump OOM information. At that point, static bool (*oom_handler)(struct oom_control *oc) = default_oom_killer; bool out_of_memory(struct oom_control *oc) { return oom_handler(oc); } and let in-tree kernel modules override current OOM killer would be the only practical choice (if we refuse adding many knobs).