Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966273AbdIYUcl (ORCPT ); Mon, 25 Sep 2017 16:32:41 -0400 Received: from mx2.suse.de ([195.135.220.15]:55586 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S966250AbdIYUcj (ORCPT ); Mon, 25 Sep 2017 16:32:39 -0400 Date: Mon, 25 Sep 2017 22:32:35 +0200 From: Michal Hocko To: Yang Shi Cc: cl@linux.com, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/2 v4] oom: capture unreclaimable slab info in oom message when kernel panic Message-ID: <20170925203235.vhhiqxp72v67n76l@dhcp22.suse.cz> References: <1505947132-4363-1-git-send-email-yang.s@alibaba-inc.com> <20170925142352.havlx6ikheanqyhj@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170609 (1.8.3) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2047 Lines: 41 On Mon 25-09-17 23:55:19, Yang Shi wrote: > > > On 9/25/17 7:23 AM, Michal Hocko wrote: > > On Thu 21-09-17 06:38:50, Yang Shi wrote: > > > Recently we ran into a oom issue, kernel panic due to no killable process. > > > The dmesg shows huge unreclaimable slabs used almost 100% memory, but kdump doesn't capture vmcore due to some reason. > > > > > > So, it may sound better to capture unreclaimable slab info in oom message when kernel panic to aid trouble shooting and cover the corner case. > > > Since kernel already panic, so capturing more information sounds worthy and doesn't bother normal oom killer. > > > > > > With the patchset, tools/vm/slabinfo has a new option, "-U", to show unreclaimable slab only. > > > > > > And, oom will print all non zero (num_objs * size != 0) unreclaimable slabs in oom killer message. > > > > Well, I do undestand that this _might_ be useful but it also might > > generates a _lot_ of output. The oom report can be quite verbose already > > so is this something we want to have enabled by default? > > The uneclaimable slub message will be just printed out when kernel panic (no > killable process or panic_on_oom is set). So, it will not bother normal oom. > Since kernel is already panic, so it might be preferred to have more > information reported. Well, this certainly depends. If you have a limited console output (e.g. no serial console) then the additional information can easily scroll the potentially much more useful information from the early oom report. We already do have a control to enable/disable tasks dumping which can be very long as well. > We definitely can add a proc knob to control it if we want to disable the > message even if when kernel panic. Well, I do not have a strong opinion on this. I can see cases where this kind of information would be useful but most OOM reports I have seen were simply user space pinned memory. Slab memory leaks are seen very seldom. Do you think a pr_dbg and slab stats for all ooms would be still useful? -- Michal Hocko SUSE Labs