Received: by 2002:a05:7412:419a:b0:f3:1519:9f41 with SMTP id i26csp4927892rdh; Wed, 29 Nov 2023 15:12:48 -0800 (PST) X-Google-Smtp-Source: AGHT+IHqpp/apONYu/gUF3/2Qjn2PsFuL4MvmOy0Hywkm0gUBEO6pAO9F+lLlxFuOdjYQ4Y3Wgla X-Received: by 2002:a05:6a20:6a2a:b0:18c:b464:ec5b with SMTP id p42-20020a056a206a2a00b0018cb464ec5bmr12376350pzk.61.1701299568519; Wed, 29 Nov 2023 15:12:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701299568; cv=none; d=google.com; s=arc-20160816; b=fH0s6UWbIB+5siunZooH1BOOCX10DMxsUHYg+9RbG6GCEc53If0DHOaK883qM/7FgG XT3xAgyvCPTi3GCNFVEI90AAMEhOPkqnuP40nq7Ja4o46tASL3uZ566zyJs0HSNXqIVr HVZSJaH5XkCs+DvpuM7eMjoms6NJX+UkTyoO+U1nR+rxRCQLsqjbju2XlGfVjryFYjMk CsdtbRVytFl4Ip5lhIcmxivXXUNcbp0MzghZtAGn6wETI9IuOkRZ1lsUWLrrGGe8M03J i7OorUo2K4R+mMfDOVrtgdiRXkDdnMpJnNu3uP3djbI/TVoby0QynOuyv8Bo9FWKLCDi MfsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:dkim-signature:date; bh=bz6GxSchlHkXU2aV4ba3kIBQOXF/boj29ncnlCvytHU=; fh=sP08z9fNV69tvFjtJ+r75WmUf6uZHuzd+w00qqhGM8s=; b=c538RLUsrDTauZoCcAi7Mr/ki/Eehp0rt+JQPMWWT1nfDfn8/LUMBGLYWDvlD+IDlK zNCjtTA2ZFdYDYm9B3lu2eHGj5nKI1hcgYREKKFxVU9tAR2lL0Z23i3c2CdUXgxvPl5D TPo5i4aG8odbTdK+JxS7sjdX+EbAnkLeBHqMfLDFZkYr7ZTqq5aYqgDRsPvZcdr+dUbj NTgI5urwI4uxqo99k7Cz/4Fzw1DD1jtlWOL2c6ArEoBMIQQe5ioGFXRz18Y6VwTgTChv B1vQprV3oZj/8crmmeKFyaE6WyWUBZEYRmBS7XNWUo0p1rJX3MIru/luy2qJYUffS0Ov 62dQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=wcLd0b+S; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from groat.vger.email (groat.vger.email. [2620:137:e000::3:5]) by mx.google.com with ESMTPS id z9-20020a056a001d8900b006cbf67abff9si13300211pfw.269.2023.11.29.15.12.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Nov 2023 15:12:48 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) client-ip=2620:137:e000::3:5; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=wcLd0b+S; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id B5A0080C3A14; Wed, 29 Nov 2023 15:12:20 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233055AbjK2XL5 (ORCPT + 99 others); Wed, 29 Nov 2023 18:11:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51022 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229611AbjK2XL4 (ORCPT ); Wed, 29 Nov 2023 18:11:56 -0500 Received: from out-185.mta1.migadu.com (out-185.mta1.migadu.com [95.215.58.185]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BE7A4A2 for ; Wed, 29 Nov 2023 15:12:01 -0800 (PST) Date: Wed, 29 Nov 2023 18:11:47 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1701299520; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=bz6GxSchlHkXU2aV4ba3kIBQOXF/boj29ncnlCvytHU=; b=wcLd0b+SU3Q08M7z+7e+p3ST5xtyx3lE5TeHDRElAIXAZRCPie/oUJeFa7jSIYZP1KIFDw OEMI1w2ReC27VoEy+nvYH4raQL+GrSzwULOn1wgD//78O4uhqRkP8fE97JrBLkErScz7AF M4ouptTywkbdStIdTJ4FJRTitezve88= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Kent Overstreet To: Michal Hocko Cc: Roman Gushchin , Qi Zheng , Muchun Song , Linux-MM , linux-kernel@vger.kernel.org, Andrew Morton , Dave Chinner Subject: Re: [PATCH 2/7] mm: shrinker: Add a .to_text() method for shrinkers Message-ID: <20231129231147.7msiocerq7phxnyu@moria.home.lan> References: <20231122232515.177833-3-kent.overstreet@linux.dev> <20231123212411.s6r5ekvkklvhwfra@moria.home.lan> <4caadff7-1df0-45cc-9d43-e616f9e4ddb3@bytedance.com> <20231125003009.tbaxuquny43uwei3@moria.home.lan> <76A1EE85-B62C-49B3-889C-80F9A2A88040@linux.dev> <20231128035345.5c7yc7jnautjpfoc@moria.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Wed, 29 Nov 2023 15:12:20 -0800 (PST) On Wed, Nov 29, 2023 at 10:14:54AM +0100, Michal Hocko wrote: > On Tue 28-11-23 16:34:35, Roman Gushchin wrote: > > On Tue, Nov 28, 2023 at 02:23:36PM +0800, Qi Zheng wrote: > [...] > > > Now I think adding this method might not be a good idea. If we allow > > > shrinkers to report thier own private information, OOM logs may become > > > cluttered. Most people only care about some general information when > > > troubleshooting OOM problem, but not the private information of a > > > shrinker. > > > > I agree with that. > > > > It seems that the feature is mostly useful for kernel developers and it's easily > > achievable by attaching a bpf program to the oom handler. If it requires a bit > > of work on the bpf side, we can do that instead, but probably not. And this > > solution can potentially provide way more information in a more flexible way. > > > > So I'm not convinced it's a good idea to make the generic oom handling code > > more complicated and fragile for everybody, as well as making oom reports differ > > more between kernel versions and configurations. > > Completely agreed! From my many years of experience of oom reports > analysing from production systems I would conclude the following categories > - clear runaways (and/or memory leaks) > - userspace consumers - either shmem or anonymous memory > predominantly consumes the memory, swap is either depleted > or not configured. > OOM report is usually useful to pinpoint those as we > have required counters available > - kernel memory consumers - if we are lucky they are > using slab allocator and unreclaimable slab is a huge > part of the memory consumption. If this is a page > allocator user the oom repport only helps to deduce > the fact by looking at how much user + slab + page > table etc. form. But identifying the root cause is > close to impossible without something like page_owner > or a crash dump. > - misbehaving memory reclaim > - minority of issues and the oom report is usually > insufficient to drill down to the root cause. If the > problem is reproducible then collecting vmstat data > can give a much better clue. > - high number of slab reclaimable objects or free swap > are good indicators. Shrinkers data could be > potentially helpful in the slab case but I really have > hard time to remember any such situation. > On non-production systems the situation is quite different. I can see > how it could be very beneficial to add a very specific debugging data > for subsystem/shrinker which is developed and could cause the OOM. For > that purpose the proposed scheme is rather inflexible AFAICS. Considering that you're an MM guy, and that shrinkers are pretty much universally used by _filesystem_ people - I'm not sure your experience is the most relevant here? The general attitude I've been seeing in this thread has been one of dismissiveness towards filesystem people. Roman too; back when he was working on his shrinker debug feature I reached out to him, explained that I was working on my own, and asked about collaborating - got crickets in response... Hmm.. Besides that, I haven't seen anything what-so-ever out of you guys to make our lives easier, regarding OOM debugging, nor do you guys even seem interested in the needs and perspectives of the filesytem people. Roman, your feature didn't help one bit for OOM debuging - didn't even come with documentation or hints as to what it's for. BPF? Please. Anyways. Regarding log spam: that's something this patchset already starts to address. I don't think we needed to be dumping every single slab in the system, for ~2 pages worth of logs; hence this patchset changes that to just print the top 10. The same approach is taken with shrinkers: more targeted, less spammy output. So now that that concern has been addressed, perhaps some actual meat: For one, the patchset adds tracking for when a shrinker was last asked to free something, vs. when it was actually freed. So right there, we can finally see at a glance when a shrinker has gotten stuck and which one. Next up, why has a shrinker gotten stuck? That's why the .to_text() callback is needed: _shrinkers have internal state_, and the reasons objects may not be getting freed are specific to a given shrinker implementation. In bcachefs we added counters for each individual reason an object may be skipped by the shrinker (io in flight? that debugged a runaway prefetch issue. too many dirty? that points to journal reclaim). I'm working on a .to_text() function for the struct super_block shrinker, will post that one soon...