Received: by 2002:a05:7412:b10a:b0:f3:1519:9f41 with SMTP id az10csp733519rdb; Thu, 30 Nov 2023 17:48:12 -0800 (PST) X-Google-Smtp-Source: AGHT+IGftKanBAKvgH+wDXl6O+GEE0vDzPBYc4OXJTRO0/w1SojRkGw9C/3d8mmVDch0X0XNWbrs X-Received: by 2002:a17:90b:391:b0:285:afcc:e667 with SMTP id ga17-20020a17090b039100b00285afcce667mr19951494pjb.27.1701395292172; Thu, 30 Nov 2023 17:48:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701395292; cv=none; d=google.com; s=arc-20160816; b=nRQMse3IOlFWRZFrCZ8RfYW4yzn+iqU2KbbLlA6FqaJyuccVvVleQoFsx1x+pB0zBe ISFowneX5o68dQX5alEAJ3UdEEoLuwzIaF4LiJodAf9qQ+wmi7KcX5RjhI1Al4zXzNiT xsSJeWhsenwll+P51AZDJu6c9WK8Qd1Fzq9rPcSru6G2wE+3q+JJJy3R0QFmuN8qxBKE hHoUdCBc7sI/h1yfzMZ2W/179ECleeAx9qJQJbIPPIrjrrp1GAQ+ylNO085ddc0OTH8w j+UDc7VGDzR5P7TCfUxWvklBBN3lGcaAL90VSJLZ3wip/J+/5N0Kke46Isot/BmV4HHC qCCg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:dkim-signature:date; bh=8DNjgQTchLvLaDKKz+PNTn+XS4o7e4vJLhgx6og08P4=; fh=sP08z9fNV69tvFjtJ+r75WmUf6uZHuzd+w00qqhGM8s=; b=hAEzyrP0TTJuPd00QXGVV7i/2LS/8Bvhw4j2cK4jKSpBcY8DmDyJu8QSN4+oGc/Dmx kEdDMIxBlxNB4Xn++e1qU6ZNwWzLpvoRSk/N6S+8uumbTp0LHzSbVV2LxEg7aztgwAzs qKhivPgNXw7+kqR3EjwO0DrxgxcZlKB70xzi+et492t1tBWEi/TJuQ5DR1gDLltb/Tvx orJvQknmUTQ3utgWJqHqmSsUxG/5+t1NzDnBcEHVrAHx6/meRgiTwFdU59v0W4hPg4Tx D8qrSv/FTEF2SLOXp3XBR3i+N5xNv2cMIcU3HnolQgxnYayuyVJfF0YBKCWzZ9cOVdZ6 eHFg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=o8YvlnA5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id a10-20020a17090acb8a00b002805aa7b138si4640441pju.59.2023.11.30.17.48.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Nov 2023 17:48:12 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=o8YvlnA5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 3A44B8024CC4; Thu, 30 Nov 2023 17:48:06 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1376388AbjLABrt (ORCPT + 99 others); Thu, 30 Nov 2023 20:47:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56912 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231336AbjLABrr (ORCPT ); Thu, 30 Nov 2023 20:47:47 -0500 Received: from out-171.mta1.migadu.com (out-171.mta1.migadu.com [IPv6:2001:41d0:203:375::ab]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6878A10F9 for ; Thu, 30 Nov 2023 17:47:51 -0800 (PST) Date: Thu, 30 Nov 2023 20:47:45 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1701395269; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=8DNjgQTchLvLaDKKz+PNTn+XS4o7e4vJLhgx6og08P4=; b=o8YvlnA5l55Q9Ne/lUb4wsf0ptcUzWPg0sgakcZYsq3+PNkQO3OqqADFyBDPVNf+hv854M UC/aM89sl+wHb+1yrATfQamrdemNYZQyGoWOc/OtFxYFHfUbeuztiYfWIhuR+fEIrJomWJ lb7PC4NIkOOT5O++btN59GCGwcBfgCM= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Kent Overstreet To: Michal Hocko Cc: Roman Gushchin , Qi Zheng , Muchun Song , Linux-MM , linux-kernel@vger.kernel.org, Andrew Morton , Dave Chinner Subject: Re: [PATCH 2/7] mm: shrinker: Add a .to_text() method for shrinkers Message-ID: <20231201014745.b2ud4w3ymztdtctu@moria.home.lan> References: <20231123212411.s6r5ekvkklvhwfra@moria.home.lan> <4caadff7-1df0-45cc-9d43-e616f9e4ddb3@bytedance.com> <20231125003009.tbaxuquny43uwei3@moria.home.lan> <76A1EE85-B62C-49B3-889C-80F9A2A88040@linux.dev> <20231128035345.5c7yc7jnautjpfoc@moria.home.lan> <20231129231147.7msiocerq7phxnyu@moria.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Thu, 30 Nov 2023 17:48:06 -0800 (PST) On Thu, Nov 30, 2023 at 09:14:35AM +0100, Michal Hocko wrote: > On Wed 29-11-23 18:11:47, Kent Overstreet wrote: > > Considering that you're an MM guy, and that shrinkers are pretty much > > universally used by _filesystem_ people - I'm not sure your experience > > is the most relevant here? > > I really do not understand where you have concluded that. In those years > of analysis I was not debugging my _own_ code. I was dealing with > customer reports and I would not really blame them to specifically > trigger any class of OOM reports. I've also spent a considerable amount of time debugging OOM issues, and a lot of that took a lot longer than it should of due to insufficient visibility in what the system was doing. I'm talking about things like tuning journal reclaim/writeback behaviour (this is a tricky one! shrinkers can't shrink if all items are dirty, but random update workloads really suffer if we're biasing too much in favour of memory reclaim, i.e. limiting dirty ratio too much), or debugging tests in fstests that really like to exhaust memory on just the inode cache. If you can take the time to understand what other people are trying to do and share your own perspective on what you find useful - instead of just saying "I've spent a lot of time on OOM reports and I haven't need any of this/this is just for debugging" - we'll be able to have a much more productive discussion. Regarding another point you guys have been making - that this is "just for developers debugging their own code" - that's a terribly dismissive attitude to take as well. Debugging doesn't stop when we're done testing the code on our local machine and push it out to be merged; we're constantly debugging our own code as it is running in the wild based on sparse bug reports with at most a dmesg log. That dmesg log needs to, whenever possible, have all the information we need to debug the issue. In bcachefs, I have made this principle a _high_ priority; when I have a bug in front of me, if there's visibility improvements that would make the issue easier to debug I prioritize that _first_, and then fix the actual bug. That's been one of the guiding principles that have enabled me to work efficiently. Code should tell you _what_ went wrong when something goes wrong, whenever possible. Not just for ourselves, the individual developer, it makes our code more maintainable by the people tha come after us. > > For one, the patchset adds tracking for when a shrinker was last asked > > to free something, vs. when it was actually freed. So right there, we > > can finally see at a glance when a shrinker has gotten stuck and which > > one. > > The primary problem I have with this is how to decide whether to dump > shrinker data and/or which shrinkers to mention. How do you know that it > is the specific shrinker which has contributed to the OOM state? > Printing that data unconditionally will very likely be just additional > balast in most production situations. Sure if you are doing a filesystem > development and you are tuning your specific shrinker then this might be > a really important information to have. But then it is a debugging devel > tool rather than something we want or need to have in a generic oom > report. Like I've mentioned before, this patchset only reports on the top 10 shrinkers, by number of objects. If we can plumb through reporting on memory usage in _bytes_, that would help even more with deciding what to report on. > All that being said, I am with you on the fact that the oom report in > its current form could see improvements. I'm glad we're finally in agreement on something! If you want to share your own ideas on what could be improved and what you find useful, maybe we could find some more common ground.