Received: by 2002:ab2:b82:0:b0:1f3:401:3cfb with SMTP id 2csp695209lqh; Thu, 28 Mar 2024 13:25:40 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCU8u9xEqAqD1t34mpqsvJQ9jJQnnK2gEwB8LZT9PUxwrvs0wCiWQJpMNtbn7M83jTEH8twC90CYZKH0F+FyE+0XAS+VtJ6w5YcxAhUKLw== X-Google-Smtp-Source: AGHT+IG77z+j/u7TVHczHcNHYXKGQOBdxs16ZRkBttFj2F9wT3oeOwovEBENrHmIDTzmtuj3Ge0W X-Received: by 2002:a1f:e402:0:b0:4d4:32e1:e7b4 with SMTP id b2-20020a1fe402000000b004d432e1e7b4mr416075vkh.4.1711657539863; Thu, 28 Mar 2024 13:25:39 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1711657539; cv=pass; d=google.com; s=arc-20160816; b=ED92Y+l32IYzREqsHwPad2nVYNWux93kAG/TvGksfHGOWpwL0d1kzHAtjGbX9g+goo l9ss1y6pdefR/CCbQi3u4b5LQFd5doFHdjyQIBcin4SbgkEQnBVKFvrn9cawzCAG1EuT jPfh3Z6Pl/8fFk50aQ8Z9VpE2Y2+vymBaVp24PdKordWT9vIX5ycg4xfWfUqREmdcuND Q/OmIY7Uk1N2NysWhztu2rZ+0lvTMClhDUlsCrDJrf02ahd1bC9ciWBltZc0CPH7MyGy TbnBpmWNC0qTcqZgahuPMkfKQAsJVVUt7tEWSJekfhPu8uNwWYGAO9bFYPgRReS1yL2/ X4Sw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:dkim-signature:date; bh=SITu3wP6qaILu57znUskVSN6H9osyOHT4Gb41FteW/Y=; fh=JYyhVAcStIY1lJrQSdbMBNHKqT1iSuFrmz0Tsoevt28=; b=VuOczqQC0HkM26pEjY+jICbFrKtwWweOBEYXLcjTX/7gGA53xAvsh/A7NrKOrGNQLy WPalcBYFJRlDiIik90mt4LOD/C19UHBVu0bOOrFVBk5z9UR4hNM5tKsJ9dNc9RRq0xNr rpdIiAajghNFpd07PWQUIeIaBKtPQWSXlidTSydLG4UHJtZUk6kG3cHhZIhJvDdMjISg NquAlYrNRIrCUResoc1FUjdSoHU7+nvooSdtlCvfETfVVZuWT0wVV8D/xpyN2U1cdbS8 U9asy5knbsOlfKySszOjMO3KxSxakoetUx2Wxu7F1KRIh6KMUpp41Zad2KDUZcaA/zht 3yvg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b="t1xfM/sk"; arc=pass (i=1 spf=pass spfdomain=linux.dev dkim=pass dkdomain=linux.dev dmarc=pass fromdomain=linux.dev); spf=pass (google.com: domain of linux-kernel+bounces-123575-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-123575-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id y1-20020a0cff41000000b006968991bbc4si2126610qvt.273.2024.03.28.13.25.39 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 28 Mar 2024 13:25:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-123575-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b="t1xfM/sk"; arc=pass (i=1 spf=pass spfdomain=linux.dev dkim=pass dkdomain=linux.dev dmarc=pass fromdomain=linux.dev); spf=pass (google.com: domain of linux-kernel+bounces-123575-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-123575-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 9597A1C3005D for ; Thu, 28 Mar 2024 20:25:39 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 1765B13AA54; Thu, 28 Mar 2024 20:22:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="t1xfM/sk" Received: from out-178.mta0.migadu.com (out-178.mta0.migadu.com [91.218.175.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 48C0E139599 for ; Thu, 28 Mar 2024 20:22:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.178 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711657341; cv=none; b=d4oH68PR3aBt7BcvMOKTK16RAEYvvzTwPuyuJLYpxr1BfLXtMKjNwphxJVfD/w5PS8WbxcrB9sJ8mWCNPN9XzHbcwwTglEGQbBruQh41Tr3uxQTm3VSUhbe5Re+YSC3XjGs3OPil5DhN6KaK9mffYNz4kDywWEQoqH71UUdlG/Y= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711657341; c=relaxed/simple; bh=v6O4Uqk52YbEsaRRH8ACq+JI9VweRdHN6fwD+VY99LY=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=uaW1vVDknDj8907rX+4H/dp7dR15Hv39JEmLq2VGTCdWnlgLZyirA0P6Zvc2kqIdO/NgruoJuhjxjHE4dwzyl/DLHaiqb5OyhM4Nzm3JVyFIN4inKqVQppFJNQqh1+ljH4WGjqhNNBzOv5iDZf0C6+43tizW/7bpmlsjtwVtRnY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=t1xfM/sk; arc=none smtp.client-ip=91.218.175.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Date: Thu, 28 Mar 2024 16:22:13 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1711657337; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=SITu3wP6qaILu57znUskVSN6H9osyOHT4Gb41FteW/Y=; b=t1xfM/sk4la00t065yw/dchJtSd86DyHJ5hZsC77hI3tD6MWdTT1Xhr+DJZxHogBRqAlv2 VvI/3ZhGyUghQnky940nynaLU+ZcO+Dg2ZNUQXtqMkuZiWU+UN7Tob6gPhBCiYaT9Nv7UV zNUs/Hgip5OP/EHxeC07CyVObPP73YQ= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Kent Overstreet To: Tejun Heo Cc: Kemeng Shi , akpm@linux-foundation.org, willy@infradead.org, jack@suse.cz, bfoster@redhat.com, dsterba@suse.com, mjguzik@gmail.com, dhowells@redhat.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: Re: [PATCH v2 0/6] Improve visibility of writeback Message-ID: References: <20240327155751.3536-1-shikemeng@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT On Thu, Mar 28, 2024 at 10:13:27AM -1000, Tejun Heo wrote: > Hello, > > On Thu, Mar 28, 2024 at 03:55:32PM -0400, Kent Overstreet wrote: > > > On Thu, Mar 28, 2024 at 03:40:02PM -0400, Kent Overstreet wrote: > > > > Collecting latency numbers at various key places is _enormously_ useful. > > > > The hard part is deciding where it's useful to collect; that requires > > > > intimate knowledge of the code. Once you're defining those collection > > > > poitns statically, doing it with BPF is just another useless layer of > > > > indirection. > > > > > > Given how much flexibility helps with debugging, claiming it useless is a > > > stretch. > > > > Well, what would it add? > > It depends on the case but here's an example. If I'm seeing occasional tail > latency spikes, I'd want to know whether there's any correation with > specific types or sizes of IOs and if so who's issuing them and why. With > BPF, you can detect those conditions to tag and capture where exactly those > IOs are coming from and aggregate the result however you like across > thousands of machines in production without anyone noticing. That's useful, > no? That's cool, but really esoteric. We need to be able to answer basic questions and build an overall picture of what the system is doing without having to reach for the big stuff. Most users are never going to touch tracing, let alone BPF; that's too much setup. But I can and do regularly tell users "check this, this and this" and debug things on that basis without ever touching their machine. And basic latency numbers are really easy for users to understand, that makes them doubly worthwhile to collect and make visible. > Also, actual percentile disribution is almost always a lot more insightful > than more coarsely aggregated numbers. We can't add all that to fixed infra. > In most cases not because runtime overhead would be too hight but because > the added interface and code complexity and maintenance overhead isn't > justifiable given how niche, adhoc and varied these use cases get. You can't calculate percentiles accurately and robustly in one pass - that only works if your input data obeys a nice statistical distribution, and the cases we care about are the ones where it doesn't. > > > > > The time stats stuff I wrote is _really_ cheap, and you really want this > > > > stuff always on so that you've actually got the data you need when > > > > you're bughunting. > > > > > > For some stats and some use cases, always being available is useful and > > > building fixed infra for them makes sense. For other stats and other use > > > cases, flexibility is pretty useful too (e.g. what if you want percentile > > > distribution which is filtered by some criteria?). They aren't mutually > > > exclusive and I'm not sure bdi wb instrumentation is on top of enough > > > people's minds. > > > > > > As for overhead, BPF instrumentation can be _really_ cheap too. We often run > > > these programs per packet. > > > > The main things I want are just > > - elapsed time since last writeback IO completed, so we can see at a > > glance if it's stalled > > - time stats on writeback io initiation to completion > > > > The main value of this one will be tracking down tail latency issues and > > finding out where in the stack they originate. > > Yeah, I mean, if always keeping those numbers around is useful for wide > enough number of users and cases, sure, go ahead and add fixed infra. I'm > not quite sure bdi wb stats fall in that bucket given how little attention > it usually gets. I think it should be getting a lot more attention given that memory reclaim and writeback are generally implicated whenever a user complains about their system going out to lunch.