Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp1632446pxb; Thu, 4 Mar 2021 17:01:41 -0800 (PST) X-Google-Smtp-Source: ABdhPJytrtJT2MFjM+8BsIBKiLdql9eHv21y1sZKshvG0/yO3VsaHeeWu0Yb7sv7Xdcect196SPX X-Received: by 2002:a05:6e02:586:: with SMTP id c6mr6310496ils.106.1614906101627; Thu, 04 Mar 2021 17:01:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1614906101; cv=none; d=google.com; s=arc-20160816; b=viq2XODTwchHUydjqbkTn1Lu4RyzuIr45iAjUQZCOkUO/Bd7sQpPasQvmp2fl1WTxV sNha9wzbo1InUcArhB7X6Xsq7Nda6vyWvml5xnuZPJG1D8eYsuQm2Ek0rSKYc7UMRrmF vJ1A9N93ff1+Mxd838kAeBbpYZCmKGyh6TKvQ0L8i1KAShkvtQCm2nVGn/JOxgKmS2+A k90Ll5TGSOsjY/yPPS4mlcr8VIbyPwNMUKLMuglgvtKKrtm/UeQ2p7HFyDGoDvuzlOTd SiqF8HZETxKgrjFa0eN2wv3l24zic7XJLgD5y4tur0B4+20AxTPzECEO4FY0gJ1kcXyJ hfhQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:to:from:date:dkim-signature; bh=IcaYni6bjhzWAog8jP+9JNL6E12mkvDoGDk5JGEal6k=; b=ULgaCHu6FsuQ0vhLwvDuiOj5OfgM7lmpWtSZS/31PhR4k3/V39xvZ5HR7+Cj1H26FV gQa8mcopf7M1a2oT7QNEGHjIl+N+TuB7l5rtvZwlTRae3PvwGHwDUx1nRY+kK132k6in 8s862WuJrdmV0wPanuI719WQZ1lsO2VuHbJph/ISG3R69GK4K3nktxRysQPR4pnglBCa Xkrc4yy6GzNbVgi6nDRQVpydubkmzu/59sOrsu1ZPPLQhYKHB/sJQREREsZyGIZwW/xV e6lDVIaFHUxw7jf5fO2tF/INr34hiEm4Tq8b11MMgi7FJwzlri+NZnQi5D8LF5qGdy1g O/Ug== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Zzi07XWJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id m17si1069732jaj.76.2021.03.04.17.01.28; Thu, 04 Mar 2021 17:01:41 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Zzi07XWJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231167AbhCDWYV (ORCPT + 99 others); Thu, 4 Mar 2021 17:24:21 -0500 Received: from mail.kernel.org ([198.145.29.99]:55150 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229467AbhCDWYV (ORCPT ); Thu, 4 Mar 2021 17:24:21 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id A261C64FF1; Thu, 4 Mar 2021 22:24:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1614896660; bh=TdFUuCFCI4O7OYx3gwfS81B/nzcodb1k9TZ5OKOcK+I=; h=Date:From:To:Subject:References:In-Reply-To:From; b=Zzi07XWJRmBorPWAEOPfEUf9URvPZCct8sLKnvVPDajjQlmZhKOO8TmJcHA22BAo1 8EWBOtLTKRgLmQWUNc47JHMFX6LZ5YTydMV+79PhYMMmGj45i93v+zonftDPAnVKBD +Vp/xFB6v+/0Ca1IMY4FB+/BGNHK3wLr2/6Ne8D8mZcXayKZmeMlbkoavWrDUazj0D VJfWU9/O5F6ZdfrAfMveDe5VwpdyV5xwPGRjH+dITVkQrOa8nR4fUmdSOtf3CXIcB2 GFCgs701RHuYxkdEshNDrFTmmaXmeS/eirDmC99Ozp922V9elq4yB18zo/jDRP2aFz 3XUvdomw1MSeg== Date: Thu, 4 Mar 2021 17:24:19 -0500 From: Sasha Levin To: dsterba@suse.cz, linux-kernel@vger.kernel.org, stable@vger.kernel.org, Josef Bacik , Nikolay Borisov , David Sterba , linux-btrfs@vger.kernel.org Subject: Re: [PATCH AUTOSEL 5.11 55/67] btrfs: only let one thread pre-flush delayed refs in commit Message-ID: References: <20210224125026.481804-1-sashal@kernel.org> <20210224125026.481804-55-sashal@kernel.org> <20210224180820.GY1993@twin.jikos.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20210224180820.GY1993@twin.jikos.cz> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 24, 2021 at 07:08:20PM +0100, David Sterba wrote: >On Wed, Feb 24, 2021 at 07:50:13AM -0500, Sasha Levin wrote: >> From: Josef Bacik >> >> [ Upstream commit e19eb11f4f3d3b0463cd897016064a79cb6d8c6d ] >> >> I've been running a stress test that runs 20 workers in their own >> subvolume, which are running an fsstress instance with 4 threads per >> worker, which is 80 total fsstress threads. In addition to this I'm >> running balance in the background as well as creating and deleting >> snapshots. This test takes around 12 hours to run normally, going >> slower and slower as the test goes on. >> >> The reason for this is because fsstress is running fsync sometimes, and >> because we're messing with block groups we often fall through to >> btrfs_commit_transaction, so will often have 20-30 threads all calling >> btrfs_commit_transaction at the same time. >> >> These all get stuck contending on the extent tree while they try to run >> delayed refs during the initial part of the commit. >> >> This is suboptimal, really because the extent tree is a single point of >> failure we only want one thread acting on that tree at once to reduce >> lock contention. >> >> Fix this by making the flushing mechanism a bit operation, to make it >> easy to use test_and_set_bit() in order to make sure only one task does >> this initial flush. >> >> Once we're into the transaction commit we only have one thread doing >> delayed ref running, it's just this initial pre-flush that is >> problematic. With this patch my stress test takes around 90 minutes to >> run, instead of 12 hours. >> >> The memory barrier is not necessary for the flushing bit as it's >> ordered, unlike plain int. The transaction state accessed in >> btrfs_should_end_transaction could be affected by that too as it's not >> always used under transaction lock. Upon Nikolay's analysis in [1] >> it's not necessary: >> >> In should_end_transaction it's read without holding any locks. (U) >> >> It's modified in btrfs_cleanup_transaction without holding the >> fs_info->trans_lock (U), but the STATE_ERROR flag is going to be set. >> >> set in cleanup_transaction under fs_info->trans_lock (L) >> set in btrfs_commit_trans to COMMIT_START under fs_info->trans_lock.(L) >> set in btrfs_commit_trans to COMMIT_DOING under fs_info->trans_lock.(L) >> set in btrfs_commit_trans to COMMIT_UNBLOCK under >> fs_info->trans_lock.(L) >> >> set in btrfs_commit_trans to COMMIT_COMPLETED without locks but at this >> point the transaction is finished and fs_info->running_trans is NULL (U >> but irrelevant). >> >> So by the looks of it we can have a concurrent READ race with a WRITE, >> due to reads not taking a lock. In this case what we want to ensure is >> we either see new or old state. I consulted with Will Deacon and he said >> that in such a case we'd want to annotate the accesses to ->state with >> (READ|WRITE)_ONCE so as to avoid a theoretical tear, in this case I >> don't think this could happen but I imagine at some point KCSAN would >> flag such an access as racy (which it is). >> >> [1] https://lore.kernel.org/linux-btrfs/e1fd5cc1-0f28-f670-69f4-e9958b4964e6@suse.com >> >> Reviewed-by: Nikolay Borisov >> Signed-off-by: Josef Bacik >> [ add comments regarding memory barrier ] >> Signed-off-by: David Sterba >> Signed-off-by: Sasha Levin > >Please drop this patch from autosel queue, it's part of a larger series >that reworks flushing and is not a standalone fix. Will do, thanks! -- Thanks, Sasha