Received: by 2002:a05:7412:6592:b0:d7:7d3a:4fe2 with SMTP id m18csp759816rdg; Thu, 10 Aug 2023 20:43:44 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE4nbM2T/ZnbSvKUtJ/OD24daviEvajKdcZ9YK3SI8LOcfCx5Fpxo55uVOKnbSOXg1KATtP X-Received: by 2002:a17:907:6eab:b0:992:13c7:560 with SMTP id sh43-20020a1709076eab00b0099213c70560mr5336061ejc.38.1691725424675; Thu, 10 Aug 2023 20:43:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691725424; cv=none; d=google.com; s=arc-20160816; b=B388HWliFXMntXf/IXfNo59+/bmW/TyODSNKax+J6FAH6SE2UuKvgn8SnEgEiIo0GF vQFmQO0nkAvrSCU9GByM9LOKE9K6FhoNkXSBUDaBfcmC7THJYLnVhygkJL9q1/kJDMc5 /2syOezArP29ElqmZB51rX+jAGY4rHyXHMCtvuEwpQIr0rgiR34hc4j25gN8Buct2lff acivI+ag6Xfc8mh2kylWXSiNdppjOb2su5LMlGJm/UvOXAuXrMVyk8M/lSq8+m0TJ42p RD5DmXQQJdgxKdqWgcVI7bEhQShoPBP/Hv2sQWSrhDGLzEf4WkyzrIsy593etbW39Ktt H/7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=eGMw5dRH/Dn1o5RMvHbPyVXLZWn5aPRdckLaI9JG+kM=; fh=j/IEJVjPBqX+zLqQYAC7E7dK8KafGuosVODu4buoj/Q=; b=R6Desg4Ekn5+ct7yDk2hnrkucyboHWvcRZOsirdb2qfusmgWkrNDRBmlAfxWeuSzPr bl8vfe6zXNJ9dmcaMgGtmGjQvpcVqZS7XtG7nPc0HAKd61uTqF2ZDD/2IfHcgrlaRlNa nHmOMI+ZCetsm3t7Q7oCSRNST76GrTWXvxhP7bSI/QjbWtp+1p0ddyEbRyUS5r2+amNV bSuRBiXHgAuRrwVxqwUzw5nlrEpDTdpXQBZdZHOxZus/XMr1nEKqrpWbSJtqDNF7EF12 auc1Jjrzul6pE9MUrl3NVLtavQNR/2XF5MGH691y7NXKxl/aSCUefIRhkoWjWpc4AeLt 1Izg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20221208.gappssmtp.com header.s=20221208 header.b="2ZMeeI/v"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l25-20020a1709061c5900b00992a0966793si2529184ejg.814.2023.08.10.20.43.21; Thu, 10 Aug 2023 20:43:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20221208.gappssmtp.com header.s=20221208 header.b="2ZMeeI/v"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229971AbjHKCkN (ORCPT + 99 others); Thu, 10 Aug 2023 22:40:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48400 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229726AbjHKCkN (ORCPT ); Thu, 10 Aug 2023 22:40:13 -0400 Received: from mail-io1-xd35.google.com (mail-io1-xd35.google.com [IPv6:2607:f8b0:4864:20::d35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 904A92D55 for ; Thu, 10 Aug 2023 19:40:11 -0700 (PDT) Received: by mail-io1-xd35.google.com with SMTP id ca18e2360f4ac-7748ca56133so19187639f.0 for ; Thu, 10 Aug 2023 19:40:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20221208.gappssmtp.com; s=20221208; t=1691721611; x=1692326411; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=eGMw5dRH/Dn1o5RMvHbPyVXLZWn5aPRdckLaI9JG+kM=; b=2ZMeeI/vpBk6rUcLSnhM1dWgCvoniCYkmTpzDPdjs3Y34fxxxVyGIxS8JJlMfA9A5x fiWEVv3n5BxhxYkPkQehoIBQSWVdQhcG+JduDHe+WnAQKH16sF3wbAvHUWeBfpNSgkTn lRrmATG0adoVnyd0S7rtnjUjljGLj3+uF7LIhq5V4Qn9ExoeVGGHQTqyVuhSHQk1T0YM 3ho4TggdSn8tPGnlT1kTqf0XZD1CMF2+8/OtY/Oy6XTRC6pM455a4zb4DULfsgKiRqfP 8c8abpIKt8JMSwdpssvd4Nm6NshY9v6nl2/pXXK5n+n1P8rtA8I3XXSItBG8TTVkg4zC usfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691721611; x=1692326411; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=eGMw5dRH/Dn1o5RMvHbPyVXLZWn5aPRdckLaI9JG+kM=; b=JjXBf6I0/9Ibdh5D2RwcVRULUuwGpWJ1AqABQ411gBb2Y4j0xSMh/4t6Y7yrlWmFcb 8vBKBw0I/520K2gHruksS4i3NJcYrfVDGJ/ULe1ay1JYAX9jHlD64NvGrBCwUbHz311e FCYJW5Wefl3roKqODxOL/PUva33yX1asWqAhHQY2PzgY9HnlHS/j5ZA/VYlk78se57a9 6m1qWqNlTJnicb5baKP3+rxdXftQaRpQA29rFM4SK2uWVaYvvy6FpXShVGTdT61i80WA EOkX/1rIR5fi+CBH4gy3yzOD/wftpSfauQO7X+y0R56U5S0umWatMDFzlMR9pcwMr20r HWag== X-Gm-Message-State: AOJu0YwxB6PtF5iFDDtnNZBt5V4/n+09aePTMcdIoKAgOWAovz8NZJDm 5zfvk1y0R4CKInr/ziiDYWjECQ== X-Received: by 2002:a92:c6ce:0:b0:349:7518:4877 with SMTP id v14-20020a92c6ce000000b0034975184877mr658909ilm.0.1691721610842; Thu, 10 Aug 2023 19:40:10 -0700 (PDT) Received: from [192.168.1.136] ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id u14-20020a170903124e00b001b8b26fa6c1sm2501772plh.115.2023.08.10.19.40.08 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 10 Aug 2023 19:40:09 -0700 (PDT) Message-ID: Date: Thu, 10 Aug 2023 20:40:07 -0600 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [GIT PULL] bcachefs Content-Language: en-US To: Linus Torvalds , "Darrick J. Wong" Cc: Kent Overstreet , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-bcachefs@vger.kernel.org, dchinner@redhat.com, sandeen@redhat.com, willy@infradead.org, josef@toxicpanda.com, tytso@mit.edu, bfoster@redhat.com, jack@suse.cz, andreas.gruenbacher@gmail.com, brauner@kernel.org, peterz@infradead.org, akpm@linux-foundation.org, dhowells@redhat.com, snitzer@kernel.org References: <20230626214656.hcp4puionmtoloat@moria.home.lan> <20230706155602.mnhsylo3pnief2of@moria.home.lan> <20230712025459.dbzcjtkb4zem4pdn@moria.home.lan> <20230810155453.6xz2k7f632jypqyz@moria.home.lan> <20230810223942.GG11336@frogsfrogsfrogs> From: Jens Axboe In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 8/10/23 5:47 PM, Linus Torvalds wrote: > On Thu, 10 Aug 2023 at 15:39, Darrick J. Wong wrote: >> >> FWIW I recently fixed all my stupid debian package dependencies so that >> I could actually install liburing again, and rebuilt fstests. The very >> next morning I noticed a number of new test failures in /exactly/ the >> way that Kent said to expect: >> >> fsstress -d /mnt & ; \ >> umount /mnt; mount /dev/sda /mnt >> >> Here, umount exits before the filesystem is really torn down, and then >> mount fails because it can't get an exclusive lock on the device. > > I agree that that obviously sounds like mount is just returning either > too early. Or too eagerly. > > But I suspect any delayed fput() issues (whether from aio or io_uring) > are then just a way to trigger the problem, not the fundamental cause. > > Because even if the fput() is delayed, the mntput() part of that > delayed __fput action is the one that *should* have kept the > filesystem mounted until it is no longer busy. > > And more importantly, having some of the common paths synchronize > *their* fput() calls only affects those paths. > > It doesn't affect the fundamental issue that the last fput() can > happen in odd contexts when the file descriptor was used for something > a bit stranger. > > So I do feel like the fput patch I saw looked more like a "hide the > problem" than a real fix. The fput patch was not pretty, nor is it needed. What happens on the io_uring side is that pending requests (which can hold files referenced) are canceled on exit. But we don't wait for the references to go away, which then introduces this race. I've used this to trigger it: #!/bin/bash DEV=/dev/nvme0n1 MNT=/data ITER=0 while true; do echo loop $ITER sudo mount $DEV $MNT fio --name=test --ioengine=io_uring --iodepth=2 --filename=$MNT/foo --size=1g --buffered=1 --overwrite=0 --numjobs=12 --minimal --rw=randread --thread=1 --output=/dev/null & Y=$(($RANDOM % 3)) X=$(($RANDOM % 10)) VAL="$Y.$X" sleep $VAL ps -e | grep fio > /dev/null 2>&1 while [ $? -eq 0 ]; do killall -9 fio > /dev/null 2>&1 wait > /dev/null 2>&1 ps -e | grep "fio " > /dev/null 2>&1 done sudo umount /data if [ $? -ne 0 ]; then break fi ((ITER++)) done and can make it happen pretty easily, within a few iterations. Contrary to how it was otherwise presented in this thread, I did take a look at this a month ago and wrote up some patches for it. Just rebased them on the current tree: https://git.kernel.dk/cgit/linux/log/?h=io_uring-exit-cancel Since we have task_work involved for both the completions and the __fput(), ordering is a concern which is why it needs a bit more effort than just the bare bones stuff. The way the task_work list works, we llist_del_all() and run all items. But we do encapsulate that in io_uring anyway, so it's possible to run our pending local items and avoid that particular snag. WIP obviously, the first 3-4 prep patches were posted earlier today, but I'm not happy with the last 3 yet in the above branch. Or at least not fully confident, so will need a bit more thinking and testing. Does pass the above test case, and the regular liburing test/regression cases, though. -- Jens Axboe