Received: by 2002:a05:6358:a55:b0:ec:fcf4:3ecf with SMTP id 21csp6720715rwb; Wed, 18 Jan 2023 08:34:06 -0800 (PST) X-Google-Smtp-Source: AMrXdXtaXiKzQNeUlzi1cqWdP56nrKwgEFx2q1Gj2rV+MSTEy/9n+tTdpVtHnQEivp/ufNbRpVcF X-Received: by 2002:a17:902:ccc1:b0:194:7d25:cb78 with SMTP id z1-20020a170902ccc100b001947d25cb78mr9075195ple.46.1674059646244; Wed, 18 Jan 2023 08:34:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1674059646; cv=none; d=google.com; s=arc-20160816; b=PNIYM1t1nGElkGNqc/grQafeE6SdW4OnoOgOwTizaJTSgtw7DxHLJ0hQPYPJaJvPLw n8+Rqv6ze5hWK6ETSlnqrsPUUlrJyNwf83F9p+7Au5Irq4nwe7tczIXdqoIJthPf0eOX N/pq5hPFufHlfd4YHDE4WG5XCUEtnkqA38x15/wcfk6yQaFho7UH+gXnPZdZ8RfHibNc 8bwPBAPQPqpMkKgrS28WrxBlq74Cb7aEOqc931EsHcCxkCGGWRkSA3p96W3xW69Uzhqe PF+xiK8Twj7AtH4EGuguue1XtnP8wM9UyOibVlm5lh2qT2y4Fk+Iq2xI+iWkH5s6HPO+ 5O1g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=79tBezyevsf2Wxt0n7eAII57Vo+xNtqmBftZsLQEsak=; b=B9TPVMWNSluI3yfx4pN5hQUQk8v5xJYYWYgFTNYxUHQvXO3FC8aQD8HAFuM6XzlFCq qHMC5r5fQT1zKZqTUN1wR3gR5OAKrRHqRT671LT0V3nbOLT3FXQvzaiw5P3EhZUTzC8g Yw88HsnAbGSG2lTocBcUMmUtto97lsGqygG7+zFUrXrXeFxfoiV8trfHPmreTMZLG4qR TYe9LzYnx+veOHjJ04kzDqztROOZKUNDjifVjJGBAS0u0QWmzuxZNgVxSjkkWNaFcHEk rZLzybMPzO2EnFGEFnDv6pJ/9kEGi73FEoHo4czmCIbkWBxpHMaKviMXwMt3N1SBYKig MfeA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.org.uk header.s=zeniv-20220401 header.b="AE1dpnZ/"; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=zeniv.linux.org.uk Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q9-20020a170902bd8900b00189efa12957si31132920pls.126.2023.01.18.08.33.52; Wed, 18 Jan 2023 08:34:06 -0800 (PST) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.org.uk header.s=zeniv-20220401 header.b="AE1dpnZ/"; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=zeniv.linux.org.uk Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230232AbjARQcj (ORCPT + 99 others); Wed, 18 Jan 2023 11:32:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35544 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230262AbjARQbz (ORCPT ); Wed, 18 Jan 2023 11:31:55 -0500 Received: from zeniv.linux.org.uk (zeniv.linux.org.uk [IPv6:2a03:a000:7:0:5054:ff:fe1c:15ff]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9E63346171; Wed, 18 Jan 2023 08:30:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=linux.org.uk; s=zeniv-20220401; h=Sender:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=79tBezyevsf2Wxt0n7eAII57Vo+xNtqmBftZsLQEsak=; b=AE1dpnZ/RyCsAhZjNATK0qKchW MalbWxkYDaH7CPFLM6QjQBYvj+f8wHu/p3KYTaMnKBZcf6xFD3P0y4loZGMWsUhMgKygsP9WiMBIA PMiAr1iF3KI/4jOG/7P6H9pnw78sWVIN1wqSDKT02en3TvOUvzspx4fgNv6fxrXv7YaToj+dTbxtm 1kXKhSFh8Co62Z4dCOPCNeGPVkwCKH+xfHFNRAfHbMZRH//pp1YSWJq0RGoDZK0AZuAJ0ZxR69F8N waSdo5JVLSs0u5piImKsTtBxvqo+WxwHDC+tTfT2cvDCwHk1Bb+Py1GY8hQKR8OrxsfgrEm2BWsOo 6CPWQBIg==; Received: from viro by zeniv.linux.org.uk with local (Exim 4.96 #2 (Red Hat Linux)) id 1pIBKA-002a04-0T; Wed, 18 Jan 2023 16:30:06 +0000 Date: Wed, 18 Jan 2023 16:30:06 +0000 From: Al Viro To: Jan Kara Cc: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, Ted Tso , linux-xfs@vger.kernel.org Subject: Re: Locking issue with directory renames Message-ID: References: <20230117123735.un7wbamlbdihninm@quack3> <20230118091036.qqscls22q6htxscf@quack3> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230118091036.qqscls22q6htxscf@quack3> Sender: Al Viro X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Wed, Jan 18, 2023 at 10:10:36AM +0100, Jan Kara wrote: > > Yes, we can lock the source inode in ->rename() if we need it. The snag is > that if 'target' exists, it is already locked so when locking 'source' we > are possibly not following the VFS lock ordering of i_rwsem by inode > address (I don't think it can cause any real dealock but still it looks > suspicious). Also we'll have to lock with I_MUTEX_NONDIR2 lockdep class to > make lockdep happy but that's just a minor annoyance. Finally, we'll have > to check for RENAME_EXCHANGE because in that case, both source and target > will be already locked. Thus if we do the additional locking in the > filesystem, we will leak quite some details about rename locking into the > filesystem which seems undesirable to me. Rules for inode locks are simple: * directories before non-directories * ancestors before descendents * for non-directories the ordering is by in-core inode address So the instances that need that extra lock would do that when source is a directory and non RENAME_EXCHANGE is given. Having the target already locked is irrelevant - if it exists, it's already checked to be a directory as well, and had it been a descendent of source, we would have already found that and failed with -ELOOP. If A and B are both directories, there's no ordering between them unless one is an ancestor of another - such can be locked in any order. However, one of the following must be true: * C is locked and both A and B had been observed to be children of C after the lock on C had been acquired, or * ->s_vfs_rename_mutex is held for the filesystem containing both A and B. Note that ->s_vfs_rename_mutex is there to stabilize the tree topology and make "is A an ancestor of B?" possible to check for more than "A is locked, B is a child of A, so A will remain its ancestor until unlocked"...