Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp17385887rwd; Tue, 27 Jun 2023 02:10:45 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4Z7MQdiXHat6W7bA1gEzz72rYk4o/nr5iROKGA2OEkkiMPNeUAizCRY4gf2QMDx8HUa0eG X-Received: by 2002:a17:906:9b88:b0:988:9641:fe38 with SMTP id dd8-20020a1709069b8800b009889641fe38mr25863946ejc.32.1687857045374; Tue, 27 Jun 2023 02:10:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687857045; cv=none; d=google.com; s=arc-20160816; b=S9viRqlQlC0cv/UixkaPI65gYICYV8Tvr1MjZ6dsLyGLaSNljnC8m99M4H68YcBOwp 8hoc/yEFt87wLavLyshFZAfenFEXBNwUwpBjzB88gJmFCX7/mm2TqHyepm2kM/XYVFyn xulMc36mjLw4JJXCmjk4r8+rt6KmhmmDwD7pha2YyqVXTS8OgjMRzNUUfaAbNY5U0Lek OoMng+LiYrpeNpj+71dZAnS2g9ue+rAq2Oo9Logl2K+/vgkV9tuc6VwrKkhkI9w+jYEc ods8D/DDphb4hoLI/E6vLhByBC4ban397AfJx+AlXW+8/OVd/I78zVdFBwK5VQfkBw5f i0Xg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id; bh=dj/rayFSzrSy0JM0kUyGzsx1MWOIQVAK+O8ILc5/ZQg=; fh=9x1LrL2QrIZog3VQ/5V9jBOQgfRcp9bYZChBiKY7RGM=; b=YWIwkXo6SN63vbe6xiJx+hS5fx/X9hphXGOKBl9nMsdupSEAxaAxR2ANHqn8H82Ejd ymsCK38bs3uAixWx7/0VYG4TTl4Wf8fn36/ozwekyKRwGyiV3TAOicJfs0Z2INsbiM+V AH7R6Lp9onmDZH07CkSPci6yFVmxcsEmlcvfYbHxqEzsQunMKZ8JlM8GyNMmyhqSo45l 4sa4z4PZ9XVKGA3UjqdtUGNFDTfkED3axNPLFu3YoZjhyhm7HBhiHiXv4CVzQKHFz/e6 cwYes5/JX93ukBOwgtASLFHvK0VQsiinEDtvVg7wFrp0/vdaehV4+czuawgjCCrQZoDD 8rYw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k20-20020a17090646d400b0098e1d1b9a0esi3252815ejs.419.2023.06.27.02.10.15; Tue, 27 Jun 2023 02:10:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230372AbjF0JIz (ORCPT + 99 others); Tue, 27 Jun 2023 05:08:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58196 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231232AbjF0JIh (ORCPT ); Tue, 27 Jun 2023 05:08:37 -0400 Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 85D782715; Tue, 27 Jun 2023 02:08:30 -0700 (PDT) Received: from dggpeml500021.china.huawei.com (unknown [172.30.72.55]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4QqzNJ14vvzLmwW; Tue, 27 Jun 2023 17:06:24 +0800 (CST) Received: from [10.174.177.174] (10.174.177.174) by dggpeml500021.china.huawei.com (7.185.36.21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Tue, 27 Jun 2023 17:08:27 +0800 Message-ID: Date: Tue, 27 Jun 2023 17:08:27 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.1.2 Subject: Re: [PATCH] quota: fix race condition between dqput() and dquot_mark_dquot_dirty() To: Jan Kara CC: , , , , , , , Baokun Li References: <20230616085608.42435-1-libaokun1@huawei.com> <20230616152824.ndpgvkegvvip2ahh@quack3> <20230622145620.hk3bdjxtlr64gtzl@quack3> <20230626130957.kvfli23djxc2opkq@quack3> <2486ec73-55e0-00cb-fc76-97b9b285a9ce@huawei.com> <20230627083406.hhjf55e2tqnwqaf6@quack3> Content-Language: en-US From: Baokun Li In-Reply-To: <20230627083406.hhjf55e2tqnwqaf6@quack3> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.177.174] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To dggpeml500021.china.huawei.com (7.185.36.21) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org Hello! On 2023/6/27 16:34, Jan Kara wrote: > Hello! > > On Mon 26-06-23 21:55:49, Baokun Li wrote: >> On 2023/6/26 21:09, Jan Kara wrote: >>> On Sun 25-06-23 15:56:10, Baokun Li wrote: >>>>>> I think we can simply focus on the race between the DQ_ACTIVE_B flag and >>>>>> the DQ_MOD_B flag, which is the core problem, because the same quota >>>>>> should not have both flags. These two flags are protected by dq_list_lock >>>>>> and dquot->dq_lock respectively, so it makes sense to add a >>>>>> wait_on_dquot() to ensure the accuracy of DQ_ACTIVE_B. >>>>> But the fundamental problem is not only the race with DQ_MOD_B setting. The >>>>> dquot structure can be completely freed by the time >>>>> dquot_claim_space_nodirty() calls dquot_mark_dquot_dirty() on it. That's >>>>> why I think making __dquot_transfer() obey dquot_srcu rules is the right >>>>> solution. >>>> Yes, now I also think that making __dquot_transfer() obey dquot_srcu >>>> rules is a better solution. But with inode->i_lock protection, why would >>>> the dquot structure be completely freed? >>> Well, when dquot_claim_space_nodirty() calls mark_all_dquot_dirty() it does >>> not hold any locks (only dquot_srcu). So nothing prevents dquot_transfer() >>> to go, swap dquot structure pointers and drop dquot references and after >>> that mark_all_dquot_dirty() can use a stale pointer to call >>> mark_dquot_dirty() on already freed memory. >>> >> No, this doesn't look like it's going to happen. The >> mark_all_dquot_dirty() uses a pointer array pointer, the dquot in the >> array is dynamically changing, so after swap dquot structure pointers, >> mark_all_dquot_dirty() uses the new pointer, and the stale pointer is >> always destroyed after swap, so there is no case of using the stale >> pointer here. > There is a case - CPU0 can prefetch the values from dquots[] array into its > local cache, then CPU1 can update the dquots[] array (these writes can > happily stay in CPU1 store cache invisible to other CPUs) and free the > dquots via dqput(). Then CPU0 can pass the prefetched dquot pointers to > mark_dquot_dirty(). There are no locks or memory barries preventing CPUs > from ordering instructions and memory operations like this in the code... > You can read Documentation/memory-barriers.txt about all the perils current > CPU architecture brings wrt coordination of memory accesses among CPUs ;) > > Honza Got it! Sorry for misunderstanding you (I thought "completely freed" meant dquot_destroy(), but you should have meant dquot_release()). Then this is exactly the scenario in the first function graph in my patch description. In this scenario the data from the old dquot has been moved to the new dquot. The old dquot may set DQ_MOD_B bit after it has been cleared of DQ_ACTIVE_B bit. We want to do exactly the thing to avoid this situation. -- With Best Regards, Baokun Li .