Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp383069rwb; Fri, 18 Nov 2022 02:56:47 -0800 (PST) X-Google-Smtp-Source: AA0mqf7C8QLXi5TVm/G9/21nA66QTXGfVlZ7DcBmQ8cnmd3WFUZU+3AACzvH5PzUoeOKFRda/9xF X-Received: by 2002:a17:90a:fa4f:b0:212:dac0:ce83 with SMTP id dt15-20020a17090afa4f00b00212dac0ce83mr7285897pjb.223.1668769007230; Fri, 18 Nov 2022 02:56:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668769007; cv=none; d=google.com; s=arc-20160816; b=IJMOk2WTO51eWChkcjqJRXL4d8yHB/vy6pkETn3q5ofacsCfOsa4gOi1SLedbUjn5I VyGEkpK5XkqcKv54eTkeWeakHIM9oorFwQgTpRL11mN4RJVXLLR4Qhfts81q3v08N8CT p+hm9cHC1fPiBohff9bwBj7et5i09RyQXZiV2/sLVs24bb30/04+riBtB2zGMhq8BECH pjqCvfj1ZSBfypYyGcMtflvO4wJqPlvClW5VfH+Xl3p4iFW0SZOXb8JfjZhaSOPgYT68 Gmfh8eEcWc+bn3HU4KQMyWE3P4dXEg1Zoe4148hG421NOuG/oZikUYNmrANQTV7/7W5l EMQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:message-id:date:cc:to:from:subject:organization :dkim-signature; bh=f5bu2IF1InfzvQ00y6+OuZ0Xhu88JGi4I8t2yA0L/fE=; b=U8jALUIrLszguYqSHLLlaZoA5ffMB8zrbu4d8DdvMQhONAyq7lAaz/fF8vTV0j+y4L qIhgBmVF/yRdBgrwvTXTGlz3hVw9IEnMmGSYxpSf18mJtWOoQUevWl4uLHGxUt+D8BZq sE3Tb7i6BM5SWp2SMwJdd+mU42m6hzYe6Pig5OAehNxDZj7/ZsPh+DxMsAl6CMbZC+GI 70fEv74ldnmWyuXN4SGqaoKBsNdFp7ZbunKnK96/AL9Bz7TNaAmhEQN1VnlVmTiVzD4A vExSI/tJ2otZmLtfej+lwfhH7nOuP7sgzdRild9ISS0SOW9WJwAXetK6QgnVFzL31xES qluA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=KMVoo+0J; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id mi12-20020a17090b4b4c00b0021409ca96c2si3471239pjb.23.2022.11.18.02.56.34; Fri, 18 Nov 2022 02:56:47 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=KMVoo+0J; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235246AbiKRKiw (ORCPT + 91 others); Fri, 18 Nov 2022 05:38:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58446 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241444AbiKRKio (ORCPT ); Fri, 18 Nov 2022 05:38:44 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1EADC92B6F for ; Fri, 18 Nov 2022 02:37:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668767863; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=f5bu2IF1InfzvQ00y6+OuZ0Xhu88JGi4I8t2yA0L/fE=; b=KMVoo+0JUqn9uyXcDAt8uU2b4jlyxyXJ8RlzkcMGYTKwHseL3HVS8dsSTq1uAjrV306Xmb XUCcdUlTsmpxgPLXGBZ8VV1mrWclHpzVumaOucs87WEIp4qNjk49dR2hOui5/qt3G+tQVY b23ZGbh7/OcA7jvJx5Sr0hDxlt37HHw= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-322-Bsc1Dem6OEiKHGf63FHB2A-1; Fri, 18 Nov 2022 05:37:40 -0500 X-MC-Unique: Bsc1Dem6OEiKHGf63FHB2A-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 86C4F101A54E; Fri, 18 Nov 2022 10:37:39 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.24]) by smtp.corp.redhat.com (Postfix) with ESMTP id 79C7235429; Fri, 18 Nov 2022 10:37:38 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH] afs: Stop implementing ->writepage() From: David Howells To: linux-afs@lists.infradead.org Cc: Marc Dionne , Christoph Hellwig , Matthew Wilcox , Theodore Ts'o , dhowells@redhat.com, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Fri, 18 Nov 2022 10:37:35 +0000 Message-ID: <166876785552.222254.4403222906022558715.stgit@warthog.procyon.org.uk> User-Agent: StGit/1.5 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org We're trying to get rid of the ->writepage() hook[1]. Stop afs from using it by unlocking the page and calling filemap_fdatawrite_wbc() rather than folio_write_one(). We drop the folio lock so that writeback can include pages on both sides of the target page in the write without risking deadlock. A hint flag is added to the writeback_control struct so that a filesystem can say that the write is triggered by write_begin seeing a conflicting write. This causes do_writepages() to do a single pass of the loop only. This requires ->migrate_folio() to be implemented, so point that at filemap_migrate_folio() for files and also for symlinks and directories. A couple of questions: (1) afs_write_back_from_locked_folio() could be called directly rather than calling filemap_fdatawrite_wbc(), but that would avoid the control group stuff that wbc_attach_and_unlock_inode() and co. seem to do. Do I actually need to do this? (2) afs_writepages_region() has a loop in it to generate multiple writes. do_writepages() also acquired a loop[2] which will also generate multiple writes. Should I remove the loop from afs_writepages_region() and leave it to the caller of ->writepages()? Signed-off-by: David Howells cc: Marc Dionne cc: Christoph Hellwig cc: Matthew Wilcox cc: Theodore Ts'o cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/20221113162902.883850-1-hch@lst.de/ [1] Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=80a2ea9f85850f1cdae814be03b4a16c3d3abc00 [2] --- fs/afs/dir.c | 1 + fs/afs/file.c | 3 +- fs/afs/write.c | 63 +++++++++++++++++++++++---------------------- include/linux/writeback.h | 1 + mm/page-writeback.c | 3 +- 5 files changed, 38 insertions(+), 33 deletions(-) diff --git a/fs/afs/dir.c b/fs/afs/dir.c index 230c2d19116d..baed7b095087 100644 --- a/fs/afs/dir.c +++ b/fs/afs/dir.c @@ -77,6 +77,7 @@ const struct address_space_operations afs_dir_aops = { .dirty_folio = afs_dir_dirty_folio, .release_folio = afs_dir_release_folio, .invalidate_folio = afs_dir_invalidate_folio, + .migrate_folio = filemap_migrate_folio, }; const struct dentry_operations afs_fs_dentry_operations = { diff --git a/fs/afs/file.c b/fs/afs/file.c index d1cfb235c4b9..a2325e0b9d38 100644 --- a/fs/afs/file.c +++ b/fs/afs/file.c @@ -58,14 +58,15 @@ const struct address_space_operations afs_file_aops = { .invalidate_folio = afs_invalidate_folio, .write_begin = afs_write_begin, .write_end = afs_write_end, - .writepage = afs_writepage, .writepages = afs_writepages, + .migrate_folio = filemap_migrate_folio, }; const struct address_space_operations afs_symlink_aops = { .read_folio = afs_symlink_read_folio, .release_folio = afs_release_folio, .invalidate_folio = afs_invalidate_folio, + .migrate_folio = filemap_migrate_folio, }; static const struct vm_operations_struct afs_vm_ops = { diff --git a/fs/afs/write.c b/fs/afs/write.c index 9ebdd36eaf2f..38d02ead3f38 100644 --- a/fs/afs/write.c +++ b/fs/afs/write.c @@ -38,6 +38,24 @@ static void afs_folio_start_fscache(bool caching, struct folio *folio) } #endif +/* + * Flush out a conflicting write. This may extend the write to the surrounding + * pages if also dirty and contiguous to the conflicting region.. + */ +static int afs_flush_conflicting_write(struct address_space *mapping, + struct folio *folio) +{ + struct writeback_control wbc = { + .sync_mode = WB_SYNC_ALL, + .nr_to_write = LONG_MAX, + .range_start = folio_pos(folio), + .range_end = LLONG_MAX, + .for_write_begin = true, + }; + + return filemap_fdatawrite_wbc(mapping, &wbc); +} + /* * prepare to perform part of a write to a page */ @@ -80,7 +98,8 @@ int afs_write_begin(struct file *file, struct address_space *mapping, if (folio_test_writeback(folio)) { trace_afs_folio_dirty(vnode, tracepoint_string("alrdy"), folio); - goto flush_conflicting_write; + folio_unlock(folio); + goto wait_for_writeback; } /* If the file is being filled locally, allow inter-write * spaces to be merged into writes. If it's not, only write @@ -99,8 +118,15 @@ int afs_write_begin(struct file *file, struct address_space *mapping, * flush the page out. */ flush_conflicting_write: - _debug("flush conflict"); - ret = folio_write_one(folio); + trace_afs_folio_dirty(vnode, tracepoint_string("confl"), folio); + folio_unlock(folio); + + ret = afs_flush_conflicting_write(mapping, folio); + if (ret < 0) + goto error; + +wait_for_writeback: + ret = folio_wait_writeback_killable(folio); if (ret < 0) goto error; @@ -663,34 +689,6 @@ static ssize_t afs_write_back_from_locked_folio(struct address_space *mapping, return ret; } -/* - * write a page back to the server - * - the caller locked the page for us - */ -int afs_writepage(struct page *subpage, struct writeback_control *wbc) -{ - struct folio *folio = page_folio(subpage); - ssize_t ret; - loff_t start; - - _enter("{%lx},", folio_index(folio)); - -#ifdef CONFIG_AFS_FSCACHE - folio_wait_fscache(folio); -#endif - - start = folio_index(folio) * PAGE_SIZE; - ret = afs_write_back_from_locked_folio(folio_mapping(folio), wbc, - folio, start, LLONG_MAX - start); - if (ret < 0) { - _leave(" = %zd", ret); - return ret; - } - - _leave(" = 0"); - return 0; -} - /* * write a region of pages back to the server */ @@ -775,6 +773,9 @@ static int afs_writepages_region(struct address_space *mapping, start += ret; + if (wbc->for_write_begin) + break; + cond_resched(); } while (wbc->nr_to_write > 0); diff --git a/include/linux/writeback.h b/include/linux/writeback.h index 06f9291b6fd5..3832ac3425c8 100644 --- a/include/linux/writeback.h +++ b/include/linux/writeback.h @@ -61,6 +61,7 @@ struct writeback_control { unsigned range_cyclic:1; /* range_start is cyclic */ unsigned for_sync:1; /* sync(2) WB_SYNC_ALL writeback */ unsigned unpinned_fscache_wb:1; /* Cleared I_PINNING_FSCACHE_WB */ + unsigned for_write_begin:1; /* Flush conflicting write */ /* * When writeback IOs are bounced through async layers, only the diff --git a/mm/page-writeback.c b/mm/page-writeback.c index 7e9d8d857ecc..04c65b8b4ded 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -2469,7 +2469,8 @@ int do_writepages(struct address_space *mapping, struct writeback_control *wbc) ret = mapping->a_ops->writepages(mapping, wbc); else ret = generic_writepages(mapping, wbc); - if ((ret != -ENOMEM) || (wbc->sync_mode != WB_SYNC_ALL)) + if ((ret != -ENOMEM) || (wbc->sync_mode != WB_SYNC_ALL) || + wbc->for_write_begin) break; /*