Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp1252697pxb; Tue, 29 Mar 2022 22:05:17 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy7qK5t1BSqdmB8/3d+EPeef+0nN8EX0GFg9W6oOXuVC1CrjRA1frbpicb+/mJSh3YLmlxr X-Received: by 2002:aa7:d5d7:0:b0:418:f7b0:88cc with SMTP id d23-20020aa7d5d7000000b00418f7b088ccmr8451240eds.227.1648616716911; Tue, 29 Mar 2022 22:05:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648616716; cv=none; d=google.com; s=arc-20160816; b=RL/8e+zb7Rk+8OOlTAczQGw68w3mX/7emiXoXUwA2bpsSZ1IxPE/HRKyaEPAKiiNOD y+VzHw6WVXSkHsB7Qy1Z6jJcadU+11XIUz0K7XGibNF+KsMTWm1eXy1W0MCz+hqH+X8m BCuUGPrklyFGhCG6bRxST2YzTwQXf0G0pIs+qThAhUVt2z48dv3EpHIjR19UMUWCvBJQ eqrruUvUbzXzIP7EOwgU52nvJ/ttg7tSHlcwww3pXR5xgHTHfU451vcJM06stj6mm7ts Pca5cAh+hzg841WdEbUr4YAcW/y0RHWRljxBk7GQ4rauZG0mFQzy6ef03L5uKeEY/wNG Y6dQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:cc:to:from :subject:dkim-signature:dkim-signature; bh=eZoKoyzq+mjWgFFmoZiRVp8Ge/xCx6lu4KsWyGo8HHM=; b=fzPL4FmBD5Z/0NQv61jta5T1y+1ftlF4sZFF0tOGD63Jq711xdCvpQiYftEKiKurVx n+FmL+Qdpr+f2KgTP2+box/TjNybBMQBhIYpwrMEteZ3UdEE+XlxI4Ia4lqAeTu8o7XG da+xfseIrL9+BzaQ1Yy6VScd9BhqRRqSJF+PJtVZlLEJNzpxxd2joJnamqV+yFif68P9 rfqijD4o3WM/OWoxqnhTJY+R7//vhnVyLmhmTMEOd+GS9ooUaGeo1gVi6lM0TPI/Gu5y 4uKImL/jhG1yxtLDAQ9eBJ2r2XvtMRGvuxpbqXmw230/toZaRWStpT+abOy15pOVxbmt Fnvw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=WK4ZqFzl; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519 header.b="k/3H9b5r"; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a16-20020a509b50000000b00418d8b5b7easi21218205edj.449.2022.03.29.22.04.53; Tue, 29 Mar 2022 22:05:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=WK4ZqFzl; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519 header.b="k/3H9b5r"; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241410AbiC2Xym (ORCPT + 99 others); Tue, 29 Mar 2022 19:54:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45462 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241408AbiC2XyS (ORCPT ); Tue, 29 Mar 2022 19:54:18 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E0D65232D2A; Tue, 29 Mar 2022 16:52:07 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 72B36218FB; Tue, 29 Mar 2022 23:52:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1648597926; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=eZoKoyzq+mjWgFFmoZiRVp8Ge/xCx6lu4KsWyGo8HHM=; b=WK4ZqFzlcF4tMT+Q5HuZ+j0mqX8tYAUW7Brwlg+zCDxZlSeohFiKL7F4UKcMsQC/UZyX57 gRlNTV9fSse1claA38oMwqGVAvkjfUfqnd+rTbpKcFdqhCn26XKQVSquHTlETlV/KPhFQb 1EEVKC64hdhEnm9nqPlg/kN8Qdr+P4g= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1648597926; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=eZoKoyzq+mjWgFFmoZiRVp8Ge/xCx6lu4KsWyGo8HHM=; b=k/3H9b5rfhEwGFBsuRm7S8d3P8BCNLsEFUrE798qhjoonzahT9uqijZIQY5MjCaqzdSIGF Nbvs0SOXowpvrGDg== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 2B56013A7E; Tue, 29 Mar 2022 23:52:00 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id Kke9NaCbQ2JBLwAAMHmgww (envelope-from ); Tue, 29 Mar 2022 23:52:00 +0000 Subject: [PATCH 06/10] MM: perform async writes to SWP_FS_OPS swap-space using ->swap_rw From: NeilBrown To: Andrew Morton Cc: Christoph Hellwig , David Howells , linux-nfs@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Date: Wed, 30 Mar 2022 10:49:41 +1100 Message-ID: <164859778126.29473.12399585304843922231.stgit@noble.brown> In-Reply-To: <164859751830.29473.5309689752169286816.stgit@noble.brown> References: <164859751830.29473.5309689752169286816.stgit@noble.brown> User-Agent: StGit/0.23 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org This patch switches swap-out to SWP_FS_OPS swap-spaces to use ->swap_rw and makes the writes asynchronous, like they are for other swap spaces. To make it async we need to allocate the kiocb struct from a mempool. This may block, but won't block as long as waiting for the write to complete. At most it will wait for some previous swap IO to complete. Reviewed-by: Christoph Hellwig Signed-off-by: NeilBrown --- mm/page_io.c | 98 ++++++++++++++++++++++++++++++++++------------------------ 1 file changed, 58 insertions(+), 40 deletions(-) diff --git a/mm/page_io.c b/mm/page_io.c index 52d423c9962b..a01cc273bb00 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -303,6 +303,57 @@ int sio_pool_init(void) return 0; } +static void sio_write_complete(struct kiocb *iocb, long ret) +{ + struct swap_iocb *sio = container_of(iocb, struct swap_iocb, iocb); + struct page *page = sio->bvec.bv_page; + + if (ret != PAGE_SIZE) { + /* + * In the case of swap-over-nfs, this can be a + * temporary failure if the system has limited + * memory for allocating transmit buffers. + * Mark the page dirty and avoid + * folio_rotate_reclaimable but rate-limit the + * messages but do not flag PageError like + * the normal direct-to-bio case as it could + * be temporary. + */ + set_page_dirty(page); + ClearPageReclaim(page); + pr_err_ratelimited("Write error %ld on dio swapfile (%llu)\n", + ret, page_file_offset(page)); + } else + count_vm_event(PSWPOUT); + end_page_writeback(page); + mempool_free(sio, sio_pool); +} + +static int swap_writepage_fs(struct page *page, struct writeback_control *wbc) +{ + struct swap_iocb *sio; + struct swap_info_struct *sis = page_swap_info(page); + struct file *swap_file = sis->swap_file; + struct address_space *mapping = swap_file->f_mapping; + struct iov_iter from; + int ret; + + set_page_writeback(page); + unlock_page(page); + sio = mempool_alloc(sio_pool, GFP_NOIO); + init_sync_kiocb(&sio->iocb, swap_file); + sio->iocb.ki_complete = sio_write_complete; + sio->iocb.ki_pos = page_file_offset(page); + sio->bvec.bv_page = page; + sio->bvec.bv_len = PAGE_SIZE; + sio->bvec.bv_offset = 0; + iov_iter_bvec(&from, WRITE, &sio->bvec, 1, PAGE_SIZE); + ret = mapping->a_ops->swap_rw(&sio->iocb, &from); + if (ret != -EIOCBQUEUED) + sio_write_complete(&sio->iocb, ret); + return ret; +} + int __swap_writepage(struct page *page, struct writeback_control *wbc, bio_end_io_t end_write_func) { @@ -311,46 +362,13 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc, struct swap_info_struct *sis = page_swap_info(page); VM_BUG_ON_PAGE(!PageSwapCache(page), page); - if (data_race(sis->flags & SWP_FS_OPS)) { - struct kiocb kiocb; - struct file *swap_file = sis->swap_file; - struct address_space *mapping = swap_file->f_mapping; - struct bio_vec bv = { - .bv_page = page, - .bv_len = PAGE_SIZE, - .bv_offset = 0 - }; - struct iov_iter from; - - iov_iter_bvec(&from, WRITE, &bv, 1, PAGE_SIZE); - init_sync_kiocb(&kiocb, swap_file); - kiocb.ki_pos = page_file_offset(page); - - set_page_writeback(page); - unlock_page(page); - ret = mapping->a_ops->direct_IO(&kiocb, &from); - if (ret == PAGE_SIZE) { - count_vm_event(PSWPOUT); - ret = 0; - } else { - /* - * In the case of swap-over-nfs, this can be a - * temporary failure if the system has limited - * memory for allocating transmit buffers. - * Mark the page dirty and avoid - * folio_rotate_reclaimable but rate-limit the - * messages but do not flag PageError like - * the normal direct-to-bio case as it could - * be temporary. - */ - set_page_dirty(page); - ClearPageReclaim(page); - pr_err_ratelimited("Write error on dio swapfile (%llu)\n", - page_file_offset(page)); - } - end_page_writeback(page); - return ret; - } + /* + * ->flags can be updated non-atomicially (scan_swap_map_slots), + * but that will never affect SWP_FS_OPS, so the data_race + * is safe. + */ + if (data_race(sis->flags & SWP_FS_OPS)) + return swap_writepage_fs(page, wbc); ret = bdev_write_page(sis->bdev, swap_page_sector(page), page, wbc); if (!ret) {