Received: by 2002:ab2:6857:0:b0:1ef:ffd0:ce49 with SMTP id l23csp2398637lqp; Sun, 24 Mar 2024 17:48:10 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUON2whDbjOWMLgos8plnAQPJ4EIXKbfOSXAI1bkKF0hZsEorcS7S5mmHqG5jOT71ad8dGCgk7NtadI4ZE0ATUX9KXfUHbrMtE0Bj1/4w== X-Google-Smtp-Source: AGHT+IGa5+7H75A1z2dBYkRnSkphLhfaVFG0rVad26pXnwvm7pwXNzt2aatl05txI8DERTt4GHVb X-Received: by 2002:a05:620a:166f:b0:788:32c4:7d05 with SMTP id d15-20020a05620a166f00b0078832c47d05mr5987320qko.17.1711327690159; Sun, 24 Mar 2024 17:48:10 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1711327690; cv=pass; d=google.com; s=arc-20160816; b=Bx+LpH7fJRuxa2dDphmfjEvR68p4z78aNWEb6yPACmCw0Fz31uMr/W1P0mNKK7JrkS BOS36w3VIfqrYm3gn2UBQTPSfU9wc+HplB592PiyJAAl/hvaDauaFKsXiemWC5MkMUeg QFTpGKQ0gldl1IwBOS/elm6KZD59acjN9LmOBOLxQ1WxReV3H4Jkva0ECYJ/KYg5H+zz IB24HUdj4+Q6RuOx/Y2VQZNQ3xoH+e1bh8I69i1nvIHjxX8zRQ/ow4ZVLHw1EnGBHRWj /jLZf2c3ZBj0JFhS4VJI/bpKositUnRCp6o56tvpNOEDP6kM2m5BL/I6J8XN1L1ZVmIS WPmQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=A0sUAmlc+BpmvpTRxkSnZkhtQ2HM4Bo/RSBtUvSjQQQ=; fh=PkPyzqFEsIJ5UuI/5YnxPgMLjFoCxBfnCRhV4asoAII=; b=URTHLXWbldLm6uvvQQU2vr6cRi9rmHhfdV4ASG6SJfVfonn2rZHFwYiPM1gQm4AlfZ TWL8lwHitoX9g8H6xZ9F2+fOtQhVjoZA/NDBi7tJU2ETNmHyWZPtbQNLm0u0a06RjlXV nJp/uRebBy3C/2d4CnsGHIaDWidJFuBvuBFeTGAbtif0nrfkjbg0YLh5IJcrPZg0cfJR GLv99w7FmhorlNFXLbVxg34c0nTFoAlSJsUepfwZiA+x+KqAD+zYh+4u0aHJCLV8CBca idfMuos9X/793PWK1aZx7zF00Hc/RQD+PslvjyAB/4GB3hWwqYtLdXCWx4cn9P/AiyaP s+Dg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=VEPTqepQ; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-113482-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-113482-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id cx13-20020a05620a51cd00b00789e7ef11f9si4489248qkb.586.2024.03.24.17.48.09 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 24 Mar 2024 17:48:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-113482-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=VEPTqepQ; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-113482-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-113482-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id C48B11C23783 for ; Mon, 25 Mar 2024 00:48:09 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B9EE313CFA9; Sun, 24 Mar 2024 22:44:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="VEPTqepQ" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AAB6A1B5DC5; Sun, 24 Mar 2024 22:44:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711320283; cv=none; b=Xd+LD92REjh9oSdBMmAd2StFgpXdiqt2xbZMKnO5GDjN9sUaSe1kZdHk6wOK4eXUvET1hQkwGZ+7KRFdN3y5vXhlaLHg5MDJEb3a3YMHRxhDxXC+END1bmBuJizeQhrLebDsF7GjbtecK7uW2BzMYNOWW4q++tmZH5NWFaXK1vo= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711320283; c=relaxed/simple; bh=HldJHJkre/YvtXCW8jZdXx0k5vWaYJ2xREmVbn7+8sU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=OjKBG3Y7JfQnUiiiU637fswIVB9DNbf2I61PJK9NEbAGjWoh0DnDzORNDFkQTBkTSAZacpjsJR5A9Tm4VqBm/usN3wuEySp+O+VUFigNQJeIXehJxec7hsO9IzRYKgB8v2OqkClNFvtMvIatq1zC+ue623hg4rtB85UEvQria3I= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=VEPTqepQ; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 61755C433F1; Sun, 24 Mar 2024 22:44:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1711320283; bh=HldJHJkre/YvtXCW8jZdXx0k5vWaYJ2xREmVbn7+8sU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VEPTqepQwC1EFzylNq9oMQa/rkOtHFL6Dg1mKqxr6ZC0E8pxcY9eQA6P6Wz+eJBdQ ynhxq5206GEGipvitxrcGfx6Q3lOAaZMMOpIRwaXn58pmGhVitQ6yiL7GHDfehLSFh BzcwyxdjiDVZHAXuT7yLrDcwB7pv5ZkHS0Ep855Ibqm1XZ3SQznSGzxc0UJuu3D6QH p0uPPIEiVp/OMXosSTlmL+S7ZbUm34LfT2ai8+d2sAQvSkKbG/LG9EmhzOithiIWoB zTTKbgkNuj1VmnPkuIu0do5acSKCaZa9ZverHg9C/qvpYuZQqZAuI3JWJGRbBCDCyR FiubKCvyIw6gA== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Dave Wysochanski , Jeff Layton , Jeff Layton , David Howells , Trond Myklebust , Sasha Levin Subject: [PATCH 6.8 591/715] NFS: Fix nfs_netfs_issue_read() xarray locking for writeback interrupt Date: Sun, 24 Mar 2024 18:32:50 -0400 Message-ID: <20240324223455.1342824-592-sashal@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240324223455.1342824-1-sashal@kernel.org> References: <20240324223455.1342824-1-sashal@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore Content-Transfer-Encoding: 8bit From: Dave Wysochanski [ Upstream commit fd5860ab6341506004219b080aea40213b299d2e ] The loop inside nfs_netfs_issue_read() currently does not disable interrupts while iterating through pages in the xarray to submit for NFS read. This is not safe though since after taking xa_lock, another page in the mapping could be processed for writeback inside an interrupt, and deadlock can occur. The fix is simple and clean if we use xa_for_each_range(), which handles the iteration with RCU while reducing code complexity. The problem is easily reproduced with the following test: mount -o vers=3,fsc 127.0.0.1:/export /mnt/nfs dd if=/dev/zero of=/mnt/nfs/file1.bin bs=4096 count=1 echo 3 > /proc/sys/vm/drop_caches dd if=/mnt/nfs/file1.bin of=/dev/null umount /mnt/nfs On the console with a lockdep-enabled kernel a message similar to the following will be seen: ================================ WARNING: inconsistent lock state 6.7.0-lockdbg+ #10 Not tainted -------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. test5/1708 [HC0[0]:SC0[0]:HE1:SE1] takes: ffff888127baa598 (&xa->xa_lock#4){+.?.}-{3:3}, at: nfs_netfs_issue_read+0x1b2/0x4b0 [nfs] {IN-SOFTIRQ-W} state was registered at: lock_acquire+0x144/0x380 _raw_spin_lock_irqsave+0x4e/0xa0 __folio_end_writeback+0x17e/0x5c0 folio_end_writeback+0x93/0x1b0 iomap_finish_ioend+0xeb/0x6a0 blk_update_request+0x204/0x7f0 blk_mq_end_request+0x30/0x1c0 blk_complete_reqs+0x7e/0xa0 __do_softirq+0x113/0x544 __irq_exit_rcu+0xfe/0x120 irq_exit_rcu+0xe/0x20 sysvec_call_function_single+0x6f/0x90 asm_sysvec_call_function_single+0x1a/0x20 pv_native_safe_halt+0xf/0x20 default_idle+0x9/0x20 default_idle_call+0x67/0xa0 do_idle+0x2b5/0x300 cpu_startup_entry+0x34/0x40 start_secondary+0x19d/0x1c0 secondary_startup_64_no_verify+0x18f/0x19b irq event stamp: 176891 hardirqs last enabled at (176891): [] _raw_spin_unlock_irqrestore+0x44/0x60 hardirqs last disabled at (176890): [] _raw_spin_lock_irqsave+0x79/0xa0 softirqs last enabled at (176646): [] __irq_exit_rcu+0xfe/0x120 softirqs last disabled at (176633): [] __irq_exit_rcu+0xfe/0x120 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&xa->xa_lock#4); lock(&xa->xa_lock#4); *** DEADLOCK *** 2 locks held by test5/1708: #0: ffff888127baa498 (&sb->s_type->i_mutex_key#22){++++}-{4:4}, at: nfs_start_io_read+0x28/0x90 [nfs] #1: ffff888127baa650 (mapping.invalidate_lock#3){.+.+}-{4:4}, at: page_cache_ra_unbounded+0xa4/0x280 stack backtrace: CPU: 6 PID: 1708 Comm: test5 Kdump: loaded Not tainted 6.7.0-lockdbg+ Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-1.fc39 04/01/2014 Call Trace: dump_stack_lvl+0x5b/0x90 mark_lock+0xb3f/0xd20 __lock_acquire+0x77b/0x3360 _raw_spin_lock+0x34/0x80 nfs_netfs_issue_read+0x1b2/0x4b0 [nfs] netfs_begin_read+0x77f/0x980 [netfs] nfs_netfs_readahead+0x45/0x60 [nfs] nfs_readahead+0x323/0x5a0 [nfs] read_pages+0xf3/0x5c0 page_cache_ra_unbounded+0x1c8/0x280 filemap_get_pages+0x38c/0xae0 filemap_read+0x206/0x5e0 nfs_file_read+0xb7/0x140 [nfs] vfs_read+0x2a9/0x460 ksys_read+0xb7/0x140 Fixes: 000dbe0bec05 ("NFS: Convert buffered read paths to use netfs when fscache is enabled") Suggested-by: Jeff Layton Signed-off-by: Dave Wysochanski Reviewed-by: Jeff Layton Reviewed-by: David Howells Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin --- fs/nfs/fscache.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c index 2d1bfee225c36..ddc1ee0319554 100644 --- a/fs/nfs/fscache.c +++ b/fs/nfs/fscache.c @@ -301,11 +301,11 @@ static void nfs_netfs_issue_read(struct netfs_io_subrequest *sreq) struct inode *inode = sreq->rreq->inode; struct nfs_open_context *ctx = sreq->rreq->netfs_priv; struct page *page; + unsigned long idx; int err; pgoff_t start = (sreq->start + sreq->transferred) >> PAGE_SHIFT; pgoff_t last = ((sreq->start + sreq->len - sreq->transferred - 1) >> PAGE_SHIFT); - XA_STATE(xas, &sreq->rreq->mapping->i_pages, start); nfs_pageio_init_read(&pgio, inode, false, &nfs_async_read_completion_ops); @@ -316,19 +316,14 @@ static void nfs_netfs_issue_read(struct netfs_io_subrequest *sreq) pgio.pg_netfs = netfs; /* used in completion */ - xas_lock(&xas); - xas_for_each(&xas, page, last) { + xa_for_each_range(&sreq->rreq->mapping->i_pages, idx, page, start, last) { /* nfs_read_add_folio() may schedule() due to pNFS layout and other RPCs */ - xas_pause(&xas); - xas_unlock(&xas); err = nfs_read_add_folio(&pgio, ctx, page_folio(page)); if (err < 0) { netfs->error = err; goto out; } - xas_lock(&xas); } - xas_unlock(&xas); out: nfs_pageio_complete_read(&pgio); nfs_netfs_put(netfs); -- 2.43.0