2022-07-05 03:24:34

by Xiubo Li

[permalink] [raw]
Subject: [PATCH v2 0/2] netfs: fix the crash when unlocking the folio

From: Xiubo Li <[email protected]>

V2:
- Add error_unlocked lable and rename error lable to error_locked.


kernel: page:00000000c9746ff1 refcount:2 mapcount:0 mapping:00000000dc2785bb index:0x1 pfn:0x141afc
kernel: memcg:ffff88810f766000
kernel: aops:ceph_aops [ceph] ino:100000005e7 dentry name:"postgresql-Fri.log"
kernel: flags: 0x5ffc000000201c(uptodate|dirty|lru|private|node=0|zone=2|lastcpupid=0x7ff)
kernel: raw: 005ffc000000201c ffffea000a9eeb48 ffffea00060ade48 ffff888193ed8228
kernel: raw: 0000000000000001 ffff88810cc96500 00000002ffffffff ffff88810f766000
kernel: page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio))
kernel: ------------[ cut here ]------------
kernel: kernel BUG at mm/filemap.c:1559!
kernel: invalid opcode: 0000 [#1] PREEMPT SMP PTI
kernel: CPU: 4 PID: 131697 Comm: postmaster Tainted: G S 5.19.0-rc2-ceph-g822a4c74e05d #1
kernel: Hardware name: Supermicro SYS-5018R-WR/X10SRW-F, BIOS 2.0 12/17/2015
kernel: RIP: 0010:folio_unlock+0x26/0x30
kernel: Code: 00 0f 1f 00 0f 1f 44 00 00 48 8b 07 a8 01 74 0e f0 80 27 fe 78 01 c3 31 f6 e9 d6 fe ff ff 48 c7 c6 c0 81 37 82 e8 aa 64 04 00 <0f> 0b 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 87 b8 01 00 00
kernel: RSP: 0018:ffffc90004377bc8 EFLAGS: 00010246
kernel: RAX: 000000000000003f RBX: ffff888193ed8228 RCX: 0000000000000001
kernel: RDX: 0000000000000000 RSI: ffffffff823a3569 RDI: 00000000ffffffff
kernel: RBP: ffffffff828a0058 R08: 0000000000000001 R09: 0000000000000001
kernel: R10: 000000007c6b0fd2 R11: 0000000000000034 R12: 0000000000000001
kernel: R13: 00000000fffffe00 R14: ffffea000506bf00 R15: ffff888193ed8000
kernel: FS: 00007f4993626340(0000) GS:ffff88885fd00000(0000) knlGS:0000000000000000
kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 0000555789ee8000 CR3: 000000017a52a006 CR4: 00000000001706e0
kernel: Call Trace:
kernel: <TASK>
kernel: netfs_write_begin+0x130/0x950 [netfs]
kernel: ceph_write_begin+0x46/0xd0 [ceph]
kernel: generic_perform_write+0xef/0x200
kernel: ? file_update_time+0xd4/0x110
kernel: ceph_write_iter+0xb01/0xcd0 [ceph]
kernel: ? lock_is_held_type+0xe3/0x140
kernel: ? new_sync_write+0x106/0x180
kernel: new_sync_write+0x106/0x180
kernel: vfs_write+0x29a/0x3a0
kernel: ksys_write+0x5c/0xd0
kernel: do_syscall_64+0x34/0x80
kernel: entry_SYSCALL_64_after_hwframe+0x46/0xb0
kernel: RIP: 0033:0x7f49903205c8
kernel: Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 d5 3f 2a 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55
kernel: RSP: 002b:00007fff104bd178 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
kernel: RAX: ffffffffffffffda RBX: 0000000000000048 RCX: 00007f49903205c8
kernel: RDX: 0000000000000048 RSI: 000055944d3c1ea0 RDI: 000000000000000b
kernel: RBP: 000055944d3c1ea0 R08: 000055944d3963d0 R09: 00007fff1055b080
kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 000055944d3962f0
kernel: R13: 0000000000000048 R14: 00007f49905bb880 R15: 0000000000000048
kernel: </TASK>



Xiubo Li (2):
netfs: do not unlock and put the folio twice
afs: unlock the folio when vnode is marked deleted

fs/afs/file.c | 8 +++++++-
fs/netfs/buffered_read.c | 13 +++++++------
2 files changed, 14 insertions(+), 7 deletions(-)

--
2.36.0.rc1


2022-07-05 03:33:25

by Xiubo Li

[permalink] [raw]
Subject: [PATCH v2 1/2] netfs: do not unlock and put the folio twice

From: Xiubo Li <[email protected]>

check_write_begin() will unlock and put the folio when return
non-zero. So we should avoid unlocking and putting it twice in
netfs layer.

URL: https://tracker.ceph.com/issues/56423
Signed-off-by: Xiubo Li <[email protected]>
---
fs/netfs/buffered_read.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c
index 42f892c5712e..b6fd6e5fe019 100644
--- a/fs/netfs/buffered_read.c
+++ b/fs/netfs/buffered_read.c
@@ -320,7 +320,7 @@ static bool netfs_skip_folio_read(struct folio *folio, loff_t pos, size_t len,
* pointer to the fsdata cookie that gets returned to the VM to be passed to
* write_end. It is permitted to sleep. It should return 0 if the request
* should go ahead; unlock the folio and return -EAGAIN to cause the folio to
- * be regot; or return an error.
+ * be regot; or unlock the folio and return an error.
*
* The calling netfs must initialise a netfs context contiguous to the vfs
* inode before calling this.
@@ -353,7 +353,7 @@ int netfs_write_begin(struct netfs_inode *ctx,
trace_netfs_failure(NULL, NULL, ret, netfs_fail_check_write_begin);
if (ret == -EAGAIN)
goto retry;
- goto error;
+ goto error_unlocked;
}
}

@@ -375,7 +375,7 @@ int netfs_write_begin(struct netfs_inode *ctx,
NETFS_READ_FOR_WRITE);
if (IS_ERR(rreq)) {
ret = PTR_ERR(rreq);
- goto error;
+ goto error_locked;
}
rreq->no_unlock_folio = folio_index(folio);
__set_bit(NETFS_RREQ_NO_UNLOCK_FOLIO, &rreq->flags);
@@ -402,12 +402,12 @@ int netfs_write_begin(struct netfs_inode *ctx,

ret = netfs_begin_read(rreq, true);
if (ret < 0)
- goto error;
+ goto error_locked;

have_folio:
ret = folio_wait_fscache_killable(folio);
if (ret < 0)
- goto error;
+ goto error_locked;
have_folio_no_wait:
*_folio = folio;
_leave(" = 0");
@@ -415,9 +415,10 @@ int netfs_write_begin(struct netfs_inode *ctx,

error_put:
netfs_put_request(rreq, false, netfs_rreq_trace_put_failed);
-error:
+error_locked:
folio_unlock(folio);
folio_put(folio);
+error_unlocked:
_leave(" = %d", ret);
return ret;
}
--
2.36.0.rc1

2022-07-05 13:11:29

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] netfs: fix the crash when unlocking the folio

On Tue, 2022-07-05 at 10:52 +0800, [email protected] wrote:
> From: Xiubo Li <[email protected]>
>
> V2:
> - Add error_unlocked lable and rename error lable to error_locked.
>
>
> kernel: page:00000000c9746ff1 refcount:2 mapcount:0 mapping:00000000dc2785bb index:0x1 pfn:0x141afc
> kernel: memcg:ffff88810f766000
> kernel: aops:ceph_aops [ceph] ino:100000005e7 dentry name:"postgresql-Fri.log"
> kernel: flags: 0x5ffc000000201c(uptodate|dirty|lru|private|node=0|zone=2|lastcpupid=0x7ff)
> kernel: raw: 005ffc000000201c ffffea000a9eeb48 ffffea00060ade48 ffff888193ed8228
> kernel: raw: 0000000000000001 ffff88810cc96500 00000002ffffffff ffff88810f766000
> kernel: page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio))
> kernel: ------------[ cut here ]------------
> kernel: kernel BUG at mm/filemap.c:1559!
> kernel: invalid opcode: 0000 [#1] PREEMPT SMP PTI
> kernel: CPU: 4 PID: 131697 Comm: postmaster Tainted: G S 5.19.0-rc2-ceph-g822a4c74e05d #1
> kernel: Hardware name: Supermicro SYS-5018R-WR/X10SRW-F, BIOS 2.0 12/17/2015
> kernel: RIP: 0010:folio_unlock+0x26/0x30
> kernel: Code: 00 0f 1f 00 0f 1f 44 00 00 48 8b 07 a8 01 74 0e f0 80 27 fe 78 01 c3 31 f6 e9 d6 fe ff ff 48 c7 c6 c0 81 37 82 e8 aa 64 04 00 <0f> 0b 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 87 b8 01 00 00
> kernel: RSP: 0018:ffffc90004377bc8 EFLAGS: 00010246
> kernel: RAX: 000000000000003f RBX: ffff888193ed8228 RCX: 0000000000000001
> kernel: RDX: 0000000000000000 RSI: ffffffff823a3569 RDI: 00000000ffffffff
> kernel: RBP: ffffffff828a0058 R08: 0000000000000001 R09: 0000000000000001
> kernel: R10: 000000007c6b0fd2 R11: 0000000000000034 R12: 0000000000000001
> kernel: R13: 00000000fffffe00 R14: ffffea000506bf00 R15: ffff888193ed8000
> kernel: FS: 00007f4993626340(0000) GS:ffff88885fd00000(0000) knlGS:0000000000000000
> kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> kernel: CR2: 0000555789ee8000 CR3: 000000017a52a006 CR4: 00000000001706e0
> kernel: Call Trace:
> kernel: <TASK>
> kernel: netfs_write_begin+0x130/0x950 [netfs]
> kernel: ceph_write_begin+0x46/0xd0 [ceph]
> kernel: generic_perform_write+0xef/0x200
> kernel: ? file_update_time+0xd4/0x110
> kernel: ceph_write_iter+0xb01/0xcd0 [ceph]
> kernel: ? lock_is_held_type+0xe3/0x140
> kernel: ? new_sync_write+0x106/0x180
> kernel: new_sync_write+0x106/0x180
> kernel: vfs_write+0x29a/0x3a0
> kernel: ksys_write+0x5c/0xd0
> kernel: do_syscall_64+0x34/0x80
> kernel: entry_SYSCALL_64_after_hwframe+0x46/0xb0
> kernel: RIP: 0033:0x7f49903205c8
> kernel: Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 d5 3f 2a 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55
> kernel: RSP: 002b:00007fff104bd178 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> kernel: RAX: ffffffffffffffda RBX: 0000000000000048 RCX: 00007f49903205c8
> kernel: RDX: 0000000000000048 RSI: 000055944d3c1ea0 RDI: 000000000000000b
> kernel: RBP: 000055944d3c1ea0 R08: 000055944d3963d0 R09: 00007fff1055b080
> kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 000055944d3962f0
> kernel: R13: 0000000000000048 R14: 00007f49905bb880 R15: 0000000000000048
> kernel: </TASK>
>
>
>
> Xiubo Li (2):
> netfs: do not unlock and put the folio twice
> afs: unlock the folio when vnode is marked deleted
>
> fs/afs/file.c | 8 +++++++-
> fs/netfs/buffered_read.c | 13 +++++++------
> 2 files changed, 14 insertions(+), 7 deletions(-)
>

Reviewed-by: Jeff Layton <[email protected]>