If `get_unused_fd_flags` files fails (either in setting up `ctx` as
`tctx->last` or `get_unused_fd_flags`) `ctx` will never be freed.
Signed-off-by: Noah Goldstein <[email protected]>
---
I very well may be missing something (or there may be a double
free if the failure is after `get_unused_fd_flags`) but looks
to me to be a memory leak.
fs/io_uring.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index a3b76e63f9da..9685a7be48e3 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -11863,7 +11863,7 @@ static __cold int io_uring_create(unsigned entries, struct io_uring_params *p,
if (ret < 0) {
/* fput will clean it up */
fput(file);
- return ret;
+ goto err;
}
trace_io_uring_create(ret, ctx, p->sq_entries, p->cq_entries, p->flags);
--
2.25.1
On Fri, Apr 29, 2022 at 7:47 AM Jens Axboe <[email protected]> wrote:
>
> On 4/28/22 6:42 PM, Noah Goldstein wrote:
> > If `get_unused_fd_flags` files fails (either in setting up `ctx` as
> > `tctx->last` or `get_unused_fd_flags`) `ctx` will never be freed.
>
> There's a comment there telling you why, the fput will end up
> releasing it just like it would when an application closes it.
I see. Thought the 'it' refered to in the comment was the file, not
ctx.
>
> > I very well may be missing something (or there may be a double
> > free if the failure is after `get_unused_fd_flags`) but looks
> > to me to be a memory leak.
>
> Have you tried synthetically reproducing the two failures you're
> thinking of and tracing cleanup?
>
> --
> Jens Axboe
>
On 4/28/22 6:42 PM, Noah Goldstein wrote:
> If `get_unused_fd_flags` files fails (either in setting up `ctx` as
> `tctx->last` or `get_unused_fd_flags`) `ctx` will never be freed.
There's a comment there telling you why, the fput will end up
releasing it just like it would when an application closes it.
> I very well may be missing something (or there may be a double
> free if the failure is after `get_unused_fd_flags`) but looks
> to me to be a memory leak.
Have you tried synthetically reproducing the two failures you're
thinking of and tracing cleanup?
--
Jens Axboe
Greeting,
FYI, we noticed the following commit (built with gcc-11):
commit: 193e0976d8366877b80c46936582102716299815 ("[PATCH v1] io_uring: Fix memory leak if file setup fails.")
url: https://github.com/intel-lab-lkp/linux/commits/Noah-Goldstein/io_uring-Fix-memory-leak-if-file-setup-fails/20220429-084504
base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 249aca0d3d631660aa3583c6a3559b75b6e971b4
patch link: https://lore.kernel.org/io-uring/[email protected]
in testcase: trinity
version: trinity-static-x86_64-x86_64-f93256fb_2019-08-28
with following parameters:
runtime: 300s
group: group-01
test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/
on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>
[ 28.203642][ T5569] ------------[ cut here ]------------
[ 28.204685][ T5569] percpu_ref_kill_and_confirm called more than once on io_ring_ctx_ref_free!
[ 28.204717][ T5569] WARNING: CPU: 1 PID: 5569 at lib/percpu-refcount.c:388 percpu_ref_kill_and_confirm+0x4b/0x79
[ 28.207973][ T5569] Modules linked in: can_bcm can_raw can cn scsi_transport_iscsi sr_mod cdrom
[ 28.209497][ T5569] CPU: 1 PID: 5569 Comm: trinity-c3 Not tainted 5.18.0-rc4-00178-g193e0976d836 #1
[ 28.211060][ T5569] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 28.212645][ T5569] RIP: 0010:percpu_ref_kill_and_confirm+0x4b/0x79
[ 28.213721][ T5569] Code: 3d f3 03 70 01 00 75 24 48 8b 45 08 48 c7 c6 20 d4 09 82 48 c7 c7 78 0a 33 82 c6 05 d8 03 70 01 01 48 8b 50 08 e8 ac 29 61 00 <
0f> 0b 48 83 4d 00 02 4c 89 ee 48 89 ef e8 f0 fc ff ff 48 89 ef e8
[ 28.216874][ T5569] RSP: 0018:ffffc90000187e70 EFLAGS: 00010082
[ 28.217885][ T5569] RAX: 0000000000000000 RBX: ffff888119dc0800 RCX: 0000000000000027
[ 28.219277][ T5569] RDX: 0000000000000003 RSI: 0000000000000001 RDI: ffff88842fd1b6e0
[ 28.220725][ T5569] RBP: ffff888119dc0800 R08: 0000000000000000 R09: 0000000000000000
[ 28.222118][ T5569] R10: 0000000000000014 R11: ffffffff822bd53e R12: 0000000000000246
[ 28.223604][ T5569] R13: 0000000000000000 R14: ffff8881758d7d80 R15: 0000000000000000
[ 28.225195][ T5569] FS: 00000000010a2880(0000) GS:ffff88842fd00000(0000) knlGS:0000000000000000
[ 28.226723][ T5569] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 28.227850][ T5569] CR2: 000000000000006e CR3: 0000000177eec000 CR4: 00000000000406e0
[ 28.229259][ T5569] DR0: 00007f654a6d2000 DR1: 00007f654a6d7000 DR2: 0000000000000000
[ 28.230657][ T5569] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 000000000037060a
[ 28.232118][ T5569] Call Trace:
[ 28.232770][ T5569] <TASK>
[ 28.233365][ T5569] io_ring_ctx_wait_and_kill+0x36/0x127
[ 28.234324][ T5569] io_uring_release+0x1c/0x1f
[ 28.235164][ T5569] __fput+0x114/0x1d7
[ 28.235932][ T5569] task_work_run+0x6b/0x8d
[ 28.236739][ T5569] exit_to_user_mode_prepare+0x6b/0xfd
[ 28.248671][ T5569] syscall_exit_to_user_mode+0x14/0x27
[ 28.249677][ T5569] do_syscall_64+0x80/0x86
[ 28.250493][ T5569] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 28.251559][ T5569] RIP: 0033:0x453b29
[ 28.252293][ T5569] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b 84 00 00 c3 66 2e 0f 1f 84 00 00 00 00
[ 28.255404][ T5569] RSP: 002b:00007ffedd871608 EFLAGS: 00000246 ORIG_RAX: 00000000000001a9
[ 28.256906][ T5569] RAX: ffffffffffffffe8 RBX: 00000000000001a9 RCX: 0000000000453b29
[ 28.258294][ T5569] RDX: 0000000011111111 RSI: 00007f654acd7000 RDI: 00000000000000ee
[ 28.259722][ T5569] RBP: 00007ffedd8716b0 R08: fffffffffffffffd R09: 0000000010000000
[ 28.261101][ T5569] R10: 000000000000fffd R11: 0000000000000246 R12: 0000000000000002
[ 28.262497][ T5569] R13: 00007f654aff7058 R14: 00000000010a2830 R15: 00007f654aff7000
[ 28.263904][ T5569] </TASK>
[ 28.264515][ T5569] ---[ end trace 0000000000000000 ]---
To reproduce:
# build kernel
cd linux
cp config-5.18.0-rc4-00178-g193e0976d836 .config
make HOSTCC=gcc-11 CC=gcc-11 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules
make HOSTCC=gcc-11 CC=gcc-11 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install
cd <mod-install-dir>
find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
--
0-DAY CI Kernel Test Service
https://01.org/lkp