2022-12-20 14:50:03

by Mirsad Todorovac

[permalink] [raw]
Subject: Possible regression in drm/i915 driver: memleak

Hi all,

I have been unsuccessful to find any particular Intel i915 maintainer
emails, so my best bet is to post here, as you will must assuredly
already know them.

The problem is a kernel memory leak that is repeatedly occurring
triggered during the execution of Chrome browser under the latest 6.1.0+
kernel of this morning and Almalinux 8.6 on a Lenovo desktop box
with Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz CPU.

The build is with KMEMLEAK, KASAN and MGLRU turned on during the build,
on a vanilla mainline kernel from Mr. Torvalds' tree.

The leaks look like this one:

unreferenced object 0xffff888131754880 (size 64):
comm "chrome", pid 13058, jiffies 4298568878 (age 3708.084s)
hex dump (first 32 bytes):
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 80 1e 3e 83 88 ff ff ...........>....
backtrace:
[<ffffffff9e9b5542>] slab_post_alloc_hook+0xb2/0x340
[<ffffffff9e9bbf5f>] __kmem_cache_alloc_node+0x1bf/0x2c0
[<ffffffff9e8f767a>] kmalloc_trace+0x2a/0xb0
[<ffffffffc08dfde5>] drm_vma_node_allow+0x45/0x150 [drm]
[<ffffffffc0b33315>] __assign_mmap_offset_handle+0x615/0x820 [i915]
[<ffffffffc0b34057>] i915_gem_mmap_offset_ioctl+0x77/0x110 [i915]
[<ffffffffc08bc5e1>] drm_ioctl_kernel+0x181/0x280 [drm]
[<ffffffffc08bc9cd>] drm_ioctl+0x2dd/0x6a0 [drm]
[<ffffffff9ea54744>] __x64_sys_ioctl+0xc4/0x100
[<ffffffff9fbc0178>] do_syscall_64+0x58/0x80
[<ffffffff9fc000aa>] entry_SYSCALL_64_after_hwframe+0x72/0xdc

The complete list of leaks in attachment, but they seem similar or the same.

Please find attached lshw and kernel build config file.

I will probably check the same parms on my laptop at home, which is also
Lenovo, but a different hw config and Ubuntu 22.10.

Thanks,
Mirsad

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia


Attachments:
6.1.0+-kmemleak-chrome-i915-drm.log (29.49 kB)
lshw.txt.xz (4.52 kB)
config-6.1.0+.xz (55.83 kB)
Download all attachments

2022-12-20 15:37:51

by srinivas pandruvada

[permalink] [raw]
Subject: Re: Possible regression in drm/i915 driver: memleak

+Added DRM mailing list and maintainers

On Tue, 2022-12-20 at 15:33 +0100, Mirsad Todorovac wrote:
> Hi all,
>
> I have been unsuccessful to find any particular Intel i915 maintainer
> emails, so my best bet is to post here, as you will must assuredly
> already know them.
>
> The problem is a kernel memory leak that is repeatedly occurring
> triggered during the execution of Chrome browser under the latest
> 6.1.0+
> kernel of this morning and Almalinux 8.6 on a Lenovo desktop box
> with Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz CPU.
>
> The build is with KMEMLEAK, KASAN and MGLRU turned on during the
> build,
> on a vanilla mainline kernel from Mr. Torvalds' tree.
>
> The leaks look like this one:
>
> unreferenced object 0xffff888131754880 (size 64):
>    comm "chrome", pid 13058, jiffies 4298568878 (age 3708.084s)
>    hex dump (first 32 bytes):
>      01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> ................
>      00 00 00 00 00 00 00 00 00 80 1e 3e 83 88 ff ff 
> ...........>....
>    backtrace:
>      [<ffffffff9e9b5542>] slab_post_alloc_hook+0xb2/0x340
>      [<ffffffff9e9bbf5f>] __kmem_cache_alloc_node+0x1bf/0x2c0
>      [<ffffffff9e8f767a>] kmalloc_trace+0x2a/0xb0
>      [<ffffffffc08dfde5>] drm_vma_node_allow+0x45/0x150 [drm]
>      [<ffffffffc0b33315>] __assign_mmap_offset_handle+0x615/0x820
> [i915]
>      [<ffffffffc0b34057>] i915_gem_mmap_offset_ioctl+0x77/0x110
> [i915]
>      [<ffffffffc08bc5e1>] drm_ioctl_kernel+0x181/0x280 [drm]
>      [<ffffffffc08bc9cd>] drm_ioctl+0x2dd/0x6a0 [drm]
>      [<ffffffff9ea54744>] __x64_sys_ioctl+0xc4/0x100
>      [<ffffffff9fbc0178>] do_syscall_64+0x58/0x80
>      [<ffffffff9fc000aa>] entry_SYSCALL_64_after_hwframe+0x72/0xdc
>
> The complete list of leaks in attachment, but they seem similar or
> the same.
>
> Please find attached lshw and kernel build config file.
>
> I will probably check the same parms on my laptop at home, which is
> also
> Lenovo, but a different hw config and Ubuntu 22.10.
>
> Thanks,
> Mirsad
>
> --
> Mirsad Goran Todorovac
> Sistem inženjer
> Grafički fakultet | Akademija likovnih umjetnosti
> Sveučilište u Zagrebu

2022-12-20 16:06:04

by Tvrtko Ursulin

[permalink] [raw]
Subject: Re: Possible regression in drm/i915 driver: memleak


Hi,

On 20/12/2022 15:22, srinivas pandruvada wrote:
> +Added DRM mailing list and maintainers
>
> On Tue, 2022-12-20 at 15:33 +0100, Mirsad Todorovac wrote:
>> Hi all,
>>
>> I have been unsuccessful to find any particular Intel i915 maintainer
>> emails, so my best bet is to post here, as you will must assuredly
>> already know them.

For future reference you can use ${kernel_dir}/scripts/get_maintainer.pl -f ...

>> The problem is a kernel memory leak that is repeatedly occurring
>> triggered during the execution of Chrome browser under the latest
>> 6.1.0+
>> kernel of this morning and Almalinux 8.6 on a Lenovo desktop box
>> with Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz CPU.
>>
>> The build is with KMEMLEAK, KASAN and MGLRU turned on during the
>> build,
>> on a vanilla mainline kernel from Mr. Torvalds' tree.
>>
>> The leaks look like this one:
>>
>> unreferenced object 0xffff888131754880 (size 64):
>>    comm "chrome", pid 13058, jiffies 4298568878 (age 3708.084s)
>>    hex dump (first 32 bytes):
>>      01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> ................
>>      00 00 00 00 00 00 00 00 00 80 1e 3e 83 88 ff ff
>> ...........>....
>>    backtrace:
>>      [<ffffffff9e9b5542>] slab_post_alloc_hook+0xb2/0x340
>>      [<ffffffff9e9bbf5f>] __kmem_cache_alloc_node+0x1bf/0x2c0
>>      [<ffffffff9e8f767a>] kmalloc_trace+0x2a/0xb0
>>      [<ffffffffc08dfde5>] drm_vma_node_allow+0x45/0x150 [drm]
>>      [<ffffffffc0b33315>] __assign_mmap_offset_handle+0x615/0x820
>> [i915]
>>      [<ffffffffc0b34057>] i915_gem_mmap_offset_ioctl+0x77/0x110
>> [i915]
>>      [<ffffffffc08bc5e1>] drm_ioctl_kernel+0x181/0x280 [drm]
>>      [<ffffffffc08bc9cd>] drm_ioctl+0x2dd/0x6a0 [drm]
>>      [<ffffffff9ea54744>] __x64_sys_ioctl+0xc4/0x100
>>      [<ffffffff9fbc0178>] do_syscall_64+0x58/0x80
>>      [<ffffffff9fc000aa>] entry_SYSCALL_64_after_hwframe+0x72/0xdc
>>
>> The complete list of leaks in attachment, but they seem similar or
>> the same.
>>
>> Please find attached lshw and kernel build config file.
>>
>> I will probably check the same parms on my laptop at home, which is
>> also
>> Lenovo, but a different hw config and Ubuntu 22.10.

Could you try the below patch?

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index c3ea243d414d..0b07534c203a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -679,9 +679,10 @@ mmap_offset_attach(struct drm_i915_gem_object *obj,
insert:
mmo = insert_mmo(obj, mmo);
GEM_BUG_ON(lookup_mmo(obj, mmap_type) != mmo);
-out:
+
if (file)
drm_vma_node_allow(&mmo->vma_node, file);
+out:
return mmo;

err:

Maybe it is not the best fix but curious to know if it will make the leak go away.

Regards,

Tvrtko

2022-12-20 17:38:01

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: Possible regression in drm/i915 driver: memleak

On 20. 12. 2022. 16:52, Tvrtko Ursulin wrote:

> On 20/12/2022 15:22, srinivas pandruvada wrote:
>> +Added DRM mailing list and maintainers
>>
>> On Tue, 2022-12-20 at 15:33 +0100, Mirsad Todorovac wrote:
>>> Hi all,
>>>
>>> I have been unsuccessful to find any particular Intel i915 maintainer
>>> emails, so my best bet is to post here, as you will must assuredly
>>> already know them.
>
> For future reference you can use ${kernel_dir}/scripts/get_maintainer.pl -f ...

Thank you, this will help a great deal provided that I find any
more bugs ...

>>> The problem is a kernel memory leak that is repeatedly occurring
>>> triggered during the execution of Chrome browser under the latest
>>> 6.1.0+
>>> kernel of this morning and Almalinux 8.6 on a Lenovo desktop box
>>> with Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz CPU.
>>>
>>> The build is with KMEMLEAK, KASAN and MGLRU turned on during the
>>> build,
>>> on a vanilla mainline kernel from Mr. Torvalds' tree.
>>>
>>> The leaks look like this one:
>>>
>>> unreferenced object 0xffff888131754880 (size 64):
>>>     comm "chrome", pid 13058, jiffies 4298568878 (age 3708.084s)
>>>     hex dump (first 32 bytes):
>>>       01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> ................
>>>       00 00 00 00 00 00 00 00 00 80 1e 3e 83 88 ff ff
>>> ...........>....
>>>     backtrace:
>>>       [<ffffffff9e9b5542>] slab_post_alloc_hook+0xb2/0x340
>>>       [<ffffffff9e9bbf5f>] __kmem_cache_alloc_node+0x1bf/0x2c0
>>>       [<ffffffff9e8f767a>] kmalloc_trace+0x2a/0xb0
>>>       [<ffffffffc08dfde5>] drm_vma_node_allow+0x45/0x150 [drm]
>>>       [<ffffffffc0b33315>] __assign_mmap_offset_handle+0x615/0x820
>>> [i915]
>>>       [<ffffffffc0b34057>] i915_gem_mmap_offset_ioctl+0x77/0x110
>>> [i915]
>>>       [<ffffffffc08bc5e1>] drm_ioctl_kernel+0x181/0x280 [drm]
>>>       [<ffffffffc08bc9cd>] drm_ioctl+0x2dd/0x6a0 [drm]
>>>       [<ffffffff9ea54744>] __x64_sys_ioctl+0xc4/0x100
>>>       [<ffffffff9fbc0178>] do_syscall_64+0x58/0x80
>>>       [<ffffffff9fc000aa>] entry_SYSCALL_64_after_hwframe+0x72/0xdc
>>>
>>> The complete list of leaks in attachment, but they seem similar or
>>> the same.
>>>
>>> Please find attached lshw and kernel build config file.
>>>
>>> I will probably check the same parms on my laptop at home, which is
>>> also
>>> Lenovo, but a different hw config and Ubuntu 22.10.
>
> Could you try the below patch?
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> index c3ea243d414d..0b07534c203a 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> @@ -679,9 +679,10 @@ mmap_offset_attach(struct drm_i915_gem_object *obj,
>  insert:
>         mmo = insert_mmo(obj, mmo);
>         GEM_BUG_ON(lookup_mmo(obj, mmap_type) != mmo);
> -out:
> +
>         if (file)
>                 drm_vma_node_allow(&mmo->vma_node, file);
> +out:
>         return mmo;
>
>  err:
>
> Maybe it is not the best fix but curious to know if it will make the leak go away.

The patch was successfully applied to the latest Mr. Torvalds' tree (commit b6bb9676f216).

It is currently building, which can take up to 90 minutes on our system.

Now the test depends on whether I will be able to setup the machine at work remotely
(there were some firewalls on port 22 recently).

I will keep you updated.

Thanks,
Mirsad

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union

2022-12-20 20:11:27

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: LOOKS GOOD: Possible regression in drm/i915 driver: memleak

On 12/20/22 16:52, Tvrtko Ursulin wrote:

> On 20/12/2022 15:22, srinivas pandruvada wrote:
>> +Added DRM mailing list and maintainers
>>
>> On Tue, 2022-12-20 at 15:33 +0100, Mirsad Todorovac wrote:
>>> Hi all,
>>>
>>> I have been unsuccessful to find any particular Intel i915 maintainer
>>> emails, so my best bet is to post here, as you will must assuredly
>>> already know them.
>
> For future reference you can use ${kernel_dir}/scripts/get_maintainer.pl
> -f ...
>
>>> The problem is a kernel memory leak that is repeatedly occurring
>>> triggered during the execution of Chrome browser under the latest
>>> 6.1.0+
>>> kernel of this morning and Almalinux 8.6 on a Lenovo desktop box
>>> with Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz CPU.
>>>
>>> The build is with KMEMLEAK, KASAN and MGLRU turned on during the
>>> build,
>>> on a vanilla mainline kernel from Mr. Torvalds' tree.
>>>
>>> The leaks look like this one:
>>>
>>> unreferenced object 0xffff888131754880 (size 64):
>>>     comm "chrome", pid 13058, jiffies 4298568878 (age 3708.084s)
>>>     hex dump (first 32 bytes):
>>>       01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> ................
>>>       00 00 00 00 00 00 00 00 00 80 1e 3e 83 88 ff ff
>>> ...........>....
>>>     backtrace:
>>>       [<ffffffff9e9b5542>] slab_post_alloc_hook+0xb2/0x340
>>>       [<ffffffff9e9bbf5f>] __kmem_cache_alloc_node+0x1bf/0x2c0
>>>       [<ffffffff9e8f767a>] kmalloc_trace+0x2a/0xb0
>>>       [<ffffffffc08dfde5>] drm_vma_node_allow+0x45/0x150 [drm]
>>>       [<ffffffffc0b33315>] __assign_mmap_offset_handle+0x615/0x820
>>> [i915]
>>>       [<ffffffffc0b34057>] i915_gem_mmap_offset_ioctl+0x77/0x110
>>> [i915]
>>>       [<ffffffffc08bc5e1>] drm_ioctl_kernel+0x181/0x280 [drm]
>>>       [<ffffffffc08bc9cd>] drm_ioctl+0x2dd/0x6a0 [drm]
>>>       [<ffffffff9ea54744>] __x64_sys_ioctl+0xc4/0x100
>>>       [<ffffffff9fbc0178>] do_syscall_64+0x58/0x80
>>>       [<ffffffff9fc000aa>] entry_SYSCALL_64_after_hwframe+0x72/0xdc
>>>
>>> The complete list of leaks in attachment, but they seem similar or
>>> the same.
>>>
>>> Please find attached lshw and kernel build config file.
>>>
>>> I will probably check the same parms on my laptop at home, which is
>>> also
>>> Lenovo, but a different hw config and Ubuntu 22.10.
>
> Could you try the below patch?
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> index c3ea243d414d..0b07534c203a 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
> @@ -679,9 +679,10 @@ mmap_offset_attach(struct drm_i915_gem_object *obj,
>  insert:
>         mmo = insert_mmo(obj, mmo);
>         GEM_BUG_ON(lookup_mmo(obj, mmap_type) != mmo);
> -out:
> +
>         if (file)
>                 drm_vma_node_allow(&mmo->vma_node, file);
> +out:
>         return mmo;
>
>  err:
>
> Maybe it is not the best fix but curious to know if it will make the
> leak go away.

Hi,

After 27 minutes uptime with the patched kernel it looks promising.
It is much longer than it took for the buggy kernel to leak slabs.

Here is the output:

[root@pc-mtodorov marvin]# echo scan > /sys/kernel/debug/kmemleak
[root@pc-mtodorov marvin]# cat !$
cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff888105028d80 (size 16):
comm "kworker/u12:5", pid 359, jiffies 4294902898 (age 1620.144s)
hex dump (first 16 bytes):
6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00 memstick0.......
backtrace:
[<ffffffffb6bb5542>] slab_post_alloc_hook+0xb2/0x340
[<ffffffffb6bbbf5f>] __kmem_cache_alloc_node+0x1bf/0x2c0
[<ffffffffb6af8175>] __kmalloc_node_track_caller+0x55/0x160
[<ffffffffb6ae34a6>] kstrdup+0x36/0x60
[<ffffffffb6ae3508>] kstrdup_const+0x28/0x30
[<ffffffffb70d0757>] kvasprintf_const+0x97/0xd0
[<ffffffffb7c9cdf4>] kobject_set_name_vargs+0x34/0xc0
[<ffffffffb750289b>] dev_set_name+0x9b/0xd0
[<ffffffffc12d9201>] memstick_check+0x181/0x639 [memstick]
[<ffffffffb676e1d6>] process_one_work+0x4e6/0x7e0
[<ffffffffb676e556>] worker_thread+0x76/0x770
[<ffffffffb677b468>] kthread+0x168/0x1a0
[<ffffffffb6604c99>] ret_from_fork+0x29/0x50
[root@pc-mtodorov marvin]# w
20:27:35 up 27 min, 2 users, load average: 0.83, 1.15, 1.19
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
marvin tty2 tty2 20:01 27:10 10:12 2.09s
/opt/google/chrome/chrome --type=utility --utility-sub-type=audio.m
marvin pts/1 - 20:01 0.00s 2:00 0.38s sudo bash
[root@pc-mtodorov marvin]# uname -rms
Linux 6.1.0-b6bb9676f216-mglru-kmemlk-kasan+ x86_64
[root@pc-mtodorov marvin]#

2. On the Ubuntu 22.10 with Debian build I did not reproduce the error
thus far.

This looks to me like fixed, but if it doesn't leak anything until
Thursday morning when I will see this desktop box next time, then we'll
know with more certainty.

Hope this helps. (My $0.02 .)

Kudos for the quick fix :)

Kind regards,
Mirsad

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia

2022-12-21 07:43:48

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: Possible regression in drm/i915 driver: memleak

On 20.12.2022. 20:34, Mirsad Todorovac wrote:
> On 12/20/22 16:52, Tvrtko Ursulin wrote:
>
>> On 20/12/2022 15:22, srinivas pandruvada wrote:
>>> +Added DRM mailing list and maintainers
>>>
>>> On Tue, 2022-12-20 at 15:33 +0100, Mirsad Todorovac wrote:
>>>> Hi all,
>>>>
>>>> I have been unsuccessful to find any particular Intel i915 maintainer
>>>> emails, so my best bet is to post here, as you will must assuredly
>>>> already know them.
>>
>> For future reference you can use
>> ${kernel_dir}/scripts/get_maintainer.pl -f ...
>>
>>>> The problem is a kernel memory leak that is repeatedly occurring
>>>> triggered during the execution of Chrome browser under the latest
>>>> 6.1.0+
>>>> kernel of this morning and Almalinux 8.6 on a Lenovo desktop box
>>>> with Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz CPU.
>>>>
>>>> The build is with KMEMLEAK, KASAN and MGLRU turned on during the
>>>> build,
>>>> on a vanilla mainline kernel from Mr. Torvalds' tree.
>>>>
>>>> The leaks look like this one:
>>>>
>>>> unreferenced object 0xffff888131754880 (size 64):
>>>>     comm "chrome", pid 13058, jiffies 4298568878 (age 3708.084s)
>>>>     hex dump (first 32 bytes):
>>>>       01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>> ................
>>>>       00 00 00 00 00 00 00 00 00 80 1e 3e 83 88 ff ff
>>>> ...........>....
>>>>     backtrace:
>>>>       [<ffffffff9e9b5542>] slab_post_alloc_hook+0xb2/0x340
>>>>       [<ffffffff9e9bbf5f>] __kmem_cache_alloc_node+0x1bf/0x2c0
>>>>       [<ffffffff9e8f767a>] kmalloc_trace+0x2a/0xb0
>>>>       [<ffffffffc08dfde5>] drm_vma_node_allow+0x45/0x150 [drm]
>>>>       [<ffffffffc0b33315>] __assign_mmap_offset_handle+0x615/0x820
>>>> [i915]
>>>>       [<ffffffffc0b34057>] i915_gem_mmap_offset_ioctl+0x77/0x110
>>>> [i915]
>>>>       [<ffffffffc08bc5e1>] drm_ioctl_kernel+0x181/0x280 [drm]
>>>>       [<ffffffffc08bc9cd>] drm_ioctl+0x2dd/0x6a0 [drm]
>>>>       [<ffffffff9ea54744>] __x64_sys_ioctl+0xc4/0x100
>>>>       [<ffffffff9fbc0178>] do_syscall_64+0x58/0x80
>>>>       [<ffffffff9fc000aa>] entry_SYSCALL_64_after_hwframe+0x72/0xdc
>>>>
>>>> The complete list of leaks in attachment, but they seem similar or
>>>> the same.
>>>>
>>>> Please find attached lshw and kernel build config file.
>>>>
>>>> I will probably check the same parms on my laptop at home, which is
>>>> also
>>>> Lenovo, but a different hw config and Ubuntu 22.10.
>>
>> Could you try the below patch?
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> index c3ea243d414d..0b07534c203a 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> @@ -679,9 +679,10 @@ mmap_offset_attach(struct drm_i915_gem_object *obj,
>>   insert:
>>          mmo = insert_mmo(obj, mmo);
>>          GEM_BUG_ON(lookup_mmo(obj, mmap_type) != mmo);
>> -out:
>> +
>>          if (file)
>>                  drm_vma_node_allow(&mmo->vma_node, file);
>> +out:
>>          return mmo;
>>
>>   err:
>>
>> Maybe it is not the best fix but curious to know if it will make the
>> leak go away.
>
> Hi,
>
> After 27 minutes uptime with the patched kernel it looks promising.
> It is much longer than it took for the buggy kernel to leak slabs.
>
> Here is the output:
>
> [root@pc-mtodorov marvin]# echo scan > /sys/kernel/debug/kmemleak
> [root@pc-mtodorov marvin]# cat !$
> cat /sys/kernel/debug/kmemleak
> unreferenced object 0xffff888105028d80 (size 16):
>   comm "kworker/u12:5", pid 359, jiffies 4294902898 (age 1620.144s)
>   hex dump (first 16 bytes):
>     6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00 memstick0.......
>   backtrace:
>     [<ffffffffb6bb5542>] slab_post_alloc_hook+0xb2/0x340
>     [<ffffffffb6bbbf5f>] __kmem_cache_alloc_node+0x1bf/0x2c0
>     [<ffffffffb6af8175>] __kmalloc_node_track_caller+0x55/0x160
>     [<ffffffffb6ae34a6>] kstrdup+0x36/0x60
>     [<ffffffffb6ae3508>] kstrdup_const+0x28/0x30
>     [<ffffffffb70d0757>] kvasprintf_const+0x97/0xd0
>     [<ffffffffb7c9cdf4>] kobject_set_name_vargs+0x34/0xc0
>     [<ffffffffb750289b>] dev_set_name+0x9b/0xd0
>     [<ffffffffc12d9201>] memstick_check+0x181/0x639 [memstick]
>     [<ffffffffb676e1d6>] process_one_work+0x4e6/0x7e0
>     [<ffffffffb676e556>] worker_thread+0x76/0x770
>     [<ffffffffb677b468>] kthread+0x168/0x1a0
>     [<ffffffffb6604c99>] ret_from_fork+0x29/0x50
> [root@pc-mtodorov marvin]# w
>  20:27:35 up 27 min,  2 users,  load average: 0.83, 1.15, 1.19
> USER     TTY      FROM             LOGIN@   IDLE   JCPU   PCPU WHAT
> marvin   tty2     tty2             20:01   27:10  10:12   2.09s
> /opt/google/chrome/chrome --type=utility --utility-sub-type=audio.m
> marvin   pts/1    -                20:01    0.00s  2:00   0.38s sudo bash
> [root@pc-mtodorov marvin]# uname -rms
> Linux 6.1.0-b6bb9676f216-mglru-kmemlk-kasan+ x86_64
> [root@pc-mtodorov marvin]#
>
> 2. On the Ubuntu 22.10 with Debian build I did not reproduce the error
> thus far.
>
> This looks to me like fixed, but if it doesn't leak anything until
> Thursday morning when I will see this desktop box next time, then
> we'll know with more certainty.

After an inspection in the morning local time and 12:10h uptime, it
appears that the problem is fixed. No chrome-triggered
i915_gem_mmap_offset_ioctl leaks.

By this uptime, there were about 30 instances of leaks in the unpatched
kernel.

Congratulations!

Kind regards,
Mirsad

--
Mirsad Todorovac
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb
Republic of Croatia, the European Union
--
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

2022-12-22 00:21:09

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: LOOKS GOOD: Possible regression in drm/i915 driver: memleak

On 20. 12. 2022. 20:34, Mirsad Todorovac wrote:
> On 12/20/22 16:52, Tvrtko Ursulin wrote:
>
>> On 20/12/2022 15:22, srinivas pandruvada wrote:
>>> +Added DRM mailing list and maintainers
>>>
>>> On Tue, 2022-12-20 at 15:33 +0100, Mirsad Todorovac wrote:
>>>> Hi all,
>>>>
>>>> I have been unsuccessful to find any particular Intel i915 maintainer
>>>> emails, so my best bet is to post here, as you will must assuredly
>>>> already know them.
>>
>> For future reference you can use ${kernel_dir}/scripts/get_maintainer.pl -f ...
>>
>>>> The problem is a kernel memory leak that is repeatedly occurring
>>>> triggered during the execution of Chrome browser under the latest
>>>> 6.1.0+
>>>> kernel of this morning and Almalinux 8.6 on a Lenovo desktop box
>>>> with Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz CPU.
>>>>
>>>> The build is with KMEMLEAK, KASAN and MGLRU turned on during the
>>>> build,
>>>> on a vanilla mainline kernel from Mr. Torvalds' tree.
>>>>
>>>> The leaks look like this one:
>>>>
>>>> unreferenced object 0xffff888131754880 (size 64):
>>>>     comm "chrome", pid 13058, jiffies 4298568878 (age 3708.084s)
>>>>     hex dump (first 32 bytes):
>>>>       01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>> ................
>>>>       00 00 00 00 00 00 00 00 00 80 1e 3e 83 88 ff ff
>>>> ...........>....
>>>>     backtrace:
>>>>       [<ffffffff9e9b5542>] slab_post_alloc_hook+0xb2/0x340
>>>>       [<ffffffff9e9bbf5f>] __kmem_cache_alloc_node+0x1bf/0x2c0
>>>>       [<ffffffff9e8f767a>] kmalloc_trace+0x2a/0xb0
>>>>       [<ffffffffc08dfde5>] drm_vma_node_allow+0x45/0x150 [drm]
>>>>       [<ffffffffc0b33315>] __assign_mmap_offset_handle+0x615/0x820
>>>> [i915]
>>>>       [<ffffffffc0b34057>] i915_gem_mmap_offset_ioctl+0x77/0x110
>>>> [i915]
>>>>       [<ffffffffc08bc5e1>] drm_ioctl_kernel+0x181/0x280 [drm]
>>>>       [<ffffffffc08bc9cd>] drm_ioctl+0x2dd/0x6a0 [drm]
>>>>       [<ffffffff9ea54744>] __x64_sys_ioctl+0xc4/0x100
>>>>       [<ffffffff9fbc0178>] do_syscall_64+0x58/0x80
>>>>       [<ffffffff9fc000aa>] entry_SYSCALL_64_after_hwframe+0x72/0xdc
>>>>
>>>> The complete list of leaks in attachment, but they seem similar or
>>>> the same.
>>>>
>>>> Please find attached lshw and kernel build config file.
>>>>
>>>> I will probably check the same parms on my laptop at home, which is
>>>> also
>>>> Lenovo, but a different hw config and Ubuntu 22.10.
>>
>> Could you try the below patch?
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> index c3ea243d414d..0b07534c203a 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> @@ -679,9 +679,10 @@ mmap_offset_attach(struct drm_i915_gem_object *obj,
>>   insert:
>>          mmo = insert_mmo(obj, mmo);
>>          GEM_BUG_ON(lookup_mmo(obj, mmap_type) != mmo);
>> -out:
>> +
>>          if (file)
>>                  drm_vma_node_allow(&mmo->vma_node, file);
>> +out:
>>          return mmo;
>>
>>   err:
>>
>> Maybe it is not the best fix but curious to know if it will make the leak go away.
>
> Hi,
>
> After 27 minutes uptime with the patched kernel it looks promising.
> It is much longer than it took for the buggy kernel to leak slabs.
>
> Here is the output:
>
> [root@pc-mtodorov marvin]# echo scan > /sys/kernel/debug/kmemleak
> [root@pc-mtodorov marvin]# cat !$
> cat /sys/kernel/debug/kmemleak
> unreferenced object 0xffff888105028d80 (size 16):
>   comm "kworker/u12:5", pid 359, jiffies 4294902898 (age 1620.144s)
>   hex dump (first 16 bytes):
>     6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00  memstick0.......
>   backtrace:
>     [<ffffffffb6bb5542>] slab_post_alloc_hook+0xb2/0x340
>     [<ffffffffb6bbbf5f>] __kmem_cache_alloc_node+0x1bf/0x2c0
>     [<ffffffffb6af8175>] __kmalloc_node_track_caller+0x55/0x160
>     [<ffffffffb6ae34a6>] kstrdup+0x36/0x60
>     [<ffffffffb6ae3508>] kstrdup_const+0x28/0x30
>     [<ffffffffb70d0757>] kvasprintf_const+0x97/0xd0
>     [<ffffffffb7c9cdf4>] kobject_set_name_vargs+0x34/0xc0
>     [<ffffffffb750289b>] dev_set_name+0x9b/0xd0
>     [<ffffffffc12d9201>] memstick_check+0x181/0x639 [memstick]
>     [<ffffffffb676e1d6>] process_one_work+0x4e6/0x7e0
>     [<ffffffffb676e556>] worker_thread+0x76/0x770
>     [<ffffffffb677b468>] kthread+0x168/0x1a0
>     [<ffffffffb6604c99>] ret_from_fork+0x29/0x50
> [root@pc-mtodorov marvin]# w
>  20:27:35 up 27 min,  2 users,  load average: 0.83, 1.15, 1.19
> USER     TTY      FROM             LOGIN@   IDLE   JCPU   PCPU WHAT
> marvin   tty2     tty2             20:01   27:10  10:12   2.09s /opt/google/chrome/chrome --type=utility --utility-sub-type=audio.m
> marvin   pts/1    -                20:01    0.00s  2:00   0.38s sudo bash
> [root@pc-mtodorov marvin]# uname -rms
> Linux 6.1.0-b6bb9676f216-mglru-kmemlk-kasan+ x86_64
> [root@pc-mtodorov marvin]#

As I hear no reply from Tvrtko, and there is already 1d5h uptime with no leaks (but
the kworker with memstick_check nag I couldn't bisect on the only box that reproduced it,
because something in hw was not supported in pre 4.16 kernels on the Lenovo V530S-07ICB.
Or I am doing something wrong.)

However, now I can find the memstick maintainers thanks to Tvrtko's hint.

If you no longer require my service, I would close this on my behalf.

I hope I did not cause too much trouble. The knowledgeable knew that this was not a security
risk, but only a bug. (30 leaks of 64 bytes each were hardly to exhaust memory in any realistic
time.)

However, having some experience with software development, I always preferred bugs reported
and fixed rather than concealed and lying in wait (or worse, found first by a motivated
adversary.) Forgive me this rant, I do not live from writing kernel drivers, this is just a
pet project as of time being ...

Thanks,
Mirsad

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union

2022-12-22 08:26:45

by Tvrtko Ursulin

[permalink] [raw]
Subject: Re: LOOKS GOOD: Possible regression in drm/i915 driver: memleak


On 22/12/2022 00:12, Mirsad Goran Todorovac wrote:
> On 20. 12. 2022. 20:34, Mirsad Todorovac wrote:
>> On 12/20/22 16:52, Tvrtko Ursulin wrote:
>>
>>> On 20/12/2022 15:22, srinivas pandruvada wrote:
>>>> +Added DRM mailing list and maintainers
>>>>
>>>> On Tue, 2022-12-20 at 15:33 +0100, Mirsad Todorovac wrote:
>>>>> Hi all,
>>>>>
>>>>> I have been unsuccessful to find any particular Intel i915 maintainer
>>>>> emails, so my best bet is to post here, as you will must assuredly
>>>>> already know them.
>>>
>>> For future reference you can use
>>> ${kernel_dir}/scripts/get_maintainer.pl -f ...
>>>
>>>>> The problem is a kernel memory leak that is repeatedly occurring
>>>>> triggered during the execution of Chrome browser under the latest
>>>>> 6.1.0+
>>>>> kernel of this morning and Almalinux 8.6 on a Lenovo desktop box
>>>>> with Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz CPU.
>>>>>
>>>>> The build is with KMEMLEAK, KASAN and MGLRU turned on during the
>>>>> build,
>>>>> on a vanilla mainline kernel from Mr. Torvalds' tree.
>>>>>
>>>>> The leaks look like this one:
>>>>>
>>>>> unreferenced object 0xffff888131754880 (size 64):
>>>>>     comm "chrome", pid 13058, jiffies 4298568878 (age 3708.084s)
>>>>>     hex dump (first 32 bytes):
>>>>>       01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>>> ................
>>>>>       00 00 00 00 00 00 00 00 00 80 1e 3e 83 88 ff ff
>>>>> ...........>....
>>>>>     backtrace:
>>>>>       [<ffffffff9e9b5542>] slab_post_alloc_hook+0xb2/0x340
>>>>>       [<ffffffff9e9bbf5f>] __kmem_cache_alloc_node+0x1bf/0x2c0
>>>>>       [<ffffffff9e8f767a>] kmalloc_trace+0x2a/0xb0
>>>>>       [<ffffffffc08dfde5>] drm_vma_node_allow+0x45/0x150 [drm]
>>>>>       [<ffffffffc0b33315>] __assign_mmap_offset_handle+0x615/0x820
>>>>> [i915]
>>>>>       [<ffffffffc0b34057>] i915_gem_mmap_offset_ioctl+0x77/0x110
>>>>> [i915]
>>>>>       [<ffffffffc08bc5e1>] drm_ioctl_kernel+0x181/0x280 [drm]
>>>>>       [<ffffffffc08bc9cd>] drm_ioctl+0x2dd/0x6a0 [drm]
>>>>>       [<ffffffff9ea54744>] __x64_sys_ioctl+0xc4/0x100
>>>>>       [<ffffffff9fbc0178>] do_syscall_64+0x58/0x80
>>>>>       [<ffffffff9fc000aa>] entry_SYSCALL_64_after_hwframe+0x72/0xdc
>>>>>
>>>>> The complete list of leaks in attachment, but they seem similar or
>>>>> the same.
>>>>>
>>>>> Please find attached lshw and kernel build config file.
>>>>>
>>>>> I will probably check the same parms on my laptop at home, which is
>>>>> also
>>>>> Lenovo, but a different hw config and Ubuntu 22.10.
>>>
>>> Could you try the below patch?
>>>
>>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>>> b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>>> index c3ea243d414d..0b07534c203a 100644
>>> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>>> @@ -679,9 +679,10 @@ mmap_offset_attach(struct drm_i915_gem_object *obj,
>>>   insert:
>>>          mmo = insert_mmo(obj, mmo);
>>>          GEM_BUG_ON(lookup_mmo(obj, mmap_type) != mmo);
>>> -out:
>>> +
>>>          if (file)
>>>                  drm_vma_node_allow(&mmo->vma_node, file);
>>> +out:
>>>          return mmo;
>>>
>>>   err:
>>>
>>> Maybe it is not the best fix but curious to know if it will make the
>>> leak go away.
>>
>> Hi,
>>
>> After 27 minutes uptime with the patched kernel it looks promising.
>> It is much longer than it took for the buggy kernel to leak slabs.
>>
>> Here is the output:
>>
>> [root@pc-mtodorov marvin]# echo scan > /sys/kernel/debug/kmemleak
>> [root@pc-mtodorov marvin]# cat !$
>> cat /sys/kernel/debug/kmemleak
>> unreferenced object 0xffff888105028d80 (size 16):
>>    comm "kworker/u12:5", pid 359, jiffies 4294902898 (age 1620.144s)
>>    hex dump (first 16 bytes):
>>      6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00  memstick0.......
>>    backtrace:
>>      [<ffffffffb6bb5542>] slab_post_alloc_hook+0xb2/0x340
>>      [<ffffffffb6bbbf5f>] __kmem_cache_alloc_node+0x1bf/0x2c0
>>      [<ffffffffb6af8175>] __kmalloc_node_track_caller+0x55/0x160
>>      [<ffffffffb6ae34a6>] kstrdup+0x36/0x60
>>      [<ffffffffb6ae3508>] kstrdup_const+0x28/0x30
>>      [<ffffffffb70d0757>] kvasprintf_const+0x97/0xd0
>>      [<ffffffffb7c9cdf4>] kobject_set_name_vargs+0x34/0xc0
>>      [<ffffffffb750289b>] dev_set_name+0x9b/0xd0
>>      [<ffffffffc12d9201>] memstick_check+0x181/0x639 [memstick]
>>      [<ffffffffb676e1d6>] process_one_work+0x4e6/0x7e0
>>      [<ffffffffb676e556>] worker_thread+0x76/0x770
>>      [<ffffffffb677b468>] kthread+0x168/0x1a0
>>      [<ffffffffb6604c99>] ret_from_fork+0x29/0x50
>> [root@pc-mtodorov marvin]# w
>>   20:27:35 up 27 min,  2 users,  load average: 0.83, 1.15, 1.19
>> USER     TTY      FROM             LOGIN@   IDLE   JCPU   PCPU WHAT
>> marvin   tty2     tty2             20:01   27:10  10:12   2.09s
>> /opt/google/chrome/chrome --type=utility --utility-sub-type=audio.m
>> marvin   pts/1    -                20:01    0.00s  2:00   0.38s sudo bash
>> [root@pc-mtodorov marvin]# uname -rms
>> Linux 6.1.0-b6bb9676f216-mglru-kmemlk-kasan+ x86_64
>> [root@pc-mtodorov marvin]#
>
> As I hear no reply from Tvrtko, and there is already 1d5h uptime with no
> leaks (but
> the kworker with memstick_check nag I couldn't bisect on the only box
> that reproduced it,
> because something in hw was not supported in pre 4.16 kernels on the
> Lenovo V530S-07ICB.
> Or I am doing something wrong.)
>
> However, now I can find the memstick maintainers thanks to Tvrtko's hint.
>
> If you no longer require my service, I would close this on my behalf.
>
> I hope I did not cause too much trouble. The knowledgeable knew that
> this was not a security
> risk, but only a bug. (30 leaks of 64 bytes each were hardly to exhaust
> memory in any realistic
> time.)
>
> However, having some experience with software development, I always
> preferred bugs reported
> and fixed rather than concealed and lying in wait (or worse, found first
> by a motivated
> adversary.) Forgive me this rant, I do not live from writing kernel
> drivers, this is just a
> pet project as of time being ...

It is not forgotten - I was trying to reach out to the original author
of the fixlet which worked for you. If that fails I will take it up on
myself, but need to set aside some time to get into the exact problem
space before I can vouch for the fix and send it on my own.

In the meantime definitely thanks a lot for testing this quickly and
reporting back!

What will happen next is, that when either the original author or myself
are ready to send out the fix as a proper patch, you will be copied on
it via the "Reported-by" and possibly "Tested-by" tags. Latter is if the
patch remains identical. If it changes we might kindly ask you to
re-test if possible.

Regards,

Tvrtko

2022-12-22 15:26:29

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: LOOKS GOOD: Possible regression in drm/i915 driver: memleak

On 12/22/2022 09:04, Tvrtko Ursulin wrote:

>
> On 22/12/2022 00:12, Mirsad Goran Todorovac wrote:
>> On 20. 12. 2022. 20:34, Mirsad Todorovac wrote:
>>
>> As I hear no reply from Tvrtko, and there is already 1d5h uptime with
>> no leaks (but
>> the kworker with memstick_check nag I couldn't bisect on the only box
>> that reproduced it,
>> because something in hw was not supported in pre 4.16 kernels on the
>> Lenovo V530S-07ICB.
>> Or I am doing something wrong.)
>>
>> However, now I can find the memstick maintainers thanks to Tvrtko's
>> hint.
>>
>> If you no longer require my service, I would close this on my behalf.
>>
>> I hope I did not cause too much trouble. The knowledgeable knew that
>> this was not a security
>> risk, but only a bug. (30 leaks of 64 bytes each were hardly to
>> exhaust memory in any realistic
>> time.)
>>
>> However, having some experience with software development, I always
>> preferred bugs reported
>> and fixed rather than concealed and lying in wait (or worse, found
>> first by a motivated
>> adversary.) Forgive me this rant, I do not live from writing kernel
>> drivers, this is just a
>> pet project as of time being ...
Hi,
> It is not forgotten - I was trying to reach out to the original author
> of the fixlet which worked for you. If that fails I will take it up on
> myself, but need to set aside some time to get into the exact problem
> space before I can vouch for the fix and send it on my own.
That's good news. Possibly with some assistance I could bisect on pre
4.16 kernels with the additional drivers.
> In the meantime definitely thanks a lot for testing this quickly and
> reporting back!
Not at all, I considered it a privilege to assist your team.
> What will happen next is, that when either the original author or
> myself are ready to send out the fix as a proper patch, you will be
> copied on it via the "Reported-by" and possibly "Tested-by" tags.
> Latter is if the patch remains identical. If it changes we might
> kindly ask you to re-test if possible.

I've seen the published patch and it seems like the same two lines
change (-1/+1).
In case of a change, I will attempt to test with the same config, setup
and running programs.

I may need to correct myself in regard as to security aspect of this
patch as addressed in 786555987207.

QUOTE:

    Currently we create a new mmap_offset for every call to
    mmap_offset_ioctl. This exposes ourselves to an abusive client that may
    simply create new mmap_offsets ad infinitum, which will exhaust
physical
    memory and the virtual address space. In addition to the exhaustion, a
    very long linear list of mmap_offsets causes other clients using the
    object to incur long list walks -- these long lists can also be
    generated by simply having many clients generate their own mmap_offset.

It is unobvious whether the bug that caused chrome to trigger 30
memleaks could be exploited by an
abusive script to exhaust larger parts of kernel memory and possibly
crash the kernel?

Thanks,
Mirsad

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
tel. +385 (0)1 3711 451
mob. +385 91 57 88 355

2022-12-23 12:33:22

by Tvrtko Ursulin

[permalink] [raw]
Subject: Re: LOOKS GOOD: Possible regression in drm/i915 driver: memleak


On 22/12/2022 15:21, Mirsad Goran Todorovac wrote:
> On 12/22/2022 09:04, Tvrtko Ursulin wrote:
>> On 22/12/2022 00:12, Mirsad Goran Todorovac wrote:
>>> On 20. 12. 2022. 20:34, Mirsad Todorovac wrote:
>>>
>>> As I hear no reply from Tvrtko, and there is already 1d5h uptime with
>>> no leaks (but
>>> the kworker with memstick_check nag I couldn't bisect on the only box
>>> that reproduced it,
>>> because something in hw was not supported in pre 4.16 kernels on the
>>> Lenovo V530S-07ICB.
>>> Or I am doing something wrong.)
>>>
>>> However, now I can find the memstick maintainers thanks to Tvrtko's
>>> hint.
>>>
>>> If you no longer require my service, I would close this on my behalf.
>>>
>>> I hope I did not cause too much trouble. The knowledgeable knew that
>>> this was not a security
>>> risk, but only a bug. (30 leaks of 64 bytes each were hardly to
>>> exhaust memory in any realistic
>>> time.)
>>>
>>> However, having some experience with software development, I always
>>> preferred bugs reported
>>> and fixed rather than concealed and lying in wait (or worse, found
>>> first by a motivated
>>> adversary.) Forgive me this rant, I do not live from writing kernel
>>> drivers, this is just a
>>> pet project as of time being ...
> Hi,
>> It is not forgotten - I was trying to reach out to the original author
>> of the fixlet which worked for you. If that fails I will take it up on
>> myself, but need to set aside some time to get into the exact problem
>> space before I can vouch for the fix and send it on my own.
> That's good news. Possibly with some assistance I could bisect on pre
> 4.16 kernels with the additional drivers.

Sorry, maybe I am confused, but from where does 4.16 come?

>> In the meantime definitely thanks a lot for testing this quickly and
>> reporting back!
> Not at all, I considered it a privilege to assist your team.
>> What will happen next is, that when either the original author or
>> myself are ready to send out the fix as a proper patch, you will be
>> copied on it via the "Reported-by" and possibly "Tested-by" tags.
>> Latter is if the patch remains identical. If it changes we might
>> kindly ask you to re-test if possible.
>
> I've seen the published patch and it seems like the same two lines
> change (-1/+1).
> In case of a change, I will attempt to test with the same config, setup
> and running programs.

Yes it is the same diff so no need to re-test really.

> I may need to correct myself in regard as to security aspect of this
> patch as addressed in 786555987207.
>
> QUOTE:
>
>     Currently we create a new mmap_offset for every call to
>     mmap_offset_ioctl. This exposes ourselves to an abusive client that
> may
>     simply create new mmap_offsets ad infinitum, which will exhaust
> physical
>     memory and the virtual address space. In addition to the exhaustion, a
>     very long linear list of mmap_offsets causes other clients using the
>     object to incur long list walks -- these long lists can also be
>     generated by simply having many clients generate their own
> mmap_offset.
>
> It is unobvious whether the bug that caused chrome to trigger 30
> memleaks could be exploited by an
> abusive script to exhaust larger parts of kernel memory and possibly
> crash the kernel?

Indeed. Attackers imagination can be pretty impressive so I'd rather
assume it is exploitable than that it isn't. Luckily it is "just" a
memory leak rather and information leak or worse. Hopefully we can merge
the fix soon, as soon as a willing reviewer is found.

Regards,

Tvrtko

2022-12-25 21:27:20

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: LOOKS GOOD: Possible regression in drm/i915 driver: memleak

On 23. 12. 2022. 13:18, Tvrtko Ursulin wrote:
>

>>> It is not forgotten - I was trying to reach out to the original author of the fixlet which worked for you. If that fails I will
>>> take it up on myself, but need to set aside some time to get into the exact problem space before I can vouch for the fix and send
>>> it on my own.

>> That's good news. Possibly with some assistance I could bisect on pre 4.16 kernels with the additional drivers.
>
> Sorry, maybe I am confused, but from where does 4.16 come?

Sorry, I forgot to refer to the memstick_check() memleak in drivers/memstick/core/memstick.c,
also discovered through CONFIG_KMEMLEAK=y option enabled.

The 4.16 is the last kernel I managed to start on my Lenovo desktop box which only reproduced
the memstick_check() leak.

Needless to say, this is not a i915-related bug.

Sorry for imprecision in my paragraph.

Regards,
Mirsad

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union

2022-12-25 23:16:19

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: LOOKS GOOD: Possible regression in drm/i915 driver: memleak

On 22. 12. 2022. 09:04, Tvrtko Ursulin wrote:
>
> In the meantime definitely thanks a lot for testing this quickly and reporting back!

Don't think much of it - anyone with CONFIG_KMEMLEAK enabled could have caught this bug.

I was surprised that you found the fix in less than an hour without me having to bisect :)

Kind regards,
Mirsad

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union

2023-01-09 15:23:07

by Tvrtko Ursulin

[permalink] [raw]
Subject: Re: LOOKS GOOD: Possible regression in drm/i915 driver: memleak


On 25/12/2022 22:48, Mirsad Goran Todorovac wrote:
> On 22. 12. 2022. 09:04, Tvrtko Ursulin wrote:
>>
>> In the meantime definitely thanks a lot for testing this quickly and
>> reporting back!
>
> Don't think much of it - anyone with CONFIG_KMEMLEAK enabled could have
> caught this bug.
>
> I was surprised that you found the fix in less than an hour without me
> having to bisect :)

Fix sadly has a problem handling shared buffers so different version
will hopefully appear soon.

Regards,

Tvrtko

2023-01-16 06:32:58

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: LOOKS GOOD: Possible regression in drm/i915 driver: memleak



On 1/9/23 16:00, Tvrtko Ursulin wrote:
>
> On 25/12/2022 22:48, Mirsad Goran Todorovac wrote:
>> On 22. 12. 2022. 09:04, Tvrtko Ursulin wrote:
>>>
>>> In the meantime definitely thanks a lot for testing this quickly and reporting back!
>>
>> Don't think much of it - anyone with CONFIG_KMEMLEAK enabled could have caught this bug.
>>
>> I was surprised that you found the fix in less than an hour without me having to bisect :)
>
> Fix sadly has a problem handling shared buffers so different version will hopefully appear soon.

Bummer.

Ready to test the new version in the same environment.

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia