Dear all,
Around Sep 27th 2022 I've noticed in a mainline kernel built with
CONFIG_DEBUG_KMEMLEAK=y
that there actually is a leak:
|sudo cat /sys/kernel/debug/kmemleak unreferenced object
0xffff8881095f3ee0 (size 80): comm "thermald", pid 837, jiffies
4294896698 (age 9867.428s) hex dump (first 32 bytes): 00 00 00 00 00 00
00 00 0d 01 2d 00 00 00 00 00 ..........-..... af 07 01 00 00 c9 ff ff
00 00 00 00 00 00 00 00 ................ backtrace: [<00000000b50b9dd6>]
kmem_cache_alloc+0x184/0x380 [<00000000fa8428c0>]
acpi_os_acquire_object+0x2c/0x32 [<000000002cc0099f>]
acpi_ps_alloc_op+0x65/0xe6 [<00000000335faf1b>]
acpi_ps_get_next_arg+0x842/0x9ed [<000000007afa2dee>]
acpi_ps_parse_loop+0x718/0xee1 [<0000000010ce490e>]
acpi_ps_parse_aml+0x261/0x7b2 [<00000000278d4c5f>]
acpi_ps_execute_method+0x360/0x459 [<00000000ff7ad4ba>]
acpi_ns_evaluate+0x595/0x810 [<0000000037ce3488>]
acpi_evaluate_object+0x28b/0x5b2 [<000000001a800bbf>]
acpi_run_osc+0x209/0x3d0 [<00000000776fbd43>]
int3400_thermal_run_osc+0xed/0x180 [int3400_thermal]
[<00000000d6ec2302>] current_uuid_store+0x17c/0x1d0 [int3400_thermal]
[<00000000486cf3e6>] dev_attr_store+0x3e/0x60 [<00000000bf193027>]
sysfs_kf_write+0x88/0xa0 [<00000000820b5cce>]
kernfs_fop_write_iter+0x1c9/0x270 [<0000000062f8d35e>]
vfs_write+0x5a5/0x750 Mr. Pandruvada required a bug bisect from me, so I
have eventually made one. # first bad commit:
[c7ff29763989bd09c433f73fae3c1e1c15d9cda4] thermal: int340x: Update OS
policy capability handshake Here is the git bisect log:
mtodorov@domac:~/linux/kernel/linux_stable$ git bisect log git bisect
start # good: [b6abb62daa5511c4a3eaa30cbdb02544d1f10fa2] Linux 5.15.1
git bisect good b6abb62daa5511c4a3eaa30cbdb02544d1f10fa2 # bad:
[e6f4ff3f91251f67b130c29f38673eb5702f88b9] Linux 6.0.3 git bisect bad
e6f4ff3f91251f67b130c29f38673eb5702f88b9 # good:
[8bb7eca972ad531c9b149c0a51ab43a417385813] Linux 5.15 git bisect good
8bb7eca972ad531c9b149c0a51ab43a417385813 # bad:
[1464677662943738741500a6f16b85d36bbde2be] Merge tag
'platform-drivers-x86-v5.18-1' of
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
git bisect bad 1464677662943738741500a6f16b85d36bbde2be # good:
[8efd0d9c316af470377894a6a0f9ff63ce18c177] Merge tag '5.17-net-next' of
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next git bisect
good 8efd0d9c316af470377894a6a0f9ff63ce18c177 # good:
[aaa25a2fa7964d94690f6de5edd7164ca7d76555] Merge
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net git bisect good
aaa25a2fa7964d94690f6de5edd7164ca7d76555 # bad:
[b4bc93bd76d4da32600795cd323c971f00a2e788] Merge tag 'arm-drivers-5.18'
of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc git bisect bad
b4bc93bd76d4da32600795cd323c971f00a2e788 # bad:
[ef510682af3dbe2f9cdae7126a1461c94e010967] Merge tag 'f2fs-for-5.18' of
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs git bisect
bad ef510682af3dbe2f9cdae7126a1461c94e010967 # good:
[a04b1bf574e1f4875ea91f5c62ca051666443200] Merge tag 'for-5.18/parisc-1'
of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux git
bisect good a04b1bf574e1f4875ea91f5c62ca051666443200 # bad:
[b080cee72ef355669cbc52ff55dc513d37433600] Merge tag
'for-5.18/io_uring-statx-2022-03-18' of git://git.kernel.dk/linux-block
git bisect bad b080cee72ef355669cbc52ff55dc513d37433600 # good:
[02b82b02c34321dde10d003aafcd831a769b2a8a] Merge tag 'pm-5.18-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm git bisect
good 02b82b02c34321dde10d003aafcd831a769b2a8a # good:
[0e03b8fd29363f2df44e2a7a176d486de550757a] crypto: xilinx - Turn SHA
into a tristate and allow COMPILE_TEST git bisect good
0e03b8fd29363f2df44e2a7a176d486de550757a # good:
[3e504d2026eb6c8762cd6040ae57db166516824a] random: check for signal and
try earlier when generating entropy git bisect good
3e504d2026eb6c8762cd6040ae57db166516824a # good:
[5e929367468c8f97cd1ffb0417316cecfebef94b] io_uring: terminate manual
loop iterator loop correctly for non-vecs git bisect good
5e929367468c8f97cd1ffb0417316cecfebef94b # bad:
[2d6fc1455f3f383499e013ebc4b19ff49c53c15e] Merge branches
'thermal-powerclamp', 'thermal-int340x' and 'thermal-docs' git bisect
bad 2d6fc1455f3f383499e013ebc4b19ff49c53c15e # good:
[1d6aab36a26ba44b114d7f8a857c430c9e0c32c9]
thermal/drivers/ti-soc-thermal: Remove unused function
ti_thermal_get_temp() git bisect good
1d6aab36a26ba44b114d7f8a857c430c9e0c32c9 # bad:
[c7ff29763989bd09c433f73fae3c1e1c15d9cda4] thermal: int340x: Update OS
policy capability handshake git bisect bad
c7ff29763989bd09c433f73fae3c1e1c15d9cda4 # good:
[098c874e20be2a4cee3021aa9b3485ed5e1f4d5b] thermal: Replace
acpi_bus_get_device() git bisect good
098c874e20be2a4cee3021aa9b3485ed5e1f4d5b # good:
[668f69a5f863b877bc3ae129efe9a80b6f055141] thermal: int340x: Increase
bitmap size git bisect good 668f69a5f863b877bc3ae129efe9a80b6f055141 #
first bad commit: [c7ff29763989bd09c433f73fae3c1e1c15d9cda4] thermal:
int340x: Update OS policy capability handshake You have new mail in
/var/mail/mtodorov mtodorov@domac:~/linux/kernel/linux_stable$ I was
unable to locate the culprit in the patch myself. Thank you very much
for your attention. I am available for all further questions. Have a
nice day :) Regards, |
--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
tel. +385 (0)1 3711 451
mob. +385 91 57 88 355
Hi all,
[Due to unexplained garbling of the text; will try to submit the bug
report again.]
There was a bug discovered in 6.0-rc3..rc7 kernels on Sep 27th 2022.
There was a kernel memory leak in thermald, disovered with a mainline
build with
CONFIG_DEBUG_MEMLEAK=y.
sudo cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff8881095f3ee0 (size 80):
comm "thermald", pid 837, jiffies 4294896698 (age 9867.428s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 0d 01 2d 00 00 00 00 00 ..........-.....
af 07 01 00 00 c9 ff ff 00 00 00 00 00 00 00 00 ................
backtrace:
[<00000000b50b9dd6>] kmem_cache_alloc+0x184/0x380
[<00000000fa8428c0>] acpi_os_acquire_object+0x2c/0x32
[<000000002cc0099f>] acpi_ps_alloc_op+0x65/0xe6
[<00000000335faf1b>] acpi_ps_get_next_arg+0x842/0x9ed
[<000000007afa2dee>] acpi_ps_parse_loop+0x718/0xee1
[<0000000010ce490e>] acpi_ps_parse_aml+0x261/0x7b2
[<00000000278d4c5f>] acpi_ps_execute_method+0x360/0x459
[<00000000ff7ad4ba>] acpi_ns_evaluate+0x595/0x810
[<0000000037ce3488>] acpi_evaluate_object+0x28b/0x5b2
[<000000001a800bbf>] acpi_run_osc+0x209/0x3d0
[<00000000776fbd43>] int3400_thermal_run_osc+0xed/0x180
[int3400_thermal]
[<00000000d6ec2302>] current_uuid_store+0x17c/0x1d0 [int3400_thermal]
[<00000000486cf3e6>] dev_attr_store+0x3e/0x60
[<00000000bf193027>] sysfs_kf_write+0x88/0xa0
[<00000000820b5cce>] kernfs_fop_write_iter+0x1c9/0x270
[<0000000062f8d35e>] vfs_write+0x5a5/0x750
Mr. Pandruvada request a bisection on the bug, so I eventually made one.
The culprit seems to be the following commit (but I am unable to discern
the bug):
# first bad commit: [c7ff29763989bd09c433f73fae3c1e1c15d9cda4] thermal:
int340x: Update OS policy capability handshake
mtodorov@domac:~/linux/kernel/linux_stable$ git bisect log
git bisect start
# good: [b6abb62daa5511c4a3eaa30cbdb02544d1f10fa2] Linux 5.15.1
git bisect good b6abb62daa5511c4a3eaa30cbdb02544d1f10fa2
# bad: [e6f4ff3f91251f67b130c29f38673eb5702f88b9] Linux 6.0.3
git bisect bad e6f4ff3f91251f67b130c29f38673eb5702f88b9
# good: [8bb7eca972ad531c9b149c0a51ab43a417385813] Linux 5.15
git bisect good 8bb7eca972ad531c9b149c0a51ab43a417385813
# bad: [1464677662943738741500a6f16b85d36bbde2be] Merge tag
'platform-drivers-x86-v5.18-1' of
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
git bisect bad 1464677662943738741500a6f16b85d36bbde2be
# good: [8efd0d9c316af470377894a6a0f9ff63ce18c177] Merge tag
'5.17-net-next' of
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
git bisect good 8efd0d9c316af470377894a6a0f9ff63ce18c177
# good: [aaa25a2fa7964d94690f6de5edd7164ca7d76555] Merge
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
git bisect good aaa25a2fa7964d94690f6de5edd7164ca7d76555
# bad: [b4bc93bd76d4da32600795cd323c971f00a2e788] Merge tag
'arm-drivers-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
git bisect bad b4bc93bd76d4da32600795cd323c971f00a2e788
# bad: [ef510682af3dbe2f9cdae7126a1461c94e010967] Merge tag
'f2fs-for-5.18' of
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
git bisect bad ef510682af3dbe2f9cdae7126a1461c94e010967
# good: [a04b1bf574e1f4875ea91f5c62ca051666443200] Merge tag
'for-5.18/parisc-1' of
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
git bisect good a04b1bf574e1f4875ea91f5c62ca051666443200
# bad: [b080cee72ef355669cbc52ff55dc513d37433600] Merge tag
'for-5.18/io_uring-statx-2022-03-18' of git://git.kernel.dk/linux-block
git bisect bad b080cee72ef355669cbc52ff55dc513d37433600
# good: [02b82b02c34321dde10d003aafcd831a769b2a8a] Merge tag
'pm-5.18-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
git bisect good 02b82b02c34321dde10d003aafcd831a769b2a8a
# good: [0e03b8fd29363f2df44e2a7a176d486de550757a] crypto: xilinx - Turn
SHA into a tristate and allow COMPILE_TEST
git bisect good 0e03b8fd29363f2df44e2a7a176d486de550757a
# good: [3e504d2026eb6c8762cd6040ae57db166516824a] random: check for
signal and try earlier when generating entropy
git bisect good 3e504d2026eb6c8762cd6040ae57db166516824a
# good: [5e929367468c8f97cd1ffb0417316cecfebef94b] io_uring: terminate
manual loop iterator loop correctly for non-vecs
git bisect good 5e929367468c8f97cd1ffb0417316cecfebef94b
# bad: [2d6fc1455f3f383499e013ebc4b19ff49c53c15e] Merge branches
'thermal-powerclamp', 'thermal-int340x' and 'thermal-docs'
git bisect bad 2d6fc1455f3f383499e013ebc4b19ff49c53c15e
# good: [1d6aab36a26ba44b114d7f8a857c430c9e0c32c9]
thermal/drivers/ti-soc-thermal: Remove unused function ti_thermal_get_temp()
git bisect good 1d6aab36a26ba44b114d7f8a857c430c9e0c32c9
# bad: [c7ff29763989bd09c433f73fae3c1e1c15d9cda4] thermal: int340x:
Update OS policy capability handshake
git bisect bad c7ff29763989bd09c433f73fae3c1e1c15d9cda4
# good: [098c874e20be2a4cee3021aa9b3485ed5e1f4d5b] thermal: Replace
acpi_bus_get_device()
git bisect good 098c874e20be2a4cee3021aa9b3485ed5e1f4d5b
# good: [668f69a5f863b877bc3ae129efe9a80b6f055141] thermal: int340x:
Increase bitmap size
git bisect good 668f69a5f863b877bc3ae129efe9a80b6f055141
# first bad commit: [c7ff29763989bd09c433f73fae3c1e1c15d9cda4] thermal:
int340x: Update OS policy capability handshake
You have new mail in /var/mail/mtodorov
mtodorov@domac:~/linux/kernel/linux_stable$
I am available for any further questions or issues on the bug.
Have a nice day :)
Regards,
Mirsad
On 10/24/2022 3:13 PM, Mirsad Goran Todorovac wrote:
> ...
>
--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
tel. +385 (0)1 3711 451
mob. +385 91 57 88 355
Hi Mirsad,
Thanks for the bisect.
On Mon, 2022-10-24 at 15:13 +0200, Mirsad Goran Todorovac wrote:
> Dear all,
>
> Around Sep 27th 2022 I've noticed in a mainline kernel built with
> CONFIG_DEBUG_KMEMLEAK=y
> that there actually is a leak:
>
> > sudo cat /sys/kernel/debug/kmemleak unreferenced object
> 0xffff8881095f3ee0 (size 80): comm "thermald", pid 837, jiffies
> 4294896698 (age 9867.428s) hex dump (first 32 bytes): 00 00 00 00 00
> 00
> 00 00 0d 01 2d 00 00 00 00 00 ..........-..... af 07 01 00 00 c9 ff
> ff
> 00 00 00 00 00 00 00 00 ................ backtrace:
> [<00000000b50b9dd6>]
> kmem_cache_alloc+0x184/0x380 [<00000000fa8428c0>]
> acpi_os_acquire_object+0x2c/0x32 [<000000002cc0099f>]
> acpi_ps_alloc_op+0x65/0xe6 [<00000000335faf1b>]
> acpi_ps_get_next_arg+0x842/0x9ed [<000000007afa2dee>]
> acpi_ps_parse_loop+0x718/0xee1 [<0000000010ce490e>]
> acpi_ps_parse_aml+0x261/0x7b2 [<00000000278d4c5f>]
> acpi_ps_execute_method+0x360/0x459 [<00000000ff7ad4ba>]
> acpi_ns_evaluate+0x595/0x810 [<0000000037ce3488>]
> acpi_evaluate_object+0x28b/0x5b2 [<000000001a800bbf>]
> acpi_run_osc+0x209/0x3d0 [<00000000776fbd43>]
> int3400_thermal_run_osc+0xed/0x180 [int3400_thermal]
> [<00000000d6ec2302>] current_uuid_store+0x17c/0x1d0 [int3400_thermal]
> [<00000000486cf3e6>] dev_attr_store+0x3e/0x60 [<00000000bf193027>]
> sysfs_kf_write+0x88/0xa0 [<00000000820b5cce>]
> kernfs_fop_write_iter+0x1c9/0x270 [<0000000062f8d35e>]
> vfs_write+0x5a5/0x750 Mr. Pandruvada required a bug bisect from me,
> so I
> have eventually made one. # first bad commit:
> [c7ff29763989bd09c433f73fae3c1e1c15d9cda4] thermal: int340x: Update
> OS
This will say this patch as this patch is calling acpi_run_osc in
response to thermald calls for the first time.
But looking at code, this is freeing the memory allocated by
acpi_run_osc() call chain as any other caller.
status = acpi_run_osc(handle, &context);
if (ACPI_SUCCESS(status)) {
ret = *((u32 *)(context.ret.pointer + 4));
if (ret != *enable)
result = -EPERM;
kfree(context.ret.pointer);
} else
result = -EPERM;
There is no kfree when call failed as at other places.
I think you are failing, you can search for "_OSC" in dmesg.
On some Dell systems this OSC setting fails because of some BIOS issue.
May be you are hitting that case.
Just for the sake of test, please apply the diff and see if the issue
is gone.
Thanks,
Srinivas
> policy capability handshake Here is the git bisect log:
> mtodorov@domac:~/linux/kernel/linux_stable$ git bisect log git bisect
> start # good: [b6abb62daa5511c4a3eaa30cbdb02544d1f10fa2] Linux 5.15.1
> git bisect good b6abb62daa5511c4a3eaa30cbdb02544d1f10fa2 # bad:
> [e6f4ff3f91251f67b130c29f38673eb5702f88b9] Linux 6.0.3 git bisect bad
> e6f4ff3f91251f67b130c29f38673eb5702f88b9 # good:
> [8bb7eca972ad531c9b149c0a51ab43a417385813] Linux 5.15 git bisect good
> 8bb7eca972ad531c9b149c0a51ab43a417385813 # bad:
> [1464677662943738741500a6f16b85d36bbde2be] Merge tag
> 'platform-drivers-x86-v5.18-1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-
> x86
> git bisect bad 1464677662943738741500a6f16b85d36bbde2be # good:
> [8efd0d9c316af470377894a6a0f9ff63ce18c177] Merge tag '5.17-net-next'
> of
> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next git
> bisect
> good 8efd0d9c316af470377894a6a0f9ff63ce18c177 # good:
> [aaa25a2fa7964d94690f6de5edd7164ca7d76555] Merge
> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net git bisect
> good
> aaa25a2fa7964d94690f6de5edd7164ca7d76555 # bad:
> [b4bc93bd76d4da32600795cd323c971f00a2e788] Merge tag 'arm-drivers-
> 5.18'
> of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc git bisect
> bad
> b4bc93bd76d4da32600795cd323c971f00a2e788 # bad:
> [ef510682af3dbe2f9cdae7126a1461c94e010967] Merge tag 'f2fs-for-5.18'
> of
> git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs git bisect
> bad ef510682af3dbe2f9cdae7126a1461c94e010967 # good:
> [a04b1bf574e1f4875ea91f5c62ca051666443200] Merge tag 'for-
> 5.18/parisc-1'
> of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
> git
> bisect good a04b1bf574e1f4875ea91f5c62ca051666443200 # bad:
> [b080cee72ef355669cbc52ff55dc513d37433600] Merge tag
> 'for-5.18/io_uring-statx-2022-03-18' of git://git.kernel.dk/linux-
> block
> git bisect bad b080cee72ef355669cbc52ff55dc513d37433600 # good:
> [02b82b02c34321dde10d003aafcd831a769b2a8a] Merge tag 'pm-5.18-rc1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm git
> bisect
> good 02b82b02c34321dde10d003aafcd831a769b2a8a # good:
> [0e03b8fd29363f2df44e2a7a176d486de550757a] crypto: xilinx - Turn SHA
> into a tristate and allow COMPILE_TEST git bisect good
> 0e03b8fd29363f2df44e2a7a176d486de550757a # good:
> [3e504d2026eb6c8762cd6040ae57db166516824a] random: check for signal
> and
> try earlier when generating entropy git bisect good
> 3e504d2026eb6c8762cd6040ae57db166516824a # good:
> [5e929367468c8f97cd1ffb0417316cecfebef94b] io_uring: terminate manual
> loop iterator loop correctly for non-vecs git bisect good
> 5e929367468c8f97cd1ffb0417316cecfebef94b # bad:
> [2d6fc1455f3f383499e013ebc4b19ff49c53c15e] Merge branches
> 'thermal-powerclamp', 'thermal-int340x' and 'thermal-docs' git bisect
> bad 2d6fc1455f3f383499e013ebc4b19ff49c53c15e # good:
> [1d6aab36a26ba44b114d7f8a857c430c9e0c32c9]
> thermal/drivers/ti-soc-thermal: Remove unused function
> ti_thermal_get_temp() git bisect good
> 1d6aab36a26ba44b114d7f8a857c430c9e0c32c9 # bad:
> [c7ff29763989bd09c433f73fae3c1e1c15d9cda4] thermal: int340x: Update
> OS
> policy capability handshake git bisect bad
> c7ff29763989bd09c433f73fae3c1e1c15d9cda4 # good:
> [098c874e20be2a4cee3021aa9b3485ed5e1f4d5b] thermal: Replace
> acpi_bus_get_device() git bisect good
> 098c874e20be2a4cee3021aa9b3485ed5e1f4d5b # good:
> [668f69a5f863b877bc3ae129efe9a80b6f055141] thermal: int340x: Increase
> bitmap size git bisect good 668f69a5f863b877bc3ae129efe9a80b6f055141
> #
> first bad commit: [c7ff29763989bd09c433f73fae3c1e1c15d9cda4] thermal:
> int340x: Update OS policy capability handshake You have new mail in
> /var/mail/mtodorov mtodorov@domac:~/linux/kernel/linux_stable$ I was
> unable to locate the culprit in the patch myself. Thank you very much
> for your attention. I am available for all further questions. Have a
> nice day :) Regards, |
>
Hi Srinivas,
On 24. 10. 2022. 17:51, srinivas pandruvada wrote:
> Hi Mirsad,
>
> Thanks for the bisect.
>
> On Mon, 2022-10-24 at 15:13 +0200, Mirsad Goran Todorovac wrote:
>> Dear all,
>>
>> Around Sep 27th 2022 I've noticed in a mainline kernel built with
>> CONFIG_DEBUG_KMEMLEAK=y
>> that there actually is a leak:
>>
>>> sudo cat /sys/kernel/debug/kmemleak unreferenced object
>> 0xffff8881095f3ee0 (size 80): comm "thermald", pid 837, jiffies
>> 4294896698 (age 9867.428s) hex dump (first 32 bytes): 00 00 00 00 00
>> 00
>> 00 00 0d 01 2d 00 00 00 00 00 ..........-..... af 07 01 00 00 c9 ff
>> ff
>> 00 00 00 00 00 00 00 00 ................ backtrace:
>> [<00000000b50b9dd6>]
>> kmem_cache_alloc+0x184/0x380 [<00000000fa8428c0>]
>> acpi_os_acquire_object+0x2c/0x32 [<000000002cc0099f>]
>> acpi_ps_alloc_op+0x65/0xe6 [<00000000335faf1b>]
>> acpi_ps_get_next_arg+0x842/0x9ed [<000000007afa2dee>]
>> acpi_ps_parse_loop+0x718/0xee1 [<0000000010ce490e>]
>> acpi_ps_parse_aml+0x261/0x7b2 [<00000000278d4c5f>]
>> acpi_ps_execute_method+0x360/0x459 [<00000000ff7ad4ba>]
>> acpi_ns_evaluate+0x595/0x810 [<0000000037ce3488>]
>> acpi_evaluate_object+0x28b/0x5b2 [<000000001a800bbf>]
>> acpi_run_osc+0x209/0x3d0 [<00000000776fbd43>]
>> int3400_thermal_run_osc+0xed/0x180 [int3400_thermal]
>> [<00000000d6ec2302>] current_uuid_store+0x17c/0x1d0 [int3400_thermal]
>> [<00000000486cf3e6>] dev_attr_store+0x3e/0x60 [<00000000bf193027>]
>> sysfs_kf_write+0x88/0xa0 [<00000000820b5cce>]
>> kernfs_fop_write_iter+0x1c9/0x270 [<0000000062f8d35e>]
>> vfs_write+0x5a5/0x750 Mr. Pandruvada required a bug bisect from me,
>> so I
>> have eventually made one. # first bad commit:
>> [c7ff29763989bd09c433f73fae3c1e1c15d9cda4] thermal: int340x: Update
>> OS
> This will say this patch as this patch is calling acpi_run_osc in
> response to thermald calls for the first time.
>
> But looking at code, this is freeing the memory allocated by
> acpi_run_osc() call chain as any other caller.
>
> status = acpi_run_osc(handle, &context);
> if (ACPI_SUCCESS(status)) {
> ret = *((u32 *)(context.ret.pointer + 4));
> if (ret != *enable)
> result = -EPERM;
>
> kfree(context.ret.pointer);
> } else
> result = -EPERM;
>
> There is no kfree when call failed as at other places.
> I think you are failing, you can search for "_OSC" in dmesg.
> On some Dell systems this OSC setting fails because of some BIOS issue.
> May be you are hitting that case.
> Just for the sake of test, please apply the diff and see if the issue
> is gone.
Thank you for the patch. Unfortunately, when applied to v6.0.3 it didn't
fix the issue.
marvin@marvin-IdeaPad-3-15ITL6:~$ uname -rms
Linux 6.0.3-18-fix01-mlk+ x86_64
marvin@marvin-IdeaPad-3-15ITL6:~$ sudo bash
[sudo] password for marvin:
root@marvin-IdeaPad-3-15ITL6:/home/marvin# cat /sys/kernel/debug/kmemleak
root@marvin-IdeaPad-3-15ITL6:/home/marvin# echo scan >
/sys/kernel/debug/kmemleak
root@marvin-IdeaPad-3-15ITL6:/home/marvin# cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff998b030c3370 (size 80):
comm "thermald", pid 824, jiffies 4294893654 (age 67.080s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 0d 01 2d 00 00 00 00 00 ..........-.....
af 07 01 c0 6f bc ff ff 00 00 00 00 00 00 00 00 ....o...........
backtrace:
[<00000000490225c2>] slab_post_alloc_hook+0x80/0x2e0
[<00000000dc142b33>] kmem_cache_alloc+0x166/0x2e0
[<00000000168f1071>] acpi_os_acquire_object+0x2c/0x32
[<00000000fcc615e1>] acpi_ps_alloc_op+0x4a/0x99
[<00000000fb475bb4>] acpi_ps_get_next_arg+0x611/0x761
[<000000009048d529>] acpi_ps_parse_loop+0x494/0x8d7
[<000000005b0bf086>] acpi_ps_parse_aml+0x1bb/0x561
[<000000007ab7e288>] acpi_ps_execute_method+0x20f/0x2d5
[<00000000c12fa6b7>] acpi_ns_evaluate+0x34d/0x4f3
[<000000001be94719>] acpi_evaluate_object+0x180/0x3ae
[<00000000423a7ad5>] acpi_run_osc+0x128/0x250
[<0000000040a72af8>] int3400_thermal_run_osc+0x6f/0xc0
[int3400_thermal]
[<00000000f8d59987>] current_uuid_store+0xe3/0x120 [int3400_thermal]
[<000000007e2e2d17>] dev_attr_store+0x14/0x30
[<00000000b824b589>] sysfs_kf_write+0x38/0x50
[<00000000beae69c1>] kernfs_fop_write_iter+0x146/0x1d0
root@marvin-IdeaPad-3-15ITL6:/home/marvin#
The build process was as follows:
1573 10/24/2022 06:41:53 PM cd linux_stable
1574 10/24/2022 06:42:03 PM git checkout v6.0.3
1575 10/24/2022 06:42:44 PM cd ..
1576 10/24/2022 06:42:50 PM time rm -rf linux_stable_build; time cp
-rp linux_stable linux_stable_build; \
time diff -ur linux_stable linux_stable_build; cd
linux_stable_build
1577 10/24/2022 06:46:19 PM git apply ../thermald-20221024-01.diff
1578 10/24/2022 06:46:28 PM vi ../config-5.15.0-50-memleak
1579 10/24/2022 06:47:08 PM cp ../config-5.15.0-50-memleak .config
1580 10/24/2022 06:47:16 PM make olddefconfig
1581 10/24/2022 06:48:42 PM time nice make CC="ccache gcc"
KBUILD_BUILD_TIMESTAMP="" -j10 deb-pkg; date
I think your patch definitively makes sense, but there's more to this
than meets the eye :-/
Hope this helps.
Thanks
Mirsad
--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union
On Mon, 2022-10-24 at 20:34 +0200, Mirsad Goran Todorovac wrote:
> Hi Srinivas,
>
> On 24. 10. 2022. 17:51, srinivas pandruvada wrote:
> > Hi Mirsad,
> >
> > Thanks for the bisect.
> >
> > On Mon, 2022-10-24 at 15:13 +0200, Mirsad Goran Todorovac wrote:
> > > Dear all,
> > >
> > > Around Sep 27th 2022 I've noticed in a mainline kernel built with
> > > CONFIG_DEBUG_KMEMLEAK=y
> > > that there actually is a leak:
> > >
> > > > sudo cat /sys/kernel/debug/kmemleak unreferenced object
> > > 0xffff8881095f3ee0 (size 80): comm "thermald", pid 837, jiffies
> > > 4294896698 (age 9867.428s) hex dump (first 32 bytes): 00 00 00 00
> > > 00
> > > 00
> > > 00 00 0d 01 2d 00 00 00 00 00 ..........-..... af 07 01 00 00 c9
> > > ff
> > > ff
> > > 00 00 00 00 00 00 00 00 ................ backtrace:
> > > [<00000000b50b9dd6>]
> > > kmem_cache_alloc+0x184/0x380 [<00000000fa8428c0>]
> > > acpi_os_acquire_object+0x2c/0x32 [<000000002cc0099f>]
> > > acpi_ps_alloc_op+0x65/0xe6 [<00000000335faf1b>]
> > > acpi_ps_get_next_arg+0x842/0x9ed [<000000007afa2dee>]
> > > acpi_ps_parse_loop+0x718/0xee1 [<0000000010ce490e>]
> > > acpi_ps_parse_aml+0x261/0x7b2 [<00000000278d4c5f>]
> > > acpi_ps_execute_method+0x360/0x459 [<00000000ff7ad4ba>]
> > > acpi_ns_evaluate+0x595/0x810 [<0000000037ce3488>]
> > > acpi_evaluate_object+0x28b/0x5b2 [<000000001a800bbf>]
> > > acpi_run_osc+0x209/0x3d0 [<00000000776fbd43>]
> > > int3400_thermal_run_osc+0xed/0x180 [int3400_thermal]
> > > [<00000000d6ec2302>] current_uuid_store+0x17c/0x1d0
> > > [int3400_thermal]
> > > [<00000000486cf3e6>] dev_attr_store+0x3e/0x60
> > > [<00000000bf193027>]
> > > sysfs_kf_write+0x88/0xa0 [<00000000820b5cce>]
> > > kernfs_fop_write_iter+0x1c9/0x270 [<0000000062f8d35e>]
> > > vfs_write+0x5a5/0x750 Mr. Pandruvada required a bug bisect from
> > > me,
> > > so I
> > > have eventually made one. # first bad commit:
> > > [c7ff29763989bd09c433f73fae3c1e1c15d9cda4] thermal: int340x:
> > > Update
> > > OS
> > This will say this patch as this patch is calling acpi_run_osc in
> > response to thermald calls for the first time.
> >
> > But looking at code, this is freeing the memory allocated by
> > acpi_run_osc() call chain as any other caller.
> >
> > status = acpi_run_osc(handle, &context);
> > if (ACPI_SUCCESS(status)) {
> > ret = *((u32 *)(context.ret.pointer + 4));
> > if (ret != *enable)
> > result = -EPERM;
> >
> > kfree(context.ret.pointer);
> > } else
> > result = -EPERM;
> >
> > There is no kfree when call failed as at other places.
> > I think you are failing, you can search for "_OSC" in dmesg.
> > On some Dell systems this OSC setting fails because of some BIOS
> > issue.
> > May be you are hitting that case.
> > Just for the sake of test, please apply the diff and see if the
> > issue
> > is gone.
>
> Thank you for the patch. Unfortunately, when applied to v6.0.3 it
> didn't
> fix the issue.
Thanks for the test. I copied to acpi and acpica mailing list. Someone
can tell us what is this call doing wrong here.
Thanks,
Srinivas
>
> marvin@marvin-IdeaPad-3-15ITL6:~$ uname -rms
> Linux 6.0.3-18-fix01-mlk+ x86_64
> marvin@marvin-IdeaPad-3-15ITL6:~$ sudo bash
> [sudo] password for marvin:
> root@marvin-IdeaPad-3-15ITL6:/home/marvin# cat
> /sys/kernel/debug/kmemleak
> root@marvin-IdeaPad-3-15ITL6:/home/marvin# echo scan >
> /sys/kernel/debug/kmemleak
> root@marvin-IdeaPad-3-15ITL6:/home/marvin# cat
> /sys/kernel/debug/kmemleak
> unreferenced object 0xffff998b030c3370 (size 80):
> comm "thermald", pid 824, jiffies 4294893654 (age 67.080s)
> hex dump (first 32 bytes):
> 00 00 00 00 00 00 00 00 0d 01 2d 00 00 00 00 00 ..........-.....
> af 07 01 c0 6f bc ff ff 00 00 00 00 00 00 00 00 ....o...........
> backtrace:
> [<00000000490225c2>] slab_post_alloc_hook+0x80/0x2e0
> [<00000000dc142b33>] kmem_cache_alloc+0x166/0x2e0
> [<00000000168f1071>] acpi_os_acquire_object+0x2c/0x32
> [<00000000fcc615e1>] acpi_ps_alloc_op+0x4a/0x99
> [<00000000fb475bb4>] acpi_ps_get_next_arg+0x611/0x761
> [<000000009048d529>] acpi_ps_parse_loop+0x494/0x8d7
> [<000000005b0bf086>] acpi_ps_parse_aml+0x1bb/0x561
> [<000000007ab7e288>] acpi_ps_execute_method+0x20f/0x2d5
> [<00000000c12fa6b7>] acpi_ns_evaluate+0x34d/0x4f3
> [<000000001be94719>] acpi_evaluate_object+0x180/0x3ae
> [<00000000423a7ad5>] acpi_run_osc+0x128/0x250
> [<0000000040a72af8>] int3400_thermal_run_osc+0x6f/0xc0
> [int3400_thermal]
> [<00000000f8d59987>] current_uuid_store+0xe3/0x120
> [int3400_thermal]
> [<000000007e2e2d17>] dev_attr_store+0x14/0x30
> [<00000000b824b589>] sysfs_kf_write+0x38/0x50
> [<00000000beae69c1>] kernfs_fop_write_iter+0x146/0x1d0
> root@marvin-IdeaPad-3-15ITL6:/home/marvin#
>
> The build process was as follows:
>
> 1573 10/24/2022 06:41:53 PM cd linux_stable
> 1574 10/24/2022 06:42:03 PM git checkout v6.0.3
> 1575 10/24/2022 06:42:44 PM cd ..
> 1576 10/24/2022 06:42:50 PM time rm -rf linux_stable_build; time
> cp
> -rp linux_stable linux_stable_build; \
> time diff -ur linux_stable linux_stable_build;
> cd
> linux_stable_build
> 1577 10/24/2022 06:46:19 PM git apply ../thermald-20221024-
> 01.diff
> 1578 10/24/2022 06:46:28 PM vi ../config-5.15.0-50-memleak
> 1579 10/24/2022 06:47:08 PM cp ../config-5.15.0-50-memleak
> .config
> 1580 10/24/2022 06:47:16 PM make olddefconfig
> 1581 10/24/2022 06:48:42 PM time nice make CC="ccache gcc"
> KBUILD_BUILD_TIMESTAMP="" -j10 deb-pkg; date
>
> I think your patch definitively makes sense, but there's more to this
> than meets the eye :-/
>
> Hope this helps.
>
> Thanks
> Mirsad
>
> --
> Mirsad Goran Todorovac
> Sistem inženjer
> Grafički fakultet | Akademija likovnih umjetnosti
> Sveučilište u Zagrebu
On 24. 10. 2022. 20:39, srinivas pandruvada wrote:
>> Thank you for the patch. Unfortunately, when applied to v6.0.3 it
>> didn't
>> fix the issue.
> Thanks for the test. I copied to acpi and acpica mailing list. Someone
> can tell us what is this call doing wrong here.
Seems like a prudent thing to do. It must be heavy to provide support
for all of the
hardware on the market ...
Maybe this will help (however, this dmesg -l err was the same in "git
bisect good" and "git bisect bad" kernels!):
root@marvin-IdeaPad-3-15ITL6:~# dmesg -l err
[ 0.121673] ACPI BIOS Error (bug): Could not resolve symbol
[\_SB.PCI0], AE_NOT_FOUND (20220331/dswload2-163)
[ 0.121688] ACPI Error: AE_NOT_FOUND, During name lookup/catalog
(20220331/psobject-221)
[ 0.142742] ACPI BIOS Error (bug): Could not resolve symbol
[\_SB.PC00.DGPV], AE_NOT_FOUND (20220331/psargs-330)
[ 0.142751] ACPI Error: Aborting method \_SB.PC00.PEG0.PCRP._ON due
to previous error (AE_NOT_FOUND) (20220331/psparse-531)
[ 0.308625] integrity: Problem loading X.509 certificate -65
[ 2.731846] mtd device must be supplied (device name is empty)
[ 3.226997] i801_smbus 0000:00:1f.4: Transaction timeout
[ 3.229085] i801_smbus 0000:00:1f.4: Failed terminating the transaction
[ 3.229194] i801_smbus 0000:00:1f.4: SMBus is busy, can't use it!
[ 3.515909] mtd device must be supplied (device name is empty)
[ 4.600624] ACPI BIOS Error (bug): Could not resolve symbol
[\_TZ.ETMD], AE_NOT_FOUND (20220331/psargs-330)
[ 4.600741] ACPI Error: Aborting method \_SB.IETM._OSC due to
previous error (AE_NOT_FOUND) (20220331/psparse-531)
[ 5.110999] Bluetooth: hci0: Malformed MSFT vendor event: 0x02
[ 5.173006] Bluetooth: hci0: HCI_REQ-0xfc1e
root@marvin-IdeaPad-3-15ITL6:~# dmesg | grep _OSC
[ 0.131652] ACPI: \_SB_.PR00: _OSC native thermal LVT Acked
[ 0.167416] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM
ClockPM Segments MSI EDR HPX-Type3]
[ 0.169119] acpi PNP0A08:00: _OSC: platform does not support [AER]
[ 0.172500] acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug
SHPCHotplug PME PCIeCapability LTR DPC]
[ 4.600655] No Local Variables are initialized for Method [_OSC]
[ 4.600660] Initialized Arguments for Method [_OSC]: (4 arguments
defined for method invocation)
[ 4.600741] ACPI Error: Aborting method \_SB.IETM._OSC due to
previous error (AE_NOT_FOUND) (20220331/psparse-531)
root@marvin-IdeaPad-3-15ITL6:~#
>> marvin@marvin-IdeaPad-3-15ITL6:~$ uname -rms
>> Linux 6.0.3-18-fix01-mlk+ x86_64
>> marvin@marvin-IdeaPad-3-15ITL6:~$ sudo bash
>> [sudo] password for marvin:
>> root@marvin-IdeaPad-3-15ITL6:/home/marvin# cat
>> /sys/kernel/debug/kmemleak
>> root@marvin-IdeaPad-3-15ITL6:/home/marvin# echo scan >
>> /sys/kernel/debug/kmemleak
>> root@marvin-IdeaPad-3-15ITL6:/home/marvin# cat
>> /sys/kernel/debug/kmemleak
>> unreferenced object 0xffff998b030c3370 (size 80):
>> comm "thermald", pid 824, jiffies 4294893654 (age 67.080s)
>> hex dump (first 32 bytes):
>> 00 00 00 00 00 00 00 00 0d 01 2d 00 00 00 00 00 ..........-.....
>> af 07 01 c0 6f bc ff ff 00 00 00 00 00 00 00 00 ....o...........
>> backtrace:
>> [<00000000490225c2>] slab_post_alloc_hook+0x80/0x2e0
>> [<00000000dc142b33>] kmem_cache_alloc+0x166/0x2e0
>> [<00000000168f1071>] acpi_os_acquire_object+0x2c/0x32
>> [<00000000fcc615e1>] acpi_ps_alloc_op+0x4a/0x99
>> [<00000000fb475bb4>] acpi_ps_get_next_arg+0x611/0x761
>> [<000000009048d529>] acpi_ps_parse_loop+0x494/0x8d7
>> [<000000005b0bf086>] acpi_ps_parse_aml+0x1bb/0x561
>> [<000000007ab7e288>] acpi_ps_execute_method+0x20f/0x2d5
>> [<00000000c12fa6b7>] acpi_ns_evaluate+0x34d/0x4f3
>> [<000000001be94719>] acpi_evaluate_object+0x180/0x3ae
>> [<00000000423a7ad5>] acpi_run_osc+0x128/0x250
>> [<0000000040a72af8>] int3400_thermal_run_osc+0x6f/0xc0
>> [int3400_thermal]
>> [<00000000f8d59987>] current_uuid_store+0xe3/0x120
>> [int3400_thermal]
>> [<000000007e2e2d17>] dev_attr_store+0x14/0x30
>> [<00000000b824b589>] sysfs_kf_write+0x38/0x50
>> [<00000000beae69c1>] kernfs_fop_write_iter+0x146/0x1d0
>> root@marvin-IdeaPad-3-15ITL6:/home/marvin#
>>
>> The build process was as follows:
>>
>> 1573 10/24/2022 06:41:53 PM cd linux_stable
>> 1574 10/24/2022 06:42:03 PM git checkout v6.0.3
>> 1575 10/24/2022 06:42:44 PM cd ..
>> 1576 10/24/2022 06:42:50 PM time rm -rf linux_stable_build; time
>> cp
>> -rp linux_stable linux_stable_build; \
>> time diff -ur linux_stable linux_stable_build;
>> cd
>> linux_stable_build
>> 1577 10/24/2022 06:46:19 PM git apply ../thermald-20221024-
>> 01.diff
>> 1578 10/24/2022 06:46:28 PM vi ../config-5.15.0-50-memleak
>> 1579 10/24/2022 06:47:08 PM cp ../config-5.15.0-50-memleak
>> .config
>> 1580 10/24/2022 06:47:16 PM make olddefconfig
>> 1581 10/24/2022 06:48:42 PM time nice make CC="ccache gcc"
>> KBUILD_BUILD_TIMESTAMP="" -j10 deb-pkg; date
>>
>> I think your patch definitively makes sense, but there's more to this
>> than meets the eye :-/
>>
>> Hope this helps.
>>
>> Thanks
>> Mirsad
>>
>> --
>> Mirsad Goran Todorovac
>> Sistem inženjer
>> Grafički fakultet | Akademija likovnih umjetnosti
>> Sveučilište u Zagrebu
--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union
[Note: this mail is primarily send for documentation purposes and/or for
regzbot, my Linux kernel regression tracking bot. That's why I removed
most or all folks from the list of recipients, but left any that looked
like a mailing lists. These mails usually contain '#forregzbot' in the
subject, to make them easy to spot and filter out.]
[TLDR: I'm adding this regression report to the list of tracked
regressions; all text from me you find below is based on a few templates
paragraphs you might have encountered already already in similar form.]
Hi, this is your Linux kernel regression tracker.
On 24.10.22 15:13, Mirsad Goran Todorovac wrote:
> Dear all,
>
> Around Sep 27th 2022 I've noticed in a mainline kernel built with
> CONFIG_DEBUG_KMEMLEAK=y
> that there actually is a leak:
Thanks for the report. To be sure below issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
tracking bot:
#regzbot ^introduced c7ff29763989bd
#regzbot title thermald regression (MEMLEAK)
#regzbot ignore-activity
This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply -- ideally with also
telling regzbot about it, as explained here:
https://linux-regtracking.leemhuis.info/tracked-regression/
Reminder for developers: When fixing the issue, add 'Link:' tags
pointing to the report (the mail this one replies to), as explained for
in the Linux kernel's documentation; above webpage explains why this is
important for tracked regressions.
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.
> |sudo cat /sys/kernel/debug/kmemleak unreferenced object
> 0xffff8881095f3ee0 (size 80): comm "thermald", pid 837, jiffies
> 4294896698 (age 9867.428s) hex dump (first 32 bytes): 00 00 00 00 00 00
> 00 00 0d 01 2d 00 00 00 00 00 ..........-..... af 07 01 00 00 c9 ff ff
> 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000b50b9dd6>]
> kmem_cache_alloc+0x184/0x380 [<00000000fa8428c0>]
> acpi_os_acquire_object+0x2c/0x32 [<000000002cc0099f>]
> acpi_ps_alloc_op+0x65/0xe6 [<00000000335faf1b>]
> acpi_ps_get_next_arg+0x842/0x9ed [<000000007afa2dee>]
> acpi_ps_parse_loop+0x718/0xee1 [<0000000010ce490e>]
> acpi_ps_parse_aml+0x261/0x7b2 [<00000000278d4c5f>]
> acpi_ps_execute_method+0x360/0x459 [<00000000ff7ad4ba>]
> acpi_ns_evaluate+0x595/0x810 [<0000000037ce3488>]
> acpi_evaluate_object+0x28b/0x5b2 [<000000001a800bbf>]
> acpi_run_osc+0x209/0x3d0 [<00000000776fbd43>]
> int3400_thermal_run_osc+0xed/0x180 [int3400_thermal]
> [<00000000d6ec2302>] current_uuid_store+0x17c/0x1d0 [int3400_thermal]
> [<00000000486cf3e6>] dev_attr_store+0x3e/0x60 [<00000000bf193027>]
> sysfs_kf_write+0x88/0xa0 [<00000000820b5cce>]
> kernfs_fop_write_iter+0x1c9/0x270 [<0000000062f8d35e>]
> vfs_write+0x5a5/0x750 Mr. Pandruvada required a bug bisect from me, so I
> have eventually made one. # first bad commit:
> [c7ff29763989bd09c433f73fae3c1e1c15d9cda4] thermal: int340x: Update OS
> policy capability handshake Here is the git bisect log:
> mtodorov@domac:~/linux/kernel/linux_stable$ git bisect log git bisect
> start # good: [b6abb62daa5511c4a3eaa30cbdb02544d1f10fa2] Linux 5.15.1
> git bisect good b6abb62daa5511c4a3eaa30cbdb02544d1f10fa2 # bad:
> [e6f4ff3f91251f67b130c29f38673eb5702f88b9] Linux 6.0.3 git bisect bad
> e6f4ff3f91251f67b130c29f38673eb5702f88b9 # good:
> [8bb7eca972ad531c9b149c0a51ab43a417385813] Linux 5.15 git bisect good
> 8bb7eca972ad531c9b149c0a51ab43a417385813 # bad:
> [1464677662943738741500a6f16b85d36bbde2be] Merge tag
> 'platform-drivers-x86-v5.18-1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
> git bisect bad 1464677662943738741500a6f16b85d36bbde2be # good:
> [8efd0d9c316af470377894a6a0f9ff63ce18c177] Merge tag '5.17-net-next' of
> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next git bisect
> good 8efd0d9c316af470377894a6a0f9ff63ce18c177 # good:
> [aaa25a2fa7964d94690f6de5edd7164ca7d76555] Merge
> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net git bisect good
> aaa25a2fa7964d94690f6de5edd7164ca7d76555 # bad:
> [b4bc93bd76d4da32600795cd323c971f00a2e788] Merge tag 'arm-drivers-5.18'
> of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc git bisect bad
> b4bc93bd76d4da32600795cd323c971f00a2e788 # bad:
> [ef510682af3dbe2f9cdae7126a1461c94e010967] Merge tag 'f2fs-for-5.18' of
> git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs git bisect
> bad ef510682af3dbe2f9cdae7126a1461c94e010967 # good:
> [a04b1bf574e1f4875ea91f5c62ca051666443200] Merge tag 'for-5.18/parisc-1'
> of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux git
> bisect good a04b1bf574e1f4875ea91f5c62ca051666443200 # bad:
> [b080cee72ef355669cbc52ff55dc513d37433600] Merge tag
> 'for-5.18/io_uring-statx-2022-03-18' of git://git.kernel.dk/linux-block
> git bisect bad b080cee72ef355669cbc52ff55dc513d37433600 # good:
> [02b82b02c34321dde10d003aafcd831a769b2a8a] Merge tag 'pm-5.18-rc1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm git bisect
> good 02b82b02c34321dde10d003aafcd831a769b2a8a # good:
> [0e03b8fd29363f2df44e2a7a176d486de550757a] crypto: xilinx - Turn SHA
> into a tristate and allow COMPILE_TEST git bisect good
> 0e03b8fd29363f2df44e2a7a176d486de550757a # good:
> [3e504d2026eb6c8762cd6040ae57db166516824a] random: check for signal and
> try earlier when generating entropy git bisect good
> 3e504d2026eb6c8762cd6040ae57db166516824a # good:
> [5e929367468c8f97cd1ffb0417316cecfebef94b] io_uring: terminate manual
> loop iterator loop correctly for non-vecs git bisect good
> 5e929367468c8f97cd1ffb0417316cecfebef94b # bad:
> [2d6fc1455f3f383499e013ebc4b19ff49c53c15e] Merge branches
> 'thermal-powerclamp', 'thermal-int340x' and 'thermal-docs' git bisect
> bad 2d6fc1455f3f383499e013ebc4b19ff49c53c15e # good:
> [1d6aab36a26ba44b114d7f8a857c430c9e0c32c9]
> thermal/drivers/ti-soc-thermal: Remove unused function
> ti_thermal_get_temp() git bisect good
> 1d6aab36a26ba44b114d7f8a857c430c9e0c32c9 # bad:
> [c7ff29763989bd09c433f73fae3c1e1c15d9cda4] thermal: int340x: Update OS
> policy capability handshake git bisect bad
> c7ff29763989bd09c433f73fae3c1e1c15d9cda4 # good:
> [098c874e20be2a4cee3021aa9b3485ed5e1f4d5b] thermal: Replace
> acpi_bus_get_device() git bisect good
> 098c874e20be2a4cee3021aa9b3485ed5e1f4d5b # good:
> [668f69a5f863b877bc3ae129efe9a80b6f055141] thermal: int340x: Increase
> bitmap size git bisect good 668f69a5f863b877bc3ae129efe9a80b6f055141 #
> first bad commit: [c7ff29763989bd09c433f73fae3c1e1c15d9cda4] thermal:
> int340x: Update OS policy capability handshake You have new mail in
> /var/mail/mtodorov mtodorov@domac:~/linux/kernel/linux_stable$ I was
> unable to locate the culprit in the patch myself. Thank you very much
> for your attention. I am available for all further questions. Have a
> nice day :) Regards, |
>
On 10/26/2022 10:08 AM, Thorsten Leemhuis wrote:
> [Note: this mail is primarily send for documentation purposes and/or for
> regzbot, my Linux kernel regression tracking bot. That's why I removed
> most or all folks from the list of recipients, but left any that looked
> like a mailing lists. These mails usually contain '#forregzbot' in the
> subject, to make them easy to spot and filter out.]
>
> [TLDR: I'm adding this regression report to the list of tracked
> regressions; all text from me you find below is based on a few templates
> paragraphs you might have encountered already already in similar form.]
>
> Hi, this is your Linux kernel regression tracker.
>
> On 24.10.22 15:13, Mirsad Goran Todorovac wrote:
>> Dear all,
>>
>> Around Sep 27th 2022 I've noticed in a mainline kernel built with
>> CONFIG_DEBUG_KMEMLEAK=y
>> that there actually is a leak:
> Thanks for the report. To be sure below issue doesn't fall through the
> cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
> tracking bot:
>
> #regzbot ^introduced c7ff29763989bd
> #regzbot title thermald regression (MEMLEAK)
> #regzbot ignore-activity
>
> This isn't a regression? This issue or a fix for it are already
> discussed somewhere else? It was fixed already? You want to clarify when
> the regression started to happen? Or point out I got the title or
> something else totally wrong? Then just reply -- ideally with also
> telling regzbot about it, as explained here:
> https://linux-regtracking.leemhuis.info/tracked-regression/
You're welcome, no thanks needed.
Is this really a regression? I can't tell if this is a one-time memory
leak in thermald, or can it be
exploited for causing memory leaks in a loop, exhausting kernel memory
and producing denial-of-service
or kernel crash.
Thanks,
Mirsad
> Reminder for developers: When fixing the issue, add 'Link:' tags
> pointing to the report (the mail this one replies to), as explained for
> in the Linux kernel's documentation; above webpage explains why this is
> important for tracked regressions.
>
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
>
> P.S.: As the Linux kernel's regression tracker I deal with a lot of
> reports and sometimes miss something important when writing mails like
> this. If that's the case here, don't hesitate to tell me in a public
> reply, it's in everyone's interest to set the public record straight.
>
>
>> |sudo cat /sys/kernel/debug/kmemleak unreferenced object
>> 0xffff8881095f3ee0 (size 80): comm "thermald", pid 837, jiffies
>> 4294896698 (age 9867.428s) hex dump (first 32 bytes): 00 00 00 00 00 00
>> 00 00 0d 01 2d 00 00 00 00 00 ..........-..... af 07 01 00 00 c9 ff ff
>> 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000b50b9dd6>]
>> kmem_cache_alloc+0x184/0x380 [<00000000fa8428c0>]
>> acpi_os_acquire_object+0x2c/0x32 [<000000002cc0099f>]
>> acpi_ps_alloc_op+0x65/0xe6 [<00000000335faf1b>]
>> acpi_ps_get_next_arg+0x842/0x9ed [<000000007afa2dee>]
>> acpi_ps_parse_loop+0x718/0xee1 [<0000000010ce490e>]
>> acpi_ps_parse_aml+0x261/0x7b2 [<00000000278d4c5f>]
>> acpi_ps_execute_method+0x360/0x459 [<00000000ff7ad4ba>]
>> acpi_ns_evaluate+0x595/0x810 [<0000000037ce3488>]
>> acpi_evaluate_object+0x28b/0x5b2 [<000000001a800bbf>]
>> acpi_run_osc+0x209/0x3d0 [<00000000776fbd43>]
>> int3400_thermal_run_osc+0xed/0x180 [int3400_thermal]
>> [<00000000d6ec2302>] current_uuid_store+0x17c/0x1d0 [int3400_thermal]
>> [<00000000486cf3e6>] dev_attr_store+0x3e/0x60 [<00000000bf193027>]
>> sysfs_kf_write+0x88/0xa0 [<00000000820b5cce>]
>> kernfs_fop_write_iter+0x1c9/0x270 [<0000000062f8d35e>]
>> vfs_write+0x5a5/0x750 Mr. Pandruvada required a bug bisect from me, so I
>> have eventually made one. # first bad commit:
>> [c7ff29763989bd09c433f73fae3c1e1c15d9cda4] thermal: int340x: Update OS
>> policy capability handshake Here is the git bisect log:
>> mtodorov@domac:~/linux/kernel/linux_stable$ git bisect log git bisect
>> start # good: [b6abb62daa5511c4a3eaa30cbdb02544d1f10fa2] Linux 5.15.1
>> git bisect good b6abb62daa5511c4a3eaa30cbdb02544d1f10fa2 # bad:
>> [e6f4ff3f91251f67b130c29f38673eb5702f88b9] Linux 6.0.3 git bisect bad
>> e6f4ff3f91251f67b130c29f38673eb5702f88b9 # good:
>> [8bb7eca972ad531c9b149c0a51ab43a417385813] Linux 5.15 git bisect good
>> 8bb7eca972ad531c9b149c0a51ab43a417385813 # bad:
>> [1464677662943738741500a6f16b85d36bbde2be] Merge tag
>> 'platform-drivers-x86-v5.18-1' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
>> git bisect bad 1464677662943738741500a6f16b85d36bbde2be # good:
>> [8efd0d9c316af470377894a6a0f9ff63ce18c177] Merge tag '5.17-net-next' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next git bisect
>> good 8efd0d9c316af470377894a6a0f9ff63ce18c177 # good:
>> [aaa25a2fa7964d94690f6de5edd7164ca7d76555] Merge
>> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net git bisect good
>> aaa25a2fa7964d94690f6de5edd7164ca7d76555 # bad:
>> [b4bc93bd76d4da32600795cd323c971f00a2e788] Merge tag 'arm-drivers-5.18'
>> of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc git bisect bad
>> b4bc93bd76d4da32600795cd323c971f00a2e788 # bad:
>> [ef510682af3dbe2f9cdae7126a1461c94e010967] Merge tag 'f2fs-for-5.18' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs git bisect
>> bad ef510682af3dbe2f9cdae7126a1461c94e010967 # good:
>> [a04b1bf574e1f4875ea91f5c62ca051666443200] Merge tag 'for-5.18/parisc-1'
>> of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux git
>> bisect good a04b1bf574e1f4875ea91f5c62ca051666443200 # bad:
>> [b080cee72ef355669cbc52ff55dc513d37433600] Merge tag
>> 'for-5.18/io_uring-statx-2022-03-18' of git://git.kernel.dk/linux-block
>> git bisect bad b080cee72ef355669cbc52ff55dc513d37433600 # good:
>> [02b82b02c34321dde10d003aafcd831a769b2a8a] Merge tag 'pm-5.18-rc1' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm git bisect
>> good 02b82b02c34321dde10d003aafcd831a769b2a8a # good:
>> [0e03b8fd29363f2df44e2a7a176d486de550757a] crypto: xilinx - Turn SHA
>> into a tristate and allow COMPILE_TEST git bisect good
>> 0e03b8fd29363f2df44e2a7a176d486de550757a # good:
>> [3e504d2026eb6c8762cd6040ae57db166516824a] random: check for signal and
>> try earlier when generating entropy git bisect good
>> 3e504d2026eb6c8762cd6040ae57db166516824a # good:
>> [5e929367468c8f97cd1ffb0417316cecfebef94b] io_uring: terminate manual
>> loop iterator loop correctly for non-vecs git bisect good
>> 5e929367468c8f97cd1ffb0417316cecfebef94b # bad:
>> [2d6fc1455f3f383499e013ebc4b19ff49c53c15e] Merge branches
>> 'thermal-powerclamp', 'thermal-int340x' and 'thermal-docs' git bisect
>> bad 2d6fc1455f3f383499e013ebc4b19ff49c53c15e # good:
>> [1d6aab36a26ba44b114d7f8a857c430c9e0c32c9]
>> thermal/drivers/ti-soc-thermal: Remove unused function
>> ti_thermal_get_temp() git bisect good
>> 1d6aab36a26ba44b114d7f8a857c430c9e0c32c9 # bad:
>> [c7ff29763989bd09c433f73fae3c1e1c15d9cda4] thermal: int340x: Update OS
>> policy capability handshake git bisect bad
>> c7ff29763989bd09c433f73fae3c1e1c15d9cda4 # good:
>> [098c874e20be2a4cee3021aa9b3485ed5e1f4d5b] thermal: Replace
>> acpi_bus_get_device() git bisect good
>> 098c874e20be2a4cee3021aa9b3485ed5e1f4d5b # good:
>> [668f69a5f863b877bc3ae129efe9a80b6f055141] thermal: int340x: Increase
>> bitmap size git bisect good 668f69a5f863b877bc3ae129efe9a80b6f055141 #
>> first bad commit: [c7ff29763989bd09c433f73fae3c1e1c15d9cda4] thermal:
>> int340x: Update OS policy capability handshake You have new mail in
>> /var/mail/mtodorov mtodorov@domac:~/linux/kernel/linux_stable$ I was
>> unable to locate the culprit in the patch myself. Thank you very much
>> for your attention. I am available for all further questions. Have a
>> nice day :) Regards, |
>>
--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
tel. +385 (0)1 3711 451
mob. +385 91 57 88 355
Dear all,
On 24. 10. 2022. 20:56, Mirsad Goran Todorovac wrote:
> On 24. 10. 2022. 20:39, srinivas pandruvada wrote:
>
>>> Thank you for the patch. Unfortunately, when applied to v6.0.3 it
>>> didn't
>>> fix the issue.
>> Thanks for the test. I copied to acpi and acpica mailing list. Someone
>> can tell us what is this call doing wrong here.
I have worse news: after every
# systemctl stop thermald
# systemctl start thermald
the number of leaks increases by one allocated block (apparently 80
bytes). The effect appears to be
cummulative.
Please find the results of the MEMLEAK scan in the attachment.
In theory, motivated adversary could theoretically exhaust i.e. 8 GiB
in a loop of 10 million thermald stops/starts,
on my laptop and 2 sec for stop+start, it would be approx. 230 days.
Hope this helps.
Mirsad
--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union
On Wed, 2022-10-26 at 19:52 +0200, Mirsad Goran Todorovac wrote:
> Dear all,
>
> On 24. 10. 2022. 20:56, Mirsad Goran Todorovac wrote:
> > On 24. 10. 2022. 20:39, srinivas pandruvada wrote:
> >
> > > > Thank you for the patch. Unfortunately, when applied to v6.0.3
> > > > it
> > > > didn't
> > > > fix the issue.
> > > Thanks for the test. I copied to acpi and acpica mailing list.
> > > Someone
> > > can tell us what is this call doing wrong here.
>
> I have worse news: after every
>
> # systemctl stop thermald
> # systemctl start thermald
>
> the number of leaks increases by one allocated block (apparently 80
> bytes). The effect appears to be
> cummulative.
>
> Please find the results of the MEMLEAK scan in the attachment.
>
> In theory, motivated adversary could theoretically exhaust i.e. 8
> GiB
> in a loop of 10 million thermald stops/starts,
Of course it needs to be debugged. To start/stop systemctl service you
need root access. If you have root access, there are other worse things
can be done.
Thanks,
Srinivas
> on my laptop and 2 sec for stop+start, it would be approx. 230 days.
>
> Hope this helps.
>
> Mirsad
>
> --
>
> Mirsad Goran Todorovac
> Sistem inženjer
> Grafički fakultet | Akademija likovnih umjetnosti
> Sveučilište u Zagrebu
On 27. 10. 2022. 00:48, srinivas pandruvada wrote:
> On Wed, 2022-10-26 at 19:52 +0200, Mirsad Goran Todorovac wrote:
>> Dear all,
>>
>> On 24. 10. 2022. 20:56, Mirsad Goran Todorovac wrote:
>>> On 24. 10. 2022. 20:39, srinivas pandruvada wrote:
>>>
>>>>> Thank you for the patch. Unfortunately, when applied to v6.0.3
>>>>> it
>>>>> didn't
>>>>> fix the issue.
>>>> Thanks for the test. I copied to acpi and acpica mailing list.
>>>> Someone
>>>> can tell us what is this call doing wrong here.
> Of course it needs to be debugged. To start/stop systemctl service you
> need root access. If you have root access, there are other worse things
> can be done.
Indeed.
About fixing the bug, I tried to look at the source, but indeed it is
above my means.
I don't have exactly experience in debugging kernel drivers.
However, I've noticed something else.
A command:
for a in {1..100}; do
echo $a
systemctl stop thermald
sleep 1
systemctl start thermald
sleep 1
done
... it produces 180 memory leaks, attached in the attachment 1.
Apparetnly, most of the processes
created 2 leaks.
This is different from the test of loop of 1000 iterations, but without
sleep command, which
created only a handful of memleaks.
Sounds like a racing condition, rather than a deterministic bug.
It happens with the bisect done on Lenovo Ideapad 3 | Intel Core i5 |
Ubuntu 22.04. Output of lshw
might be useful, so I have attached it as attachment 2.
This came as I was driving my bike home, but other than this I ran out
of ideas how to help you in
debugging.
Thanks.
--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union
P.S.
Forgot another useful thing you've mentioned: please find attached the
dmesg output.
Good luck!
Mirsad
On 27. 10. 2022. 00:48, srinivas pandruvada wrote:
> On Wed, 2022-10-26 at 19:52 +0200, Mirsad Goran Todorovac wrote:
>> Dear all,
>>
>> On 24. 10. 2022. 20:56, Mirsad Goran Todorovac wrote:
>>> On 24. 10. 2022. 20:39, srinivas pandruvada wrote:
>>>
>>>>> Thank you for the patch. Unfortunately, when applied to v6.0.3
>>>>> it
>>>>> didn't
>>>>> fix the issue.
>>>> Thanks for the test. I copied to acpi and acpica mailing list.
>>>> Someone
>>>> can tell us what is this call doing wrong here.
>> I have worse news: after every
>>
>> # systemctl stop thermald
>> # systemctl start thermald
>>
>> the number of leaks increases by one allocated block (apparently 80
>> bytes). The effect appears to be
>> cummulative.
>>
>> Please find the results of the MEMLEAK scan in the attachment.
>>
>> In theory, motivated adversary could theoretically exhaust i.e. 8
>> GiB
>> in a loop of 10 million thermald stops/starts,
> Of course it needs to be debugged. To start/stop systemctl service you
> need root access. If you have root access, there are other worse things
> can be done.
>
> Thanks,
> Srinivas
>
>> on my laptop and 2 sec for stop+start, it would be approx. 230 days.
>>
>> Hope this helps.
>>
>> Mirsad
>>
>> --
>>
>> Mirsad Goran Todorovac
>> Sistem inženjer
>> Grafički fakultet | Akademija likovnih umjetnosti
>> Sveučilište u Zagrebu
--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union
On 26.10.22 14:32, Mirsad Goran Todorovac wrote:
> On 10/26/2022 10:08 AM, Thorsten Leemhuis wrote:
>
>> [Note: this mail is primarily send for documentation purposes and/or for
>> regzbot, my Linux kernel regression tracking bot. That's why I removed
>> most or all folks from the list of recipients, but left any that looked
>> like a mailing lists. These mails usually contain '#forregzbot' in the
>> subject, to make them easy to spot and filter out.]
>>
>> [TLDR: I'm adding this regression report to the list of tracked
>> regressions; all text from me you find below is based on a few templates
>> paragraphs you might have encountered already already in similar form.]
>>
>> Hi, this is your Linux kernel regression tracker.
>>
>> On 24.10.22 15:13, Mirsad Goran Todorovac wrote:
>>> Dear all,
>>>
>>> Around Sep 27th 2022 I've noticed in a mainline kernel built with
>>> CONFIG_DEBUG_KMEMLEAK=y
>>> that there actually is a leak:
>> Thanks for the report. To be sure below issue doesn't fall through the
>> cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
>> tracking bot:
>>
>> #regzbot ^introduced c7ff29763989bd
>> #regzbot title thermald regression (MEMLEAK)
>> #regzbot ignore-activity
>>
>> This isn't a regression? This issue or a fix for it are already
>> discussed somewhere else? It was fixed already? You want to clarify when
>> the regression started to happen? Or point out I got the title or
>> something else totally wrong? Then just reply -- ideally with also
>> telling regzbot about it, as explained here:
>> https://linux-regtracking.leemhuis.info/tracked-regression/
>
> You're welcome, no thanks needed.
>
> Is this really a regression? I can't tell if this is a one-time memory
> leak in thermald, or can it be
> exploited for causing memory leaks in a loop, exhausting kernel memory
> and producing denial-of-service
> or kernel crash.
#regzbot inconclusive: likely wasn't a regression