2019-01-18 15:39:33

by Marc Gonzalez

[permalink] [raw]
Subject: Re: kmemleak panic

On 18/01/2019 15:34, Catalin Marinas wrote:

> On Fri, Jan 18, 2019 at 02:36:46PM +0100, Marc Gonzalez wrote:
>
>> Trying to diagnose a separate issue, I enabled a raft of debugging options,
>> including kmemleak. However, it looks like kmemleak itself is crashing?
>>
>> We seem to be crashing on this code:
>>
>> kasan_disable_current();
>> pointer = *ptr;
>> kasan_enable_current();
>
> There was another regression reported recently:
>
> http://lkml.kernel.org/r/[email protected]
>
> See if reverting commit 9f1eb38e0e113 (mm, kmemleak: little optimization while
> scanning) fixes it.
>

[Drop LAKML, add LKML, add recipients]

Bug is easy to reproduce:

boot
mount -t debugfs nodev /sys/kernel/debug/
echo scan > /sys/kernel/debug/kmemleak

Unable to handle kernel paging request at virtual address ffffffc021e00000
Mem abort info:
ESR = 0x96000006
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000006
CM = 0, WnR = 0
swapper pgtable: 4k pages, 39-bit VAs, pgdp = (____ptrval____)
[ffffffc021e00000] pgd=000000017e3ba803, pud=000000017e3ba803, pmd=0000000000000000
Internal error: Oops: 96000006 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 635 Comm: exe Not tainted 5.0.0-rc1 #16
Hardware name: Qualcomm Technologies, Inc. MSM8998 v1 MTP (DT)
pstate: 80000085 (Nzcv daIf -PAN -UAO)
pc : scan_block+0x70/0x190
lr : scan_block+0x6c/0x190
sp : ffffff80133b3b20
x29: ffffff80133b3b20 x28: ffffffc0fdbaf018
x27: ffffffc022000000 x26: 0000000000000080
x25: ffffff80118bdf70 x24: ffffffc0f8cc8000
x23: ffffff8010bd8000 x22: ffffff8010bd8830
x21: ffffffc021e00ff9 x20: ffffffc0f8cc8050
x19: ffffffc021e00000 x18: 00000000000025fd
x17: 0000000000000200 x16: 0000000000000000
x15: ffffff8010c24dd8 x14: 00000000000025f9
x13: 00000000445b0e6c x12: ffffffc0f5a96658
x11: 0000000000000001 x10: ffffff8010bae688
x9 : ffffff8010baf000 x8 : ffffff8010bae688
x7 : 0000000000000003 x6 : 0000000000000000
x5 : ffffff801133ad60 x4 : 0000000000002878
x3 : ffffff8010c24d88 x2 : 7c512d102eca1300
x1 : ffffffc0f5a81b00 x0 : 0000000000000000
Process exe (pid: 635, stack limit = 0x(____ptrval____))
Call trace:
scan_block+0x70/0x190
scan_gray_list+0x108/0x1c0
kmemleak_scan+0x33c/0x7c0
kmemleak_write+0x410/0x4b0
full_proxy_write+0x68/0xa0
__vfs_write+0x60/0x190
vfs_write+0xac/0x1a0
ksys_write+0x6c/0xe0
__arm64_sys_write+0x24/0x30
el0_svc_handler+0xc0/0x160
el0_svc+0x8/0xc
Code: f9000fb4 d503201f 97ffffd2 35000580 (f9400260)
---[ end trace 8797ac2fea89abd6 ]---
note: exe[635] exited with preempt_count 2



Reverting 9f1eb38e0e1131e75cc4ac684391b25d70282589 does not help:

Unable to handle kernel paging request at virtual address ffffffc021e00000
Mem abort info:
ESR = 0x96000006
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000006
CM = 0, WnR = 0
swapper pgtable: 4k pages, 39-bit VAs, pgdp = (____ptrval____)
[ffffffc021e00000] pgd=000000017e3ba803, pud=000000017e3ba803, pmd=0000000000000000
Internal error: Oops: 96000006 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 636 Comm: exe Not tainted 5.0.0-rc1 #18
Hardware name: Qualcomm Technologies, Inc. MSM8998 v1 MTP (DT)
pstate: 80000085 (Nzcv daIf -PAN -UAO)
pc : scan_block+0x70/0x190
lr : scan_block+0x6c/0x190
sp : ffffff8013303b20
x29: ffffff8013303b20 x28: ffffffc0fdbaf018
x27: ffffffc022000000 x26: 0000000000000080
x25: ffffff80118bdf70 x24: ffffffc0f8cc8000
x23: ffffff8010bd8000 x22: ffffff8010bd8830
x21: ffffffc021e00ff9 x20: ffffffc0f8cc8050
x19: ffffffc021e00000 x18: 0000000000002704
x17: 0000000000000200 x16: 0000000000000000
x15: ffffff8010c24dd8 x14: 0000000000002700
x13: 0000000068ed2dfa x12: ffffffc0f5a39a58
x11: 0000000000000001 x10: ffffff8010bae688
x9 : ffffff8010baf000 x8 : ffffff8010bae688
x7 : 0000000000000003 x6 : 0000000000000000
x5 : ffffff80113223a0 x4 : 00000000000029c8
x3 : ffffff8010c24d88 x2 : 7c512d102eca1300
x1 : ffffffc0f59f9200 x0 : 0000000000000000
Process exe (pid: 636, stack limit = 0x(____ptrval____))
Call trace:
scan_block+0x70/0x190
scan_gray_list+0x108/0x1c0
kmemleak_scan+0x338/0x7c0
kmemleak_write+0x410/0x4b0
full_proxy_write+0x68/0xa0
__vfs_write+0x60/0x190
vfs_write+0xac/0x1a0
ksys_write+0x6c/0xe0
__arm64_sys_write+0x24/0x30
el0_svc_handler+0xc0/0x160
el0_svc+0x8/0xc
Code: f9000fb4 d503201f 97ffffd2 35000580 (f9400260)
---[ end trace 65933b4754c5cee7 ]---
note: exe[636] exited with preempt_count 2


Regards.


2019-01-18 16:16:52

by Qian Cai

[permalink] [raw]
Subject: Re: kmemleak panic



On 1/18/19 10:36 AM, Marc Gonzalez wrote:
> On 18/01/2019 15:34, Catalin Marinas wrote:
>
>> On Fri, Jan 18, 2019 at 02:36:46PM +0100, Marc Gonzalez wrote:
>>
>>> Trying to diagnose a separate issue, I enabled a raft of debugging options,
>>> including kmemleak. However, it looks like kmemleak itself is crashing?
>>>
>>> We seem to be crashing on this code:
>>>
>>> kasan_disable_current();
>>> pointer = *ptr;
>>> kasan_enable_current();
>>
>> There was another regression reported recently:
>>
>> http://lkml.kernel.org/r/[email protected]
>>
>> See if reverting commit 9f1eb38e0e113 (mm, kmemleak: little optimization while
>> scanning) fixes it.
>>
>
> [Drop LAKML, add LKML, add recipients]
>
> Bug is easy to reproduce:
>
> boot
> mount -t debugfs nodev /sys/kernel/debug/
> echo scan > /sys/kernel/debug/kmemleak
>
> Unable to handle kernel paging request at virtual address ffffffc021e00000
> Mem abort info:
> ESR = 0x96000006
> Exception class = DABT (current EL), IL = 32 bits
> SET = 0, FnV = 0
> EA = 0, S1PTW = 0
> Data abort info:
> ISV = 0, ISS = 0x00000006
> CM = 0, WnR = 0
> swapper pgtable: 4k pages, 39-bit VAs, pgdp = (____ptrval____)
> [ffffffc021e00000] pgd=000000017e3ba803, pud=000000017e3ba803, pmd=0000000000000000
> Internal error: Oops: 96000006 [#1] PREEMPT SMP
> Modules linked in:
> CPU: 0 PID: 635 Comm: exe Not tainted 5.0.0-rc1 #16
> Hardware name: Qualcomm Technologies, Inc. MSM8998 v1 MTP (DT)
> pstate: 80000085 (Nzcv daIf -PAN -UAO)
> pc : scan_block+0x70/0x190
> lr : scan_block+0x6c/0x190
> sp : ffffff80133b3b20
> x29: ffffff80133b3b20 x28: ffffffc0fdbaf018
> x27: ffffffc022000000 x26: 0000000000000080
> x25: ffffff80118bdf70 x24: ffffffc0f8cc8000
> x23: ffffff8010bd8000 x22: ffffff8010bd8830
> x21: ffffffc021e00ff9 x20: ffffffc0f8cc8050
> x19: ffffffc021e00000 x18: 00000000000025fd
> x17: 0000000000000200 x16: 0000000000000000
> x15: ffffff8010c24dd8 x14: 00000000000025f9
> x13: 00000000445b0e6c x12: ffffffc0f5a96658
> x11: 0000000000000001 x10: ffffff8010bae688
> x9 : ffffff8010baf000 x8 : ffffff8010bae688
> x7 : 0000000000000003 x6 : 0000000000000000
> x5 : ffffff801133ad60 x4 : 0000000000002878
> x3 : ffffff8010c24d88 x2 : 7c512d102eca1300
> x1 : ffffffc0f5a81b00 x0 : 0000000000000000
> Process exe (pid: 635, stack limit = 0x(____ptrval____))
> Call trace:
> scan_block+0x70/0x190
> scan_gray_list+0x108/0x1c0
> kmemleak_scan+0x33c/0x7c0
> kmemleak_write+0x410/0x4b0
> full_proxy_write+0x68/0xa0
> __vfs_write+0x60/0x190
> vfs_write+0xac/0x1a0
> ksys_write+0x6c/0xe0
> __arm64_sys_write+0x24/0x30
> el0_svc_handler+0xc0/0x160
> el0_svc+0x8/0xc
> Code: f9000fb4 d503201f 97ffffd2 35000580 (f9400260)
> ---[ end trace 8797ac2fea89abd6 ]---
> note: exe[635] exited with preempt_count 2
>
>
>
> Reverting 9f1eb38e0e1131e75cc4ac684391b25d70282589 does not help:

This looks like something different from the original "invalid PFNs from
pfn_to_online_page()" issue. What's your .config ?

2019-01-18 17:08:29

by Marc Gonzalez

[permalink] [raw]
Subject: Re: kmemleak panic

On 18/01/2019 17:14, Qian Cai wrote:

> This looks like something different from the original "invalid PFNs from
> pfn_to_online_page()" issue. What's your .config ?

Here's my defconfig:

# CONFIG_SWAP is not set
CONFIG_NO_HZ_IDLE=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_PREEMPT=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=20
CONFIG_BLK_DEV_INITRD=y
CONFIG_ARCH_QCOM=y
# CONFIG_EFI is not set
# CONFIG_SUSPEND is not set
CONFIG_PM=y
# CONFIG_MQ_IOSCHED_DEADLINE is not set
# CONFIG_MQ_IOSCHED_KYBER is not set
CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
# CONFIG_BLK_DEV is not set
# CONFIG_INPUT_KEYBOARD is not set
# CONFIG_INPUT_MOUSE is not set
# CONFIG_SERIO_SERPORT is not set
CONFIG_VT_HW_CONSOLE_BINDING=y
CONFIG_LEGACY_PTY_COUNT=16
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_MSM=y
CONFIG_SERIAL_MSM_CONSOLE=y
CONFIG_SERIAL_DEV_BUS=y
# CONFIG_HW_RANDOM is not set
# CONFIG_HWMON is not set
CONFIG_REGULATOR=y
CONFIG_REGULATOR_FIXED_VOLTAGE=y
CONFIG_REGULATOR_QCOM_SMD_RPM=y
# CONFIG_HID is not set
# CONFIG_USB_SUPPORT is not set
# CONFIG_COMMON_CLK_XGENE is not set
CONFIG_COMMON_CLK_QCOM=y
# CONFIG_QCOM_A53PLL is not set
# CONFIG_QCOM_CLK_APCS_MSM8916 is not set
CONFIG_QCOM_CLK_SMD_RPM=y
CONFIG_MSM_GCC_8998=y
CONFIG_HWSPINLOCK=y
CONFIG_HWSPINLOCK_QCOM=y
CONFIG_MAILBOX=y
CONFIG_QCOM_APCS_IPC=y
CONFIG_RPMSG_QCOM_GLINK_RPM=y
CONFIG_RPMSG_QCOM_GLINK_SMEM=y
CONFIG_RPMSG_QCOM_SMD=y
CONFIG_RPMSG_VIRTIO=y
CONFIG_QCOM_COMMAND_DB=y
CONFIG_QCOM_SMEM=y
CONFIG_QCOM_SMD_RPM=y
CONFIG_TMPFS=y
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_KERNEL=y
CONFIG_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=4000
CONFIG_FUNCTION_TRACER=y

2019-01-18 17:40:04

by Qian Cai

[permalink] [raw]
Subject: Re: kmemleak panic



On 1/18/19 12:05 PM, Marc Gonzalez wrote:
> On 18/01/2019 17:14, Qian Cai wrote:
>
>> This looks like something different from the original "invalid PFNs from
>> pfn_to_online_page()" issue. What's your .config ?
>
> Here's my defconfig:
>
> # CONFIG_SWAP is not set
> CONFIG_NO_HZ_IDLE=y
> CONFIG_HIGH_RES_TIMERS=y
> CONFIG_PREEMPT=y
> CONFIG_IKCONFIG=y
> CONFIG_IKCONFIG_PROC=y
> CONFIG_LOG_BUF_SHIFT=20
> CONFIG_BLK_DEV_INITRD=y
> CONFIG_ARCH_QCOM=y
> # CONFIG_EFI is not set
> # CONFIG_SUSPEND is not set
> CONFIG_PM=y
> # CONFIG_MQ_IOSCHED_DEADLINE is not set
> # CONFIG_MQ_IOSCHED_KYBER is not set
> CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
> CONFIG_DEVTMPFS=y
> CONFIG_DEVTMPFS_MOUNT=y
> # CONFIG_BLK_DEV is not set
> # CONFIG_INPUT_KEYBOARD is not set
> # CONFIG_INPUT_MOUSE is not set
> # CONFIG_SERIO_SERPORT is not set
> CONFIG_VT_HW_CONSOLE_BINDING=y
> CONFIG_LEGACY_PTY_COUNT=16
> CONFIG_SERIAL_8250=y
> CONFIG_SERIAL_MSM=y
> CONFIG_SERIAL_MSM_CONSOLE=y
> CONFIG_SERIAL_DEV_BUS=y
> # CONFIG_HW_RANDOM is not set
> # CONFIG_HWMON is not set
> CONFIG_REGULATOR=y
> CONFIG_REGULATOR_FIXED_VOLTAGE=y
> CONFIG_REGULATOR_QCOM_SMD_RPM=y
> # CONFIG_HID is not set
> # CONFIG_USB_SUPPORT is not set
> # CONFIG_COMMON_CLK_XGENE is not set
> CONFIG_COMMON_CLK_QCOM=y
> # CONFIG_QCOM_A53PLL is not set
> # CONFIG_QCOM_CLK_APCS_MSM8916 is not set
> CONFIG_QCOM_CLK_SMD_RPM=y
> CONFIG_MSM_GCC_8998=y
> CONFIG_HWSPINLOCK=y
> CONFIG_HWSPINLOCK_QCOM=y
> CONFIG_MAILBOX=y
> CONFIG_QCOM_APCS_IPC=y
> CONFIG_RPMSG_QCOM_GLINK_RPM=y
> CONFIG_RPMSG_QCOM_GLINK_SMEM=y
> CONFIG_RPMSG_QCOM_SMD=y
> CONFIG_RPMSG_VIRTIO=y
> CONFIG_QCOM_COMMAND_DB=y
> CONFIG_QCOM_SMEM=y
> CONFIG_QCOM_SMD_RPM=y
> CONFIG_TMPFS=y
> CONFIG_DEBUG_INFO=y
> CONFIG_DEBUG_KERNEL=y
> CONFIG_DEBUG_KMEMLEAK=y
> CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=4000
> CONFIG_FUNCTION_TRACER=y
>

This is truncated. I'll need your whole .config file, so I can try to reproduce
it here on an arm64 ThunderX2 server.

2019-01-18 19:14:15

by Robin Murphy

[permalink] [raw]
Subject: Re: kmemleak panic

On 2019-01-18 3:36 pm, Marc Gonzalez wrote:
> Unable to handle kernel paging request at virtual address ffffffc021e00000

I can't help but notice that you seem to get the same address in 4
different logs - if it really is that deterministic then that's quite
the boon for debugging (FWIW my first thought is that it looks a lot
like a phys_to_virt() of something bogus). I've never used kmemleak
myself, but looking at where it's crashing it appears to think that that
address is a valid object - if I'm reading the docs correctly then I
guess the "dump" command ought to be able to tell you where it thinks
that object came from.

Robin.

2019-01-19 10:26:08

by Marc Gonzalez

[permalink] [raw]
Subject: Re: kmemleak panic

On 18/01/2019 18:38, Qian Cai wrote:

> On 1/18/19 12:05 PM, Marc Gonzalez wrote:
>
>> On 18/01/2019 17:14, Qian Cai wrote:
>>
>>> This looks like something different from the original "invalid PFNs from
>>> pfn_to_online_page()" issue. What's your .config ?
>>
>> Here's my defconfig:
>>
>> # CONFIG_SWAP is not set
>> CONFIG_NO_HZ_IDLE=y
>> CONFIG_HIGH_RES_TIMERS=y
>> CONFIG_PREEMPT=y
>> CONFIG_IKCONFIG=y
>> CONFIG_IKCONFIG_PROC=y
>> CONFIG_LOG_BUF_SHIFT=20
>> CONFIG_BLK_DEV_INITRD=y
>> CONFIG_ARCH_QCOM=y
>> # CONFIG_EFI is not set
>> # CONFIG_SUSPEND is not set
>> CONFIG_PM=y
>> # CONFIG_MQ_IOSCHED_DEADLINE is not set
>> # CONFIG_MQ_IOSCHED_KYBER is not set
>> CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
>> CONFIG_DEVTMPFS=y
>> CONFIG_DEVTMPFS_MOUNT=y
>> # CONFIG_BLK_DEV is not set
>> # CONFIG_INPUT_KEYBOARD is not set
>> # CONFIG_INPUT_MOUSE is not set
>> # CONFIG_SERIO_SERPORT is not set
>> CONFIG_VT_HW_CONSOLE_BINDING=y
>> CONFIG_LEGACY_PTY_COUNT=16
>> CONFIG_SERIAL_8250=y
>> CONFIG_SERIAL_MSM=y
>> CONFIG_SERIAL_MSM_CONSOLE=y
>> CONFIG_SERIAL_DEV_BUS=y
>> # CONFIG_HW_RANDOM is not set
>> # CONFIG_HWMON is not set
>> CONFIG_REGULATOR=y
>> CONFIG_REGULATOR_FIXED_VOLTAGE=y
>> CONFIG_REGULATOR_QCOM_SMD_RPM=y
>> # CONFIG_HID is not set
>> # CONFIG_USB_SUPPORT is not set
>> # CONFIG_COMMON_CLK_XGENE is not set
>> CONFIG_COMMON_CLK_QCOM=y
>> # CONFIG_QCOM_A53PLL is not set
>> # CONFIG_QCOM_CLK_APCS_MSM8916 is not set
>> CONFIG_QCOM_CLK_SMD_RPM=y
>> CONFIG_MSM_GCC_8998=y
>> CONFIG_HWSPINLOCK=y
>> CONFIG_HWSPINLOCK_QCOM=y
>> CONFIG_MAILBOX=y
>> CONFIG_QCOM_APCS_IPC=y
>> CONFIG_RPMSG_QCOM_GLINK_RPM=y
>> CONFIG_RPMSG_QCOM_GLINK_SMEM=y
>> CONFIG_RPMSG_QCOM_SMD=y
>> CONFIG_RPMSG_VIRTIO=y
>> CONFIG_QCOM_COMMAND_DB=y
>> CONFIG_QCOM_SMEM=y
>> CONFIG_QCOM_SMD_RPM=y
>> CONFIG_TMPFS=y
>> CONFIG_DEBUG_INFO=y
>> CONFIG_DEBUG_KERNEL=y
>> CONFIG_DEBUG_KMEMLEAK=y
>> CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=4000
>> CONFIG_FUNCTION_TRACER=y
>
> This is truncated. I'll need your whole .config file, so I can try to reproduce
> it here on an arm64 ThunderX2 server.

Smirk. It's not truncated, it's condensed ;-)

It's a defconfig, as generated by make savedefconfig.

https://stackoverflow.com/questions/27899104/how-to-create-a-defconfig-file-from-a-config

Save it to e.g. arch/arm64/configs/marc_defconfig and then you can
just run 'make marc_defconfig' to generate the .config

Regards.

2019-01-19 13:31:47

by Catalin Marinas

[permalink] [raw]
Subject: Re: kmemleak panic

On Fri, Jan 18, 2019 at 04:36:59PM +0100, Marc Gonzalez wrote:
> mount -t debugfs nodev /sys/kernel/debug/
> echo scan > /sys/kernel/debug/kmemleak
>
> Unable to handle kernel paging request at virtual address ffffffc021e00000
[...]
> Call trace:
> scan_block+0x70/0x190
> scan_gray_list+0x108/0x1c0
> kmemleak_scan+0x33c/0x7c0
> kmemleak_write+0x410/0x4b0

As per Robin's remark, this address seems to be pretty easy to
reproduce. It also happens via scan_gray_list() which indicates an
object kmemleak was informed about via kmemleak_alloc() (so this
excludes the pfn that Qian noticed).

Can you configure the kernel with CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN off
just to avoid the bug being triggered early and run:

mount -t debugfs nodev /sys/kernel/debug/
echo dump=0xffffffc021e00000 > /sys/kernel/debug/kmemleak

Then run another scan to make sure this is the address that triggered
the page fault:

echo scan > /sys/kernel/debug/kmemleak

The above should tell us where the object that kmemleak is trying to
scan came from.

Of course, ideally we should bisect this but I haven't been able to
reproduce it.

--
Catalin

2019-01-21 12:00:25

by Marc Gonzalez

[permalink] [raw]
Subject: Re: kmemleak panic

On 19/01/2019 14:28, Catalin Marinas wrote:

> As per Robin's remark, this address seems to be pretty easy to
> reproduce. It also happens via scan_gray_list() which indicates an
> object kmemleak was informed about via kmemleak_alloc() (so this
> excludes the pfn that Qian noticed).
>
> Can you configure the kernel with CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN off
> just to avoid the bug being triggered early and run:
>
> mount -t debugfs nodev /sys/kernel/debug/
> echo dump=0xffffffc021e00000 > /sys/kernel/debug/kmemleak
>
> Then run another scan to make sure this is the address that triggered
> the page fault:
>
> echo scan > /sys/kernel/debug/kmemleak
>
> The above should tell us where the object that kmemleak is trying to
> scan came from.

Here are the remap requests on my system:

REMAP: PA=17a00000 VA=ffffff8010040000 SIZE=10000
REMAP: PA=17b00000 VA=ffffff8010c00000 SIZE=100000
REMAP: PA=17920000 VA=ffffff8010005000 SIZE=1000
REMAP: PA=17921000 VA=ffffff801000d000 SIZE=1000
REMAP: PA=17921000 VA=ffffff8010015000 SIZE=1000
REMAP: PA=01f40000 VA=ffffff8010fa0000 SIZE=20000
REMAP: PA=86000000 VA=ffffff8011000000 SIZE=200000
REMAP: PA=00100000 VA=ffffff8011300000 SIZE=b0000
REMAP: PA=17911000 VA=ffffff8010025000 SIZE=1000
REMAP: PA=00778000 VA=ffffff80115d0000 SIZE=7000
REMAP: PA=0c1b0000 VA=ffffff801002d000 SIZE=1000


# echo dump=0xffffffc021e00000 > /sys/kernel/debug/kmemleak
kmemleak: Object 0xffffffc021e00000 (size 2097152):
kmemleak: comm "swapper/0", pid 0, jiffies 4294892296
kmemleak: min_count = 0
kmemleak: count = 0
kmemleak: flags = 0x1
kmemleak: checksum = 0
kmemleak: backtrace:
kmemleak_alloc_phys+0x48/0x60
memblock_alloc_range_nid+0x8c/0xa4
memblock_alloc_base_nid+0x4c/0x60
__memblock_alloc_base+0x3c/0x4c
early_init_dt_alloc_reserved_memory_arch+0x54/0xa4
fdt_init_reserved_mem+0x308/0x3ec
early_init_fdt_scan_reserved_mem+0x88/0xb0
arm64_memblock_init+0x1dc/0x254
setup_arch+0x1c8/0x4ec
start_kernel+0x84/0x44c
0xffffffffffffffff

# echo scan > /sys/kernel/debug/kmemleak
Unable to handle kernel paging request at virtual address ffffffc021e00000
Mem abort info:
ESR = 0x96000006
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000006
CM = 0, WnR = 0
swapper pgtable: 4k pages, 39-bit VAs, pgdp = (____ptrval____)
[ffffffc021e00000] pgd=000000017e3ba803, pud=000000017e3ba803, pmd=0000000000000000
Internal error: Oops: 96000006 [#1] PREEMPT SMP
CPU: 4 PID: 508 Comm: sh Tainted: G S 5.0.0-rc1 #11
Hardware name: Qualcomm Technologies, Inc. MSM8998 v1 MTP (DT)
pstate: 80000085 (Nzcv daIf -PAN -UAO)
pc : scan_block+0x70/0x180
lr : scan_block+0x6c/0x180
sp : ffffff8011a2bb40
x29: ffffff8011a2bb40 x28: ffffffc0f8375e80
x27: ffffff8010b8a000 x26: 0000000000000080
x25: ffffff8010b8a668 x24: ffffffc0f8cc0000
x23: ffffff8010b8a000 x22: ffffff8010a4b8c8
x21: ffffffc021e00ff9 x20: ffffffc021e01000
x19: ffffffc021e00000 x18: 0000000000000000
x17: 0000000000000200 x16: 0000000000000000
x15: ffffffffffffffff x14: ffffffff00000000
x13: ffffffffffffffff x12: 0000000000000020
x11: 0101010101010101 x10: 00000000004bfd20
x9 : 7f7f7f7f7f7f7f7f x8 : fefeff096d606272
x7 : 00000000dc3c0000 x6 : 0000000000000018
x5 : ffffffbf03f8efc8 x4 : ffffffbf03f8efc7
x3 : 5b36396f4e7d4000 x2 : ffffffc0f8cc0000
x1 : ffffffc0f6689880 x0 : 0000000000000000
Process sh (pid: 508, stack limit = 0x(____ptrval____))
Call trace:
scan_block+0x70/0x180
scan_gray_list+0xe0/0x190
kmemleak_scan+0x2bc/0x540
kmemleak_write+0x328/0x3d0
full_proxy_write+0x68/0xa0
__vfs_write+0x60/0x190
vfs_write+0xac/0x1a0
ksys_write+0x6c/0xe0
__arm64_sys_write+0x24/0x30
el0_svc_handler+0xb8/0x140
el0_svc+0x8/0xc
Code: f9000fb4 d503201f 97ffffd2 35000540 (f9400260)
---[ end trace 754cbd2624bb2b91 ]---
note: sh[508] exited with preempt_count 2


__reserved_mem_alloc_size() printed:
OF: reserved mem: allocated memory for 'rmtfs' node: base 0x00000000a1e00000, size 2 MiB

See git show c7833949564ec

The corresponding node is:

reserved-memory {
#address-cells = <2>;
#size-cells = <2>;
ranges;

memory@85800000 {
reg = <0x0 0x85800000 0x0 0x800000>;
no-map;
};

smem_mem: smem-mem@86000000 {
reg = <0x0 0x86000000 0x0 0x200000>;
no-map;
};

memory@86200000 {
reg = <0x0 0x86200000 0x0 0x2600000>;
no-map;
};

rmtfs {
compatible = "qcom,rmtfs-mem";

size = <0x0 0x200000>;
alloc-ranges = <0x0 0xa0000000 0x0 0x2000000>;
no-map;

qcom,client-id = <1>;
qcom,vmid = <15>;
};
};

2019-01-21 12:21:40

by Robin Murphy

[permalink] [raw]
Subject: Re: kmemleak panic

On 21/01/2019 11:57, Marc Gonzalez wrote:
[...]
> # echo dump=0xffffffc021e00000 > /sys/kernel/debug/kmemleak
> kmemleak: Object 0xffffffc021e00000 (size 2097152):
> kmemleak: comm "swapper/0", pid 0, jiffies 4294892296
> kmemleak: min_count = 0
> kmemleak: count = 0
> kmemleak: flags = 0x1
> kmemleak: checksum = 0
> kmemleak: backtrace:
> kmemleak_alloc_phys+0x48/0x60
> memblock_alloc_range_nid+0x8c/0xa4
> memblock_alloc_base_nid+0x4c/0x60
> __memblock_alloc_base+0x3c/0x4c
> early_init_dt_alloc_reserved_memory_arch+0x54/0xa4
> fdt_init_reserved_mem+0x308/0x3ec
> early_init_fdt_scan_reserved_mem+0x88/0xb0
> arm64_memblock_init+0x1dc/0x254
> setup_arch+0x1c8/0x4ec
> start_kernel+0x84/0x44c
> 0xffffffffffffffff

OK, so via the __va(phys) call in kmemleak_alloc_phys(), you end up with
the linear map address of a no-map reservation, which unsurprisingly
turns out not to be mapped. Is there a way to tell kmemleak that it
can't scan within a particular object?

Robin.

2019-01-21 12:25:44

by Marc Gonzalez

[permalink] [raw]
Subject: Re: kmemleak panic

On 21/01/2019 12:57, Marc Gonzalez wrote:

> On 19/01/2019 14:28, Catalin Marinas wrote:
>
>> As per Robin's remark, this address seems to be pretty easy to
>> reproduce. It also happens via scan_gray_list() which indicates an
>> object kmemleak was informed about via kmemleak_alloc() (so this
>> excludes the pfn that Qian noticed).
>>
>> Can you configure the kernel with CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN off
>> just to avoid the bug being triggered early and run:
>>
>> mount -t debugfs nodev /sys/kernel/debug/
>> echo dump=0xffffffc021e00000 > /sys/kernel/debug/kmemleak
>>
>> Then run another scan to make sure this is the address that triggered
>> the page fault:
>>
>> echo scan > /sys/kernel/debug/kmemleak
>>
>> The above should tell us where the object that kmemleak is trying to
>> scan came from.
>
> Here are the remap requests on my system:
>
> REMAP: PA=17a00000 VA=ffffff8010040000 SIZE=10000
> REMAP: PA=17b00000 VA=ffffff8010c00000 SIZE=100000
> REMAP: PA=17920000 VA=ffffff8010005000 SIZE=1000
> REMAP: PA=17921000 VA=ffffff801000d000 SIZE=1000
> REMAP: PA=17921000 VA=ffffff8010015000 SIZE=1000
> REMAP: PA=01f40000 VA=ffffff8010fa0000 SIZE=20000
> REMAP: PA=86000000 VA=ffffff8011000000 SIZE=200000
> REMAP: PA=00100000 VA=ffffff8011300000 SIZE=b0000
> REMAP: PA=17911000 VA=ffffff8010025000 SIZE=1000
> REMAP: PA=00778000 VA=ffffff80115d0000 SIZE=7000
> REMAP: PA=0c1b0000 VA=ffffff801002d000 SIZE=1000
>
>
> # echo dump=0xffffffc021e00000 > /sys/kernel/debug/kmemleak
> kmemleak: Object 0xffffffc021e00000 (size 2097152):
> kmemleak: comm "swapper/0", pid 0, jiffies 4294892296
> kmemleak: min_count = 0
> kmemleak: count = 0
> kmemleak: flags = 0x1
> kmemleak: checksum = 0
> kmemleak: backtrace:
> kmemleak_alloc_phys+0x48/0x60
> memblock_alloc_range_nid+0x8c/0xa4
> memblock_alloc_base_nid+0x4c/0x60
> __memblock_alloc_base+0x3c/0x4c
> early_init_dt_alloc_reserved_memory_arch+0x54/0xa4
> fdt_init_reserved_mem+0x308/0x3ec
> early_init_fdt_scan_reserved_mem+0x88/0xb0
> arm64_memblock_init+0x1dc/0x254
> setup_arch+0x1c8/0x4ec
> start_kernel+0x84/0x44c
> 0xffffffffffffffff
>
> # echo scan > /sys/kernel/debug/kmemleak
> Unable to handle kernel paging request at virtual address ffffffc021e00000
> Mem abort info:
> ESR = 0x96000006
> Exception class = DABT (current EL), IL = 32 bits
> SET = 0, FnV = 0
> EA = 0, S1PTW = 0
> Data abort info:
> ISV = 0, ISS = 0x00000006
> CM = 0, WnR = 0
> swapper pgtable: 4k pages, 39-bit VAs, pgdp = (____ptrval____)
> [ffffffc021e00000] pgd=000000017e3ba803, pud=000000017e3ba803, pmd=0000000000000000
> Internal error: Oops: 96000006 [#1] PREEMPT SMP
> CPU: 4 PID: 508 Comm: sh Tainted: G S 5.0.0-rc1 #11
> Hardware name: Qualcomm Technologies, Inc. MSM8998 v1 MTP (DT)
> pstate: 80000085 (Nzcv daIf -PAN -UAO)
> pc : scan_block+0x70/0x180
> lr : scan_block+0x6c/0x180
> sp : ffffff8011a2bb40
> x29: ffffff8011a2bb40 x28: ffffffc0f8375e80
> x27: ffffff8010b8a000 x26: 0000000000000080
> x25: ffffff8010b8a668 x24: ffffffc0f8cc0000
> x23: ffffff8010b8a000 x22: ffffff8010a4b8c8
> x21: ffffffc021e00ff9 x20: ffffffc021e01000
> x19: ffffffc021e00000 x18: 0000000000000000
> x17: 0000000000000200 x16: 0000000000000000
> x15: ffffffffffffffff x14: ffffffff00000000
> x13: ffffffffffffffff x12: 0000000000000020
> x11: 0101010101010101 x10: 00000000004bfd20
> x9 : 7f7f7f7f7f7f7f7f x8 : fefeff096d606272
> x7 : 00000000dc3c0000 x6 : 0000000000000018
> x5 : ffffffbf03f8efc8 x4 : ffffffbf03f8efc7
> x3 : 5b36396f4e7d4000 x2 : ffffffc0f8cc0000
> x1 : ffffffc0f6689880 x0 : 0000000000000000
> Process sh (pid: 508, stack limit = 0x(____ptrval____))
> Call trace:
> scan_block+0x70/0x180
> scan_gray_list+0xe0/0x190
> kmemleak_scan+0x2bc/0x540
> kmemleak_write+0x328/0x3d0
> full_proxy_write+0x68/0xa0
> __vfs_write+0x60/0x190
> vfs_write+0xac/0x1a0
> ksys_write+0x6c/0xe0
> __arm64_sys_write+0x24/0x30
> el0_svc_handler+0xb8/0x140
> el0_svc+0x8/0xc
> Code: f9000fb4 d503201f 97ffffd2 35000540 (f9400260)
> ---[ end trace 754cbd2624bb2b91 ]---
> note: sh[508] exited with preempt_count 2
>
>
> __reserved_mem_alloc_size() printed:
> OF: reserved mem: allocated memory for 'rmtfs' node: base 0x00000000a1e00000, size 2 MiB
>
> See git show c7833949564ec
>
> The corresponding node is:
>
> reserved-memory {
> #address-cells = <2>;
> #size-cells = <2>;
> ranges;
>
> memory@85800000 {
> reg = <0x0 0x85800000 0x0 0x800000>;
> no-map;
> };
>
> smem_mem: smem-mem@86000000 {
> reg = <0x0 0x86000000 0x0 0x200000>;
> no-map;
> };
>
> memory@86200000 {
> reg = <0x0 0x86200000 0x0 0x2600000>;
> no-map;
> };
>
> rmtfs {
> compatible = "qcom,rmtfs-mem";
>
> size = <0x0 0x200000>;
> alloc-ranges = <0x0 0xa0000000 0x0 0x2000000>;
> no-map;
>
> qcom,client-id = <1>;
> qcom,vmid = <15>;
> };
> };

Complete boot log using 'loglevel=7 memblock=debug'
https://pastebin.ubuntu.com/p/2BN2csSYkx/

Regards.

2019-01-21 13:36:55

by Rob Herring

[permalink] [raw]
Subject: Re: kmemleak panic

On Mon, Jan 21, 2019 at 6:19 AM Robin Murphy <[email protected]> wrote:
>
> On 21/01/2019 11:57, Marc Gonzalez wrote:
> [...]
> > # echo dump=0xffffffc021e00000 > /sys/kernel/debug/kmemleak
> > kmemleak: Object 0xffffffc021e00000 (size 2097152):
> > kmemleak: comm "swapper/0", pid 0, jiffies 4294892296
> > kmemleak: min_count = 0
> > kmemleak: count = 0
> > kmemleak: flags = 0x1
> > kmemleak: checksum = 0
> > kmemleak: backtrace:
> > kmemleak_alloc_phys+0x48/0x60
> > memblock_alloc_range_nid+0x8c/0xa4
> > memblock_alloc_base_nid+0x4c/0x60
> > __memblock_alloc_base+0x3c/0x4c
> > early_init_dt_alloc_reserved_memory_arch+0x54/0xa4
> > fdt_init_reserved_mem+0x308/0x3ec
> > early_init_fdt_scan_reserved_mem+0x88/0xb0
> > arm64_memblock_init+0x1dc/0x254
> > setup_arch+0x1c8/0x4ec
> > start_kernel+0x84/0x44c
> > 0xffffffffffffffff
>
> OK, so via the __va(phys) call in kmemleak_alloc_phys(), you end up with
> the linear map address of a no-map reservation, which unsurprisingly
> turns out not to be mapped. Is there a way to tell kmemleak that it
> can't scan within a particular object?

There was this patch posted[1]. I never got a reply, so it hasn't been applied.

Rob

https://patchwork.ozlabs.org/patch/995367/

2019-01-21 13:57:13

by Marc Gonzalez

[permalink] [raw]
Subject: Re: kmemleak panic

On 21/01/2019 14:35, Rob Herring wrote:

> On Mon, Jan 21, 2019 at 6:19 AM Robin Murphy wrote:
>>
>> On 21/01/2019 11:57, Marc Gonzalez wrote:
>> [...]
>>> # echo dump=0xffffffc021e00000 > /sys/kernel/debug/kmemleak
>>> kmemleak: Object 0xffffffc021e00000 (size 2097152):
>>> kmemleak: comm "swapper/0", pid 0, jiffies 4294892296
>>> kmemleak: min_count = 0
>>> kmemleak: count = 0
>>> kmemleak: flags = 0x1
>>> kmemleak: checksum = 0
>>> kmemleak: backtrace:
>>> kmemleak_alloc_phys+0x48/0x60
>>> memblock_alloc_range_nid+0x8c/0xa4
>>> memblock_alloc_base_nid+0x4c/0x60
>>> __memblock_alloc_base+0x3c/0x4c
>>> early_init_dt_alloc_reserved_memory_arch+0x54/0xa4
>>> fdt_init_reserved_mem+0x308/0x3ec
>>> early_init_fdt_scan_reserved_mem+0x88/0xb0
>>> arm64_memblock_init+0x1dc/0x254
>>> setup_arch+0x1c8/0x4ec
>>> start_kernel+0x84/0x44c
>>> 0xffffffffffffffff
>>
>> OK, so via the __va(phys) call in kmemleak_alloc_phys(), you end up with
>> the linear map address of a no-map reservation, which unsurprisingly
>> turns out not to be mapped. Is there a way to tell kmemleak that it
>> can't scan within a particular object?
>
> There was this patch posted[1]. I never got a reply, so it hasn't been applied.
>
> Rob
>
> https://patchwork.ozlabs.org/patch/995367/

It is worth noting that the author's email address appears to be dead.
<[email protected]>: host hqemgate08.nvidia.com[216.228.121.117] said:
550 #5.1.0 Address rejected. (in reply to RCPT TO command)

Adding a few nvidia devs for comment.

2019-01-21 14:38:36

by Catalin Marinas

[permalink] [raw]
Subject: Re: kmemleak panic

On Mon, Jan 21, 2019 at 07:35:11AM -0600, Rob Herring wrote:
> On Mon, Jan 21, 2019 at 6:19 AM Robin Murphy <[email protected]> wrote:
> >
> > On 21/01/2019 11:57, Marc Gonzalez wrote:
> > [...]
> > > # echo dump=0xffffffc021e00000 > /sys/kernel/debug/kmemleak
> > > kmemleak: Object 0xffffffc021e00000 (size 2097152):
> > > kmemleak: comm "swapper/0", pid 0, jiffies 4294892296
> > > kmemleak: min_count = 0
> > > kmemleak: count = 0
> > > kmemleak: flags = 0x1
> > > kmemleak: checksum = 0
> > > kmemleak: backtrace:
> > > kmemleak_alloc_phys+0x48/0x60
> > > memblock_alloc_range_nid+0x8c/0xa4
> > > memblock_alloc_base_nid+0x4c/0x60
> > > __memblock_alloc_base+0x3c/0x4c
> > > early_init_dt_alloc_reserved_memory_arch+0x54/0xa4
> > > fdt_init_reserved_mem+0x308/0x3ec
> > > early_init_fdt_scan_reserved_mem+0x88/0xb0
> > > arm64_memblock_init+0x1dc/0x254
> > > setup_arch+0x1c8/0x4ec
> > > start_kernel+0x84/0x44c
> > > 0xffffffffffffffff
> >
> > OK, so via the __va(phys) call in kmemleak_alloc_phys(), you end up with
> > the linear map address of a no-map reservation, which unsurprisingly
> > turns out not to be mapped. Is there a way to tell kmemleak that it
> > can't scan within a particular object?
>
> There was this patch posted[1]. I never got a reply, so it hasn't been applied.
>
> https://patchwork.ozlabs.org/patch/995367/

Thanks Rob, I wasn't aware of this patch (or I just missed it at the
time).

I wonder whether kmemleak should simply remove ranges passed to
memblock_remove(), or at least mark them as no-scan.

--
Catalin

2019-01-21 15:44:04

by Rob Herring

[permalink] [raw]
Subject: Re: kmemleak panic

+Mike Rapoport

On Mon, Jan 21, 2019 at 8:37 AM Catalin Marinas <[email protected]> wrote:
>
> On Mon, Jan 21, 2019 at 07:35:11AM -0600, Rob Herring wrote:
> > On Mon, Jan 21, 2019 at 6:19 AM Robin Murphy <[email protected]> wrote:
> > >
> > > On 21/01/2019 11:57, Marc Gonzalez wrote:
> > > [...]
> > > > # echo dump=0xffffffc021e00000 > /sys/kernel/debug/kmemleak
> > > > kmemleak: Object 0xffffffc021e00000 (size 2097152):
> > > > kmemleak: comm "swapper/0", pid 0, jiffies 4294892296
> > > > kmemleak: min_count = 0
> > > > kmemleak: count = 0
> > > > kmemleak: flags = 0x1
> > > > kmemleak: checksum = 0
> > > > kmemleak: backtrace:
> > > > kmemleak_alloc_phys+0x48/0x60
> > > > memblock_alloc_range_nid+0x8c/0xa4
> > > > memblock_alloc_base_nid+0x4c/0x60
> > > > __memblock_alloc_base+0x3c/0x4c
> > > > early_init_dt_alloc_reserved_memory_arch+0x54/0xa4
> > > > fdt_init_reserved_mem+0x308/0x3ec
> > > > early_init_fdt_scan_reserved_mem+0x88/0xb0
> > > > arm64_memblock_init+0x1dc/0x254
> > > > setup_arch+0x1c8/0x4ec
> > > > start_kernel+0x84/0x44c
> > > > 0xffffffffffffffff
> > >
> > > OK, so via the __va(phys) call in kmemleak_alloc_phys(), you end up with
> > > the linear map address of a no-map reservation, which unsurprisingly
> > > turns out not to be mapped. Is there a way to tell kmemleak that it
> > > can't scan within a particular object?
> >
> > There was this patch posted[1]. I never got a reply, so it hasn't been applied.
> >
> > https://patchwork.ozlabs.org/patch/995367/
>
> Thanks Rob, I wasn't aware of this patch (or I just missed it at the
> time).
>
> I wonder whether kmemleak should simply remove ranges passed to
> memblock_remove(), or at least mark them as no-scan.

Seems reasonable to me, but of course that impacts a lot of other
cases. Maybe Mike R has some thoughts?

Rob

2019-01-21 15:55:31

by Robin Murphy

[permalink] [raw]
Subject: Re: kmemleak panic

On 21/01/2019 15:42, Rob Herring wrote:
> +Mike Rapoport
>
> On Mon, Jan 21, 2019 at 8:37 AM Catalin Marinas <[email protected]> wrote:
>>
>> On Mon, Jan 21, 2019 at 07:35:11AM -0600, Rob Herring wrote:
>>> On Mon, Jan 21, 2019 at 6:19 AM Robin Murphy <[email protected]> wrote:
>>>>
>>>> On 21/01/2019 11:57, Marc Gonzalez wrote:
>>>> [...]
>>>>> # echo dump=0xffffffc021e00000 > /sys/kernel/debug/kmemleak
>>>>> kmemleak: Object 0xffffffc021e00000 (size 2097152):
>>>>> kmemleak: comm "swapper/0", pid 0, jiffies 4294892296
>>>>> kmemleak: min_count = 0
>>>>> kmemleak: count = 0
>>>>> kmemleak: flags = 0x1
>>>>> kmemleak: checksum = 0
>>>>> kmemleak: backtrace:
>>>>> kmemleak_alloc_phys+0x48/0x60
>>>>> memblock_alloc_range_nid+0x8c/0xa4
>>>>> memblock_alloc_base_nid+0x4c/0x60
>>>>> __memblock_alloc_base+0x3c/0x4c
>>>>> early_init_dt_alloc_reserved_memory_arch+0x54/0xa4
>>>>> fdt_init_reserved_mem+0x308/0x3ec
>>>>> early_init_fdt_scan_reserved_mem+0x88/0xb0
>>>>> arm64_memblock_init+0x1dc/0x254
>>>>> setup_arch+0x1c8/0x4ec
>>>>> start_kernel+0x84/0x44c
>>>>> 0xffffffffffffffff
>>>>
>>>> OK, so via the __va(phys) call in kmemleak_alloc_phys(), you end up with
>>>> the linear map address of a no-map reservation, which unsurprisingly
>>>> turns out not to be mapped. Is there a way to tell kmemleak that it
>>>> can't scan within a particular object?
>>>
>>> There was this patch posted[1]. I never got a reply, so it hasn't been applied.
>>>
>>> https://patchwork.ozlabs.org/patch/995367/
>>
>> Thanks Rob, I wasn't aware of this patch (or I just missed it at the
>> time).
>>
>> I wonder whether kmemleak should simply remove ranges passed to
>> memblock_remove(), or at least mark them as no-scan.
>
> Seems reasonable to me, but of course that impacts a lot of other
> cases. Maybe Mike R has some thoughts?

In particular, might that risk crippling kmemleak on EFI arm64 EFI,
where we memblock_remove() the entire physical address space (but then
rebuild the memblock list from scratch)?

FWIW, from the reserved-memory angle I think that patch looks reasonable
as-is (modulo perhaps a kmemleak_no_scan_phys() wrapper for API
symmetry). MEMBLOCK_NOMAP is already a massive pain in the bum and I'd
really rather not introduce any more usage of it if at all possible.

Robin.

2019-01-21 17:46:34

by Mike Rapoport

[permalink] [raw]
Subject: Re: kmemleak panic

On Mon, Jan 21, 2019 at 09:42:07AM -0600, Rob Herring wrote:
> +Mike Rapoport
>
> On Mon, Jan 21, 2019 at 8:37 AM Catalin Marinas <[email protected]> wrote:
> >
> > On Mon, Jan 21, 2019 at 07:35:11AM -0600, Rob Herring wrote:
> > > On Mon, Jan 21, 2019 at 6:19 AM Robin Murphy <[email protected]> wrote:
> > > >
> > > > On 21/01/2019 11:57, Marc Gonzalez wrote:
> > > > [...]
> > > > > # echo dump=0xffffffc021e00000 > /sys/kernel/debug/kmemleak
> > > > > kmemleak: Object 0xffffffc021e00000 (size 2097152):
> > > > > kmemleak: comm "swapper/0", pid 0, jiffies 4294892296
> > > > > kmemleak: min_count = 0
> > > > > kmemleak: count = 0
> > > > > kmemleak: flags = 0x1
> > > > > kmemleak: checksum = 0
> > > > > kmemleak: backtrace:
> > > > > kmemleak_alloc_phys+0x48/0x60
> > > > > memblock_alloc_range_nid+0x8c/0xa4
> > > > > memblock_alloc_base_nid+0x4c/0x60
> > > > > __memblock_alloc_base+0x3c/0x4c
> > > > > early_init_dt_alloc_reserved_memory_arch+0x54/0xa4
> > > > > fdt_init_reserved_mem+0x308/0x3ec
> > > > > early_init_fdt_scan_reserved_mem+0x88/0xb0
> > > > > arm64_memblock_init+0x1dc/0x254
> > > > > setup_arch+0x1c8/0x4ec
> > > > > start_kernel+0x84/0x44c
> > > > > 0xffffffffffffffff
> > > >
> > > > OK, so via the __va(phys) call in kmemleak_alloc_phys(), you end up with
> > > > the linear map address of a no-map reservation, which unsurprisingly
> > > > turns out not to be mapped. Is there a way to tell kmemleak that it
> > > > can't scan within a particular object?
> > >
> > > There was this patch posted[1]. I never got a reply, so it hasn't been applied.
> > >
> > > https://patchwork.ozlabs.org/patch/995367/
> >
> > Thanks Rob, I wasn't aware of this patch (or I just missed it at the
> > time).
> >
> > I wonder whether kmemleak should simply remove ranges passed to
> > memblock_remove(), or at least mark them as no-scan.

I'm not sure that would be possible. Normal use of memblock_remove() is as
a counterpart of memblock_add() which does not involve kmemleak.
As memblock_remove() essentially hides range of the physical memory from
the system, it's not clear how it can communicate to kmemleak what region
should not be scanned.

> Seems reasonable to me, but of course that impacts a lot of other
> cases. Maybe Mike R has some thoughts?

If I understood correctly, the trouble comes from no-map range allocated in
early_init_dt_alloc_reserved_memory_arch().

There's indeed imbalance, because memblock_alloc() does kmemleak_alloc(), but
memblock_remove() does not do kmemleak_free().

I think the best way is to replace __memblock_alloc_base() with
memblock_find_in_range(), e.g something like:


diff --git a/drivers/of/of_reserved_mem.c b/drivers/of/of_reserved_mem.c
index 1977ee0adcb1..6807a1cffe55 100644
--- a/drivers/of/of_reserved_mem.c
+++ b/drivers/of/of_reserved_mem.c
@@ -37,21 +37,16 @@ int __init __weak early_init_dt_alloc_reserved_memory_arch(phys_addr_t size,
*/
end = !end ? MEMBLOCK_ALLOC_ANYWHERE : end;
align = !align ? SMP_CACHE_BYTES : align;
- base = __memblock_alloc_base(size, align, end);
+ base = memblock_find_in_range(size, align, start, end);
if (!base)
return -ENOMEM;

- /*
- * Check if the allocated region fits in to start..end window
- */
- if (base < start) {
- memblock_free(base, size);
- return -ENOMEM;
- }
-
*res_base = base;
if (nomap)
return memblock_remove(base, size);
+ else
+ return memblock_reserve(base, size);
+
return 0;
}


> Rob
>

--
Sincerely yours,
Mike.


2019-01-22 08:22:12

by Prateek Patel

[permalink] [raw]
Subject: Re: kmemleak panic


On 1/21/2019 7:24 PM, Marc Gonzalez wrote:
> On 21/01/2019 14:35, Rob Herring wrote:
>
>> On Mon, Jan 21, 2019 at 6:19 AM Robin Murphy wrote:
>>> On 21/01/2019 11:57, Marc Gonzalez wrote:
>>> [...]
>>>> # echo dump=0xffffffc021e00000 > /sys/kernel/debug/kmemleak
>>>> kmemleak: Object 0xffffffc021e00000 (size 2097152):
>>>> kmemleak: comm "swapper/0", pid 0, jiffies 4294892296
>>>> kmemleak: min_count = 0
>>>> kmemleak: count = 0
>>>> kmemleak: flags = 0x1
>>>> kmemleak: checksum = 0
>>>> kmemleak: backtrace:
>>>> kmemleak_alloc_phys+0x48/0x60
>>>> memblock_alloc_range_nid+0x8c/0xa4
>>>> memblock_alloc_base_nid+0x4c/0x60
>>>> __memblock_alloc_base+0x3c/0x4c
>>>> early_init_dt_alloc_reserved_memory_arch+0x54/0xa4
>>>> fdt_init_reserved_mem+0x308/0x3ec
>>>> early_init_fdt_scan_reserved_mem+0x88/0xb0
>>>> arm64_memblock_init+0x1dc/0x254
>>>> setup_arch+0x1c8/0x4ec
>>>> start_kernel+0x84/0x44c
>>>> 0xffffffffffffffff
>>> OK, so via the __va(phys) call in kmemleak_alloc_phys(), you end up with
>>> the linear map address of a no-map reservation, which unsurprisingly
>>> turns out not to be mapped. Is there a way to tell kmemleak that it
>>> can't scan within a particular object?

Yes,  kmemleak_no_scan() do this which is done in patch
https://patchwork.ozlabs.org/patch/995367/

This was done to avoid kmemleak scanning on the blocks which are nomaped
and should not have linear mapping created in kernel page table.

>> There was this patch posted[1]. I never got a reply, so it hasn't been applied.
>>
>> Rob
>>
>> https://patchwork.ozlabs.org/patch/995367/
> It is worth noting that the author's email address appears to be dead.
> <[email protected]>: host hqemgate08.nvidia.com[216.228.121.117] said:
> 550 #5.1.0 Address rejected. (in reply to RCPT TO command)
>
> Adding a few nvidia devs for comment.

2019-01-22 14:04:36

by Marc Gonzalez

[permalink] [raw]
Subject: Re: kmemleak panic

On 21/01/2019 18:42, Mike Rapoport wrote:

> If I understood correctly, the trouble comes from no-map range allocated in
> early_init_dt_alloc_reserved_memory_arch().
>
> There's indeed imbalance, because memblock_alloc() does kmemleak_alloc(), but
> memblock_remove() does not do kmemleak_free().
>
> I think the best way is to replace __memblock_alloc_base() with
> memblock_find_in_range(), e.g something like:
>
>
> diff --git a/drivers/of/of_reserved_mem.c b/drivers/of/of_reserved_mem.c
> index 1977ee0adcb1..6807a1cffe55 100644
> --- a/drivers/of/of_reserved_mem.c
> +++ b/drivers/of/of_reserved_mem.c
> @@ -37,21 +37,16 @@ int __init __weak early_init_dt_alloc_reserved_memory_arch(phys_addr_t size,
> */
> end = !end ? MEMBLOCK_ALLOC_ANYWHERE : end;
> align = !align ? SMP_CACHE_BYTES : align;
> - base = __memblock_alloc_base(size, align, end);
> + base = memblock_find_in_range(size, align, start, end);
> if (!base)
> return -ENOMEM;
>
> - /*
> - * Check if the allocated region fits in to start..end window
> - */
> - if (base < start) {
> - memblock_free(base, size);
> - return -ENOMEM;
> - }
> -
> *res_base = base;
> if (nomap)
> return memblock_remove(base, size);
> + else
> + return memblock_reserve(base, size);
> +
> return 0;
> }
>

Your patch solves the issue. \o/

2019-01-22 14:15:40

by Marc Gonzalez

[permalink] [raw]
Subject: Re: kmemleak panic

On 22/01/2019 15:02, Marc Gonzalez wrote:

> On 21/01/2019 18:42, Mike Rapoport wrote:
>
>> If I understood correctly, the trouble comes from no-map range allocated in
>> early_init_dt_alloc_reserved_memory_arch().
>>
>> There's indeed imbalance, because memblock_alloc() does kmemleak_alloc(), but
>> memblock_remove() does not do kmemleak_free().
>>
>> I think the best way is to replace __memblock_alloc_base() with
>> memblock_find_in_range(), e.g something like:
>>
>>
>> diff --git a/drivers/of/of_reserved_mem.c b/drivers/of/of_reserved_mem.c
>> index 1977ee0adcb1..6807a1cffe55 100644
>> --- a/drivers/of/of_reserved_mem.c
>> +++ b/drivers/of/of_reserved_mem.c
>> @@ -37,21 +37,16 @@ int __init __weak early_init_dt_alloc_reserved_memory_arch(phys_addr_t size,
>> */
>> end = !end ? MEMBLOCK_ALLOC_ANYWHERE : end;
>> align = !align ? SMP_CACHE_BYTES : align;
>> - base = __memblock_alloc_base(size, align, end);
>> + base = memblock_find_in_range(size, align, start, end);
>> if (!base)
>> return -ENOMEM;
>>
>> - /*
>> - * Check if the allocated region fits in to start..end window
>> - */
>> - if (base < start) {
>> - memblock_free(base, size);
>> - return -ENOMEM;
>> - }
>> -
>> *res_base = base;
>> if (nomap)
>> return memblock_remove(base, size);
>> + else
>> + return memblock_reserve(base, size);
>> +
>> return 0;
>> }
>>
>
> Your patch solves the issue. \o/

[ Add nvidia devs, but drop [email protected] ]

2019-01-23 05:56:46

by Mike Rapoport

[permalink] [raw]
Subject: Re: kmemleak panic

On Tue, Jan 22, 2019 at 03:12:54PM +0100, Marc Gonzalez wrote:
> On 22/01/2019 15:02, Marc Gonzalez wrote:
>
> > On 21/01/2019 18:42, Mike Rapoport wrote:
> >
> >> If I understood correctly, the trouble comes from no-map range allocated in
> >> early_init_dt_alloc_reserved_memory_arch().
> >>
> >> There's indeed imbalance, because memblock_alloc() does kmemleak_alloc(), but
> >> memblock_remove() does not do kmemleak_free().
> >>
> >> I think the best way is to replace __memblock_alloc_base() with
> >> memblock_find_in_range(), e.g something like:
> >>
> >>
> >> diff --git a/drivers/of/of_reserved_mem.c b/drivers/of/of_reserved_mem.c
> >> index 1977ee0adcb1..6807a1cffe55 100644
> >> --- a/drivers/of/of_reserved_mem.c
> >> +++ b/drivers/of/of_reserved_mem.c
> >> @@ -37,21 +37,16 @@ int __init __weak early_init_dt_alloc_reserved_memory_arch(phys_addr_t size,
> >> */
> >> end = !end ? MEMBLOCK_ALLOC_ANYWHERE : end;
> >> align = !align ? SMP_CACHE_BYTES : align;
> >> - base = __memblock_alloc_base(size, align, end);
> >> + base = memblock_find_in_range(size, align, start, end);
> >> if (!base)
> >> return -ENOMEM;
> >>
> >> - /*
> >> - * Check if the allocated region fits in to start..end window
> >> - */
> >> - if (base < start) {
> >> - memblock_free(base, size);
> >> - return -ENOMEM;
> >> - }
> >> -
> >> *res_base = base;
> >> if (nomap)
> >> return memblock_remove(base, size);
> >> + else
> >> + return memblock_reserve(base, size);
> >> +
> >> return 0;
> >> }
> >>
> >
> > Your patch solves the issue. \o/

Great :)

> [ Add nvidia devs, but drop [email protected] ]
>

Resending it as a formal patch now, I took a liberty to add your Tested-by.

From a847ca684db29a3c09e4dd2a8a008b35cf36e52f Mon Sep 17 00:00:00 2001
From: Mike Rapoport <[email protected]>
Date: Wed, 23 Jan 2019 07:38:50 +0200
Subject: [PATCH] of: fix kmemleak crash caused by imbalance in early memory
reservation

Marc Gonzalez reported the following kmemleak crash:

Unable to handle kernel paging request at virtual address ffffffc021e00000
Mem abort info:
ESR = 0x96000006
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000006
CM = 0, WnR = 0
swapper pgtable: 4k pages, 39-bit VAs, pgdp = (____ptrval____)
[ffffffc021e00000] pgd=000000017e3ba803, pud=000000017e3ba803,
pmd=0000000000000000
Internal error: Oops: 96000006 [#1] PREEMPT SMP
Modules linked in:
CPU: 6 PID: 523 Comm: kmemleak Tainted: G S W 5.0.0-rc1 #13
Hardware name: Qualcomm Technologies, Inc. MSM8998 v1 MTP (DT)
pstate: 80000085 (Nzcv daIf -PAN -UAO)
pc : scan_block+0x70/0x190
lr : scan_block+0x6c/0x190
sp : ffffff8012e8bd20
x29: ffffff8012e8bd20 x28: ffffffc0fdbaf018
x27: ffffffc022000000 x26: 0000000000000080
x25: ffffff8011aadf70 x24: ffffffc0f8cc8000
x23: ffffff8010dc8000 x22: ffffff8010dc8830
x21: ffffffc021e00ff9 x20: ffffffc0f8cc8050
x19: ffffffc021e00000 x18: 0000000000002409
x17: 0000000000000200 x16: 0000000000000000
x15: ffffff8010e14dd8 x14: 0000000000002406
x13: 000000004c4dd0c6 x12: ffffffc0f77dad58
x11: 0000000000000001 x10: ffffff8010d9e688
x9 : ffffff8010d9f000 x8 : ffffff8010d9e688
x7 : 0000000000000002 x6 : 0000000000000000
x5 : ffffff8011511c20 x4 : 00000000000026d1
x3 : ffffff8010e14d88 x2 : 5b36396f4e7d4000
x1 : 0000000000208040 x0 : 0000000000000000
Process kmemleak (pid: 523, stack limit = 0x(____ptrval____))
Call trace:
scan_block+0x70/0x190
scan_gray_list+0x108/0x1c0
kmemleak_scan+0x33c/0x7c0
kmemleak_scan_thread+0x98/0xf0
kthread+0x11c/0x120
ret_from_fork+0x10/0x1c
Code: f9000fb4 d503201f 97ffffd2 35000580 (f9400260)
---[ end trace 176d6ed9d86a0c33 ]---
note: kmemleak[523] exited with preempt_count 2

The crash happens when a no-map area is allocated in
early_init_dt_alloc_reserved_memory_arch(). The allocated region is
registered with kmemleak, but it is then removed from memblock using
memblock_remove() that is not kmemleak-aware.

Replacing __memblock_alloc_base() with memblock_find_in_range() makes sure
that the allocated memory is not added to kmemleak and then
memblock_remove()'ing this memory is safe.

As a bonus, since memblock_find_in_range() ensures the allocation in the
specified range, the bounds check can be removed.

Signed-off-by: Mike Rapoport <[email protected]>
Tested-by: Marc Gonzalez <[email protected]>
---
drivers/of/of_reserved_mem.c | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/drivers/of/of_reserved_mem.c b/drivers/of/of_reserved_mem.c
index 1977ee0adcb1..6807a1cffe55 100644
--- a/drivers/of/of_reserved_mem.c
+++ b/drivers/of/of_reserved_mem.c
@@ -37,21 +37,16 @@ int __init __weak early_init_dt_alloc_reserved_memory_arch(phys_addr_t size,
*/
end = !end ? MEMBLOCK_ALLOC_ANYWHERE : end;
align = !align ? SMP_CACHE_BYTES : align;
- base = __memblock_alloc_base(size, align, end);
+ base = memblock_find_in_range(size, align, start, end);
if (!base)
return -ENOMEM;

- /*
- * Check if the allocated region fits in to start..end window
- */
- if (base < start) {
- memblock_free(base, size);
- return -ENOMEM;
- }
-
*res_base = base;
if (nomap)
return memblock_remove(base, size);
+ else
+ return memblock_reserve(base, size);
+
return 0;
}

--
2.7.4


2019-01-23 07:06:46

by Prateek Patel

[permalink] [raw]
Subject: Re: kmemleak panic


On 1/23/2019 11:24 AM, Mike Rapoport wrote:
> On Tue, Jan 22, 2019 at 03:12:54PM +0100, Marc Gonzalez wrote:
>> On 22/01/2019 15:02, Marc Gonzalez wrote:
>>
>>> On 21/01/2019 18:42, Mike Rapoport wrote:
>>>
>>>> If I understood correctly, the trouble comes from no-map range allocated in
>>>> early_init_dt_alloc_reserved_memory_arch().
>>>>
>>>> There's indeed imbalance, because memblock_alloc() does kmemleak_alloc(), but
>>>> memblock_remove() does not do kmemleak_free().
>>>>
>>>> I think the best way is to replace __memblock_alloc_base() with
>>>> memblock_find_in_range(), e.g something like:
>>>>
>>>>
>>>> diff --git a/drivers/of/of_reserved_mem.c b/drivers/of/of_reserved_mem.c
>>>> index 1977ee0adcb1..6807a1cffe55 100644
>>>> --- a/drivers/of/of_reserved_mem.c
>>>> +++ b/drivers/of/of_reserved_mem.c
>>>> @@ -37,21 +37,16 @@ int __init __weak early_init_dt_alloc_reserved_memory_arch(phys_addr_t size,
>>>> */
>>>> end = !end ? MEMBLOCK_ALLOC_ANYWHERE : end;
>>>> align = !align ? SMP_CACHE_BYTES : align;
>>>> - base = __memblock_alloc_base(size, align, end);
>>>> + base = memblock_find_in_range(size, align, start, end);
>>>> if (!base)
>>>> return -ENOMEM;
>>>>
>>>> - /*
>>>> - * Check if the allocated region fits in to start..end window
>>>> - */
>>>> - if (base < start) {
>>>> - memblock_free(base, size);
>>>> - return -ENOMEM;
>>>> - }
>>>> -
>>>> *res_base = base;
>>>> if (nomap)
>>>> return memblock_remove(base, size);
>>>> + else
>>>> + return memblock_reserve(base, size);
>>>> +
>>>> return 0;
>>>> }
>>>>
>>> Your patch solves the issue. \o/
> Great :)
>
>> [ Add nvidia devs, but drop [email protected] ]
>>
> Resending it as a formal patch now, I took a liberty to add your Tested-by.
>
> From a847ca684db29a3c09e4dd2a8a008b35cf36e52f Mon Sep 17 00:00:00 2001
> From: Mike Rapoport <[email protected]>
> Date: Wed, 23 Jan 2019 07:38:50 +0200
> Subject: [PATCH] of: fix kmemleak crash caused by imbalance in early memory
> reservation
>
> Marc Gonzalez reported the following kmemleak crash:
>
> Unable to handle kernel paging request at virtual address ffffffc021e00000
> Mem abort info:
> ESR = 0x96000006
> Exception class = DABT (current EL), IL = 32 bits
> SET = 0, FnV = 0
> EA = 0, S1PTW = 0
> Data abort info:
> ISV = 0, ISS = 0x00000006
> CM = 0, WnR = 0
> swapper pgtable: 4k pages, 39-bit VAs, pgdp = (____ptrval____)
> [ffffffc021e00000] pgd=000000017e3ba803, pud=000000017e3ba803,
> pmd=0000000000000000
> Internal error: Oops: 96000006 [#1] PREEMPT SMP
> Modules linked in:
> CPU: 6 PID: 523 Comm: kmemleak Tainted: G S W 5.0.0-rc1 #13
> Hardware name: Qualcomm Technologies, Inc. MSM8998 v1 MTP (DT)
> pstate: 80000085 (Nzcv daIf -PAN -UAO)
> pc : scan_block+0x70/0x190
> lr : scan_block+0x6c/0x190
> sp : ffffff8012e8bd20
> x29: ffffff8012e8bd20 x28: ffffffc0fdbaf018
> x27: ffffffc022000000 x26: 0000000000000080
> x25: ffffff8011aadf70 x24: ffffffc0f8cc8000
> x23: ffffff8010dc8000 x22: ffffff8010dc8830
> x21: ffffffc021e00ff9 x20: ffffffc0f8cc8050
> x19: ffffffc021e00000 x18: 0000000000002409
> x17: 0000000000000200 x16: 0000000000000000
> x15: ffffff8010e14dd8 x14: 0000000000002406
> x13: 000000004c4dd0c6 x12: ffffffc0f77dad58
> x11: 0000000000000001 x10: ffffff8010d9e688
> x9 : ffffff8010d9f000 x8 : ffffff8010d9e688
> x7 : 0000000000000002 x6 : 0000000000000000
> x5 : ffffff8011511c20 x4 : 00000000000026d1
> x3 : ffffff8010e14d88 x2 : 5b36396f4e7d4000
> x1 : 0000000000208040 x0 : 0000000000000000
> Process kmemleak (pid: 523, stack limit = 0x(____ptrval____))
> Call trace:
> scan_block+0x70/0x190
> scan_gray_list+0x108/0x1c0
> kmemleak_scan+0x33c/0x7c0
> kmemleak_scan_thread+0x98/0xf0
> kthread+0x11c/0x120
> ret_from_fork+0x10/0x1c
> Code: f9000fb4 d503201f 97ffffd2 35000580 (f9400260)
> ---[ end trace 176d6ed9d86a0c33 ]---
> note: kmemleak[523] exited with preempt_count 2
>
> The crash happens when a no-map area is allocated in
> early_init_dt_alloc_reserved_memory_arch(). The allocated region is
> registered with kmemleak, but it is then removed from memblock using
> memblock_remove() that is not kmemleak-aware.
>
> Replacing __memblock_alloc_base() with memblock_find_in_range() makes sure
> that the allocated memory is not added to kmemleak and then
> memblock_remove()'ing this memory is safe.
>
> As a bonus, since memblock_find_in_range() ensures the allocation in the
> specified range, the bounds check can be removed.
>
> Signed-off-by: Mike Rapoport <[email protected]>
> Tested-by: Marc Gonzalez <[email protected]>
> ---
> drivers/of/of_reserved_mem.c | 13 ++++---------
> 1 file changed, 4 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/of/of_reserved_mem.c b/drivers/of/of_reserved_mem.c
> index 1977ee0adcb1..6807a1cffe55 100644
> --- a/drivers/of/of_reserved_mem.c
> +++ b/drivers/of/of_reserved_mem.c
> @@ -37,21 +37,16 @@ int __init __weak early_init_dt_alloc_reserved_memory_arch(phys_addr_t size,
> */
> end = !end ? MEMBLOCK_ALLOC_ANYWHERE : end;
> align = !align ? SMP_CACHE_BYTES : align;
> - base = __memblock_alloc_base(size, align, end);
> + base = memblock_find_in_range(size, align, start, end);
> if (!base)
> return -ENOMEM;
>
> - /*
> - * Check if the allocated region fits in to start..end window
> - */
> - if (base < start) {
> - memblock_free(base, size);
> - return -ENOMEM;
> - }
> -
> *res_base = base;
> if (nomap)
> return memblock_remove(base, size);
> + else
> + return memblock_reserve(base, size);
> +
> return 0;
> }
>

Thanks Mike for the patch. With this, skipping kmemleak scan:
https://patchwork.ozlabs.org/patch/995367/ is not required.


2019-01-23 07:29:44

by Marek Szyprowski

[permalink] [raw]
Subject: Re: kmemleak panic

Hi Mike,

On 2019-01-23 06:54, Mike Rapoport wrote:

...
> Resending it as a formal patch now, I took a liberty to add your Tested-by.
>
> >From a847ca684db29a3c09e4dd2a8a008b35cf36e52f Mon Sep 17 00:00:00 2001
> From: Mike Rapoport <[email protected]>
> Date: Wed, 23 Jan 2019 07:38:50 +0200
> Subject: [PATCH] of: fix kmemleak crash caused by imbalance in early memory
> reservation
>
> Marc Gonzalez reported the following kmemleak crash:
>
> Unable to handle kernel paging request at virtual address ffffffc021e00000
> Mem abort info:
> ESR = 0x96000006
> Exception class = DABT (current EL), IL = 32 bits
> SET = 0, FnV = 0
> EA = 0, S1PTW = 0
> Data abort info:
> ISV = 0, ISS = 0x00000006
> CM = 0, WnR = 0
> swapper pgtable: 4k pages, 39-bit VAs, pgdp = (____ptrval____)
> [ffffffc021e00000] pgd=000000017e3ba803, pud=000000017e3ba803,
> pmd=0000000000000000
> Internal error: Oops: 96000006 [#1] PREEMPT SMP
> Modules linked in:
> CPU: 6 PID: 523 Comm: kmemleak Tainted: G S W 5.0.0-rc1 #13
> Hardware name: Qualcomm Technologies, Inc. MSM8998 v1 MTP (DT)
> pstate: 80000085 (Nzcv daIf -PAN -UAO)
> pc : scan_block+0x70/0x190
> lr : scan_block+0x6c/0x190
> sp : ffffff8012e8bd20
> x29: ffffff8012e8bd20 x28: ffffffc0fdbaf018
> x27: ffffffc022000000 x26: 0000000000000080
> x25: ffffff8011aadf70 x24: ffffffc0f8cc8000
> x23: ffffff8010dc8000 x22: ffffff8010dc8830
> x21: ffffffc021e00ff9 x20: ffffffc0f8cc8050
> x19: ffffffc021e00000 x18: 0000000000002409
> x17: 0000000000000200 x16: 0000000000000000
> x15: ffffff8010e14dd8 x14: 0000000000002406
> x13: 000000004c4dd0c6 x12: ffffffc0f77dad58
> x11: 0000000000000001 x10: ffffff8010d9e688
> x9 : ffffff8010d9f000 x8 : ffffff8010d9e688
> x7 : 0000000000000002 x6 : 0000000000000000
> x5 : ffffff8011511c20 x4 : 00000000000026d1
> x3 : ffffff8010e14d88 x2 : 5b36396f4e7d4000
> x1 : 0000000000208040 x0 : 0000000000000000
> Process kmemleak (pid: 523, stack limit = 0x(____ptrval____))
> Call trace:
> scan_block+0x70/0x190
> scan_gray_list+0x108/0x1c0
> kmemleak_scan+0x33c/0x7c0
> kmemleak_scan_thread+0x98/0xf0
> kthread+0x11c/0x120
> ret_from_fork+0x10/0x1c
> Code: f9000fb4 d503201f 97ffffd2 35000580 (f9400260)
> ---[ end trace 176d6ed9d86a0c33 ]---
> note: kmemleak[523] exited with preempt_count 2
>
> The crash happens when a no-map area is allocated in
> early_init_dt_alloc_reserved_memory_arch(). The allocated region is
> registered with kmemleak, but it is then removed from memblock using
> memblock_remove() that is not kmemleak-aware.
>
> Replacing __memblock_alloc_base() with memblock_find_in_range() makes sure
> that the allocated memory is not added to kmemleak and then
> memblock_remove()'ing this memory is safe.
>
> As a bonus, since memblock_find_in_range() ensures the allocation in the
> specified range, the bounds check can be removed.
>
> Signed-off-by: Mike Rapoport <[email protected]>
> Tested-by: Marc Gonzalez <[email protected]>

With the minor fix mentioned below, you can add my:

Acked-by: Marek Szyprowski <[email protected]>

> ---
> drivers/of/of_reserved_mem.c | 13 ++++---------
> 1 file changed, 4 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/of/of_reserved_mem.c b/drivers/of/of_reserved_mem.c
> index 1977ee0adcb1..6807a1cffe55 100644
> --- a/drivers/of/of_reserved_mem.c
> +++ b/drivers/of/of_reserved_mem.c
> @@ -37,21 +37,16 @@ int __init __weak early_init_dt_alloc_reserved_memory_arch(phys_addr_t size,
> */
> end = !end ? MEMBLOCK_ALLOC_ANYWHERE : end;
> align = !align ? SMP_CACHE_BYTES : align;
> - base = __memblock_alloc_base(size, align, end);
> + base = memblock_find_in_range(size, align, start, end);

Please remove a comment about __memblock_alloc_base() above that block
of code. It is no longer needed after this change.

> if (!base)
> return -ENOMEM;
>
> - /*
> - * Check if the allocated region fits in to start..end window
> - */
> - if (base < start) {
> - memblock_free(base, size);
> - return -ENOMEM;
> - }
> -
> *res_base = base;
> if (nomap)
> return memblock_remove(base, size);
> + else
> + return memblock_reserve(base, size);
> +
> return 0;
> }
>

Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland


2019-01-23 12:33:42

by Mike Rapoport

[permalink] [raw]
Subject: [PATCH v2] of: fix kmemleak crash (was: Re: kmemleak panic)

On Wed, Jan 23, 2019 at 08:28:03AM +0100, Marek Szyprowski wrote:
> Hi Mike,
>
> On 2019-01-23 06:54, Mike Rapoport wrote:
>
> > @@ -37,21 +37,16 @@ int __init __weak early_init_dt_alloc_reserved_memory_arch(phys_addr_t size,
> > */
> > end = !end ? MEMBLOCK_ALLOC_ANYWHERE : end;
> > align = !align ? SMP_CACHE_BYTES : align;
> > - base = __memblock_alloc_base(size, align, end);
> > + base = memblock_find_in_range(size, align, start, end);
>
> Please remove a comment about __memblock_alloc_base() above that block
> of code. It is no longer needed after this change.

Huh, missed that, thanks!
Fixed now.

From 2f340afcc8cc81f1829a19f9c595a9995656e547 Mon Sep 17 00:00:00 2001
From: Mike Rapoport <[email protected]>
Date: Wed, 23 Jan 2019 07:38:50 +0200
Subject: [PATCH v2] of: fix kmemleak crash caused by imbalance in early memory
reservation

Marc Gonzalez reported the following kmemleak crash:

Unable to handle kernel paging request at virtual address ffffffc021e00000
Mem abort info:
ESR = 0x96000006
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000006
CM = 0, WnR = 0
swapper pgtable: 4k pages, 39-bit VAs, pgdp = (____ptrval____)
[ffffffc021e00000] pgd=000000017e3ba803, pud=000000017e3ba803,
pmd=0000000000000000
Internal error: Oops: 96000006 [#1] PREEMPT SMP
Modules linked in:
CPU: 6 PID: 523 Comm: kmemleak Tainted: G S W 5.0.0-rc1 #13
Hardware name: Qualcomm Technologies, Inc. MSM8998 v1 MTP (DT)
pstate: 80000085 (Nzcv daIf -PAN -UAO)
pc : scan_block+0x70/0x190
lr : scan_block+0x6c/0x190
sp : ffffff8012e8bd20
x29: ffffff8012e8bd20 x28: ffffffc0fdbaf018
x27: ffffffc022000000 x26: 0000000000000080
x25: ffffff8011aadf70 x24: ffffffc0f8cc8000
x23: ffffff8010dc8000 x22: ffffff8010dc8830
x21: ffffffc021e00ff9 x20: ffffffc0f8cc8050
x19: ffffffc021e00000 x18: 0000000000002409
x17: 0000000000000200 x16: 0000000000000000
x15: ffffff8010e14dd8 x14: 0000000000002406
x13: 000000004c4dd0c6 x12: ffffffc0f77dad58
x11: 0000000000000001 x10: ffffff8010d9e688
x9 : ffffff8010d9f000 x8 : ffffff8010d9e688
x7 : 0000000000000002 x6 : 0000000000000000
x5 : ffffff8011511c20 x4 : 00000000000026d1
x3 : ffffff8010e14d88 x2 : 5b36396f4e7d4000
x1 : 0000000000208040 x0 : 0000000000000000
Process kmemleak (pid: 523, stack limit = 0x(____ptrval____))
Call trace:
scan_block+0x70/0x190
scan_gray_list+0x108/0x1c0
kmemleak_scan+0x33c/0x7c0
kmemleak_scan_thread+0x98/0xf0
kthread+0x11c/0x120
ret_from_fork+0x10/0x1c
Code: f9000fb4 d503201f 97ffffd2 35000580 (f9400260)
---[ end trace 176d6ed9d86a0c33 ]---
note: kmemleak[523] exited with preempt_count 2

The crash happens when a no-map area is allocated in
early_init_dt_alloc_reserved_memory_arch(). The allocated region is
registered with kmemleak, but it is then removed from memblock using
memblock_remove() that is not kmemleak-aware.

Replacing __memblock_alloc_base() with memblock_find_in_range() makes sure
that the allocated memory is not added to kmemleak and then
memblock_remove()'ing this memory is safe.

As a bonus, since memblock_find_in_range() ensures the allocation in the
specified range, the bounds check can be removed.

Signed-off-by: Mike Rapoport <[email protected]>
Tested-by: Marc Gonzalez <[email protected]>
Acked-by: Marek Szyprowski <[email protected]>
---
drivers/of/of_reserved_mem.c | 18 +++++-------------
1 file changed, 5 insertions(+), 13 deletions(-)

diff --git a/drivers/of/of_reserved_mem.c b/drivers/of/of_reserved_mem.c
index 1977ee0adcb1..2ae81604ffef 100644
--- a/drivers/of/of_reserved_mem.c
+++ b/drivers/of/of_reserved_mem.c
@@ -31,27 +31,19 @@ int __init __weak early_init_dt_alloc_reserved_memory_arch(phys_addr_t size,
phys_addr_t *res_base)
{
phys_addr_t base;
- /*
- * We use __memblock_alloc_base() because memblock_alloc_base()
- * panic()s on allocation failure.
- */
+
end = !end ? MEMBLOCK_ALLOC_ANYWHERE : end;
align = !align ? SMP_CACHE_BYTES : align;
- base = __memblock_alloc_base(size, align, end);
+ base = memblock_find_in_range(size, align, start, end);
if (!base)
return -ENOMEM;

- /*
- * Check if the allocated region fits in to start..end window
- */
- if (base < start) {
- memblock_free(base, size);
- return -ENOMEM;
- }
-
*res_base = base;
if (nomap)
return memblock_remove(base, size);
+ else
+ return memblock_reserve(base, size);
+
return 0;
}

--
2.7.4




2019-02-01 16:25:34

by Marc Gonzalez

[permalink] [raw]
Subject: Re: [PATCH v2] of: fix kmemleak crash

On 23/01/2019 13:31, Mike Rapoport wrote:

> Signed-off-by: Mike Rapoport <[email protected]>
> Tested-by: Marc Gonzalez <[email protected]>
> Acked-by: Marek Szyprowski <[email protected]>
> ---
> drivers/of/of_reserved_mem.c | 18 +++++-------------
> 1 file changed, 5 insertions(+), 13 deletions(-)

Thanks for the patch, Mike.

Whose tree should this patch go through?

Regards.

2019-02-04 10:13:06

by Marc Gonzalez

[permalink] [raw]
Subject: Re: [PATCH v2] of: fix kmemleak crash

+ GKH

On 01/02/2019 17:23, Marc Gonzalez wrote:

> On 23/01/2019 13:31, Mike Rapoport wrote:
>
>> Signed-off-by: Mike Rapoport <[email protected]>
>> Tested-by: Marc Gonzalez <[email protected]>
>> Acked-by: Marek Szyprowski <[email protected]>
>> ---
>> drivers/of/of_reserved_mem.c | 18 +++++-------------
>> 1 file changed, 5 insertions(+), 13 deletions(-)
>
> Thanks for the patch, Mike.
>
> Whose tree should this patch go through?

By the way, I think we can add

Acked-by: Prateek Patel <[email protected]>
Fixes: 3f0c820664483 ("drivers: of: add initialization code for dynamic reserved memory")
Cc: [email protected] # 3.15+

Regards.

2019-02-04 14:46:56

by Marc Gonzalez

[permalink] [raw]
Subject: [RESEND PATCH v2] of: fix kmemleak crash caused by imbalance in early memory reservation

From: Mike Rapoport <[email protected]>

Marc Gonzalez reported the following kmemleak crash:

Unable to handle kernel paging request at virtual address ffffffc021e00000
Mem abort info:
ESR = 0x96000006
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000006
CM = 0, WnR = 0
swapper pgtable: 4k pages, 39-bit VAs, pgdp = (____ptrval____)
[ffffffc021e00000] pgd=000000017e3ba803, pud=000000017e3ba803,
pmd=0000000000000000
Internal error: Oops: 96000006 [#1] PREEMPT SMP
Modules linked in:
CPU: 6 PID: 523 Comm: kmemleak Tainted: G S W 5.0.0-rc1 #13
Hardware name: Qualcomm Technologies, Inc. MSM8998 v1 MTP (DT)
pstate: 80000085 (Nzcv daIf -PAN -UAO)
pc : scan_block+0x70/0x190
lr : scan_block+0x6c/0x190
sp : ffffff8012e8bd20
x29: ffffff8012e8bd20 x28: ffffffc0fdbaf018
x27: ffffffc022000000 x26: 0000000000000080
x25: ffffff8011aadf70 x24: ffffffc0f8cc8000
x23: ffffff8010dc8000 x22: ffffff8010dc8830
x21: ffffffc021e00ff9 x20: ffffffc0f8cc8050
x19: ffffffc021e00000 x18: 0000000000002409
x17: 0000000000000200 x16: 0000000000000000
x15: ffffff8010e14dd8 x14: 0000000000002406
x13: 000000004c4dd0c6 x12: ffffffc0f77dad58
x11: 0000000000000001 x10: ffffff8010d9e688
x9 : ffffff8010d9f000 x8 : ffffff8010d9e688
x7 : 0000000000000002 x6 : 0000000000000000
x5 : ffffff8011511c20 x4 : 00000000000026d1
x3 : ffffff8010e14d88 x2 : 5b36396f4e7d4000
x1 : 0000000000208040 x0 : 0000000000000000
Process kmemleak (pid: 523, stack limit = 0x(____ptrval____))
Call trace:
scan_block+0x70/0x190
scan_gray_list+0x108/0x1c0
kmemleak_scan+0x33c/0x7c0
kmemleak_scan_thread+0x98/0xf0
kthread+0x11c/0x120
ret_from_fork+0x10/0x1c
Code: f9000fb4 d503201f 97ffffd2 35000580 (f9400260)
---[ end trace 176d6ed9d86a0c33 ]---
note: kmemleak[523] exited with preempt_count 2

The crash happens when a no-map area is allocated in
early_init_dt_alloc_reserved_memory_arch(). The allocated region is
registered with kmemleak, but it is then removed from memblock using
memblock_remove() that is not kmemleak-aware.

Replacing __memblock_alloc_base() with memblock_find_in_range() makes sure
that the allocated memory is not added to kmemleak and then
memblock_remove()'ing this memory is safe.

As a bonus, since memblock_find_in_range() ensures the allocation in the
specified range, the bounds check can be removed.

Cc: [email protected] # 3.15+
Fixes: 3f0c820664483 ("drivers: of: add initialization code for dynamic reserved memory")
Acked-by: Marek Szyprowski <[email protected]>
Acked-by: Prateek Patel <[email protected]>
Tested-by: Marc Gonzalez <[email protected]>
Signed-off-by: Mike Rapoport <[email protected]>
---
Resend with DT CCed to reach robh's patch queue
I added CC: stable, Fixes, and Prateek's ack
Trim recipients list to minimize inconvenience
---
drivers/of/of_reserved_mem.c | 18 +++++-------------
1 file changed, 5 insertions(+), 13 deletions(-)

diff --git a/drivers/of/of_reserved_mem.c b/drivers/of/of_reserved_mem.c
index 1977ee0adcb1..2ae81604ffef 100644
--- a/drivers/of/of_reserved_mem.c
+++ b/drivers/of/of_reserved_mem.c
@@ -31,27 +31,19 @@ int __init __weak early_init_dt_alloc_reserved_memory_arch(phys_addr_t size,
phys_addr_t *res_base)
{
phys_addr_t base;
- /*
- * We use __memblock_alloc_base() because memblock_alloc_base()
- * panic()s on allocation failure.
- */
+
end = !end ? MEMBLOCK_ALLOC_ANYWHERE : end;
align = !align ? SMP_CACHE_BYTES : align;
- base = __memblock_alloc_base(size, align, end);
+ base = memblock_find_in_range(size, align, start, end);
if (!base)
return -ENOMEM;

- /*
- * Check if the allocated region fits in to start..end window
- */
- if (base < start) {
- memblock_free(base, size);
- return -ENOMEM;
- }
-
*res_base = base;
if (nomap)
return memblock_remove(base, size);
+ else
+ return memblock_reserve(base, size);
+
return 0;
}

--
2.7.4

2019-02-04 17:11:05

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2] of: fix kmemleak crash

On Mon, Feb 04, 2019 at 11:10:39AM +0100, Marc Gonzalez wrote:
> + GKH
>
> On 01/02/2019 17:23, Marc Gonzalez wrote:
>
> > On 23/01/2019 13:31, Mike Rapoport wrote:
> >
> >> Signed-off-by: Mike Rapoport <[email protected]>
> >> Tested-by: Marc Gonzalez <[email protected]>
> >> Acked-by: Marek Szyprowski <[email protected]>
> >> ---
> >> drivers/of/of_reserved_mem.c | 18 +++++-------------
> >> 1 file changed, 5 insertions(+), 13 deletions(-)
> >
> > Thanks for the patch, Mike.
> >
> > Whose tree should this patch go through?
>
> By the way, I think we can add
>
> Acked-by: Prateek Patel <[email protected]>
> Fixes: 3f0c820664483 ("drivers: of: add initialization code for dynamic reserved memory")
> Cc: [email protected] # 3.15+

What am I supposed to do with this?

confused,

greg k-h

2019-02-04 17:23:56

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH v2] of: fix kmemleak crash

On Mon, Feb 4, 2019 at 9:25 AM Greg Kroah-Hartman
<[email protected]> wrote:
>
> On Mon, Feb 04, 2019 at 11:10:39AM +0100, Marc Gonzalez wrote:
> > + GKH
> >
> > On 01/02/2019 17:23, Marc Gonzalez wrote:
> >
> > > On 23/01/2019 13:31, Mike Rapoport wrote:
> > >
> > >> Signed-off-by: Mike Rapoport <[email protected]>
> > >> Tested-by: Marc Gonzalez <[email protected]>
> > >> Acked-by: Marek Szyprowski <[email protected]>
> > >> ---
> > >> drivers/of/of_reserved_mem.c | 18 +++++-------------
> > >> 1 file changed, 5 insertions(+), 13 deletions(-)
> > >
> > > Thanks for the patch, Mike.
> > >
> > > Whose tree should this patch go through?
> >
> > By the way, I think we can add
> >
> > Acked-by: Prateek Patel <[email protected]>
> > Fixes: 3f0c820664483 ("drivers: of: add initialization code for dynamic reserved memory")
> > Cc: [email protected] # 3.15+
>
> What am I supposed to do with this?

Nothing til I apply it.

Rob

2019-02-11 20:27:16

by Marc Gonzalez

[permalink] [raw]
Subject: Re: [RESEND PATCH v2] of: fix kmemleak crash caused by imbalance in early memory reservation

On 04/02/2019 15:37, Marc Gonzalez wrote:

> Cc: [email protected] # 3.15+
> Fixes: 3f0c820664483 ("drivers: of: add initialization code for dynamic reserved memory")
> Acked-by: Marek Szyprowski <[email protected]>
> Acked-by: Prateek Patel <[email protected]>
> Tested-by: Marc Gonzalez <[email protected]>
> Signed-off-by: Mike Rapoport <[email protected]>
> ---
> Resend with DT CCed to reach robh's patch queue
> I added CC: stable, Fixes, and Prateek's ack
> Trim recipients list to minimize inconvenience

Mike, Stephen,

I'm confused over commit 3532b3b554a216f30edb841d29eef48521bdc592 in linux-next
"memblock: drop __memblock_alloc_base()"

It's definitely going to conflict with the proposed patch
over drivers/of/of_reserved_mem.c

Rob, what's the next step then?

Regards.

2019-02-12 16:06:42

by Rob Herring

[permalink] [raw]
Subject: Re: [RESEND PATCH v2] of: fix kmemleak crash caused by imbalance in early memory reservation

On Mon, Feb 11, 2019 at 10:47 AM Marc Gonzalez <[email protected]> wrote:
>
> On 04/02/2019 15:37, Marc Gonzalez wrote:
>
> > Cc: [email protected] # 3.15+
> > Fixes: 3f0c820664483 ("drivers: of: add initialization code for dynamic reserved memory")
> > Acked-by: Marek Szyprowski <[email protected]>
> > Acked-by: Prateek Patel <[email protected]>
> > Tested-by: Marc Gonzalez <[email protected]>
> > Signed-off-by: Mike Rapoport <[email protected]>
> > ---
> > Resend with DT CCed to reach robh's patch queue
> > I added CC: stable, Fixes, and Prateek's ack
> > Trim recipients list to minimize inconvenience
>
> Mike, Stephen,
>
> I'm confused over commit 3532b3b554a216f30edb841d29eef48521bdc592 in linux-next
> "memblock: drop __memblock_alloc_base()"
>
> It's definitely going to conflict with the proposed patch
> over drivers/of/of_reserved_mem.c
>
> Rob, what's the next step then?

Rebase it on top of what's in linux-next and apply it to the tree
which has the above dependency. I'm guessing that is Andrew Morton's
tree.

Rob

2019-02-12 22:27:24

by Stephen Rothwell

[permalink] [raw]
Subject: Re: [RESEND PATCH v2] of: fix kmemleak crash caused by imbalance in early memory reservation

Hi all,

On Tue, 12 Feb 2019 10:03:09 -0600 Rob Herring <[email protected]> wrote:
>
> On Mon, Feb 11, 2019 at 10:47 AM Marc Gonzalez <[email protected]> wrote:
> >
> > On 04/02/2019 15:37, Marc Gonzalez wrote:
> >
> > > Cc: [email protected] # 3.15+
> > > Fixes: 3f0c820664483 ("drivers: of: add initialization code for dynamic reserved memory")
> > > Acked-by: Marek Szyprowski <[email protected]>
> > > Acked-by: Prateek Patel <[email protected]>
> > > Tested-by: Marc Gonzalez <[email protected]>
> > > Signed-off-by: Mike Rapoport <[email protected]>
> > > ---
> > > Resend with DT CCed to reach robh's patch queue
> > > I added CC: stable, Fixes, and Prateek's ack
> > > Trim recipients list to minimize inconvenience
> >
> > I'm confused over commit 3532b3b554a216f30edb841d29eef48521bdc592 in linux-next
> > "memblock: drop __memblock_alloc_base()"
> >
> > It's definitely going to conflict with the proposed patch
> > over drivers/of/of_reserved_mem.c
> >
> > Rob, what's the next step then?
>
> Rebase it on top of what's in linux-next and apply it to the tree
> which has the above dependency. I'm guessing that is Andrew Morton's
> tree.

Yeah, that is in Andrew's "post linux-next" patch series, so if you
rebase it on top of linux-next and then send it to Andrew with some
explanation.

...

Actually, if it is intended for the stable trees, then presumably it is
intended to go to Linus for the current release? In which case, the
patch in Andrew's tree will have to be changed to cope after your patch
appears in Linus' tree (and therefore, linux-next).

--
Cheers,
Stephen Rothwell


Attachments:
(No filename) (499.00 B)
OpenPGP digital signature

2019-02-12 22:27:45

by Andrew Morton

[permalink] [raw]
Subject: Re: [RESEND PATCH v2] of: fix kmemleak crash caused by imbalance in early memory reservation

On Wed, 13 Feb 2019 08:50:28 +1100 Stephen Rothwell <[email protected]> wrote:

> > > It's definitely going to conflict with the proposed patch
> > > over drivers/of/of_reserved_mem.c
> > >
> > > Rob, what's the next step then?
> >
> > Rebase it on top of what's in linux-next and apply it to the tree
> > which has the above dependency. I'm guessing that is Andrew Morton's
> > tree.
>
> Yeah, that is in Andrew's "post linux-next" patch series, so if you
> rebase it on top of linux-next and then send it to Andrew with some
> explanation.
>
> ...
>
> Actually, if it is intended for the stable trees, then presumably it is
> intended to go to Linus for the current release? In which case, the
> patch in Andrew's tree will have to be changed to cope after your patch
> appears in Linus' tree (and therefore, linux-next).

Yup, just do whatever needs to be done and I'll figure it out ;)

2019-02-12 22:29:55

by Rob Herring

[permalink] [raw]
Subject: Re: [RESEND PATCH v2] of: fix kmemleak crash caused by imbalance in early memory reservation

On Tue, Feb 12, 2019 at 3:50 PM Stephen Rothwell <[email protected]> wrote:
>
> Hi all,
>
> On Tue, 12 Feb 2019 10:03:09 -0600 Rob Herring <[email protected]> wrote:
> >
> > On Mon, Feb 11, 2019 at 10:47 AM Marc Gonzalez <[email protected]> wrote:
> > >
> > > On 04/02/2019 15:37, Marc Gonzalez wrote:
> > >
> > > > Cc: [email protected] # 3.15+
> > > > Fixes: 3f0c820664483 ("drivers: of: add initialization code for dynamic reserved memory")
> > > > Acked-by: Marek Szyprowski <[email protected]>
> > > > Acked-by: Prateek Patel <[email protected]>
> > > > Tested-by: Marc Gonzalez <[email protected]>
> > > > Signed-off-by: Mike Rapoport <[email protected]>
> > > > ---
> > > > Resend with DT CCed to reach robh's patch queue
> > > > I added CC: stable, Fixes, and Prateek's ack
> > > > Trim recipients list to minimize inconvenience
> > >
> > > I'm confused over commit 3532b3b554a216f30edb841d29eef48521bdc592 in linux-next
> > > "memblock: drop __memblock_alloc_base()"
> > >
> > > It's definitely going to conflict with the proposed patch
> > > over drivers/of/of_reserved_mem.c
> > >
> > > Rob, what's the next step then?
> >
> > Rebase it on top of what's in linux-next and apply it to the tree
> > which has the above dependency. I'm guessing that is Andrew Morton's
> > tree.
>
> Yeah, that is in Andrew's "post linux-next" patch series, so if you
> rebase it on top of linux-next and then send it to Andrew with some
> explanation.
>
> ...
>
> Actually, if it is intended for the stable trees, then presumably it is
> intended to go to Linus for the current release? In which case, the
> patch in Andrew's tree will have to be changed to cope after your patch
> appears in Linus' tree (and therefore, linux-next).

At this point in the cycle, I wasn't planning to send this for 5.0.
It's not fixing something introduced in 5.0 and it is a debug feature.

Rob

2019-02-13 09:52:04

by Mike Rapoport

[permalink] [raw]
Subject: Re: [RESEND PATCH v2] of: fix kmemleak crash caused by imbalance in early memory reservation

On Tue, Feb 12, 2019 at 04:12:24PM -0600, Rob Herring wrote:
> On Tue, Feb 12, 2019 at 3:50 PM Stephen Rothwell <[email protected]> wrote:
> >
> > Hi all,
> >
> > On Tue, 12 Feb 2019 10:03:09 -0600 Rob Herring <[email protected]> wrote:
> > >
> > > On Mon, Feb 11, 2019 at 10:47 AM Marc Gonzalez <[email protected]> wrote:
> > > >
> > > > On 04/02/2019 15:37, Marc Gonzalez wrote:
> > > >
> > > > > Cc: [email protected] # 3.15+
> > > > > Fixes: 3f0c820664483 ("drivers: of: add initialization code for dynamic reserved memory")
> > > > > Acked-by: Marek Szyprowski <[email protected]>
> > > > > Acked-by: Prateek Patel <[email protected]>
> > > > > Tested-by: Marc Gonzalez <[email protected]>
> > > > > Signed-off-by: Mike Rapoport <[email protected]>
> > > > > ---
> > > > > Resend with DT CCed to reach robh's patch queue
> > > > > I added CC: stable, Fixes, and Prateek's ack
> > > > > Trim recipients list to minimize inconvenience
> > > >
> > > > I'm confused over commit 3532b3b554a216f30edb841d29eef48521bdc592 in linux-next
> > > > "memblock: drop __memblock_alloc_base()"
> > > >
> > > > It's definitely going to conflict with the proposed patch
> > > > over drivers/of/of_reserved_mem.c
> > > >
> > > > Rob, what's the next step then?
> > >
> > > Rebase it on top of what's in linux-next and apply it to the tree
> > > which has the above dependency. I'm guessing that is Andrew Morton's
> > > tree.
> >
> > Yeah, that is in Andrew's "post linux-next" patch series, so if you
> > rebase it on top of linux-next and then send it to Andrew with some
> > explanation.
> >
> > ...
> >
> > Actually, if it is intended for the stable trees, then presumably it is
> > intended to go to Linus for the current release? In which case, the
> > patch in Andrew's tree will have to be changed to cope after your patch
> > appears in Linus' tree (and therefore, linux-next).
>
> At this point in the cycle, I wasn't planning to send this for 5.0.
> It's not fixing something introduced in 5.0 and it is a debug feature.

Below is the version vs. current mmotm.

From 9ea6dceb46067d4f1cbbdbec1189c8496aa0a4bc Mon Sep 17 00:00:00 2001
From: Mike Rapoport <[email protected]>
Date: Mon, 4 Feb 2019 15:37:21 +0100
Subject: [PATCH] of: fix kmemleak crash caused by imbalance in early memory
reservation

Marc Gonzalez reported the following kmemleak crash:

Unable to handle kernel paging request at virtual address ffffffc021e00000
Mem abort info:
ESR = 0x96000006
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000006
CM = 0, WnR = 0
swapper pgtable: 4k pages, 39-bit VAs, pgdp = (____ptrval____)
[ffffffc021e00000] pgd=000000017e3ba803, pud=000000017e3ba803,
pmd=0000000000000000
Internal error: Oops: 96000006 [#1] PREEMPT SMP
Modules linked in:
CPU: 6 PID: 523 Comm: kmemleak Tainted: G S W 5.0.0-rc1 #13
Hardware name: Qualcomm Technologies, Inc. MSM8998 v1 MTP (DT)
pstate: 80000085 (Nzcv daIf -PAN -UAO)
pc : scan_block+0x70/0x190
lr : scan_block+0x6c/0x190
sp : ffffff8012e8bd20
x29: ffffff8012e8bd20 x28: ffffffc0fdbaf018
x27: ffffffc022000000 x26: 0000000000000080
x25: ffffff8011aadf70 x24: ffffffc0f8cc8000
x23: ffffff8010dc8000 x22: ffffff8010dc8830
x21: ffffffc021e00ff9 x20: ffffffc0f8cc8050
x19: ffffffc021e00000 x18: 0000000000002409
x17: 0000000000000200 x16: 0000000000000000
x15: ffffff8010e14dd8 x14: 0000000000002406
x13: 000000004c4dd0c6 x12: ffffffc0f77dad58
x11: 0000000000000001 x10: ffffff8010d9e688
x9 : ffffff8010d9f000 x8 : ffffff8010d9e688
x7 : 0000000000000002 x6 : 0000000000000000
x5 : ffffff8011511c20 x4 : 00000000000026d1
x3 : ffffff8010e14d88 x2 : 5b36396f4e7d4000
x1 : 0000000000208040 x0 : 0000000000000000
Process kmemleak (pid: 523, stack limit = 0x(____ptrval____))
Call trace:
scan_block+0x70/0x190
scan_gray_list+0x108/0x1c0
kmemleak_scan+0x33c/0x7c0
kmemleak_scan_thread+0x98/0xf0
kthread+0x11c/0x120
ret_from_fork+0x10/0x1c
Code: f9000fb4 d503201f 97ffffd2 35000580 (f9400260)
---[ end trace 176d6ed9d86a0c33 ]---
note: kmemleak[523] exited with preempt_count 2

The crash happens when a no-map area is allocated in
early_init_dt_alloc_reserved_memory_arch(). The allocated region is
registered with kmemleak, but it is then removed from memblock using
memblock_remove() that is not kmemleak-aware.

Replacing __memblock_alloc_base() with memblock_find_in_range() makes sure
that the allocated memory is not added to kmemleak and then
memblock_remove()'ing this memory is safe.

As a bonus, since memblock_find_in_range() ensures the allocation in the
specified range, the bounds check can be removed.

Cc: [email protected] # 3.15+
Fixes: 3f0c820664483 ("drivers: of: add initialization code for dynamic reserved memory")
Acked-by: Marek Szyprowski <[email protected]>
Acked-by: Prateek Patel <[email protected]>
Tested-by: Marc Gonzalez <[email protected]>
Signed-off-by: Mike Rapoport <[email protected]>
---
drivers/of/of_reserved_mem.c | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/drivers/of/of_reserved_mem.c b/drivers/of/of_reserved_mem.c
index 78aa9eb..47971ab 100644
--- a/drivers/of/of_reserved_mem.c
+++ b/drivers/of/of_reserved_mem.c
@@ -34,21 +34,16 @@ int __init __weak early_init_dt_alloc_reserved_memory_arch(phys_addr_t size,

end = !end ? MEMBLOCK_ALLOC_ANYWHERE : end;
align = !align ? SMP_CACHE_BYTES : align;
- base = memblock_phys_alloc_range(size, align, 0, end);
+ base = memblock_find_in_range(size, align, start, end);
if (!base)
return -ENOMEM;

- /*
- * Check if the allocated region fits in to start..end window
- */
- if (base < start) {
- memblock_free(base, size);
- return -ENOMEM;
- }
-
*res_base = base;
if (nomap)
return memblock_remove(base, size);
+ else
+ return memblock_reserve(base, size);
+
return 0;
}

--
2.7.4


2019-02-13 13:31:44

by Marc Gonzalez

[permalink] [raw]
Subject: Re: [RESEND PATCH v2] of: fix kmemleak crash caused by imbalance in early memory reservation

On 13/02/2019 07:57, Mike Rapoport wrote:

> Below is the version vs. current mmotm.
>
> From 9ea6dceb46067d4f1cbbdbec1189c8496aa0a4bc Mon Sep 17 00:00:00 2001
> From: Mike Rapoport <[email protected]>
> Date: Mon, 4 Feb 2019 15:37:21 +0100
> Subject: [PATCH] of: fix kmemleak crash caused by imbalance in early memory reservation

Out of curiosity, why don't you send as a proper v3?

> Marc Gonzalez reported the following kmemleak crash:
>
> [...]
>
> The crash happens when a no-map area is allocated in
> early_init_dt_alloc_reserved_memory_arch(). The allocated region is
> registered with kmemleak, but it is then removed from memblock using
> memblock_remove() that is not kmemleak-aware.
>
> Replacing __memblock_alloc_base() with memblock_find_in_range()

Nit: in this new version, you're replacing memblock_phys_alloc_range()
with memblock_find_in_range() so I don't know if the comment still
applies.

> makes sure that the allocated memory is not added to kmemleak and then
> memblock_remove()'ing this memory is safe.
>
> As a bonus, since memblock_find_in_range() ensures the allocation in the
> specified range, the bounds check can be removed.
>
> Cc: [email protected] # 3.15+
> Fixes: 3f0c820664483 ("drivers: of: add initialization code for dynamic reserved memory")
> Acked-by: Marek Szyprowski <[email protected]>
> Acked-by: Prateek Patel <[email protected]>
> Tested-by: Marc Gonzalez <[email protected]>
> Signed-off-by: Mike Rapoport <[email protected]>

2019-02-13 16:36:24

by Mike Rapoport

[permalink] [raw]
Subject: Re: [RESEND PATCH v2] of: fix kmemleak crash caused by imbalance in early memory reservation

On Wed, Feb 13, 2019 at 10:27:48AM +0100, Marc Gonzalez wrote:
> On 13/02/2019 07:57, Mike Rapoport wrote:
>
> > Below is the version vs. current mmotm.
> >
> > From 9ea6dceb46067d4f1cbbdbec1189c8496aa0a4bc Mon Sep 17 00:00:00 2001
> > From: Mike Rapoport <[email protected]>
> > Date: Mon, 4 Feb 2019 15:37:21 +0100
> > Subject: [PATCH] of: fix kmemleak crash caused by imbalance in early memory reservation
>
> Out of curiosity, why don't you send as a proper v3?

Was too much in a hurry

> > Marc Gonzalez reported the following kmemleak crash:
> >
> > [...]
> >
> > The crash happens when a no-map area is allocated in
> > early_init_dt_alloc_reserved_memory_arch(). The allocated region is
> > registered with kmemleak, but it is then removed from memblock using
> > memblock_remove() that is not kmemleak-aware.
> >
> > Replacing __memblock_alloc_base() with memblock_find_in_range()
>
> Nit: in this new version, you're replacing memblock_phys_alloc_range()
> with memblock_find_in_range() so I don't know if the comment still
> applies.

and didn't check the outcome of blindly applying the patch :(

I'll send a proper v3 soon. Sorry for the noise.

> > makes sure that the allocated memory is not added to kmemleak and then
> > memblock_remove()'ing this memory is safe.
> >
> > As a bonus, since memblock_find_in_range() ensures the allocation in the
> > specified range, the bounds check can be removed.
> >
> > Cc: [email protected] # 3.15+
> > Fixes: 3f0c820664483 ("drivers: of: add initialization code for dynamic reserved memory")
> > Acked-by: Marek Szyprowski <[email protected]>
> > Acked-by: Prateek Patel <[email protected]>
> > Tested-by: Marc Gonzalez <[email protected]>
> > Signed-off-by: Mike Rapoport <[email protected]>
>

--
Sincerely yours,
Mike.


2019-03-06 02:42:11

by Guenter Roeck

[permalink] [raw]
Subject: Re: [RESEND PATCH v2] of: fix kmemleak crash caused by imbalance in early memory reservation

On Tue, Feb 12, 2019 at 04:12:24PM -0600, Rob Herring wrote:
> On Tue, Feb 12, 2019 at 3:50 PM Stephen Rothwell <[email protected]> wrote:
> >
> > Hi all,
> >
> > On Tue, 12 Feb 2019 10:03:09 -0600 Rob Herring <[email protected]> wrote:
> > >
> > > On Mon, Feb 11, 2019 at 10:47 AM Marc Gonzalez <[email protected]> wrote:
> > > >
> > > > On 04/02/2019 15:37, Marc Gonzalez wrote:
> > > >
> > > > > Cc: [email protected] # 3.15+
> > > > > Fixes: 3f0c820664483 ("drivers: of: add initialization code for dynamic reserved memory")
> > > > > Acked-by: Marek Szyprowski <[email protected]>
> > > > > Acked-by: Prateek Patel <[email protected]>
> > > > > Tested-by: Marc Gonzalez <[email protected]>
> > > > > Signed-off-by: Mike Rapoport <[email protected]>
> > > > > ---
> > > > > Resend with DT CCed to reach robh's patch queue
> > > > > I added CC: stable, Fixes, and Prateek's ack
> > > > > Trim recipients list to minimize inconvenience
> > > >
> > > > I'm confused over commit 3532b3b554a216f30edb841d29eef48521bdc592 in linux-next
> > > > "memblock: drop __memblock_alloc_base()"
> > > >
> > > > It's definitely going to conflict with the proposed patch
> > > > over drivers/of/of_reserved_mem.c
> > > >
> > > > Rob, what's the next step then?
> > >
> > > Rebase it on top of what's in linux-next and apply it to the tree
> > > which has the above dependency. I'm guessing that is Andrew Morton's
> > > tree.
> >
> > Yeah, that is in Andrew's "post linux-next" patch series, so if you
> > rebase it on top of linux-next and then send it to Andrew with some
> > explanation.
> >
> > ...
> >
> > Actually, if it is intended for the stable trees, then presumably it is
> > intended to go to Linus for the current release? In which case, the
> > patch in Andrew's tree will have to be changed to cope after your patch
> > appears in Linus' tree (and therefore, linux-next).
>
> At this point in the cycle, I wasn't planning to send this for 5.0.
> It's not fixing something introduced in 5.0 and it is a debug feature.
>
Hi Rob,

this may be a debug feature, but we do test our kernels with it enabled,
and the problem does affect our 4.19 branch (chromeos-4.19). Are you
suggesting that we should backport the fix into our branch and not send
the backport to -stable ?

No problem, just trying to avoid wasting our time.

Thanks,
Guenter

2019-03-06 15:51:34

by Rob Herring

[permalink] [raw]
Subject: Re: [RESEND PATCH v2] of: fix kmemleak crash caused by imbalance in early memory reservation

On Tue, Mar 5, 2019 at 8:12 PM Guenter Roeck <[email protected]> wrote:
>
> On Tue, Feb 12, 2019 at 04:12:24PM -0600, Rob Herring wrote:
> > On Tue, Feb 12, 2019 at 3:50 PM Stephen Rothwell <[email protected]> wrote:
> > >
> > > Hi all,
> > >
> > > On Tue, 12 Feb 2019 10:03:09 -0600 Rob Herring <[email protected]> wrote:
> > > >
> > > > On Mon, Feb 11, 2019 at 10:47 AM Marc Gonzalez <[email protected]> wrote:
> > > > >
> > > > > On 04/02/2019 15:37, Marc Gonzalez wrote:
> > > > >
> > > > > > Cc: [email protected] # 3.15+
> > > > > > Fixes: 3f0c820664483 ("drivers: of: add initialization code for dynamic reserved memory")
> > > > > > Acked-by: Marek Szyprowski <[email protected]>
> > > > > > Acked-by: Prateek Patel <[email protected]>
> > > > > > Tested-by: Marc Gonzalez <[email protected]>
> > > > > > Signed-off-by: Mike Rapoport <[email protected]>
> > > > > > ---
> > > > > > Resend with DT CCed to reach robh's patch queue
> > > > > > I added CC: stable, Fixes, and Prateek's ack
> > > > > > Trim recipients list to minimize inconvenience
> > > > >
> > > > > I'm confused over commit 3532b3b554a216f30edb841d29eef48521bdc592 in linux-next
> > > > > "memblock: drop __memblock_alloc_base()"
> > > > >
> > > > > It's definitely going to conflict with the proposed patch
> > > > > over drivers/of/of_reserved_mem.c
> > > > >
> > > > > Rob, what's the next step then?
> > > >
> > > > Rebase it on top of what's in linux-next and apply it to the tree
> > > > which has the above dependency. I'm guessing that is Andrew Morton's
> > > > tree.
> > >
> > > Yeah, that is in Andrew's "post linux-next" patch series, so if you
> > > rebase it on top of linux-next and then send it to Andrew with some
> > > explanation.
> > >
> > > ...
> > >
> > > Actually, if it is intended for the stable trees, then presumably it is
> > > intended to go to Linus for the current release? In which case, the
> > > patch in Andrew's tree will have to be changed to cope after your patch
> > > appears in Linus' tree (and therefore, linux-next).
> >
> > At this point in the cycle, I wasn't planning to send this for 5.0.
> > It's not fixing something introduced in 5.0 and it is a debug feature.
> >
> Hi Rob,
>
> this may be a debug feature, but we do test our kernels with it enabled,
> and the problem does affect our 4.19 branch (chromeos-4.19). Are you
> suggesting that we should backport the fix into our branch and not send
> the backport to -stable ?

No, not at all. Just that I wasn't going to add it to the probable
last 5.0-rc and would wait.

However, it's complicated by other memblock changes in 5.1 and not a
trivial backport. One of the versions on the list should be easier to
backport than what's in mainline (or going to be).

Rob

2019-03-06 19:05:44

by Guenter Roeck

[permalink] [raw]
Subject: Re: [RESEND PATCH v2] of: fix kmemleak crash caused by imbalance in early memory reservation

On 3/6/19 5:39 AM, Rob Herring wrote:
> On Tue, Mar 5, 2019 at 8:12 PM Guenter Roeck <[email protected]> wrote:
>>
>> On Tue, Feb 12, 2019 at 04:12:24PM -0600, Rob Herring wrote:
>>> On Tue, Feb 12, 2019 at 3:50 PM Stephen Rothwell <[email protected]> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> On Tue, 12 Feb 2019 10:03:09 -0600 Rob Herring <[email protected]> wrote:
>>>>>
>>>>> On Mon, Feb 11, 2019 at 10:47 AM Marc Gonzalez <[email protected]> wrote:
>>>>>>
>>>>>> On 04/02/2019 15:37, Marc Gonzalez wrote:
>>>>>>
>>>>>>> Cc: [email protected] # 3.15+
>>>>>>> Fixes: 3f0c820664483 ("drivers: of: add initialization code for dynamic reserved memory")
>>>>>>> Acked-by: Marek Szyprowski <[email protected]>
>>>>>>> Acked-by: Prateek Patel <[email protected]>
>>>>>>> Tested-by: Marc Gonzalez <[email protected]>
>>>>>>> Signed-off-by: Mike Rapoport <[email protected]>
>>>>>>> ---
>>>>>>> Resend with DT CCed to reach robh's patch queue
>>>>>>> I added CC: stable, Fixes, and Prateek's ack
>>>>>>> Trim recipients list to minimize inconvenience
>>>>>>
>>>>>> I'm confused over commit 3532b3b554a216f30edb841d29eef48521bdc592 in linux-next
>>>>>> "memblock: drop __memblock_alloc_base()"
>>>>>>
>>>>>> It's definitely going to conflict with the proposed patch
>>>>>> over drivers/of/of_reserved_mem.c
>>>>>>
>>>>>> Rob, what's the next step then?
>>>>>
>>>>> Rebase it on top of what's in linux-next and apply it to the tree
>>>>> which has the above dependency. I'm guessing that is Andrew Morton's
>>>>> tree.
>>>>
>>>> Yeah, that is in Andrew's "post linux-next" patch series, so if you
>>>> rebase it on top of linux-next and then send it to Andrew with some
>>>> explanation.
>>>>
>>>> ...
>>>>
>>>> Actually, if it is intended for the stable trees, then presumably it is
>>>> intended to go to Linus for the current release? In which case, the
>>>> patch in Andrew's tree will have to be changed to cope after your patch
>>>> appears in Linus' tree (and therefore, linux-next).
>>>
>>> At this point in the cycle, I wasn't planning to send this for 5.0.
>>> It's not fixing something introduced in 5.0 and it is a debug feature.
>>>
>> Hi Rob,
>>
>> this may be a debug feature, but we do test our kernels with it enabled,
>> and the problem does affect our 4.19 branch (chromeos-4.19). Are you
>> suggesting that we should backport the fix into our branch and not send
>> the backport to -stable ?
>
> No, not at all. Just that I wasn't going to add it to the probable
> last 5.0-rc and would wait.
>
> However, it's complicated by other memblock changes in 5.1 and not a
> trivial backport. One of the versions on the list should be easier to
> backport than what's in mainline (or going to be).
>

We went ahead and applied a backport of an older version of the patch series
to chromeos-4.19. We'll see how well that works, but so far it looks like
it fixes our problem.

Guenter