2019-12-19 17:35:40

by Scott Cheloha

[permalink] [raw]
Subject: Re: [PATCH v3] drivers/base/memory.c: cache blocks in radix tree to accelerate lookup

On Wed, Dec 18, 2019 at 10:00:49AM +0100, David Hildenbrand wrote:
> On 17.12.19 20:32, Scott Cheloha wrote:
> > Searching for a particular memory block by id is slow because each block
> > device is kept in an unsorted linked list on the subsystem bus.
> >
> > Lookup is much faster if we cache the blocks in a radix tree. Memory
> > subsystem initialization and hotplug/hotunplug is at least a little faster
> > for any machine with more than ~100 blocks, and the speedup grows with
> > the block count.
> >
> > Signed-off-by: Scott Cheloha <[email protected]>
> > Acked-by: David Hildenbrand <[email protected]>
> > ---
> > v2 incorporates suggestions from David Hildenbrand.
> >
> > v3 changes:
> > - Rebase atop "drivers/base/memory.c: drop the mem_sysfs_mutex"
> >
> > - Be conservative: don't use radix_tree_for_each_slot() in
> > walk_memory_blocks() yet. It introduces RCU which could
> > change behavior. Walking the tree "by hand" with
> > find_memory_block_by_id() is slower but keeps the patch
> > simple.
>
> Fine with me (splitting it out, e.g., into an addon patch), however, as
> readers/writers run mutually exclusive, there is nothing to worry about
> here. RCU will not make a difference.

Oh. In that case, can you make heads or tails of this CI failure
email I got about the v2 patch? I've inlined it at the end of this
mail. "Suspicious RCU usage". Unclear if it's a false positive. My
thinking was that I'd hold off on using radix_tree_for_each_slot() and
introducing a possible regression until I understood what was
triggering the robot.

Also, is there anyone else I should shop this patch to? I copied the
maintainers reported by scripts/get_maintainer.pl but are there others
who might be interested?

--

Here's that CI mail. I've stripped the attached config because it's
huge.

Date: Sun, 1 Dec 2019 23:05:23 +0800
From: kernel test robot <[email protected]>
To: Scott Cheloha <[email protected]>
Cc: 0day robot <[email protected]>, LKP <[email protected]>
Subject: 86dc301f7b ("drivers/base/memory.c: cache blocks in radix tree .."):
[ 1.341517] WARNING: suspicious RCU usage
Message-ID: <20191201150523.GE18573@shao2-debian>

Greetings,

0day kernel testing robot got the below dmesg and the first bad commit is

https://github.com/0day-ci/linux/commits/Scott-Cheloha/drivers-base-memory-c-cache-blocks-in-radix-tree-to-accelerate-lookup/20191124-104557

commit 86dc301f7b4815d90e3a7843ffed655d64efe445
Author: Scott Cheloha <[email protected]>
AuthorDate: Thu Nov 21 13:59:52 2019 -0600
Commit: 0day robot <[email protected]>
CommitDate: Sun Nov 24 10:45:59 2019 +0800

drivers/base/memory.c: cache blocks in radix tree to accelerate lookup

Searching for a particular memory block by id is slow because each block
device is kept in an unsorted linked list on the subsystem bus.

Lookup is much faster if we cache the blocks in a radix tree. Memory
subsystem initialization and hotplug/hotunplug is at least a little faster
for any machine with more than ~100 blocks, and the speedup grows with
the block count.

Signed-off-by: Scott Cheloha <[email protected]>
Acked-by: David Hildenbrand <[email protected]>

0e4a459f56 tracing: Remove unnecessary DEBUG_FS dependency
86dc301f7b drivers/base/memory.c: cache blocks in radix tree to accelerate lookup
+---------------------------------------------------------------------+------------+------------+
| | 0e4a459f56 | 86dc301f7b |
+---------------------------------------------------------------------+------------+------------+
| boot_successes | 8 | 0 |
| boot_failures | 25 | 11 |
| WARNING:possible_circular_locking_dependency_detected | 25 | 8 |
| WARNING:suspicious_RCU_usage | 0 | 11 |
| include/linux/radix-tree.h:#suspicious_rcu_dereference_check()usage | 0 | 11 |
+---------------------------------------------------------------------+------------+------------+

If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>

[ 1.335279] random: get_random_bytes called from kcmp_cookies_init+0x29/0x4c with crng_init=0
[ 1.336883] ACPI: bus type PCI registered
[ 1.338295] PCI: Using configuration type 1 for base access
[ 1.340735]
[ 1.341049] =============================
[ 1.341517] WARNING: suspicious RCU usage
[ 1.342266] 5.4.0-rc5-00070-g86dc301f7b481 #1 Tainted: G T
[ 1.343494] -----------------------------
[ 1.344226] include/linux/radix-tree.h:167 suspicious rcu_dereference_check() usage!
[ 1.345516]
[ 1.345516] other info that might help us debug this:
[ 1.345516]
[ 1.346962]
[ 1.346962] rcu_scheduler_active = 2, debug_locks = 1
[ 1.348134] no locks held by swapper/0/1.
[ 1.348866]
[ 1.348866] stack backtrace:
[ 1.349525] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G T 5.4.0-rc5-00070-g86dc301f7b481 #1
[ 1.351230] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 1.352720] Call Trace:
[ 1.353187] ? dump_stack+0x9a/0xde
[ 1.353507] ? node_access_release+0x19/0x19
[ 1.353507] ? walk_memory_blocks+0xe6/0x184
[ 1.353507] ? set_debug_rodata+0x20/0x20
[ 1.353507] ? link_mem_sections+0x39/0x3d
[ 1.353507] ? topology_init+0x74/0xc8
[ 1.353507] ? enable_cpu0_hotplug+0x15/0x15
[ 1.353507] ? do_one_initcall+0x13d/0x30a
[ 1.353507] ? kernel_init_freeable+0x18e/0x23b
[ 1.353507] ? rest_init+0x173/0x173
[ 1.353507] ? kernel_init+0x10/0x151
[ 1.353507] ? rest_init+0x173/0x173
[ 1.353507] ? ret_from_fork+0x3a/0x50
[ 1.410829] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
[ 1.427389] cryptd: max_cpu_qlen set to 1000
[ 1.457792] ACPI: Added _OSI(Module Device)
[ 1.458615] ACPI: Added _OSI(Processor Device)
[ 1.459428] ACPI: Added _OSI(3.0 _SCP Extensions)

# HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD
git bisect start 1d5f18079cb10420c4e4b67f571b998ec861a20c 219d54332a09e8d8741c1e1982f5eae56099de85 --
git bisect good 00e5729fc7309fd1da26659b7dce0fcc0b46ab7e # 23:10 G 11 0 9 9 Merge 'linux-review/Daniel-W-S-Almeida/media-dummy_dvb_fe-register-adapter-frontend/20191127-043813' into devel-hourly-2019113015
git bisect bad da795b1ad310518d9100105235c9ba3ff4ee2524 # 23:32 B 0 11 27 0 Merge 'linux-review/Krzysztof-Kozlowski/media-Fix-Kconfig-indentation/20191122-055026' into devel-hourly-2019113015
git bisect bad 53f2832a6bf7103957815cef4ad58be22255ea7a # 00:21 B 0 7 24 0 Merge 'linux-review/Luc-Van-Oostenryck/misc-xilinx_sdfec-add-missing-__user-annotation/20191124-052749' into devel-hourly-2019113015
git bisect good 03a04e32f16a2c4bd267b48e8ab4aaf71c507050 # 00:55 G 11 0 8 8 Merge 'linux-review/Helmut-Grohne/mdio_bus-revert-inadvertent-error-code-change/20191124-171049' into devel-hourly-2019113015
git bisect bad 03403c59a591b12556bd0db1243eb0503112a0df # 02:22 B 0 10 26 0 Merge 'linux-review/Navid-Emamdoost/EDAC-Fix-memory-leak-in-i5100_init_one/20191124-103621' into devel-hourly-2019113015
git bisect good fae5c4b30fda35cbcc8389f432dcaac20f9b3a12 # 03:02 G 11 0 7 7 Merge 'linux-review/David-Gow/kunit-Always-print-actual-pointer-values-in-asserts/20191124-131742' into devel-hourly-2019113015
git bisect bad 1e0c725d2bb21789330a2fe1a360c37ae753eb18 # 04:01 B 0 10 26 0 Merge 'linux-review/Scott-Cheloha/drivers-base-memory-c-cache-blocks-in-radix-tree-to-accelerate-lookup/20191124-104557' into devel-hourly-2019113015
git bisect good 0902b87758e830e857e11f1538e50b218743f4b0 # 04:29 G 10 0 7 7 Merge 'linux-review/Julio-Faracco/drivers-net-virtio_net-Implement-a-dev_watchdog-handler/20191124-135051' into devel-hourly-2019113015
git bisect good 2d0c894673b55b51915ea858eabcf7c97c9b8ccb # 05:33 G 10 0 6 6 Merge 'linux-review/Maciej-enczykowski/net-ipv6-IPV6_TRANSPARENT-check-NET_RAW-prior-to-NET_ADMIN/20191124-120121' into devel-hourly-2019113015
git bisect good 3c095c856fb84271e56e1869aface37af1b078f1 # 05:58 G 11 0 10 10 Merge 'linux-review/Navid-Emamdoost/Bluetooth-Fix-memory-leak-in-hci_connect_le_scan/20191124-111255' into devel-hourly-2019113015
git bisect bad 86dc301f7b4815d90e3a7843ffed655d64efe445 # 06:39 B 0 11 27 0 drivers/base/memory.c: cache blocks in radix tree to accelerate lookup
# first bad commit: [86dc301f7b4815d90e3a7843ffed655d64efe445] drivers/base/memory.c: cache blocks in radix tree to accelerate lookup
git bisect good 0e4a459f56c32d3e52ae69a4b447db2f48a65f44 # 07:25 G 33 0 25 25 tracing: Remove unnecessary DEBUG_FS dependency
# extra tests with debug options
git bisect good 86dc301f7b4815d90e3a7843ffed655d64efe445 # 09:13 G 11 0 0 0 drivers/base/memory.c: cache blocks in radix tree to accelerate lookup
# extra tests on head commit of linux-review/Scott-Cheloha/drivers-base-memory-c-cache-blocks-in-radix-tree-to-accelerate-lookup/20191124-104557
git bisect bad 86dc301f7b4815d90e3a7843ffed655d64efe445 # 09:16 B 0 11 27 0 drivers/base/memory.c: cache blocks in radix tree to accelerate lookup
# bad: [86dc301f7b4815d90e3a7843ffed655d64efe445] drivers/base/memory.c: cache blocks in radix tree to accelerate lookup
# extra tests on revert first bad commit
git bisect good 65bba365eabe04a22e5d68012a45b92eee26860c # 10:18 G 10 0 9 9 Revert "drivers/base/memory.c: cache blocks in radix tree to accelerate lookup"
# good: [65bba365eabe04a22e5d68012a45b92eee26860c] Revert "drivers/base/memory.c: cache blocks in radix tree to accelerate lookup"

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation


2019-12-20 10:53:26

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v3] drivers/base/memory.c: cache blocks in radix tree to accelerate lookup

On 19.12.19 18:33, Scott Cheloha wrote:
> On Wed, Dec 18, 2019 at 10:00:49AM +0100, David Hildenbrand wrote:
>> On 17.12.19 20:32, Scott Cheloha wrote:
>>> Searching for a particular memory block by id is slow because each block
>>> device is kept in an unsorted linked list on the subsystem bus.
>>>
>>> Lookup is much faster if we cache the blocks in a radix tree. Memory
>>> subsystem initialization and hotplug/hotunplug is at least a little faster
>>> for any machine with more than ~100 blocks, and the speedup grows with
>>> the block count.
>>>
>>> Signed-off-by: Scott Cheloha <[email protected]>
>>> Acked-by: David Hildenbrand <[email protected]>
>>> ---
>>> v2 incorporates suggestions from David Hildenbrand.
>>>
>>> v3 changes:
>>> - Rebase atop "drivers/base/memory.c: drop the mem_sysfs_mutex"
>>>
>>> - Be conservative: don't use radix_tree_for_each_slot() in
>>> walk_memory_blocks() yet. It introduces RCU which could
>>> change behavior. Walking the tree "by hand" with
>>> find_memory_block_by_id() is slower but keeps the patch
>>> simple.
>>
>> Fine with me (splitting it out, e.g., into an addon patch), however, as
>> readers/writers run mutually exclusive, there is nothing to worry about
>> here. RCU will not make a difference.
>
> Oh. In that case, can you make heads or tails of this CI failure
> email I got about the v2 patch? I've inlined it at the end of this
> mail. "Suspicious RCU usage". Unclear if it's a false positive. My
> thinking was that I'd hold off on using radix_tree_for_each_slot() and
> introducing a possible regression until I understood what was
> triggering the robot.

Ah, did not see that message. See below.

>
> Also, is there anyone else I should shop this patch to? I copied the
> maintainers reported by scripts/get_maintainer.pl but are there others
> who might be interested?

On memory hotplug (and related) things, it's usually a good idea to CC
Andrew (who picks up basically all MM stuff), Michal Hocko, and Oscar
Salvador. (cc-ing them)

>
> --
>
> Here's that CI mail. I've stripped the attached config because it's
> huge.
>
> Date: Sun, 1 Dec 2019 23:05:23 +0800
> From: kernel test robot <[email protected]>
> To: Scott Cheloha <[email protected]>
> Cc: 0day robot <[email protected]>, LKP <[email protected]>
> Subject: 86dc301f7b ("drivers/base/memory.c: cache blocks in radix tree .."):
> [ 1.341517] WARNING: suspicious RCU usage
> Message-ID: <20191201150523.GE18573@shao2-debian>
>
> Greetings,
>
> 0day kernel testing robot got the below dmesg and the first bad commit is
>
> https://github.com/0day-ci/linux/commits/Scott-Cheloha/drivers-base-memory-c-cache-blocks-in-radix-tree-to-accelerate-lookup/20191124-104557
>
> commit 86dc301f7b4815d90e3a7843ffed655d64efe445
> Author: Scott Cheloha <[email protected]>
> AuthorDate: Thu Nov 21 13:59:52 2019 -0600
> Commit: 0day robot <[email protected]>
> CommitDate: Sun Nov 24 10:45:59 2019 +0800
>
> drivers/base/memory.c: cache blocks in radix tree to accelerate lookup
>
> Searching for a particular memory block by id is slow because each block
> device is kept in an unsorted linked list on the subsystem bus.
>
> Lookup is much faster if we cache the blocks in a radix tree. Memory
> subsystem initialization and hotplug/hotunplug is at least a little faster
> for any machine with more than ~100 blocks, and the speedup grows with
> the block count.
>
> Signed-off-by: Scott Cheloha <[email protected]>
> Acked-by: David Hildenbrand <[email protected]>
>
> 0e4a459f56 tracing: Remove unnecessary DEBUG_FS dependency
> 86dc301f7b drivers/base/memory.c: cache blocks in radix tree to accelerate lookup
> +---------------------------------------------------------------------+------------+------------+
> | | 0e4a459f56 | 86dc301f7b |
> +---------------------------------------------------------------------+------------+------------+
> | boot_successes | 8 | 0 |
> | boot_failures | 25 | 11 |
> | WARNING:possible_circular_locking_dependency_detected | 25 | 8 |
> | WARNING:suspicious_RCU_usage | 0 | 11 |
> | include/linux/radix-tree.h:#suspicious_rcu_dereference_check()usage | 0 | 11 |
> +---------------------------------------------------------------------+------------+------------+
>
> If you fix the issue, kindly add following tag
> Reported-by: kernel test robot <[email protected]>
>
> [ 1.335279] random: get_random_bytes called from kcmp_cookies_init+0x29/0x4c with crng_init=0
> [ 1.336883] ACPI: bus type PCI registered
> [ 1.338295] PCI: Using configuration type 1 for base access
> [ 1.340735]
> [ 1.341049] =============================
> [ 1.341517] WARNING: suspicious RCU usage
> [ 1.342266] 5.4.0-rc5-00070-g86dc301f7b481 #1 Tainted: G T
> [ 1.343494] -----------------------------
> [ 1.344226] include/linux/radix-tree.h:167 suspicious rcu_dereference_check() usage!
> [ 1.345516]
> [ 1.345516] other info that might help us debug this:
> [ 1.345516]
> [ 1.346962]
> [ 1.346962] rcu_scheduler_active = 2, debug_locks = 1
> [ 1.348134] no locks held by swapper/0/1.
> [ 1.348866]
> [ 1.348866] stack backtrace:
> [ 1.349525] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G T 5.4.0-rc5-00070-g86dc301f7b481 #1
> [ 1.351230] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> [ 1.352720] Call Trace:
> [ 1.353187] ? dump_stack+0x9a/0xde
> [ 1.353507] ? node_access_release+0x19/0x19
> [ 1.353507] ? walk_memory_blocks+0xe6/0x184
> [ 1.353507] ? set_debug_rodata+0x20/0x20
> [ 1.353507] ? link_mem_sections+0x39/0x3d
> [ 1.353507] ? topology_init+0x74/0xc8
> [ 1.353507] ? enable_cpu0_hotplug+0x15/0x15
> [ 1.353507] ? do_one_initcall+0x13d/0x30a
> [ 1.353507] ? kernel_init_freeable+0x18e/0x23b
> [ 1.353507] ? rest_init+0x173/0x173
> [ 1.353507] ? kernel_init+0x10/0x151
> [ 1.353507] ? rest_init+0x173/0x173
> [ 1.353507] ? ret_from_fork+0x3a/0x50
> [ 1.410829] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
> [ 1.427389] cryptd: max_cpu_qlen set to 1000
> [ 1.457792] ACPI: Added _OSI(Module Device)
> [ 1.458615] ACPI: Added _OSI(Processor Device)
> [ 1.459428] ACPI: Added _OSI(3.0 _SCP Extensions)
>


We get a complaint that we do a rcu_dereference() without proper protection.

We come via

rest_init()->kernel_init()->kernel_init_freeable()->do_basic_setup()->
do_initcalls()->do_initcall_level()->do_one_initcall()->
topology_init()->register_one_node()->link_mem_sections()->
walk_memory_blocks()

E.g., we add ACPI memory similarly via
...do_initcalls()...acpi_init()->acpi_scan_init()
also without holding the device hotplug lock. AFAIK, we can't get
races/concurrent hot(un)plug activity here (we even documented that
for ACPI). And if we would, the code would already be wrong ;)

I assume the simplest and cheapest way to suppress the warning would be
to add rcu_read_lock()/rcu_read_unlock() around the
radix_tree_for_each_slot().

Another way to suppress the warning would be to take the device hotplug
lock before calling register_one_node() in all arch specific code - but
we had a similar discussion due to the ACPI code back then and decided
to not take the device hotplug lock if not really necessary.

... but after all we would want a radix_tree_for_each_slot() variant
that does not perform such checks here. We don't hold any locks and
don't need any locks.

--
Thanks,

David / dhildenb