2018-02-28 19:06:29

by Frank Rowand

[permalink] [raw]
Subject: [PATCH v4 0/2] of: cache phandle nodes to reduce cost of of_find_node_by_phandle()

From: Frank Rowand <[email protected]>

Create a cache of the nodes that contain a phandle property. Use this
cache to find the node for a given phandle value instead of scanning
the devicetree to find the node. If the phandle value is not found
in the cache, of_find_node_by_phandle() will fall back to the tree
scan algorithm.

Size and performance data is in patch 1/2 comments

Changes since v3:
- of_populate_phandle_cache(): add check for failed memory allocation
- add patch 2/2 into series instead of as a standalone patch that was
dependent on patch 1/2 of this series

Changes since v2:
- add mask to calculation of phandle cache entry
- which results in better overhead reduction for devicetrees with
phandle properties not allocated in the monotonically increasing
range of 1..n
- due to mask, number of entries in cache potentially increased to
next power of two
- minor fixes as suggested by reviewers
- no longer using live_tree_max_phandle() so do not move it from
drivers/of/resolver.c to drivers/of/base.c

Changes since v1:
- change short description from
of: cache phandle nodes to reduce cost of of_find_node_by_phandle()
- rebase on v4.16-rc1
- reorder new functions in base.c to avoid forward declaration
- add locking around kfree(phandle_cache) for memory ordering
- add explicit check for non-null of phandle_cache in
of_find_node_by_phandle(). There is already a check for !handle,
which prevents accessing a null phandle_cache, but that dependency
is not obvious, so this check makes it more apparent.
- do not free phandle_cache if modules are enabled, so that
cached phandles will be available when modules are loaded



Frank Rowand (2):
of: cache phandle nodes to reduce cost of of_find_node_by_phandle()
of: add early boot allocation of of_find_node_by_phandle() cache

drivers/of/base.c | 119 ++++++++++++++++++++++++++++++++++++++++++++++--
drivers/of/fdt.c | 2 +
drivers/of/of_private.h | 5 ++
drivers/of/resolver.c | 3 --
4 files changed, 122 insertions(+), 7 deletions(-)

--
Frank Rowand <[email protected]>



2018-02-28 19:06:15

by Frank Rowand

[permalink] [raw]
Subject: [PATCH v4 2/2] of: add early boot allocation of of_find_node_by_phandle() cache

From: Frank Rowand <[email protected]>

The initial implementation of the of_find_node_by_phandle() cache
allocates the cache using kcalloc(). Add an early boot allocation
of the cache so it will be usable during early boot. Switch over
to the kcalloc() based cache once normal memory allocation
becomes available.

Signed-off-by: Frank Rowand <[email protected]>
---

The previous version of this patch was a standalone patch:
[PATCH] of: add early boot allocation of of_find_node_by_phandle() cache
that was dependent on patch 1/2 of this series.

Changes from previous versions:
- no changes


drivers/of/base.c | 33 +++++++++++++++++++++++++++++++++
drivers/of/fdt.c | 2 ++
drivers/of/of_private.h | 2 ++
3 files changed, 37 insertions(+)

diff --git a/drivers/of/base.c b/drivers/of/base.c
index e71d157d7149..eeaa270a5135 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -16,9 +16,11 @@

#define pr_fmt(fmt) "OF: " fmt

+#include <linux/bootmem.h>
#include <linux/console.h>
#include <linux/ctype.h>
#include <linux/cpu.h>
+#include <linux/memblock.h>
#include <linux/module.h>
#include <linux/of.h>
#include <linux/of_device.h>
@@ -134,6 +136,29 @@ static void of_populate_phandle_cache(void)
raw_spin_unlock_irqrestore(&devtree_lock, flags);
}

+void __init of_populate_phandle_cache_early(void)
+{
+ u32 cache_entries;
+ struct device_node *np;
+ u32 phandles = 0;
+ size_t size;
+
+ for_each_of_allnodes(np)
+ if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL)
+ phandles++;
+
+ cache_entries = roundup_pow_of_two(phandles);
+ phandle_cache_mask = cache_entries - 1;
+
+ size = cache_entries * sizeof(*phandle_cache);
+ phandle_cache = memblock_virt_alloc(size, 4);
+ memset(phandle_cache, 0, size);
+
+ for_each_of_allnodes(np)
+ if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL)
+ phandle_cache[np->phandle & phandle_cache_mask] = np;
+}
+
#ifndef CONFIG_MODULES
static int __init of_free_phandle_cache(void)
{
@@ -153,7 +178,15 @@ static int __init of_free_phandle_cache(void)

void __init of_core_init(void)
{
+ unsigned long flags;
struct device_node *np;
+ phys_addr_t size;
+
+ raw_spin_lock_irqsave(&devtree_lock, flags);
+ size = (phandle_cache_mask + 1) * sizeof(*phandle_cache);
+ memblock_free(__pa(phandle_cache), size);
+ phandle_cache = NULL;
+ raw_spin_unlock_irqrestore(&devtree_lock, flags);

of_populate_phandle_cache();

diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
index 84aa9d676375..cb320df23f26 100644
--- a/drivers/of/fdt.c
+++ b/drivers/of/fdt.c
@@ -1264,6 +1264,8 @@ void __init unflatten_device_tree(void)
of_alias_scan(early_init_dt_alloc_memory_arch);

unittest_unflatten_overlay_base();
+
+ of_populate_phandle_cache_early();
}

/**
diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
index fa70650136b4..6720448c84cc 100644
--- a/drivers/of/of_private.h
+++ b/drivers/of/of_private.h
@@ -134,6 +134,8 @@ extern void __of_sysfs_remove_bin_file(struct device_node *np,
/* illegal phandle value (set when unresolved) */
#define OF_PHANDLE_ILLEGAL 0xdeadbeef

+extern void __init of_populate_phandle_cache_early(void);
+
/* iterators for transactions, used for overlays */
/* forward iterator */
#define for_each_transaction_entry(_oft, _te) \
--
Frank Rowand <[email protected]>


2018-02-28 19:06:49

by Frank Rowand

[permalink] [raw]
Subject: [PATCH v4 1/2] of: cache phandle nodes to reduce cost of of_find_node_by_phandle()

From: Frank Rowand <[email protected]>

Create a cache of the nodes that contain a phandle property. Use this
cache to find the node for a given phandle value instead of scanning
the devicetree to find the node. If the phandle value is not found
in the cache, of_find_node_by_phandle() will fall back to the tree
scan algorithm.

The cache is initialized in of_core_init().

The cache is freed via a late_initcall_sync() if modules are not
enabled.

If the devicetree is created by the dtc compiler, with all phandle
property values auto generated, then the size required by the cache
could be 4 * (1 + number of phandles) bytes. This results in an O(1)
node lookup cost for a given phandle value. Due to a concern that the
phandle property values might not be consistent with what is generated
by the dtc compiler, a mask has been added to the cache lookup algorithm.
To maintain the O(1) node lookup cost, the size of the cache has been
increased by rounding the number of entries up to the next power of
two.

The overhead of finding the devicetree node containing a given phandle
value has been noted by several people in the recent past, in some cases
with a patch to add a hashed index of devicetree nodes, based on the
phandle value of the node. One concern with this approach is the extra
space added to each node. This patch takes advantage of the phandle
property values auto generated by the dtc compiler, which begin with
one and monotonically increase by one, resulting in a range of 1..n
for n phandle values. This implementation should also provide a good
reduction of overhead for any range of phandle values that are mostly
in a monotonic range.

Performance measurements by Chintan Pandya <[email protected]>
of several implementations of patches that are similar to this one
suggest an expected reduction of boot time by ~400ms for his test
system. If the cache size was decreased to 64 entries, the boot
time was reduced by ~340 ms. The measurements were on a 4.9.73 kernel
for arch/arm64/boot/dts/qcom/sda670-mtp.dts, contains 2371 nodes and
814 phandle values.

Reported-by: Chintan Pandya <[email protected]>
Signed-off-by: Frank Rowand <[email protected]>
---
Changes since v3:
- of_populate_phandle_cache(): add check for failed memory allocation

Changes since v2:
- add mask to calculation of phandle cache entry
- which results in better overhead reduction for devicetrees with
phandle properties not allocated in the monotonically increasing
range of 1..n
- due to mask, number of entries in cache potentially increased to
next power of two
- minor fixes as suggested by reviewers
- no longer using live_tree_max_phandle() so do not move it from
drivers/of/resolver.c to drivers/of/base.c

Changes since v1:
- change short description from
of: cache phandle nodes to reduce cost of of_find_node_by_phandle()
- rebase on v4.16-rc1
- reorder new functions in base.c to avoid forward declaration
- add locking around kfree(phandle_cache) for memory ordering
- add explicit check for non-null of phandle_cache in
of_find_node_by_phandle(). There is already a check for !handle,
which prevents accessing a null phandle_cache, but that dependency
is not obvious, so this check makes it more apparent.
- do not free phandle_cache if modules are enabled, so that
cached phandles will be available when modules are loaded

drivers/of/base.c | 86 ++++++++++++++++++++++++++++++++++++++++++++++---
drivers/of/of_private.h | 3 ++
drivers/of/resolver.c | 3 --
3 files changed, 85 insertions(+), 7 deletions(-)

diff --git a/drivers/of/base.c b/drivers/of/base.c
index ad28de96e13f..e71d157d7149 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -91,10 +91,72 @@ int __weak of_node_to_nid(struct device_node *np)
}
#endif

+static struct device_node **phandle_cache;
+static u32 phandle_cache_mask;
+
+/*
+ * Assumptions behind phandle_cache implementation:
+ * - phandle property values are in a contiguous range of 1..n
+ *
+ * If the assumptions do not hold, then
+ * - the phandle lookup overhead reduction provided by the cache
+ * will likely be less
+ */
+static void of_populate_phandle_cache(void)
+{
+ unsigned long flags;
+ u32 cache_entries;
+ struct device_node *np;
+ u32 phandles = 0;
+
+ raw_spin_lock_irqsave(&devtree_lock, flags);
+
+ kfree(phandle_cache);
+ phandle_cache = NULL;
+
+ for_each_of_allnodes(np)
+ if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL)
+ phandles++;
+
+ cache_entries = roundup_pow_of_two(phandles);
+ phandle_cache_mask = cache_entries - 1;
+
+ phandle_cache = kcalloc(cache_entries, sizeof(*phandle_cache),
+ GFP_ATOMIC);
+ if (!phandle_cache)
+ goto out;
+
+ for_each_of_allnodes(np)
+ if (np->phandle && np->phandle != OF_PHANDLE_ILLEGAL)
+ phandle_cache[np->phandle & phandle_cache_mask] = np;
+
+out:
+ raw_spin_unlock_irqrestore(&devtree_lock, flags);
+}
+
+#ifndef CONFIG_MODULES
+static int __init of_free_phandle_cache(void)
+{
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&devtree_lock, flags);
+
+ kfree(phandle_cache);
+ phandle_cache = NULL;
+
+ raw_spin_unlock_irqrestore(&devtree_lock, flags);
+
+ return 0;
+}
+late_initcall_sync(of_free_phandle_cache);
+#endif
+
void __init of_core_init(void)
{
struct device_node *np;

+ of_populate_phandle_cache();
+
/* Create the kset, and register existing nodes */
mutex_lock(&of_mutex);
of_kset = kset_create_and_add("devicetree", NULL, firmware_kobj);
@@ -1021,16 +1083,32 @@ int of_modalias_node(struct device_node *node, char *modalias, int len)
*/
struct device_node *of_find_node_by_phandle(phandle handle)
{
- struct device_node *np;
+ struct device_node *np = NULL;
unsigned long flags;
+ phandle masked_handle;

if (!handle)
return NULL;

raw_spin_lock_irqsave(&devtree_lock, flags);
- for_each_of_allnodes(np)
- if (np->phandle == handle)
- break;
+
+ masked_handle = handle & phandle_cache_mask;
+
+ if (phandle_cache) {
+ if (phandle_cache[masked_handle] &&
+ handle == phandle_cache[masked_handle]->phandle)
+ np = phandle_cache[masked_handle];
+ }
+
+ if (!np) {
+ for_each_of_allnodes(np)
+ if (np->phandle == handle) {
+ if (phandle_cache)
+ phandle_cache[masked_handle] = np;
+ break;
+ }
+ }
+
of_node_get(np);
raw_spin_unlock_irqrestore(&devtree_lock, flags);
return np;
diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
index 0c609e7d0334..fa70650136b4 100644
--- a/drivers/of/of_private.h
+++ b/drivers/of/of_private.h
@@ -131,6 +131,9 @@ extern void __of_update_property_sysfs(struct device_node *np,
extern void __of_sysfs_remove_bin_file(struct device_node *np,
struct property *prop);

+/* illegal phandle value (set when unresolved) */
+#define OF_PHANDLE_ILLEGAL 0xdeadbeef
+
/* iterators for transactions, used for overlays */
/* forward iterator */
#define for_each_transaction_entry(_oft, _te) \
diff --git a/drivers/of/resolver.c b/drivers/of/resolver.c
index 740d19bde601..b2ca8185c8c6 100644
--- a/drivers/of/resolver.c
+++ b/drivers/of/resolver.c
@@ -19,9 +19,6 @@

#include "of_private.h"

-/* illegal phandle value (set when unresolved) */
-#define OF_PHANDLE_ILLEGAL 0xdeadbeef
-
static phandle live_tree_max_phandle(void)
{
struct device_node *node;
--
Frank Rowand <[email protected]>


2018-02-28 19:32:29

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v4 1/2] of: cache phandle nodes to reduce cost of of_find_node_by_phandle()

On Wed, Feb 28, 2018 at 9:04 PM, <[email protected]> wrote:

> Create a cache of the nodes that contain a phandle property. Use this
> cache to find the node for a given phandle value instead of scanning
> the devicetree to find the node. If the phandle value is not found
> in the cache, of_find_node_by_phandle() will fall back to the tree
> scan algorithm.
>
> The cache is initialized in of_core_init().
>
> The cache is freed via a late_initcall_sync() if modules are not
> enabled.
>
> If the devicetree is created by the dtc compiler, with all phandle
> property values auto generated, then the size required by the cache
> could be 4 * (1 + number of phandles) bytes. This results in an O(1)
> node lookup cost for a given phandle value. Due to a concern that the
> phandle property values might not be consistent with what is generated
> by the dtc compiler, a mask has been added to the cache lookup algorithm.
> To maintain the O(1) node lookup cost, the size of the cache has been
> increased by rounding the number of entries up to the next power of
> two.
>
> The overhead of finding the devicetree node containing a given phandle
> value has been noted by several people in the recent past, in some cases
> with a patch to add a hashed index of devicetree nodes, based on the
> phandle value of the node. One concern with this approach is the extra
> space added to each node. This patch takes advantage of the phandle
> property values auto generated by the dtc compiler, which begin with
> one and monotonically increase by one, resulting in a range of 1..n
> for n phandle values. This implementation should also provide a good
> reduction of overhead for any range of phandle values that are mostly
> in a monotonic range.
>
> Performance measurements by Chintan Pandya <[email protected]>
> of several implementations of patches that are similar to this one
> suggest an expected reduction of boot time by ~400ms for his test
> system. If the cache size was decreased to 64 entries, the boot
> time was reduced by ~340 ms. The measurements were on a 4.9.73 kernel
> for arch/arm64/boot/dts/qcom/sda670-mtp.dts, contains 2371 nodes and
> 814 phandle values.

The question is why O(1) is so important? O(log(n)) wouldn't work?

Using radix_tree() I suppose allows to dynamically extend or shrink
the cache which would work with DT overlays.

--
With Best Regards,
Andy Shevchenko

2018-02-28 19:47:28

by Frank Rowand

[permalink] [raw]
Subject: Re: [PATCH v4 1/2] of: cache phandle nodes to reduce cost of of_find_node_by_phandle()

On 02/28/18 11:31, Andy Shevchenko wrote:
> On Wed, Feb 28, 2018 at 9:04 PM, <[email protected]> wrote:
>
>> Create a cache of the nodes that contain a phandle property. Use this
>> cache to find the node for a given phandle value instead of scanning
>> the devicetree to find the node. If the phandle value is not found
>> in the cache, of_find_node_by_phandle() will fall back to the tree
>> scan algorithm.
>>
>> The cache is initialized in of_core_init().
>>
>> The cache is freed via a late_initcall_sync() if modules are not
>> enabled.
>>
>> If the devicetree is created by the dtc compiler, with all phandle
>> property values auto generated, then the size required by the cache
>> could be 4 * (1 + number of phandles) bytes. This results in an O(1)
>> node lookup cost for a given phandle value. Due to a concern that the
>> phandle property values might not be consistent with what is generated
>> by the dtc compiler, a mask has been added to the cache lookup algorithm.
>> To maintain the O(1) node lookup cost, the size of the cache has been
>> increased by rounding the number of entries up to the next power of
>> two.
>>
>> The overhead of finding the devicetree node containing a given phandle
>> value has been noted by several people in the recent past, in some cases
>> with a patch to add a hashed index of devicetree nodes, based on the
>> phandle value of the node. One concern with this approach is the extra
>> space added to each node. This patch takes advantage of the phandle
>> property values auto generated by the dtc compiler, which begin with
>> one and monotonically increase by one, resulting in a range of 1..n
>> for n phandle values. This implementation should also provide a good
>> reduction of overhead for any range of phandle values that are mostly
>> in a monotonic range.
>>
>> Performance measurements by Chintan Pandya <[email protected]>
>> of several implementations of patches that are similar to this one
>> suggest an expected reduction of boot time by ~400ms for his test
>> system. If the cache size was decreased to 64 entries, the boot
>> time was reduced by ~340 ms. The measurements were on a 4.9.73 kernel
>> for arch/arm64/boot/dts/qcom/sda670-mtp.dts, contains 2371 nodes and
>> 814 phandle values.
>
> The question is why O(1) is so important? O(log(n)) wouldn't work?

O(1) is not critical. It was just a nice side result.


> Using radix_tree() I suppose allows to dynamically extend or shrink
> the cache which would work with DT overlays.

The memory usage of the phandle cache in this patch is fairly small.
The memory overhead of a radix_tree() would not be justified.

2018-02-28 20:20:21

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v4 1/2] of: cache phandle nodes to reduce cost of of_find_node_by_phandle()

On Wed, Feb 28, 2018 at 9:44 PM, Frank Rowand <[email protected]> wrote:
> On 02/28/18 11:31, Andy Shevchenko wrote:
>> On Wed, Feb 28, 2018 at 9:04 PM, <[email protected]> wrote:

>> The question is why O(1) is so important? O(log(n)) wouldn't work?
>
> O(1) is not critical. It was just a nice side result.
>
>
>> Using radix_tree() I suppose allows to dynamically extend or shrink
>> the cache which would work with DT overlays.
>
> The memory usage of the phandle cache in this patch is fairly small.
> The memory overhead of a radix_tree() would not be justified.

OTOH the advantage I mentioned isn't a good argument?

--
With Best Regards,
Andy Shevchenko

2018-02-28 20:56:06

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH v4 1/2] of: cache phandle nodes to reduce cost of of_find_node_by_phandle()

On Wed, Feb 28, 2018 at 2:19 PM, Andy Shevchenko
<[email protected]> wrote:
> On Wed, Feb 28, 2018 at 9:44 PM, Frank Rowand <[email protected]> wrote:
>> On 02/28/18 11:31, Andy Shevchenko wrote:
>>> On Wed, Feb 28, 2018 at 9:04 PM, <[email protected]> wrote:
>
>>> The question is why O(1) is so important? O(log(n)) wouldn't work?
>>
>> O(1) is not critical. It was just a nice side result.
>>
>>
>>> Using radix_tree() I suppose allows to dynamically extend or shrink
>>> the cache which would work with DT overlays.
>>
>> The memory usage of the phandle cache in this patch is fairly small.
>> The memory overhead of a radix_tree() would not be justified.
>
> OTOH the advantage I mentioned isn't a good argument?

Yes, but still one that ignores memory usage. I'll take whatever
solution doesn't undo this[1].

Rob

[1] https://lwn.net/Articles/735839/

2018-02-28 20:59:50

by Frank Rowand

[permalink] [raw]
Subject: Re: [PATCH v4 1/2] of: cache phandle nodes to reduce cost of of_find_node_by_phandle()

On 02/28/18 12:19, Andy Shevchenko wrote:
> On Wed, Feb 28, 2018 at 9:44 PM, Frank Rowand <[email protected]> wrote:
>> On 02/28/18 11:31, Andy Shevchenko wrote:
>>> On Wed, Feb 28, 2018 at 9:04 PM, <[email protected]> wrote:
>
>>> The question is why O(1) is so important? O(log(n)) wouldn't work?
>>
>> O(1) is not critical. It was just a nice side result.
>>
>>
>>> Using radix_tree() I suppose allows to dynamically extend or shrink
>>> the cache which would work with DT overlays.
>>
>> The memory usage of the phandle cache in this patch is fairly small.
>> The memory overhead of a radix_tree() would not be justified.
>
> OTOH the advantage I mentioned isn't a good argument?

No. Deleting and re-creating the cache to resize it (when applying an
overlay) would be a rare event that would happen as desired by the
overlay application code. There is no real gain by having extension
or shrinkage occur automatically and if the overlay application code
desires the resizing it is trivial to implement (a single function
call).

2018-03-03 05:29:14

by Fengguang Wu

[permalink] [raw]
Subject: [of] b013aa45d2: kernel_BUG_at_arch/x86/mm/physaddr.c

FYI, we noticed the following commit (built with gcc-5):

commit: b013aa45d2168019984aec70e00d09cf0a4a00c5 ("of: add early boot allocation of of_find_node_by_phandle() cache")
url: https://github.com/0day-ci/linux/commits/frowand-list-gmail-com/of-cache-phandle-nodes-to-reduce-cost-of-of_find_node_by_phandle/20180303-090055
base: https://git.kernel.org/cgit/linux/kernel/git/robh/linux.git for-next

in testcase: boot

on test machine: qemu-system-x86_64 -enable-kvm -m 420M

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+------------------------------------------+------------+------------+
| | 90e6f76859 | b013aa45d2 |
+------------------------------------------+------------+------------+
| boot_successes | 12 | 0 |
| boot_failures | 0 | 10 |
| kernel_BUG_at_arch/x86/mm/physaddr.c | 0 | 10 |
| invalid_opcode:#[##] | 0 | 10 |
| RIP:__phys_addr | 0 | 10 |
| Kernel_panic-not_syncing:Fatal_exception | 0 | 10 |
+------------------------------------------+------------+------------+



[ 0.039484] kernel BUG at arch/x86/mm/physaddr.c:27!
[ 0.039999] invalid opcode: 0000 [#1] PREEMPT PTI
[ 0.039999] CPU: 0 PID: 1 Comm: swapper Not tainted 4.16.0-rc1-00008-gb013aa4 #1
[ 0.039999] RIP: 0010:__phys_addr+0x39/0x50
[ 0.039999] RSP: 0000:ffff880018de3ee0 EFLAGS: 00010087
[ 0.039999] RAX: 0000780000000000 RBX: 0000000000000001 RCX: 0000000000000002
[ 0.039999] RDX: 0000000080000000 RSI: ffffffff82e77239 RDI: 0000000000000000
[ 0.039999] RBP: ffff880018de3ee0 R08: 0000000000000000 R09: 0000000000000001
[ 0.039999] R10: 00000000000029cb R11: 63561fc2891644ad R12: 0000000000000286
[ 0.039999] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 0.039999] FS: 0000000000000000(0000) GS:ffffffff8309b000(0000) knlGS:0000000000000000
[ 0.039999] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.039999] CR2: 00000000ffffffff CR3: 0000000003074000 CR4: 00000000000006f0
[ 0.039999] Call Trace:
[ 0.039999] of_core_init+0x30/0x21f
[ 0.039999] driver_init+0x36/0x38
[ 0.039999] kernel_init_freeable+0x82/0x19f
[ 0.039999] ? rest_init+0x220/0x220
[ 0.039999] kernel_init+0xe/0x100
[ 0.039999] ret_from_fork+0x24/0x30
[ 0.039999] Code: 48 89 e5 72 28 48 b8 00 00 00 00 00 78 00 00 48 01 f8 48 39 c2 72 14 0f b6 0d fd 0c 45 02 48 89 c2 48 d3 ea 48 85 d2 75 02 5d c3 <0f> 0b 48 89 d0 48 03 05 2b ef 02 02 48 81 fa ff ff ff 1f 76 e9
[ 0.039999] RIP: __phys_addr+0x39/0x50 RSP: ffff880018de3ee0
[ 0.039999] ---[ end trace 56a848e98751a243 ]---


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email



Thanks,
lkp


Attachments:
(No filename) (3.01 kB)
config-4.16.0-rc1-00008-gb013aa4 (117.07 kB)
job-script (4.13 kB)
dmesg.xz (5.81 kB)
Download all attachments

2018-03-03 07:40:50

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] of: add early boot allocation of of_find_node_by_phandle() cache

Hi Frank,

I love your patch! Yet something to improve:

[auto build test ERROR on robh/for-next]
[also build test ERROR on v4.16-rc3 next-20180302]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/frowand-list-gmail-com/of-cache-phandle-nodes-to-reduce-cost-of-of_find_node_by_phandle/20180303-090055
base: https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git for-next
config: tile-allmodconfig (attached as .config)
compiler: tilegx-linux-gcc (GCC) 7.2.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=tile

All errors (new ones prefixed by >>):

drivers//of/base.c: In function 'of_core_init':
>> drivers//of/base.c:187:2: error: implicit declaration of function 'memblock_free'; did you mean 'ptlock_free'? [-Werror=implicit-function-declaration]
memblock_free(__pa(phandle_cache), size);
^~~~~~~~~~~~~
ptlock_free
cc1: some warnings being treated as errors

vim +187 drivers//of/base.c

178
179 void __init of_core_init(void)
180 {
181 unsigned long flags;
182 struct device_node *np;
183 phys_addr_t size;
184
185 raw_spin_lock_irqsave(&devtree_lock, flags);
186 size = (phandle_cache_mask + 1) * sizeof(*phandle_cache);
> 187 memblock_free(__pa(phandle_cache), size);
188 phandle_cache = NULL;
189 raw_spin_unlock_irqrestore(&devtree_lock, flags);
190
191 of_populate_phandle_cache();
192
193 /* Create the kset, and register existing nodes */
194 mutex_lock(&of_mutex);
195 of_kset = kset_create_and_add("devicetree", NULL, firmware_kobj);
196 if (!of_kset) {
197 mutex_unlock(&of_mutex);
198 pr_err("failed to register existing nodes\n");
199 return;
200 }
201 for_each_of_allnodes(np)
202 __of_attach_node_sysfs(np);
203 mutex_unlock(&of_mutex);
204
205 /* Symlink in /proc as required by userspace ABI */
206 if (of_root)
207 proc_symlink("device-tree", NULL, "/sys/firmware/devicetree/base");
208 }
209

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (2.43 kB)
.config.gz (50.59 kB)
Download all attachments