2018-07-12 21:01:17

by Frank Rowand

[permalink] [raw]
Subject: [PATCH] of: overlay: update phandle cache on overlay apply and remove

From: Frank Rowand <[email protected]>

A comment in the review of the patch adding the phandle cache said that
the cache would have to be updated when modules are applied and removed.
This patch implements the cache updates.

Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")
Reported-by: Alan Tull <[email protected]>
Suggested-by: Alan Tull <[email protected]>
Signed-off-by: Frank Rowand <[email protected]>
---

Changes since RFC:
- update code comment to mention race condition avoidance

For the RFC version of this patch, the 0day test reported a general
protection fault from the KASAN runtime memory debugger on x86_64
in qemu. The GPF was in a devicetree unittest.

0day tested the patch on v4.17-rc1, with some other patches applied.
I was unable to replicate the GPF on v4.18-rc1 with just this patch
applied. I was also unable to replicate the GPF on a clone of the
v4.17-rc1 0day repository, using the 0day kernel config. I will
reply to this email with the 0day GPF report.


drivers/of/base.c | 6 +++---
drivers/of/of_private.h | 2 ++
drivers/of/overlay.c | 11 +++++++++++
3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/of/base.c b/drivers/of/base.c
index 848f549164cd..466e3c8582f0 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -102,7 +102,7 @@ static u32 phandle_cache_mask;
* - the phandle lookup overhead reduction provided by the cache
* will likely be less
*/
-static void of_populate_phandle_cache(void)
+void of_populate_phandle_cache(void)
{
unsigned long flags;
u32 cache_entries;
@@ -134,8 +134,7 @@ static void of_populate_phandle_cache(void)
raw_spin_unlock_irqrestore(&devtree_lock, flags);
}

-#ifndef CONFIG_MODULES
-static int __init of_free_phandle_cache(void)
+int of_free_phandle_cache(void)
{
unsigned long flags;

@@ -148,6 +147,7 @@ static int __init of_free_phandle_cache(void)

return 0;
}
+#if !defined(CONFIG_MODULES)
late_initcall_sync(of_free_phandle_cache);
#endif

diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
index 891d780c076a..216175d11d3d 100644
--- a/drivers/of/of_private.h
+++ b/drivers/of/of_private.h
@@ -79,6 +79,8 @@ int of_resolve_phandles(struct device_node *tree);
#if defined(CONFIG_OF_OVERLAY)
void of_overlay_mutex_lock(void);
void of_overlay_mutex_unlock(void);
+int of_free_phandle_cache(void);
+void of_populate_phandle_cache(void);
#else
static inline void of_overlay_mutex_lock(void) {};
static inline void of_overlay_mutex_unlock(void) {};
diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
index 7baa53e5b1d7..eda57ef12fd0 100644
--- a/drivers/of/overlay.c
+++ b/drivers/of/overlay.c
@@ -804,6 +804,8 @@ static int of_overlay_apply(const void *fdt, struct device_node *tree,
goto err_free_overlay_changeset;
}

+ of_populate_phandle_cache();
+
ret = __of_changeset_apply_notify(&ovcs->cset);
if (ret)
pr_err("overlay changeset entry notify error %d\n", ret);
@@ -1046,8 +1048,17 @@ int of_overlay_remove(int *ovcs_id)

list_del(&ovcs->ovcs_list);

+ /*
+ * Disable phandle cache. Avoids race condition that would arise
+ * from removing cache entry when the associated node is deleted.
+ */
+ of_free_phandle_cache();
+
ret_apply = 0;
ret = __of_changeset_revert_entries(&ovcs->cset, &ret_apply);
+
+ of_populate_phandle_cache();
+
if (ret) {
if (ret_apply)
devicetree_state_flags |= DTSF_REVERT_FAIL;
--
Frank Rowand <[email protected]>



2018-07-12 21:09:47

by Frank Rowand

[permalink] [raw]
Subject: Re: [PATCH] of: overlay: update phandle cache on overlay apply and remove

On 07/12/18 14:00, [email protected] wrote:
> From: Frank Rowand <[email protected]>
>
> A comment in the review of the patch adding the phandle cache said that
> the cache would have to be updated when modules are applied and removed.
> This patch implements the cache updates.
>
> Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")
> Reported-by: Alan Tull <[email protected]>
> Suggested-by: Alan Tull <[email protected]>
> Signed-off-by: Frank Rowand <[email protected]>
> ---
>
> Changes since RFC:
> - update code comment to mention race condition avoidance
>
> For the RFC version of this patch, the 0day test reported a general
> protection fault from the KASAN runtime memory debugger on x86_64
> in qemu. The GPF was in a devicetree unittest.
>
> 0day tested the patch on v4.17-rc1, with some other patches applied.
> I was unable to replicate the GPF on v4.18-rc1 with just this patch
> applied. I was also unable to replicate the GPF on a clone of the
> v4.17-rc1 0day repository, using the 0day kernel config. I will
> reply to this email with the 0day GPF report.
>

< snip >

Here is the 0day GPF report for the RFC version of this patch:


From: kernel test robot [mailto:[email protected]]
Sent: Monday, June 18, 2018 8:45 PM
To: Rowand, Frank <[email protected]>
Cc: LKP <[email protected]>; 0day robot <[email protected]>
Subject: 000946c8fd ("of: overlay: update phandle cache on overlay .."): general protection fault: 0000 [#1] PREEMPT KASAN

Greetings,

0day kernel testing robot got the below dmesg and the first bad commit is

https://github.com/0day-ci/linux/commits/frowand-list-gmail-com/of-overlay-update-phandle-cache-on-overlay-apply-and-remove/20180618-000935

commit 000946c8fd3a9ffaeec3b3f52522afa932568cc4
Author: Frank Rowand <[email protected]>
AuthorDate: Sun Jun 17 09:03:36 2018 -0700
Commit: 0day robot <[email protected]>
CommitDate: Mon Jun 18 00:09:42 2018 +0800

of: overlay: update phandle cache on overlay apply and remove

A comment in the review of the patch adding the phandle cache said that
the cache would have to be updated when modules are applied and removed.
This patch implements the cache updates.

Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")
Reported-by: Alan Tull <[email protected]>
Suggested-by: Alan Tull <[email protected]>
Signed-off-by: Frank Rowand <[email protected]>

44acf10587 dt-bindings: submitting-patches: add guidance on patch content and subject
000946c8fd of: overlay: update phandle cache on overlay apply and remove
+------------------------------------------------+------------+------------+
| | 44acf10587 | 000946c8fd |
+------------------------------------------------+------------+------------+
| boot_successes | 32 | 0 |
| boot_failures | 4 | 15 |
| IP-Config:Auto-configuration_of_network_failed | 4 | |
| general_protection_fault:#[##] | 0 | 15 |
| RIP:of_find_node_by_phandle | 0 | 15 |
| Kernel_panic-not_syncing:Fatal_exception | 0 | 15 |
+------------------------------------------------+------------+------------+

[ 11.267460] rtc_cmos 00:00: setting system clock to 2018-06-18 18:57:15 UTC (1529348235)
[ 11.270382] Duplicate name in testcase-data, renamed to "duplicate-name#1"
[ 11.279673] ### dt-test ### start of unittest - you will see error messages
[ 11.281198] kasan: CONFIG_KASAN_INLINE enabled
[ 11.281766] kasan: GPF could be caused by NULL-ptr deref or user memory access
[ 11.282585] general protection fault: 0000 [#1] PREEMPT KASAN
[ 11.283241] CPU: 0 PID: 1 Comm: swapper Not tainted 4.17.0-rc1-00037-g000946c #1
[ 11.284076] RIP: 0010:of_find_node_by_phandle+0x57/0x280
[ 11.284680] RSP: 0000:ffff88001a60fa90 EFLAGS: 00010002
[ 11.285253] RAX: dffffc0000000000 RBX: 0000000000000020 RCX: 0000000000000002
[ 11.286028] RDX: 0000000000000010 RSI: 0000000000000004 RDI: ffffffff842d19a4
[ 11.286805] RBP: ffff88001a60fad0 R08: dffffc0000000000 R09: 0000000000000000
[ 11.287588] R10: ffff88001a60f9e8 R11: 0000000000000001 R12: 0000000000000002
[ 11.288389] R13: 0000000000000002 R14: 0000000000000202 R15: ffff88001a60fbe8
[ 11.289190] FS: 0000000000000000(0000) GS:ffffffff83c4a000(0000) knlGS:0000000000000000
[ 11.290110] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 11.290734] CR2: 0000000000000000 CR3: 0000000003c22001 CR4: 00000000001606b0
[ 11.291501] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 11.292306] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 11.293100] Call Trace:
[ 11.293391] ? _raw_spin_unlock_irqrestore+0x56/0x70
[ 11.293963] of_phandle_iterator_next+0x259/0x5c0
[ 11.294504] ? of_phandle_iterator_init+0x9b/0x220
[ 11.295035] ? of_phandle_iterator_init+0x220/0x220
[ 11.295569] ? of_find_node_opts_by_path+0x207/0x220
[ 11.296116] of_count_phandle_with_args+0x92/0xe0
[ 11.296649] ? of_parse_phandle_with_fixed_args+0x20/0x20
[ 11.297261] ? __this_cpu_preempt_check+0x13/0x20
[ 11.297802] ? of_find_node_opts_by_path+0x207/0x220
[ 11.298365] of_unittest+0x25e1/0x4c07
[ 11.298800] ? dt_alloc_memory+0x3d/0x3d
[ 11.299249] ? __this_cpu_preempt_check+0x13/0x20
[ 11.299790] ? trace_hardirqs_on_caller+0x4b8/0x510
[ 11.300325] ? _raw_spin_unlock_irqrestore+0x56/0x70
[ 11.300872] ? add_device_randomness+0x2e9/0x320
[ 11.301381] ? kset_register+0x31/0x50
[ 11.301801] ? dt_alloc_memory+0x3d/0x3d
[ 11.302235] do_one_initcall+0x3a1/0x806
[ 11.302675] ? start_kernel+0x7ee/0x7ee
[ 11.303103] ? rcu_read_lock_sched_held+0x140/0x160
[ 11.303626] kernel_init_freeable+0x3a1/0x511
[ 11.304105] ? rest_init+0xe0/0xe0
[ 11.304485] kernel_init+0xe/0x110
[ 11.304872] ? rest_init+0xe0/0xe0
[ 11.305252] ret_from_fork+0x24/0x30
[ 11.305658] Code: 93 3a 03 44 89 e9 23 0d 38 93 3a 03 49 89 c6 48 85 d2 74 79 89 c8 48 8d 1c c2 48 b8 00 00 00 00 00 fc ff df 48 89 de 48 c1 ee 03 <80> 3c 06 00 74 16 48 89 df 89 4d cc 48 89 55 d0 e8 d4 ab cc fe
[ 11.307863] RIP: of_find_node_by_phandle+0x57/0x280 RSP: ffff88001a60fa90
[ 11.308625] ---[ end trace 88de1812d7982342 ]---
[ 11.309153] Kernel panic - not syncing: Fatal exception

# HH:MM RESULT GOOD BAD GOOD_BUT_DIRTY DIRTY_NOT_BAD
git bisect start 34a5660c8b22ba1acd878e7c8709cdb29594e4c2 ce397d215ccd07b8ae3f71db689aedb85d56ab40 --
git bisect bad 90fae9145aab54a3d05b3e54fc24c885061d2a25 # 17:37 B 0 11 26 2 Merge 'linux-review/Hans-de-Goede/efifb-Copy-the-ACPI-BGRT-boot-graphics-to-the/20180617-233509' into devel-spot-201806181537
git bisect good c117ccf13ab8d450e25cd92d3fde6ce9db3cc7f3 # 17:48 G 11 0 0 0 Merge 'linux-review/Waiman-Long/cpuset-Enable-cpuset-controller-in-default-hierarchy/20180618-121839' into devel-spot-201806181537
git bisect good 7ce999a205ad0c2731f3c71ab0688d8a40e7bc92 # 17:58 G 11 0 0 0 Merge 'linux-review/efremov-linux-com/ARM-S3C24XX-fix-typo-in-guard-macro-for-s3c2412-h/20180618-035001' into devel-spot-201806181537
git bisect good 1b342e0111745f85c89f1888152342a8e7bedcaa # 18:15 G 11 0 0 0 Merge 'linux-review/efremov-linux-com/crypto-skcipher-remove-static-declaration-of-export-function/20180618-015246' into devel-spot-201806181537
git bisect bad b5022002fed7d8cfc25f51246e3a23e3f71abdf9 # 18:25 B 0 11 24 0 Merge 'linux-review/frowand-list-gmail-com/of-overlay-update-phandle-cache-on-overlay-apply-and-remove/20180618-000935' into devel-spot-201806181537
git bisect good 15a1dd0388aa2468fbf412da9848e25adc1ee520 # 18:37 G 11 0 0 0 Merge 'linux-review/Robert-Jarzmik/ARM-pxa-switch-to-DMA-slave-maps/20180618-010625' into devel-spot-201806181537
git bisect bad 000946c8fd3a9ffaeec3b3f52522afa932568cc4 # 18:57 B 0 11 24 0 of: overlay: update phandle cache on overlay apply and remove
# first bad commit: [000946c8fd3a9ffaeec3b3f52522afa932568cc4] of: overlay: update phandle cache on overlay apply and remove
git bisect good 44acf10587907ff77c28fd97906220b2d8eb4f05 # 19:07 G 31 0 1 1 dt-bindings: submitting-patches: add guidance on patch content and subject
# extra tests with debug options
git bisect bad 000946c8fd3a9ffaeec3b3f52522afa932568cc4 # 19:19 B 0 1 14 0 of: overlay: update phandle cache on overlay apply and remove
# extra tests on HEAD of linux-devel/devel-spot-201806181537
git bisect bad 34a5660c8b22ba1acd878e7c8709cdb29594e4c2 # 19:25 B 0 49 65 0 0day head guard for 'devel-spot-201806181537'
# extra tests on tree/branch linux-review/frowand-list-gmail-com/of-overlay-update-phandle-cache-on-overlay-apply-and-remove/20180618-000935
git bisect bad 000946c8fd3a9ffaeec3b3f52522afa932568cc4 # 19:26 B 0 15 28 0 of: overlay: update phandle cache on overlay apply and remove
# extra tests with first bad commit reverted
git bisect good 4fce31f65973a3bb06d1f05ae55c77833585eac5 # 19:44 G 11 0 0 0 Revert "of: overlay: update phandle cache on overlay apply and remove"

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/lkp Intel Corporation


reproduce-yocto-lkp-hsw01-4220180618185722x86_64-randconfig-s4-061816474.17.0-rc1-00037-g000946c1

#!/bin/bash

kernel=$1
initrd=yocto-trinity-x86_64.cgz

wget --no-clobber https://github.com/fengguang/reproduce-kernel-bug/raw/master/yocto/$initrd

kvm=(
qemu-system-x86_64
-enable-kvm
-cpu Haswell,+smep
-kernel $kernel
-initrd $initrd
-m 512
-smp 1
-device e1000,netdev=net0
-netdev user,id=net0
-boot order=nc
-no-reboot
-watchdog i6300esb
-watchdog-action debug
-rtc base=localtime
-serial stdio
-display none
-monitor null
)

append=(
root=/dev/ram0
hung_task_panic=1
debug
apic=debug
sysrq_always_enabled
rcupdate.rcu_cpu_stall_timeout=100
net.ifnames=0
printk.devkmsg=on
panic=-1
softlockup_panic=1
nmi_watchdog=panic
oops=panic
load_ramdisk=2
prompt_ramdisk=0
drbd.minor_count=8
systemd.log_level=err
ignore_loglevel
console=tty0
earlyprintk=ttyS0,115200
console=ttyS0,115200
vga=normal
rw
drbd.minor_count=8
rcuperf.shutdown=0
)

"${kvm[@]}" -append "${append[*]}"


Attachments:
dmesg-yocto-lkp-hsw01-4220180618185722x86_64-randconfig-s4-061816474.17.0-rc1-00037-g000946c1.gz (14.53 kB)
dmesg-vm-lkp-wsx03-1G-2920180618192143x86_64-randconfig-s4-061816474.17.0-rc1-00036-g44acf101.gz (15.65 kB)
reproduce-yocto-lkp-hsw01-4220180618185722x86_64-randconfig-s4-061816474.17.0-rc1-00037-g000946c1 (971.00 B)
config-4.17.0-rc1-00037-g000946c (111.36 kB)
Download all attachments

2018-07-20 15:07:41

by Rob Herring (Arm)

[permalink] [raw]
Subject: Re: [PATCH] of: overlay: update phandle cache on overlay apply and remove

On Thu, Jul 12, 2018 at 02:00:07PM -0700, [email protected] wrote:
> From: Frank Rowand <[email protected]>
>
> A comment in the review of the patch adding the phandle cache said that
> the cache would have to be updated when modules are applied and removed.
> This patch implements the cache updates.
>
> Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")
> Reported-by: Alan Tull <[email protected]>
> Suggested-by: Alan Tull <[email protected]>
> Signed-off-by: Frank Rowand <[email protected]>
> ---
>
> Changes since RFC:
> - update code comment to mention race condition avoidance
>
> For the RFC version of this patch, the 0day test reported a general
> protection fault from the KASAN runtime memory debugger on x86_64
> in qemu. The GPF was in a devicetree unittest.
>
> 0day tested the patch on v4.17-rc1, with some other patches applied.
> I was unable to replicate the GPF on v4.18-rc1 with just this patch
> applied. I was also unable to replicate the GPF on a clone of the
> v4.17-rc1 0day repository, using the 0day kernel config. I will
> reply to this email with the 0day GPF report.

Didn't see any 0-day issues, so I've applied and it is in Linus' tree
now.

Rob