2024-05-09 11:06:36

by Breno Leitao

[permalink] [raw]
Subject: [PATCH] debugobjects: Fix potential data race in debug_objects_maxchain

KCSAN has identified a potential data race in debugobjects, where the
global variable debug_objects_maxchain is accessed for both reading and
writing simultaneously in separate and parallel data paths. This results
in the following splat printed by KCSAN:

BUG: KCSAN: data-race in debug_check_no_obj_freed / debug_object_activate

write to 0xffffffff847ccfc8 of 4 bytes by task 734 on cpu 41:
debug_object_activate (lib/debugobjects.c:199
lib/debugobjects.c:564 lib/debugobjects.c:710)
call_rcu (kernel/rcu/rcu.h:227
kernel/rcu/tree.c:2719 kernel/rcu/tree.c:2838)
security_inode_free (security/security.c:1626)
__destroy_inode (./include/linux/fsnotify.h:222 fs/inode.c:287)
evict (fs/inode.c:310 fs/inode.c:682)
iput (fs/inode.c:1769)
dentry_unlink_inode (fs/dcache.c:401)
__dentry_kill (fs/dcache.c:?)
dput (fs/dcache.c:846)
__fput (fs/file_table.c:431)
____fput (fs/file_table.c:451)
task_work_run (kernel/task_work.c:181)
do_exit (kernel/exit.c:879)
do_group_exit (kernel/exit.c:1027)
__pfx___ia32_sys_exit_group (kernel/exit.c:1038)
x64_sys_call (arch/x86/entry/syscall_64.c:33)
do_syscall_64 (arch/x86/entry/common.c:?)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)

read to 0xffffffff847ccfc8 of 4 bytes by task 384 on cpu 31:
debug_check_no_obj_freed (lib/debugobjects.c:1000 lib/debugobjects.c:1019)
kfree (mm/slub.c:2081 mm/slub.c:4280 mm/slub.c:4390)
percpu_ref_exit (lib/percpu-refcount.c:147)
css_free_rwork_fn (kernel/cgroup/cgroup.c:5357)
process_scheduled_works (kernel/workqueue.c:3272 kernel/workqueue.c:3348)
worker_thread (./include/linux/list.h:373
kernel/workqueue.c:955 kernel/workqueue.c:3430)
kthread (kernel/kthread.c:389)
ret_from_fork (arch/x86/kernel/process.c:153)
ret_from_fork_asm (arch/x86/entry/entry_64.S:257)

value changed: 0x00000070 -> 0x00000071

Include READ_ONCE()/WRITE_ONCE() annotations on the accesses to
debug_objects_maxchain to prevent potential data corruption, explicitly
indicating that this data is shared across two parallel data paths.

Signed-off-by: Breno Leitao <[email protected]>
---
lib/debugobjects.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index fb12a9bacd2f..fbd262aa6b29 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -195,8 +195,8 @@ static struct debug_obj *lookup_object(void *addr, struct debug_bucket *b)
if (obj->object == addr)
return obj;
}
- if (cnt > debug_objects_maxchain)
- debug_objects_maxchain = cnt;
+ if (cnt > READ_ONCE(debug_objects_maxchain))
+ WRITE_ONCE(debug_objects_maxchain, cnt);

return NULL;
}
@@ -997,8 +997,8 @@ static void __debug_check_no_obj_freed(const void *address, unsigned long size)
}
raw_spin_unlock_irqrestore(&db->lock, flags);

- if (cnt > debug_objects_maxchain)
- debug_objects_maxchain = cnt;
+ if (cnt > READ_ONCE(debug_objects_maxchain))
+ WRITE_ONCE(debug_objects_maxchain, cnt);

objs_checked += cnt;
}
--
2.43.0



2024-05-24 19:27:02

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] debugobjects: Fix potential data race in debug_objects_maxchain

On Thu, May 09 2024 at 04:06, Breno Leitao wrote:
> - if (cnt > debug_objects_maxchain)
> - debug_objects_maxchain = cnt;
> + if (cnt > READ_ONCE(debug_objects_maxchain))
> + WRITE_ONCE(debug_objects_maxchain, cnt);

The data race is actually harmless as this is just used for debugfs
statistics.

I'm pretty sure that the rest of these debug variables are racy as
well. Can we please annotate all of them without waiting for KCSAN to
detect them one by one?

Thanks,

tglx