2022-05-09 16:29:25

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: [PATCH v6 2/4] module: update dependencies at try_module_get()

Sometimes, device drivers are bound into each other via try_module_get(),
making such references invisible when looking at /proc/modules or lsmod.

Add a function to allow setting up module references for such
cases, and call it when try_module_get() is used.

Reviewed-by: Dan Williams <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
---

See [PATCH v6 0/4] at: https://lore.kernel.org/all/[email protected]/

include/linux/module.h | 8 +++--
kernel/module/main.c | 73 ++++++++++++++++++++++++++++++++++--------
2 files changed, 64 insertions(+), 17 deletions(-)

diff --git a/include/linux/module.h b/include/linux/module.h
index 77961f5773b6..a66b9be92ef5 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -618,12 +618,12 @@ extern void __module_get(struct module *module);

/* This is the Right Way to get a module: if it fails, it's being removed,
* so pretend it's not there. */
-extern bool try_module_get(struct module *module);
+extern bool try_module_get_owner(struct module *module, struct module *this);

extern void module_put(struct module *module);

#else /*!CONFIG_MODULE_UNLOAD*/
-static inline bool try_module_get(struct module *module)
+static inline bool try_module_get_owner(struct module *module, struct module *this)
{
return !module || module_is_live(module);
}
@@ -752,7 +752,7 @@ static inline void __module_get(struct module *module)
{
}

-static inline bool try_module_get(struct module *module)
+static inline bool try_module_get_owner(struct module *module, struct module *this)
{
return true;
}
@@ -893,6 +893,8 @@ static inline bool module_sig_ok(struct module *module)
}
#endif /* CONFIG_MODULE_SIG */

+#define try_module_get(mod) try_module_get_owner(mod, THIS_MODULE)
+
int module_kallsyms_on_each_symbol(int (*fn)(void *, const char *,
struct module *, unsigned long),
void *data);
diff --git a/kernel/module/main.c b/kernel/module/main.c
index fe44d46c378b..6044aeba0f18 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -139,6 +139,24 @@ int unregister_module_notifier(struct notifier_block *nb)
}
EXPORT_SYMBOL(unregister_module_notifier);

+static bool __try_module_get(struct module *module)
+{
+ bool ret = true;
+
+ if (module) {
+ preempt_disable();
+ /* Note: here, we can fail to get a reference */
+ if (likely(module_is_live(module) &&
+ atomic_inc_not_zero(&module->refcnt) != 0))
+ trace_module_get(module, _RET_IP_);
+ else
+ ret = false;
+
+ preempt_enable();
+ }
+ return ret;
+}
+
/*
* We require a truly strong try_module_get(): 0 means success.
* Otherwise an error is returned due to ongoing or failed
@@ -149,7 +167,7 @@ static inline int strong_try_module_get(struct module *mod)
BUG_ON(mod && mod->state == MODULE_STATE_UNFORMED);
if (mod && mod->state == MODULE_STATE_COMING)
return -EBUSY;
- if (try_module_get(mod))
+ if (__try_module_get(mod))
return 0;
else
return -ENOENT;
@@ -620,6 +638,41 @@ static int ref_module(struct module *a, struct module *b)
return 0;
}

+static int ref_module_dependency(struct module *mod, struct module *this)
+{
+ int ret = 0;
+
+ if (!this || !mod || !this->name || !mod->holders_dir)
+ return -EINVAL;
+
+ if (mod == this)
+ return 0;
+
+ mutex_lock(&module_mutex);
+
+ if (already_uses(this, mod))
+ goto ret;
+
+ ret = strong_try_module_get(mod);
+ if (ret)
+ goto ret;
+
+ ret = add_module_usage(this, mod);
+ if (ret) {
+ module_put(mod);
+ goto ret;
+ }
+
+#ifdef CONFIG_MODULE_UNLOAD
+ ret = sysfs_create_link(mod->holders_dir,
+ &this->mkobj.kobj, this->name);
+#endif
+
+ret:
+ mutex_unlock(&module_mutex);
+ return ret;
+}
+
/* Clear the unload stuff of the module. */
static void module_unload_free(struct module *mod)
{
@@ -830,24 +883,16 @@ void __module_get(struct module *module)
}
EXPORT_SYMBOL(__module_get);

-bool try_module_get(struct module *module)
+bool try_module_get_owner(struct module *module, struct module *this)
{
- bool ret = true;
+ int ret = __try_module_get(module);

- if (module) {
- preempt_disable();
- /* Note: here, we can fail to get a reference */
- if (likely(module_is_live(module) &&
- atomic_inc_not_zero(&module->refcnt) != 0))
- trace_module_get(module, _RET_IP_);
- else
- ret = false;
+ if (ret)
+ ref_module_dependency(module, this);

- preempt_enable();
- }
return ret;
}
-EXPORT_SYMBOL(try_module_get);
+EXPORT_SYMBOL(try_module_get_owner);

void module_put(struct module *module)
{
--
2.35.3



2022-05-16 08:32:20

by kernel test robot

[permalink] [raw]
Subject: [module] a1f1245bef: BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/mutex.c



Greeting,

FYI, we noticed the following commit (built with gcc-11):

commit: a1f1245bef4084ac5f6ccf3d075a3551476fe948 ("[Intel-gfx] [PATCH v6 2/4] module: update dependencies at try_module_get()")
url: https://github.com/intel-lab-lkp/linux/commits/Mauro-Carvalho-Chehab/Let-userspace-know-when-snd-hda-intel-needs-i915/20220510-002743
base: https://git.kernel.org/cgit/linux/kernel/git/mcgrof/linux.git modules-next
patch link: https://lore.kernel.org/alsa-devel/28a942f860ccdee05751dcccc74b70e9d64f2b94.1652113087.git.mchehab@kernel.org

in testcase: trinity
version: trinity-i386-4d2343bd-1_20200320
with following parameters:

runtime: 300s
group: group-03

test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/


on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


[ 21.287386][ T3487] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:280
[ 21.288810][ T3487] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 3487, name: trinity-main
[ 21.291046][ T3487] preempt_count: 1, expected: 0
[ 21.292559][ T3487] CPU: 1 PID: 3487 Comm: trinity-main Not tainted 5.18.0-rc1-00031-ga1f1245bef40 #1
[ 21.294755][ T3487] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014
[ 21.297034][ T3487] Call Trace:
[ 21.298338][ T3487] <TASK>
[ 21.299507][ T3487] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1))
[ 21.301012][ T3487] __might_resched.cold (kernel/sched/core.c:9734)
[ 21.302642][ T3487] mutex_lock (kernel/locking/mutex.c:280)
[ 21.304051][ T3487] try_module_get_owner (kernel/module/main.c:653 kernel/module/main.c:891)
[ 21.305665][ T3487] can_create (net/can/af_can.c:102 net/can/af_can.c:142) can
[ 21.307157][ T3487] __sock_create (net/socket.c:1469)
[ 21.308584][ T3487] __sys_socket (net/socket.c:1519 net/socket.c:1561)
[ 21.309971][ T3487] ? __might_fault (mm/memory.c:5324)
[ 21.311415][ T3487] ? _copy_from_user (arch/x86/include/asm/uaccess_64.h:46 arch/x86/include/asm/uaccess_64.h:52 lib/usercopy.c:16)
[ 21.312850][ T3487] __do_compat_sys_socketcall (net/compat.c:450)
[ 21.314418][ T3487] ? switch_fpu_return (arch/x86/kernel/fpu/context.h:50 arch/x86/kernel/fpu/context.h:77 arch/x86/kernel/fpu/core.c:755)
[ 21.315916][ T3487] __do_fast_syscall_32 (arch/x86/entry/common.c:112 arch/x86/entry/common.c:178)
[ 21.317406][ T3487] do_fast_syscall_32 (arch/x86/entry/common.c:203)
[ 21.319647][ T3487] entry_SYSENTER_compat_after_hwframe (arch/x86/entry/entry_64_compat.S:142)
[ 21.321308][ T3487] RIP: 0023:0xf7f92549
[ 21.322572][ T3487] Code: 03 74 c0 01 10 05 03 74 b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00
All code
========
0: 03 74 c0 01 add 0x1(%rax,%rax,8),%esi
4: 10 05 03 74 b8 01 adc %al,0x1b87403(%rip) # 0x1b8740d
a: 10 06 adc %al,(%rsi)
c: 03 74 b4 01 add 0x1(%rsp,%rsi,4),%esi
10: 10 07 adc %al,(%rdi)
12: 03 74 b0 01 add 0x1(%rax,%rsi,4),%esi
16: 10 08 adc %cl,(%rax)
18: 03 74 d8 01 add 0x1(%rax,%rbx,8),%esi
1c: 00 00 add %al,(%rax)
1e: 00 00 add %al,(%rax)
20: 00 51 52 add %dl,0x52(%rcx)
23: 55 push %rbp
24: 89 e5 mov %esp,%ebp
26: 0f 34 sysenter
28: cd 80 int $0x80
2a:* 5d pop %rbp <-- trapping instruction
2b: 5a pop %rdx
2c: 59 pop %rcx
2d: c3 retq
2e: 90 nop
2f: 90 nop
30: 90 nop
31: 90 nop
32: 8d b4 26 00 00 00 00 lea 0x0(%rsi,%riz,1),%esi
39: 8d b4 26 00 00 00 00 lea 0x0(%rsi,%riz,1),%esi

Code starting with the faulting instruction
===========================================
0: 5d pop %rbp
1: 5a pop %rdx
2: 59 pop %rcx
3: c3 retq
4: 90 nop
5: 90 nop
6: 90 nop
7: 90 nop
8: 8d b4 26 00 00 00 00 lea 0x0(%rsi,%riz,1),%esi
f: 8d b4 26 00 00 00 00 lea 0x0(%rsi,%riz,1),%esi
[ 21.326895][ T3487] RSP: 002b:00000000ffcf9b90 EFLAGS: 00000206 ORIG_RAX: 0000000000000066
[ 21.328923][ T3487] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00000000ffcf9ba4
[ 21.330892][ T3487] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000003
[ 21.332797][ T3487] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
[ 21.334697][ T3487] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 21.336568][ T3487] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 21.338481][ T3487] </TASK>
[ 21.372109][ T3565] can: broadcast manager protocol
[ OK ] Started System Logging Service.
[ 23.041578][ T264] LKP: stdout: 243: Kernel tests: Boot OK!
[ 23.041588][ T264]
[ 28.843305][ T264] LKP: stdout: 243: HOSTNAME vm-snb-99, MAC 52:54:00:12:34:56, kernel 5.18.0-rc1-00031-ga1f1245bef40 1
[ 28.843315][ T264]
[ 29.128109][ T264] install debs round one: dpkg -i --force-confdef --force-depends /opt/deb/gawk_1%3a4.1.4+dfsg-1_i386.deb
[ 29.128119][ T264]
[ 29.135239][ T264] Selecting previously unselected package gawk.
[ 29.135247][ T264]
[ 29.143980][ T264] (Reading database ... 16210 files and directories currently installed.)
[ 29.143988][ T264]
[ 29.152286][ T264] Preparing to unpack .../gawk_1%3a4.1.4+dfsg-1_i386.deb ...
[ 29.152293][ T264]
[ 29.158858][ T264] Unpacking gawk (1:4.1.4+dfsg-1) ...
[ 29.158903][ T264]
[ 29.166758][ T264] Setting up gawk (1:4.1.4+dfsg-1) ...
[ 29.166765][ T264]
[ 36.360833][ T264] LKP: stdout: 243: /lkp/lkp/src/bin/run-lkp /lkp/jobs/scheduled/vm-snb-99/trinity-group-03-300s-debian-i386-20191205.cgz-a1f1245bef4084ac5f6ccf3d075a3551476fe948-20220512-45405-6v8k22-5.yaml
[ 36.360844][ T264]
[ 36.381667][ T264] RESULT_ROOT=/result/trinity/group-03-300s/vm-snb/debian-i386-20191205.cgz/x86_64-rhel-8.3/gcc-11/a1f1245bef4084ac5f6ccf3d075a3551476fe948/3
[ 36.381678][ T264]
[ 36.401159][ T264] job=/lkp/jobs/scheduled/vm-snb-99/trinity-group-03-300s-debian-i386-20191205.cgz-a1f1245bef4084ac5f6ccf3d075a3551476fe948-20220512-45405-6v8k22-5.yaml
[ 36.401169][ T264]
[ 41.230056][ T264] result_service: raw_upload, RESULT_MNT: /internal-lkp-server/result, RESULT_ROOT: /internal-lkp-server/result/trinity/group-03-300s/vm-snb/debian-i386-20191205.cgz/x86_64-rhel-8.3/gcc-11/a1f1245bef4084ac5f6ccf3d075a3551476fe948/3, TMP_RESULT_ROOT: /tmp/lkp/result
[ 41.230068][ T264]
[ 41.248856][ T264] run-job /lkp/jobs/scheduled/vm-snb-99/trinity-group-03-300s-debian-i386-20191205.cgz-a1f1245bef4084ac5f6ccf3d075a3551476fe948-20220512-45405-6v8k22-5.yaml
[ 41.248882][ T264]
[ 44.208341][ T264] /usr/bin/wget -q --timeout=1800 --tries=1 --local-encoding=UTF-8 http://internal-lkp-server:80/~lkp/cgi-bin/lkp-jobfile-append-var?job_file=/lkp/jobs/scheduled/vm-snb-99/trinity-group-03-300s-debian-i386-20191205.cgz-a1f1245bef4084ac5f6ccf3d075a3551476fe948-20220512-45405-6v8k22-5.yaml&job_state=running -O /dev/null
[ 44.208353][ T264]
[ 44.219857][ T264] target ucode:
[ 44.219891][ T264]
[ 44.228816][ T264] Seeding trinity by 167967812 based on vm-snb/debian-i386-20191205.cgz/x86_64-rhel-8.3
[ 44.228827][ T264]
[ 54.666581][ T264] 2022-05-12 02:09:19 chroot --userspec nobody:nogroup / trinity -q -q -l off -s 167967812 -N 999999999 -c add_key -c alarm -c bind -c brk -c chroot -c clock_getres -c clock_settime -c close -c execveat -c fadvise64_64 -c fchdir -c fchmodat -c fchown -c fchown16 -c fremovexattr -c fsmount -c fstatat64 -c futimesat -c getcpu -c getgid -c getsockopt -c gettid -c inotify_init1 -c io_uring_register -c ioctl -c listxattr -c llseek -c lsetxattr -c madvise -c migrate_pages -c mknod -c modify_ldt -c move_mount -c mq_getsetattr -c msync -c munlockall -c openat -c personality -c pidfd_open -c prlimit64 -c pwrite64 -c pwritev -c quotactl -c rename -c renameat2 -c rt_sigqueueinfo -c sched_getattr -c sched_getparam -c select -c semctl -c sendmmsg -c sendmsg -c sendto -c set_robust_list -c setfsgid16 -c setgid -c setgroups16 -c setitimer -c setresuid16 -c setuid -c sigaltstack -c signalfd -c statx -c te
[ 54.666593][ T264]
[ 60.531652][ T264] Trinity 2019.06 Dave Jones <[email protected]>
[ 60.531662][ T264]
[ 60.538969][ T264] shm:0xf7f88000-0x4734e14 (4 pages)
[ 60.538977][ T264]
[ 60.647920][ T264] [main] Marking syscall add_key (286) as to be enabled.
[ 60.647930][ T264]
[ 60.656357][ T264] [main] alarm is marked as AVOID. Skipping
[ 60.656366][ T264]
[ 60.665687][ T264] [main] Marking syscall alarm (27) as to be enabled.
[ 60.665695][ T264]
[ 60.673619][ T264] [main] Marking syscall bind (361) as to be enabled.
[ 60.673627][ T264]
[ 60.682931][ T264] [main] brk is marked as AVOID. Skipping
[ 60.682939][ T264]
[ 60.689625][ T264] [main] Marking syscall brk (45) as to be enabled.
[ 60.689633][ T264]
[ 60.697690][ T264] [main] Marking syscall chroot (61) as to be enabled.
[ 60.697698][ T264]
[ 61.189764][ T264] [main] Marking syscall clock_getres (266) as to be enabled.
[ 61.189774][ T264]
[ 61.299228][ T264] [main] Marking syscall clock_settime (264) as to be enabled.
[ 61.299239][ T264]
[ 61.307346][ T264] [main] close is marked as AVOID. Skipping
[ 61.307354][ T264]
[ 61.314936][ T264] [main] Marking syscall close (6) as to be enabled.
[ 61.314944][ T264]
[ 61.323575][ T264] [main] Marking syscall execveat (358) as to be enabled.
[ 61.323585][ T264]
[ 61.334213][ T264] [main] Marking syscall fadvise64_64 (272) as to be enabled.
[ 61.334222][ T264]
[ 61.344015][ T264] [main] Marking syscall fchdir (133) as to be enabled.
[ 61.344023][ T264]
[ 61.351111][ T264] [main] Marking syscall fchmodat (306) as to be enabled.
[ 61.351119][ T264]
[ 61.359986][ T264] [main] Marking syscall fchown (207) as to be enabled.
[ 61.359995][ T264]
[ 61.367978][ T264] [main] Marking syscall fchown16 (95) as to be enabled.
[ 61.367986][ T264]
[ 61.380250][ T264] [main] Marking syscall fremovexattr (237) as to be enabled.
[ 61.380259][ T264]
[ 61.391813][ T264] [main] Marking syscall fsmount (425) as to be enabled.
[ 61.391822][ T264]
[ 61.399349][ T264] [main] Marking syscall fstatat64 (300) as to be enabled.
[ 61.399357][ T264]
[ 61.411934][ T264] [main] Marking syscall futimesat (299) as to be enabled.


To reproduce:

# build kernel
cd linux
cp config-5.18.0-rc1-00031-ga1f1245bef40 .config
make HOSTCC=gcc-11 CC=gcc-11 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules
make HOSTCC=gcc-11 CC=gcc-11 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install
cd <mod-install-dir>
find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz


git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.



--
0-DAY CI Kernel Test Service
https://01.org/lkp



Attachments:
(No filename) (11.80 kB)
config-5.18.0-rc1-00031-ga1f1245bef40 (165.16 kB)
job-script (4.64 kB)
dmesg.xz (15.81 kB)
Download all attachments