Hi,
One of the RFC 6803 key derivation kunit subtests is failing.
cki-project data warehouse : https://datawarehouse.cki-project.org/issue/2514
Arches: X86_64, ARM64, S390x, ppc64le
First Appeared: ~6.8.rc2
TRACE:
# Subtest: RFC 6803 key derivation
# RFC 6803 key derivation: ASSERTION FAILED at net/sunrpc/auth_gss/gss_krb5_test.c:63
Expected err == 0, but
err == -110 (0xffffffffffffff92)
not ok 1 Derive Kc subkey for camellia128-cts-cmac
ok 2 Derive Ke subkey for camellia128-cts-cmac
ok 3 Derive Ki subkey for camellia128-cts-cmac
ok 4 Derive Kc subkey for camellia256-cts-cmac
ok 5 Derive Ke subkey for camellia256-cts-cmac
ok 6 Derive Ki subkey for camellia256-cts-cmac
# RFC 6803 key derivation: pass:5 fail:1 skip:0 total:6
not ok 1 RFC 6803 key derivation
--
2.44.0
Hi,
On Thu, 07 Mar 2024, Nico Pache wrote:
> Hi,
>
> One of the RFC 6803 key derivation kunit subtests is failing.
>
> cki-project data warehouse : https://datawarehouse.cki-project.org/issue/2514
>
> Arches: X86_64, ARM64, S390x, ppc64le
> First Appeared: ~6.8.rc2
>
> TRACE:
> # Subtest: RFC 6803 key derivation
> # RFC 6803 key derivation: ASSERTION FAILED at net/sunrpc/auth_gss/gss_krb5_test.c:63
> Expected err == 0, but
> err == -110 (0xffffffffffffff92)
> not ok 1 Derive Kc subkey for camellia128-cts-cmac
> ok 2 Derive Ke subkey for camellia128-cts-cmac
> ok 3 Derive Ki subkey for camellia128-cts-cmac
> ok 4 Derive Kc subkey for camellia256-cts-cmac
> ok 5 Derive Ke subkey for camellia256-cts-cmac
> ok 6 Derive Ki subkey for camellia256-cts-cmac
> # RFC 6803 key derivation: pass:5 fail:1 skip:0 total:6
> not ok 1 RFC 6803 key derivation
This was broken by:
c72a870926c2 kunit: add ability to run tests after boot using debugfs
__kunit_test_suites_init() runs any time a kernel module is loaded, via
the "kunit_mod_nb" notifier_block... even if the kernel module has no
kunit tests. But now __kunit_test_suites_init() also locks a mutex,
which is a problem if a kunit test itself needs to load a kernel module
(which the gss_krb5_test module does).
This fixes it for me:
---8<---
diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index 088489856db8..18af9453632b 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -707,6 +707,9 @@ int __kunit_test_suites_init(struct kunit_suite * const * const suites, int num_
{
unsigned int i;
+ if (num_suites == 0)
+ return 0;
+
if (!kunit_enabled() && num_suites > 0) {
pr_info("kunit: disabled\n");
return 0;
---8<---
More detail below:
Here's the modprobe command where I loaded the gss_krb5_test module. This
process has the "kunit_run_lock" mutex locked:
PID: 1468 TASK: ffff9aed0ac20000 CPU: 0 COMMAND: "modprobe"
#0 [ffffba974196f6f8] __schedule at ffffffff83fd85f5
#1 [ffffba974196f7b0] schedule at ffffffff83fd9672
#2 [ffffba974196f7c8] schedule_timeout at ffffffff83fe0308
#3 [ffffba974196f818] wait_for_completion_timeout at ffffffff83fda3d4
#4 [ffffba974196f878] kunit_try_catch_run at ffffffffc0d5e851 [kunit]
#5 [ffffba974196f8c8] kunit_run_tests at ffffffffc0d5c0ea [kunit]
#6 [ffffba974196fb78] __kunit_test_suites_init at ffffffffc0d5c9af [kunit]
#7 [ffffba974196fb98] kunit_module_notify at ffffffffc0d5ba4b [kunit]
#8 [ffffba974196fc08] notifier_call_chain at ffffffff8314647a
#9 [ffffba974196fc40] blocking_notifier_call_chain_robust at ffffffff83146565
#10 [ffffba974196fc88] load_module at ffffffff831e1935
#11 [ffffba974196fde8] __do_sys_init_module at ffffffff831e1fba
#12 [ffffba974196fec0] do_syscall_64 at ffffffff83fc3461
#13 [ffffba974196fee8] do_user_addr_fault at ffffffff830979df
#14 [ffffba974196ff28] exc_page_fault at ffffffff83fc9c7f
#15 [ffffba974196ff50] entry_SYSCALL_64_after_hwframe at ffffffff840000ea
RIP: 00007ff1f272b4ae RSP: 00007ffd45db8f68 RFLAGS: 00000246
RAX: ffffffffffffffda RBX: 000055bf4c0c4b20 RCX: 00007ff1f272b4ae
RDX: 000055bf4b204e79 RSI: 0000000000099691 RDI: 000055bf4cbfd130
RBP: 00007ffd45db9020 R8: 000055bf4c0c4010 R9: 0000000000000007
R10: 0000000000000001 R11: 0000000000000246 R12: 000055bf4b204e79
R13: 0000000000040000 R14: 000055bf4c0c4c50 R15: 000055bf4c0c4390
ORIG_RAX: 00000000000000af CS: 0033 SS: 002b
Here's the kunit test case running. It's trying to allocate "cmac(camellia)"
via crypto_alloc_shash():
PID: 1508 TASK: ffff9aed155d0000 CPU: 1 COMMAND: "kunit_try_catch"
#0 [ffffba974194fba0] __schedule at ffffffff83fd85f5
#1 [ffffba974194fc58] schedule at ffffffff83fd9672
#2 [ffffba974194fc70] schedule_timeout at ffffffff83fe0308
#3 [ffffba974194fcc0] wait_for_completion_killable_timeout at ffffffff83fda708
#4 [ffffba974194fd20] crypto_larval_wait at ffffffff83747fb4
#5 [ffffba974194fd38] crypto_alg_mod_lookup at ffffffff83748252
#6 [ffffba974194fd70] crypto_alloc_tfm_node at ffffffff83748492
#7 [ffffba974194fdb0] krb5_kdf_feedback_cmac at ffffffffc0d76bb2 [rpcsec_gss_krb5]
#8 [ffffba974194fe30] kdf_case at ffffffffc0d800a8 [gss_krb5_test]
#9 [ffffba974194fe80] kunit_try_run_case at ffffffffc0d5bb54 [kunit]
#10 [ffffba974194fee8] kunit_generic_run_threadfn_adapter at ffffffffc0d5e797 [kunit]
#11 [ffffba974194fef8] kthread at ffffffff8313eda5
#12 [ffffba974194ff30] ret_from_fork at ffffffff830414a1
#13 [ffffba974194ff50] ret_from_fork_asm at ffffffff830039ab
Here the crypto manager is trying to modprobe the camellia kernel module via a
usermodehelper call:
PID: 1511 TASK: ffff9aed04630000 CPU: 3 COMMAND: "cryptomgr_probe"
#0 [ffffba974195fb88] __schedule at ffffffff83fd85f5
#1 [ffffba974195fc40] schedule at ffffffff83fd9672
#2 [ffffba974195fc58] schedule_timeout at ffffffff83fe03c1
#3 [ffffba974195fca8] wait_for_completion_state at ffffffff83fdb06d
#4 [ffffba974195fd18] call_usermodehelper_exec at ffffffff83130313
#5 [ffffba974195fd68] __request_module at ffffffff831e325d
#6 [ffffba974195fe28] crypto_alg_mod_lookup at ffffffff83748220
#7 [ffffba974195fe60] crypto_grab_spawn at ffffffff83749ff7
#8 [ffffba974195fe98] cmac_create at ffffffff8375c2f0
#9 [ffffba974195fed8] cryptomgr_probe at ffffffff83754a93
#10 [ffffba974195fef8] kthread at ffffffff8313eda5
#11 [ffffba974195ff30] ret_from_fork at ffffffff830414a1
#12 [ffffba974195ff50] ret_from_fork_asm at ffffffff830039ab
And here's the resulting modprobe command, which is stuck waiting on the
"kunit_run_lock" mutex:
PID: 1512 TASK: ffff9aed143fafc0 CPU: 2 COMMAND: "modprobe"
#0 [ffffba9741957990] __schedule at ffffffff83fd85f5
#1 [ffffba9741957a48] schedule at ffffffff83fd9672
#2 [ffffba9741957a60] schedule_preempt_disabled at ffffffff83fd9cb5
#3 [ffffba9741957a68] __mutex_lock.constprop.0 at ffffffff83fdc57a
#4 [ffffba9741957ae8] __kunit_test_suites_init at ffffffffc0d5c95a [kunit]
#5 [ffffba9741957b08] kunit_module_notify at ffffffffc0d5ba4b [kunit]
#6 [ffffba9741957b78] notifier_call_chain at ffffffff8314647a
#7 [ffffba9741957bb0] blocking_notifier_call_chain_robust at ffffffff83146565
#8 [ffffba9741957bf8] load_module at ffffffff831e1935
#9 [ffffba9741957d58] __do_sys_init_module at ffffffff831e1fba
#10 [ffffba9741957e30] do_syscall_64 at ffffffff83fc3461
#11 [ffffba9741957e48] __vm_munmap at ffffffff833bcdeb
#12 [ffffba9741957ee8] do_syscall_64 at ffffffff83fc3470
#13 [ffffba9741957f50] entry_SYSCALL_64_after_hwframe at ffffffff840000ea
RIP: 00007f8ba092b4ae RSP: 00007ffc771e0378 RFLAGS: 00000246
RAX: ffffffffffffffda RBX: 00005572137e6e40 RCX: 00007f8ba092b4ae
RDX: 0000557211c4de79 RSI: 0000000000080451 RDI: 00007f8b9ff90010
RBP: 00007ffc771e0430 R8: 00005572137e6010 R9: 0000000000000007
R10: 0000000000000001 R11: 0000000000000246 R12: 0000557211c4de79
R13: 0000000000040000 R14: 00005572137e73b0 R15: 00005572137e6400
ORIG_RAX: 00000000000000af CS: 0033 SS: 002b
The camellia module doesn't even have any kunit tests, so __kunit_test_suites_init()
is waiting to lock the "kunit_run_lock" mutex for nothing:
crash> module -o | grep num_kunit
[0x478] int num_kunit_init_suites;
[0x488] int num_kunit_suites;
crash> mod | grep camellia
ffffffffc0da15c0 camellia_x86_64 ffffffffc0d99000 57344 (not loaded) [CONFIG_KALLSYMS]
crash> px 0xffffffffc0da15c0+0x478
$1 = 0xffffffffc0da1a38
crash> px 0xffffffffc0da15c0+0x488
$2 = 0xffffffffc0da1a48
crash> rd 0xffffffffc0da1a38
ffffffffc0da1a38: 0000000000000000 ........
crash> rd 0xffffffffc0da1a48
ffffffffc0da1a48: 0000000000000000 ........
-Scott
> --
> 2.44.0
>
>
On Tue, Mar 19, 2024 at 11:51 AM Scott Mayhew <[email protected]> wrote:
>
> Hi,
>
> On Thu, 07 Mar 2024, Nico Pache wrote:
>
> > Hi,
> >
> > One of the RFC 6803 key derivation kunit subtests is failing.
> >
> > cki-project data warehouse : https://datawarehouse.cki-project.org/issue/2514
> >
> > Arches: X86_64, ARM64, S390x, ppc64le
> > First Appeared: ~6.8.rc2
> >
> > TRACE:
> > # Subtest: RFC 6803 key derivation
> > # RFC 6803 key derivation: ASSERTION FAILED at net/sunrpc/auth_gss/gss_krb5_test.c:63
> > Expected err == 0, but
> > err == -110 (0xffffffffffffff92)
> > not ok 1 Derive Kc subkey for camellia128-cts-cmac
> > ok 2 Derive Ke subkey for camellia128-cts-cmac
> > ok 3 Derive Ki subkey for camellia128-cts-cmac
> > ok 4 Derive Kc subkey for camellia256-cts-cmac
> > ok 5 Derive Ke subkey for camellia256-cts-cmac
> > ok 6 Derive Ki subkey for camellia256-cts-cmac
> > # RFC 6803 key derivation: pass:5 fail:1 skip:0 total:6
> > not ok 1 RFC 6803 key derivation
>
> This was broken by:
> c72a870926c2 kunit: add ability to run tests after boot using debugfs
>
> __kunit_test_suites_init() runs any time a kernel module is loaded, via
> the "kunit_mod_nb" notifier_block... even if the kernel module has no
> kunit tests. But now __kunit_test_suites_init() also locks a mutex,
> which is a problem if a kunit test itself needs to load a kernel module
> (which the gss_krb5_test module does).
>
> This fixes it for me:
>
> ---8<---
> diff --git a/lib/kunit/test.c b/lib/kunit/test.c
> index 088489856db8..18af9453632b 100644
> --- a/lib/kunit/test.c
> +++ b/lib/kunit/test.c
> @@ -707,6 +707,9 @@ int __kunit_test_suites_init(struct kunit_suite * const * const suites, int num_
> {
> unsigned int i;
>
> + if (num_suites == 0)
> + return 0;
> +
> if (!kunit_enabled() && num_suites > 0) {
> pr_info("kunit: disabled\n");
> return 0;
> ---8<---
>
Nice find! Would you mind posting a patch?
-- Nico
> More detail below:
>
> Here's the modprobe command where I loaded the gss_krb5_test module. This
> process has the "kunit_run_lock" mutex locked:
>
> PID: 1468 TASK: ffff9aed0ac20000 CPU: 0 COMMAND: "modprobe"
> #0 [ffffba974196f6f8] __schedule at ffffffff83fd85f5
> #1 [ffffba974196f7b0] schedule at ffffffff83fd9672
> #2 [ffffba974196f7c8] schedule_timeout at ffffffff83fe0308
> #3 [ffffba974196f818] wait_for_completion_timeout at ffffffff83fda3d4
> #4 [ffffba974196f878] kunit_try_catch_run at ffffffffc0d5e851 [kunit]
> #5 [ffffba974196f8c8] kunit_run_tests at ffffffffc0d5c0ea [kunit]
> #6 [ffffba974196fb78] __kunit_test_suites_init at ffffffffc0d5c9af [kunit]
> #7 [ffffba974196fb98] kunit_module_notify at ffffffffc0d5ba4b [kunit]
> #8 [ffffba974196fc08] notifier_call_chain at ffffffff8314647a
> #9 [ffffba974196fc40] blocking_notifier_call_chain_robust at ffffffff83146565
> #10 [ffffba974196fc88] load_module at ffffffff831e1935
> #11 [ffffba974196fde8] __do_sys_init_module at ffffffff831e1fba
> #12 [ffffba974196fec0] do_syscall_64 at ffffffff83fc3461
> #13 [ffffba974196fee8] do_user_addr_fault at ffffffff830979df
> #14 [ffffba974196ff28] exc_page_fault at ffffffff83fc9c7f
> #15 [ffffba974196ff50] entry_SYSCALL_64_after_hwframe at ffffffff840000ea
> RIP: 00007ff1f272b4ae RSP: 00007ffd45db8f68 RFLAGS: 00000246
> RAX: ffffffffffffffda RBX: 000055bf4c0c4b20 RCX: 00007ff1f272b4ae
> RDX: 000055bf4b204e79 RSI: 0000000000099691 RDI: 000055bf4cbfd130
> RBP: 00007ffd45db9020 R8: 000055bf4c0c4010 R9: 0000000000000007
> R10: 0000000000000001 R11: 0000000000000246 R12: 000055bf4b204e79
> R13: 0000000000040000 R14: 000055bf4c0c4c50 R15: 000055bf4c0c4390
> ORIG_RAX: 00000000000000af CS: 0033 SS: 002b
>
> Here's the kunit test case running. It's trying to allocate "cmac(camellia)"
> via crypto_alloc_shash():
>
> PID: 1508 TASK: ffff9aed155d0000 CPU: 1 COMMAND: "kunit_try_catch"
> #0 [ffffba974194fba0] __schedule at ffffffff83fd85f5
> #1 [ffffba974194fc58] schedule at ffffffff83fd9672
> #2 [ffffba974194fc70] schedule_timeout at ffffffff83fe0308
> #3 [ffffba974194fcc0] wait_for_completion_killable_timeout at ffffffff83fda708
> #4 [ffffba974194fd20] crypto_larval_wait at ffffffff83747fb4
> #5 [ffffba974194fd38] crypto_alg_mod_lookup at ffffffff83748252
> #6 [ffffba974194fd70] crypto_alloc_tfm_node at ffffffff83748492
> #7 [ffffba974194fdb0] krb5_kdf_feedback_cmac at ffffffffc0d76bb2 [rpcsec_gss_krb5]
> #8 [ffffba974194fe30] kdf_case at ffffffffc0d800a8 [gss_krb5_test]
> #9 [ffffba974194fe80] kunit_try_run_case at ffffffffc0d5bb54 [kunit]
> #10 [ffffba974194fee8] kunit_generic_run_threadfn_adapter at ffffffffc0d5e797 [kunit]
> #11 [ffffba974194fef8] kthread at ffffffff8313eda5
> #12 [ffffba974194ff30] ret_from_fork at ffffffff830414a1
> #13 [ffffba974194ff50] ret_from_fork_asm at ffffffff830039ab
>
> Here the crypto manager is trying to modprobe the camellia kernel module via a
> usermodehelper call:
>
> PID: 1511 TASK: ffff9aed04630000 CPU: 3 COMMAND: "cryptomgr_probe"
> #0 [ffffba974195fb88] __schedule at ffffffff83fd85f5
> #1 [ffffba974195fc40] schedule at ffffffff83fd9672
> #2 [ffffba974195fc58] schedule_timeout at ffffffff83fe03c1
> #3 [ffffba974195fca8] wait_for_completion_state at ffffffff83fdb06d
> #4 [ffffba974195fd18] call_usermodehelper_exec at ffffffff83130313
> #5 [ffffba974195fd68] __request_module at ffffffff831e325d
> #6 [ffffba974195fe28] crypto_alg_mod_lookup at ffffffff83748220
> #7 [ffffba974195fe60] crypto_grab_spawn at ffffffff83749ff7
> #8 [ffffba974195fe98] cmac_create at ffffffff8375c2f0
> #9 [ffffba974195fed8] cryptomgr_probe at ffffffff83754a93
> #10 [ffffba974195fef8] kthread at ffffffff8313eda5
> #11 [ffffba974195ff30] ret_from_fork at ffffffff830414a1
> #12 [ffffba974195ff50] ret_from_fork_asm at ffffffff830039ab
>
> And here's the resulting modprobe command, which is stuck waiting on the
> "kunit_run_lock" mutex:
>
> PID: 1512 TASK: ffff9aed143fafc0 CPU: 2 COMMAND: "modprobe"
> #0 [ffffba9741957990] __schedule at ffffffff83fd85f5
> #1 [ffffba9741957a48] schedule at ffffffff83fd9672
> #2 [ffffba9741957a60] schedule_preempt_disabled at ffffffff83fd9cb5
> #3 [ffffba9741957a68] __mutex_lock.constprop.0 at ffffffff83fdc57a
> #4 [ffffba9741957ae8] __kunit_test_suites_init at ffffffffc0d5c95a [kunit]
> #5 [ffffba9741957b08] kunit_module_notify at ffffffffc0d5ba4b [kunit]
> #6 [ffffba9741957b78] notifier_call_chain at ffffffff8314647a
> #7 [ffffba9741957bb0] blocking_notifier_call_chain_robust at ffffffff83146565
> #8 [ffffba9741957bf8] load_module at ffffffff831e1935
> #9 [ffffba9741957d58] __do_sys_init_module at ffffffff831e1fba
> #10 [ffffba9741957e30] do_syscall_64 at ffffffff83fc3461
> #11 [ffffba9741957e48] __vm_munmap at ffffffff833bcdeb
> #12 [ffffba9741957ee8] do_syscall_64 at ffffffff83fc3470
> #13 [ffffba9741957f50] entry_SYSCALL_64_after_hwframe at ffffffff840000ea
> RIP: 00007f8ba092b4ae RSP: 00007ffc771e0378 RFLAGS: 00000246
> RAX: ffffffffffffffda RBX: 00005572137e6e40 RCX: 00007f8ba092b4ae
> RDX: 0000557211c4de79 RSI: 0000000000080451 RDI: 00007f8b9ff90010
> RBP: 00007ffc771e0430 R8: 00005572137e6010 R9: 0000000000000007
> R10: 0000000000000001 R11: 0000000000000246 R12: 0000557211c4de79
> R13: 0000000000040000 R14: 00005572137e73b0 R15: 00005572137e6400
> ORIG_RAX: 00000000000000af CS: 0033 SS: 002b
>
> The camellia module doesn't even have any kunit tests, so __kunit_test_suites_init()
> is waiting to lock the "kunit_run_lock" mutex for nothing:
>
> crash> module -o | grep num_kunit
> [0x478] int num_kunit_init_suites;
> [0x488] int num_kunit_suites;
> crash> mod | grep camellia
> ffffffffc0da15c0 camellia_x86_64 ffffffffc0d99000 57344 (not loaded) [CONFIG_KALLSYMS]
> crash> px 0xffffffffc0da15c0+0x478
> $1 = 0xffffffffc0da1a38
> crash> px 0xffffffffc0da15c0+0x488
> $2 = 0xffffffffc0da1a48
> crash> rd 0xffffffffc0da1a38
> ffffffffc0da1a38: 0000000000000000 ........
> crash> rd 0xffffffffc0da1a48
> ffffffffc0da1a48: 0000000000000000 ........
>
> -Scott
> > --
> > 2.44.0
> >
> >
>
On Wed, 20 Mar 2024, Nico Pache wrote:
> On Tue, Mar 19, 2024 at 11:51 AM Scott Mayhew <[email protected]> wrote:
> >
> > Hi,
> >
> > On Thu, 07 Mar 2024, Nico Pache wrote:
> >
> > > Hi,
> > >
> > > One of the RFC 6803 key derivation kunit subtests is failing.
> > >
> > > cki-project data warehouse : https://datawarehouse.cki-project.org/issue/2514
> > >
> > > Arches: X86_64, ARM64, S390x, ppc64le
> > > First Appeared: ~6.8.rc2
> > >
> > > TRACE:
> > > # Subtest: RFC 6803 key derivation
> > > # RFC 6803 key derivation: ASSERTION FAILED at net/sunrpc/auth_gss/gss_krb5_test.c:63
> > > Expected err == 0, but
> > > err == -110 (0xffffffffffffff92)
> > > not ok 1 Derive Kc subkey for camellia128-cts-cmac
> > > ok 2 Derive Ke subkey for camellia128-cts-cmac
> > > ok 3 Derive Ki subkey for camellia128-cts-cmac
> > > ok 4 Derive Kc subkey for camellia256-cts-cmac
> > > ok 5 Derive Ke subkey for camellia256-cts-cmac
> > > ok 6 Derive Ki subkey for camellia256-cts-cmac
> > > # RFC 6803 key derivation: pass:5 fail:1 skip:0 total:6
> > > not ok 1 RFC 6803 key derivation
> >
> > This was broken by:
> > c72a870926c2 kunit: add ability to run tests after boot using debugfs
> >
> > __kunit_test_suites_init() runs any time a kernel module is loaded, via
> > the "kunit_mod_nb" notifier_block... even if the kernel module has no
> > kunit tests. But now __kunit_test_suites_init() also locks a mutex,
> > which is a problem if a kunit test itself needs to load a kernel module
> > (which the gss_krb5_test module does).
> >
> > This fixes it for me:
> >
> > ---8<---
> > diff --git a/lib/kunit/test.c b/lib/kunit/test.c
> > index 088489856db8..18af9453632b 100644
> > --- a/lib/kunit/test.c
> > +++ b/lib/kunit/test.c
> > @@ -707,6 +707,9 @@ int __kunit_test_suites_init(struct kunit_suite * const * const suites, int num_
> > {
> > unsigned int i;
> >
> > + if (num_suites == 0)
> > + return 0;
> > +
> > if (!kunit_enabled() && num_suites > 0) {
> > pr_info("kunit: disabled\n");
> > return 0;
> > ---8<---
> >
> Nice find! Would you mind posting a patch?
Yes - while testing I also found an issue with the RFC 8009 encryption
test and I wanted to be sure it wasn't related to this issue before
posting the patch.
-Scott
>
> -- Nico
>
> > More detail below:
> >
> > Here's the modprobe command where I loaded the gss_krb5_test module. This
> > process has the "kunit_run_lock" mutex locked:
> >
> > PID: 1468 TASK: ffff9aed0ac20000 CPU: 0 COMMAND: "modprobe"
> > #0 [ffffba974196f6f8] __schedule at ffffffff83fd85f5
> > #1 [ffffba974196f7b0] schedule at ffffffff83fd9672
> > #2 [ffffba974196f7c8] schedule_timeout at ffffffff83fe0308
> > #3 [ffffba974196f818] wait_for_completion_timeout at ffffffff83fda3d4
> > #4 [ffffba974196f878] kunit_try_catch_run at ffffffffc0d5e851 [kunit]
> > #5 [ffffba974196f8c8] kunit_run_tests at ffffffffc0d5c0ea [kunit]
> > #6 [ffffba974196fb78] __kunit_test_suites_init at ffffffffc0d5c9af [kunit]
> > #7 [ffffba974196fb98] kunit_module_notify at ffffffffc0d5ba4b [kunit]
> > #8 [ffffba974196fc08] notifier_call_chain at ffffffff8314647a
> > #9 [ffffba974196fc40] blocking_notifier_call_chain_robust at ffffffff83146565
> > #10 [ffffba974196fc88] load_module at ffffffff831e1935
> > #11 [ffffba974196fde8] __do_sys_init_module at ffffffff831e1fba
> > #12 [ffffba974196fec0] do_syscall_64 at ffffffff83fc3461
> > #13 [ffffba974196fee8] do_user_addr_fault at ffffffff830979df
> > #14 [ffffba974196ff28] exc_page_fault at ffffffff83fc9c7f
> > #15 [ffffba974196ff50] entry_SYSCALL_64_after_hwframe at ffffffff840000ea
> > RIP: 00007ff1f272b4ae RSP: 00007ffd45db8f68 RFLAGS: 00000246
> > RAX: ffffffffffffffda RBX: 000055bf4c0c4b20 RCX: 00007ff1f272b4ae
> > RDX: 000055bf4b204e79 RSI: 0000000000099691 RDI: 000055bf4cbfd130
> > RBP: 00007ffd45db9020 R8: 000055bf4c0c4010 R9: 0000000000000007
> > R10: 0000000000000001 R11: 0000000000000246 R12: 000055bf4b204e79
> > R13: 0000000000040000 R14: 000055bf4c0c4c50 R15: 000055bf4c0c4390
> > ORIG_RAX: 00000000000000af CS: 0033 SS: 002b
> >
> > Here's the kunit test case running. It's trying to allocate "cmac(camellia)"
> > via crypto_alloc_shash():
> >
> > PID: 1508 TASK: ffff9aed155d0000 CPU: 1 COMMAND: "kunit_try_catch"
> > #0 [ffffba974194fba0] __schedule at ffffffff83fd85f5
> > #1 [ffffba974194fc58] schedule at ffffffff83fd9672
> > #2 [ffffba974194fc70] schedule_timeout at ffffffff83fe0308
> > #3 [ffffba974194fcc0] wait_for_completion_killable_timeout at ffffffff83fda708
> > #4 [ffffba974194fd20] crypto_larval_wait at ffffffff83747fb4
> > #5 [ffffba974194fd38] crypto_alg_mod_lookup at ffffffff83748252
> > #6 [ffffba974194fd70] crypto_alloc_tfm_node at ffffffff83748492
> > #7 [ffffba974194fdb0] krb5_kdf_feedback_cmac at ffffffffc0d76bb2 [rpcsec_gss_krb5]
> > #8 [ffffba974194fe30] kdf_case at ffffffffc0d800a8 [gss_krb5_test]
> > #9 [ffffba974194fe80] kunit_try_run_case at ffffffffc0d5bb54 [kunit]
> > #10 [ffffba974194fee8] kunit_generic_run_threadfn_adapter at ffffffffc0d5e797 [kunit]
> > #11 [ffffba974194fef8] kthread at ffffffff8313eda5
> > #12 [ffffba974194ff30] ret_from_fork at ffffffff830414a1
> > #13 [ffffba974194ff50] ret_from_fork_asm at ffffffff830039ab
> >
> > Here the crypto manager is trying to modprobe the camellia kernel module via a
> > usermodehelper call:
> >
> > PID: 1511 TASK: ffff9aed04630000 CPU: 3 COMMAND: "cryptomgr_probe"
> > #0 [ffffba974195fb88] __schedule at ffffffff83fd85f5
> > #1 [ffffba974195fc40] schedule at ffffffff83fd9672
> > #2 [ffffba974195fc58] schedule_timeout at ffffffff83fe03c1
> > #3 [ffffba974195fca8] wait_for_completion_state at ffffffff83fdb06d
> > #4 [ffffba974195fd18] call_usermodehelper_exec at ffffffff83130313
> > #5 [ffffba974195fd68] __request_module at ffffffff831e325d
> > #6 [ffffba974195fe28] crypto_alg_mod_lookup at ffffffff83748220
> > #7 [ffffba974195fe60] crypto_grab_spawn at ffffffff83749ff7
> > #8 [ffffba974195fe98] cmac_create at ffffffff8375c2f0
> > #9 [ffffba974195fed8] cryptomgr_probe at ffffffff83754a93
> > #10 [ffffba974195fef8] kthread at ffffffff8313eda5
> > #11 [ffffba974195ff30] ret_from_fork at ffffffff830414a1
> > #12 [ffffba974195ff50] ret_from_fork_asm at ffffffff830039ab
> >
> > And here's the resulting modprobe command, which is stuck waiting on the
> > "kunit_run_lock" mutex:
> >
> > PID: 1512 TASK: ffff9aed143fafc0 CPU: 2 COMMAND: "modprobe"
> > #0 [ffffba9741957990] __schedule at ffffffff83fd85f5
> > #1 [ffffba9741957a48] schedule at ffffffff83fd9672
> > #2 [ffffba9741957a60] schedule_preempt_disabled at ffffffff83fd9cb5
> > #3 [ffffba9741957a68] __mutex_lock.constprop.0 at ffffffff83fdc57a
> > #4 [ffffba9741957ae8] __kunit_test_suites_init at ffffffffc0d5c95a [kunit]
> > #5 [ffffba9741957b08] kunit_module_notify at ffffffffc0d5ba4b [kunit]
> > #6 [ffffba9741957b78] notifier_call_chain at ffffffff8314647a
> > #7 [ffffba9741957bb0] blocking_notifier_call_chain_robust at ffffffff83146565
> > #8 [ffffba9741957bf8] load_module at ffffffff831e1935
> > #9 [ffffba9741957d58] __do_sys_init_module at ffffffff831e1fba
> > #10 [ffffba9741957e30] do_syscall_64 at ffffffff83fc3461
> > #11 [ffffba9741957e48] __vm_munmap at ffffffff833bcdeb
> > #12 [ffffba9741957ee8] do_syscall_64 at ffffffff83fc3470
> > #13 [ffffba9741957f50] entry_SYSCALL_64_after_hwframe at ffffffff840000ea
> > RIP: 00007f8ba092b4ae RSP: 00007ffc771e0378 RFLAGS: 00000246
> > RAX: ffffffffffffffda RBX: 00005572137e6e40 RCX: 00007f8ba092b4ae
> > RDX: 0000557211c4de79 RSI: 0000000000080451 RDI: 00007f8b9ff90010
> > RBP: 00007ffc771e0430 R8: 00005572137e6010 R9: 0000000000000007
> > R10: 0000000000000001 R11: 0000000000000246 R12: 0000557211c4de79
> > R13: 0000000000040000 R14: 00005572137e73b0 R15: 00005572137e6400
> > ORIG_RAX: 00000000000000af CS: 0033 SS: 002b
> >
> > The camellia module doesn't even have any kunit tests, so __kunit_test_suites_init()
> > is waiting to lock the "kunit_run_lock" mutex for nothing:
> >
> > crash> module -o | grep num_kunit
> > [0x478] int num_kunit_init_suites;
> > [0x488] int num_kunit_suites;
> > crash> mod | grep camellia
> > ffffffffc0da15c0 camellia_x86_64 ffffffffc0d99000 57344 (not loaded) [CONFIG_KALLSYMS]
> > crash> px 0xffffffffc0da15c0+0x478
> > $1 = 0xffffffffc0da1a38
> > crash> px 0xffffffffc0da15c0+0x488
> > $2 = 0xffffffffc0da1a48
> > crash> rd 0xffffffffc0da1a38
> > ffffffffc0da1a38: 0000000000000000 ........
> > crash> rd 0xffffffffc0da1a48
> > ffffffffc0da1a48: 0000000000000000 ........
> >
> > -Scott
> > > --
> > > 2.44.0
> > >
> > >
> >
>
Commit c72a870926c2 added a mutex to prevent kunit tests from running
concurrently. Unfortunately that mutex gets locked during module load
regardless of whether the module actually has any kunit tests. This
causes a problem for kunit tests that might need to load other kernel
modules (e.g. gss_krb5_test loading the camellia module).
So check to see if there are actually any tests to run before locking
the kunit_run_lock mutex.
Fixes: c72a870926c2 ("kunit: add ability to run tests after boot using debugfs")
Reported-by: Nico Pache <[email protected]>
Signed-off-by: Scott Mayhew <[email protected]>
---
lib/kunit/test.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index 1d1475578515..b8514dbb337c 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -712,6 +712,9 @@ int __kunit_test_suites_init(struct kunit_suite * const * const suites, int num_
{
unsigned int i;
+ if (num_suites == 0)
+ return 0;
+
if (!kunit_enabled() && num_suites > 0) {
pr_info("kunit: disabled\n");
return 0;
--
2.43.0
On Thu, Mar 21, 2024 at 10:32 AM Scott Mayhew <[email protected]> wrote:
>
> Commit c72a870926c2 added a mutex to prevent kunit tests from running
> concurrently. Unfortunately that mutex gets locked during module load
> regardless of whether the module actually has any kunit tests. This
> causes a problem for kunit tests that might need to load other kernel
> modules (e.g. gss_krb5_test loading the camellia module).
>
> So check to see if there are actually any tests to run before locking
> the kunit_run_lock mutex.
>
> Fixes: c72a870926c2 ("kunit: add ability to run tests after boot using debugfs")
> Reported-by: Nico Pache <[email protected]>
> Signed-off-by: Scott Mayhew <[email protected]>
Hi!
Sorry about this bug. Thanks for the patch! We should definitely add this check.
Reviewed-by: Rae Moar <[email protected]>
Thanks!
-Rae
> ---
> lib/kunit/test.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/lib/kunit/test.c b/lib/kunit/test.c
> index 1d1475578515..b8514dbb337c 100644
> --- a/lib/kunit/test.c
> +++ b/lib/kunit/test.c
> @@ -712,6 +712,9 @@ int __kunit_test_suites_init(struct kunit_suite * const * const suites, int num_
> {
> unsigned int i;
>
> + if (num_suites == 0)
> + return 0;
> +
> if (!kunit_enabled() && num_suites > 0) {
> pr_info("kunit: disabled\n");
> return 0;
> --
> 2.43.0
>
On Thu, 21 Mar 2024 at 22:32, Scott Mayhew <[email protected]> wrote:
>
> Commit c72a870926c2 added a mutex to prevent kunit tests from running
> concurrently. Unfortunately that mutex gets locked during module load
> regardless of whether the module actually has any kunit tests. This
> causes a problem for kunit tests that might need to load other kernel
> modules (e.g. gss_krb5_test loading the camellia module).
>
> So check to see if there are actually any tests to run before locking
> the kunit_run_lock mutex.
>
> Fixes: c72a870926c2 ("kunit: add ability to run tests after boot using debugfs")
> Reported-by: Nico Pache <[email protected]>
> Signed-off-by: Scott Mayhew <[email protected]>
> ---
Thanks, this works well here, and is a good idea anyway.
Reviewed-by: David Gow <[email protected]>
Cheers,
-- David
> lib/kunit/test.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/lib/kunit/test.c b/lib/kunit/test.c
> index 1d1475578515..b8514dbb337c 100644
> --- a/lib/kunit/test.c
> +++ b/lib/kunit/test.c
> @@ -712,6 +712,9 @@ int __kunit_test_suites_init(struct kunit_suite * const * const suites, int num_
> {
> unsigned int i;
>
> + if (num_suites == 0)
> + return 0;
> +
> if (!kunit_enabled() && num_suites > 0) {
> pr_info("kunit: disabled\n");
> return 0;
> --
> 2.43.0
>
On Sat, 23 Mar 2024, David Gow wrote:
> On Thu, 21 Mar 2024 at 22:32, Scott Mayhew <[email protected]> wrote:
> >
> > Commit c72a870926c2 added a mutex to prevent kunit tests from running
> > concurrently. Unfortunately that mutex gets locked during module load
> > regardless of whether the module actually has any kunit tests. This
> > causes a problem for kunit tests that might need to load other kernel
> > modules (e.g. gss_krb5_test loading the camellia module).
> >
> > So check to see if there are actually any tests to run before locking
> > the kunit_run_lock mutex.
> >
> > Fixes: c72a870926c2 ("kunit: add ability to run tests after boot using debugfs")
> > Reported-by: Nico Pache <[email protected]>
> > Signed-off-by: Scott Mayhew <[email protected]>
> > ---
>
> Thanks, this works well here, and is a good idea anyway.
>
> Reviewed-by: David Gow <[email protected]>
>
Brendan, David,
Is there a reason this patch hasn't been merged?
-Scott
> Cheers,
> -- David
>
> > lib/kunit/test.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/lib/kunit/test.c b/lib/kunit/test.c
> > index 1d1475578515..b8514dbb337c 100644
> > --- a/lib/kunit/test.c
> > +++ b/lib/kunit/test.c
> > @@ -712,6 +712,9 @@ int __kunit_test_suites_init(struct kunit_suite * const * const suites, int num_
> > {
> > unsigned int i;
> >
> > + if (num_suites == 0)
> > + return 0;
> > +
> > if (!kunit_enabled() && num_suites > 0) {
> > pr_info("kunit: disabled\n");
> > return 0;
> > --
> > 2.43.0
> >
On Tue, 30 Apr 2024 at 21:58, Scott Mayhew <[email protected]> wrote:
>
> On Sat, 23 Mar 2024, David Gow wrote:
>
> > On Thu, 21 Mar 2024 at 22:32, Scott Mayhew <[email protected]> wrote:
> > >
> > > Commit c72a870926c2 added a mutex to prevent kunit tests from running
> > > concurrently. Unfortunately that mutex gets locked during module load
> > > regardless of whether the module actually has any kunit tests. This
> > > causes a problem for kunit tests that might need to load other kernel
> > > modules (e.g. gss_krb5_test loading the camellia module).
> > >
> > > So check to see if there are actually any tests to run before locking
> > > the kunit_run_lock mutex.
> > >
> > > Fixes: c72a870926c2 ("kunit: add ability to run tests after boot using debugfs")
> > > Reported-by: Nico Pache <[email protected]>
> > > Signed-off-by: Scott Mayhew <[email protected]>
> > > ---
> >
> > Thanks, this works well here, and is a good idea anyway.
> >
> > Reviewed-by: David Gow <[email protected]>
> >
>
> Brendan, David,
>
> Is there a reason this patch hasn't been merged?
>
> -Scott
>
Sorry: it totally slipped through the net. Thanks for the reminder,
it's merged now:
https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git/commit/?h=kunit&id=2168e528f8679881df7487309f3444a121b2b544
Cheers,
-- David