2023-06-26 12:08:25

by You Kangren

[permalink] [raw]
Subject: [PATCH] posix-timers: Release the acquired lock before returning

Release the acquired lock before returning to prevent errors

Signed-off-by: You Kangren <[email protected]>
---
kernel/time/posix-timers.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index b924f0f096fa..8ab383cba0af 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -599,6 +599,7 @@ static struct k_itimer *__lock_timer(timer_t timer_id, unsigned long *flags)
*/
if (timr->it_signal == current->signal) {
rcu_read_unlock();
+ spin_unlock_irqrestore(&timr->it_lock, *flags);
return timr;
}
spin_unlock_irqrestore(&timr->it_lock, *flags);
@@ -1066,9 +1067,10 @@ static void itimer_delete(struct k_itimer *timer)
* do_exit() only for the last thread of the thread group.
* So no other task can access and delete that timer.
*/
- if (WARN_ON_ONCE(timer_wait_running(timer, &flags) != timer))
+ if (WARN_ON_ONCE(timer_wait_running(timer, &flags) != timer)) {
+ spin_unlock_irqrestore(&timer->it_lock, flags);
return;
-
+ }
goto retry_delete;
}
list_del(&timer->list);
--
2.39.0



2023-06-26 15:08:57

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] posix-timers: Release the acquired lock before returning

On Mon, Jun 26 2023 at 20:00, You Kangren wrote:
> Release the acquired lock before returning to prevent errors

Seriously?

> @@ -599,6 +599,7 @@ static struct k_itimer *__lock_timer(timer_t timer_id, unsigned long *flags)
> */
> if (timr->it_signal == current->signal) {
> rcu_read_unlock();
> + spin_unlock_irqrestore(&timr->it_lock, *flags);
> return timr;

The name of this function is __lock_timer() so it better returns with
the timer locked, no?

> }
> spin_unlock_irqrestore(&timr->it_lock, *flags);
> @@ -1066,9 +1067,10 @@ static void itimer_delete(struct k_itimer *timer)
> * do_exit() only for the last thread of the thread group.
> * So no other task can access and delete that timer.
> */
> - if (WARN_ON_ONCE(timer_wait_running(timer, &flags) != timer))
> + if (WARN_ON_ONCE(timer_wait_running(timer, &flags) != timer)) {
> + spin_unlock_irqrestore(&timer->it_lock, flags);

There is no lock held in that case.

Thanks,

tglx

2023-06-28 14:16:25

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] posix-timers: Release the acquired lock before returning



Hello,

kernel test robot noticed "WARNING:bad_unlock_balance_detected" on:

commit: a33a94dbedf738c4e9e4b264883fbfd9a01b1e29 ("[PATCH] posix-timers: Release the acquired lock before returning")
url: https://github.com/intel-lab-lkp/linux/commits/You-Kangren/posix-timers-Release-the-acquired-lock-before-returning/20230626-200132
base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git d2b32be7debd6c0deeae95844997bd89fbe4769d
patch link: https://lore.kernel.org/all/[email protected]/
patch subject: [PATCH] posix-timers: Release the acquired lock before returning

in testcase: trinity
version: trinity-i386-abe9de86-1_20230429
with following parameters:

runtime: 600s

test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/


compiler: gcc-12
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-lkp/[email protected]


[ 40.691875][ T648] WARNING: bad unlock balance detected!
[ 40.692294][ T648] 6.4.0-rc4-00041-ga33a94dbedf7 #17 Not tainted
[ 40.692764][ T648] -------------------------------------
[ 40.693174][ T648] trinity-c1/648 is trying to release lock (&posix_cpu_timers_key) at:
[ 40.693798][ T648] do_timer_gettime (kbuild/src/rand-4/kernel/time/posix-timers.c:716)
[ 40.694256][ T648] but there are no more locks to release!
[ 40.694687][ T648]
[ 40.694687][ T648] other info that might help us debug this:
[ 40.695297][ T648] no locks held by trinity-c1/648.
[ 40.695683][ T648]
[ 40.695683][ T648] stack backtrace:
[ 40.696130][ T648] CPU: 1 PID: 648 Comm: trinity-c1 Not tainted 6.4.0-rc4-00041-ga33a94dbedf7 #17 52a599974c7ffcbcfb7dedc40a8a8b5005ce7907
[ 40.697075][ T648] Call Trace:
[ 40.697333][ T648] dump_stack_lvl (kbuild/src/rand-4/lib/dump_stack.c:107 (discriminator 4))
[ 40.697681][ T648] dump_stack (kbuild/src/rand-4/lib/dump_stack.c:114)
[ 40.697992][ T648] print_unlock_imbalance_bug (kbuild/src/rand-4/kernel/locking/lockdep.c:5145 kbuild/src/rand-4/kernel/locking/lockdep.c:5121)
[ 40.698432][ T648] ? do_timer_gettime (kbuild/src/rand-4/kernel/time/posix-timers.c:716)
[ 40.698806][ T648] ? do_timer_gettime (kbuild/src/rand-4/kernel/time/posix-timers.c:716)
[ 40.699182][ T648] ? do_timer_gettime (kbuild/src/rand-4/kernel/time/posix-timers.c:716)
[ 40.699555][ T648] __lock_release+0x61/0x111
[ 40.699955][ T648] ? do_timer_gettime (kbuild/src/rand-4/kernel/time/posix-timers.c:716)
[ 40.700334][ T648] lock_release (kbuild/src/rand-4/kernel/locking/lockdep.c:467 kbuild/src/rand-4/kernel/locking/lockdep.c:5727)
[ 40.700667][ T648] _raw_spin_unlock_irqrestore (kbuild/src/rand-4/include/linux/spinlock_api_smp.h:150 kbuild/src/rand-4/kernel/locking/spinlock.c:194)
[ 40.701104][ T648] do_timer_gettime (kbuild/src/rand-4/kernel/time/posix-timers.c:716)
[ 40.701460][ T648] __ia32_sys_timer_gettime32 (kbuild/src/rand-4/kernel/time/posix-timers.c:741 kbuild/src/rand-4/kernel/time/posix-timers.c:735 kbuild/src/rand-4/kernel/time/posix-timers.c:735)
[ 40.701881][ T648] __do_fast_syscall_32 (kbuild/src/rand-4/arch/x86/entry/common.c:112 kbuild/src/rand-4/arch/x86/entry/common.c:178)
[ 40.706369][ T648] do_fast_syscall_32 (kbuild/src/rand-4/arch/x86/entry/common.c:203)
[ 40.706751][ T648] do_SYSENTER_32 (kbuild/src/rand-4/arch/x86/entry/common.c:247)
[ 40.707100][ T648] entry_SYSENTER_32 (kbuild/src/rand-4/arch/x86/entry/entry_32.S:867)
[ 40.707473][ T648] EIP: 0x37f7256d
[ 40.707755][ T648] Code: c4 01 10 03 03 74 c0 01 10 05 03 74 b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
All code
========
0: c4 01 10 03 (bad)
4: 03 74 c0 01 add 0x1(%rax,%rax,8),%esi
8: 10 05 03 74 b8 01 adc %al,0x1b87403(%rip) # 0x1b87411
e: 10 06 adc %al,(%rsi)
10: 03 74 b4 01 add 0x1(%rsp,%rsi,4),%esi
14: 10 07 adc %al,(%rdi)
16: 03 74 b0 01 add 0x1(%rax,%rsi,4),%esi
1a: 10 08 adc %cl,(%rax)
1c: 03 74 d8 01 add 0x1(%rax,%rbx,8),%esi
20: 00 51 52 add %dl,0x52(%rcx)
23: 55 push %rbp
24: 89 e5 mov %esp,%ebp
26: 0f 34 sysenter
28: cd 80 int $0x80
2a:* 5d pop %rbp <-- trapping instruction
2b: 5a pop %rdx
2c: 59 pop %rcx
2d: c3 retq
2e: 90 nop
2f: 90 nop
30: 90 nop
31: 90 nop
32: 8d 76 00 lea 0x0(%rsi),%esi
35: 58 pop %rax
36: b8 77 00 00 00 mov $0x77,%eax
3b: cd 80 int $0x80
3d: 90 nop
3e: 8d .byte 0x8d
3f: 76 .byte 0x76

Code starting with the faulting instruction
===========================================
0: 5d pop %rbp
1: 5a pop %rdx
2: 59 pop %rcx
3: c3 retq
4: 90 nop
5: 90 nop
6: 90 nop
7: 90 nop
8: 8d 76 00 lea 0x0(%rsi),%esi
b: 58 pop %rax
c: b8 77 00 00 00 mov $0x77,%eax
11: cd 80 int $0x80
13: 90 nop
14: 8d .byte 0x8d
15: 76 .byte 0x76
[ 40.709247][ T648] EAX: ffffffda EBX: 00000000 ECX: 369e1000 EDX: 000000c8
[ 40.709776][ T648] ESI: fffffffe EDI: 0000cfa5 EBP: 0000d3ee ESP: 3ffe77cc
[ 40.710325][ T648] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000216
[ 40.710964][ T648] ------------[ cut here ]------------
[ 40.711429][ T648] raw_local_irq_restore() called with IRQs enabled
[ 40.711945][ T648] WARNING: CPU: 1 PID: 648 at kernel/locking/irqflag-debug.c:10 warn_bogus_irq_restore (kbuild/src/rand-4/kernel/locking/irqflag-debug.c:10 (discriminator 3))
[ 40.712739][ T648] Modules linked in: rose mpls_router phonet bluetooth ecdh_generic rfkill can_raw cn scsi_transport_iscsi scsi_mod atm netrom appletalk ax25 evbug crc32c_intel input_leds evdev button fuse autofs4
[ 40.714207][ T648] CPU: 1 PID: 648 Comm: trinity-c1 Not tainted 6.4.0-rc4-00041-ga33a94dbedf7 #17 52a599974c7ffcbcfb7dedc40a8a8b5005ce7907
[ 40.715159][ T648] EIP: warn_bogus_irq_restore (kbuild/src/rand-4/kernel/locking/irqflag-debug.c:10 (discriminator 3))
[ 40.715579][ T648] Code: 5b 89 fa 5e 5f 5d 8d 67 f8 5f 31 c9 c3 80 3d ab 1d 1b 42 00 75 1b 55 89 e5 c6 05 ab 1d 1b 42 01 68 28 1a bc 41 e8 a5 b7 82 ff <0f> 0b 58 c9 31 c0 c3 31 c0 c3 55 89 e5 56 53 64 8b 1d 80 d2 2a 42
All code
========
0: 5b pop %rbx
1: 89 fa mov %edi,%edx
3: 5e pop %rsi
4: 5f pop %rdi
5: 5d pop %rbp
6: 8d 67 f8 lea -0x8(%rdi),%esp
9: 5f pop %rdi
a: 31 c9 xor %ecx,%ecx
c: c3 retq
d: 80 3d ab 1d 1b 42 00 cmpb $0x0,0x421b1dab(%rip) # 0x421b1dbf
14: 75 1b jne 0x31
16: 55 push %rbp
17: 89 e5 mov %esp,%ebp
19: c6 05 ab 1d 1b 42 01 movb $0x1,0x421b1dab(%rip) # 0x421b1dcb
20: 68 28 1a bc 41 pushq $0x41bc1a28
25: e8 a5 b7 82 ff callq 0xffffffffff82b7cf
2a:* 0f 0b ud2 <-- trapping instruction
2c: 58 pop %rax
2d: c9 leaveq
2e: 31 c0 xor %eax,%eax
30: c3 retq
31: 31 c0 xor %eax,%eax
33: c3 retq
34: 55 push %rbp
35: 89 e5 mov %esp,%ebp
37: 56 push %rsi
38: 53 push %rbx
39: 64 8b 1d 80 d2 2a 42 mov %fs:0x422ad280(%rip),%ebx # 0x422ad2c0

Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: 58 pop %rax
3: c9 leaveq
4: 31 c0 xor %eax,%eax
6: c3 retq
7: 31 c0 xor %eax,%eax
9: c3 retq
a: 55 push %rbp
b: 89 e5 mov %esp,%ebp
d: 56 push %rsi
e: 53 push %rbx
f: 64 8b 1d 80 d2 2a 42 mov %fs:0x422ad280(%rip),%ebx # 0x422ad296
[ 40.717015][ T648] EAX: 00000000 EBX: 531a37e0 ECX: 00000000 EDX: 00000000
[ 40.717541][ T648] ESI: 00000202 EDI: bec97f78 EBP: bec97f30 ESP: bec97f2c
[ 40.718085][ T648] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010246
[ 40.718678][ T648] CR0: 80050033 CR2: 00ef2f74 CR3: 7e6d5ac0 CR4: 000406b0
[ 40.719224][ T648] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 40.719766][ T648] DR6: fffe0ff0 DR7: 00000400
[ 40.720126][ T648] Call Trace:
[ 40.720383][ T648] ? show_regs (kbuild/src/rand-4/arch/x86/kernel/dumpstack.c:479 kbuild/src/rand-4/arch/x86/kernel/dumpstack.c:465)
[ 40.720707][ T648] ? warn_bogus_irq_restore (kbuild/src/rand-4/kernel/locking/irqflag-debug.c:10 (discriminator 3))
[ 40.721129][ T648] ? __warn (kbuild/src/rand-4/kernel/panic.c:673)
[ 40.721436][ T648] ? report_bug (kbuild/src/rand-4/lib/bug.c:201 kbuild/src/rand-4/lib/bug.c:219)
[ 40.721769][ T648] ? warn_bogus_irq_restore (kbuild/src/rand-4/kernel/locking/irqflag-debug.c:10 (discriminator 3))
[ 40.722178][ T648] ? exc_overflow (kbuild/src/rand-4/arch/x86/kernel/traps.c:337)
[ 40.722519][ T648] ? handle_bug (kbuild/src/rand-4/arch/x86/kernel/traps.c:324)
[ 40.722847][ T648] ? exc_invalid_op (kbuild/src/rand-4/arch/x86/kernel/traps.c:345 (discriminator 1))
[ 40.723198][ T648] ? handle_exception (kbuild/src/rand-4/arch/x86/entry/entry_32.S:1083)
[ 40.723575][ T648] ? exc_overflow (kbuild/src/rand-4/arch/x86/kernel/traps.c:337)
[ 40.723915][ T648] ? warn_bogus_irq_restore (kbuild/src/rand-4/kernel/locking/irqflag-debug.c:10 (discriminator 3))
[ 40.724309][ T648] ? exc_overflow (kbuild/src/rand-4/arch/x86/kernel/traps.c:337)
[ 40.724647][ T648] ? warn_bogus_irq_restore (kbuild/src/rand-4/kernel/locking/irqflag-debug.c:10 (discriminator 3))
[ 40.725053][ T648] _raw_spin_unlock_irqrestore (kbuild/src/rand-4/arch/x86/include/asm/irqflags.h:42 kbuild/src/rand-4/arch/x86/include/asm/irqflags.h:77 kbuild/src/rand-4/arch/x86/include/asm/irqflags.h:135 kbuild/src/rand-4/include/linux/spinlock_api_smp.h:151 kbuild/src/rand-4/kernel/locking/spinlock.c:194)
[ 40.725483][ T648] do_timer_gettime (kbuild/src/rand-4/kernel/time/posix-timers.c:716)
[ 40.725842][ T648] __ia32_sys_timer_gettime32 (kbuild/src/rand-4/kernel/time/posix-timers.c:741 kbuild/src/rand-4/kernel/time/posix-timers.c:735 kbuild/src/rand-4/kernel/time/posix-timers.c:735)
[ 40.726271][ T648] __do_fast_syscall_32 (kbuild/src/rand-4/arch/x86/entry/common.c:112 kbuild/src/rand-4/arch/x86/entry/common.c:178)
[ 40.726656][ T648] do_fast_syscall_32 (kbuild/src/rand-4/arch/x86/entry/common.c:203)
[ 40.727024][ T648] do_SYSENTER_32 (kbuild/src/rand-4/arch/x86/entry/common.c:247)
[ 40.727370][ T648] entry_SYSENTER_32 (kbuild/src/rand-4/arch/x86/entry/entry_32.S:867)
[ 40.727741][ T648] EIP: 0x37f7256d
[ 40.728023][ T648] Code: c4 01 10 03 03 74 c0 01 10 05 03 74 b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
All code
========
0: c4 01 10 03 (bad)
4: 03 74 c0 01 add 0x1(%rax,%rax,8),%esi
8: 10 05 03 74 b8 01 adc %al,0x1b87403(%rip) # 0x1b87411
e: 10 06 adc %al,(%rsi)
10: 03 74 b4 01 add 0x1(%rsp,%rsi,4),%esi
14: 10 07 adc %al,(%rdi)
16: 03 74 b0 01 add 0x1(%rax,%rsi,4),%esi
1a: 10 08 adc %cl,(%rax)
1c: 03 74 d8 01 add 0x1(%rax,%rbx,8),%esi
20: 00 51 52 add %dl,0x52(%rcx)
23: 55 push %rbp
24: 89 e5 mov %esp,%ebp
26: 0f 34 sysenter
28: cd 80 int $0x80
2a:* 5d pop %rbp <-- trapping instruction
2b: 5a pop %rdx
2c: 59 pop %rcx
2d: c3 retq
2e: 90 nop
2f: 90 nop
30: 90 nop
31: 90 nop
32: 8d 76 00 lea 0x0(%rsi),%esi
35: 58 pop %rax
36: b8 77 00 00 00 mov $0x77,%eax
3b: cd 80 int $0x80
3d: 90 nop
3e: 8d .byte 0x8d
3f: 76 .byte 0x76

Code starting with the faulting instruction
===========================================
0: 5d pop %rbp
1: 5a pop %rdx
2: 59 pop %rcx
3: c3 retq
4: 90 nop
5: 90 nop
6: 90 nop
7: 90 nop
8: 8d 76 00 lea 0x0(%rsi),%esi
b: 58 pop %rax
c: b8 77 00 00 00 mov $0x77,%eax
11: cd 80 int $0x80
13: 90 nop
14: 8d .byte 0x8d
15: 76 .byte 0x76


To reproduce:

# build kernel
cd linux
cp config-6.4.0-rc4-00041-ga33a94dbedf7 .config
make HOSTCC=gcc-12 CC=gcc-12 ARCH=i386 olddefconfig prepare modules_prepare bzImage modules
make HOSTCC=gcc-12 CC=gcc-12 ARCH=i386 INSTALL_MOD_PATH=<mod-install-dir> modules_install
cd <mod-install-dir>
find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz


git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.



--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki



Attachments:
(No filename) (14.87 kB)
config-6.4.0-rc4-00041-ga33a94dbedf7 (151.86 kB)
job-script (4.94 kB)
dmesg.xz (17.77 kB)
Download all attachments