I got below panic when doing fuzz test:
Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000005
CPU: 2 PID: 1 Comm: systemd Kdump: loaded Not tainted 6.1.0-rc1 #1
Hardware name: LENOVO 20L5A07XCD/20L5A07XCD, BIOS N24ET56W (1.31 ) 02/19/2020
Call Trace:
[ 157.210356] dump_stack_lvl+0x49/0x63
[ 157.210364] dump_stack+0x10/0x16
[ 157.210368] panic+0x10c/0x299
[ 157.210375] do_exit.cold+0x15/0x15
[ 157.210381] do_group_exit+0x35/0x90
[ 157.210386] get_signal+0x910/0x960
[ 157.210390] ? signal_wake_up_state+0x2e/0x40
[ 157.210396] ? complete_signal+0xd0/0x2c0
[ 157.210402] arch_do_signal_or_restart+0x37/0x7c0
[ 157.210408] ? send_signal_locked+0xf5/0x140
[ 157.210416] exit_to_user_mode_prepare+0x133/0x180
[ 157.210423] irqentry_exit_to_user_mode+0x9/0x20
[ 157.210428] noist_exc_debug+0xea/0x150
[ 157.210433] asm_exc_debug+0x34/0x40
[ 157.210438] RIP: 0033:0x7fcf2a8e51c9
[ 157.210442] Code: ff ff 73 01 c3 48 8b 0d c5 7c 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 ba 00 00 00 <0f> 05 c3 0f 1f 40 00 f3 0f 1e fa b8 ea 00 00 00 0f 05 48 3d 01 f0
[ 157.210446] RSP: 002b:00007ffd7dc44678 EFLAGS: 00000302
[ 157.210451] RAX: 00000000000000ba RBX: 000055f7c0363170 RCX: 000055f7c04d2820
[ 157.210454] RDX: 00000000ffffffff RSI: ffffffffffffffff RDI: 000055f7c0363170
[ 157.210457] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000001dd0
[ 157.210460] R10: 00007ffd7ddc9090 R11: 000000000000d7da R12: 0000000000000001
[ 157.210463] R13: ffffffffffffffff R14: 000055f7bf3557c1 R15: 0000000000000000
If a task attaches init task and is single stepping it, when this task
exits, ptrace value will be cleaned. It causes SIGNAL_UNKILLABLE flag
cleaned, and init task will lose the protection. Init task maybe be killed
by SIGTRAP signal because of stepping enabled. So syscall tracing and
stepping should be turned off for protecting init task before ptrace value
is cleaned.
Signed-off-by: chen zhang <[email protected]>
---
kernel/ptrace.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 54482193e1ed..e7c41154b31e 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -130,6 +130,9 @@ void __ptrace_unlink(struct task_struct *child)
put_cred(old_cred);
spin_lock(&child->sighand->siglock);
+ if (unlikely(child->signal->flags & SIGNAL_UNKILLABLE) &&
+ unlikely(task_thread_info(child)->flags & _TIF_SINGLESTEP))
+ user_disable_single_step(child);
child->ptrace = 0;
/*
* Clear all pending traps and TRAPPING. TRAPPING should be
--
2.25.1
On 10/24, chen zhang wrote:
>
> I got below panic when doing fuzz test:
>
> Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000005
> CPU: 2 PID: 1 Comm: systemd Kdump: loaded Not tainted 6.1.0-rc1 #1
> Hardware name: LENOVO 20L5A07XCD/20L5A07XCD, BIOS N24ET56W (1.31 ) 02/19/2020
> Call Trace:
> [ 157.210356] dump_stack_lvl+0x49/0x63
> [ 157.210364] dump_stack+0x10/0x16
> [ 157.210368] panic+0x10c/0x299
> [ 157.210375] do_exit.cold+0x15/0x15
> [ 157.210381] do_group_exit+0x35/0x90
> [ 157.210386] get_signal+0x910/0x960
> [ 157.210390] ? signal_wake_up_state+0x2e/0x40
> [ 157.210396] ? complete_signal+0xd0/0x2c0
> [ 157.210402] arch_do_signal_or_restart+0x37/0x7c0
> [ 157.210408] ? send_signal_locked+0xf5/0x140
> [ 157.210416] exit_to_user_mode_prepare+0x133/0x180
> [ 157.210423] irqentry_exit_to_user_mode+0x9/0x20
> [ 157.210428] noist_exc_debug+0xea/0x150
> [ 157.210433] asm_exc_debug+0x34/0x40
> [ 157.210438] RIP: 0033:0x7fcf2a8e51c9
> [ 157.210442] Code: ff ff 73 01 c3 48 8b 0d c5 7c 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 ba 00 00 00 <0f> 05 c3 0f 1f 40 00 f3 0f 1e fa b8 ea 00 00 00 0f 05 48 3d 01 f0
> [ 157.210446] RSP: 002b:00007ffd7dc44678 EFLAGS: 00000302
> [ 157.210451] RAX: 00000000000000ba RBX: 000055f7c0363170 RCX: 000055f7c04d2820
> [ 157.210454] RDX: 00000000ffffffff RSI: ffffffffffffffff RDI: 000055f7c0363170
> [ 157.210457] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000001dd0
> [ 157.210460] R10: 00007ffd7ddc9090 R11: 000000000000d7da R12: 0000000000000001
> [ 157.210463] R13: ffffffffffffffff R14: 000055f7bf3557c1 R15: 0000000000000000
>
> If a task attaches init task and is single stepping it, when this task
> exits, ptrace value will be cleaned. It causes SIGNAL_UNKILLABLE flag
> cleaned, and init task will lose the protection. Init task maybe be killed
> by SIGTRAP signal because of stepping enabled.
Well. If debugger just exits then the tracee (init or a regular process) can be
killed. This is the application bug. Debugger should not exit unexpectedly, that
is all.
The kernel can't "cleanup" the state of the tracee anyway. Say, debugger installs
a breakpoint and exits. The tracee will be killed once it hits this bp. What can
the kernel do?
Not to mention I don't understand how your patch can actually help. If nothing
else,
- debugger does ptrace(PTRACE_SINGLESTEP), this wakes the tracee up
- the tracee enters force_sig_info_to_task(SIGTRAP) after single step
- debugger exits, __ptrace_unlink() clears ptrace/TIF_SINGLESTEP
- force_sig_info_to_task() clears SIGNAL_UNKILLABLE, the traced init
will be killed.
Oleg.
Hi chen,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on linus/master]
[also build test ERROR on v6.1-rc2 next-20221024]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/chen-zhang/ptrace-disable-single-step-in-__ptrace_unlink-for-protecting-init-task/20221024-175119
patch link: https://lore.kernel.org/r/20221024094759.11434-1-chenzhang%40kylinos.cn
patch subject: [PATCH] ptrace: disable single step in __ptrace_unlink for protecting init task
config: mips-bcm47xx_defconfig
compiler: mipsel-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/30b32cd2ca59b28cda083521c0689decf9fe6243
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review chen-zhang/ptrace-disable-single-step-in-__ptrace_unlink-for-protecting-init-task/20221024-175119
git checkout 30b32cd2ca59b28cda083521c0689decf9fe6243
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=mips SHELL=/bin/bash
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
All errors (new ones prefixed by >>):
In file included from arch/mips/include/asm/bug.h:5,
from include/linux/bug.h:5,
from include/linux/thread_info.h:13,
from include/asm-generic/current.h:5,
from ./arch/mips/include/generated/asm/current.h:1,
from include/linux/sched.h:12,
from kernel/ptrace.c:13:
kernel/ptrace.c: In function '__ptrace_unlink':
>> kernel/ptrace.c:134:55: error: '_TIF_SINGLESTEP' undeclared (first use in this function); did you mean 'PTRACE_SINGLESTEP'?
134 | unlikely(task_thread_info(child)->flags & _TIF_SINGLESTEP))
| ^~~~~~~~~~~~~~~
include/linux/compiler.h:78:45: note: in definition of macro 'unlikely'
78 | # define unlikely(x) __builtin_expect(!!(x), 0)
| ^
kernel/ptrace.c:134:55: note: each undeclared identifier is reported only once for each function it appears in
134 | unlikely(task_thread_info(child)->flags & _TIF_SINGLESTEP))
| ^~~~~~~~~~~~~~~
include/linux/compiler.h:78:45: note: in definition of macro 'unlikely'
78 | # define unlikely(x) __builtin_expect(!!(x), 0)
| ^
vim +134 kernel/ptrace.c
> 13 #include <linux/sched.h>
14 #include <linux/sched/mm.h>
15 #include <linux/sched/coredump.h>
16 #include <linux/sched/task.h>
17 #include <linux/errno.h>
18 #include <linux/mm.h>
19 #include <linux/highmem.h>
20 #include <linux/pagemap.h>
21 #include <linux/ptrace.h>
22 #include <linux/security.h>
23 #include <linux/signal.h>
24 #include <linux/uio.h>
25 #include <linux/audit.h>
26 #include <linux/pid_namespace.h>
27 #include <linux/syscalls.h>
28 #include <linux/uaccess.h>
29 #include <linux/regset.h>
30 #include <linux/hw_breakpoint.h>
31 #include <linux/cn_proc.h>
32 #include <linux/compat.h>
33 #include <linux/sched/signal.h>
34 #include <linux/minmax.h>
35
36 #include <asm/syscall.h> /* for syscall_get_* */
37
38 /*
39 * Access another process' address space via ptrace.
40 * Source/target buffer must be kernel space,
41 * Do not walk the page table directly, use get_user_pages
42 */
43 int ptrace_access_vm(struct task_struct *tsk, unsigned long addr,
44 void *buf, int len, unsigned int gup_flags)
45 {
46 struct mm_struct *mm;
47 int ret;
48
49 mm = get_task_mm(tsk);
50 if (!mm)
51 return 0;
52
53 if (!tsk->ptrace ||
54 (current != tsk->parent) ||
55 ((get_dumpable(mm) != SUID_DUMP_USER) &&
56 !ptracer_capable(tsk, mm->user_ns))) {
57 mmput(mm);
58 return 0;
59 }
60
61 ret = __access_remote_vm(mm, addr, buf, len, gup_flags);
62 mmput(mm);
63
64 return ret;
65 }
66
67
68 void __ptrace_link(struct task_struct *child, struct task_struct *new_parent,
69 const struct cred *ptracer_cred)
70 {
71 BUG_ON(!list_empty(&child->ptrace_entry));
72 list_add(&child->ptrace_entry, &new_parent->ptraced);
73 child->parent = new_parent;
74 child->ptracer_cred = get_cred(ptracer_cred);
75 }
76
77 /*
78 * ptrace a task: make the debugger its new parent and
79 * move it to the ptrace list.
80 *
81 * Must be called with the tasklist lock write-held.
82 */
83 static void ptrace_link(struct task_struct *child, struct task_struct *new_parent)
84 {
85 __ptrace_link(child, new_parent, current_cred());
86 }
87
88 /**
89 * __ptrace_unlink - unlink ptracee and restore its execution state
90 * @child: ptracee to be unlinked
91 *
92 * Remove @child from the ptrace list, move it back to the original parent,
93 * and restore the execution state so that it conforms to the group stop
94 * state.
95 *
96 * Unlinking can happen via two paths - explicit PTRACE_DETACH or ptracer
97 * exiting. For PTRACE_DETACH, unless the ptracee has been killed between
98 * ptrace_check_attach() and here, it's guaranteed to be in TASK_TRACED.
99 * If the ptracer is exiting, the ptracee can be in any state.
100 *
101 * After detach, the ptracee should be in a state which conforms to the
102 * group stop. If the group is stopped or in the process of stopping, the
103 * ptracee should be put into TASK_STOPPED; otherwise, it should be woken
104 * up from TASK_TRACED.
105 *
106 * If the ptracee is in TASK_TRACED and needs to be moved to TASK_STOPPED,
107 * it goes through TRACED -> RUNNING -> STOPPED transition which is similar
108 * to but in the opposite direction of what happens while attaching to a
109 * stopped task. However, in this direction, the intermediate RUNNING
110 * state is not hidden even from the current ptracer and if it immediately
111 * re-attaches and performs a WNOHANG wait(2), it may fail.
112 *
113 * CONTEXT:
114 * write_lock_irq(tasklist_lock)
115 */
116 void __ptrace_unlink(struct task_struct *child)
117 {
118 const struct cred *old_cred;
119 BUG_ON(!child->ptrace);
120
121 clear_task_syscall_work(child, SYSCALL_TRACE);
122 #if defined(CONFIG_GENERIC_ENTRY) || defined(TIF_SYSCALL_EMU)
123 clear_task_syscall_work(child, SYSCALL_EMU);
124 #endif
125
126 child->parent = child->real_parent;
127 list_del_init(&child->ptrace_entry);
128 old_cred = child->ptracer_cred;
129 child->ptracer_cred = NULL;
130 put_cred(old_cred);
131
132 spin_lock(&child->sighand->siglock);
133 if (unlikely(child->signal->flags & SIGNAL_UNKILLABLE) &&
> 134 unlikely(task_thread_info(child)->flags & _TIF_SINGLESTEP))
135 user_disable_single_step(child);
136 child->ptrace = 0;
137 /*
138 * Clear all pending traps and TRAPPING. TRAPPING should be
139 * cleared regardless of JOBCTL_STOP_PENDING. Do it explicitly.
140 */
141 task_clear_jobctl_pending(child, JOBCTL_TRAP_MASK);
142 task_clear_jobctl_trapping(child);
143
144 /*
145 * Reinstate JOBCTL_STOP_PENDING if group stop is in effect and
146 * @child isn't dead.
147 */
148 if (!(child->flags & PF_EXITING) &&
149 (child->signal->flags & SIGNAL_STOP_STOPPED ||
150 child->signal->group_stop_count)) {
151 child->jobctl |= JOBCTL_STOP_PENDING;
152
153 /*
154 * This is only possible if this thread was cloned by the
155 * traced task running in the stopped group, set the signal
156 * for the future reports.
157 * FIXME: we should change ptrace_init_task() to handle this
158 * case.
159 */
160 if (!(child->jobctl & JOBCTL_STOP_SIGMASK))
161 child->jobctl |= SIGSTOP;
162 }
163
164 /*
165 * If transition to TASK_STOPPED is pending or in TASK_TRACED, kick
166 * @child in the butt. Note that @resume should be used iff @child
167 * is in TASK_TRACED; otherwise, we might unduly disrupt
168 * TASK_KILLABLE sleeps.
169 */
170 if (child->jobctl & JOBCTL_STOP_PENDING || task_is_traced(child))
171 ptrace_signal_wake_up(child, true);
172
173 spin_unlock(&child->sighand->siglock);
174 }
175
--
0-DAY CI Kernel Test Service
https://01.org/lkp
Hi chen,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on linus/master]
[also build test ERROR on v6.1-rc2 next-20221024]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/chen-zhang/ptrace-disable-single-step-in-__ptrace_unlink-for-protecting-init-task/20221024-175119
patch link: https://lore.kernel.org/r/20221024094759.11434-1-chenzhang%40kylinos.cn
patch subject: [PATCH] ptrace: disable single step in __ptrace_unlink for protecting init task
config: arm-versatile_defconfig
compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 791a7ae1ba3efd6bca96338e10ffde557ba83920)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install arm cross compiling tool for clang build
# apt-get install binutils-arm-linux-gnueabi
# https://github.com/intel-lab-lkp/linux/commit/30b32cd2ca59b28cda083521c0689decf9fe6243
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review chen-zhang/ptrace-disable-single-step-in-__ptrace_unlink-for-protecting-init-task/20221024-175119
git checkout 30b32cd2ca59b28cda083521c0689decf9fe6243
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=arm SHELL=/bin/bash
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
All errors (new ones prefixed by >>):
>> kernel/ptrace.c:134:48: error: use of undeclared identifier '_TIF_SINGLESTEP'
unlikely(task_thread_info(child)->flags & _TIF_SINGLESTEP))
^
1 error generated.
vim +/_TIF_SINGLESTEP +134 kernel/ptrace.c
125
126 child->parent = child->real_parent;
127 list_del_init(&child->ptrace_entry);
128 old_cred = child->ptracer_cred;
129 child->ptracer_cred = NULL;
130 put_cred(old_cred);
131
132 spin_lock(&child->sighand->siglock);
133 if (unlikely(child->signal->flags & SIGNAL_UNKILLABLE) &&
> 134 unlikely(task_thread_info(child)->flags & _TIF_SINGLESTEP))
135 user_disable_single_step(child);
136 child->ptrace = 0;
137 /*
138 * Clear all pending traps and TRAPPING. TRAPPING should be
139 * cleared regardless of JOBCTL_STOP_PENDING. Do it explicitly.
140 */
141 task_clear_jobctl_pending(child, JOBCTL_TRAP_MASK);
142 task_clear_jobctl_trapping(child);
143
144 /*
145 * Reinstate JOBCTL_STOP_PENDING if group stop is in effect and
146 * @child isn't dead.
147 */
148 if (!(child->flags & PF_EXITING) &&
149 (child->signal->flags & SIGNAL_STOP_STOPPED ||
150 child->signal->group_stop_count)) {
151 child->jobctl |= JOBCTL_STOP_PENDING;
152
153 /*
154 * This is only possible if this thread was cloned by the
155 * traced task running in the stopped group, set the signal
156 * for the future reports.
157 * FIXME: we should change ptrace_init_task() to handle this
158 * case.
159 */
160 if (!(child->jobctl & JOBCTL_STOP_SIGMASK))
161 child->jobctl |= SIGSTOP;
162 }
163
164 /*
165 * If transition to TASK_STOPPED is pending or in TASK_TRACED, kick
166 * @child in the butt. Note that @resume should be used iff @child
167 * is in TASK_TRACED; otherwise, we might unduly disrupt
168 * TASK_KILLABLE sleeps.
169 */
170 if (child->jobctl & JOBCTL_STOP_PENDING || task_is_traced(child))
171 ptrace_signal_wake_up(child, true);
172
173 spin_unlock(&child->sighand->siglock);
174 }
175
--
0-DAY CI Kernel Test Service
https://01.org/lkp