Syscall user dispatch makes it possible to cleanly intercept system
calls from user-land. However, most transparent checkpoint software
presently leverages some combination of ptrace and system call
injection to place software in a ready-to-checkpoint state.
If Syscall User Dispatch is enabled at the time of being quiesced,
injected system calls will subsequently be interposed upon and
dispatched to the task's signal handler.
This patch set implements 3 features to enable software such as CRIU
to cleanly interpose upon software leveraging syscall user dispatch.
- Implement PTRACE_O_SUSPEND_SYSCALL_USER_DISPATCH, akin to a similar
feature for SECCOMP. This allows a ptracer to temporarily disable
syscall user dispatch, making syscall injection possible.
- Implement an fs/proc extension that reports whether Syscall User
Dispatch is being used in proc/status. A similar value is present
for SECCOMP, and is used to determine whether special logic is
needed during checkpoint/resume.
- Implement a getter interface for Syscall User Dispatch config info.
To resume successfully, the checkpoint/resume software has to
save and restore this information. Presently this configuration
is write-only, with no way for C/R software to save it.
Signed-off-by: Gregory Price <[email protected]>
Gregory Price (3):
ptrace,syscall_user_dispatch: Implement Syscall User Dispatch
Suspension
fs/proc/array: Add Syscall User Dispatch to proc status
prctl,syscall_user_dispatch: add a getter for configuration info
.../admin-guide/syscall-user-dispatch.rst | 18 +++++++
fs/proc/array.c | 8 +++
include/linux/ptrace.h | 2 +
include/linux/syscall_user_dispatch.h | 7 +++
include/uapi/linux/prctl.h | 3 ++
include/uapi/linux/ptrace.h | 6 ++-
kernel/entry/syscall_user_dispatch.c | 19 +++++++
kernel/ptrace.c | 5 ++
kernel/sys.c | 4 ++
.../syscall_user_dispatch/sud_test.c | 54 +++++++++++++++++++
10 files changed, 125 insertions(+), 1 deletion(-)
--
2.37.3
This patch implements simple getter interface for syscall user dispatch
configuration info.
To support checkpoint/resume of a syscall user dispatch process,
the prctl settings for syscall user dispatch must be fetchable.
Presently, these settings are write-only, making it impossible to
implement transparent checkpoint (coordination with the software is
required).
As Syscall User Dispatch is explicitly not for secure-container
development, exposing the configuration state via prctl does not
violate the original design intent.
Signed-off-by: Gregory Price <[email protected]>
---
.../admin-guide/syscall-user-dispatch.rst | 18 +++++++
include/linux/syscall_user_dispatch.h | 7 +++
include/uapi/linux/prctl.h | 3 ++
kernel/entry/syscall_user_dispatch.c | 14 +++++
kernel/sys.c | 4 ++
.../syscall_user_dispatch/sud_test.c | 54 +++++++++++++++++++
6 files changed, 100 insertions(+)
diff --git a/Documentation/admin-guide/syscall-user-dispatch.rst b/Documentation/admin-guide/syscall-user-dispatch.rst
index 60314953c728..8b2c8b6441b7 100644
--- a/Documentation/admin-guide/syscall-user-dispatch.rst
+++ b/Documentation/admin-guide/syscall-user-dispatch.rst
@@ -45,6 +45,10 @@ only the syscall dispatcher address and the userspace key.
As the ABI of these intercepted syscalls is unknown to Linux, these
syscalls are not instrumentable via ptrace or the syscall tracepoints.
+A getter interface is supplied for the purpose of userland
+checkpoint/restore software being able to suspend and restore the
+current state of the system.
+
Interface
---------
@@ -73,6 +77,20 @@ thread-wide, without the need to invoke the kernel directly. selector
can be set to SYSCALL_DISPATCH_FILTER_ALLOW or SYSCALL_DISPATCH_FILTER_BLOCK.
Any other value should terminate the program with a SIGSYS.
+
+A thread can fetch the current Syscall User Dispatch configuration with the following prctl:
+
+ prctl(PR_GET_SYSCALL_USER_DISPATCH, <dispatch_config>))
+
+<dispatch_config> is a pointer to a ``struct syscall_user_dispatch`` as defined in ``linux/include/linux/syscall_user_dispatch.h``::
+
+ struct syscall_user_dispatch {
+ char __user *selector;
+ unsigned long offset;
+ unsigned long len;
+ bool on_dispatch;
+ };
+
Security Notes
--------------
diff --git a/include/linux/syscall_user_dispatch.h b/include/linux/syscall_user_dispatch.h
index a0ae443fb7df..aab25e5b6496 100644
--- a/include/linux/syscall_user_dispatch.h
+++ b/include/linux/syscall_user_dispatch.h
@@ -16,6 +16,7 @@ struct syscall_user_dispatch {
bool on_dispatch;
};
+int get_syscall_user_dispatch(struct syscall_user_dispatch __user *usd);
int set_syscall_user_dispatch(unsigned long mode, unsigned long offset,
unsigned long len, char __user *selector);
@@ -25,6 +26,12 @@ int set_syscall_user_dispatch(unsigned long mode, unsigned long offset,
#else
struct syscall_user_dispatch {};
+static inline int get_syscall_user_dispatch(
+ struct syscall_user_dispatch __user *usd)
+{
+ return -EINVAL;
+}
+
static inline int set_syscall_user_dispatch(unsigned long mode, unsigned long offset,
unsigned long len, char __user *selector)
{
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index a5e06dcbba13..221c0e369cc0 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -284,4 +284,7 @@ struct prctl_mm_map {
#define PR_SET_VMA 0x53564d41
# define PR_SET_VMA_ANON_NAME 0
+/* Get Syscall User Dispatch configuraiton settings */
+#define PR_GET_SYSCALL_USER_DISPATCH 65
+
#endif /* _LINUX_PRCTL_H */
diff --git a/kernel/entry/syscall_user_dispatch.c b/kernel/entry/syscall_user_dispatch.c
index f097c06224c9..71441664571a 100644
--- a/kernel/entry/syscall_user_dispatch.c
+++ b/kernel/entry/syscall_user_dispatch.c
@@ -73,6 +73,20 @@ bool syscall_user_dispatch(struct pt_regs *regs)
return true;
}
+int get_syscall_user_dispatch(struct syscall_user_dispatch __user *usd)
+{
+ struct syscall_user_dispatch *sd = ¤t->syscall_dispatch;
+
+ if (usd) {
+ if (copy_to_user(usd, sd, sizeof(*sd)))
+ return -EFAULT;
+ } else {
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
int set_syscall_user_dispatch(unsigned long mode, unsigned long offset,
unsigned long len, char __user *selector)
{
diff --git a/kernel/sys.c b/kernel/sys.c
index 5fd54bf0e886..b762c49fc424 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2618,6 +2618,10 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
error = set_syscall_user_dispatch(arg2, arg3, arg4,
(char __user *) arg5);
break;
+ case PR_GET_SYSCALL_USER_DISPATCH:
+ error = get_syscall_user_dispatch(
+ (struct syscall_user_dispatch __user *) arg2);
+ break;
#ifdef CONFIG_SCHED_CORE
case PR_SCHED_CORE:
error = sched_core_share_pid(arg2, arg3, arg4, arg5);
diff --git a/tools/testing/selftests/syscall_user_dispatch/sud_test.c b/tools/testing/selftests/syscall_user_dispatch/sud_test.c
index b5d592d4099e..555912f3c192 100644
--- a/tools/testing/selftests/syscall_user_dispatch/sud_test.c
+++ b/tools/testing/selftests/syscall_user_dispatch/sud_test.c
@@ -35,6 +35,16 @@
#define SYSCALL_DISPATCH_ON(x) ((x) = SYSCALL_DISPATCH_FILTER_BLOCK)
#define SYSCALL_DISPATCH_OFF(x) ((x) = SYSCALL_DISPATCH_FILTER_ALLOW)
+#ifndef PR_GET_SYSCALL_USER_DISPATCH
+#define PR_GET_SYSCALL_USER_DISPATCH 65
+#endif
+struct syscall_user_dispatch {
+ char *selector;
+ unsigned long offset;
+ unsigned long len;
+ bool on_dispatch;
+};
+
/* Test Summary:
*
* - dispatch_trigger_sigsys: Verify if PR_SET_SYSCALL_USER_DISPATCH is
@@ -309,4 +319,48 @@ TEST(direct_dispatch_range)
}
}
+
+TEST(get_dispatch_settings)
+{
+ int ret = 0;
+ struct syscall_user_dispatch usd;
+
+ glob_sel = SYSCALL_DISPATCH_FILTER_ALLOW;
+
+ /* Check the negative paths - bad user pointer */
+ ret = prctl(PR_GET_SYSCALL_USER_DISPATCH, NULL);
+ ASSERT_EQ(-1, ret) {
+ TH_LOG("Kernel reported success to accessing a NULL pointer");
+ }
+ ASSERT_EQ(EINVAL, errno);
+
+ /* Get the settings prior to it being activated */
+ ret = prctl(PR_GET_SYSCALL_USER_DISPATCH, &usd);
+ ASSERT_EQ(0, ret) {
+ TH_LOG("Kernel failed to fetch syscall user dispatch settings");
+ }
+
+ /* Make sure selector is off prior to prctl. */
+ SYSCALL_DISPATCH_OFF(glob_sel);
+ ret = prctl(PR_SET_SYSCALL_USER_DISPATCH, PR_SYS_DISPATCH_ON, 0, 0L, &glob_sel);
+ ASSERT_EQ(0, ret) {
+ TH_LOG("Failed to get Syscall User Dispatch settings");
+ }
+
+ /* sanity check the settings */
+ ret = prctl(PR_GET_SYSCALL_USER_DISPATCH, &usd);
+ ASSERT_EQ(0, ret) {
+ TH_LOG("Failed to get Syscall User Dispatch settings");
+ }
+ ASSERT_EQ(&glob_sel, usd.selector) {
+ TH_LOG("Selector is an unexpected pointer");
+ }
+ ASSERT_EQ(0, usd.offset) {
+ TH_LOG("Offset is an unexpected value");
+ }
+ ASSERT_EQ(0, usd.len) {
+ TH_LOG("Length is an unexpected value");
+ }
+}
+
TEST_HARNESS_MAIN
--
2.37.3
If a dispatch selector has been configured for Syscall User Dispatch,
report Syscall User Dispath as configured in proc/status.
This provides an indicator to userland checkpoint/restart software that
it must manage special signal conditions (similar to SECCOMP)
Signed-off-by: Gregory Price <[email protected]>
---
fs/proc/array.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/fs/proc/array.c b/fs/proc/array.c
index 49283b8103c7..c85cdb4c137c 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -428,6 +428,13 @@ static inline void task_thp_status(struct seq_file *m, struct mm_struct *mm)
seq_printf(m, "THP_enabled:\t%d\n", thp_enabled);
}
+static inline void task_syscall_user_dispatch(struct seq_file *m,
+ struct task_struct *p)
+{
+ seq_put_decimal_ull(m, "\nSyscall_user_dispatch:\t",
+ (p->syscall_dispatch.selector != NULL));
+}
+
int proc_pid_status(struct seq_file *m, struct pid_namespace *ns,
struct pid *pid, struct task_struct *task)
{
@@ -451,6 +458,7 @@ int proc_pid_status(struct seq_file *m, struct pid_namespace *ns,
task_cpus_allowed(m, task);
cpuset_task_status_allowed(m, task);
task_context_switch_counts(m, task);
+ task_syscall_user_dispatch(m, task);
return 0;
}
--
2.37.3
Adds PTRACE_O_SUSPEND_SYSCALL_USER_DISPATCH to ptrace options, and
modify Syscall User Dispatch to suspend interception when enabled.
This is modeled after the SUSPEND_SECCOMP feature, which suspends
SECCOMP interposition. Without doing this, software like CRIU will
inject system calls into a process and be intercepted by Syscall
User Dispatch, either causing a crash (due to blocked signals) or
the delivery of those signals to a ptracer (not the intended behavior).
Since Syscall User Dispatch is not a privileged feature, a check
for permissions is not required, however attempting to set this
option when CONFIG_CHECKPOINT_RESTORE it not supported should be
disallowed, as its intended use is checkpoint/resume.
Signed-off-by: Gregory Price <[email protected]>
---
include/linux/ptrace.h | 2 ++
include/uapi/linux/ptrace.h | 6 +++++-
kernel/entry/syscall_user_dispatch.c | 5 +++++
kernel/ptrace.c | 5 +++++
4 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index eaaef3ffec22..461ae5c99d57 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -45,6 +45,8 @@ extern int ptrace_access_vm(struct task_struct *tsk, unsigned long addr,
#define PT_EXITKILL (PTRACE_O_EXITKILL << PT_OPT_FLAG_SHIFT)
#define PT_SUSPEND_SECCOMP (PTRACE_O_SUSPEND_SECCOMP << PT_OPT_FLAG_SHIFT)
+#define PT_SUSPEND_SYSCALL_USER_DISPATCH \
+ (PTRACE_O_SUSPEND_SYSCALL_USER_DISPATCH << PT_OPT_FLAG_SHIFT)
extern long arch_ptrace(struct task_struct *child, long request,
unsigned long addr, unsigned long data);
diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
index 195ae64a8c87..ba9e3f19a22c 100644
--- a/include/uapi/linux/ptrace.h
+++ b/include/uapi/linux/ptrace.h
@@ -146,9 +146,13 @@ struct ptrace_rseq_configuration {
/* eventless options */
#define PTRACE_O_EXITKILL (1 << 20)
#define PTRACE_O_SUSPEND_SECCOMP (1 << 21)
+#define PTRACE_O_SUSPEND_SYSCALL_USER_DISPATCH (1 << 22)
#define PTRACE_O_MASK (\
- 0x000000ff | PTRACE_O_EXITKILL | PTRACE_O_SUSPEND_SECCOMP)
+ 0x000000ff | \
+ PTRACE_O_EXITKILL | \
+ PTRACE_O_SUSPEND_SECCOMP | \
+ PTRACE_O_SUSPEND_SYSCALL_USER_DISPATCH)
#include <asm/ptrace.h>
diff --git a/kernel/entry/syscall_user_dispatch.c b/kernel/entry/syscall_user_dispatch.c
index 0b6379adff6b..f097c06224c9 100644
--- a/kernel/entry/syscall_user_dispatch.c
+++ b/kernel/entry/syscall_user_dispatch.c
@@ -8,6 +8,7 @@
#include <linux/uaccess.h>
#include <linux/signal.h>
#include <linux/elf.h>
+#include <linux/ptrace.h>
#include <linux/sched/signal.h>
#include <linux/sched/task_stack.h>
@@ -36,6 +37,10 @@ bool syscall_user_dispatch(struct pt_regs *regs)
struct syscall_user_dispatch *sd = ¤t->syscall_dispatch;
char state;
+ if (IS_ENABLED(CONFIG_CHECKPOINT_RESTORE) &&
+ unlikely(current->ptrace & PT_SUSPEND_SYSCALL_USER_DISPATCH))
+ return false;
+
if (likely(instruction_pointer(regs) - sd->offset < sd->len))
return false;
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 54482193e1ed..a6ad815bd4be 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -370,6 +370,11 @@ static int check_ptrace_options(unsigned long data)
if (data & ~(unsigned long)PTRACE_O_MASK)
return -EINVAL;
+ if (unlikely(data & PTRACE_O_SUSPEND_SYSCALL_USER_DISPATCH)) {
+ if (!IS_ENABLED(CONFIG_CHECKPOINT_RESTART))
+ return -EINVAL;
+ }
+
if (unlikely(data & PTRACE_O_SUSPEND_SECCOMP)) {
if (!IS_ENABLED(CONFIG_CHECKPOINT_RESTORE) ||
!IS_ENABLED(CONFIG_SECCOMP))
--
2.37.3
On Mon, Jan 09, 2023 at 10:33:48AM -0500, Gregory Price wrote:
> This patch implements simple getter interface for syscall user dispatch
> configuration info.
>
> To support checkpoint/resume of a syscall user dispatch process,
> the prctl settings for syscall user dispatch must be fetchable.
> Presently, these settings are write-only, making it impossible to
> implement transparent checkpoint (coordination with the software is
> required).
>
> As Syscall User Dispatch is explicitly not for secure-container
> development, exposing the configuration state via prctl does not
> violate the original design intent.
>
> Signed-off-by: Gregory Price <[email protected]>
> ---
> .../admin-guide/syscall-user-dispatch.rst | 18 +++++++
> include/linux/syscall_user_dispatch.h | 7 +++
> include/uapi/linux/prctl.h | 3 ++
> kernel/entry/syscall_user_dispatch.c | 14 +++++
> kernel/sys.c | 4 ++
> .../syscall_user_dispatch/sud_test.c | 54 +++++++++++++++++++
> 6 files changed, 100 insertions(+)
>
> diff --git a/Documentation/admin-guide/syscall-user-dispatch.rst b/Documentation/admin-guide/syscall-user-dispatch.rst
> index 60314953c728..8b2c8b6441b7 100644
> --- a/Documentation/admin-guide/syscall-user-dispatch.rst
> +++ b/Documentation/admin-guide/syscall-user-dispatch.rst
> @@ -45,6 +45,10 @@ only the syscall dispatcher address and the userspace key.
> As the ABI of these intercepted syscalls is unknown to Linux, these
> syscalls are not instrumentable via ptrace or the syscall tracepoints.
>
> +A getter interface is supplied for the purpose of userland
> +checkpoint/restore software being able to suspend and restore the
> +current state of the system.
> +
> Interface
> ---------
>
> @@ -73,6 +77,20 @@ thread-wide, without the need to invoke the kernel directly. selector
> can be set to SYSCALL_DISPATCH_FILTER_ALLOW or SYSCALL_DISPATCH_FILTER_BLOCK.
> Any other value should terminate the program with a SIGSYS.
>
> +
> +A thread can fetch the current Syscall User Dispatch configuration with the following prctl:
> +
> + prctl(PR_GET_SYSCALL_USER_DISPATCH, <dispatch_config>))
> +
> +<dispatch_config> is a pointer to a ``struct syscall_user_dispatch`` as defined in ``linux/include/linux/syscall_user_dispatch.h``::
syscall_user_dispatch.h isn't a part of uapi, so I am not sure that it
is a good idea to use it here.
For criu, it is much more convinient to have a ptrace interface to get
this sort of parameters. prctl requires to execute a system call from a
context of the target process. It is tricky so we want to minimize a
number of such calls.
Thanks,
Andrei
On Thu, Jan 12, 2023 at 10:15:39AM -0800, Andrei Vagin wrote:
> On Mon, Jan 09, 2023 at 10:33:48AM -0500, Gregory Price wrote:
> > This patch implements simple getter interface for syscall user dispatch
> > configuration info.
> >
> > To support checkpoint/resume of a syscall user dispatch process,
> > the prctl settings for syscall user dispatch must be fetchable.
> > Presently, these settings are write-only, making it impossible to
> > implement transparent checkpoint (coordination with the software is
> > required).
> >
> > As Syscall User Dispatch is explicitly not for secure-container
> > development, exposing the configuration state via prctl does not
> > violate the original design intent.
> >
> > Signed-off-by: Gregory Price <[email protected]>
> > ---
> > .../admin-guide/syscall-user-dispatch.rst | 18 +++++++
> > include/linux/syscall_user_dispatch.h | 7 +++
> > include/uapi/linux/prctl.h | 3 ++
> > kernel/entry/syscall_user_dispatch.c | 14 +++++
> > kernel/sys.c | 4 ++
> > .../syscall_user_dispatch/sud_test.c | 54 +++++++++++++++++++
> > 6 files changed, 100 insertions(+)
> >
> > diff --git a/Documentation/admin-guide/syscall-user-dispatch.rst b/Documentation/admin-guide/syscall-user-dispatch.rst
> > index 60314953c728..8b2c8b6441b7 100644
> > --- a/Documentation/admin-guide/syscall-user-dispatch.rst
> > +++ b/Documentation/admin-guide/syscall-user-dispatch.rst
> > @@ -45,6 +45,10 @@ only the syscall dispatcher address and the userspace key.
> > As the ABI of these intercepted syscalls is unknown to Linux, these
> > syscalls are not instrumentable via ptrace or the syscall tracepoints.
> >
> > +A getter interface is supplied for the purpose of userland
> > +checkpoint/restore software being able to suspend and restore the
> > +current state of the system.
> > +
> > Interface
> > ---------
> >
> > @@ -73,6 +77,20 @@ thread-wide, without the need to invoke the kernel directly. selector
> > can be set to SYSCALL_DISPATCH_FILTER_ALLOW or SYSCALL_DISPATCH_FILTER_BLOCK.
> > Any other value should terminate the program with a SIGSYS.
> >
> > +
> > +A thread can fetch the current Syscall User Dispatch configuration with the following prctl:
> > +
> > + prctl(PR_GET_SYSCALL_USER_DISPATCH, <dispatch_config>))
> > +
> > +<dispatch_config> is a pointer to a ``struct syscall_user_dispatch`` as defined in ``linux/include/linux/syscall_user_dispatch.h``::
>
> syscall_user_dispatch.h isn't a part of uapi, so I am not sure that it
> is a good idea to use it here.
>
> For criu, it is much more convinient to have a ptrace interface to get
> this sort of parameters. prctl requires to execute a system call from a
> context of the target process. It is tricky so we want to minimize a
> number of such calls.
>
> Thanks,
> Andrei
Thank you for the feedback.
I think you're right. A Ptrace for this seems more in-line with the
SECCOMP filter exporting that CRIU uses too.
I'll look at implementing that instead.
On Mon, Jan 09, 2023 at 10:33:48AM -0500, Gregory Price wrote:
> This patch implements simple getter interface for syscall user dispatch
> configuration info.
s/This patch implements/Implement/
> +
> +A thread can fetch the current Syscall User Dispatch configuration with the following prctl:
This should have been ended with double colon (::) to make below code code
block, to be consistent with syscall_user_dispatch definition below.
> +
> + prctl(PR_GET_SYSCALL_USER_DISPATCH, <dispatch_config>))
> +
> +<dispatch_config> is a pointer to a ``struct syscall_user_dispatch`` as defined in ``linux/include/linux/syscall_user_dispatch.h``::
> +
> + struct syscall_user_dispatch {
> + char __user *selector;
> + unsigned long offset;
> + unsigned long len;
> + bool on_dispatch;
> + };
> +
Thanks.
--
An old man doll... just what I always wanted! - Clara
Hi Gregory,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on shuah-kselftest/next]
[also build test ERROR on shuah-kselftest/fixes ebiederm-user-namespace/for-next linus/master v6.2-rc4 next-20230117]
[cannot apply to tip/core/entry]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Gregory-Price/ptrace-syscall_user_dispatch-Implement-Syscall-User-Dispatch-Suspension/20230109-233954
base: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git next
patch link: https://lore.kernel.org/r/20230109153348.5625-3-gregory.price%40memverge.com
patch subject: [PATCH 2/3] fs/proc/array: Add Syscall User Dispatch to proc status
config: arm-pxa168_defconfig
compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 4196ca3278f78c6e19246e54ab0ecb364e37d66a)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install arm cross compiling tool for clang build
# apt-get install binutils-arm-linux-gnueabi
# https://github.com/intel-lab-lkp/linux/commit/f6bd5bdbe4e444c678e756d5b8b50e07ea4ccec5
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Gregory-Price/ptrace-syscall_user_dispatch-Implement-Syscall-User-Dispatch-Suspension/20230109-233954
git checkout f6bd5bdbe4e444c678e756d5b8b50e07ea4ccec5
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=arm olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=arm SHELL=/bin/bash
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
All errors (new ones prefixed by >>):
>> fs/proc/array.c:435:29: error: no member named 'selector' in 'struct syscall_user_dispatch'
(p->syscall_dispatch.selector != NULL));
~~~~~~~~~~~~~~~~~~~ ^
1 error generated.
vim +435 fs/proc/array.c
430
431 static inline void task_syscall_user_dispatch(struct seq_file *m,
432 struct task_struct *p)
433 {
434 seq_put_decimal_ull(m, "\nSyscall_user_dispatch:\t",
> 435 (p->syscall_dispatch.selector != NULL));
436 }
437
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests
On Mon, Jan 09, 2023 at 10:33:46AM -0500, Gregory Price wrote:
> @@ -36,6 +37,10 @@ bool syscall_user_dispatch(struct pt_regs *regs)
> struct syscall_user_dispatch *sd = ¤t->syscall_dispatch;
> char state;
>
> + if (IS_ENABLED(CONFIG_CHECKPOINT_RESTORE) &&
> + unlikely(current->ptrace & PT_SUSPEND_SYSCALL_USER_DISPATCH))
> + return false;
> +
> if (likely(instruction_pointer(regs) - sd->offset < sd->len))
> return false;
>
So by making syscall_user_dispatch() return false, we'll make
syscall_trace_enter() continue to handle things, and supposedly you want
to land in ptrace_report_syscall_entry(), right?
> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> index 54482193e1ed..a6ad815bd4be 100644
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -370,6 +370,11 @@ static int check_ptrace_options(unsigned long data)
> if (data & ~(unsigned long)PTRACE_O_MASK)
> return -EINVAL;
>
> + if (unlikely(data & PTRACE_O_SUSPEND_SYSCALL_USER_DISPATCH)) {
> + if (!IS_ENABLED(CONFIG_CHECKPOINT_RESTART))
> + return -EINVAL;
> + }
Should setting this then not also depend on having
SYSCALL_WORK_SYSCALL_TRACE set? Because without that, you get 'funny'
things.
On Wed, Jan 18, 2023 at 02:41:00PM -0500, Gregory Price wrote:
> ---------- Forwarded message ---------
> From: Peter Zijlstra <[email protected]>
> Date: Wed, Jan 18, 2023 at 12:16 PM
> Subject: Re: [PATCH 1/3] ptrace,syscall_user_dispatch: Implement Syscall
> User Dispatch Suspension
> To: Gregory Price <[email protected]>
>
>
> On Mon, Jan 09, 2023 at 10:33:46AM -0500, Gregory Price wrote:
> > @@ -36,6 +37,10 @@ bool syscall_user_dispatch(struct pt_regs *regs)
> > struct syscall_user_dispatch *sd = ¤t->syscall_dispatch;
> > char state;
> >
> > + if (IS_ENABLED(CONFIG_CHECKPOINT_RESTORE) &&
> > + unlikely(current->ptrace &
> PT_SUSPEND_SYSCALL_USER_DISPATCH))
> > + return false;
> > +
> > if (likely(instruction_pointer(regs) - sd->offset < sd->len))
> > return false;
> >
>
> So by making syscall_user_dispatch() return false, we'll make
> syscall_trace_enter() continue to handle things, and supposedly you want
> to land in ptrace_report_syscall_entry(), right?
>
> ... snip ...
>
> Should setting this then not also depend on having
> SYSCALL_WORK_SYSCALL_TRACE set? Because without that, you get 'funny'
> things.
Hm, this is an interesting question. My thoughts are that I want the
process to handle the syscall as-if syscall user dispatch was not
present at all, regardless of SYSCALL_TRACE.
This is because some software, like CRIU, actually injects syscalls to
run in the context of the software in an effort to collect resources.
So I actually *want* those 'funny' things to occur, because they're most
likely intentional. I don't necessarily want to intercept system calls
that subsequently occur (although i might).
So if this feature required SYSCALL_TRACE, you would no longer be able
to inject system calls ala CRIU.
That's also my understanding of the SECCOMP_SUSPEND feature as well,
it's intended specifically to allow *otherwise disallowed* syscalls to
be injected into the process and SECCOMP bypassed. (in this case,
SECCOMP_SUSPEND requires root for exactly this reason).
On Wed, Jan 18, 2023 at 02:49:31PM -0500, Gregory Price wrote:
> On Wed, Jan 18, 2023 at 02:41:00PM -0500, Gregory Price wrote:
> > ---------- Forwarded message ---------
> > From: Peter Zijlstra <[email protected]>
> > Date: Wed, Jan 18, 2023 at 12:16 PM
> > Subject: Re: [PATCH 1/3] ptrace,syscall_user_dispatch: Implement Syscall
> > User Dispatch Suspension
> > To: Gregory Price <[email protected]>
> >
> >
> > On Mon, Jan 09, 2023 at 10:33:46AM -0500, Gregory Price wrote:
> > > @@ -36,6 +37,10 @@ bool syscall_user_dispatch(struct pt_regs *regs)
> > > struct syscall_user_dispatch *sd = ¤t->syscall_dispatch;
> > > char state;
> > >
> > > + if (IS_ENABLED(CONFIG_CHECKPOINT_RESTORE) &&
> > > + unlikely(current->ptrace &
> > PT_SUSPEND_SYSCALL_USER_DISPATCH))
> > > + return false;
> > > +
> > > if (likely(instruction_pointer(regs) - sd->offset < sd->len))
> > > return false;
> > >
> >
> > So by making syscall_user_dispatch() return false, we'll make
> > syscall_trace_enter() continue to handle things, and supposedly you want
> > to land in ptrace_report_syscall_entry(), right?
> >
> > ... snip ...
> >
> > Should setting this then not also depend on having
> > SYSCALL_WORK_SYSCALL_TRACE set? Because without that, you get 'funny'
> > things.
>
> Hm, this is an interesting question. My thoughts are that I want the
> process to handle the syscall as-if syscall user dispatch was not
> present at all, regardless of SYSCALL_TRACE.
>
> This is because some software, like CRIU, actually injects syscalls to
> run in the context of the software in an effort to collect resources.
Oh, right. I used to know that.
> So I actually *want* those 'funny' things to occur, because they're most
> likely intentional. I don't necessarily want to intercept system calls
> that subsequently occur (although i might).
>
> So if this feature required SYSCALL_TRACE, you would no longer be able
> to inject system calls ala CRIU.
Yeah, I suppose you're right. It makes it a very sharp instrument, but I
suppose you get what you asked for.