Error injection is a useful mechanism to fail arbitrary kernel
functions. However, it is often hard to guarantee an error propagates
appropriately to user space programs. By injecting into syscalls, we can
return arbitrary values to user space directly; this increases
flexibility and robustness in testing, allowing us to test user space
error paths effectively.
The following script, for example, fails calls to sys_open() from a
given pid:
from bcc import BPF
from sys import argv
pid = argv[1]
prog = r"""
int kprobe__SyS_open(struct pt_regs *ctx, const char *pathname, int flags)
{
u32 pid = bpf_get_current_pid_tgid();
if (pid == %s)
bpf_override_return(ctx, -ENOENT);
return 0;
}
""" % pid
b = BPF(text = prog)
while 1:
b.perf_buffer_poll()
This patch whitelists all syscalls defined with SYSCALL_DEFINE for error
injection.
Signed-off-by: Howard McLauchlan <[email protected]>
---
based on 4.16-rc5
include/linux/syscalls.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index a78186d826d7..e8c6d63ace78 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -191,6 +191,8 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
#define SYSCALL_DEFINE0(sname) \
SYSCALL_METADATA(_##sname, 0); \
+ asmlinkage long sys_##sname(void); \
+ ALLOW_ERROR_INJECTION(sys_##sname, ERRNO); \
asmlinkage long sys_##sname(void)
#define SYSCALL_DEFINE1(name, ...) SYSCALL_DEFINEx(1, _##name, __VA_ARGS__)
@@ -210,6 +212,7 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
#define __SYSCALL_DEFINEx(x, name, ...) \
asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \
__attribute__((alias(__stringify(SyS##name)))); \
+ ALLOW_ERROR_INJECTION(sys##name, ERRNO); \
static inline long SYSC##name(__MAP(x,__SC_DECL,__VA_ARGS__)); \
asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
--
2.14.1
On Tue, Mar 13, 2018 at 04:16:27PM -0700, Howard McLauchlan wrote:
> Error injection is a useful mechanism to fail arbitrary kernel
> functions. However, it is often hard to guarantee an error propagates
> appropriately to user space programs. By injecting into syscalls, we can
> return arbitrary values to user space directly; this increases
> flexibility and robustness in testing, allowing us to test user space
> error paths effectively.
>
> The following script, for example, fails calls to sys_open() from a
> given pid:
>
> from bcc import BPF
> from sys import argv
>
> pid = argv[1]
>
> prog = r"""
>
> int kprobe__SyS_open(struct pt_regs *ctx, const char *pathname, int flags)
> {
> u32 pid = bpf_get_current_pid_tgid();
> if (pid == %s)
> bpf_override_return(ctx, -ENOENT);
> return 0;
> }
> """ % pid
>
> b = BPF(text = prog)
> while 1:
> b.perf_buffer_poll()
>
> This patch whitelists all syscalls defined with SYSCALL_DEFINE for error
> injection.
>
> Signed-off-by: Howard McLauchlan <[email protected]>
> ---
> based on 4.16-rc5
> include/linux/syscalls.h | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index a78186d826d7..e8c6d63ace78 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -191,6 +191,8 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
>
> #define SYSCALL_DEFINE0(sname) \
> SYSCALL_METADATA(_##sname, 0); \
> + asmlinkage long sys_##sname(void); \
> + ALLOW_ERROR_INJECTION(sys_##sname, ERRNO); \
> asmlinkage long sys_##sname(void)
>
> #define SYSCALL_DEFINE1(name, ...) SYSCALL_DEFINEx(1, _##name, __VA_ARGS__)
> @@ -210,6 +212,7 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
> #define __SYSCALL_DEFINEx(x, name, ...) \
> asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \
> __attribute__((alias(__stringify(SyS##name)))); \
> + ALLOW_ERROR_INJECTION(sys##name, ERRNO); \
> static inline long SYSC##name(__MAP(x,__SC_DECL,__VA_ARGS__)); \
> asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
> asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
> --
> 2.14.1
>
Adding a few more people to Cc
On 3/13/18 4:45 PM, Omar Sandoval wrote:
> On Tue, Mar 13, 2018 at 04:16:27PM -0700, Howard McLauchlan wrote:
>> Error injection is a useful mechanism to fail arbitrary kernel
>> functions. However, it is often hard to guarantee an error propagates
>> appropriately to user space programs. By injecting into syscalls, we can
>> return arbitrary values to user space directly; this increases
>> flexibility and robustness in testing, allowing us to test user space
>> error paths effectively.
>>
>> The following script, for example, fails calls to sys_open() from a
>> given pid:
>>
>> from bcc import BPF
>> from sys import argv
>>
>> pid = argv[1]
>>
>> prog = r"""
>>
>> int kprobe__SyS_open(struct pt_regs *ctx, const char *pathname, int flags)
>> {
>> u32 pid = bpf_get_current_pid_tgid();
>> if (pid == %s)
>> bpf_override_return(ctx, -ENOENT);
>> return 0;
>> }
>> """ % pid
>>
>> b = BPF(text = prog)
>> while 1:
>> b.perf_buffer_poll()
>>
>> This patch whitelists all syscalls defined with SYSCALL_DEFINE for error
>> injection.
>>
>> Signed-off-by: Howard McLauchlan <[email protected]>
>> ---
>> based on 4.16-rc5
>> include/linux/syscalls.h | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
>> index a78186d826d7..e8c6d63ace78 100644
>> --- a/include/linux/syscalls.h
>> +++ b/include/linux/syscalls.h
>> @@ -191,6 +191,8 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
>>
>> #define SYSCALL_DEFINE0(sname) \
>> SYSCALL_METADATA(_##sname, 0); \
>> + asmlinkage long sys_##sname(void); \
>> + ALLOW_ERROR_INJECTION(sys_##sname, ERRNO); \
>> asmlinkage long sys_##sname(void)
duplication of asmlinkage in the above?
>>
>> #define SYSCALL_DEFINE1(name, ...) SYSCALL_DEFINEx(1, _##name, __VA_ARGS__)
>> @@ -210,6 +212,7 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
>> #define __SYSCALL_DEFINEx(x, name, ...) \
>> asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \
>> __attribute__((alias(__stringify(SyS##name)))); \
>> + ALLOW_ERROR_INJECTION(sys##name, ERRNO); \
>> static inline long SYSC##name(__MAP(x,__SC_DECL,__VA_ARGS__)); \
>> asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
>> asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
>> --
>> 2.14.1
>>
>
> Adding a few more people to Cc
>
On Tue, Mar 13, 2018 at 11:16 PM, Howard McLauchlan <[email protected]> wrote:
> Error injection is a useful mechanism to fail arbitrary kernel
> functions. However, it is often hard to guarantee an error propagates
> appropriately to user space programs. By injecting into syscalls, we can
> return arbitrary values to user space directly; this increases
> flexibility and robustness in testing, allowing us to test user space
> error paths effectively.
Temporary NAK IMO. Specifically:
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index a78186d826d7..e8c6d63ace78 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -191,6 +191,8 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
>
> #define SYSCALL_DEFINE0(sname) \
> SYSCALL_METADATA(_##sname, 0); \
> + asmlinkage long sys_##sname(void); \
> + ALLOW_ERROR_INJECTION(sys_##sname, ERRNO); \
sys_xyz() is not just the syscall itself; it's also a helper that's
used for entirely silly reasons by various bits of kernel code for
quite a few syscalls. Fortunately, Dominik has patches to fix that,
and Linus is even considering pulling them for 4.16. This patch will
most likely conflict with the final result of Dominik's series.
Can you and Dominik coordinate a bit to get this patch or its
equivalent landed on top of Dominik's work? It might make sense for
Dominik to just add this patch to his series so it can land with the
rest of it. Dominik, Ingo, what do you think?
--Andy
On 03/13/2018 04:49 PM, Yonghong Song wrote:
>
>
> On 3/13/18 4:45 PM, Omar Sandoval wrote:
>> On Tue, Mar 13, 2018 at 04:16:27PM -0700, Howard McLauchlan wrote:
>>> Error injection is a useful mechanism to fail arbitrary kernel
>>> functions. However, it is often hard to guarantee an error propagates
>>> appropriately to user space programs. By injecting into syscalls, we can
>>> return arbitrary values to user space directly; this increases
>>> flexibility and robustness in testing, allowing us to test user space
>>> error paths effectively.
>>>
>>> The following script, for example, fails calls to sys_open() from a
>>> given pid:
>>>
>>> from bcc import BPF
>>> from sys import argv
>>>
>>> pid = argv[1]
>>>
>>> prog = r"""
>>>
>>> int kprobe__SyS_open(struct pt_regs *ctx, const char *pathname, int flags)
>>> {
>>> u32 pid = bpf_get_current_pid_tgid();
>>> if (pid == %s)
>>> bpf_override_return(ctx, -ENOENT);
>>> return 0;
>>> }
>>> """ % pid
>>>
>>> b = BPF(text = prog)
>>> while 1:
>>> b.perf_buffer_poll()
>>>
>>> This patch whitelists all syscalls defined with SYSCALL_DEFINE for error
>>> injection.
>>>
>>> Signed-off-by: Howard McLauchlan <[email protected]>
>>> ---
>>> based on 4.16-rc5
>>> include/linux/syscalls.h | 3 +++
>>> 1 file changed, 3 insertions(+)
>>>
>>> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
>>> index a78186d826d7..e8c6d63ace78 100644
>>> --- a/include/linux/syscalls.h
>>> +++ b/include/linux/syscalls.h
>>> @@ -191,6 +191,8 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
>>> #define SYSCALL_DEFINE0(sname) \
>>> SYSCALL_METADATA(_##sname, 0); \
>>> + asmlinkage long sys_##sname(void); \
>>> + ALLOW_ERROR_INJECTION(sys_##sname, ERRNO); \
>>> asmlinkage long sys_##sname(void)
>
> duplication of asmlinkage in the above?
>
The pre-declaration is necessary to ensure ALLOW_ERROR_INJECTION works appropriately. There can be syscalls
that are not pre-declared elsewhere which will fail compilation if not declared in this block.
>>> #define SYSCALL_DEFINE1(name, ...) SYSCALL_DEFINEx(1, _##name, __VA_ARGS__)
>>> @@ -210,6 +212,7 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
>>> #define __SYSCALL_DEFINEx(x, name, ...) \
>>> asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \
>>> __attribute__((alias(__stringify(SyS##name)))); \
>>> + ALLOW_ERROR_INJECTION(sys##name, ERRNO); \
>>> static inline long SYSC##name(__MAP(x,__SC_DECL,__VA_ARGS__)); \
>>> asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
>>> asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
>>> --
>>> 2.14.1
>>>
>>
>> Adding a few more people to Cc
>>
On 03/13/2018 04:56 PM, Andy Lutomirski wrote:
> On Tue, Mar 13, 2018 at 11:16 PM, Howard McLauchlan <[email protected]> wrote:
>> Error injection is a useful mechanism to fail arbitrary kernel
>> functions. However, it is often hard to guarantee an error propagates
>> appropriately to user space programs. By injecting into syscalls, we can
>> return arbitrary values to user space directly; this increases
>> flexibility and robustness in testing, allowing us to test user space
>> error paths effectively.
>
> Temporary NAK IMO. Specifically:
>
>> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
>> index a78186d826d7..e8c6d63ace78 100644
>> --- a/include/linux/syscalls.h
>> +++ b/include/linux/syscalls.h
>> @@ -191,6 +191,8 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
>>
>> #define SYSCALL_DEFINE0(sname) \
>> SYSCALL_METADATA(_##sname, 0); \
>> + asmlinkage long sys_##sname(void); \
>> + ALLOW_ERROR_INJECTION(sys_##sname, ERRNO); \
>
> sys_xyz() is not just the syscall itself; it's also a helper that's
> used for entirely silly reasons by various bits of kernel code for
> quite a few syscalls. Fortunately, Dominik has patches to fix that,
> and Linus is even considering pulling them for 4.16. This patch will
> most likely conflict with the final result of Dominik's series.
>
> Can you and Dominik coordinate a bit to get this patch or its
> equivalent landed on top of Dominik's work? It might make sense for
> Dominik to just add this patch to his series so it can land with the
> rest of it. Dominik, Ingo, what do you think?
>
> --Andy
>
Dominik,
This patch applies cleanly on top of your patch series. Is there anything you'd need from me to get this in on top of your work?
Howard
On Fri, Mar 16, 2018 at 03:55:04PM -0700, Howard McLauchlan wrote:
> On 03/13/2018 04:56 PM, Andy Lutomirski wrote:
> > On Tue, Mar 13, 2018 at 11:16 PM, Howard McLauchlan <[email protected]> wrote:
> >> Error injection is a useful mechanism to fail arbitrary kernel
> >> functions. However, it is often hard to guarantee an error propagates
> >> appropriately to user space programs. By injecting into syscalls, we can
> >> return arbitrary values to user space directly; this increases
> >> flexibility and robustness in testing, allowing us to test user space
> >> error paths effectively.
> >
> > Temporary NAK IMO. Specifically:
> >
> >> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> >> index a78186d826d7..e8c6d63ace78 100644
> >> --- a/include/linux/syscalls.h
> >> +++ b/include/linux/syscalls.h
> >> @@ -191,6 +191,8 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
> >>
> >> #define SYSCALL_DEFINE0(sname) \
> >> SYSCALL_METADATA(_##sname, 0); \
> >> + asmlinkage long sys_##sname(void); \
> >> + ALLOW_ERROR_INJECTION(sys_##sname, ERRNO); \
> >
> > sys_xyz() is not just the syscall itself; it's also a helper that's
> > used for entirely silly reasons by various bits of kernel code for
> > quite a few syscalls. Fortunately, Dominik has patches to fix that,
> > and Linus is even considering pulling them for 4.16. This patch will
> > most likely conflict with the final result of Dominik's series.
> >
> > Can you and Dominik coordinate a bit to get this patch or its
> > equivalent landed on top of Dominik's work? It might make sense for
> > Dominik to just add this patch to his series so it can land with the
> > rest of it. Dominik, Ingo, what do you think?
> >
> > --Andy
> >
>
> Dominik,
>
> This patch applies cleanly on top of your patch series. Is there anything you'd need from me to get this in on top of your work?
Howard,
would this form part of the kernel<->userspace interface and therefore needs
to be kept stable? If so, this patch should wait until the arch-specific
syscall calling convention is agreed upon.
Moreover, the patches I sent out already do not cover all syscalls yet.
Until all in-kernel users of sys_*() are gone (or at least outside arch/),
I'd prefer to postpone this patch.
Thanks,
Dominik
On Sun, Mar 18, 2018 at 6:47 AM, Dominik Brodowski
<[email protected]> wrote:
> On Fri, Mar 16, 2018 at 03:55:04PM -0700, Howard McLauchlan wrote:
>> On 03/13/2018 04:56 PM, Andy Lutomirski wrote:
>> > On Tue, Mar 13, 2018 at 11:16 PM, Howard McLauchlan <[email protected]> wrote:
>> >> Error injection is a useful mechanism to fail arbitrary kernel
>> >> functions. However, it is often hard to guarantee an error propagates
>> >> appropriately to user space programs. By injecting into syscalls, we can
>> >> return arbitrary values to user space directly; this increases
>> >> flexibility and robustness in testing, allowing us to test user space
>> >> error paths effectively.
>> >
>> > Temporary NAK IMO. Specifically:
>> >
>> >> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
>> >> index a78186d826d7..e8c6d63ace78 100644
>> >> --- a/include/linux/syscalls.h
>> >> +++ b/include/linux/syscalls.h
>> >> @@ -191,6 +191,8 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
>> >>
>> >> #define SYSCALL_DEFINE0(sname) \
>> >> SYSCALL_METADATA(_##sname, 0); \
>> >> + asmlinkage long sys_##sname(void); \
>> >> + ALLOW_ERROR_INJECTION(sys_##sname, ERRNO); \
>> >
>> > sys_xyz() is not just the syscall itself; it's also a helper that's
>> > used for entirely silly reasons by various bits of kernel code for
>> > quite a few syscalls. Fortunately, Dominik has patches to fix that,
>> > and Linus is even considering pulling them for 4.16. This patch will
>> > most likely conflict with the final result of Dominik's series.
>> >
>> > Can you and Dominik coordinate a bit to get this patch or its
>> > equivalent landed on top of Dominik's work? It might make sense for
>> > Dominik to just add this patch to his series so it can land with the
>> > rest of it. Dominik, Ingo, what do you think?
>> >
>> > --Andy
>> >
>>
>> Dominik,
>>
>> This patch applies cleanly on top of your patch series. Is there anything you'd need from me to get this in on top of your work?
>
> Howard,
>
> would this form part of the kernel<->userspace interface and therefore needs
> to be kept stable? If so, this patch should wait until the arch-specific
> syscall calling convention is agreed upon.
>
> Moreover, the patches I sent out already do not cover all syscalls yet.
> Until all in-kernel users of sys_*() are gone (or at least outside arch/),
> I'd prefer to postpone this patch.
>
I was assuming that this ALLOW_ERROR_INJECTION thing is *not*
considered stable ABI. We should be free to change the way that the
syscall entry code calls syscalls whenever we like.
If you want a stable syscall error injection mechanism, make it work
like seccomp instead, please.
On 03/18/2018 07:13 PM, Andy Lutomirski wrote:
> On Sun, Mar 18, 2018 at 6:47 AM, Dominik Brodowski
> <[email protected]> wrote:
>> On Fri, Mar 16, 2018 at 03:55:04PM -0700, Howard McLauchlan wrote:
>>> On 03/13/2018 04:56 PM, Andy Lutomirski wrote:
>>>> On Tue, Mar 13, 2018 at 11:16 PM, Howard McLauchlan <[email protected]> wrote:
>>>>> Error injection is a useful mechanism to fail arbitrary kernel
>>>>> functions. However, it is often hard to guarantee an error propagates
>>>>> appropriately to user space programs. By injecting into syscalls, we can
>>>>> return arbitrary values to user space directly; this increases
>>>>> flexibility and robustness in testing, allowing us to test user space
>>>>> error paths effectively.
>>>>
>>>> Temporary NAK IMO. Specifically:
>>>>
>>>>> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
>>>>> index a78186d826d7..e8c6d63ace78 100644
>>>>> --- a/include/linux/syscalls.h
>>>>> +++ b/include/linux/syscalls.h
>>>>> @@ -191,6 +191,8 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
>>>>>
>>>>> #define SYSCALL_DEFINE0(sname) \
>>>>> SYSCALL_METADATA(_##sname, 0); \
>>>>> + asmlinkage long sys_##sname(void); \
>>>>> + ALLOW_ERROR_INJECTION(sys_##sname, ERRNO); \
>>>>
>>>> sys_xyz() is not just the syscall itself; it's also a helper that's
>>>> used for entirely silly reasons by various bits of kernel code for
>>>> quite a few syscalls. Fortunately, Dominik has patches to fix that,
>>>> and Linus is even considering pulling them for 4.16. This patch will
>>>> most likely conflict with the final result of Dominik's series.
>>>>
>>>> Can you and Dominik coordinate a bit to get this patch or its
>>>> equivalent landed on top of Dominik's work? It might make sense for
>>>> Dominik to just add this patch to his series so it can land with the
>>>> rest of it. Dominik, Ingo, what do you think?
>>>>
>>>> --Andy
>>>>
>>>
>>> Dominik,
>>>
>>> This patch applies cleanly on top of your patch series. Is there anything you'd need from me to get this in on top of your work?
>>
>> Howard,
>>
>> would this form part of the kernel<->userspace interface and therefore needs
>> to be kept stable? If so, this patch should wait until the arch-specific
>> syscall calling convention is agreed upon.
>>
>> Moreover, the patches I sent out already do not cover all syscalls yet.
>> Until all in-kernel users of sys_*() are gone (or at least outside arch/),
>> I'd prefer to postpone this patch.
>>
>
> I was assuming that this ALLOW_ERROR_INJECTION thing is *not*
> considered stable ABI. We should be free to change the way that the
> syscall entry code calls syscalls whenever we like.
>
> If you want a stable syscall error injection mechanism, make it work
> like seccomp instead, please.
>
This is not supposed to be considered stable. It's for debug purposes only and
would normally be configured off.
Howard