2021-11-23 20:56:18

by Maciej Żenczykowski

[permalink] [raw]
Subject: [PATCH bpf-next] bpf: allow readonly direct path access for skfilter

From: Maciej Żenczykowski <[email protected]>

skfilter bpf programs can read the packet directly via llvm.bpf.load.byte/
/half/word which are 8/16/32-bit primitive bpf instructions and thus
behave basically as well as DPA reads. But there is no 64-bit equivalent,
due to the support for the equivalent 64-bit bpf opcode never having been
added (unclear why, there was a patch posted).
DPA uses a slightly different mechanism, so doesn't suffer this limitation.

Using 64-bit reads, 128-bit ipv6 address comparisons can be done in just
2 steps, instead of the 4 steps needed with llvm.bpf.word.

This should hopefully allow simpler (less instructions, and possibly less
logic and maybe even less jumps) programs. Less jumps may also mean vastly
faster bpf verifier times (it can be exponential in the number of jumps...).

This can be particularly important when trying to do something like scan
a netlink message for a pattern (2000 iteration loop) to decide whether
a message should be dropped, or delivered to userspace (thus waking it up).

I'm requiring CAP_NET_ADMIN because I'm not sure of the security
implications...

Tested: only build tested
Signed-off-by: Maciej Żenczykowski <[email protected]>
---
kernel/bpf/verifier.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 331b170d9fcc..0c2e25fb9844 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3258,6 +3258,11 @@ static bool may_access_direct_pkt_data(struct bpf_verifier_env *env,
enum bpf_prog_type prog_type = resolve_prog_type(env->prog);

switch (prog_type) {
+ case BPF_PROG_TYPE_SOCKET_FILTER:
+ if (meta || !capable(CAP_NET_ADMIN))
+ return false;
+ fallthrough;
+
/* Program types only with direct read access go here! */
case BPF_PROG_TYPE_LWT_IN:
case BPF_PROG_TYPE_LWT_OUT:
--
2.34.0.rc2.393.gf8c9666880-goog



2021-11-23 23:02:48

by Maciej Żenczykowski

[permalink] [raw]
Subject: Re: [PATCH bpf-next] bpf: allow readonly direct path access for skfilter

Note: this is more of an RFC... question in patch format... is this
even a good idea?

On Tue, Nov 23, 2021 at 12:56 PM Maciej Żenczykowski
<[email protected]> wrote:
>
> From: Maciej Żenczykowski <[email protected]>
>
> skfilter bpf programs can read the packet directly via llvm.bpf.load.byte/
> /half/word which are 8/16/32-bit primitive bpf instructions and thus
> behave basically as well as DPA reads. But there is no 64-bit equivalent,
> due to the support for the equivalent 64-bit bpf opcode never having been
> added (unclear why, there was a patch posted).
> DPA uses a slightly different mechanism, so doesn't suffer this limitation.
>
> Using 64-bit reads, 128-bit ipv6 address comparisons can be done in just
> 2 steps, instead of the 4 steps needed with llvm.bpf.word.
>
> This should hopefully allow simpler (less instructions, and possibly less
> logic and maybe even less jumps) programs. Less jumps may also mean vastly
> faster bpf verifier times (it can be exponential in the number of jumps...).
>
> This can be particularly important when trying to do something like scan
> a netlink message for a pattern (2000 iteration loop) to decide whether
> a message should be dropped, or delivered to userspace (thus waking it up).
>
> I'm requiring CAP_NET_ADMIN because I'm not sure of the security
> implications...
>
> Tested: only build tested
> Signed-off-by: Maciej Żenczykowski <[email protected]>
> ---
> kernel/bpf/verifier.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 331b170d9fcc..0c2e25fb9844 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -3258,6 +3258,11 @@ static bool may_access_direct_pkt_data(struct bpf_verifier_env *env,
> enum bpf_prog_type prog_type = resolve_prog_type(env->prog);
>
> switch (prog_type) {
> + case BPF_PROG_TYPE_SOCKET_FILTER:
> + if (meta || !capable(CAP_NET_ADMIN))
> + return false;
> + fallthrough;
> +
> /* Program types only with direct read access go here! */
> case BPF_PROG_TYPE_LWT_IN:
> case BPF_PROG_TYPE_LWT_OUT:
> --
> 2.34.0.rc2.393.gf8c9666880-goog
>

2021-11-27 02:16:15

by Song Liu

[permalink] [raw]
Subject: Re: [PATCH bpf-next] bpf: allow readonly direct path access for skfilter

On Tue, Nov 23, 2021 at 3:02 PM Maciej Żenczykowski <[email protected]> wrote:
>
> Note: this is more of an RFC... question in patch format... is this
> even a good idea?
>
> On Tue, Nov 23, 2021 at 12:56 PM Maciej Żenczykowski
> <[email protected]> wrote:
> >
> > From: Maciej Żenczykowski <[email protected]>
> >
> > skfilter bpf programs can read the packet directly via llvm.bpf.load.byte/
> > /half/word which are 8/16/32-bit primitive bpf instructions and thus
> > behave basically as well as DPA reads. But there is no 64-bit equivalent,
> > due to the support for the equivalent 64-bit bpf opcode never having been
> > added (unclear why, there was a patch posted).
> > DPA uses a slightly different mechanism, so doesn't suffer this limitation.
> >
> > Using 64-bit reads, 128-bit ipv6 address comparisons can be done in just
> > 2 steps, instead of the 4 steps needed with llvm.bpf.word.
> >
> > This should hopefully allow simpler (less instructions, and possibly less
> > logic and maybe even less jumps) programs. Less jumps may also mean vastly
> > faster bpf verifier times (it can be exponential in the number of jumps...).
> >
> > This can be particularly important when trying to do something like scan
> > a netlink message for a pattern (2000 iteration loop) to decide whether
> > a message should be dropped, or delivered to userspace (thus waking it up).
> >
> > I'm requiring CAP_NET_ADMIN because I'm not sure of the security
> > implications...

I don't know BPF_PROG_TYPE_SOCKET_FILTER very well, but the patch
seems reasonable to me. It will be great if we can show the performance
impact with a benchmark or a selftests.

Thanks,
Song

2021-11-30 01:43:12

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [PATCH bpf-next] bpf: allow readonly direct path access for skfilter

On Tue, Nov 23, 2021 at 12:56 PM Maciej Żenczykowski
<[email protected]> wrote:
>
> From: Maciej Żenczykowski <[email protected]>
>
> skfilter bpf programs can read the packet directly via llvm.bpf.load.byte/
> /half/word which are 8/16/32-bit primitive bpf instructions and thus
> behave basically as well as DPA reads. But there is no 64-bit equivalent,
> due to the support for the equivalent 64-bit bpf opcode never having been
> added (unclear why, there was a patch posted).
> DPA uses a slightly different mechanism, so doesn't suffer this limitation.
>
> Using 64-bit reads, 128-bit ipv6 address comparisons can be done in just
> 2 steps, instead of the 4 steps needed with llvm.bpf.word.

llvm.bpf.word is a pseudo instruction.
It's actually a function call for classic bpf.
See bpf_gen_ld_abs.
We used to have ugly special cases for them in JITs,
but then got rid of it.
Don't use them if performance is a requirement.

> This should hopefully allow simpler (less instructions, and possibly less
> logic and maybe even less jumps) programs. Less jumps may also mean vastly
> faster bpf verifier times (it can be exponential in the number of jumps...).
>
> This can be particularly important when trying to do something like scan
> a netlink message for a pattern (2000 iteration loop) to decide whether
> a message should be dropped, or delivered to userspace (thus waking it up).
>
> I'm requiring CAP_NET_ADMIN because I'm not sure of the security
> implications...
>
> Tested: only build tested
> Signed-off-by: Maciej Żenczykowski <[email protected]>
> ---
> kernel/bpf/verifier.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 331b170d9fcc..0c2e25fb9844 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -3258,6 +3258,11 @@ static bool may_access_direct_pkt_data(struct bpf_verifier_env *env,
> enum bpf_prog_type prog_type = resolve_prog_type(env->prog);
>
> switch (prog_type) {
> + case BPF_PROG_TYPE_SOCKET_FILTER:
> + if (meta || !capable(CAP_NET_ADMIN))
> + return false;

probably needs CAP_BPF too.

Other than that I think it's fine.