On Mon, Sep 21, 2020 at 7:35 AM YiFei Zhu <[email protected]> wrote:
[...]
> We do this by creating a per-task bitmap of permitted syscalls.
> If seccomp filter is invoked we check if it is cached and if so
> directly return allow. Else we call into the cBPF filter, and if
> the result is an allow then we cache the results.
What? Why? We already have code to statically evaluate the filter for
all syscall numbers. We should be using the results of that instead of
re-running the filter and separately caching the results.
> The cache is per-task
Please don't. The static results are per-filter, so the bitmask(s)
should also be per-filter and immutable.
> minimize thread-synchronization issues in
> the hot path of cache lookup
There should be no need for synchronization because those bitmasks
should be immutable.
> and to avoid different architecture
> numbers sharing the same cache.
There should be separate caches for separate architectures, and we
should precompute the results for all architectures. (We only have
around 2 different architectures max, so it's completely reasonable to
precompute and store all that.)
> To account for one thread changing the filter for another thread of
> the same process, the per-task struct also contains a pointer to
> the filter the cache is built on. When the cache lookup uses a
> different filter then the last lookup, the per-task cache bitmap is
> cleared.
Unnecessary complexity, we don't need that if we make the bitmasks immutable.
On Mon, Sep 21, 2020 at 1:09 PM Jann Horn <[email protected]> wrote:
>
> On Mon, Sep 21, 2020 at 7:35 AM YiFei Zhu <[email protected]> wrote:
> [...]
> > We do this by creating a per-task bitmap of permitted syscalls.
> > If seccomp filter is invoked we check if it is cached and if so
> > directly return allow. Else we call into the cBPF filter, and if
> > the result is an allow then we cache the results.
>
> What? Why? We already have code to statically evaluate the filter for
> all syscall numbers. We should be using the results of that instead of
> re-running the filter and separately caching the results.
>
> > The cache is per-task
>
> Please don't. The static results are per-filter, so the bitmask(s)
> should also be per-filter and immutable.
I do agree that an immutable bitmask is faster and easier to reason
about its correctness. However, I did not find the "code to statically
evaluate the filter for all syscall numbers" while reading seccomp.c.
Would you give me a pointer to that and I will see how to best make
use of it?
YiFei Zhu
On Tue, Sep 22, 2020 at 12:51 AM YiFei Zhu <[email protected]> wrote:
> On Mon, Sep 21, 2020 at 1:09 PM Jann Horn <[email protected]> wrote:
> >
> > On Mon, Sep 21, 2020 at 7:35 AM YiFei Zhu <[email protected]> wrote:
> > [...]
> > > We do this by creating a per-task bitmap of permitted syscalls.
> > > If seccomp filter is invoked we check if it is cached and if so
> > > directly return allow. Else we call into the cBPF filter, and if
> > > the result is an allow then we cache the results.
> >
> > What? Why? We already have code to statically evaluate the filter for
> > all syscall numbers. We should be using the results of that instead of
> > re-running the filter and separately caching the results.
> >
> > > The cache is per-task
> >
> > Please don't. The static results are per-filter, so the bitmask(s)
> > should also be per-filter and immutable.
>
> I do agree that an immutable bitmask is faster and easier to reason
> about its correctness. However, I did not find the "code to statically
> evaluate the filter for all syscall numbers" while reading seccomp.c.
> Would you give me a pointer to that and I will see how to best make
> use of it?
I'm talking about the code you're adding in the other patch ("[RFC
PATCH seccomp 1/2] seccomp/cache: Add "emulator" to check if filter is
arg-dependent"). Sorry, that was a bit unclear.
On Mon, Sep 21, 2020 at 5:58 PM Jann Horn <[email protected]> wrote:
> > I do agree that an immutable bitmask is faster and easier to reason
> > about its correctness. However, I did not find the "code to statically
> > evaluate the filter for all syscall numbers" while reading seccomp.c.
> > Would you give me a pointer to that and I will see how to best make
> > use of it?
>
> I'm talking about the code you're adding in the other patch ("[RFC
> PATCH seccomp 1/2] seccomp/cache: Add "emulator" to check if filter is
> arg-dependent"). Sorry, that was a bit unclear.
I see, building an immutable accept bitmask when preparing and then
just use that when running it. I guess if the arch number issue is
resolved this should be more doable. Will do.
YiFei Zhu