2020-09-08 11:55:09

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH RFC 00/10] KFENCE: A low-overhead sampling-based memory safety error detector

On 9/7/20 3:40 PM, Marco Elver wrote:
> This adds the Kernel Electric-Fence (KFENCE) infrastructure. KFENCE is a
> low-overhead sampling-based memory safety error detector of heap
> use-after-free, invalid-free, and out-of-bounds access errors. This
> series enables KFENCE for the x86 and arm64 architectures, and adds
> KFENCE hooks to the SLAB and SLUB allocators.
>
> KFENCE is designed to be enabled in production kernels, and has near
> zero performance overhead. Compared to KASAN, KFENCE trades performance
> for precision. The main motivation behind KFENCE's design, is that with
> enough total uptime KFENCE will detect bugs in code paths not typically
> exercised by non-production test workloads. One way to quickly achieve a
> large enough total uptime is when the tool is deployed across a large
> fleet of machines.

Looks nice!

> KFENCE objects each reside on a dedicated page, at either the left or
> right page boundaries. The pages to the left and right of the object
> page are "guard pages", whose attributes are changed to a protected
> state, and cause page faults on any attempted access to them. Such page
> faults are then intercepted by KFENCE, which handles the fault
> gracefully by reporting a memory access error.
>
> Guarded allocations are set up based on a sample interval (can be set
> via kfence.sample_interval). After expiration of the sample interval, a
> guarded allocation from the KFENCE object pool is returned to the main
> allocator (SLAB or SLUB). At this point, the timer is reset, and the
> next allocation is set up after the expiration of the interval.
>
> To enable/disable a KFENCE allocation through the main allocator's
> fast-path without overhead, KFENCE relies on static branches via the
> static keys infrastructure. The static branch is toggled to redirect the
> allocation to KFENCE.

Toggling a static branch is AFAIK quite disruptive (PeterZ will probably tell
you better), and with the default 100ms sample interval, I'd think it's not good
to toggle it so often? Did you measure what performance would you get, if the
static key was only for long-term toggling the whole feature on and off (boot
time or even runtime), but the decisions "am I in a sample interval right now?"
would be normal tests behind this static key? Thanks.

> We have verified by running synthetic benchmarks (sysbench I/O,
> hackbench) that a kernel with KFENCE is performance-neutral compared to
> a non-KFENCE baseline kernel.
>
> KFENCE is inspired by GWP-ASan [1], a userspace tool with similar
> properties. The name "KFENCE" is a homage to the Electric Fence Malloc
> Debugger [2].
>
> For more details, see Documentation/dev-tools/kfence.rst added in the
> series -- also viewable here:
>
> https://raw.githubusercontent.com/google/kasan/kfence/Documentation/dev-tools/kfence.rst
>
> [1] http://llvm.org/docs/GwpAsan.html
> [2] https://linux.die.net/man/3/efence
>
> Alexander Potapenko (6):
> mm: add Kernel Electric-Fence infrastructure
> x86, kfence: enable KFENCE for x86
> mm, kfence: insert KFENCE hooks for SLAB
> mm, kfence: insert KFENCE hooks for SLUB
> kfence, kasan: make KFENCE compatible with KASAN
> kfence, kmemleak: make KFENCE compatible with KMEMLEAK
>
> Marco Elver (4):
> arm64, kfence: enable KFENCE for ARM64
> kfence, lockdep: make KFENCE compatible with lockdep
> kfence, Documentation: add KFENCE documentation
> kfence: add test suite
>
> Documentation/dev-tools/index.rst | 1 +
> Documentation/dev-tools/kfence.rst | 285 +++++++++++
> MAINTAINERS | 11 +
> arch/arm64/Kconfig | 1 +
> arch/arm64/include/asm/kfence.h | 39 ++
> arch/arm64/mm/fault.c | 4 +
> arch/x86/Kconfig | 2 +
> arch/x86/include/asm/kfence.h | 60 +++
> arch/x86/mm/fault.c | 4 +
> include/linux/kfence.h | 174 +++++++
> init/main.c | 2 +
> kernel/locking/lockdep.c | 8 +
> lib/Kconfig.debug | 1 +
> lib/Kconfig.kfence | 70 +++
> mm/Makefile | 1 +
> mm/kasan/common.c | 7 +
> mm/kfence/Makefile | 6 +
> mm/kfence/core.c | 730 +++++++++++++++++++++++++++
> mm/kfence/kfence-test.c | 777 +++++++++++++++++++++++++++++
> mm/kfence/kfence.h | 104 ++++
> mm/kfence/report.c | 201 ++++++++
> mm/kmemleak.c | 11 +
> mm/slab.c | 46 +-
> mm/slab_common.c | 6 +-
> mm/slub.c | 72 ++-
> 25 files changed, 2591 insertions(+), 32 deletions(-)
> create mode 100644 Documentation/dev-tools/kfence.rst
> create mode 100644 arch/arm64/include/asm/kfence.h
> create mode 100644 arch/x86/include/asm/kfence.h
> create mode 100644 include/linux/kfence.h
> create mode 100644 lib/Kconfig.kfence
> create mode 100644 mm/kfence/Makefile
> create mode 100644 mm/kfence/core.c
> create mode 100644 mm/kfence/kfence-test.c
> create mode 100644 mm/kfence/kfence.h
> create mode 100644 mm/kfence/report.c
>


2020-09-08 17:46:06

by Alexander Potapenko

[permalink] [raw]
Subject: Re: [PATCH RFC 00/10] KFENCE: A low-overhead sampling-based memory safety error detector

> Toggling a static branch is AFAIK quite disruptive (PeterZ will probably tell
> you better), and with the default 100ms sample interval, I'd think it's not good
> to toggle it so often? Did you measure what performance would you get, if the
> static key was only for long-term toggling the whole feature on and off (boot
> time or even runtime), but the decisions "am I in a sample interval right now?"
> would be normal tests behind this static key? Thanks.

100ms is the default that we use for testing, but for production it
should be fine to pick a longer interval (e.g. 1 second or more).
We haven't noticed any performance impact with neither 100ms nor bigger values.

Regarding using normal branches, they are quite expensive.
E.g. at some point we used to have a branch in slab_free() to check
whether the freed object belonged to KFENCE pool.
When the pool address was taken from memory, this resulted in some
non-zero performance penalty.

As for enabling the whole feature at runtime, our intention is to let
the users have it enabled by default, otherwise someone will need to
tell every machine in the fleet when the feature is to be enabled.
>
> > We have verified by running synthetic benchmarks (sysbench I/O,
> > hackbench) that a kernel with KFENCE is performance-neutral compared to
> > a non-KFENCE baseline kernel.
> >
> > KFENCE is inspired by GWP-ASan [1], a userspace tool with similar
> > properties. The name "KFENCE" is a homage to the Electric Fence Malloc
> > Debugger [2].
> >
> > For more details, see Documentation/dev-tools/kfence.rst added in the
> > series -- also viewable here:
> >
> > https://raw.githubusercontent.com/google/kasan/kfence/Documentation/dev-tools/kfence.rst
> >
> > [1] http://llvm.org/docs/GwpAsan.html
> > [2] https://linux.die.net/man/3/efence
> >
> > Alexander Potapenko (6):
> > mm: add Kernel Electric-Fence infrastructure
> > x86, kfence: enable KFENCE for x86
> > mm, kfence: insert KFENCE hooks for SLAB
> > mm, kfence: insert KFENCE hooks for SLUB
> > kfence, kasan: make KFENCE compatible with KASAN
> > kfence, kmemleak: make KFENCE compatible with KMEMLEAK
> >
> > Marco Elver (4):
> > arm64, kfence: enable KFENCE for ARM64
> > kfence, lockdep: make KFENCE compatible with lockdep
> > kfence, Documentation: add KFENCE documentation
> > kfence: add test suite
> >
> > Documentation/dev-tools/index.rst | 1 +
> > Documentation/dev-tools/kfence.rst | 285 +++++++++++
> > MAINTAINERS | 11 +
> > arch/arm64/Kconfig | 1 +
> > arch/arm64/include/asm/kfence.h | 39 ++
> > arch/arm64/mm/fault.c | 4 +
> > arch/x86/Kconfig | 2 +
> > arch/x86/include/asm/kfence.h | 60 +++
> > arch/x86/mm/fault.c | 4 +
> > include/linux/kfence.h | 174 +++++++
> > init/main.c | 2 +
> > kernel/locking/lockdep.c | 8 +
> > lib/Kconfig.debug | 1 +
> > lib/Kconfig.kfence | 70 +++
> > mm/Makefile | 1 +
> > mm/kasan/common.c | 7 +
> > mm/kfence/Makefile | 6 +
> > mm/kfence/core.c | 730 +++++++++++++++++++++++++++
> > mm/kfence/kfence-test.c | 777 +++++++++++++++++++++++++++++
> > mm/kfence/kfence.h | 104 ++++
> > mm/kfence/report.c | 201 ++++++++
> > mm/kmemleak.c | 11 +
> > mm/slab.c | 46 +-
> > mm/slab_common.c | 6 +-
> > mm/slub.c | 72 ++-
> > 25 files changed, 2591 insertions(+), 32 deletions(-)
> > create mode 100644 Documentation/dev-tools/kfence.rst
> > create mode 100644 arch/arm64/include/asm/kfence.h
> > create mode 100644 arch/x86/include/asm/kfence.h
> > create mode 100644 include/linux/kfence.h
> > create mode 100644 lib/Kconfig.kfence
> > create mode 100644 mm/kfence/Makefile
> > create mode 100644 mm/kfence/core.c
> > create mode 100644 mm/kfence/kfence-test.c
> > create mode 100644 mm/kfence/kfence.h
> > create mode 100644 mm/kfence/report.c
> >
>


--
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

2020-09-08 20:03:32

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH RFC 00/10] KFENCE: A low-overhead sampling-based memory safety error detector

On 9/8/20 2:16 PM, Alexander Potapenko wrote:
>> Toggling a static branch is AFAIK quite disruptive (PeterZ will probably tell
>> you better), and with the default 100ms sample interval, I'd think it's not good
>> to toggle it so often? Did you measure what performance would you get, if the
>> static key was only for long-term toggling the whole feature on and off (boot
>> time or even runtime), but the decisions "am I in a sample interval right now?"
>> would be normal tests behind this static key? Thanks.
>
> 100ms is the default that we use for testing, but for production it
> should be fine to pick a longer interval (e.g. 1 second or more).
> We haven't noticed any performance impact with neither 100ms nor bigger values.

Hmm, I see.

> Regarding using normal branches, they are quite expensive.
> E.g. at some point we used to have a branch in slab_free() to check
> whether the freed object belonged to KFENCE pool.
> When the pool address was taken from memory, this resulted in some
> non-zero performance penalty.

Well yeah, if the checks involve extra cache misses, that adds up. But AFAICS
you can't avoid that kind of checks with static key anyway (am I looking right
at is_kfence_address()?) because some kfence-allocated objects will exist even
after the sampling period ended, right?
So AFAICS kfence_alloc() is the only user of the static key and I wonder if it
really makes such difference there.

> As for enabling the whole feature at runtime, our intention is to let
> the users have it enabled by default, otherwise someone will need to
> tell every machine in the fleet when the feature is to be enabled.

Sure, but I guess there are tools that make it no difference in effort between 1
machine and fleet.

I'll try to explain my general purpose distro-kernel POV. What I like e.g. about
debug_pagealloc and page_owner (and contributed to that state of these features)
is that a distro kernel can be shipped with them compiled in, but they are
static-key disabled thus have no overhead, until a user enables them on boot,
without a need to replace the kernel with a debug one first. Users can enable
them for their own debugging, or when asked by somebody from the distro
assisting with the debugging.

I think KFENCE has similar potential and could work the same way - compiled in
always, but a static key would eliminate everything, even the
is_kfence_address() checks, until it became enabled (but then it would probably
be a one-way street for the rest of the kernel's uptime). Some distro users
would decide to enable it always, some not, but could be advised to when needed.
So the existing static key could be repurposed for this, or if it's really worth
having the current one to control just the sampling period, then there would be two?

Thanks.

>> > We have verified by running synthetic benchmarks (sysbench I/O,
>> > hackbench) that a kernel with KFENCE is performance-neutral compared to
>> > a non-KFENCE baseline kernel.
>> >
>> > KFENCE is inspired by GWP-ASan [1], a userspace tool with similar
>> > properties. The name "KFENCE" is a homage to the Electric Fence Malloc
>> > Debugger [2].
>> >
>> > For more details, see Documentation/dev-tools/kfence.rst added in the
>> > series -- also viewable here:
>> >
>> > https://raw.githubusercontent.com/google/kasan/kfence/Documentation/dev-tools/kfence.rst
>> >
>> > [1] http://llvm.org/docs/GwpAsan.html
>> > [2] https://linux.die.net/man/3/efence
>> >
>> > Alexander Potapenko (6):
>> > mm: add Kernel Electric-Fence infrastructure
>> > x86, kfence: enable KFENCE for x86
>> > mm, kfence: insert KFENCE hooks for SLAB
>> > mm, kfence: insert KFENCE hooks for SLUB
>> > kfence, kasan: make KFENCE compatible with KASAN
>> > kfence, kmemleak: make KFENCE compatible with KMEMLEAK
>> >
>> > Marco Elver (4):
>> > arm64, kfence: enable KFENCE for ARM64
>> > kfence, lockdep: make KFENCE compatible with lockdep
>> > kfence, Documentation: add KFENCE documentation
>> > kfence: add test suite
>> >
>> > Documentation/dev-tools/index.rst | 1 +
>> > Documentation/dev-tools/kfence.rst | 285 +++++++++++
>> > MAINTAINERS | 11 +
>> > arch/arm64/Kconfig | 1 +
>> > arch/arm64/include/asm/kfence.h | 39 ++
>> > arch/arm64/mm/fault.c | 4 +
>> > arch/x86/Kconfig | 2 +
>> > arch/x86/include/asm/kfence.h | 60 +++
>> > arch/x86/mm/fault.c | 4 +
>> > include/linux/kfence.h | 174 +++++++
>> > init/main.c | 2 +
>> > kernel/locking/lockdep.c | 8 +
>> > lib/Kconfig.debug | 1 +
>> > lib/Kconfig.kfence | 70 +++
>> > mm/Makefile | 1 +
>> > mm/kasan/common.c | 7 +
>> > mm/kfence/Makefile | 6 +
>> > mm/kfence/core.c | 730 +++++++++++++++++++++++++++
>> > mm/kfence/kfence-test.c | 777 +++++++++++++++++++++++++++++
>> > mm/kfence/kfence.h | 104 ++++
>> > mm/kfence/report.c | 201 ++++++++
>> > mm/kmemleak.c | 11 +
>> > mm/slab.c | 46 +-
>> > mm/slab_common.c | 6 +-
>> > mm/slub.c | 72 ++-
>> > 25 files changed, 2591 insertions(+), 32 deletions(-)
>> > create mode 100644 Documentation/dev-tools/kfence.rst
>> > create mode 100644 arch/arm64/include/asm/kfence.h
>> > create mode 100644 arch/x86/include/asm/kfence.h
>> > create mode 100644 include/linux/kfence.h
>> > create mode 100644 lib/Kconfig.kfence
>> > create mode 100644 mm/kfence/Makefile
>> > create mode 100644 mm/kfence/core.c
>> > create mode 100644 mm/kfence/kfence-test.c
>> > create mode 100644 mm/kfence/kfence.h
>> > create mode 100644 mm/kfence/report.c
>> >
>>
>
>

2020-09-08 20:04:10

by Marco Elver

[permalink] [raw]
Subject: Re: [PATCH RFC 00/10] KFENCE: A low-overhead sampling-based memory safety error detector

On Tue, Sep 08, 2020 at 04:40PM +0200, Vlastimil Babka wrote:
> On 9/8/20 2:16 PM, Alexander Potapenko wrote:
> >> Toggling a static branch is AFAIK quite disruptive (PeterZ will probably tell
> >> you better), and with the default 100ms sample interval, I'd think it's not good
> >> to toggle it so often? Did you measure what performance would you get, if the
> >> static key was only for long-term toggling the whole feature on and off (boot
> >> time or even runtime), but the decisions "am I in a sample interval right now?"
> >> would be normal tests behind this static key? Thanks.
> >
> > 100ms is the default that we use for testing, but for production it
> > should be fine to pick a longer interval (e.g. 1 second or more).
> > We haven't noticed any performance impact with neither 100ms nor bigger values.
>
> Hmm, I see.

To add to this, we initially also weren't sure what the results would be
toggling the static branches at varying intervals. In the end we were
pleasantly surprised, and our benchmarking results always proved there
is no noticeable slowdown above 100ms (somewhat noticeable in the range
of 1-10ms but it's tolerable if you wanted to go there).

I think we were initially, just like you might be, deceived about the
time scales here. 100ms is a really long time for a computer.

> > Regarding using normal branches, they are quite expensive.
> > E.g. at some point we used to have a branch in slab_free() to check
> > whether the freed object belonged to KFENCE pool.
> > When the pool address was taken from memory, this resulted in some
> > non-zero performance penalty.
>
> Well yeah, if the checks involve extra cache misses, that adds up. But AFAICS
> you can't avoid that kind of checks with static key anyway (am I looking right
> at is_kfence_address()?) because some kfence-allocated objects will exist even
> after the sampling period ended, right?
> So AFAICS kfence_alloc() is the only user of the static key and I wonder if it
> really makes such difference there.

The really important bit here is to differentiate between fast-paths and
slow-paths!

We insert kfence_alloc() into the allocator fast-paths, which is where
the majority of cost would be. On the other hand, the major user of
is_kfence_address(), kfence_free(), is only inserted into the slow-path.

As a result, is_kfence_address() usage has negligible cost (esp. if the
statically allocated pool is used) -- we benchmarked this quite
extensively.

> > As for enabling the whole feature at runtime, our intention is to let
> > the users have it enabled by default, otherwise someone will need to
> > tell every machine in the fleet when the feature is to be enabled.
>
> Sure, but I guess there are tools that make it no difference in effort between 1
> machine and fleet.
>
> I'll try to explain my general purpose distro-kernel POV. What I like e.g. about
> debug_pagealloc and page_owner (and contributed to that state of these features)
> is that a distro kernel can be shipped with them compiled in, but they are
> static-key disabled thus have no overhead, until a user enables them on boot,
> without a need to replace the kernel with a debug one first. Users can enable
> them for their own debugging, or when asked by somebody from the distro
> assisting with the debugging.
>
> I think KFENCE has similar potential and could work the same way - compiled in
> always, but a static key would eliminate everything, even the
> is_kfence_address() checks,

[ See my answer for the cost of is_kfence_address() above. In short,
until we add is_kfence_address() to fast-paths, introducing yet
another static branch would be premature optimization. ]

> until it became enabled (but then it would probably
> be a one-way street for the rest of the kernel's uptime). Some distro users
> would decide to enable it always, some not, but could be advised to when needed.
> So the existing static key could be repurposed for this, or if it's really worth
> having the current one to control just the sampling period, then there would be two?

You can already do this. Just set CONFIG_KFENCE_SAMPLE_INTERVAL=0. When
you decide to enable it, set kfence.sample_interval=<somenumber> as a
boot parameter.

I'll add something to that effect into Documentation/dev-tools/kfence.rst.

Thanks,
-- Marco