2021-10-23 17:23:01

by YE Chengfeng

[permalink] [raw]
Subject: [PATCH] mm/kfence: fix null pointer dereference on pointer meta

The pointer meta return from addr_to_metadata could be null, so
there is a potential null pointer dereference issue. Fix this
by adding a null check before dereference.

Fixes: 0ce20dd8 ("mm: add Kernel Electric-Fence infrastructure")
Signed-off-by: Chengfeng Ye <[email protected]>
---
mm/kfence/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 7a97db8bc8e7..7d2ec787e921 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -811,7 +811,7 @@ void __kfence_free(void *addr)
* objects once it has been freed. meta->cache may be NULL if the cache
* was destroyed.
*/
- if (unlikely(meta->cache && (meta->cache->flags & SLAB_TYPESAFE_BY_RCU)))
+ if (unlikely(meta && meta->cache && (meta->cache->flags & SLAB_TYPESAFE_BY_RCU)))
call_rcu(&meta->rcu_head, rcu_guarded_free);
else
kfence_guarded_free(addr, meta, false);
--
2.17.1


2021-10-23 18:52:10

by Marco Elver

[permalink] [raw]
Subject: Re: [PATCH] mm/kfence: fix null pointer dereference on pointer meta

On Sat, 23 Oct 2021 at 19:20, Chengfeng Ye <[email protected]> wrote:
> The pointer meta return from addr_to_metadata could be null, so
> there is a potential null pointer dereference issue. Fix this
> by adding a null check before dereference.
>
> Fixes: 0ce20dd8 ("mm: add Kernel Electric-Fence infrastructure")
> Signed-off-by: Chengfeng Ye <[email protected]>
> ---
> mm/kfence/core.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/kfence/core.c b/mm/kfence/core.c
> index 7a97db8bc8e7..7d2ec787e921 100644
> --- a/mm/kfence/core.c
> +++ b/mm/kfence/core.c
> @@ -811,7 +811,7 @@ void __kfence_free(void *addr)
> * objects once it has been freed. meta->cache may be NULL if the cache
> * was destroyed.
> */
> - if (unlikely(meta->cache && (meta->cache->flags & SLAB_TYPESAFE_BY_RCU)))
> + if (unlikely(meta && meta->cache && (meta->cache->flags & SLAB_TYPESAFE_BY_RCU)))
> call_rcu(&meta->rcu_head, rcu_guarded_free);
> else
> kfence_guarded_free(addr, meta, false);

Sorry -- Nack. What bug did you encounter?

Please see [1], and I'm afraid this attempt makes even less sense
because if it were (hypothetically) NULL like you say we just call
kfence_guarded_free() and crash there.

[1] https://lkml.kernel.org/r/CANpmjNMcgUsdvXrvQHn+-y1w-z-6QAS+WJ27RB2DCnVxORRcuw@mail.gmail.com

However, what I wrote in [1] equally applies here:

> [...]
> Adding a check like this could also hide genuine bugs, as meta should
> never be NULL in __kfence_free(). If it is, we'd like to see a crash.
>
> Did you read kfence_free() in include/linux/kfence.h? It already
> prevents __kfence_free() being called with a non-KFENCE address.
>
> Without a more thorough explanation, Nack.

May I ask which static analysis tool keeps flagging this?

Thanks,
-- Marco

2021-10-25 06:39:02

by Marco Elver

[permalink] [raw]
Subject: Re: [PATCH] mm/kfence: fix null pointer dereference on pointer meta

On Sat, 23 Oct 2021 at 21:22, YE Chengfeng <[email protected]> wrote:
[...]
> Thanks for your reply, this is reported by a static analysis tool developed by us. It just checks dataflow and doesn't know other complex semantics. I didn't know whether it is a real bug, so I send the patch just in case. It seems that if the index is incorrect, the function addr_to_metadata will also return null-ptr, I don't know whether this is checked by other upper-level functions.
[...]
> And you are right, if it is a null-ptr, the root cause of it should be in the upper-level function. I think you can add some null-ptr check like assert(meta != null) if you want, this will suppress this kind of false positive report. Anyway, I think it is not a very good thing to just let this null-ptr dereference happen, even though it is not a big deal. Adding some checking to handle this case may be better, for example, print some error logging.

It's a little more complicated than this: the negative index may
happen when called with an object in range R = [__kfence_pool,
__kfence_pool+(PAGE_SIZE*2)-1]. The first thing to note is that this
address range is never returned by KFENCE as a valid object because
both pages are "guard pages".

Secondly, while calling kfence_free(R) will result in the NULL-deref,
however, such a call is either buggy or malicious because it's only
meant to be called from the allocators' kfree slow-path (slub.c and
slab.c). Calling kfree(R) _does not_ lead to the kfree slow-path which
calls kfence_free(), because the first 2 pages in KFENCE's pool do not
have PageSlab nor page->slab_cache set.

You can try it yourself by randomly doing a kfree(__kfence_pool)
somewhere, and observing that nothing happens.

As you can see, encountering the NULL-deref in __kfence_free() really
should be impossible, unless something really bad is happening (e.g.
malicious invocation, corrupt memory, bad CPU, etc.).

And regarding assert(meta != null) you mentioned: the kernel does not
have asserts, and the closest we have to asserts are WARN_ON() and
BUG_ON(). That latter of which is closest to an assert() you may be
familiar with from user space. However, its use is heavily
discouraged: unlike user space, the kernel crashing takes the whole
machine down. Therefore, the kernel wants to handle errors as
gracefully as possible, i.e. recover where possible.

However, something like BUG_ON(!ptr) is quite redundant, because a
NULL-deref always crashes the kernel and also prints a helpful call
trace.

But as reasoned above, really shouldn't happen in our case. And if it
does, we'd _really_ want to know about it (just crash) -- we either
have a serious bug somewhere, or something more malicious is
happening. Therefore, handling this case more gracefully, be it with a
WARN_ON() or otherwise, does not seem appropriate as I couldn't say if
it's safe to recover and continue execution in such a state.

The same is true for any other place in the kernel handling pointers:
if a NULL-deref really isn't expected, often it makes more sense to
crash rather than continue in an unknown bad state potentially
corrupting more data.

Thanks,
-- Marco