2020-10-15 03:48:36

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH] fs: fix NULL dereference due to data race in prepend_path()

On Wed, Oct 14, 2020 at 01:45:28PM -0700, Andrii Nakryiko wrote:
> Fix data race in prepend_path() with re-reading mnt->mnt_ns twice without
> holding the lock. is_mounted() does check for NULL, but is_anon_ns(mnt->mnt_ns)
> might re-read the pointer again which could be NULL already, if in between
> reads one of kern_unmount()/kern_unmount_array()/umount_tree() sets mnt->mnt_ns
> to NULL.

Cute... What config/compiler has resulted in that? I agree with the analysis, but
I really hate the open-coded (and completely unexplained) use of IS_ERR_OR_NULL()
there.

> - if (is_mounted(vfsmnt) && !is_anon_ns(mnt->mnt_ns))
> + mnt_ns = READ_ONCE(mnt->mnt_ns);
> + /* open-coded is_mounted() to use local mnt_ns */
> + if (!IS_ERR_OR_NULL(mnt_ns) && !is_anon_ns(mnt_ns))
> error = 1; // absolute root
> else
> error = 2; // detached or not attached yet

Better turn that into an inlined helper in fs/mount.h, next to is_mounted(), IMO,
and kill that IS_ERR_OR_NULL garbage there. What that thing does is
if (ns == NULL || ns == MNT_NS_INTERNAL)
and it's *not* on any kind of hot path to warrant that kind of microoptimizations.

So let's make that

static inline bool is_real_ns(struct mnt_namespace *mnt_ns)
{
return mnt_ns && mnt_ns != MNT_NS_INTERNAL;
}

turn is_mounted(m) into is_real_ns(m->mnt_ns) and replace that line in your fix
with
if (is_real_ns(mnt_ns) && !is_anon_ns(mnt_ns))

Objections?


2020-10-15 07:34:42

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH] fs: fix NULL dereference due to data race in prepend_path()

On Wed, Oct 14, 2020 at 4:08 PM Al Viro <[email protected]> wrote:
>
> On Wed, Oct 14, 2020 at 01:45:28PM -0700, Andrii Nakryiko wrote:
> > Fix data race in prepend_path() with re-reading mnt->mnt_ns twice without
> > holding the lock. is_mounted() does check for NULL, but is_anon_ns(mnt->mnt_ns)
> > might re-read the pointer again which could be NULL already, if in between
> > reads one of kern_unmount()/kern_unmount_array()/umount_tree() sets mnt->mnt_ns
> > to NULL.
>
> Cute... What config/compiler has resulted in that? I agree with the analysis, but

Don't know for sure, but nothing special or exotic, a typical Facebook
production kernel config and some version of GCC (I didn't check
exactly which one).

Just given enough servers in the fleet, with time and intensive
workloads races like this, however unlikely, do surface up pretty
regularly.

> I really hate the open-coded (and completely unexplained) use of IS_ERR_OR_NULL()
> there.
>
> > - if (is_mounted(vfsmnt) && !is_anon_ns(mnt->mnt_ns))
> > + mnt_ns = READ_ONCE(mnt->mnt_ns);
> > + /* open-coded is_mounted() to use local mnt_ns */
> > + if (!IS_ERR_OR_NULL(mnt_ns) && !is_anon_ns(mnt_ns))
> > error = 1; // absolute root
> > else
> > error = 2; // detached or not attached yet
>
> Better turn that into an inlined helper in fs/mount.h, next to is_mounted(), IMO,
> and kill that IS_ERR_OR_NULL garbage there. What that thing does is
> if (ns == NULL || ns == MNT_NS_INTERNAL)
> and it's *not* on any kind of hot path to warrant that kind of microoptimizations.

Sounds good. I didn't know code well enough to infer NULL ||
MNT_NS_INTERNAL instead of IS_ERR_OR_NULL from is_mounted(), so just
open-coded the latter.

>
> So let's make that
>
> static inline bool is_real_ns(struct mnt_namespace *mnt_ns)
> {
> return mnt_ns && mnt_ns != MNT_NS_INTERNAL;
> }
>
> turn is_mounted(m) into is_real_ns(m->mnt_ns) and replace that line in your fix
> with
> if (is_real_ns(mnt_ns) && !is_anon_ns(mnt_ns))
>
> Objections?

Nope, I'll send a follow-up patch, thanks.