LinuxLists.cc - BUG: KCSAN: data-race in add_device

2022-02-08 06:15:58

Subject: BUG: KCSAN: data-race in add_device_randomness+0x20d/0x290

Dear Linux folks,

Running Linux 5.17-rc2+ in QEMU 6.2.0 the Kernel Concurrency Sanitizer
reports the race below:

qemu-system-x86_64 -cpu host -smp cpus=3 -m 2G -enable-kvm -usb
-device usb-host,vendorid=0x8087,productid=0x0a2a -drive
file=/dev/shm/debian-64.img,format=raw,if=virtio -net nic -net
user,hostfwd=tcp::22223-:22 -kernel /dev/shm/bzImage -append
"root=/dev/vda1 rw quiet"

```
[ 3.383861]
==================================================================
[ 3.385662] BUG: KCSAN: data-race in add_device_randomness+0x20d/0x290
[ 3.387142]
[ 3.387905] race at unknown origin, with read to 0xffff9ec881ad3008
of 1 bytes by task 44 on cpu 0:
[ 3.389778] add_device_randomness+0x20d/0x290
[ 3.389789] release_task+0x231/0xb20
[ 3.389800] wait_consider_task+0x883/0x17d0
[ 3.389809] do_wait+0x2b4/0x4d0
[ 3.389818] kernel_wait+0x71/0xd0
[ 3.389827] call_usermodehelper_exec_work+0xa3/0xb0
[ 3.389838] process_one_work+0x477/0x7d0
[ 3.389845] worker_thread+0x2ec/0x810
[ 3.389851] kthread+0x16d/0x1a0
[ 3.389860] ret_from_fork+0x22/0x30
[ 3.389868]
[ 3.390623] value changed: 0x00 -> 0x43
[ 3.391594]
[ 3.392363] Reported by Kernel Concurrency Sanitizer on:
[ 3.394131] CPU: 0 PID: 44 Comm: kworker/u6:2 Not tainted
5.17.0-rc2-00353-g90c9e950c0de #34
[ 3.394151] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.15.0-1 04/01/2014
[ 3.394190] Workqueue: events_unbound call_usermodehelper_exec_work
```

Please find the output of `dmesg` attached.

Kind regards,

Paul

Attachments:

linux-5.17-rc2-kcsan-add_device_randomness.txt (28.67 kB)

2022-02-08 13:23:04

by Jann Horn

[permalink] [raw]

Subject: Re: BUG: KCSAN: data-race in add_device_randomness+0x20d/0x290

+KCSAN people

On Mon, Feb 7, 2022 at 7:42 PM Jason A. Donenfeld <[email protected]> wrote:
> Thanks for the report. I assume that this is actually an old bug. Do
> you have a vmlinux or a random.o from this kernel you could send me to
> double check? Without that, my best guess, which I'd say I have
> relatively high confidence about,

Maybe KCSAN should go through the same instruction-bytes-dumping thing
as normal BUG() does? That might be helpful for cases like this...

2022-02-08 15:10:04

by Jann Horn

[permalink] [raw]

Subject: Re: BUG: KCSAN: data-race in add_device_randomness+0x20d/0x290

On Mon, Feb 7, 2022 at 7:42 PM Jason A. Donenfeld <[email protected]> wrote:
> Thanks for the report. I assume that this is actually an old bug. Do
> you have a vmlinux or a random.o from this kernel you could send me to
> double check? Without that, my best guess, which I'd say I have
> relatively high confidence about, is that the "1 byte read" is
> actually a `movzx eax, cs:lfsr` referring to the `static u8 lfsr`
> here, which gets inlined into add_device_randomness:
>
> static int crng_slow_load(const u8 *cp, size_t len)
> {
> unsigned long flags;
> static u8 lfsr = 1;
>
> This was added in 2008 with dc12baacb95f ("random: use a different
> mixing algorithm for add_device_randomness()"). My understanding is
> that the race here isn't super problematic as we're in kind of a half
> assed "low effort" phase anyway. But I'll give it some thought. I'm
> CCing Jann as well who reported the original issue that motivated that
> change.

But the "lfsr" variable is never accessed outside the part of this
method that holds a global spinlock. So that can't really be it,
right?

2022-02-09 04:11:09

by Jason A. Donenfeld

[permalink] [raw]

Subject: Re: BUG: KCSAN: data-race in add_device_randomness+0x20d/0x290

Hi Paul,

Thanks for the report. I assume that this is actually an old bug. Do
you have a vmlinux or a random.o from this kernel you could send me to
double check? Without that, my best guess, which I'd say I have
relatively high confidence about, is that the "1 byte read" is
actually a `movzx eax, cs:lfsr` referring to the `static u8 lfsr`
here, which gets inlined into add_device_randomness:

static int crng_slow_load(const u8 *cp, size_t len)
{
unsigned long flags;
static u8 lfsr = 1;

This was added in 2008 with dc12baacb95f ("random: use a different
mixing algorithm for add_device_randomness()"). My understanding is
that the race here isn't super problematic as we're in kind of a half
assed "low effort" phase anyway. But I'll give it some thought. I'm
CCing Jann as well who reported the original issue that motivated that
change.

Regards,
Jason

2022-02-09 07:11:39

by Jason A. Donenfeld

[permalink] [raw]

Subject: Re: BUG: KCSAN: data-race in add_device_randomness+0x20d/0x290

Hi Marco,

On 2/8/22, Marco Elver <[email protected]> wrote:
> Jason - if you're interested in KCSAN data race reports in some
> subsystems you maintain (I see a few in Wireguard), let me know, and
> I'll release them from syzbot's moderation queue. The way we're trying
> to do it with KCSAN is that we pre-moderate and ask maintainers if
> they're happy to be forwarded all reports that syzbot finds (currently
> some Networking and RCU, though the latter finds almost all data races
> via KCSAN-enabled rcutorture).

Oh that'd be great. Please feel free to forward whatever for WireGuard
or random.c to [email protected] and I'll gladly try to fix what needs
fixing.

Jason

2022-02-09 08:28:31

by Jann Horn

[permalink] [raw]

Subject: Re: BUG: KCSAN: data-race in add_device_randomness+0x20d/0x290

On Mon, Feb 7, 2022 at 10:50 PM Jann Horn <[email protected]> wrote:
>
> On Mon, Feb 7, 2022 at 10:45 PM Jann Horn <[email protected]> wrote:
> > On Mon, Feb 7, 2022 at 7:42 PM Jason A. Donenfeld <[email protected]> wrote:
> > > Thanks for the report. I assume that this is actually an old bug. Do
> > > you have a vmlinux or a random.o from this kernel you could send me to
> > > double check? Without that, my best guess, which I'd say I have
> > > relatively high confidence about, is that the "1 byte read" is
> > > actually a `movzx eax, cs:lfsr` referring to the `static u8 lfsr`
> > > here, which gets inlined into add_device_randomness:
> > >
> > > static int crng_slow_load(const u8 *cp, size_t len)
> > > {
> > > unsigned long flags;
> > > static u8 lfsr = 1;
> > >
> > > This was added in 2008 with dc12baacb95f ("random: use a different
> > > mixing algorithm for add_device_randomness()"). My understanding is
> > > that the race here isn't super problematic as we're in kind of a half
> > > assed "low effort" phase anyway. But I'll give it some thought. I'm
> > > CCing Jann as well who reported the original issue that motivated that
> > > change.
> >
> > But the "lfsr" variable is never accessed outside the part of this
> > method that holds a global spinlock. So that can't really be it,
> > right?
>
> There is a data race in crng_ready(), it just loads from "crng_init"
> without READ_ONCE()... maybe that's what KCSAN is noticing?

(Actually, maybe it should technically even be a smp_load_acquire()?
Or do we have anything else that ensures that we can't get ordering
issues here?)

2022-02-09 09:20:07

by Marco Elver

[permalink] [raw]

Subject: Re: BUG: KCSAN: data-race in add_device_randomness+0x20d/0x290

On Mon, 7 Feb 2022 at 22:49, Jann Horn <[email protected]> wrote:
> +KCSAN people
>
> On Mon, Feb 7, 2022 at 7:42 PM Jason A. Donenfeld <[email protected]> wrote:
> > Thanks for the report. I assume that this is actually an old bug. Do
> > you have a vmlinux or a random.o from this kernel you could send me to
> > double check? Without that, my best guess, which I'd say I have
> > relatively high confidence about,
>
> Maybe KCSAN should go through the same instruction-bytes-dumping thing
> as normal BUG() does? That might be helpful for cases like this...

A BUG() on x86 actually generates a ud2, and somewhere along the way
it uses pt_regs in show_opcodes(). Generating KCSAN stack traces is
very different, and there's no pt_regs because it's going through
compiler instrumentation.

In general, I wouldn't spend much time on one-sided non-symbolized
KCSAN reports, unless it's obvious what's going on. I've been thinking
of making CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n the default, because
the one-sided race reports are not very useful. We need to see what
we're racing against. With the normal reports where both threads'
stack traces are shown it's usually much easier to narrow down what's
happening even in the absence of symbolized stack traces.

My suggestion would be to try and get a normal "2-sided" data race report.

I also haven't found something similar in my pile of data race reports
sitting in syzbot moderation.

Jason - if you're interested in KCSAN data race reports in some
subsystems you maintain (I see a few in Wireguard), let me know, and
I'll release them from syzbot's moderation queue. The way we're trying
to do it with KCSAN is that we pre-moderate and ask maintainers if
they're happy to be forwarded all reports that syzbot finds (currently
some Networking and RCU, though the latter finds almost all data races
via KCSAN-enabled rcutorture).

Thanks,
-- Marco

2022-02-09 10:22:04

by Jann Horn

[permalink] [raw]

Subject: Re: BUG: KCSAN: data-race in add_device_randomness+0x20d/0x290

On Mon, Feb 7, 2022 at 10:45 PM Jann Horn <[email protected]> wrote:
> On Mon, Feb 7, 2022 at 7:42 PM Jason A. Donenfeld <[email protected]> wrote:
> > Thanks for the report. I assume that this is actually an old bug. Do
> > you have a vmlinux or a random.o from this kernel you could send me to
> > double check? Without that, my best guess, which I'd say I have
> > relatively high confidence about, is that the "1 byte read" is
> > actually a `movzx eax, cs:lfsr` referring to the `static u8 lfsr`
> > here, which gets inlined into add_device_randomness:
> >
> > static int crng_slow_load(const u8 *cp, size_t len)
> > {
> > unsigned long flags;
> > static u8 lfsr = 1;
> >
> > This was added in 2008 with dc12baacb95f ("random: use a different
> > mixing algorithm for add_device_randomness()"). My understanding is
> > that the race here isn't super problematic as we're in kind of a half
> > assed "low effort" phase anyway. But I'll give it some thought. I'm
> > CCing Jann as well who reported the original issue that motivated that
> > change.
>
> But the "lfsr" variable is never accessed outside the part of this
> method that holds a global spinlock. So that can't really be it,
> right?

There is a data race in crng_ready(), it just loads from "crng_init"
without READ_ONCE()... maybe that's what KCSAN is noticing?

2022-02-09 11:16:51

by Marco Elver

[permalink] [raw]

Subject: Re: BUG: KCSAN: data-race in add_device_randomness+0x20d/0x290

On Tue, 8 Feb 2022 at 01:36, Jason A. Donenfeld <[email protected]> wrote:
>
> Hi Marco,
>
> On 2/8/22, Marco Elver <[email protected]> wrote:
> > Jason - if you're interested in KCSAN data race reports in some
> > subsystems you maintain (I see a few in Wireguard), let me know, and
> > I'll release them from syzbot's moderation queue. The way we're trying
> > to do it with KCSAN is that we pre-moderate and ask maintainers if
> > they're happy to be forwarded all reports that syzbot finds (currently
> > some Networking and RCU, though the latter finds almost all data races
> > via KCSAN-enabled rcutorture).
>
> Oh that'd be great. Please feel free to forward whatever for WireGuard
> or random.c to [email protected] and I'll gladly try to fix what needs
> fixing.

Great! I've released everything I could see right now.

Thanks,
-- Marco

2022-02-09 11:52:58

by Marco Elver

[permalink] [raw]

Subject: Re: BUG: KCSAN: data-race in add_device_randomness+0x20d/0x290

On Tue, 8 Feb 2022 at 01:10, Marco Elver <[email protected]> wrote:
>
> On Mon, 7 Feb 2022 at 22:49, Jann Horn <[email protected]> wrote:
> > +KCSAN people
> >
> > On Mon, Feb 7, 2022 at 7:42 PM Jason A. Donenfeld <[email protected]> wrote:
> > > Thanks for the report. I assume that this is actually an old bug. Do
> > > you have a vmlinux or a random.o from this kernel you could send me to
> > > double check? Without that, my best guess, which I'd say I have
> > > relatively high confidence about,
> >
> > Maybe KCSAN should go through the same instruction-bytes-dumping thing
> > as normal BUG() does? That might be helpful for cases like this...
>
> A BUG() on x86 actually generates a ud2, and somewhere along the way
> it uses pt_regs in show_opcodes(). Generating KCSAN stack traces is
> very different, and there's no pt_regs because it's going through
> compiler instrumentation.
>
> In general, I wouldn't spend much time on one-sided non-symbolized
> KCSAN reports, unless it's obvious what's going on. I've been thinking
> of making CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n the default, because

(That should have been KCSAN_REPORT_RACE_UNKNOWN_ORIGIN=n ... copy-paste error.)

> the one-sided race reports are not very useful. We need to see what
> we're racing against. With the normal reports where both threads'
> stack traces are shown it's usually much easier to narrow down what's
> happening even in the absence of symbolized stack traces.
>
> My suggestion would be to try and get a normal "2-sided" data race report.
>
> I also haven't found something similar in my pile of data race reports
> sitting in syzbot moderation.
>
> Jason - if you're interested in KCSAN data race reports in some
> subsystems you maintain (I see a few in Wireguard), let me know, and
> I'll release them from syzbot's moderation queue. The way we're trying
> to do it with KCSAN is that we pre-moderate and ask maintainers if
> they're happy to be forwarded all reports that syzbot finds (currently
> some Networking and RCU, though the latter finds almost all data races
> via KCSAN-enabled rcutorture).
>
> Thanks,
> -- Marco

2022-02-09 11:56:04

by Jason A. Donenfeld

[permalink] [raw]

Subject: Re: BUG: KCSAN: data-race in add_device_randomness+0x20d/0x290

On Mon, Feb 7, 2022 at 10:50 PM Jann Horn <[email protected]> wrote:
> > But the "lfsr" variable is never accessed outside the part of this
> > method that holds a global spinlock. So that can't really be it,
> > right?

Oh, hm, yea. Seems likely.

>
> There is a data race in crng_ready(), it just loads from "crng_init"
> without READ_ONCE()... maybe that's what KCSAN is noticing?

There are lots of data races in crng_ready(), which Dominik has been
fixing up gradually (which is why I CC'd him). However, crng_init is 4
bytes, and that crng_init is read first with a cmpl that will read the
whole thing. Maybe KCSAN's reporting is wrong? But it also says 0x00
-> 0x43, which isn't one of the assigned values of crng_init. Also,
0x20d seems quite far into the body of add_device_randomness, whereas
crng_ready() is checked sort of early on.

Hopefully Paul will send us his vmlinux.

Another possibility is that this is happening on the u8 read of the
buffer *input*, of what release_task->__exit_signal is passing it:

add_device_randomness((const void*) &tsk->se.sum_exec_runtime,
sizeof(unsigned long long));

I haven't yet looked at the locking around tsk->se.sum_exec_runtime.

2022-02-15 09:44:06

by Paul Menzel

[permalink] [raw]

Subject: Re: BUG: KCSAN: data-race in add_device_randomness+0x20d/0x290

Dear Jason,

Thank you for your quick response, and sorry for the late reply.

Am 07.02.22 um 19:42 schrieb Jason A. Donenfeld:

> Thanks for the report. I assume that this is actually an old bug. Do
> you have a vmlinux or a random.o from this kernel you could send me to
> double check?

I had unfortunately already overwritten these files. I was unable to
fully reproduce the same trace with a build from the same commit, and
only got:

```
[…]
[ 12.819273] BUG: KCSAN: data-race in _mix_pool_bytes+0x5d/0x1c0

[ 12.822917] race at unknown origin, with read to 0xffffa21c07003009
of 1 bytes by task 1035 on cpu 0:
[ 12.823580] _mix_pool_bytes+0x5d/0x1c0
[ 12.823678] add_device_randomness+0x97/0x290
[ 12.823691] release_task+0x231/0xb20
[ 12.823735] wait_consider_task+0x883/0x17d0
[ 12.823755] do_wait+0x35f/0x4d0
[ 12.823768] kernel_wait4+0xdb/0x180
[ 12.823778] __do_sys_wait4+0x85/0x90
[ 12.823787] __x64_sys_wait4+0x49/0x50
[ 12.823797] do_syscall_64+0x39/0x80
[ 12.823819] entry_SYSCALL_64_after_hwframe+0x44/0xae

[ 12.824028] value changed: 0x56 -> 0xc1

[ 12.824415] Reported by Kernel Concurrency Sanitizer on:
[ 12.824688] CPU: 0 PID: 1035 Comm: postfix-script Not tainted
5.17.0-rc2-00352-g90c9e950c0de #40
[ 12.824705] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.15.0-1 04/01/2014
[…]
```

So instead of the former (used as subject line)

add_device_randomness+0x20d/0x290

there is

add_device_randomness+0x97/0x290

I uploaded `random.o` [1].

Kind regards,

Paul

[1]: https://owww.molgen.mpg.de/~pmenzel/random.o

2022-02-15 12:24:10

by Jason A. Donenfeld

[permalink] [raw]

Subject: Re: BUG: KCSAN: data-race in add_device_randomness+0x20d/0x290

Hi Paul,

Thanks for the new report. This looks like likely the same sort of
issue as before -- it's on a 1 byte read of the data that's being
passed to add_device_randomness by release_task(). It looks like there
might be a race in release_task():

void release_task(struct task_struct *p)
{
[...]
cgroup_release(p);

write_lock_irq(&tasklist_lock);
ptrace_release_task(p);
thread_pid = get_pid(p->thread_pid);
__exit_signal(p);
[...]
}
static void __exit_signal(struct task_struct *tsk)
{
[...]
add_device_randomness((const void*) &tsk->se.sum_exec_runtime,
sizeof(unsigned long long));
[...]
}

Apparently the data that is being read by add_device_randomness() is
being modified while it's being read. This would be whatever is in
`tsk->se.sum_exec_runtime`.

I'm not sure what's happening there, if this is "normal" and the task
can be scheduled out while exiting, causing the schedule to add to
sum_exec_runtime, or what. CCing some people who seem to have touched
this code, and maybe that'll help illuminate things.

Thanks,
Jason

2022-02-15 13:00:47

by Oleg Nesterov

[permalink] [raw]

Subject: Re: BUG: KCSAN: data-race in add_device_randomness+0x20d/0x290

On 02/15, Jason A. Donenfeld wrote:
>
> void release_task(struct task_struct *p)
> {
> [...]
> cgroup_release(p);
>
> write_lock_irq(&tasklist_lock);
> ptrace_release_task(p);
> thread_pid = get_pid(p->thread_pid);
> __exit_signal(p);
> [...]
> }
> static void __exit_signal(struct task_struct *tsk)
> {
> [...]
> add_device_randomness((const void*) &tsk->se.sum_exec_runtime,
> sizeof(unsigned long long));
> [...]
> }
>
> Apparently the data that is being read by add_device_randomness() is
> being modified while it's being read. This would be whatever is in
> `tsk->se.sum_exec_runtime`.
>
> I'm not sure what's happening there, if this is "normal" and the task
> can be scheduled out while exiting, causing the schedule to add to
> sum_exec_runtime, or what.

Yes, this is normal.

The parent can call release_task(child) right after the child has passed
exit_notify(), the exiting child still runs and can be scheduled out until
it does the last schedule from do_task_dead().

Oleg.