2018-10-04 15:48:15

by Eric Dumazet

[permalink] [raw]
Subject: [PATCH] Input: mousedev - add a schedule point in mousedev_write()

syzbot was able to trigger rcu stalls by calling write()
with large number of bytes.

Add a cond_resched() in the loop to avoid this.

Link: https://lkml.org/lkml/2018/8/23/1106
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: [email protected]
Cc: Dmitry Torokhov <[email protected]>
Cc: [email protected]
---
drivers/input/mousedev.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c
index e08228061bcdd2f97aaadece31d6c83eb7539ae5..412fa71245afe26a7a8ad75705566f83633ba347 100644
--- a/drivers/input/mousedev.c
+++ b/drivers/input/mousedev.c
@@ -707,6 +707,7 @@ static ssize_t mousedev_write(struct file *file, const char __user *buffer,
mousedev_generate_response(client, c);

spin_unlock_irq(&client->packet_lock);
+ cond_resched();
}

kill_fasync(&client->fasync, SIGIO, POLL_IN);
--
2.19.0.605.g01d371f741-goog



2018-10-04 19:01:36

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [PATCH] Input: mousedev - add a schedule point in mousedev_write()

Hi Eric,

On Thu, Oct 04, 2018 at 08:47:49AM -0700, Eric Dumazet wrote:
> syzbot was able to trigger rcu stalls by calling write()
> with large number of bytes.
>
> Add a cond_resched() in the loop to avoid this.

I think this simply masks a deeper issue. The code fetches characters
from userspace in a loop, takes a lock, quickly places response in an
output buffer, and releases interrupt. I do not see why this should
cause stalls as we do not hold spinlock/interrupts off for extended
period of time.

Adding Paul so he can straighten me out...

>
> Link: https://lkml.org/lkml/2018/8/23/1106
> Signed-off-by: Eric Dumazet <[email protected]>
> Reported-by: [email protected]
> Cc: Dmitry Torokhov <[email protected]>
> Cc: [email protected]
> ---
> drivers/input/mousedev.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c
> index e08228061bcdd2f97aaadece31d6c83eb7539ae5..412fa71245afe26a7a8ad75705566f83633ba347 100644
> --- a/drivers/input/mousedev.c
> +++ b/drivers/input/mousedev.c
> @@ -707,6 +707,7 @@ static ssize_t mousedev_write(struct file *file, const char __user *buffer,
> mousedev_generate_response(client, c);
>
> spin_unlock_irq(&client->packet_lock);
> + cond_resched();
> }
>
> kill_fasync(&client->fasync, SIGIO, POLL_IN);
> --
> 2.19.0.605.g01d371f741-goog
>

Thanks.

--
Dmitry

2018-10-04 19:29:30

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH] Input: mousedev - add a schedule point in mousedev_write()

On Thu, Oct 4, 2018 at 11:59 AM Dmitry Torokhov
<[email protected]> wrote:
>
> Hi Eric,
>
> On Thu, Oct 04, 2018 at 08:47:49AM -0700, Eric Dumazet wrote:
> > syzbot was able to trigger rcu stalls by calling write()
> > with large number of bytes.
> >
> > Add a cond_resched() in the loop to avoid this.
>
> I think this simply masks a deeper issue. The code fetches characters
> from userspace in a loop, takes a lock, quickly places response in an
> output buffer, and releases interrupt. I do not see why this should
> cause stalls as we do not hold spinlock/interrupts off for extended
> period of time.
>
> Adding Paul so he can straighten me out...
>

Well...

write(fd, buffer, 0x7FFF0000);

Takes between 20 seconds and 2 minutes depending on CONFIG options ....

So either apply my patch, or add a limit on the max count, and
possibly break legitimate user space ?

I dunno...

> >
> > Link: https://lkml.org/lkml/2018/8/23/1106
> > Signed-off-by: Eric Dumazet <[email protected]>
> > Reported-by: [email protected]
> > Cc: Dmitry Torokhov <[email protected]>
> > Cc: [email protected]
> > ---
> > drivers/input/mousedev.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c
> > index e08228061bcdd2f97aaadece31d6c83eb7539ae5..412fa71245afe26a7a8ad75705566f83633ba347 100644
> > --- a/drivers/input/mousedev.c
> > +++ b/drivers/input/mousedev.c
> > @@ -707,6 +707,7 @@ static ssize_t mousedev_write(struct file *file, const char __user *buffer,
> > mousedev_generate_response(client, c);
> >
> > spin_unlock_irq(&client->packet_lock);
> > + cond_resched();
> > }
> >
> > kill_fasync(&client->fasync, SIGIO, POLL_IN);
> > --
> > 2.19.0.605.g01d371f741-goog
> >
>
> Thanks.
>
> --
> Dmitry

2018-10-04 19:34:52

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] Input: mousedev - add a schedule point in mousedev_write()

On Thu, Oct 04, 2018 at 11:59:49AM -0700, Dmitry Torokhov wrote:
> Hi Eric,
>
> On Thu, Oct 04, 2018 at 08:47:49AM -0700, Eric Dumazet wrote:
> > syzbot was able to trigger rcu stalls by calling write()
> > with large number of bytes.
> >
> > Add a cond_resched() in the loop to avoid this.
>
> I think this simply masks a deeper issue. The code fetches characters
> from userspace in a loop, takes a lock, quickly places response in an
> output buffer, and releases interrupt. I do not see why this should
> cause stalls as we do not hold spinlock/interrupts off for extended
> period of time.
>
> Adding Paul so he can straighten me out...

If you are running a !PREEMPT kernel, then you need the cond_resched()
to allow the scheduler to choose someone else to run if needed and
to let RCU know that grace periods can end. Without the cond_resched(),
if you stay in that loop long enough you will get excessive scheduling
latencies and eventually even RCU CPU stall warning splats.

In a PREEMPT (instead of !PREEMPT) kernel, you would be right. When
preemption is enabled, the scheduler can preempt and RCU can sense
lack of readers from the scheduling-clock interrupt handler. Which
is why cond_resched() is nothingness in a PREEMPT kernel.

But because people run !PREEMPT as well as PREEMPT kernels, if that loop
can run for a long time, you need that cond_resched().

Thanx, Paul

> > Link: https://lkml.org/lkml/2018/8/23/1106
> > Signed-off-by: Eric Dumazet <[email protected]>
> > Reported-by: [email protected]
> > Cc: Dmitry Torokhov <[email protected]>
> > Cc: [email protected]
> > ---
> > drivers/input/mousedev.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c
> > index e08228061bcdd2f97aaadece31d6c83eb7539ae5..412fa71245afe26a7a8ad75705566f83633ba347 100644
> > --- a/drivers/input/mousedev.c
> > +++ b/drivers/input/mousedev.c
> > @@ -707,6 +707,7 @@ static ssize_t mousedev_write(struct file *file, const char __user *buffer,
> > mousedev_generate_response(client, c);
> >
> > spin_unlock_irq(&client->packet_lock);
> > + cond_resched();
> > }
> >
> > kill_fasync(&client->fasync, SIGIO, POLL_IN);
> > --
> > 2.19.0.605.g01d371f741-goog
> >
>
> Thanks.
>
> --
> Dmitry
>


2018-10-04 19:38:06

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] Input: mousedev - add a schedule point in mousedev_write()

On Thu, Oct 04, 2018 at 12:28:56PM -0700, Eric Dumazet wrote:
> On Thu, Oct 4, 2018 at 11:59 AM Dmitry Torokhov
> <[email protected]> wrote:
> >
> > Hi Eric,
> >
> > On Thu, Oct 04, 2018 at 08:47:49AM -0700, Eric Dumazet wrote:
> > > syzbot was able to trigger rcu stalls by calling write()
> > > with large number of bytes.
> > >
> > > Add a cond_resched() in the loop to avoid this.
> >
> > I think this simply masks a deeper issue. The code fetches characters
> > from userspace in a loop, takes a lock, quickly places response in an
> > output buffer, and releases interrupt. I do not see why this should
> > cause stalls as we do not hold spinlock/interrupts off for extended
> > period of time.
> >
> > Adding Paul so he can straighten me out...
> >
>
> Well...
>
> write(fd, buffer, 0x7FFF0000);
>
> Takes between 20 seconds and 2 minutes depending on CONFIG options ....

And two minutes would get you an RCU CPU stall warning, even on distro
kernels that set the stall-warning time to a full minute (as opposed
to 21 seconds in mainline).

> So either apply my patch, or add a limit on the max count, and
> possibly break legitimate user space ?
>
> I dunno...

I vote for Eric's patch. In fact:

Reviewed-by: Paul E. McKenney <[email protected]>

> > > Link: https://lkml.org/lkml/2018/8/23/1106
> > > Signed-off-by: Eric Dumazet <[email protected]>
> > > Reported-by: [email protected]
> > > Cc: Dmitry Torokhov <[email protected]>
> > > Cc: [email protected]
> > > ---
> > > drivers/input/mousedev.c | 1 +
> > > 1 file changed, 1 insertion(+)
> > >
> > > diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c
> > > index e08228061bcdd2f97aaadece31d6c83eb7539ae5..412fa71245afe26a7a8ad75705566f83633ba347 100644
> > > --- a/drivers/input/mousedev.c
> > > +++ b/drivers/input/mousedev.c
> > > @@ -707,6 +707,7 @@ static ssize_t mousedev_write(struct file *file, const char __user *buffer,
> > > mousedev_generate_response(client, c);
> > >
> > > spin_unlock_irq(&client->packet_lock);
> > > + cond_resched();
> > > }
> > >
> > > kill_fasync(&client->fasync, SIGIO, POLL_IN);
> > > --
> > > 2.19.0.605.g01d371f741-goog
> > >
> >
> > Thanks.
> >
> > --
> > Dmitry
>


2018-10-04 19:38:39

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [PATCH] Input: mousedev - add a schedule point in mousedev_write()

On October 4, 2018 12:28:56 PM PDT, Eric Dumazet <[email protected]> wrote:
>On Thu, Oct 4, 2018 at 11:59 AM Dmitry Torokhov
><[email protected]> wrote:
>>
>> Hi Eric,
>>
>> On Thu, Oct 04, 2018 at 08:47:49AM -0700, Eric Dumazet wrote:
>> > syzbot was able to trigger rcu stalls by calling write()
>> > with large number of bytes.
>> >
>> > Add a cond_resched() in the loop to avoid this.
>>
>> I think this simply masks a deeper issue. The code fetches characters
>> from userspace in a loop, takes a lock, quickly places response in an
>> output buffer, and releases interrupt. I do not see why this should
>> cause stalls as we do not hold spinlock/interrupts off for extended
>> period of time.
>>
>> Adding Paul so he can straighten me out...
>>
>
>Well...
>
>write(fd, buffer, 0x7FFF0000);
>
>Takes between 20 seconds and 2 minutes depending on CONFIG options ....

That's fine even if it takes a couple of years. We are not holding spinlock for the entirety of this time, so we should get bumped off CPU at some point.

>
>So either apply my patch, or add a limit on the max count, and
>possibly break legitimate user space ?

Legitimate users write a single character at a time and read response, so exciting after, let's say, 32 bytes would be fine. But I still want to understand why we have to do that.

>
>I dunno...
>
>> >
>> > Link: https://lkml.org/lkml/2018/8/23/1106
>> > Signed-off-by: Eric Dumazet <[email protected]>
>> > Reported-by: [email protected]
>> > Cc: Dmitry Torokhov <[email protected]>
>> > Cc: [email protected]
>> > ---
>> > drivers/input/mousedev.c | 1 +
>> > 1 file changed, 1 insertion(+)
>> >
>> > diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c
>> > index
>e08228061bcdd2f97aaadece31d6c83eb7539ae5..412fa71245afe26a7a8ad75705566f83633ba347
>100644
>> > --- a/drivers/input/mousedev.c
>> > +++ b/drivers/input/mousedev.c
>> > @@ -707,6 +707,7 @@ static ssize_t mousedev_write(struct file
>*file, const char __user *buffer,
>> > mousedev_generate_response(client, c);
>> >
>> > spin_unlock_irq(&client->packet_lock);
>> > + cond_resched();
>> > }
>> >
>> > kill_fasync(&client->fasync, SIGIO, POLL_IN);
>> > --
>> > 2.19.0.605.g01d371f741-goog
>> >
>>
>> Thanks.
>>
>> --
>> Dmitry


Thanks.

--
Dmitry

2018-10-04 19:46:16

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH] Input: mousedev - add a schedule point in mousedev_write()

On Thu, Oct 4, 2018 at 12:38 PM Dmitry Torokhov
<[email protected]> wrote:
>
> On October 4, 2018 12:28:56 PM PDT, Eric Dumazet <[email protected]> wrote:
> >On Thu, Oct 4, 2018 at 11:59 AM Dmitry Torokhov
> ><[email protected]> wrote:
> >>
> >> Hi Eric,
> >>
> >> On Thu, Oct 04, 2018 at 08:47:49AM -0700, Eric Dumazet wrote:
> >> > syzbot was able to trigger rcu stalls by calling write()
> >> > with large number of bytes.
> >> >
> >> > Add a cond_resched() in the loop to avoid this.
> >>
> >> I think this simply masks a deeper issue. The code fetches characters
> >> from userspace in a loop, takes a lock, quickly places response in an
> >> output buffer, and releases interrupt. I do not see why this should
> >> cause stalls as we do not hold spinlock/interrupts off for extended
> >> period of time.
> >>
> >> Adding Paul so he can straighten me out...
> >>
> >
> >Well...
> >
> >write(fd, buffer, 0x7FFF0000);
> >
> >Takes between 20 seconds and 2 minutes depending on CONFIG options ....
>
> That's fine even if it takes a couple of years. We are not holding spinlock for the entirety of this time, so we should get bumped off CPU at some point.

Well, you are saying that we could get rid of all cond_resched() calls
in the kernel.

You should send patches asap ;)

>
> >
> >So either apply my patch, or add a limit on the max count, and
> >possibly break legitimate user space ?
>
> Legitimate users write a single character at a time and read response, so exciting after, let's say, 32 bytes would be fine. But I still want to understand why we have to do that.
>
> >
> >I dunno...
> >
> >> >
> >> > Link: https://lkml.org/lkml/2018/8/23/1106
> >> > Signed-off-by: Eric Dumazet <[email protected]>
> >> > Reported-by: [email protected]
> >> > Cc: Dmitry Torokhov <[email protected]>
> >> > Cc: [email protected]
> >> > ---
> >> > drivers/input/mousedev.c | 1 +
> >> > 1 file changed, 1 insertion(+)
> >> >
> >> > diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c
> >> > index
> >e08228061bcdd2f97aaadece31d6c83eb7539ae5..412fa71245afe26a7a8ad75705566f83633ba347
> >100644
> >> > --- a/drivers/input/mousedev.c
> >> > +++ b/drivers/input/mousedev.c
> >> > @@ -707,6 +707,7 @@ static ssize_t mousedev_write(struct file
> >*file, const char __user *buffer,
> >> > mousedev_generate_response(client, c);
> >> >
> >> > spin_unlock_irq(&client->packet_lock);
> >> > + cond_resched();
> >> > }
> >> >
> >> > kill_fasync(&client->fasync, SIGIO, POLL_IN);
> >> > --
> >> > 2.19.0.605.g01d371f741-goog
> >> >
> >>
> >> Thanks.
> >>
> >> --
> >> Dmitry
>
>
> Thanks.
>
> --
> Dmitry

2018-10-04 22:55:22

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [PATCH] Input: mousedev - add a schedule point in mousedev_write()

On Thu, Oct 04, 2018 at 12:34:07PM -0700, Paul E. McKenney wrote:
> On Thu, Oct 04, 2018 at 11:59:49AM -0700, Dmitry Torokhov wrote:
> > Hi Eric,
> >
> > On Thu, Oct 04, 2018 at 08:47:49AM -0700, Eric Dumazet wrote:
> > > syzbot was able to trigger rcu stalls by calling write()
> > > with large number of bytes.
> > >
> > > Add a cond_resched() in the loop to avoid this.
> >
> > I think this simply masks a deeper issue. The code fetches characters
> > from userspace in a loop, takes a lock, quickly places response in an
> > output buffer, and releases interrupt. I do not see why this should
> > cause stalls as we do not hold spinlock/interrupts off for extended
> > period of time.
> >
> > Adding Paul so he can straighten me out...
>
> If you are running a !PREEMPT kernel, then you need the cond_resched()
> to allow the scheduler to choose someone else to run if needed and
> to let RCU know that grace periods can end. Without the cond_resched(),
> if you stay in that loop long enough you will get excessive scheduling
> latencies and eventually even RCU CPU stall warning splats.
>
> In a PREEMPT (instead of !PREEMPT) kernel, you would be right. When
> preemption is enabled, the scheduler can preempt and RCU can sense
> lack of readers from the scheduling-clock interrupt handler. Which
> is why cond_resched() is nothingness in a PREEMPT kernel.
>
> But because people run !PREEMPT as well as PREEMPT kernels, if that loop
> can run for a long time, you need that cond_resched().

OK, I see. I'll apply the patch then.

I think evdev.c needs similar treatment as it will keep looping while
there is data...

Thanks.

--
Dmitry

2018-10-04 23:01:49

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH] Input: mousedev - add a schedule point in mousedev_write()



On 10/04/2018 03:54 PM, Dmitry Torokhov wrote:

> OK, I see. I'll apply the patch then.

Thanks !

>
> I think evdev.c needs similar treatment as it will keep looping while
> there is data...

Yeah, presumably other drivers need care as well :/