2017-04-20 09:00:53

by Jesper Dangaard Brouer

[permalink] [raw]
Subject: Heads-up: two regressions in v4.11-rc series

Hi Linus,

Just wanted to give a heads-up on two regressions in 4.11-rc series.

(1) page allocator optimization revert

Mel Gorman and I have been playing with optimizing the page allocator,
but Tariq spotted that we caused a regression for (NIC) drivers that
refill DMA RX rings in softirq context.

The end result was a revert, and this is waiting in AKPMs quilt queue:
http://ozlabs.org/~akpm/mmots/broken-out/revert-mm-page_alloc-only-use-per-cpu-allocator-for-irq-safe-requests.patch


(2) Busy softirq can cause userspace not to be scheduled

I bisected the problem to a499a5a14dbd ("sched/cputime: Increment
kcpustat directly on irqtime account"). See email thread with
Subject: Bisected softirq accounting issue in v4.11-rc1~170^2~28
http://lkml.kernel.org/r/[email protected]

I don't know the scheduler code well enough to fix this, and will have
to rely others to figure out this scheduler regression.

To make it clear: I'm only seeing this scheduler regression when a
remote host is sending many many network packets, towards the kernel
which keeps NAPI/softirq busy all the time. A possible hint: tool
"top" only shows this in "si" column, while on v4.10 "top" also blames
"ksoftirqd/N", plus "ps" reported cputime (0:00) seems wrong for ksoftirqd.


--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer


2017-04-20 13:25:54

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: Heads-up: two regressions in v4.11-rc series

On Thu, Apr 20, 2017 at 11:00:42AM +0200, Jesper Dangaard Brouer wrote:
> Hi Linus,
>
> Just wanted to give a heads-up on two regressions in 4.11-rc series.
>
> (1) page allocator optimization revert
>
> Mel Gorman and I have been playing with optimizing the page allocator,
> but Tariq spotted that we caused a regression for (NIC) drivers that
> refill DMA RX rings in softirq context.
>
> The end result was a revert, and this is waiting in AKPMs quilt queue:
> http://ozlabs.org/~akpm/mmots/broken-out/revert-mm-page_alloc-only-use-per-cpu-allocator-for-irq-safe-requests.patch
>
>
> (2) Busy softirq can cause userspace not to be scheduled
>
> I bisected the problem to a499a5a14dbd ("sched/cputime: Increment
> kcpustat directly on irqtime account"). See email thread with
> Subject: Bisected softirq accounting issue in v4.11-rc1~170^2~28
> http://lkml.kernel.org/r/[email protected]
>
> I don't know the scheduler code well enough to fix this, and will have
> to rely others to figure out this scheduler regression.
>
> To make it clear: I'm only seeing this scheduler regression when a
> remote host is sending many many network packets, towards the kernel
> which keeps NAPI/softirq busy all the time. A possible hint: tool
> "top" only shows this in "si" column, while on v4.10 "top" also blames
> "ksoftirqd/N", plus "ps" reported cputime (0:00) seems wrong for ksoftirqd.

(I'm currently working on reproducing that one.)

2017-04-20 14:30:43

by Mel Gorman

[permalink] [raw]
Subject: Re: Heads-up: two regressions in v4.11-rc series

On Thu, Apr 20, 2017 at 11:00:42AM +0200, Jesper Dangaard Brouer wrote:
> Hi Linus,
>
> Just wanted to give a heads-up on two regressions in 4.11-rc series.
>
> (1) page allocator optimization revert
>
> Mel Gorman and I have been playing with optimizing the page allocator,
> but Tariq spotted that we caused a regression for (NIC) drivers that
> refill DMA RX rings in softirq context.
>
> The end result was a revert, and this is waiting in AKPMs quilt queue:
> http://ozlabs.org/~akpm/mmots/broken-out/revert-mm-page_alloc-only-use-per-cpu-allocator-for-irq-safe-requests.patch
>

This was flagged to Andrew that it should go in for either 4.11 or if
there were concerns about how close to the release we are then put it in
for 4.11-stable. At worst, I can do a resubmit to -stable myself after
it gets merged in the next window if it falls between the cracks.

--
Mel Gorman
SUSE Labs

2017-04-21 19:00:13

by Linus Torvalds

[permalink] [raw]
Subject: Re: Heads-up: two regressions in v4.11-rc series

On Thu, Apr 20, 2017 at 7:30 AM, Mel Gorman <[email protected]> wrote:
>> The end result was a revert, and this is waiting in AKPMs quilt queue:
>> http://ozlabs.org/~akpm/mmots/broken-out/revert-mm-page_alloc-only-use-per-cpu-allocator-for-irq-safe-requests.patch
>>
>
> This was flagged to Andrew that it should go in for either 4.11 or if
> there were concerns about how close to the release we are then put it in
> for 4.11-stable. At worst, I can do a resubmit to -stable myself after
> it gets merged in the next window if it falls between the cracks.

This got merged (commit d34b0733b452: "Revert "mm, page_alloc: only
use per-cpu allocator for irq-safe requests"").

The other issue (caused by commit a499a5a14dbd: "sched/cputime:
Increment kcpustat directly on irqtime account") is still open.

Frederic? Revert? But I guess it's something we can delay for
backporting, it's presumably not possible to hit maliciously except on
some fast local network attacker just causing an effective DoS.

Linus

2017-04-21 20:33:19

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: Heads-up: two regressions in v4.11-rc series

On Fri, Apr 21, 2017 at 10:52:29AM -0700, Linus Torvalds wrote:
> On Thu, Apr 20, 2017 at 7:30 AM, Mel Gorman <[email protected]> wrote:
> >> The end result was a revert, and this is waiting in AKPMs quilt queue:
> >> http://ozlabs.org/~akpm/mmots/broken-out/revert-mm-page_alloc-only-use-per-cpu-allocator-for-irq-safe-requests.patch
> >>
> >
> > This was flagged to Andrew that it should go in for either 4.11 or if
> > there were concerns about how close to the release we are then put it in
> > for 4.11-stable. At worst, I can do a resubmit to -stable myself after
> > it gets merged in the next window if it falls between the cracks.
>
> This got merged (commit d34b0733b452: "Revert "mm, page_alloc: only
> use per-cpu allocator for irq-safe requests"").
>
> The other issue (caused by commit a499a5a14dbd: "sched/cputime:
> Increment kcpustat directly on irqtime account") is still open.
>
> Frederic? Revert? But I guess it's something we can delay for
> backporting, it's presumably not possible to hit maliciously except on
> some fast local network attacker just causing an effective DoS.

I can't tell about the security impact. But indeed I think we should rather
delay for backporting if we can't manage to fix it in the upcoming days.
Especially as you can't revert this patch alone, it's part of a whole series
of ~ 30 commits that removed cputime_t and it's in the middle of the series,
so those that come after depend on it and those that come before just don't make
sense alone.

But I'll fix this ASAP.

Thanks.