Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S970119AbdDTJAx (ORCPT ); Thu, 20 Apr 2017 05:00:53 -0400 Received: from mx1.redhat.com ([209.132.183.28]:51912 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S937662AbdDTJAu (ORCPT ); Thu, 20 Apr 2017 05:00:50 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 05A79445D2 Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=brouer@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 05A79445D2 Date: Thu, 20 Apr 2017 11:00:42 +0200 From: Jesper Dangaard Brouer To: Linus Torvalds , Andrew Morton , Frederic Weisbecker Cc: brouer@redhat.com, Mel Gorman , Tariq Toukan , LKML , linux-mm , "netdev@vger.kernel.org" , peterz@infradead.org Subject: Heads-up: two regressions in v4.11-rc series Message-ID: <20170420110042.73d01e0f@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Thu, 20 Apr 2017 09:00:50 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1423 Lines: 36 Hi Linus, Just wanted to give a heads-up on two regressions in 4.11-rc series. (1) page allocator optimization revert Mel Gorman and I have been playing with optimizing the page allocator, but Tariq spotted that we caused a regression for (NIC) drivers that refill DMA RX rings in softirq context. The end result was a revert, and this is waiting in AKPMs quilt queue: http://ozlabs.org/~akpm/mmots/broken-out/revert-mm-page_alloc-only-use-per-cpu-allocator-for-irq-safe-requests.patch (2) Busy softirq can cause userspace not to be scheduled I bisected the problem to a499a5a14dbd ("sched/cputime: Increment kcpustat directly on irqtime account"). See email thread with Subject: Bisected softirq accounting issue in v4.11-rc1~170^2~28 http://lkml.kernel.org/r/20170328101403.34a82fbf@redhat.com I don't know the scheduler code well enough to fix this, and will have to rely others to figure out this scheduler regression. To make it clear: I'm only seeing this scheduler regression when a remote host is sending many many network packets, towards the kernel which keeps NAPI/softirq busy all the time. A possible hint: tool "top" only shows this in "si" column, while on v4.10 "top" also blames "ksoftirqd/N", plus "ps" reported cputime (0:00) seems wrong for ksoftirqd. -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer