Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752646AbdLMKyz (ORCPT ); Wed, 13 Dec 2017 05:54:55 -0500 Received: from mail.linuxfoundation.org ([140.211.169.12]:40754 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751853AbdLMKyw (ORCPT ); Wed, 13 Dec 2017 05:54:52 -0500 Date: Wed, 13 Dec 2017 11:54:56 +0100 From: Greg KH To: Rabin Vincent Cc: fweisbec@gmail.com, riel@redhat.com, mingo@redhat.com, peterz@infradead.org, stable@vger.kernel.org, linux-kernel@vger.kernel.org, Rabin Vincent Subject: Re: [PATCH for v4.9-stable] sched: fix softirq time accounting Message-ID: <20171213105456.GA5236@kroah.com> References: <1513159876-5125-1-git-send-email-rabin.vincent@axis.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1513159876-5125-1-git-send-email-rabin.vincent@axis.com> User-Agent: Mutt/1.9.1 (2017-09-22) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2519 Lines: 72 On Wed, Dec 13, 2017 at 11:11:16AM +0100, Rabin Vincent wrote: > From: Rabin Vincent > > softirq time accounting is broken on v4.9.x if ksoftirqd runs. > > With > CONFIG_IRQ_TIME_ACCOUNTING=y > # CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set > > this test code: > > struct tasklet_struct tasklet; > > static void delay_tasklet(unsigned long data) > { > udelay(10); > tasklet_schedule(&tasklet); > } > > tasklet_init(&tasklet, delay_tasklet, 0); > tasklet_schedule(&tasklet); > > results in: > > $ while :; do grep cpu0 /proc/stat; done > cpu0 5 0 80 25 16 107 1 0 0 0 > cpu0 5 0 80 25 16 107 0 0 0 0 > cpu0 5 0 80 25 16 107 0 0 0 0 > cpu0 5 0 80 25 16 107 0 0 0 0 > cpu0 5 0 81 25 16 107 0 0 0 0 > cpu0 5 0 81 25 16 107 1 0 0 0 > cpu0 5 0 81 25 16 108 18446744073709551615 0 0 0 > cpu0 5 0 81 25 16 108 18446744073709551615 0 0 0 > cpu0 5 0 81 25 16 108 18446744073709551615 0 0 0 > cpu0 5 0 81 25 16 108 0 0 0 0 > cpu0 6 0 81 25 16 108 0 0 0 0 > cpu0 6 0 81 25 16 108 0 0 0 0 > > As can be seen, the softirq numbers are totally bogus. > > When ksoftirq is running, irqtime_account_process_tick() increments > cpustat[CPUSTAT_SOFTIRQ]. This causes the "nsecs_to_cputime64(irqtime) > - cpustat[CPUSTAT_SOFTIRQ]" calculation in irqtime_account_update() to > underflow the next time a softirq is handled leading to the above > values. > > The underflow bug was added by 57430218317e5b280 ("sched/cputime: Count > actually elapsed irq & softirq time"). > > But ksoftirqd accounting was wrong even in earlier kernels. In earlier > kernels, after a ksoftirq run, the kernel would simply stop accounting > softirq time spent outside of ksoftirqd until that (accumulated) time > exceeded the time for which ksofirqd previously had run. > > Fix both the underflow and the wrong accounting by using a counter > specifically for the non-ksoftirqd softirq case. > > This code has been fixed in current mainline by a499a5a14db > ("sched/cputime: Increment kcpustat directly on irqtime account") [note > also the followup 25e2d8c1b9e327e ("sched/cputime: Fix ksoftirqd cputime > accounting regression")], but that patch is a part of the many changes > for eliminating of cputime_t so it does not seem suitable for backport. I _really_ only want to take the exact upstream patches, as every time we do something like what you are proposing to do here, we get it wrong. Seriously, our track record is horrible. Like 90% wrong. Can you try just those two patches instead? thanks, greg k-h