Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752339AbcKVGMK (ORCPT ); Tue, 22 Nov 2016 01:12:10 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:52172 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750714AbcKVGMI (ORCPT ); Tue, 22 Nov 2016 01:12:08 -0500 Date: Tue, 22 Nov 2016 07:11:55 +0100 From: Martin Schwidefsky To: Frederic Weisbecker Cc: LKML , Tony Luck , Wanpeng Li , Peter Zijlstra , Michael Ellerman , Heiko Carstens , Benjamin Herrenschmidt , Thomas Gleixner , Paul Mackerras , Ingo Molnar , Fenghua Yu , Rik van Riel , Stanislaw Gruszka Subject: Re: [PATCH 00/36] cputime: Convert core use of cputime_t to nsecs In-Reply-To: <20161121162003.GB7554@lerouge> References: <1479406123-24785-1-git-send-email-fweisbec@gmail.com> <20161118130846.7da515cc@mschwide> <20161118144700.GA31560@lerouge> <20161121075956.2b36b3e3@mschwide> <20161121162003.GB7554@lerouge> X-Mailer: Claws Mail 3.9.3 (GTK+ 2.24.23; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16112206-0028-0000-0000-00000251131D X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16112206-0029-0000-0000-0000214C8704 Message-Id: <20161122071155.3e6b3e35@mschwide> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-11-22_04:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609300000 definitions=main-1611220111 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2360 Lines: 47 On Mon, 21 Nov 2016 17:20:06 +0100 Frederic Weisbecker wrote: > On Mon, Nov 21, 2016 at 07:59:56AM +0100, Martin Schwidefsky wrote: > > On Fri, 18 Nov 2016 15:47:02 +0100 > > Frederic Weisbecker wrote: > > > > The do_account_vtime function is called once per jiffy and once per task > > > > switch. HZ is usually set to 100 for s390, the conversion once per jiffy > > > > would not be so bad, but the call on the scheduling path *will* hurt. > > > > > > I don't think we need to flush on task switch. If we maintain the accumulators > > > on the task/thread struct instead of per-cpu, then the remaining time after > > > task switch out will be accounted on next tick after after next task switch in. > > > > You can not properly calculate steal time if you allow sleeping tasks to sit on > > up to 5*HZ worth of cpu time. > > Ah, you mean that when the task goes to sleep, we shouldn't miss more than one > tick worth of system/user time but the steal time can be much higher, right? No, it is worse than that. Consider a task going to sleep just before a tick arrives. It will have almost a full HZ time-slice in its task specific accounting numbers. After the switch another task with a different set of accounting numbers is running. The tick will not push the cputime for the work done in the last HZ period. Dependent on what the new task has in its accounting number the steal time calculation can give you anything. Repeat the whole thing with any number of tasks and the missing cputime can get really large. Now get one of these processes back at the beginning of a time slice and you can get nearly 200% worth of cputime in one tick. Switch to the next task with missing cputime at the start of the new tick and you can get many ticks with too much cputime. Not doing accounting on task switch is just broken. > > I think we *have* to do accounting on task switch. > > At least on s390, likely on powerpc as well. Why not make that an option for > > the architecture with the yet-to-be-written accumulating code. > > Ok, how about doing the accumulation and always account on task switch for now, > we'll see later if it's worth having such an option. I am convinced that we need it. The prototype patch does it for s390. -- blue skies, Martin. "Reality continues to ruin my life." - Calvin.