Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751786AbdF2X1f (ORCPT ); Thu, 29 Jun 2017 19:27:35 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58944 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751532AbdF2X1c (ORCPT ); Thu, 29 Jun 2017 19:27:32 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 8C5321393E5 Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=riel@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 8C5321393E5 Message-ID: <1498778847.6130.8.camel@redhat.com> Subject: Re: [PATCH 5/5] sched: Accumulate vtime on top of nsec clocksource From: Rik van Riel To: Frederic Weisbecker , LKML Cc: Wanpeng Li , Peter Zijlstra , Thomas Gleixner , Luiz Capitulino , Ingo Molnar Date: Thu, 29 Jun 2017 19:27:27 -0400 In-Reply-To: <1498756511-11714-6-git-send-email-fweisbec@gmail.com> References: <1498756511-11714-1-git-send-email-fweisbec@gmail.com> <1498756511-11714-6-git-send-email-fweisbec@gmail.com> Organization: Red Hat, Inc Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Thu, 29 Jun 2017 23:27:31 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2439 Lines: 54 On Thu, 2017-06-29 at 19:15 +0200, Frederic Weisbecker wrote: > From: Wanpeng Li > > Currently the cputime source used by vtime is jiffies. When we cross > a context boundary and jiffies have changed since the last snapshot, > the > pending cputime is accounted to the switching out context. > > This system works ok if the ticks are not aligned across CPUs. If > they > instead are aligned (ie: all fire at the same time) and the CPUs run > in > userspace, the jiffies change is only observed on tick exit and > therefore > the user cputime is accounted as system cputime. This is because the > CPU that maintains timekeeping fires its tick at the same time as the > others. It updates jiffies in the middle of the tick and the other > CPUs > see that update on IRQ exit: > >     CPU 0 (timekeeper)                  CPU 1 >     -------------------              ------------- >                       jiffies = N >     ...                              run in userspace for a jiffy >     tick entry                       tick entry (sees jiffies = N) >     set jiffies = N + 1 >     tick exit                        tick exit (sees jiffies = N + 1) >                                                 account 1 jiffy as > stime > > Fix this with using a nanosec clock source instead of jiffies. The > cputime is then accumulated and flushed everytime the pending delta > reaches a jiffy in order to mitigate the accounting overhead. Glad to hear this could be done without dramatically increasing the accounting overhead! > [fweisbec: changelog, rebase on struct vtime, field renames, add > delta > on cputime readers, keep idle vtime as-is (low overhead accounting), > harmonize clock sources] > > Reported-by: Luiz Capitulino > Suggested-by: Thomas Gleixner > Not-Yet-Signed-off-by: Wanpeng Li > Cc: Rik van Riel > Cc: Peter Zijlstra > Cc: Thomas Gleixner > Cc: Wanpeng Li > Cc: Ingo Molnar > Cc: Luiz Capitulino > Signed-off-by: Frederic Weisbecker Acked-by: Rik van Riel