Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp2961431imm; Thu, 24 May 2018 19:59:26 -0700 (PDT) X-Google-Smtp-Source: AB8JxZqWsrs6hSgI/Wzjy7/5h8FtQuI8Xlulyi4IiCzC4hVcolM+sJZuWd6lbD8gEFHXxEQO9apb X-Received: by 2002:a65:5884:: with SMTP id d4-v6mr507040pgu.292.1527217166084; Thu, 24 May 2018 19:59:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527217166; cv=none; d=google.com; s=arc-20160816; b=ULdY9R001W2u8cKr9VvMBPa9bSwy9nWuCtL+YtY+JgclzPgevQ+yBGXINGxsBgFr4C 0SF56XC4Li87WaS6QhpSf68u20raytsFmvf/Jd39rHAX+yElWtVdQSv+stN4HqpRM06D WNuzmykP9tJKTULwZDrbERUP7ihON0FikZeA+Bjf1CblKZ3rBB7TzqJcuKnG0lf7yFVp NItTtbqyUnS+3DN0RFV3pxevj43gDxDfMiQ7t2TbWr7eUS+K+XxAFNobZAA+8Q1k/SQh MmASR6jLYcx21jjiyPoRJCFH38GFc4bF/RfQ7/uzhtkDNA2Y5HQ5ONBbLpBkOw1CMqMS fEQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=emUg8OSZZBMRbd+oOS6hvX48DSujFtLJUHL3PPM0UOo=; b=o0i3i5iqOdz3AMv7IaGS7coo7Hl9tVw19YKRlO8y6fqjASm2Z8yjLn6ZXcW/AC5qPF F8a6dwXmjhg/DdED977Uy5tpUKfJY68VYBPfjbBxP9LKx/nWEKDXT2bhkhsGiwgE7P/v pBr7/hRpwQfhA58xn9T/TcYD7FDY2BeOyV1XxpUJI9v0ce8kNiytHzrCkTd/JggY8WIp rh7zXld2SBDChPTYdVdzjC9WedpF/AiZ7yfv6/Nc98SKdnaPfkzwuNxA016nPisB5cDp a+ibLLVyB/WBHQJmTyN9kpoHvQRNU7AKJGhsXOBWOvCdMOeXP2U5eYPkqFJp+ZUED7Ar 9mAQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=gY+UH/R7; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h131-v6si21575990pfc.206.2018.05.24.19.59.11; Thu, 24 May 2018 19:59:26 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=gY+UH/R7; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755000AbeEYC5g (ORCPT + 99 others); Thu, 24 May 2018 22:57:36 -0400 Received: from mail.kernel.org ([198.145.29.99]:54044 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755262AbeEYC42 (ORCPT ); Thu, 24 May 2018 22:56:28 -0400 Received: from localhost (LFbn-NCY-1-193-82.w83-194.abo.wanadoo.fr [83.194.41.82]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 676062087C; Fri, 25 May 2018 02:56:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1527216987; bh=GGsh4bOzFZZL4KbiHe5gAzt7rA9otiN+CB7OwKkgQeI=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=gY+UH/R7THuSHxQlJDPVUFVmlw7qR86OduqzQKvLmI/JlLhVFNS33B5n8JhCZVEhz DG+Ag7YmTYrHZT2SKK26ukPg22amByx1aybHemPCizu4afP0mAaw2JHZq5NUiCm7j4 4Ny3OPk4s3z6GggcxuUvYbUFAKHgjCTcqK1v3jsk= Date: Fri, 25 May 2018 04:56:25 +0200 From: Frederic Weisbecker To: Yauheni Kaliuta Cc: Luiz Capitulino , Ingo Molnar , LKML , Peter Zijlstra , Chris Metcalf , Thomas Gleixner , Christoph Lameter , "Paul E . McKenney" , Wanpeng Li , Mike Galbraith , Rik van Riel Subject: Re: [GIT PULL] isolation: 1Hz residual tick offloading v4 Message-ID: <20180525025624.GB22082@lerouge> References: <1516320140-13189-1-git-send-email-frederic@kernel.org> <20180124104608.038fb212@redhat.com> <20180129011024.GA2942@lerouge> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 22, 2018 at 10:10:19PM +0300, Yauheni Kaliuta wrote: > Hi, Frederic! > > >>>>> On Mon, 29 Jan 2018 02:10:26 +0100, Frederic Weisbecker wrote: > > On Wed, Jan 24, 2018 at 10:46:08AM -0500, Luiz Capitulino wrote: > > [...] > > >> Since the 1Hz tick offload worked for you, I must be missing > >> a way to disable this timer or the kernel is thinking my CPU > >> has unstable TSC (which it doesn't AFAIK). > > > It's beyond the scope of this patchset but indeed that's > > right, I run my kernels with tsc=reliable because my CPUs > > don't have the TSC_RELIABLE flag. That's the only way I found > > to shutdown the tick completely on my test machine, otherwise > > I keep having that clocksource watchdog. > > [...] > > Thanks, it helps. But I have accounting problem: > > if I run user busy loop on the nohz cpu, the task accounting works > correctly (top shows the task takes 100% cpu), but cpu accounting is > wrong (cpu is 100% idle, in the per-core view as well). > > If I understand correctly, the stats are updated by account_user_time() > -> task_group_account_field() but there is no call for it in case of > offloading (it is called from irqtime_account_process_tick, > account_process_tick, vtime_user_exit). Ah I forgot about kcpustat accounting. I remember I wanted to fix that a few years ago but I forgot about it when I removed the last tick. That thing was lurking behind 1Hz. > > Moreover, task_group_account_field() uses __this_cpu_add() which will be > wrong for offloading. > > For testing I used kcpustat_cpu(task_cpu(p)) in > task_group_account_field() and added call account_user_time(curr, delta) > to the sched_tick_remote() what fixes it for me, but what would be the > proper fix? Yeah unfortunately that's unsafe. Task accounting is not designed for remote update. You could race with an update from another CPU, especially the local updater. I fear we need to take the same approach than task cputime, which is using a seqcount for updates. Then the reader would fetch the kcpustat values + the delta vtime from the task executing. Things can get complicated once we dive into corner cases: CPUTIME_IRQ, CPUTIME_SOFTIRQ, and CPUTIME_STEAL. At least we don't need to care about CPUTIME_IDLE and CPUTIME_IOWAIT that have their own delta. I'm trying that. Thanks.