Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp5061474imm; Tue, 26 Jun 2018 05:22:07 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLkoQ8Bju4wGgFqQOadN08IvUkUDMTTmzfW+dkkLx73MB54QaKroV95w2K1aq2JQnd6xFQ+ X-Received: by 2002:a63:714c:: with SMTP id b12-v6mr1192899pgn.420.1530015727290; Tue, 26 Jun 2018 05:22:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530015727; cv=none; d=google.com; s=arc-20160816; b=ksVQlU0YN4dqn9fsG/TrnSbEpXzWuvI0NQa8JcnW9JO+6PA3b08gjxa2bVNx6JM1sg tmGTu4RAQVcQ/nNW+usINbh9kG70wei0n3A0n5lt9rbBV4oUhJFQISxuNi/f450hE9Ag AuL6E97WBwzb0K1Cx8wbSHGmbpH/F38aA8AuCmE8arK0zF78wNGf3+PZMoMdtvkFBs35 oFpWZuX/UIjzfhcFrtD0lr3EF1/WybkLUtn1LO/lUnyvKlWnREmg3+IKSR20RwUB+wKp NMcQf5FyreVWEJ3OQzAV0TxeKni/332422ONA3Nz/y277MyVKa3qMGSZZgGxyTWU7dec ulrg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:references:cc:to:from :subject:arc-authentication-results; bh=1eY1Xj/osMXnjLCDTZa/Rgr2B1/G27rSlB7pwoLwHYQ=; b=nVqnMN3IDmjYOVDyL5pYlP+g/72sBt5iIljhXf5FaV9SXhdSSeNH8bse1lYc3Jeets hTzDko5ABZvwrGEwz2+qCGfT3BsrDdu0PVpajgVnERB1rF/o6rW2JctlJw/GS4xcyMDF WBddSJAq42RB2vAadKxRxJG+13qB0NhNOPG8MZPsOc9n0ctJDQwjNiKoNXBssk3mnvPi 4EVoPqb0lRt/bjp7KCz55h5avXrvsBgDpZxP1XfXe9kKgGgEsOK+fJ9jgicdE5zZRlrJ PydgwOQacTwFRZnWZQ3bXnW8888ATIei5yWRugN/W8Iy1e0IP814d3RMexd2LEcHf4gS f+XQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c10-v6si1495899pla.98.2018.06.26.05.21.52; Tue, 26 Jun 2018 05:22:07 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934791AbeFZMUA (ORCPT + 99 others); Tue, 26 Jun 2018 08:20:00 -0400 Received: from out30-130.freemail.mail.aliyun.com ([115.124.30.130]:45237 "EHLO out30-130.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933509AbeFZMT7 (ORCPT ); Tue, 26 Jun 2018 08:19:59 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R151e4;CH=green;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01f04446;MF=xlpang@linux.alibaba.com;NM=1;PH=DS;RN=5;SR=0;TI=SMTPD_---0T3NEMS9_1530015589; Received: from xunleideMacBook-Pro.local(mailfrom:xlpang@linux.alibaba.com fp:SMTPD_---0T3NEMS9_1530015589) by smtp.aliyun-inc.com(127.0.0.1); Tue, 26 Jun 2018 20:19:49 +0800 Subject: Re: [PATCH] sched/cputime: Ensure correct utime and stime proportion From: Xunlei Pang To: Peter Zijlstra , Ingo Molnar , Frederic Weisbecker , Tejun Heo Cc: linux-kernel@vger.kernel.org References: <20180622071542.61569-1-xlpang@linux.alibaba.com> Message-ID: Date: Tue, 26 Jun 2018 20:19:49 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: <20180622071542.61569-1-xlpang@linux.alibaba.com> Content-Type: text/plain; charset=gbk Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 6/22/18 3:15 PM, Xunlei Pang wrote: > We use per-cgroup cpu usage statistics similar to "cgroup rstat", > and encountered a problem that user and sys usages are wrongly > split sometimes. > > Run tasks with some random run-sleep pattern for a long time, and > when tick-based time and scheduler sum_exec_runtime hugely drifts > apart(scheduler sum_exec_runtime is less than tick-based time), > the current implementation of cputime_adjust() will produce less > sys usage than the actual use after changing to run a different > workload pattern with high sys. This is because total tick-based > utime and stime are used to split the total sum_exec_runtime. > > Same problem exists on utime and stime from "/proc//stat". > > [Example] > Run some random run-sleep patterns for minutes, then change to run > high sys pattern, and watch. > 1) standard "top"(which is the correct one): > 4.6 us, 94.5 sy, 0.0 ni, 0.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st > 2) our tool parsing utime and stime from "/proc//stat": > 20.5 usr, 78.4 sys > We can see "20.5 usr" displayed in 2) was incorrect, it recovers > gradually with time: 9.7 usr, 89.5 sys > High sys probably means there's something abnormal on the kernel path, it may hide issues, so we should make it fairly reliable. It can easily hit this problem with our per-cgroup statistics. Hi Peter, any comment on this patch?