Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp4767492imm; Mon, 18 Jun 2018 22:47:46 -0700 (PDT) X-Google-Smtp-Source: ADUXVKIilT83cWUNTJMCJoNgrfwiEU9JC840BXuiloQuHgIy6ZjiITTmuddlc7e02PnmvRj3sQIZ X-Received: by 2002:a63:3c07:: with SMTP id j7-v6mr13423281pga.440.1529387266578; Mon, 18 Jun 2018 22:47:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529387266; cv=none; d=google.com; s=arc-20160816; b=yGJzLqxQGpoT5LGRAQjJM+0Y/o0qsuudilsyMry/fqT3mjsFCJ9RE9ZDrc6RL3SvhF y6fAs8WCAzsdHMXZ6mx/0AciJKJK/RVQtmOr+XlImOLRyFbCNRFyUOlP0i15zmdZs9cD ZDiMx6PtapK/kA90etfkiB28GjfgdHkJ5uQI8412W5JDgRnh9rI5D1FDVJLEmGnLowAN NcbmXb/mcFwHYSP7UeA+xYTHyDOysCMd6b2wAMFuDjF26+Qr4d9exx+eB1Mefu7gOJ1Z UeP+3IlyHMm1FVS0XGCU0XcFli7x86FaNnM6OLnmQiIt7HdptsrPWMYyMMpZpDFDRQOx DiPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:references:cc:to :subject:reply-to:arc-authentication-results; bh=AgeSwxi+VGz4a4n2Hs+McptrZP2BhQBoWm3wrjR7iEs=; b=rQA0MCEzEqz7JlGfQSNIQMOGSOm/YlTqNAfmPhDHg8K5rkfXVrCEUnhoKWO5CY/A6G YGBejeFyuifGf7acdsZxA6XaykDS0afi25R0qGMNADa7jtQ3EKWXKswS/4YwZ9Y2xWnp wr2Cbwl2j/SYRx0cSstvxL9C1QE3AsK0a6WdSe04qaB4h1S5553h4H8MuhC2vzU92tjN VHvkvYwEHxa6LU40IXDB3pEX8wBJ/+ga0+4QoFqCqTMSZGyfEqo8flYcZRf+ZZMkoMVO tqiEY3+Kab84jI/Dtetqc4BX40wTj9xrEIOjhmFc7YVmUKK/8Ye9JFIzUXne55q1FECT 2EDw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g25-v6si13709475pgn.613.2018.06.18.22.47.32; Mon, 18 Jun 2018 22:47:46 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755816AbeFSFqu (ORCPT + 99 others); Tue, 19 Jun 2018 01:46:50 -0400 Received: from out30-131.freemail.mail.aliyun.com ([115.124.30.131]:38147 "EHLO out30-131.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755376AbeFSFqt (ORCPT ); Tue, 19 Jun 2018 01:46:49 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R881e4;CH=green;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01f04455;MF=xlpang@linux.alibaba.com;NM=1;PH=DS;RN=5;SR=0;TI=SMTPD_---0T2zTT.K_1529387192; Received: from xunleideMacBook-Pro.local(mailfrom:xlpang@linux.alibaba.com fp:SMTPD_---0T2zTT.K_1529387192) by smtp.aliyun-inc.com(127.0.0.1); Tue, 19 Jun 2018 13:46:32 +0800 Reply-To: xlpang@linux.alibaba.com Subject: Re: [PATCH 2/2] sched/fair: Advance global expiration when period timer is restarted To: Cong Wang Cc: Peter Zijlstra , Ingo Molnar , Ben Segall , LKML References: <20180618091657.21939-1-xlpang@linux.alibaba.com> From: Xunlei Pang Message-ID: Date: Tue, 19 Jun 2018 13:46:31 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 6/19/18 12:36 PM, Cong Wang wrote: > On Mon, Jun 18, 2018 at 2:16 AM, Xunlei Pang wrote: >> I noticed the group frequently got throttled even it consumed >> low cpu usage, this caused some jitters on the response time >> to some of our business containers enabling cpu quota. >> >> It's very easy to reproduce: >> mkdir /sys/fs/cgroup/cpu/test >> cd /sys/fs/cgroup/cpu/test >> echo 100000 > cpu.cfs_quota_us >> echo $$ > tasks >> then repeat: >> cat cpu.stat |grep nr_throttled // nr_throttled will increase >> >> After some analysis, we found that cfs_rq::runtime_remaining will >> be cleared by expire_cfs_rq_runtime() due to two equal but stale >> "cfs_{b|q}->runtime_expires" after period timer is re-armed. >> >> The global expiration should be advanced accordingly when the >> bandwidth period timer is restarted. >> > > I observed the same problem and already sent some patches: > > https://lkml.org/lkml/2018/5/22/37 > https://lkml.org/lkml/2018/5/22/38 > https://lkml.org/lkml/2018/5/22/35 > Looks they are related to large slice setting and unused slack under large number of cpus, the issue I described is a little different.