Received: by 2002:a25:b794:0:0:0:0:0 with SMTP id n20csp3515360ybh; Mon, 5 Aug 2019 20:26:52 -0700 (PDT) X-Google-Smtp-Source: APXvYqyK73c/eNaIqjgRCftvVi9COJd0J7GviQMH1cnL8zLWFx/9tM2GXWAYR1y6Zi4I1ksiPEhL X-Received: by 2002:a17:902:2b8a:: with SMTP id l10mr934741plb.283.1565062012109; Mon, 05 Aug 2019 20:26:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565062012; cv=none; d=google.com; s=arc-20160816; b=a8Aix/Yu7oHG+5Kd1AqLBfDpI4VM9dVjGYepaOAlEzhPQ/tsIwIdDmLKRX1vD2wQQc uuHYMvKU/iL+8lOupoR31nT+T3CccqU8KuEuvUkfqBw43VMyTsZF7yntd1yJZiL3RNJD 42sSgvVGNMbknciOJ26+Z26Pju4TljmoxNEkYn1XYTJR55q00rEZsUivXmuB+o5zgTSW 0FmB5ErsBriIHLLGWu2OabtHEpSGBwpbG8UP6ayGcdZRNuKiSkllqV31jTJ4tb461L94 KfD9lGXWVyHYQUwbINU4iepdAG7SBdfj1jc3OLuuUt3fxZfSMF6ktPYv/lgRp3G2sea3 kIsg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=+D50DlYIkjAfg2CmQf0W2uiV155Gce0ZeW8Qc/VAEhs=; b=lun91z4BwpcW8/KxZZvE+1vqtTRtehzGXM0asUmm3DrTKPcNr2Uw75gsbza7GxhpXt 4WpeiY7lP24wrHJlyRudBm6vG9dUVptZk7MqJs39LVyuEJw1RzLpYG+aOlWK+jNoiAn6 e7TOW11Fy/S3BvcuuFVl2ForkXkcuM88HMqXf25UpPqabN9Qz0gONYgm5yhrFeX/3o96 gunQZwnhEeTETBjJVv2G08DE4V89glP8xJNTji4ZxNkWtKWt3I32s2mZC7zbg7sTrY/W MCM1+Jwd2N++LpDhvelVOeD6Z0hBWmcF5I4BS5X5Awwkz5jsnCSH4ql+tnZbQBwIs1Fb ZkYw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x6si14351828pjn.10.2019.08.05.20.26.35; Mon, 05 Aug 2019 20:26:52 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731387AbfHFDY3 (ORCPT + 99 others); Mon, 5 Aug 2019 23:24:29 -0400 Received: from out30-57.freemail.mail.aliyun.com ([115.124.30.57]:54661 "EHLO out30-57.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729383AbfHFDY3 (ORCPT ); Mon, 5 Aug 2019 23:24:29 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R151e4;CH=green;DM=||false|;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04426;MF=aaron.lu@linux.alibaba.com;NM=1;PH=DS;RN=21;SR=0;TI=SMTPD_---0TYnWuqk_1565061858; Received: from aaronlu(mailfrom:aaron.lu@linux.alibaba.com fp:SMTPD_---0TYnWuqk_1565061858) by smtp.aliyun-inc.com(127.0.0.1); Tue, 06 Aug 2019 11:24:24 +0800 Date: Tue, 6 Aug 2019 11:24:18 +0800 From: Aaron Lu To: Tim Chen Cc: Julien Desfossez , "Li, Aubrey" , Aubrey Li , Subhra Mazumdar , Vineeth Remanan Pillai , Nishanth Aravamudan , Peter Zijlstra , Ingo Molnar , Thomas Gleixner , Paul Turner , Linus Torvalds , Linux List Kernel Mailing , =?iso-8859-1?Q?Fr=E9d=E9ric?= Weisbecker , Kees Cook , Greg Kerr , Phil Auld , Valentin Schneider , Mel Gorman , Pawan Gupta , Paolo Bonzini Subject: Re: [RFC PATCH v3 00/16] Core scheduling v3 Message-ID: <20190806032418.GA54717@aaronlu> References: <20190613032246.GA17752@sinkpad> <20190619183302.GA6775@sinkpad> <20190718100714.GA469@aaronlu> <20190725143003.GA992@aaronlu> <20190726152101.GA27884@sinkpad> <7dc86e3c-aa3f-905f-3745-01181a3b0dac@linux.intel.com> <20190802153715.GA18075@sinkpad> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Aug 05, 2019 at 08:55:28AM -0700, Tim Chen wrote: > On 8/2/19 8:37 AM, Julien Desfossez wrote: > > We tested both Aaron's and Tim's patches and here are our results. > > > > Test setup: > > - 2 1-thread sysbench, one running the cpu benchmark, the other one the > > mem benchmark > > - both started at the same time > > - both are pinned on the same core (2 hardware threads) > > - 10 30-seconds runs > > - test script: https://paste.debian.net/plainh/834cf45c > > - only showing the CPU events/sec (higher is better) > > - tested 4 tag configurations: > > - no tag > > - sysbench mem untagged, sysbench cpu tagged > > - sysbench mem tagged, sysbench cpu untagged > > - both tagged with a different tag > > - "Alone" is the sysbench CPU running alone on the core, no tag > > - "nosmt" is both sysbench pinned on the same hardware thread, no tag > > - "Tim's full patchset + sched" is an experiment with Tim's patchset > > combined with Aaron's "hack patch" to get rid of the remaining deep > > idle cases > > - In all test cases, both tasks can run simultaneously (which was not > > the case without those patches), but the standard deviation is a > > pretty good indicator of the fairness/consistency. > > Thanks for testing the patches and giving such detailed data. Thanks Julien. > I came to realize that for my scheme, the accumulated deficit of forced idle could be wiped > out in one execution of a task on the forced idle cpu, with the update of the min_vruntime, > even if the execution time could be far less than the accumulated deficit. > That's probably one reason my scheme didn't achieve fairness. I've been thinking if we should consider core wide tenent fairness? Let's say there are 3 tasks on 2 threads' rq of the same core, 2 tasks (e.g. A1, A2) belong to tenent A and the 3rd B1 belong to another tenent B. Assume A1 and B1 are queued on the same thread and A2 on the other thread, when we decide priority for A1 and B1, shall we also consider A2's vruntime? i.e. shall we consider A1 and A2 as a whole since they belong to the same tenent? I tend to think we should make fairness per core per tenent, instead of per thread(cpu) per task(sched entity). What do you guys think? Implemention of the idea is a mess to me, as I feel I'm duplicating the existing per cpu per sched_entity enqueue/update vruntime/dequeue logic for the per core per tenent stuff.