Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp210932yba; Wed, 8 May 2019 19:13:47 -0700 (PDT) X-Google-Smtp-Source: APXvYqyEMSPgbUgtV9YXuAIuu2tbDO6EwkDR0ni9OPYpjI1oJxs4EJc26gSNtpNDYGcCVR2CX/6l X-Received: by 2002:a63:a449:: with SMTP id c9mr2107877pgp.149.1557368027550; Wed, 08 May 2019 19:13:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557368027; cv=none; d=google.com; s=arc-20160816; b=dt6a/bvXS/e9LC67sHPHueF4RDxKkuB5zUqhr48lBCuDHH6RtcesHTcRSsKyOIcykf 5xJJCiYqeijzjMKdEzewDnLVHWXHWBzFcqj5MPzGmVIg6wjwIcoP2VOHFUqtt4FaYbZN JBBaWqpROMUVAwOTtw7wKwiEHO+Psm8IOfcIh2UExdiruX7JFnXvJNZDtrOj1NmPTeHC F8C9IiQk0d/QMbQh9TZZ0xbAg+U+E1+pNh5Nhk9jhHfrFgc/7yATaitlo4A27nYFJkkm VAY4Hu7pwl0A19tn9olxTprwFqzbhKWkickBfWRpW0WOHR+aigmM4MoAq6yi4o37qIBE vfpA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=2KAv3uz7Y6d5iTSWXz/TYnzNXFNaSJ4FuQU6l0brxx4=; b=LOXWEJl7eqYxcfxdQNNSqYBPjk/u3acIYpM+diVJ8y5ApstPaO5UkDCh1Sv9CXWJ9M aL9mVyFBPgivrs26cVMKWNxOau6d7I3Sj5BY9FyzrDxzUp0WyLEN7C7X/TyWSg4XbtmW tnmpC3qJ9FLEFC7Qbinqw6pOWbj41D2J2IFPa2QEhGdjqtGGNsN9L418C8CnKPuZlSN7 OrD1CnZ82OiBCB1ip9c+m5Z04fICwzwnJbCdOJyKG5lFy8TX0XwP0IDuS1d3oTd+CbiE hvpOeYoqL8lvZQFXsWkAlYBKNhw+rAzuYyaQjFhVnIy0JLlqsAsFX/TRvFpUgs+lVaAB 7DwA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u8si1047816pgo.122.2019.05.08.19.13.30; Wed, 08 May 2019 19:13:47 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726476AbfEICMk (ORCPT + 99 others); Wed, 8 May 2019 22:12:40 -0400 Received: from out30-44.freemail.mail.aliyun.com ([115.124.30.44]:42732 "EHLO out30-44.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725842AbfEICMk (ORCPT ); Wed, 8 May 2019 22:12:40 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R121e4;CH=green;DM=||false|;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e07487;MF=aaron.lu@linux.alibaba.com;NM=1;PH=DS;RN=21;SR=0;TI=SMTPD_---0TRDJ4vp_1557367904; Received: from aaronlu(mailfrom:aaron.lu@linux.alibaba.com fp:SMTPD_---0TRDJ4vp_1557367904) by smtp.aliyun-inc.com(127.0.0.1); Thu, 09 May 2019 10:11:50 +0800 Date: Thu, 9 May 2019 10:11:44 +0800 From: Aaron Lu To: Julien Desfossez Cc: Vineeth Remanan Pillai , Phil Auld , Nishanth Aravamudan , Peter Zijlstra , Tim Chen , mingo@kernel.org, tglx@linutronix.de, pjt@google.com, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, subhra.mazumdar@oracle.com, fweisbec@gmail.com, keescook@chromium.org, kerrnel@google.com, Aaron Lu , Aubrey Li , Valentin Schneider , Mel Gorman , Pawan Gupta , Paolo Bonzini Subject: Re: [RFC PATCH v2 00/17] Core scheduling v2 Message-ID: <20190509021144.GA24577@aaronlu> References: <20190423180238.GG22260@pauld.bos.csb> <20190423184527.6230-1-vpillai@digitalocean.com> <20190429035320.GB128241@aaronlu> <20190506193937.GA10264@sinkpad> <20190508023009.GA89792@aaronlu> <20190508174909.GA18516@sinkpad> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20190508174909.GA18516@sinkpad> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 08, 2019 at 01:49:09PM -0400, Julien Desfossez wrote: > On 08-May-2019 10:30:09 AM, Aaron Lu wrote: > > On Mon, May 06, 2019 at 03:39:37PM -0400, Julien Desfossez wrote: > > > On 29-Apr-2019 11:53:21 AM, Aaron Lu wrote: > > > > This is what I have used to make sure no two unmatched tasks being > > > > scheduled on the same core: (on top of v1, I thinks it's easier to just > > > > show the diff instead of commenting on various places of the patches :-) > > > > > > We imported this fix in v2 and made some small changes and optimizations > > > (with and without Peter’s fix from https://lkml.org/lkml/2019/4/26/658) > > > and in both cases, the performance problem where the core can end up > > > > By 'core', do you mean a logical CPU(hyperthread) or the entire core? > No I really meant the entire core. > > I’m sorry, I should have added a little bit more context. This relates > to a performance issue we saw in v1 and discussed here: > https://lore.kernel.org/lkml/20190410150116.GI2490@worktop.programming.kicks-ass.net/T/#mb9f1f54a99bac468fc5c55b06a9da306ff48e90b > > We proposed a fix that solved this, Peter came up with a better one > (https://lkml.org/lkml/2019/4/26/658), but if we add your isolation fix > as posted above, the same problem reappears. Hope this clarifies your > ask. It's clear now, thanks. I don't immediately see how my isolation fix would make your fix stop working, will need to check. But I'm busy with other stuffs so it will take a while. > > I hope that we did not miss anything crucial while integrating your fix > on top of v2 + Peter’s fix. The changes are conceptually similar, but we > refactored it slightly to make the logic clear. Please have a look and > let us know I suppose you already have a branch that have all the bits there? I wonder if you can share that branch somewhere so I can start working on top of it to make sure we are on the same page? Also, it would be good if you can share the workload, cmdline options, how many workers need to start etc. to reproduce this issue. Thanks.