Received: by 2002:ac0:bc90:0:0:0:0:0 with SMTP id a16csp1028348img; Fri, 22 Mar 2019 14:00:53 -0700 (PDT) X-Google-Smtp-Source: APXvYqyGvyVdUxmitAb3b+VKU/yLH/dXFNXS5BC4cBun3ryQM/HCWFk5rvWhSBWU6FrMJotNSuUP X-Received: by 2002:a63:54b:: with SMTP id 72mr10135116pgf.323.1553288453274; Fri, 22 Mar 2019 14:00:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553288453; cv=none; d=google.com; s=arc-20160816; b=msGe/20VQiVeabYyggmuOKy6UDKxY9rN5ubUL0vW8pXng4ZPMQirnbm81m/V+jh6Ri BX3LwhnbpiHb4yqyRG0zcSUfs8hgKwkgbJGM4sJ+kPYNxckiCf0AX2Iq9RJwXns5+Hqy 0JjiexUXPyXUhFdsoaV/9JTW2KzAbHE6r5NYW9/5mfT2xjIZbnCm1ch9KzVM7jL8G23e Bd9tv0Q1Z+L/t3zEtUcVYaxUKeAc5n/tgPjsDa5tLR06/pSdZVar3o9Rh/m5JO+53/a/ 3cL8PITMcH2XyantwAW2DNYMfzZG4z/OB62Uwjl7KaLDV6Iiv2oKr1e4nNL5bT3Lq1xr OKBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature; bh=uu3LkqEayfNWxAGHXa5LQ/QG4OaFYLEZAK6IQMPmgWU=; b=Ba65eJUPHiY506wYCufT4Mu4u4522QBE3IAUHeJgVvg4NY0qNrTc4VLZQtErgb+uvi uWw95+WRNeeuJql5YhEu852JzXoT+uMcClKuZAluIAIV7okbLhIUcBlRZ2DkYEYlifVX jpcK/JILmcSr9X2lcoY2IyinAHU8cLDzdtLCtj3hy71qsQgdNOLbRXUNwLsx/QrTF4YQ PIjCSWRXZa+chbzuR7blE9r/C2GVEeqST0CQImBLuQqWGZP/vML0hdo7OZhw1lFSn1IO Itc99o2HxK/vbpqmnRYoX3AjvQMj/HYN3K8Cftl3cF+FbNx516kekoxPEWML/854+D9f tTxg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@digitalocean.com header.s=google header.b=TEace9K+; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=digitalocean.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p8si4059032pgb.77.2019.03.22.14.00.37; Fri, 22 Mar 2019 14:00:53 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@digitalocean.com header.s=google header.b=TEace9K+; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=digitalocean.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727217AbfCVU7w (ORCPT + 99 others); Fri, 22 Mar 2019 16:59:52 -0400 Received: from mail-qk1-f195.google.com ([209.85.222.195]:41454 "EHLO mail-qk1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726029AbfCVU7v (ORCPT ); Fri, 22 Mar 2019 16:59:51 -0400 Received: by mail-qk1-f195.google.com with SMTP id o129so2057122qke.8 for ; Fri, 22 Mar 2019 13:59:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=digitalocean.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=uu3LkqEayfNWxAGHXa5LQ/QG4OaFYLEZAK6IQMPmgWU=; b=TEace9K+CerAvL1vD5KYTJp5Yl4WDtY4l6Fq8IHyIggjbv89eNcwnc6JZJ8PG30Ijt hM1mnKma9UPMlef2KhvA62mtqEwU+R6TXeTwdMrwaVD/Z42htn1EYfsx9ZkJG4xzRGA0 S5r5B7WY1yQxgW4GGElHlwSmefz9KCBqRahoc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=uu3LkqEayfNWxAGHXa5LQ/QG4OaFYLEZAK6IQMPmgWU=; b=ayWN10jOvnXiTlIW7lyAgZce+GCgv7C+IMLbwl65+Th9VRVZEJlTfrIsskluOtBq34 qYuYM5s/soDy9qKIfKBEOXjSjzBWcalaBFwHFpIr87i64Rna5uv5djMbEqINwJq4wVI3 3G3OHWMO7I8chlRMjXGTeHUofpOZRNSsjKJmLu3+ChRudOQA8p8ZCJZk/UHp40yZVJBD f1zc9OBgyryPD6Xb78XsTDiHzH1NOmPjBcBy7J/5U4gRCnc4KIrricSg7PIq7vkUm+Lt f6T8icIRFr/3/YXrBu3pcM3SJOkoO7b2/IXCq3niag7txfkOsGTWn+vejTph9fFnpqiR Ln8w== X-Gm-Message-State: APjAAAUUn7OUM7FRRTduWMWLGgiSpcrAlwLXTDvhk1aI68A8Ejcce7pt Y8sKXiG7p9pR7HfsfLElI5nLxw== X-Received: by 2002:a37:4804:: with SMTP id v4mr9335461qka.104.1553288390777; Fri, 22 Mar 2019 13:59:50 -0700 (PDT) Received: from [192.168.1.240] ([142.169.78.14]) by smtp.gmail.com with ESMTPSA id k12sm6285132qti.38.2019.03.22.13.59.48 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 22 Mar 2019 13:59:49 -0700 (PDT) From: Julien Desfossez To: Peter Zijlstra Cc: Julien Desfossez , mingo@kernel.org, tglx@linutronix.de, pjt@google.com, tim.c.chen@linux.intel.com, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, subhra.mazumdar@oracle.com, fweisbec@gmail.com, keescook@chromium.org, kerrnel@google.com, Vineeth Pillai , Nishanth Aravamudan Subject: Re: [RFC][PATCH 03/16] sched: Wrap rq::lock access Date: Fri, 22 Mar 2019 16:59:30 -0400 Message-Id: <1553288370-4167-1-git-send-email-jdesfossez@digitalocean.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <20190322133448.GT6058@hirez.programming.kicks-ass.net> References: <20190322133448.GT6058@hirez.programming.kicks-ass.net> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 22, 2019 at 9:34 AM Peter Zijlstra wrote: > On Thu, Mar 21, 2019 at 05:20:17PM -0400, Julien Desfossez wrote: > > On further investigation, we could see that the contention is mostly in > the > > way rq locks are taken. With this patchset, we lock the whole core if > > cpu.tag is set for at least one cgroup. Due to this, __schedule() is > more or > > less serialized for the core and that attributes to the performance loss > > that we are seeing. We also saw that newidle_balance() takes considerably > > long time in load_balance() due to the rq spinlock contention. Do you > think > > it would help if the core-wide locking was only performed when absolutely > > needed ? > > Something like that could be done, but then you end up with 2 locks, > something which I was hoping to avoid. > > Basically you keep rq->lock as it exists today, but add something like > rq->core->core_lock, you then have to take that second lock (nested > under rq->lock) for every scheduling action involving a tagged task. > > It makes things complicatd though; because now my head hurts thikning > about pick_next_task(). > > (this can obviously do away with the whole rq->lock wrappery) > > Also, completely untested.. We tried it and it dies within 30ms of enabling the tag on 2 VMs :-) Now after trying to debug this my head hurts as well ! We'll continue trying to figure this out, but if you want to take a look, the full dmesg is here: https://paste.debian.net/plainh/0b8f87f3 Thanks, Julien