Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp6082428imm; Wed, 12 Sep 2018 16:20:23 -0700 (PDT) X-Google-Smtp-Source: ANB0VdYv5jX/6+FQ7ZgBf9JfdP2YlnO1eXA5ojCmrWxQ4elfSZCi/4al8lz6wMtPbeBeiio7PCpX X-Received: by 2002:a62:868b:: with SMTP id x133-v6mr4692661pfd.252.1536794423774; Wed, 12 Sep 2018 16:20:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536794423; cv=none; d=google.com; s=arc-20160816; b=R0wtX4zM2vpBaJHRC6txAhjqqFuMEynI9g/hxgmy/29h3GKYoe04agCgDIN1F701hu lzsiKtLFOefrvxz8WlAufIBpBi8/WMnL70fDZ9uMieYnmNICfS9PLO+PS120Aux1Nob4 uCVVH0766UfMHTYB0/GXer3J9HXvO5EYzZr2VleCbfnD0umz5hv5hWud2XgX2Klix8Mp 60orS3EIPC/ANZBSmknOa87koSr/ZJy/5dpBYjVLBIwTKsZ4WTbxOEjagti68Abt712+ 9hpHHtXnFLFG5VsQpwtn3NAvNs7MOCNz2WSqUWu8kpSHZ51RvyvXf6IFxETi9/iXQnlw o3Aw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:openpgp:references:cc:to:subject:from:dkim-signature; bh=HlmKVFPFlkeYjVOQE9DPg9Bnik20VNGb0BuUYK1593g=; b=g7lyH5vZzk3kNHG+aUMH8V9gKDEr/D1WcoO5kJAy2VEkqOwuZDXEdLOIbDQ/nlvG2d ZbyCKJmHfiJ/JJbVVoBwDeUifbowqVbgpuC9K6nJ2P/Y58sJvv2Gd2U3NgyjY4v0AF1V 4J3c0/hVbT1r9D5w4STVq9RCiUOvHw5P6WLQG8xS2wYuerWSUK1sMY8kjkoXh7nrMeBC 4FiCfsGqZQxv0hKJIor9Ffb7jTanrf+3paArRSTebYH5bnbypL6OsgslnQNCxse8u7m9 h47xewu5lDjIQs3D+czrtjDPdd7s/1eDB9a+HmV/GyBmZnHGQmw3Y6j4RGfk5laFVjMr nzsg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=dsrC98N0; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h33-v6si2695907pgi.550.2018.09.12.16.20.07; Wed, 12 Sep 2018 16:20:23 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@amazon.de header.s=amazon201209 header.b=dsrC98N0; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727577AbeIMEZI (ORCPT + 99 others); Thu, 13 Sep 2018 00:25:08 -0400 Received: from smtp-fw-2101.amazon.com ([72.21.196.25]:7135 "EHLO smtp-fw-2101.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726611AbeIMEZI (ORCPT ); Thu, 13 Sep 2018 00:25:08 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.de; i=@amazon.de; q=dns/txt; s=amazon201209; t=1536794304; x=1568330304; h=from:subject:to:cc:references:message-id:date: mime-version:in-reply-to:content-transfer-encoding; bh=HlmKVFPFlkeYjVOQE9DPg9Bnik20VNGb0BuUYK1593g=; b=dsrC98N0C7Ed4l5FMfz1w9HJ4oMjNVLyJJoWDH/ZQDjK8l1WiFaXrVrw yQh/IB/GwuYD/Aka8mMA0hCRVaDqMt5zcFgCB1UQ/WWGH3aqUYAcqdOeq yJjc7+Pnfp1qzE/7MdN2mo8CUuPgbI+uny0UMYrQogEinZuDdntd6Zprc Q=; X-IronPort-AV: E=Sophos;i="5.53,366,1531785600"; d="scan'208";a="697388666" Received: from iad6-co-svc-p1-lb1-vlan2.amazon.com (HELO email-inbound-relay-1e-97fdccfd.us-east-1.amazon.com) ([10.124.125.2]) by smtp-border-fw-out-2101.iad2.amazon.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 12 Sep 2018 23:18:22 +0000 Received: from u7588a65da6b65f.ant.amazon.com (iad7-ws-svc-lb50-vlan3.amazon.com [10.0.93.214]) by email-inbound-relay-1e-97fdccfd.us-east-1.amazon.com (8.14.7/8.14.7) with ESMTP id w8CNIHOx000618 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Wed, 12 Sep 2018 23:18:20 GMT Received: from u7588a65da6b65f.ant.amazon.com (localhost [127.0.0.1]) by u7588a65da6b65f.ant.amazon.com (8.15.2/8.15.2/Debian-3) with ESMTP id w8CNIEQn011201; Thu, 13 Sep 2018 01:18:15 +0200 From: "=?UTF-8?Q?Jan_H._Sch=c3=b6nherr?=" Subject: [RFC 00/60] Coscheduling for Linux To: Nishanth Aravamudan Cc: Ingo Molnar , Peter Zijlstra , linux-kernel@vger.kernel.org References: <20180907214047.26914-1-jschoenh@amazon.de> <20180912002449.GA21797@breakout> Openpgp: preference=signencrypt Message-ID: <89b4f0cd-d324-14bd-3991-576de9849e34@amazon.de> Date: Thu, 13 Sep 2018 01:18:14 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/12/2018 09:34 PM, Jan H. Schönherr wrote: > That said, I see a hang, too. It seems to happen, when there is a > cpu.scheduled!=0 group that is not a direct child of the root task group. > You seem to have "/sys/fs/cgroup/cpu/machine" as an intermediate group. > (The case ==0 within !=0 within the root task group works for me.) > > I'm going to dive into the code. With the patch below (which technically changes patch 55/60), the hang I experienced is gone. Please let me know, if it works for you as well. Regards Jan diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 8da2033596ff..2d8b3f9a275f 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7189,23 +7189,26 @@ pick_next_task_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf while (!(cfs_rq = is_same_group(se, pse))) { int se_depth = se->depth; int pse_depth = pse->depth; + bool work = false; - if (se_depth <= pse_depth && leader_of(pse) == cpu) { + if (se_depth <= pse_depth && __leader_of(pse) == cpu) { put_prev_entity(cfs_rq_of(pse), pse); pse = parent_entity(pse); + work = true; } - if (se_depth >= pse_depth && leader_of(se) == cpu) { + if (se_depth >= pse_depth && __leader_of(se) == cpu) { set_next_entity(cfs_rq_of(se), se); se = parent_entity(se); + work = true; } - if (leader_of(pse) != cpu && leader_of(se) != cpu) + if (!work) break; } - if (leader_of(pse) == cpu) - put_prev_entity(cfs_rq, pse); - if (leader_of(se) == cpu) - set_next_entity(cfs_rq, se); + if (__leader_of(pse) == cpu) + put_prev_entity(cfs_rq_of(pse), pse); + if (__leader_of(se) == cpu) + set_next_entity(cfs_rq_of(se), se); } goto done;