Received: by 2002:ac0:bc90:0:0:0:0:0 with SMTP id a16csp1122572img; Fri, 22 Mar 2019 16:29:34 -0700 (PDT) X-Google-Smtp-Source: APXvYqy6QepWRN1/XY1eZIxIj2NsjrsBGXtBiF8I3ytiupqSIWKaqjt4EVAuqVEXz1X71qiOcRxK X-Received: by 2002:aa7:811a:: with SMTP id b26mr11557983pfi.250.1553297374655; Fri, 22 Mar 2019 16:29:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553297374; cv=none; d=google.com; s=arc-20160816; b=s9wEHk4lS9bINsnR0DRH4xd1T356cY1I4eG0k3858bmtIRdNWsiYLcEflvrwFXWly9 EnA5MQ+j6kkLCAdbn0kSC4NYQmbh0Ji6ESVrJn0GvPLu9MXJCiAWjZ+gzRp2ZHjIH15Y VAyLMo4jqd4fxnpaJh9tUPL8RTY1bYIRhN6iBLt10qxoU83k4ipQa0EkJ7EYuvQZr3Ai 05YrIWm5sG0boxlzpXnwRdp1aVnj8Sxg+kSHyPuWdqg/FG+906kvVl47+maIihEpbg9h f84b/e5+uKbzxOd4e11Cu0BOWQlrpJYEKKLez0dGGGtvypQVwDa7HWOguU766ThrNpJP dgTw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:subject:autocrypt:openpgp:from:references:cc:to; bh=6RsJtyrBURho0g6bbqL0/p0H76wVfsXbmMeKzi+0Zmo=; b=L8mmeje3CEVEZ68eoFZrvXPE6zIpP0MAb4cSKFl7J5qBhGOGaBXm6fSGGRAhwEok3n 8tK2iXtB/0WPxrPS4klNsHU41grdM695B1BQeDwkAlFfiLeB1mm86FUs5ClfepjtVWNm NNZpk+1IvCwJzDjSK78XR8WP4tul+u7up8CklqnJxzs0hSXRELbQEnOHCQMuhzIyeqxF qwes8c99KfNISim/19KTODTr1wkHZQ9lotYl5sXHR2HdCJAOjbQwEDaQ4FJ3mUFz25OC rodkXCdEPZMPPp+p/V4mpEftVXaYvD5VWM01REPk8H3M61iYpiRllN+DqrlZcr40jOVQ x8wg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x1si8196155pfx.190.2019.03.22.16.29.19; Fri, 22 Mar 2019 16:29:34 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727139AbfCVX2l (ORCPT + 99 others); Fri, 22 Mar 2019 19:28:41 -0400 Received: from mga12.intel.com ([192.55.52.136]:64184 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725999AbfCVX2k (ORCPT ); Fri, 22 Mar 2019 19:28:40 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 22 Mar 2019 16:28:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,256,1549958400"; d="scan'208";a="154340226" Received: from schen9-desk.jf.intel.com (HELO [10.54.74.162]) ([10.54.74.162]) by fmsmga002.fm.intel.com with ESMTP; 22 Mar 2019 16:28:39 -0700 To: Subhra Mazumdar , Julien Desfossez , Peter Zijlstra , mingo@kernel.org, tglx@linutronix.de, pjt@google.com, torvalds@linux-foundation.org Cc: linux-kernel@vger.kernel.org, fweisbec@gmail.com, keescook@chromium.org, kerrnel@google.com, Vineeth Pillai , Nishanth Aravamudan , Pawan Gupta , Aubrey References: <20190218173514.064516553@infradead.org> <1552923710-30933-1-git-send-email-jdesfossez@digitalocean.com> <15f3f7e6-5dce-6bbf-30af-7cffbd7bb0c3@oracle.com> From: Tim Chen Openpgp: preference=signencrypt Autocrypt: addr=tim.c.chen@linux.intel.com; prefer-encrypt=mutual; keydata= mQINBE6ONugBEAC1c8laQ2QrezbYFetwrzD0v8rOqanj5X1jkySQr3hm/rqVcDJudcfdSMv0 BNCCjt2dofFxVfRL0G8eQR4qoSgzDGDzoFva3NjTJ/34TlK9MMouLY7X5x3sXdZtrV4zhKGv 3Rt2osfARdH3QDoTUHujhQxlcPk7cwjTXe4o3aHIFbcIBUmxhqPaz3AMfdCqbhd7uWe9MAZX 7M9vk6PboyO4PgZRAs5lWRoD4ZfROtSViX49KEkO7BDClacVsODITpiaWtZVDxkYUX/D9OxG AkxmqrCxZxxZHDQos1SnS08aKD0QITm/LWQtwx1y0P4GGMXRlIAQE4rK69BDvzSaLB45ppOw AO7kw8aR3eu/sW8p016dx34bUFFTwbILJFvazpvRImdjmZGcTcvRd8QgmhNV5INyGwtfA8sn L4V13aZNZA9eWd+iuB8qZfoFiyAeHNWzLX/Moi8hB7LxFuEGnvbxYByRS83jsxjH2Bd49bTi XOsAY/YyGj6gl8KkjSbKOkj0IRy28nLisFdGBvgeQrvaLaA06VexptmrLjp1Qtyesw6zIJeP oHUImJltjPjFvyfkuIPfVIB87kukpB78bhSRA5mC365LsLRl+nrX7SauEo8b7MX0qbW9pg0f wsiyCCK0ioTTm4IWL2wiDB7PeiJSsViBORNKoxA093B42BWFJQARAQABtDRUaW0gQ2hlbiAo d29yayByZWxhdGVkKSA8dGltLmMuY2hlbkBsaW51eC5pbnRlbC5jb20+iQI+BBMBAgAoAhsD BgsJCAcDAgYVCAIJCgsEFgIDAQIeAQIXgAUCXFIuxAUJEYZe0wAKCRCiZ7WKota4STH3EACW 1jBRzdzEd5QeTQWrTtB0Dxs5cC8/P7gEYlYQCr3Dod8fG7UcPbY7wlZXc3vr7+A47/bSTVc0 DhUAUwJT+VBMIpKdYUbvfjmgicL9mOYW73/PHTO38BsMyoeOtuZlyoUl3yoxWmIqD4S1xV04 q5qKyTakghFa+1ZlGTAIqjIzixY0E6309spVTHoImJTkXNdDQSF0AxjW0YNejt52rkGXXSoi IgYLRb3mLJE/k1KziYtXbkgQRYssty3n731prN5XrupcS4AiZIQl6+uG7nN2DGn9ozy2dgTi smPAOFH7PKJwj8UU8HUYtX24mQA6LKRNmOgB290PvrIy89FsBot/xKT2kpSlk20Ftmke7KCa 65br/ExDzfaBKLynztcF8o72DXuJ4nS2IxfT/Zmkekvvx/s9R4kyPyebJ5IA/CH2Ez6kXIP+ q0QVS25WF21vOtK52buUgt4SeRbqSpTZc8bpBBpWQcmeJqleo19WzITojpt0JvdVNC/1H7mF 4l7og76MYSTCqIKcLzvKFeJSie50PM3IOPp4U2czSrmZURlTO0o1TRAa7Z5v/j8KxtSJKTgD lYKhR0MTIaNw3z5LPWCCYCmYfcwCsIa2vd3aZr3/Ao31ZnBuF4K2LCkZR7RQgLu+y5Tr8P7c e82t/AhTZrzQowzP0Vl6NQo8N6C2fcwjSrkCDQROjjboARAAx+LxKhznLH0RFvuBEGTcntrC 3S0tpYmVsuWbdWr2ZL9VqZmXh6UWb0K7w7OpPNW1FiaWtVLnG1nuMmBJhE5jpYsi+yU8sbMA 5BEiQn2hUo0k5eww5/oiyNI9H7vql9h628JhYd9T1CcDMghTNOKfCPNGzQ8Js33cFnszqL4I N9jh+qdg5FnMHs/+oBNtlvNjD1dQdM6gm8WLhFttXNPn7nRUPuLQxTqbuoPgoTmxUxR3/M5A KDjntKEdYZziBYfQJkvfLJdnRZnuHvXhO2EU1/7bAhdz7nULZktw9j1Sp9zRYfKRnQdIvXXa jHkOn3N41n0zjoKV1J1KpAH3UcVfOmnTj+u6iVMW5dkxLo07CddJDaayXtCBSmmd90OG0Odx cq9VaIu/DOQJ8OZU3JORiuuq40jlFsF1fy7nZSvQFsJlSmHkb+cDMZDc1yk0ko65girmNjMF hsAdVYfVsqS1TJrnengBgbPgesYO5eY0Tm3+0pa07EkONsxnzyWJDn4fh/eA6IEUo2JrOrex O6cRBNv9dwrUfJbMgzFeKdoyq/Zwe9QmdStkFpoh9036iWsj6Nt58NhXP8WDHOfBg9o86z9O VMZMC2Q0r6pGm7L0yHmPiixrxWdW0dGKvTHu/DH/ORUrjBYYeMsCc4jWoUt4Xq49LX98KDGN dhkZDGwKnAUAEQEAAYkCJQQYAQIADwIbDAUCXFIulQUJEYZenwAKCRCiZ7WKota4SYqUEACj P/GMnWbaG6s4TPM5Dg6lkiSjFLWWJi74m34I19vaX2CAJDxPXoTU6ya8KwNgXU4yhVq7TMId keQGTIw/fnCv3RLNRcTAapLarxwDPRzzq2snkZKIeNh+WcwilFjTpTRASRMRy9ehKYMq6Zh7 PXXULzxblhF60dsvi7CuRsyiYprJg0h2iZVJbCIjhumCrsLnZ531SbZpnWz6OJM9Y16+HILp iZ77miSE87+xNa5Ye1W1ASRNnTd9ftWoTgLezi0/MeZVQ4Qz2Shk0MIOu56UxBb0asIaOgRj B5RGfDpbHfjy3Ja5WBDWgUQGgLd2b5B6MVruiFjpYK5WwDGPsj0nAOoENByJ+Oa6vvP2Olkl gQzSV2zm9vjgWeWx9H+X0eq40U+ounxTLJYNoJLK3jSkguwdXOfL2/Bvj2IyU35EOC5sgO6h VRt3kA/JPvZK+6MDxXmm6R8OyohR8uM/9NCb9aDw/DnLEWcFPHfzzFFn0idp7zD5SNgAXHzV PFY6UGIm86OuPZuSG31R0AU5zvcmWCeIvhxl5ZNfmZtv5h8TgmfGAgF4PSD0x/Bq4qobcfaL ugWG5FwiybPzu2H9ZLGoaRwRmCnzblJG0pRzNaC/F+0hNf63F1iSXzIlncHZ3By15bnt5QDk l50q2K/r651xphs7CGEdKi1nU0YJVbQxJQ== Subject: Re: [RFC][PATCH 03/16] sched: Wrap rq::lock access Message-ID: <69e2eea0-51de-bdcd-cdda-ce5cd841786d@linux.intel.com> Date: Fri, 22 Mar 2019 16:28:39 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.1 MIME-Version: 1.0 In-Reply-To: <15f3f7e6-5dce-6bbf-30af-7cffbd7bb0c3@oracle.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 3/19/19 7:29 PM, Subhra Mazumdar wrote: > > On 3/18/19 8:41 AM, Julien Desfossez wrote: >> The case where we try to acquire the lock on 2 runqueues belonging to 2 >> different cores requires the rq_lockp wrapper as well otherwise we >> frequently deadlock in there. >> >> This fixes the crash reported in >> 1552577311-8218-1-git-send-email-jdesfossez@digitalocean.com >> >> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h >> index 76fee56..71bb71f 100644 >> --- a/kernel/sched/sched.h >> +++ b/kernel/sched/sched.h >> @@ -2078,7 +2078,7 @@ static inline void double_rq_lock(struct rq *rq1, struct rq *rq2) >>           raw_spin_lock(rq_lockp(rq1)); >>           __acquire(rq2->lock);    /* Fake it out ;) */ >>       } else { >> -        if (rq1 < rq2) { >> +        if (rq_lockp(rq1) < rq_lockp(rq2)) { >>               raw_spin_lock(rq_lockp(rq1)); >>               raw_spin_lock_nested(rq_lockp(rq2), SINGLE_DEPTH_NESTING); >>           } else { Pawan was seeing occasional crashes and lock up that's avoided by doing the following. We're trying to dig a little more tracing to see why pick_next_entity is returning NULL. Tim diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 5349ebedc645..4c7f353b8900 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7031,6 +7031,8 @@ pick_next_task_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf } se = pick_next_entity(cfs_rq, curr); + if (!se) + return NULL; cfs_rq = group_cfs_rq(se); } while (cfs_rq); @@ -7070,6 +7072,8 @@ pick_next_task_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf do { se = pick_next_entity(cfs_rq, NULL); + if (!se) + return NULL; set_next_entity(cfs_rq, se); cfs_rq = group_cfs_rq(se); } while (cfs_rq);