Received: by 2002:a25:e74b:0:0:0:0:0 with SMTP id e72csp500152ybh; Sat, 18 Jul 2020 10:31:49 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxbS+gCsXWrOXLggDhs1ZnHEKMckIEmOa64aTWldodcL/WPXnzSONjVihj6zJWaamaiFqSy X-Received: by 2002:a17:906:d106:: with SMTP id b6mr14242675ejz.125.1595093509603; Sat, 18 Jul 2020 10:31:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1595093509; cv=none; d=google.com; s=arc-20160816; b=gmPhSBMH+MbB09nCeJdAKIjlbXy+S16ANI3eqGcply3kPr/09cvidUFwAd+r4OWWEh PYOI5nWi8iJFG9Fw3Q1gyQ2n/jxKxgESbWDxeQ/91sfGRE+bWKQospNLKFCWfndiwweV 826jBeJ/ngrVo+8R1B/VMeScbRiQDwTMoXv+UJaoyxJG20+sktfaSos+T5FSfAmA8JB5 HafHyqr2L9K9GKFyBXFXLRfgHwRtUULheBsWSkZc174zDKdpiOEoUr+nZEpSkEXALIsL SGi52t5/npaGOy0A5owSK/3wpA+Wg197TaX5bkvwiwIeGIEB3ckt7fYdesGgCwbk8roH AC5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=JMAp/sDYH2ueTLJSKr7aYQ3kPtyH5wY8UTV+mJwXCSg=; b=fpKwghlZ54W1mRbhrE802J51MXc3jMxd6WhMlZWhSdZ4j0exgCa+8fCLkoUlfJkTQG yCVE7Y7+D5B2sQ24lNy5fG5IZGe4M15WEXB6iPjnTy76AdOb7qK7bXYt3AsZhactUrYT 0ut+uNH/pAn1DpX1obqIzgIi7HTQ8FziMv/g9QG94vryLaE6BVg0lbFGO8zbCE7cTWNN W5UWniiQxNV1fPM7jod2Rpex9R7B4dA0w6jkH0uBmFp7S+09PYxhp08Txw4DirLCQwzp LKxv5vXx5LWxL/8IankwJwnCxLAXAucmCCPJ0mJg50kaVzFAEu/MxMgiCJxopJMtjSzI 3atg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=bTpWNY2M; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ef10si7168029ejb.519.2020.07.18.10.31.09; Sat, 18 Jul 2020 10:31:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=bTpWNY2M; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727853AbgGRRFN (ORCPT + 99 others); Sat, 18 Jul 2020 13:05:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34632 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726604AbgGRRFM (ORCPT ); Sat, 18 Jul 2020 13:05:12 -0400 Received: from mail-qv1-xf41.google.com (mail-qv1-xf41.google.com [IPv6:2607:f8b0:4864:20::f41]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9EE64C0619D2 for ; Sat, 18 Jul 2020 10:05:12 -0700 (PDT) Received: by mail-qv1-xf41.google.com with SMTP id o2so3117490qvk.6 for ; Sat, 18 Jul 2020 10:05:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=JMAp/sDYH2ueTLJSKr7aYQ3kPtyH5wY8UTV+mJwXCSg=; b=bTpWNY2Mxn8TGfdXtFJXcME+IGrzur5jfEsZVxSjrWgxJeAK1Q8A+Ol5jvLhiHfYun OH3leANhpPB+GgSjpPuUVVt9AsApzQpdj1UqQ8Uv9q+vRzpibG02G1oYIuqh02pKyOqm /Lfxk/60gSN0JMycBrcfm3hozkf1AvQV7IHT8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=JMAp/sDYH2ueTLJSKr7aYQ3kPtyH5wY8UTV+mJwXCSg=; b=et2ZlFGpNBrOXHg7BjnA8tCTts0TV/8A90nHQAmADJiov8aveS9d/hVEl1KbCjbzsL T7fYU2nCESx8VkRXjufaerS5wkUhfD1TB+7ld7OmgFgn7XkO376KomDm1VWwjAP5r/yK hF9Un9LbABLYtALIAKfxrWKePIrXCBIuDC8Z3GVHZJ9Ox4UMqqGzyEwVq/EA6cTjzGZW ac88LajlbZm3ifGJ3EMqcGS6xMdFwfEi48cxKN1KaWiL5CojMGtnOsDZ/ly763VL4wTj 9o39mahSwEOU0/YDz9ZpY8J6WDTobTKJr/1NgAmEzjWxElo0ZYpAl9T0ftjm3jJfVPw6 FzNA== X-Gm-Message-State: AOAM530/nkrHWO9r8cgYmIJb4I7T9QCGzZlBD/Spe2cjS9p+05uyAcAf tuh+9W+RuIoDd7ZRGP+otvyoiw== X-Received: by 2002:a0c:e008:: with SMTP id j8mr14546159qvk.87.1595091911589; Sat, 18 Jul 2020 10:05:11 -0700 (PDT) Received: from localhost ([2620:15c:6:12:cad3:ffff:feb3:bd59]) by smtp.gmail.com with ESMTPSA id i20sm1188748qka.17.2020.07.18.10.05.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 18 Jul 2020 10:05:10 -0700 (PDT) Date: Sat, 18 Jul 2020 13:05:10 -0400 From: Joel Fernandes To: Thomas Gleixner Cc: "Li, Aubrey" , Vineeth Remanan Pillai , Nishanth Aravamudan , Julien Desfossez , Peter Zijlstra , Tim Chen , mingo@kernel.org, pjt@google.com, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, subhra.mazumdar@oracle.com, fweisbec@gmail.com, keescook@chromium.org, kerrnel@google.com, Phil Auld , Aaron Lu , Aubrey Li , Valentin Schneider , Mel Gorman , Pawan Gupta , Paolo Bonzini , vineethrp@gmail.com, Chen Yu , Christian Brauner , Tim Chen , "Paul E . McKenney" Subject: Re: [RFC PATCH 14/16] irq: Add support for core-wide protection of IRQ and softirq Message-ID: <20200718170510.GA2162553@google.com> References: <87y2nhpvap.fsf@nanos.tec.linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87y2nhpvap.fsf@nanos.tec.linutronix.de> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jul 18, 2020 at 01:37:50AM +0200, Thomas Gleixner wrote: > "Li, Aubrey" writes: > > On 2020/7/1 5:32, Vineeth Remanan Pillai wrote: > > > > We saw a lot of soft lockups on the screen when we tested v6. > > > > [ 186.527883] watchdog: BUG: soft lockup - CPU#86 stuck for 22s! [uperf:5551] > > [ 186.535884] watchdog: BUG: soft lockup - CPU#87 stuck for 22s! [uperf:5444] > > [ 186.555883] watchdog: BUG: soft lockup - CPU#89 stuck for 22s! [uperf:5547] > > [ 187.547884] rcu: INFO: rcu_sched self-detected stall on CPU > > [ 187.553760] rcu: 40-....: (14997 ticks this GP) idle=49a/1/0x4000000000000002 softirq=1711/1711 fqs=7279 > > [ 187.564685] NMI watchdog: Watchdog detected hard LOCKUP on cpu 14 > > [ 187.564723] NMI watchdog: Watchdog detected hard LOCKUP on cpu 38 > > > > The problem is gone when we reverted this patch. We are running multiple > > uperf threads(equal to cpu number) in a cgroup with coresched enabled. > > This is 100% reproducible on our side. > > ROTFL. I just predicted that from staring at the patch .... Yes, sorry. It got fixed in the below patch which I was about to share before you sent this email. However, it still has (not yet) addressed the new comments from you we have to discuss (will reply to that email). Its purely a dirty test patch but it makes Aubrey's deadlock go away. Basically, I moved the waiting code to prepare_exit_to_usermode() which removes the lock vs wait dependencies. I will work on moving the code to the right place based on the suggestions in the other email. https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git/commit/?h=sched/core-sched-fix-irq-deadlocks-v1&id=8857a261f3305887b063001c6c899869206667b6 thanks, - Joel