Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp1041295ybi; Wed, 19 Jun 2019 12:20:37 -0700 (PDT) X-Google-Smtp-Source: APXvYqwSCho3an0Z6J6D/NdJ6c9f4YqgKMrmELcMOOOcXTE+DBLP/O3WORUup9Erk2KkyUdU/gNy X-Received: by 2002:a65:44c2:: with SMTP id g2mr9140197pgs.378.1560972037498; Wed, 19 Jun 2019 12:20:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560972037; cv=none; d=google.com; s=arc-20160816; b=QSvYBQbTmo1XdYRa9QnySCF9Nb9cAoFZ7ZladZc9p2gkqQsnJ6V7x4bQEf1MKUFuTw luN4WJN01wD+SX3gG4jGmlYicj6dHS5L+cKTbS7HL9UYKGHT2mIZbYbOmOKor/9OKFTx MbjgHPV0pcqk+VRD/dN3Jaw4/LBmCGVQQr9Pg8EtJiy7YCMu68bOmODCFrQ2MEoVndWK razShWjurJLjI/Z34AxdmohteXCFlqDb3K60wEGZHWa6/tqgL+9nP9IcymGcJ2ddCPnj H9TWdKZDAM+hWxITZRnVKSmFtk8d34tELX4DwRwrZ7bvHCsUdBXA3wVDeJ8c+PH8pvW7 Sv+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:date:subject:user-agent:message-id :references:cc:in-reply-to:from:to:content-transfer-encoding :mime-version; bh=oCZoWOC3JAx+45umZVSdvnlwIXsz+WL2+H3efyNYhng=; b=gxxoB0knAoqPdbsEsAo1z8MCnFvVQk4n+8MDqYqaXBnBG4S6yn26HyVCVj1nvumcy/ j4BWuiL1OPagsqJpCPznhOSzJC5YsYFuCf1SYFLbK/QS5LHx5Mys8M9Hka8Dwpg2iogN UWfc8+2MFCYxh5pjrUb9o3J9d5A9+ktAnNlE9XYBOXMys+IUcouqar1o7C5bv+SUkrqr U0Ql/qew50YcQzeVEXYUxKlxde8WIRf6IG20cPWDtpiJMuvnC6Tt38322Si6lgiZy597 KOsXBZ0I+Sbr2QAinXBNPGxDCBOkQ0iBP+BGO10svW0fsY9Uk3X3MGO2z2o+SaOpbb2i 6SZQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u6si3893861pga.360.2019.06.19.12.20.21; Wed, 19 Jun 2019 12:20:37 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730312AbfFSTTp convert rfc822-to-8bit (ORCPT + 99 others); Wed, 19 Jun 2019 15:19:45 -0400 Received: from mail.fireflyinternet.com ([109.228.58.192]:63786 "EHLO fireflyinternet.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726449AbfFSTTp (ORCPT ); Wed, 19 Jun 2019 15:19:45 -0400 X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from localhost (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP (TLS) id 16957649-1500050 for multiple; Wed, 19 Jun 2019 20:19:36 +0100 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8BIT To: Linus Torvalds From: Chris Wilson In-Reply-To: Cc: Linux List Kernel Mailing References: <156094799629.21217.4574572565333265288@skylake-alporthouse-com> Message-ID: <156097197830.664.13418742301997062555@skylake-alporthouse-com> User-Agent: alot/0.6 Subject: Re: NMI hardlock stacktrace deadlock [was Re: Linux 5.2-rc5] Date: Wed, 19 Jun 2019 20:19:38 +0100 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Quoting Linus Torvalds (2019-06-19 19:49:37) > On Wed, Jun 19, 2019 at 5:40 AM Chris Wilson wrote: > > > > I haven't bisected this, but with the merge of rc5 into our CI we > > started hitting an issue that resulted in a oops and the NMI watchdog > > firing as we dumped the ftrace. > > Do you have the oops itself at all? An example at https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6310/fi-kbl-x1275/dmesg0.log https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6310/fi-kbl-x1275/boot0.log The bug causing the oops is clearly a driver problem. The rc5 fallout just seems to be because of some shrinker changes affecting some object reaping that were unfortunately still active. What perturbed the CI team was the machine failed to panic & reboot. -Chris