Received: by 2002:a25:e74b:0:0:0:0:0 with SMTP id e72csp1607880ybh; Thu, 16 Jul 2020 17:44:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyR7gpLHiNDkmlFibnh1bvyw0vxkvv8UTiKDVFmQ4p+ZsC2NaIhp6AcxWXqifAQfztNhAlB X-Received: by 2002:a05:6402:1c07:: with SMTP id ck7mr6915549edb.297.1594946680846; Thu, 16 Jul 2020 17:44:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594946680; cv=none; d=google.com; s=arc-20160816; b=uu2IDA71ihMYrED5qM+pJQ/8Nue/494lgI6qg/9m681O+RGnEuchgUwOMvTMcXsYd8 ECW0dF2K86gwCCkz5Uo4a4M6rGRPFruh3PLZjxGNU/Nr/qmA03/RrfXrYJZwZ0mCzqIz huDe628R3CnX1dXIPM72rCZHVGT7r/ML6LC/vQH98RPPeZyrV+ahc8GcKm8yxalNFMBx Z4QE8Xfg9PUNzU/L+puNebsAcrALcECKFujGqiuOhr2/FdBfJqBux3geVfRHP/aj0VL+ 0ui4MESGy0IzVGe4K+CcQkqelvgfoeSyzP4+0VoZWOdibAEarjGpRcpQGeCkIEtKG1hP o/xQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=Rmdg41JQUJq8opOuNZqgZDP9AkGwDIAIjz0rPo/9Gto=; b=0gZ6mSyjmRt23yJ7jX8DKg9zObQAuTnTBi1JaYYJQW/EHChMg3wT10um2MyG/SoPIM Pm+EpL1ldzT9SwhuubY1AwRQWOko5YHnXycnfyttVwabYnla93/VLwx2dsJFTfHD++nT cJBhj2R2WwRvW2tTBFl4UkDxuijoSHy+Khs1c77zP3ED37xJnVx7r7G7fgZlcvWi6ZwG vPptHIGJDR+UMad0dcVhtysP4PfgIkJd6uGp5/JApVrNlsBvVBwiQv4AX75rRSpB1uNb +L9AdMoxBFDhCIXr4d1iPMAwnaTVMf+wAnbOjKNR4yzWBtMF2pWr9j/OJOkFgr0ZWP+6 MNPA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=COc5egj6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id bj4si4360987ejb.540.2020.07.16.17.44.17; Thu, 16 Jul 2020 17:44:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=COc5egj6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726528AbgGQAoI (ORCPT + 99 others); Thu, 16 Jul 2020 20:44:08 -0400 Received: from us-smtp-1.mimecast.com ([207.211.31.81]:27418 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726296AbgGQAoH (ORCPT ); Thu, 16 Jul 2020 20:44:07 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594946645; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Rmdg41JQUJq8opOuNZqgZDP9AkGwDIAIjz0rPo/9Gto=; b=COc5egj66ufcJXjDWAILUQJ0E2ILBLJoYpYm0kznwfBMkLpUW+SN0oqKy9QPQQgVHFZDo1 KvP+IDKtszbkbSnXbcDC9V1W1NNVsYuSGEQsujkBV/wqE2xM+4EOkHL0lv0O8gFYylK8Q9 //f/1VAUvf3Tf9YNHQ0H63ukiOsPlXk= Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-335-WI0QoI4iPV60L1wSCnLwZQ-1; Thu, 16 Jul 2020 20:44:03 -0400 X-MC-Unique: WI0QoI4iPV60L1wSCnLwZQ-1 Received: by mail-qv1-f69.google.com with SMTP id m8so4547311qvv.10 for ; Thu, 16 Jul 2020 17:44:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Rmdg41JQUJq8opOuNZqgZDP9AkGwDIAIjz0rPo/9Gto=; b=WgRuVVJ8MzvES06zk1Z4dLPlfOi/uhPV+bOuEifVk0yjSfJ2Y9Ig8+RGcGVGBkUNT1 ZwVT1CBhJcOhONQM7A+TfTymv24P0r6g925wTZ0YjgPCCmC3uwPzFEKOxrlZP5lgdS1q D3SVrzwTZ+07LCYuRllJV6F2vHCciMOlLCiP6Iv5Beqzm0x2IREKodWiNjGTE+VVmE5i RyQnEfedUCC3cURaABA+ZGdOsAjATJV6LjLPwSTXTfDV6VKjyCVMd6RcWrniW/V5BaLM dAY/ohDpqzuUqUu/zcfGGA4QT3IX3N/MOyZgd6UCqqJvGYa2o8mGaq8PFfb+qKMxwzzh /x0w== X-Gm-Message-State: AOAM530U2Oxe4GBBDaDDjIVYQnREYXxxSVy/qC7BurK2xNSDAnfQQ9Rm TzPElhemEXxa3P1kubkhg01suLNMQ/jRplEpvWShbjD04KkpUAeeuHRgT0bqDORGgXYPnTeIHVd 2dIA2sZGXCcU6vVkEPC83QdRV8wxa6SG5oLdZFeOp X-Received: by 2002:a37:5c04:: with SMTP id q4mr6803737qkb.192.1594946643135; Thu, 16 Jul 2020 17:44:03 -0700 (PDT) X-Received: by 2002:a37:5c04:: with SMTP id q4mr6803712qkb.192.1594946642833; Thu, 16 Jul 2020 17:44:02 -0700 (PDT) MIME-Version: 1.0 References: <20200716235440.GA675421@bjorn-Precision-5520> In-Reply-To: <20200716235440.GA675421@bjorn-Precision-5520> From: Karol Herbst Date: Fri, 17 Jul 2020 02:43:52 +0200 Message-ID: Subject: Re: nouveau regression with 5.7 caused by "PCI/PM: Assume ports without DLL Link Active train links in 100 ms" To: Bjorn Helgaas Cc: Linux PCI , Mika Westerberg , Ben Skeggs , Bjorn Helgaas , Lyude Paul , nouveau , dri-devel , Patrick Volkerding , LKML , Kai-Heng Feng , Sasha Levin Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jul 17, 2020 at 1:54 AM Bjorn Helgaas wrote: > > [+cc Sasha -- stable kernel regression] > [+cc Patrick, Kai-Heng, LKML] > > On Fri, Jul 17, 2020 at 12:10:39AM +0200, Karol Herbst wrote: > > On Tue, Jul 7, 2020 at 9:30 PM Karol Herbst wrote: > > > > > > Hi everybody, > > > > > > with the mentioned commit Nouveau isn't able to load firmware onto the > > > GPU on one of my systems here. Even though the issue doesn't always > > > happen I am quite confident this is the commit breaking it. > > > > > > I am still digging into the issue and trying to figure out what > > > exactly breaks, but it shows up in different ways. Either we are not > > > able to boot the engines on the GPU or the GPU becomes unresponsive. > > > Btw, this is also a system where our runtime power management issue > > > shows up, so maybe there is indeed something funky with the bridge > > > controller. > > > > > > Just pinging you in case you have an idea on how this could break Nouveau > > > > > > most of the times it shows up like this: > > > nouveau 0000:01:00.0: acr: AHESASC binary failed > > > > > > Sometimes it works at boot and fails at runtime resuming with random > > > faults. So I will be investigating a bit more, but yeah... I am super > > > sure the commit triggered this issue, no idea if it actually causes > > > it. > > > > so yeah.. I reverted that locally and never ran into issues again. > > Still valid on latest 5.7. So can we get this reverted or properly > > fixed? This breaks runtime pm for us on at least some hardware. > > Yeah, that stinks. We had another similar report from Patrick: > > https://lore.kernel.org/r/CAErSpo5sTeK_my1dEhWp7aHD0xOp87+oHYWkTjbL7ALgDbXo-Q@mail.gmail.com > > Apparently the problem is ec411e02b7a2 ("PCI/PM: Assume ports without > DLL Link Active train links in 100 ms"), which Patrick found was > backported to v5.4.49 as 828b192c57e8, and you found was backported to > v5.7.6 as afaff825e3a4. > > Oddly, Patrick reported that v5.7.7 worked correctly, even though it > still contains afaff825e3a4. > > I guess in the absence of any other clues we'll have to revert it. > I hate to do that because that means we'll have slow resume of > Thunderbolt-connected devices again, but that's better than having > GPUs completely broken. > > Could you and Patrick open bugzilla.kernel.org reports, attach dmesg > logs and "sudo lspci -vv" output, and add the URLs to Kai-Heng's > original report at https://bugzilla.kernel.org/show_bug.cgi?id=206837 > and to this thread? > > There must be a way to fix the slow resume problem without breaking > the GPUs. > I wouldn't be surprised if this is related to the Intel bridge we check against for Nouveau.. I still have to check on another laptop with the same bridge our workaround was required as well but wouldn't be surprised if it shows the same problem. Will get you the information from both systems tomorrow then. > Bjorn >