Received: by 2002:a5d:9c59:0:0:0:0:0 with SMTP id 25csp83522iof; Sun, 5 Jun 2022 21:51:46 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzBva1xmQ0GS3ltzbobRuByBqL9Sfgs4DrbOezksEO/TL7ByfAlJr1ZsqsdVgOLVMZkUFdr X-Received: by 2002:a17:902:aa4b:b0:164:11ad:af0f with SMTP id c11-20020a170902aa4b00b0016411adaf0fmr22640437plr.54.1654491105883; Sun, 05 Jun 2022 21:51:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654491105; cv=none; d=google.com; s=arc-20160816; b=j/gGcXqHVuNUgTZlq5fr7UXT1JJyam5P3mUpAh6vNVoDaxeZMKKd2bcJQu5nMK2hAv jbQwTM0NAne0R60wG1pC/nGhfGXe4PMZECZS7MB75p6YqzI1fgV0v2nPaLFOoFd4xLfB 31nV++WmNz9z+RM4WopNqCzjlV38FdpvxDec2iRxpHuS2AWU2dKSy7+n4EEKzJ04Ommk wsQ36dsBAwReF9JwySLtd0LKHrdEF8z6zMX6UWk4YeTPzUc8f+EbypEnac/GAdTCk41o YulS+lqLdyyTEM8JMobnUWdDp6fGdYWt95AqLZsmrM90Yd3soln1N6XBRgCBVNivap73 c1NQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=/9WBFLi2GZ/Ui4GpH9tULtj9LEgcGT39WehBSO1WgzQ=; b=OvaJlnNPYteZaSZ92hB/nxHAnd11GR6DH0YBzJcPJHpnuIjaoyqGRydTfYnBZfv2r9 jS3qjtLz/io76GPnrDVWh4YV/S2lkJp3EBkYAW/k2z0jvqWViXDRGy/iT55w2Q7L85ub rARuPHUevC0EiyyIGhC8pLdeQ/SKAdjVVeQrE+El+YDSSMwdw/PP6w5u/losygKtVzmH FzDKblV1oy1WwZ0EGC3GgJzt/rDZoVzG9frbzxZ4estv2Y6vbGNn5MY60v0fXFOJDKrh 5VsmYbbSTBqLAt9ECYc5onM8vb2JqDgCIYQRhFL5REGe0UjX+vC+Wc/QZfAq71oNkc5g GV0w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=HEUAXsna; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id y5-20020a62ce05000000b0051c266ce245si1386153pfg.88.2022.06.05.21.51.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Jun 2022 21:51:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=HEUAXsna; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id CBB3AA8891; Sun, 5 Jun 2022 21:07:25 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245735AbiFCPuY (ORCPT + 99 others); Fri, 3 Jun 2022 11:50:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46538 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343588AbiFCPtz (ORCPT ); Fri, 3 Jun 2022 11:49:55 -0400 Received: from mail-ot1-x32a.google.com (mail-ot1-x32a.google.com [IPv6:2607:f8b0:4864:20::32a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B0DC2515B3 for ; Fri, 3 Jun 2022 08:49:43 -0700 (PDT) Received: by mail-ot1-x32a.google.com with SMTP id l9-20020a056830268900b006054381dd35so5811855otu.4 for ; Fri, 03 Jun 2022 08:49:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=/9WBFLi2GZ/Ui4GpH9tULtj9LEgcGT39WehBSO1WgzQ=; b=HEUAXsna5ZmSt0hJfuqV4f5D85h+f7wPyDldmxerUEzbCUP/OG/paIqlzAoySvKeaX SIwilD6gZCvRVOBgAiv6GINcQ62p8PX5DuVPoOn35OEQFCmsntOS/RP6/DSiP6HzKvnz KQDYVwTmYWC0hcQmTYfskxUHLjDVhbLbi7rq7Z5JERaKRgMVEsWkU5tQYsCZFktp0Hq4 Pv2dl+wx1j4nKLHpq/SBK+Q442HyVo3wK/Jx3mzQtCQj7MVzAGN7mM5NyvQCgjeLK8rv fhTbFgNwg9K1WmzDDQFjZhZLtR8NU+r+NSaFPMBUoGP/zMX6DKKXnj/oCXnhFXHdD4BJ 0J7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=/9WBFLi2GZ/Ui4GpH9tULtj9LEgcGT39WehBSO1WgzQ=; b=woqZj6kb1wOyEYXl5kAAv/8qWLmheVXKHql1aX+PDKqgUQzgvPAUDFqeLVALzBG9AJ EEsPs1T8xNfe0rsI289Ig/d61w5tp3ZMFSw9Ok/JKjwL1iJJOXiEx/pbACjcnjCcRx5A NkLpYRa07hrbZGsGGQEeA0b3DOO9LHWvzjI3r1bdvngPH5Lt83zeme2hcRtLE0Icc8Fa 38bgRmlYiP6T0mHXH3YKPZjWvtzqbG+FRYgbyM8+MhuwMPkRRX/7C1CZcFl8Oi6AeHKP nCoVK0no8Saan30wlR1QusYI9C9gkuv4GzroqJXPNuC9lzHJPW1gJFjG5FnSKLwWHkXW pL0A== X-Gm-Message-State: AOAM531R2Exk3o++/c3Kue6YCk7+jRuLDkys6+W9aXfDW79j/4igPMrd WwKpHBhXEeVhj+Av1WZn3GRYCKPDXs7gsRH326RS5wlZbwM= X-Received: by 2002:a9d:6b98:0:b0:60b:c54:e22b with SMTP id b24-20020a9d6b98000000b0060b0c54e22bmr4445740otq.357.1654271382892; Fri, 03 Jun 2022 08:49:42 -0700 (PDT) MIME-Version: 1.0 References: <20220527090039.pdrazo5e6mwgo3d3@lion.mk-sys.cz> <20220527124459.mfo4tjdsjohamsvy@lion.mk-sys.cz> <20220602142254.2ck7dw7u3xlzdnt2@lion.mk-sys.cz> In-Reply-To: <20220602142254.2ck7dw7u3xlzdnt2@lion.mk-sys.cz> From: Alex Deucher Date: Fri, 3 Jun 2022 11:49:31 -0400 Message-ID: Subject: Re: (REGRESSION bisected) Re: amdgpu errors (VM fault / GPU fault detected) with 5.19 merge window snapshots To: Michal Kubecek Cc: "Yang, Philip" , amd-gfx list , Alex Deucher , Felix Kuehling , =?UTF-8?Q?Christian_K=C3=B6nig?= , LKML Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 2, 2022 at 10:22 AM Michal Kubecek wrote: > > On Thu, Jun 02, 2022 at 09:58:22AM -0400, Alex Deucher wrote: > > On Fri, May 27, 2022 at 8:58 AM Michal Kubecek wrote: > > > On Fri, May 27, 2022 at 11:00:39AM +0200, Michal Kubecek wrote: > > > > Hello, > > > > > > > > while testing 5.19 merge window snapshots (commits babf0bb978e3 and > > > > 7e284070abe5), I keep getting errors like below. I have not seen them > > > > with 5.18 final or older. > > > > > > > > ------------------------------------------------------------------------ > > > > [ 247.150333] gmc_v8_0_process_interrupt: 46 callbacks suppressed > > > > [ 247.150336] amdgpu 0000:0c:00.0: amdgpu: GPU fault detected: 147 0x00020802 for process firefox pid 6101 thread firefox:cs0 pid 6116 > > > > [ 247.150339] amdgpu 0000:0c:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00107800 > > > > [ 247.150340] amdgpu 0000:0c:00.0: amdgpu: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0D008002 > > > > [ 247.150341] amdgpu 0000:0c:00.0: amdgpu: VM fault (0x02, vmid 6, pasid 32780) at page 1079296, write from 'TC2' (0x54433200) (8) > > > [...] > > > > [ 249.925909] amdgpu 0000:0c:00.0: amdgpu: IH ring buffer overflow (0x000844C0, 0x00004A00, 0x000044D0) > > > > [ 250.434986] [drm] Fence fallback timer expired on ring sdma0 > > > > [ 466.621568] gmc_v8_0_process_interrupt: 122 callbacks suppressed > > > [...] > > > > ------------------------------------------------------------------------ > > > > > > > > There does not seem to be any apparent immediate problem with graphics > > > > but when running commit babf0bb978e3, there seemed to be a noticeable > > > > lag in some operations, e.g. when moving a window or repainting large > > > > part of the terminal window in konsole (no idea if it's related). > > > > > > > > My GPU is Radeon Pro WX 2100 (1002:6995). What other information should > > > > I collect to help debugging the issue? > > > > > > Bisected to commit 5255e146c99a ("drm/amdgpu: rework TLB flushing"). > > > There seem to be later commits depending on it so I did not test > > > a revert on top of current mainline. > > > > > > I should also mention that most commits tested as "bad" during the > > > bisect did behave much worse than current mainline (errors starting as > > > early as with sddm, visibly damaged screen content, sometimes even > > > crashes). But all of them issued messages similar to those above into > > > kernel log. > > > > Can you verify that the kernel you tested has this patch: > > https://cgit.freedesktop.org/drm/drm/commit/?id=5be323562c6a699d38430bc068a3fd192be8ed0d > > Yes, both of them: > > mike@lion:~/work/git/kernel-upstream> git merge-base --is-ancestor 5be323562c6a babf0bb978e3 && echo yes > yes > > (7e284070abe5 is a later mainline snapshot so it also contains > 5be323562c6a) > > But it's likely that commit 5be323562c6a fixed most of the problem and > only some corner case was left as most bisect steps had many more error > messages and some even crashed before I was able to even log into KDE. > Compared to that, the mainline snapshots show much fewer errors, no > distorted picture and no crash; on the other hand, applications like > firefox or stellarium seem to trigger the errors quite consistently. This patch should help: https://patchwork.freedesktop.org/patch/488258/ Alex > > Michal