Received: by 2002:a05:6358:16cc:b0:ea:6187:17c9 with SMTP id r12csp6302548rwl; Thu, 29 Dec 2022 10:24:38 -0800 (PST) X-Google-Smtp-Source: AMrXdXuV/5JxtIKZDFrrySKXuRVW4Fm4nQlhPW4NBk/OvmFLurVXtz3NoV0mcwMgVQSrUq9Rbuxm X-Received: by 2002:a17:906:c250:b0:79e:4880:dd85 with SMTP id bl16-20020a170906c25000b0079e4880dd85mr21380661ejb.47.1672338278206; Thu, 29 Dec 2022 10:24:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1672338278; cv=none; d=google.com; s=arc-20160816; b=UCBviSv7f15enlG9qv+qkL/enas3Bg2bkFZvfXBa6O0I+6SSMz8BTDUvo5QVVSnIUx qjoQmH0kRBPYbIewvY350yHs7Adufk4y6BCuYLzfBjyJFqyMOvG6Y8iM5YBqZDs20g2y hfM/vB4itTquqn25naSDMo5pxk+d2abtePnULWTpU1+DavGr55sCu4Zh0UEhE93J+KEE UJYX/u6fzC4rFEpdhSVu542LGR7rUfB90lZ9aTkCrGgwKIQ77mTx12MQx+Ys6ip+mP6S DSYvz9hryhHzX8YmzyOUMsIXNFhiLTdfDCAZElgyj+6CEKfyZHIM5Z87VOOzkDn/nt5E EKPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=NJCXUXY/M8fTwQQAdzuC6CJTX/+lXiGdvSWxvK5G6oA=; b=huiVx/v+k8LLcCeMjlqNJTb0wkj7jwaD2DQ2A1nOMkZnsC4aqVj+m9xQUVBk0TA8l7 DeI/swlcxLTW46Dm3z1MOzys+l/JmEn1zYsTzuRT5RrZ7uG5y6XJ/ym58/CHzAyn+YbP 3JIQhqtDHKhAt0KwyHgU1cHfNY+FS95yp9W/+gyafk8cLUJOnGL5l0HbDonZiGKs6Xy6 ehXlZBuO1jgm4IYy9+6CVyB8W4kpPFwu1pOPWc1x2GY/KDImGRU85PjplYGAzxUrBGGn 7AHwnpvAMf6qlhNNsvGMD94XOi/mWQ5iBFrILDQOEbqvbYsINi4XaqqLEWBR7vxhqzgs cwJg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=WKZIfZNz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id m7-20020a056402430700b0047043b43e82si15967472edc.232.2022.12.29.10.24.23; Thu, 29 Dec 2022 10:24:38 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=WKZIfZNz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233935AbiL2Rbr (ORCPT + 61 others); Thu, 29 Dec 2022 12:31:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58234 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233840AbiL2RbS (ORCPT ); Thu, 29 Dec 2022 12:31:18 -0500 Received: from mail-oi1-x230.google.com (mail-oi1-x230.google.com [IPv6:2607:f8b0:4864:20::230]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 032E314D19 for ; Thu, 29 Dec 2022 09:31:16 -0800 (PST) Received: by mail-oi1-x230.google.com with SMTP id i127so17646769oif.8 for ; Thu, 29 Dec 2022 09:31:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=NJCXUXY/M8fTwQQAdzuC6CJTX/+lXiGdvSWxvK5G6oA=; b=WKZIfZNz9UKJkJ+ocpI1xexXH07sw2P3f7WU9Ir50CBc32zAlHh+SkNDRzlF9SSgWo RlN0i7HCiy+GdOL58Zxc1c5OuyPStHLJYsJFFJkPqkLkISnxpuoHhydccPpMCpkeVL2J pXK48VH6n5Sug61DnBSQ71+CkC0/IxZxXaNuBuNkb0hSTwIkBqKolyI307YYSGOe2hhb yluu9wuElKnfK7LrBejgiPIo9J7BT5bFB/FHkAGOKeDmfeoqUacKeUNV3Tc//2R02IGv 6StasNqLXd2xih4D/ujoXAAJw7F1DH23xr81arzoDispPWa9w2PFZODJETqFZnWt5xU9 sK6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=NJCXUXY/M8fTwQQAdzuC6CJTX/+lXiGdvSWxvK5G6oA=; b=EUgr/348tGHlmhVeAJcg7WmQPrCHw6Q76w0z8f66BP6qQ7ge22Utxer8gk4b5NXRF3 TNoBOJBN9uXE/R6yBnkjS00gSJ1S+dsOjrwYmDBmCH4TRHMtvJt++gLPIZF0jhH8rGFH Gq22a8Y1jcYZIwQaHliGWivagj19m092GaHAnklQhcZPv7/HejMtTMjRzvstOX4r6Vk4 puiK05NOCyj5WCNd9cF17cv03J4UBI7MpnM1Hq7Vi0o1Q6hZXRiGVxJ+BeA9YjlKjZJj 60EBCb3sGGzwuTo9f/hP4euqobnJI20APv1h28C0m2iR9obetff3+Yk2CK5JUjIWvzBn soyg== X-Gm-Message-State: AFqh2ko+x4LXObrvItXh1hwdl0DmhU+/l9iG783YruPtnnJojGEL9zw8 zB/nxf1aT5w0jf+Qu3qIEdRzSpRjxWYIcXQhaOk= X-Received: by 2002:a05:6808:2994:b0:35b:f5f7:3ed0 with SMTP id ex20-20020a056808299400b0035bf5f73ed0mr1888175oib.46.1672335075214; Thu, 29 Dec 2022 09:31:15 -0800 (PST) MIME-Version: 1.0 References: <20221228163102.468-1-mario.limonciello@amd.com> In-Reply-To: <20221228163102.468-1-mario.limonciello@amd.com> From: Alex Deucher Date: Thu, 29 Dec 2022 12:31:03 -0500 Message-ID: Subject: Re: [PATCH v2 00/11] Recover from failure to probe GPU To: Mario Limonciello Cc: Javier Martinez Canillas , Alex Deucher , linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, Carlos Soriano Sanchez , christian.koenig@amd.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Patches 1-10 are: Reviewed-by: Alex Deucher On Wed, Dec 28, 2022 at 11:31 AM Mario Limonciello wrote: > > One of the first thing that KMS drivers do during initialization is > destroy the system firmware framebuffer by means of > `drm_aperture_remove_conflicting_pci_framebuffers` > > This means that if for any reason the GPU failed to probe the user > will be stuck with at best a screen frozen at the last thing that > was shown before the KMS driver continued it's probe. > > The problem is most pronounced when new GPU support is introduced > because users will need to have a recent linux-firmware snapshot > on their system when they boot a kernel with matching support. > > However the problem is further exaggerated in the case of amdgpu because > it has migrated to "IP discovery" where amdgpu will attempt to load > on "ALL" AMD GPUs even if the driver is missing support for IP blocks > contained in that GPU. > > IP discovery requires some probing and isn't run until after the > framebuffer has been destroyed. > > This means a situation can occur where a user purchases a new GPU not > yet supported by a distribution and when booting the installer it will > "freeze" even if the distribution doesn't have the matching kernel support > for those IP blocks. > > The perfect example of this is Ubuntu 22.10 and the new dGPUs just > launched by AMD. The installation media ships with kernel 5.19 (which > has IP discovery) but the amdgpu support for those IP blocks landed in > kernel 6.0. The matching linux-firmware was released after 22.10's launch. > The screen will freeze without nomodeset. Even if a user manages to install > and then upgrades to kernel 6.0 after install they'll still have the > problem of missing firmware, and the same experience. > > This is quite jarring for users, particularly if they don't know > that they have to use "nomodeset" to install. > > To help the situation make changes to GPU discovery: > 1) Delay releasing the firmware framebuffer until after IP discovery has > completed. This will help the situation of an older kernel that doesn't > yet support the IP blocks probing a new GPU. > 2) Request loading all PSP, VCN, SDMA, MES and GC microcode into memory > during IP discovery. This will help the situation of new enough kernel for > the IP discovery phase to otherwise pass but missing microcode from > linux-firmware.git. > > Not all requested firmware will be loaded during IP discovery as some of it > will require larger driver architecture changes. For example SMU firmware > isn't loaded on certain products, but that's not known until later on when > the early_init phase of the SMU load occurs. > > v1->v2: > * Take the suggestion from v1 thread to delay the framebuffer release until > ip discovery is done. This patch is CC to stable to that older stable > kernels with IP discovery won't try to probe unknown IP. > * Drop changes to drm aperature. > * Fetch SDMA, VCN, MES, GC and PSP microcode during IP discovery. > > Mario Limonciello (11): > drm/amd: Delay removal of the firmware framebuffer > drm/amd: Add a legacy mapping to "amdgpu_ucode_ip_version_decode" > drm/amd: Convert SMUv11 microcode init to use > `amdgpu_ucode_ip_version_decode` > drm/amd: Convert SMU v13 to use `amdgpu_ucode_ip_version_decode` > drm/amd: Request SDMA microcode during IP discovery > drm/amd: Request VCN microcode during IP discovery > drm/amd: Request MES microcode during IP discovery > drm/amd: Request GFX9 microcode during IP discovery > drm/amd: Request GFX10 microcode during IP discovery > drm/amd: Request GFX11 microcode during IP discovery > drm/amd: Request PSP microcode during IP discovery > > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 + > drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 590 +++++++++++++++++- > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 6 - > drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 - > drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c | 9 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h | 2 +- > drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 208 ++++++ > drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 85 +-- > drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 180 +----- > drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 64 +- > drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 143 +---- > drivers/gpu/drm/amd/amdgpu/mes_v10_1.c | 28 - > drivers/gpu/drm/amd/amdgpu/mes_v11_0.c | 25 +- > drivers/gpu/drm/amd/amdgpu/psp_v10_0.c | 106 +--- > drivers/gpu/drm/amd/amdgpu/psp_v11_0.c | 165 +---- > drivers/gpu/drm/amd/amdgpu/psp_v12_0.c | 102 +-- > drivers/gpu/drm/amd/amdgpu/psp_v13_0.c | 82 --- > drivers/gpu/drm/amd/amdgpu/psp_v13_0_4.c | 36 -- > drivers/gpu/drm/amd/amdgpu/psp_v3_1.c | 36 -- > drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 61 +- > drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 42 +- > drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 65 +- > drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 30 +- > .../gpu/drm/amd/pm/swsmu/smu11/smu_v11_0.c | 35 +- > .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0.c | 12 +- > 25 files changed, 919 insertions(+), 1203 deletions(-) > > > base-commit: de9a71e391a92841582ca3008e7b127a0b8ccf41 > -- > 2.34.1 >