Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp3535576pxb; Mon, 24 Jan 2022 11:38:50 -0800 (PST) X-Google-Smtp-Source: ABdhPJyTp61OCeqHEYHi7Q0xHIZjn2luC+wWWWNdIEc9H8ScotKXefZAUW+ngAd3opxJ1PyKlQSq X-Received: by 2002:a17:903:41c9:b0:14b:53f9:b4fa with SMTP id u9-20020a17090341c900b0014b53f9b4famr5324881ple.166.1643053130415; Mon, 24 Jan 2022 11:38:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643053130; cv=none; d=google.com; s=arc-20160816; b=hAA2NmSJJ4qbU3oqDAJE/RrDfW8jyKva3dwxQGAhL9sjs7b19JTRs6pA/BJNqrEYqo eL4RlLWdp8VZFiWgY7ISmeInk/2bisyyWdqvynfmd4S6zefdURCEmUfVE5Sm57C7wHSu Xz2Jw1NGRQshaPKCqkcLMpRRxhCrkWELKzZ6BVJ8yVLZm8/y/K15R/I/gZWDmA9iHqlS o0bzYXK+0oDaHdXgLT7rHysjr2VaBc2hBbrE/yadlvokv0JTqwd+4+lNsvYfmiTJmsTz a93tG6R9Bj+sPuZmqm2y5/KaugV/tiecIMNFwk+7EEFeCNx1jZYELzmd+MzLrsDU2ZLB v2tg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=E7Oz+OAUDPzcoErJ5LrjgeRbjicJzQkLP3fhGNuFgqc=; b=anptLms9ztB/ffgnOxF44RVawWmjeA8Nw//ZDWBIbJUTqPS5Py2hN2/3dTCgf4FcaN 0vzH3GZoxqnqHk+NnAXx6nwIodxva0gA0riU2dpLSH667XX7qPpaBoCyNGJkWDbCnhjq WtVnkGU3DI5rwe+PlXnpibVNhdVs7x94RfA7/ZA5GPErC7rCULt8/Bm08quuSGjf4OlT imOD9y+aQ8Q9Gi5kg98yakc1W1sMaQoLKAEQahBf0Si2r6AqxVyJXcyHFt9o4LCuB74M v0+AP9ttRu8GNntP20JdMz0EYbVgMG4c+th2yO1W7qwom64mhihVsXHOgEwK4ZgyG4fc SKjw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=dVPPVZHd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b1si13572634plx.318.2022.01.24.11.38.36; Mon, 24 Jan 2022 11:38:50 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=dVPPVZHd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244032AbiAXRGb (ORCPT + 99 others); Mon, 24 Jan 2022 12:06:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45424 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244124AbiAXREa (ORCPT ); Mon, 24 Jan 2022 12:04:30 -0500 Received: from mail-oi1-x22f.google.com (mail-oi1-x22f.google.com [IPv6:2607:f8b0:4864:20::22f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BDAD2C061744; Mon, 24 Jan 2022 09:04:29 -0800 (PST) Received: by mail-oi1-x22f.google.com with SMTP id v67so8295137oie.9; Mon, 24 Jan 2022 09:04:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=E7Oz+OAUDPzcoErJ5LrjgeRbjicJzQkLP3fhGNuFgqc=; b=dVPPVZHdsFuLVnFkMrORa4gA/eaYJ6iPt8ceenfYEwI73FAVpW9YkJ8/5Wm3Au7hEr fvWyQr/6BB9z9H9YDWR46LqrLyNdUQ+6kiNEhwyezWSfPcx1uIn3XInrOWi2QOOOiH0I HqksV1VnqQ/wkyDIyEp78mHK8dFixfTzUsQx4l53L830PISZPGMgR7zR/KvXy+CrRixL OXgIKDEpfzp7G0THjPRlMHFE5PBYOHQSbscDm5YkcWSzkXtmesmlMjXqrqxbv1CFUnPZ DmY3fF4PvuXJ1GQNtt1mPKVNyzLxTwPuzzeE08g6r4JPvhD0JiGpIoHe936jhq62JoiF xpMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=E7Oz+OAUDPzcoErJ5LrjgeRbjicJzQkLP3fhGNuFgqc=; b=vhxeKrdwEvQrUhVIR0vhwhcrbw+o/nktq6SEyFkBsxqHFezXIDjW4wXUMY3pl7HaDb yHoRFCzWMYiSTPQJ0UdE7KHekhNHNw1j9esYvl0/EGuY/UnOSPwmOIXJVfE69hR3LjnC XHbJFdbDDecsmumECJAjTqcBtCGtKjLugh97+VRvmUzKK62OXDi0FJ+OP0li/YMwyLBc qHzyBTZhPqj2lZv0ZCSYgVjrznFHpAdNWBZ2BOsBAPnmq1/DrO6+yJ8iO/Z9hVsRW60g f01x2YQ7d6EtH7rfAmvw0es/oFx7qiL4Jp0tg6kcrFh9GWlMG9VQMlwtxx7JbNE0NmV/ eKxA== X-Gm-Message-State: AOAM533ogOVQJsK/DD/DowSOol4kdEFA32GMFVLBmGouEb4xly9bx2Ot Lb99DAIZPPP8RxEoOFTc+4w6EMXipOIP8emRDNQ= X-Received: by 2002:a05:6808:68f:: with SMTP id k15mr2142619oig.5.1643043869133; Mon, 24 Jan 2022 09:04:29 -0800 (PST) MIME-Version: 1.0 References: <87ee57c8fu.fsf@turner.link> <87a6ftk9qy.fsf@dmarc-none.turner.link> <87zgnp96a4.fsf@turner.link> <87czkk1pmt.fsf@dmarc-none.turner.link> <87sftfqwlx.fsf@dmarc-none.turner.link> In-Reply-To: <87sftfqwlx.fsf@dmarc-none.turner.link> From: Alex Deucher Date: Mon, 24 Jan 2022 12:04:18 -0500 Message-ID: Subject: Re: [REGRESSION] Too-low frequency limit for AMD GPU PCI-passed-through to Windows VM To: James Turner Cc: "Lazar, Lijo" , Thorsten Leemhuis , "Deucher, Alexander" , "regressions@lists.linux.dev" , "kvm@vger.kernel.org" , Greg KH , "Pan, Xinhui" , LKML , "amd-gfx@lists.freedesktop.org" , Alex Williamson , "Koenig, Christian" Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jan 22, 2022 at 4:38 PM James Turner wrote: > > Hi Lijo, > > > Could you provide the pp_dpm_* values in sysfs with and without the > > patch? Also, could you try forcing PCIE to gen3 (through pp_dpm_pcie) > > if it's not in gen3 when the issue happens? > > AFAICT, I can't access those values while the AMD GPU PCI devices are > bound to `vfio-pci`. However, I can at least access the link speed and > width elsewhere in sysfs. So, I gathered what information I could for > two different cases: > > - With the PCI devices bound to `vfio-pci`. With this configuration, I > can start the VM, but the `pp_dpm_*` values are not available since > the devices are bound to `vfio-pci` instead of `amdgpu`. > > - Without the PCI devices bound to `vfio-pci` (i.e. after removing the > `vfio-pci.ids=...` kernel command line argument). With this > configuration, I can access the `pp_dpm_*` values, since the PCI > devices are bound to `amdgpu`. However, I cannot use the VM. If I try > to start the VM, the display (both the external monitors attached to > the AMD GPU and the built-in laptop display attached to the Intel > iGPU) completely freezes. > > The output shown below was identical for both the good commit: > f1688bd69ec4 ("drm/amd/amdgpu:save psp ring wptr to avoid attack") > and the commit which introduced the issue: > f9b7f3703ff9 ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)") > > Note that the PCI link speed increased to 8.0 GT/s when the GPU was > under heavy load for both versions, but the clock speeds of the GPU were > different under load. (For the good commit, it was 1295 MHz; for the bad > commit, it was 501 MHz.) > Are the ATIF and ATCS ACPI methods available in the guest VM? They are required for this platform to work correctly from a power standpoint. One thing that f9b7f3703ff9 did was to get those ACPI methods executed on certain platforms where they had not been previously due to a bug in the original implementation. If the windows driver doesn't interact with them, it could cause performance issues. It may have worked by accident before because the ACPI interfaces may not have been called, leading the windows driver to believe this was a standalone dGPU rather than one integrated into a power/thermal limited platform. Alex > > # With the PCI devices bound to `vfio-pci` > > ## Before starting the VM > > % ls /sys/module/amdgpu/drivers/pci:amdgpu > module bind new_id remove_id uevent unbind > > % find /sys/bus/pci/devices/0000:01:00.0/ -type f -name 'current_link*' -print -exec cat {} \; > /sys/bus/pci/devices/0000:01:00.0/current_link_width > 8 > /sys/bus/pci/devices/0000:01:00.0/current_link_speed > 8.0 GT/s PCIe > > ## While running the VM, before placing the AMD GPU under heavy load > > % find /sys/bus/pci/devices/0000:01:00.0/ -type f -name 'current_link*' -print -exec cat {} \; > /sys/bus/pci/devices/0000:01:00.0/current_link_width > 8 > /sys/bus/pci/devices/0000:01:00.0/current_link_speed > 2.5 GT/s PCIe > > ## While running the VM, with the AMD GPU under heavy load > > % find /sys/bus/pci/devices/0000:01:00.0/ -type f -name 'current_link*' -print -exec cat {} \; > /sys/bus/pci/devices/0000:01:00.0/current_link_width > 8 > /sys/bus/pci/devices/0000:01:00.0/current_link_speed > 8.0 GT/s PCIe > > ## While running the VM, after stopping the heavy load on the AMD GPU > > % find /sys/bus/pci/devices/0000:01:00.0/ -type f -name 'current_link*' -print -exec cat {} \; > /sys/bus/pci/devices/0000:01:00.0/current_link_width > 8 > /sys/bus/pci/devices/0000:01:00.0/current_link_speed > 2.5 GT/s PCIe > > ## After stopping the VM > > % find /sys/bus/pci/devices/0000:01:00.0/ -type f -name 'current_link*' -print -exec cat {} \; > /sys/bus/pci/devices/0000:01:00.0/current_link_width > 8 > /sys/bus/pci/devices/0000:01:00.0/current_link_speed > 2.5 GT/s PCIe > > > # Without the PCI devices bound to `vfio-pci` > > % ls /sys/module/amdgpu/drivers/pci:amdgpu > 0000:01:00.0 module bind new_id remove_id uevent unbind > > % for f in /sys/module/amdgpu/drivers/pci:amdgpu/*/pp_dpm_*; do echo "$f"; cat "$f"; echo; done > /sys/module/amdgpu/drivers/pci:amdgpu/0000:01:00.0/pp_dpm_mclk > 0: 300Mhz > 1: 625Mhz > 2: 1500Mhz * > > /sys/module/amdgpu/drivers/pci:amdgpu/0000:01:00.0/pp_dpm_pcie > 0: 2.5GT/s, x8 > 1: 8.0GT/s, x16 * > > /sys/module/amdgpu/drivers/pci:amdgpu/0000:01:00.0/pp_dpm_sclk > 0: 214Mhz > 1: 501Mhz > 2: 850Mhz > 3: 1034Mhz > 4: 1144Mhz > 5: 1228Mhz > 6: 1275Mhz > 7: 1295Mhz * > > % find /sys/bus/pci/devices/0000:01:00.0/ -type f -name 'current_link*' -print -exec cat {} \; > /sys/bus/pci/devices/0000:01:00.0/current_link_width > 8 > /sys/bus/pci/devices/0000:01:00.0/current_link_speed > 8.0 GT/s PCIe > > > James