2022-09-21 17:02:22

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.19 01/16] Drivers: hv: Never allocate anything besides framebuffer from framebuffer memory region

From: Vitaly Kuznetsov <[email protected]>

[ Upstream commit f0880e2cb7e1f8039a048fdd01ce45ab77247221 ]

Passed through PCI device sometimes misbehave on Gen1 VMs when Hyper-V
DRM driver is also loaded. Looking at IOMEM assignment, we can see e.g.

$ cat /proc/iomem
...
f8000000-fffbffff : PCI Bus 0000:00
f8000000-fbffffff : 0000:00:08.0
f8000000-f8001fff : bb8c4f33-2ba2-4808-9f7f-02f3b4da22fe
...
fe0000000-fffffffff : PCI Bus 0000:00
fe0000000-fe07fffff : bb8c4f33-2ba2-4808-9f7f-02f3b4da22fe
fe0000000-fe07fffff : 2ba2:00:02.0
fe0000000-fe07fffff : mlx4_core

the interesting part is the 'f8000000' region as it is actually the
VM's framebuffer:

$ lspci -v
...
0000:00:08.0 VGA compatible controller: Microsoft Corporation Hyper-V virtual VGA (prog-if 00 [VGA controller])
Flags: bus master, fast devsel, latency 0, IRQ 11
Memory at f8000000 (32-bit, non-prefetchable) [size=64M]
...

hv_vmbus: registering driver hyperv_drm
hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] Synthvid Version major 3, minor 5
hyperv_drm 0000:00:08.0: vgaarb: deactivate vga console
hyperv_drm 0000:00:08.0: BAR 0: can't reserve [mem 0xf8000000-0xfbffffff]
hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] Cannot request framebuffer, boot fb still active?

Note: "Cannot request framebuffer" is not a fatal error in
hyperv_setup_gen1() as the code assumes there's some other framebuffer
device there but we actually have some other PCI device (mlx4 in this
case) config space there!

The problem appears to be that vmbus_allocate_mmio() can use dedicated
framebuffer region to serve any MMIO request from any device. The
semantics one might assume of a parameter named "fb_overlap_ok"
aren't implemented because !fb_overlap_ok essentially has no effect.
The existing semantics are really "prefer_fb_overlap". This patch
implements the expected and needed semantics, which is to not allocate
from the frame buffer space when !fb_overlap_ok.

Note, Gen2 VMs are usually unaffected by the issue because
framebuffer region is already taken by EFI fb (in case kernel supports
it) but Gen1 VMs may have this region unclaimed by the time Hyper-V PCI
pass-through driver tries allocating MMIO space if Hyper-V DRM/FB drivers
load after it. Devices can be brought up in any sequence so let's
resolve the issue by always ignoring 'fb_mmio' region for non-FB
requests, even if the region is unclaimed.

Reviewed-by: Michael Kelley <[email protected]>
Signed-off-by: Vitaly Kuznetsov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Wei Liu <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/hv/vmbus_drv.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 547ae334e5cd..027029efb008 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -2309,7 +2309,7 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj,
bool fb_overlap_ok)
{
struct resource *iter, *shadow;
- resource_size_t range_min, range_max, start;
+ resource_size_t range_min, range_max, start, end;
const char *dev_n = dev_name(&device_obj->device);
int retval;

@@ -2344,6 +2344,14 @@ int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj,
range_max = iter->end;
start = (range_min + align - 1) & ~(align - 1);
for (; start + size - 1 <= range_max; start += align) {
+ end = start + size - 1;
+
+ /* Skip the whole fb_mmio region if not fb_overlap_ok */
+ if (!fb_overlap_ok && fb_mmio &&
+ (((start >= fb_mmio->start) && (start <= fb_mmio->end)) ||
+ ((end >= fb_mmio->start) && (end <= fb_mmio->end))))
+ continue;
+
shadow = __request_region(iter, start, size, NULL,
IORESOURCE_BUSY);
if (!shadow)
--
2.35.1


2022-09-21 17:07:36

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.19 07/16] drm/amdgpu: use dirty framebuffer helper

From: Hamza Mahfooz <[email protected]>

[ Upstream commit 66f99628eb24409cb8feb5061f78283c8b65f820 ]

Currently, we aren't handling DRM_IOCTL_MODE_DIRTYFB. So, use
drm_atomic_helper_dirtyfb() as the dirty callback in the amdgpu_fb_funcs
struct.

Signed-off-by: Hamza Mahfooz <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index 4dfd6724b3ca..3451147beda3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -35,6 +35,7 @@
#include <linux/pci.h>
#include <linux/pm_runtime.h>
#include <drm/drm_crtc_helper.h>
+#include <drm/drm_damage_helper.h>
#include <drm/drm_edid.h>
#include <drm/drm_gem_framebuffer_helper.h>
#include <drm/drm_fb_helper.h>
@@ -493,6 +494,7 @@ bool amdgpu_display_ddc_probe(struct amdgpu_connector *amdgpu_connector,
static const struct drm_framebuffer_funcs amdgpu_fb_funcs = {
.destroy = drm_gem_fb_destroy,
.create_handle = drm_gem_fb_create_handle,
+ .dirty = drm_atomic_helper_dirtyfb,
};

uint32_t amdgpu_display_supported_domains(struct amdgpu_device *adev,
--
2.35.1

2022-09-21 17:09:34

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.19 04/16] drm/gma500: Fix (vblank) IRQs not working after suspend/resume

From: Hans de Goede <[email protected]>

[ Upstream commit 235fdbc32d559db21e580f85035c59372704f09e ]

Fix gnome-shell (and other page-flip users) hanging after suspend/resume
because of the gma500's IRQs not working.

This fixes 2 problems with the IRQ handling:

1. gma_power_off() calls gma_irq_uninstall() which does a free_irq(), but
gma_power_on() called gma_irq_preinstall() + gma_irq_postinstall() which
do not call request_irq. Replace the pre- + post-install calls with
gma_irq_install() which does prep + request + post.

2. After fixing 1. IRQs still do not work on a Packard Bell Dot SC (Intel
Atom N2600, cedarview) netbook.

Cederview uses MSI interrupts and it seems that the BIOS re-configures
things back to normal APIC based interrupts during S3 suspend. There is
some MSI PCI-config registers save/restore code which tries to deal with
this, but on the Packard Bell Dot SC this is not sufficient to restore
MSI IRQ functionality after a suspend/resume.

Replace the PCI-config registers save/restore with pci_disable_msi() on
suspend + pci_enable_msi() on resume. Fixing e.g. gnome-shell hanging.

Signed-off-by: Hans de Goede <[email protected]>
Signed-off-by: Patrik Jakobsson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/gma500/cdv_device.c | 4 +---
drivers/gpu/drm/gma500/oaktrail_device.c | 5 +----
drivers/gpu/drm/gma500/power.c | 8 +-------
drivers/gpu/drm/gma500/psb_drv.c | 2 +-
drivers/gpu/drm/gma500/psb_drv.h | 5 +----
drivers/gpu/drm/gma500/psb_irq.c | 15 ++++++++++++---
drivers/gpu/drm/gma500/psb_irq.h | 2 +-
7 files changed, 18 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/gma500/cdv_device.c b/drivers/gpu/drm/gma500/cdv_device.c
index dd32b484dd82..ce96234f3df2 100644
--- a/drivers/gpu/drm/gma500/cdv_device.c
+++ b/drivers/gpu/drm/gma500/cdv_device.c
@@ -581,11 +581,9 @@ static const struct psb_offset cdv_regmap[2] = {
static int cdv_chip_setup(struct drm_device *dev)
{
struct drm_psb_private *dev_priv = to_drm_psb_private(dev);
- struct pci_dev *pdev = to_pci_dev(dev->dev);
INIT_WORK(&dev_priv->hotplug_work, cdv_hotplug_work_func);

- if (pci_enable_msi(pdev))
- dev_warn(dev->dev, "Enabling MSI failed!\n");
+ dev_priv->use_msi = true;
dev_priv->regmap = cdv_regmap;
gma_get_core_freq(dev);
psb_intel_opregion_init(dev);
diff --git a/drivers/gpu/drm/gma500/oaktrail_device.c b/drivers/gpu/drm/gma500/oaktrail_device.c
index 5923a9c89312..f90e628cb482 100644
--- a/drivers/gpu/drm/gma500/oaktrail_device.c
+++ b/drivers/gpu/drm/gma500/oaktrail_device.c
@@ -501,12 +501,9 @@ static const struct psb_offset oaktrail_regmap[2] = {
static int oaktrail_chip_setup(struct drm_device *dev)
{
struct drm_psb_private *dev_priv = to_drm_psb_private(dev);
- struct pci_dev *pdev = to_pci_dev(dev->dev);
int ret;

- if (pci_enable_msi(pdev))
- dev_warn(dev->dev, "Enabling MSI failed!\n");
-
+ dev_priv->use_msi = true;
dev_priv->regmap = oaktrail_regmap;

ret = mid_chip_setup(dev);
diff --git a/drivers/gpu/drm/gma500/power.c b/drivers/gpu/drm/gma500/power.c
index b91de6d36e41..66873085d450 100644
--- a/drivers/gpu/drm/gma500/power.c
+++ b/drivers/gpu/drm/gma500/power.c
@@ -139,8 +139,6 @@ static void gma_suspend_pci(struct pci_dev *pdev)
dev_priv->regs.saveBSM = bsm;
pci_read_config_dword(pdev, 0xFC, &vbt);
dev_priv->regs.saveVBT = vbt;
- pci_read_config_dword(pdev, PSB_PCIx_MSI_ADDR_LOC, &dev_priv->msi_addr);
- pci_read_config_dword(pdev, PSB_PCIx_MSI_DATA_LOC, &dev_priv->msi_data);

pci_disable_device(pdev);
pci_set_power_state(pdev, PCI_D3hot);
@@ -168,9 +166,6 @@ static bool gma_resume_pci(struct pci_dev *pdev)
pci_restore_state(pdev);
pci_write_config_dword(pdev, 0x5c, dev_priv->regs.saveBSM);
pci_write_config_dword(pdev, 0xFC, dev_priv->regs.saveVBT);
- /* restoring MSI address and data in PCIx space */
- pci_write_config_dword(pdev, PSB_PCIx_MSI_ADDR_LOC, dev_priv->msi_addr);
- pci_write_config_dword(pdev, PSB_PCIx_MSI_DATA_LOC, dev_priv->msi_data);
ret = pci_enable_device(pdev);

if (ret != 0)
@@ -223,8 +218,7 @@ int gma_power_resume(struct device *_dev)
mutex_lock(&power_mutex);
gma_resume_pci(pdev);
gma_resume_display(pdev);
- gma_irq_preinstall(dev);
- gma_irq_postinstall(dev);
+ gma_irq_install(dev);
mutex_unlock(&power_mutex);
return 0;
}
diff --git a/drivers/gpu/drm/gma500/psb_drv.c b/drivers/gpu/drm/gma500/psb_drv.c
index 1d8744f3e702..54e756b48606 100644
--- a/drivers/gpu/drm/gma500/psb_drv.c
+++ b/drivers/gpu/drm/gma500/psb_drv.c
@@ -383,7 +383,7 @@ static int psb_driver_load(struct drm_device *dev, unsigned long flags)
PSB_WVDC32(0xFFFFFFFF, PSB_INT_MASK_R);
spin_unlock_irqrestore(&dev_priv->irqmask_lock, irqflags);

- gma_irq_install(dev, pdev->irq);
+ gma_irq_install(dev);

dev->max_vblank_count = 0xffffff; /* only 24 bits of frame count */

diff --git a/drivers/gpu/drm/gma500/psb_drv.h b/drivers/gpu/drm/gma500/psb_drv.h
index 0ddfec1a0851..4c3fc5eaf6ad 100644
--- a/drivers/gpu/drm/gma500/psb_drv.h
+++ b/drivers/gpu/drm/gma500/psb_drv.h
@@ -490,6 +490,7 @@ struct drm_psb_private {
int rpm_enabled;

/* MID specific */
+ bool use_msi;
bool has_gct;
struct oaktrail_gct_data gct_data;

@@ -499,10 +500,6 @@ struct drm_psb_private {
/* Register state */
struct psb_save_area regs;

- /* MSI reg save */
- uint32_t msi_addr;
- uint32_t msi_data;
-
/* Hotplug handling */
struct work_struct hotplug_work;

diff --git a/drivers/gpu/drm/gma500/psb_irq.c b/drivers/gpu/drm/gma500/psb_irq.c
index e6e6d61bbeab..038f18ed0a95 100644
--- a/drivers/gpu/drm/gma500/psb_irq.c
+++ b/drivers/gpu/drm/gma500/psb_irq.c
@@ -316,17 +316,24 @@ void gma_irq_postinstall(struct drm_device *dev)
spin_unlock_irqrestore(&dev_priv->irqmask_lock, irqflags);
}

-int gma_irq_install(struct drm_device *dev, unsigned int irq)
+int gma_irq_install(struct drm_device *dev)
{
+ struct drm_psb_private *dev_priv = to_drm_psb_private(dev);
+ struct pci_dev *pdev = to_pci_dev(dev->dev);
int ret;

- if (irq == IRQ_NOTCONNECTED)
+ if (dev_priv->use_msi && pci_enable_msi(pdev)) {
+ dev_warn(dev->dev, "Enabling MSI failed!\n");
+ dev_priv->use_msi = false;
+ }
+
+ if (pdev->irq == IRQ_NOTCONNECTED)
return -ENOTCONN;

gma_irq_preinstall(dev);

/* PCI devices require shared interrupts. */
- ret = request_irq(irq, gma_irq_handler, IRQF_SHARED, dev->driver->name, dev);
+ ret = request_irq(pdev->irq, gma_irq_handler, IRQF_SHARED, dev->driver->name, dev);
if (ret)
return ret;

@@ -369,6 +376,8 @@ void gma_irq_uninstall(struct drm_device *dev)
spin_unlock_irqrestore(&dev_priv->irqmask_lock, irqflags);

free_irq(pdev->irq, dev);
+ if (dev_priv->use_msi)
+ pci_disable_msi(pdev);
}

int gma_crtc_enable_vblank(struct drm_crtc *crtc)
diff --git a/drivers/gpu/drm/gma500/psb_irq.h b/drivers/gpu/drm/gma500/psb_irq.h
index b51e395194ff..7648f69824a5 100644
--- a/drivers/gpu/drm/gma500/psb_irq.h
+++ b/drivers/gpu/drm/gma500/psb_irq.h
@@ -17,7 +17,7 @@ struct drm_device;

void gma_irq_preinstall(struct drm_device *dev);
void gma_irq_postinstall(struct drm_device *dev);
-int gma_irq_install(struct drm_device *dev, unsigned int irq);
+int gma_irq_install(struct drm_device *dev);
void gma_irq_uninstall(struct drm_device *dev);

int gma_crtc_enable_vblank(struct drm_crtc *crtc);
--
2.35.1

2022-09-21 17:11:53

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.19 10/16] drm/amdgpu: Skip reset error status for psp v13_0_0

From: Candice Li <[email protected]>

[ Upstream commit 86875d558b91cb46f43be112799c06ecce60ec1e ]

No need to reset error status since only umc ras supported on psp v13_0_0.

Signed-off-by: Candice Li <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index dac202ae864d..9193ca5d6fe7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1805,7 +1805,8 @@ static void amdgpu_ras_log_on_err_counter(struct amdgpu_device *adev)
amdgpu_ras_query_error_status(adev, &info);

if (adev->ip_versions[MP0_HWIP][0] != IP_VERSION(11, 0, 2) &&
- adev->ip_versions[MP0_HWIP][0] != IP_VERSION(11, 0, 4)) {
+ adev->ip_versions[MP0_HWIP][0] != IP_VERSION(11, 0, 4) &&
+ adev->ip_versions[MP0_HWIP][0] != IP_VERSION(13, 0, 0)) {
if (amdgpu_ras_reset_error_status(adev, info.head.block))
dev_warn(adev->dev, "Failed to reset error counter and error status");
}
--
2.35.1

2022-09-21 17:55:41

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.19 08/16] drm/amdgpu: change the alignment size of TMR BO to 1M

From: Yang Wang <[email protected]>

[ Upstream commit 36de13fdb04abef3ee03ade5129ab146de63983b ]

align TMR BO size TO tmr size is not necessary,
modify the size to 1M to avoid re-create BO fail
when serious VRAM fragmentation.

v2:
add new macro PSP_TMR_ALIGNMENT for TMR BO alignment size

Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 2b00f8fe15a8..7b8d4484c3c4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -748,7 +748,7 @@ static int psp_tmr_init(struct psp_context *psp)
}

pptr = amdgpu_sriov_vf(psp->adev) ? &tmr_buf : NULL;
- ret = amdgpu_bo_create_kernel(psp->adev, tmr_size, PSP_TMR_SIZE(psp->adev),
+ ret = amdgpu_bo_create_kernel(psp->adev, tmr_size, PSP_TMR_ALIGNMENT,
AMDGPU_GEM_DOMAIN_VRAM,
&psp->tmr_bo, &psp->tmr_mc_addr, pptr);

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
index e431f4994931..cd366c7f311f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
@@ -36,6 +36,7 @@
#define PSP_CMD_BUFFER_SIZE 0x1000
#define PSP_1_MEG 0x100000
#define PSP_TMR_SIZE(adev) ((adev)->asic_type == CHIP_ALDEBARAN ? 0x800000 : 0x400000)
+#define PSP_TMR_ALIGNMENT 0x100000
#define PSP_FW_NAME_LEN 0x24

enum psp_shared_mem_size {
--
2.35.1

2022-09-21 18:00:19

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH AUTOSEL 5.19 01/16] Drivers: hv: Never allocate anything besides framebuffer from framebuffer memory region

Sasha Levin <[email protected]> writes:

> From: Vitaly Kuznetsov <[email protected]>
>
> [ Upstream commit f0880e2cb7e1f8039a048fdd01ce45ab77247221 ]
>

(this comment applies to all stable branches)

While this change seems to be worthy on its own, the underlying issue
with Gen1 Hyper-V VMs won't be resolved without

commit 2a8a8afba0c3053d0ea8686182f6b2104293037e
Author: Vitaly Kuznetsov <[email protected]>
Date: Sat Aug 27 15:03:44 2022 +0200

Drivers: hv: Always reserve framebuffer region for Gen1 VMs

as 'fb_mmio' is still going to be unset in some cases without it.

--
Vitaly

2022-09-25 03:31:01

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH AUTOSEL 5.19 01/16] Drivers: hv: Never allocate anything besides framebuffer from framebuffer memory region

On Wed, Sep 21, 2022 at 06:17:34PM +0200, Vitaly Kuznetsov wrote:
>Sasha Levin <[email protected]> writes:
>
>> From: Vitaly Kuznetsov <[email protected]>
>>
>> [ Upstream commit f0880e2cb7e1f8039a048fdd01ce45ab77247221 ]
>>
>
>(this comment applies to all stable branches)
>
>While this change seems to be worthy on its own, the underlying issue
>with Gen1 Hyper-V VMs won't be resolved without
>
>commit 2a8a8afba0c3053d0ea8686182f6b2104293037e
>Author: Vitaly Kuznetsov <[email protected]>
>Date: Sat Aug 27 15:03:44 2022 +0200
>
> Drivers: hv: Always reserve framebuffer region for Gen1 VMs
>
>as 'fb_mmio' is still going to be unset in some cases without it.

Which seems to fail building. Backports welcome :)

--
Thanks,
Sasha

2022-10-20 05:41:32

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH AUTOSEL 5.19 07/16] drm/amdgpu: use dirty framebuffer helper

On Wed, Sep 21, 2022 at 11:53:23AM -0400, Sasha Levin wrote:
> From: Hamza Mahfooz <[email protected]>
>
> [ Upstream commit 66f99628eb24409cb8feb5061f78283c8b65f820 ]
>
> Currently, we aren't handling DRM_IOCTL_MODE_DIRTYFB. So, use
> drm_atomic_helper_dirtyfb() as the dirty callback in the amdgpu_fb_funcs
> struct.
>
> Signed-off-by: Hamza Mahfooz <[email protected]>
> Acked-by: Alex Deucher <[email protected]>
> Signed-off-by: Alex Deucher <[email protected]>
> Signed-off-by: Sasha Levin <[email protected]>

I just spent a long time bisecting a hard-to-reproduce regression to this
commit, only to find that a revert was just queued this week.

Why was this commit backported to stable in the first place? It didn't have Cc
stable, and it didn't claim to be fixing anything.

- Eric