2023-01-10 14:07:08

by Christian König

[permalink] [raw]
Subject: Re: [regression, bisected, pci/iommu] Bug  216865 - Black screen when amdgpu started during 6.2-rc1 boot w ith AMD IOMMU enabled

Am 10.01.23 um 14:51 schrieb Jason Gunthorpe:
> On Tue, Jan 10, 2023 at 02:45:30PM +0100, Christian König wrote:
>
>> Since this is a device integrated in the CPU it could be that the ACS/ATS
>> functionalities are controlled by the BIOS and can be enabled/disabled
>> there. But this should always enable/disable both.
> This sounds like a GPU driver bug then, it should tolerate PASID being
> unavailable because of BIOS issues/whatever and not black screen on
> boot?

Yeah, potentially. Could I get a full "sudo lspci -vvvv -s $bus_id" +
dmesg of that device?

Thanks,
Christian.

>
> Jason


2023-01-10 21:21:32

by Matt Fagnani

[permalink] [raw]
Subject: Re: [regression, bisected, pci/iommu] Bug  216865 - Black screen when amdgpu started during 6.2-rc1 boot w ith AMD IOMMU enabled

Christian,

I'm attaching the output of sudo lspci -vvvv. I'm not sure what $bus_id
is in this case. I guess it might be 00 in 00:00.0. I attached the dmesg
from previous boots with 6.2-rc1 at
https://bugzilla.kernel.org/show_bug.cgi?id=216865#c2 as I mentioned at
https://lore.kernel.org/all/[email protected]/
and 6.2-rc2 + Vasant's patch with rd.driver.blacklist=amdgpu on the
kernel command line at
https://lore.kernel.org/all/[email protected]/
I'm using the Radeon R5 integrated GPU which is called Wani in lspci and
Carrizo in dmesg. The CPU is AMD A10-9620P which is Bristol Ridge or
Excavator+ according to
https://en.wikipedia.org/wiki/List_of_AMD_accelerated_processing_units
I'm using the internal Elan touchscreen in the laptop. I'm not using the
HDMI port for an external monitor or audio which I think is called
Kabini HDMI/DP Audio in lspci

Thanks,

Matt

On 1/10/23 08:56, Christian König wrote:
> Am 10.01.23 um 14:51 schrieb Jason Gunthorpe:
>> On Tue, Jan 10, 2023 at 02:45:30PM +0100, Christian König wrote:
>>
>>> Since this is a device integrated in the CPU it could be that the
>>> ACS/ATS
>>> functionalities are controlled by the BIOS and can be enabled/disabled
>>> there. But this should always enable/disable both.
>> This sounds like a GPU driver bug then, it should tolerate PASID being
>> unavailable because of BIOS issues/whatever and not black screen on
>> boot?
>
> Yeah, potentially. Could I get a full "sudo lspci -vvvv -s $bus_id" +
> dmesg of that device?
>
> Thanks,
> Christian.
>
>>
>> Jason
>


Attachments:
lspci-vvvv-1.txt (39.67 kB)

2023-01-11 08:54:55

by Christian König

[permalink] [raw]
Subject: Re: [regression, bisected, pci/iommu] Bug  216865 - Black screen when amdgpu started during 6.2-rc1 boot w ith AMD IOMMU enabled

Hi Matt,

after reading a bit into the topic I think I know what's going on here.

The assumption that you need ACS to enable PASID handling is simply
incorrect.

Going to send a revert of the offending patch with an in deep
description of the problem.

Thanks,
Christian.

Am 10.01.23 um 21:51 schrieb Matt Fagnani:
> Christian,
>
> I'm attaching the output of sudo lspci -vvvv. I'm not sure what
> $bus_id is in this case. I guess it might be 00 in 00:00.0. I attached
> the dmesg from previous boots with 6.2-rc1 at
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D216865%23c2&data=05%7C01%7Cchristian.koenig%40amd.com%7Cc14ca7b3ead040ee279f08daf34c8687%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638089808663927196%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000%7C%7C%7C&sdata=iFHmme68OeqRpw7zlSPp%2F1mB95DKCR%2FTAsjTcjT6S1s%3D&reserved=0
> as I mentioned at
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2F52583644-d875-a454-7288-8b00ea0566ae%40bell.net%2F&data=05%7C01%7Cchristian.koenig%40amd.com%7Cc14ca7b3ead040ee279f08daf34c8687%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638089808663927196%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000%7C%7C%7C&sdata=j8ZppuXkhw4dD9HS6OwsvulZaV1R3W8Hu%2BW11%2BxMCuE%3D&reserved=0
> and 6.2-rc2 + Vasant's patch with rd.driver.blacklist=amdgpu on the
> kernel command line at
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2Fff26929d-9fb0-3c85-2594-dc2937c1ba9a%40bell.net%2F&data=05%7C01%7Cchristian.koenig%40amd.com%7Cc14ca7b3ead040ee279f08daf34c8687%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638089808663927196%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000%7C%7C%7C&sdata=i6fxlEn74v86MnFfgCmtYQ2JCql0sVsimZqioBiDyPk%3D&reserved=0
> I'm using the Radeon R5 integrated GPU which is called Wani in lspci
> and Carrizo in dmesg. The CPU is AMD A10-9620P which is Bristol Ridge
> or Excavator+ according to
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FList_of_AMD_accelerated_processing_units&data=05%7C01%7Cchristian.koenig%40amd.com%7Cc14ca7b3ead040ee279f08daf34c8687%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638089808664083434%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000%7C%7C%7C&sdata=Ywp7MnbjYMeyXGGNFHOyn2A45IZSLIsShkIPEC4GB48%3D&reserved=0
> I'm using the internal Elan touchscreen in the laptop. I'm not using
> the HDMI port for an external monitor or audio which I think is called
> Kabini HDMI/DP Audio in lspci
>
> Thanks,
>
> Matt
>
> On 1/10/23 08:56, Christian König wrote:
>> Am 10.01.23 um 14:51 schrieb Jason Gunthorpe:
>>> On Tue, Jan 10, 2023 at 02:45:30PM +0100, Christian König wrote:
>>>
>>>> Since this is a device integrated in the CPU it could be that the
>>>> ACS/ATS
>>>> functionalities are controlled by the BIOS and can be enabled/disabled
>>>> there. But this should always enable/disable both.
>>> This sounds like a GPU driver bug then, it should tolerate PASID being
>>> unavailable because of BIOS issues/whatever and not black screen on
>>> boot?
>>
>> Yeah, potentially. Could I get a full "sudo lspci -vvvv -s $bus_id" +
>> dmesg of that device?
>>
>> Thanks,
>> Christian.
>>
>>>
>>> Jason
>>