2016-10-31 10:36:18

by Martin Kepplinger

[permalink] [raw]
Subject: [BUG][REGRESSION] mangled display since -rc1

so guys,

I can't believe that nobody hits this: Since -rc1 Nautilus' list of
elements or Firefox' website window or just photos in eog (probably
among many more things) is mangled. Please have a look at the screenshot
of nautilus.

This is the same on a i3 laptop with intel graphics and a i7 with nouvau
graphics. I bisected and the problem is this merge:
first bad commit: [56e520c7a0a490b63b042b047ec9659fc08762a4] Merge tag
'iommu-updates-v4.9' of
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Two things I'd ask of you if this isn't already a known problem to you:

* I failed bisecting into this merge but I could easily have tried it
totally wrong, so I'd appreciate any advice on how to bisect into this.
Strangely, running joro's
https://git.kernel.org/cgit/linux/kernel/git/joro/iommu.git/commit/?h=iommu-updates-v4.9&id=13a08259187c5cd3f63d98efa159ab42976d85a4
(referenced here:
https://git.kernel.org/cgit/linux/kernel/git/joro/iommu.git/tag/?h=iommu-updates-v4.9)
is good (?)

* Please add anybody you know is involved to CC.

I'm happy to test patches too, of course.

Anyhow. This is a bad regression that prevents me from running 4.9. Just
so you know.

so long,
martin


Attachments:
nautilus_mangled.png (57.52 kB)

2016-10-31 10:40:09

by Martin Kepplinger

[permalink] [raw]
Subject: Re: [BUG][REGRESSION] mangled display since -rc1

In case the screenshot doesn't make it to you, here it is:
https://postimg.org/image/5wl2wemt9/


Am 31.10.2016 11:36 schrieb Martin Kepplinger:
> so guys,
>
> I can't believe that nobody hits this: Since -rc1 Nautilus' list of
> elements or Firefox' website window or just photos in eog (probably
> among many more things) is mangled. Please have a look at the
> screenshot of nautilus.
>
> This is the same on a i3 laptop with intel graphics and a i7 with
> nouvau graphics. I bisected and the problem is this merge:
> first bad commit: [56e520c7a0a490b63b042b047ec9659fc08762a4] Merge tag
> 'iommu-updates-v4.9' of
> git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
>
> Two things I'd ask of you if this isn't already a known problem to you:
>
> * I failed bisecting into this merge but I could easily have tried it
> totally wrong, so I'd appreciate any advice on how to bisect into
> this.
> Strangely, running joro's
> https://git.kernel.org/cgit/linux/kernel/git/joro/iommu.git/commit/?h=iommu-updates-v4.9&id=13a08259187c5cd3f63d98efa159ab42976d85a4
> (referenced here:
> https://git.kernel.org/cgit/linux/kernel/git/joro/iommu.git/tag/?h=iommu-updates-v4.9)
> is good (?)
>
> * Please add anybody you know is involved to CC.
>
> I'm happy to test patches too, of course.
>
> Anyhow. This is a bad regression that prevents me from running 4.9.
> Just so you know.
>
> so long,
> martin

2016-10-31 14:20:46

by Jörg Rödel

[permalink] [raw]
Subject: Re: [BUG][REGRESSION] mangled display since -rc1

On Mon, Oct 31, 2016 at 11:40:06AM +0100, Martin Kepplinger wrote:
> In case the screenshot doesn't make it to you, here it is:
> https://postimg.org/image/5wl2wemt9/

Can you please send a boot-dmesg and 'lspci -v'? We need more
information about your system first.


Joerg

2016-10-31 20:45:04

by Martin Kepplinger

[permalink] [raw]
Subject: Re: [BUG][REGRESSION] mangled display since -rc1

Am 2016-10-31 um 16:21 schrieb Joerg Roedel:
> On Mon, Oct 31, 2016 at 11:40:06AM +0100, Martin Kepplinger wrote:
>> In case the screenshot doesn't make it to you, here it is:
>> https://postimg.org/image/5wl2wemt9/
>
> Can you please send a boot-dmesg and 'lspci -v'? We need more
> information about your system first.
>
>
> Joerg
>

This is one machine booting a bad kernel. I could provide another
example later this week.

martin


Attachments:
dmesg_boot_1.txt (50.07 kB)
lspci_1.txt (5.66 kB)
Download all attachments

2016-10-31 20:54:03

by Jörg Rödel

[permalink] [raw]
Subject: Re: [BUG][REGRESSION] mangled display since -rc1

On Mon, Oct 31, 2016 at 09:44:51PM +0100, Martin Kepplinger wrote:
> This is one machine booting a bad kernel. I could provide another
> example later this week.

You have an Intel system without any IOMMU (enabled), otherwise you
would have a DMAR-ACPI table, but there is none:

> [ 0.000000] ACPI: RSDP 0x00000000000FE020 000024 (v02 ACRSYS)
> [ 0.000000] ACPI: XSDT 0x00000000A6FFE210 00008C (v01 ACRSYS ACRPRDCT 00000001 01000013)
> [ 0.000000] ACPI: FACP 0x00000000A6FFB000 0000F4 (v04 ACRSYS ACRPRDCT 00000001 1025 00040000)
> [ 0.000000] ACPI: DSDT 0x00000000A6FEC000 00B903 (v01 ACRSYS ACRPRDCT 00000000 1025 00040000)
> [ 0.000000] ACPI: FACS 0x00000000A6FBB000 000040
> [ 0.000000] ACPI: FACS 0x00000000A6FBB000 000040
> [ 0.000000] ACPI: UEFI 0x00000000A6FFD000 000236 (v01 ACRSYS ACRPRDCT 00000001 1025 00040000)
> [ 0.000000] ACPI: ASF! 0x00000000A6FFC000 0000A5 (v32 ACRSYS ACRPRDCT 00000001 1025 00040000)
> [ 0.000000] ACPI: HPET 0x00000000A6FFA000 000038 (v01 ACRSYS ACRPRDCT 00000001 1025 00040000)
> [ 0.000000] ACPI: APIC 0x00000000A6FF9000 00008C (v02 ACRSYS ACRPRDCT 00000001 1025 00040000)
> [ 0.000000] ACPI: MCFG 0x00000000A6FF8000 00003C (v01 ACRSYS ACRPRDCT 00000001 1025 00040000)
> [ 0.000000] ACPI: SLIC 0x00000000A6FEB000 000176 (v01 ACRSYS ACRPRDCT 00000001 1025 00040000)
> [ 0.000000] ACPI: SSDT 0x00000000A6FEA000 0006FE (v01 ACRSYS ACRPRDCT 00001000 1025 00040000)
> [ 0.000000] ACPI: BOOT 0x00000000A6FE8000 000028 (v01 ACRSYS ACRPRDCT 00000001 1025 00040000)
> [ 0.000000] ACPI: ASPT 0x00000000A6FE3000 000034 (v07 ACRSYS ACRPRDCT 00000001 1025 00040000)
> [ 0.000000] ACPI: FPDT 0x00000000A6FE1000 000044 (v01 ACRSYS ACRPRDCT 00000001 1025 00040000)
> [ 0.000000] ACPI: SSDT 0x00000000A6FE0000 00079A (v01 ACRSYS ACRPRDCT 00003000 1025 00040000)
> [ 0.000000] ACPI: SSDT 0x00000000A6FDF000 000A92 (v01 ACRSYS ACRPRDCT 00003000 1025 00040000)

So it is pretty unlikely that any change in IOMMU code causes your
issue. Not sure why your bisecting ended up there. You also have an
Intel GPU in the system:

> 00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09) (prog-if 00 [VGA controller])
> Subsystem: Acer Incorporated [ALI] Device 0748
> Flags: bus master, fast devsel, latency 0, IRQ 28
> Memory at c0000000 (64-bit, non-prefetchable) [size=4M]
> Memory at b0000000 (64-bit, prefetchable) [size=256M]
> I/O ports at 2000 [size=64]
> [virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
> Capabilities: <access denied>
> Kernel driver in use: i915

My best guess is that some changes in the i915 driver cause your issue,
I add the maintainers of i915 to the cc-list, maybe they have an idea.


Joerg

2016-11-01 11:27:27

by Martin Kepplinger

[permalink] [raw]
Subject: Re: [BUG][REGRESSION] mangled display since -rc1



Am 31. Oktober 2016 22:54:54 MEZ, schrieb Joerg Roedel <[email protected]>:
>On Mon, Oct 31, 2016 at 09:44:51PM +0100, Martin Kepplinger wrote:
>> This is one machine booting a bad kernel. I could provide another
>> example later this week.
>
>You have an Intel system without any IOMMU (enabled), otherwise you
>would have a DMAR-ACPI table, but there is none:
>
>> [ 0.000000] ACPI: RSDP 0x00000000000FE020 000024 (v02 ACRSYS)
>> [ 0.000000] ACPI: XSDT 0x00000000A6FFE210 00008C (v01 ACRSYS
>ACRPRDCT 00000001 01000013)
>> [ 0.000000] ACPI: FACP 0x00000000A6FFB000 0000F4 (v04 ACRSYS
>ACRPRDCT 00000001 1025 00040000)
>> [ 0.000000] ACPI: DSDT 0x00000000A6FEC000 00B903 (v01 ACRSYS
>ACRPRDCT 00000000 1025 00040000)
>> [ 0.000000] ACPI: FACS 0x00000000A6FBB000 000040
>> [ 0.000000] ACPI: FACS 0x00000000A6FBB000 000040
>> [ 0.000000] ACPI: UEFI 0x00000000A6FFD000 000236 (v01 ACRSYS
>ACRPRDCT 00000001 1025 00040000)
>> [ 0.000000] ACPI: ASF! 0x00000000A6FFC000 0000A5 (v32 ACRSYS
>ACRPRDCT 00000001 1025 00040000)
>> [ 0.000000] ACPI: HPET 0x00000000A6FFA000 000038 (v01 ACRSYS
>ACRPRDCT 00000001 1025 00040000)
>> [ 0.000000] ACPI: APIC 0x00000000A6FF9000 00008C (v02 ACRSYS
>ACRPRDCT 00000001 1025 00040000)
>> [ 0.000000] ACPI: MCFG 0x00000000A6FF8000 00003C (v01 ACRSYS
>ACRPRDCT 00000001 1025 00040000)
>> [ 0.000000] ACPI: SLIC 0x00000000A6FEB000 000176 (v01 ACRSYS
>ACRPRDCT 00000001 1025 00040000)
>> [ 0.000000] ACPI: SSDT 0x00000000A6FEA000 0006FE (v01 ACRSYS
>ACRPRDCT 00001000 1025 00040000)
>> [ 0.000000] ACPI: BOOT 0x00000000A6FE8000 000028 (v01 ACRSYS
>ACRPRDCT 00000001 1025 00040000)
>> [ 0.000000] ACPI: ASPT 0x00000000A6FE3000 000034 (v07 ACRSYS
>ACRPRDCT 00000001 1025 00040000)
>> [ 0.000000] ACPI: FPDT 0x00000000A6FE1000 000044 (v01 ACRSYS
>ACRPRDCT 00000001 1025 00040000)
>> [ 0.000000] ACPI: SSDT 0x00000000A6FE0000 00079A (v01 ACRSYS
>ACRPRDCT 00003000 1025 00040000)
>> [ 0.000000] ACPI: SSDT 0x00000000A6FDF000 000A92 (v01 ACRSYS
>ACRPRDCT 00003000 1025 00040000)
>
>So it is pretty unlikely that any change in IOMMU code causes your
>issue. Not sure why your bisecting ended up there. You also have an
>Intel GPU in the system:
>
>> 00:02.0 VGA compatible controller: Intel Corporation 2nd Generation
>Core Processor Family Integrated Graphics Controller (rev 09) (prog-if
>00 [VGA controller])
>> Subsystem: Acer Incorporated [ALI] Device 0748
>> Flags: bus master, fast devsel, latency 0, IRQ 28
>> Memory at c0000000 (64-bit, non-prefetchable) [size=4M]
>> Memory at b0000000 (64-bit, prefetchable) [size=256M]
>> I/O ports at 2000 [size=64]
>> [virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
>> Capabilities: <access denied>
>> Kernel driver in use: i915
>
>My best guess is that some changes in the i915 driver cause your issue,
>I add the maintainers of i915 to the cc-list, maybe they have an idea.
>
>
> Joerg


I'll come up with a nouveau system example and it was quite easy to bisect. To quote the merge commit msg:

This also required some changes outside of the IOMMU code, but these are acked by the respective maintainers.

Any help on bisecting into it would be awesome.

2016-11-01 11:47:53

by Jani Nikula

[permalink] [raw]
Subject: Re: [BUG][REGRESSION] mangled display since -rc1

On Tue, 01 Nov 2016, Martin Kepplinger <[email protected]> wrote:
> Am 31. Oktober 2016 22:54:54 MEZ, schrieb Joerg Roedel <[email protected]>:
>>On Mon, Oct 31, 2016 at 09:44:51PM +0100, Martin Kepplinger wrote:
>>> This is one machine booting a bad kernel. I could provide another
>>> example later this week.
>>
>>You have an Intel system without any IOMMU (enabled), otherwise you
>>would have a DMAR-ACPI table, but there is none:
>>
>>> [ 0.000000] ACPI: RSDP 0x00000000000FE020 000024 (v02 ACRSYS)
>>> [ 0.000000] ACPI: XSDT 0x00000000A6FFE210 00008C (v01 ACRSYS
>>ACRPRDCT 00000001 01000013)
>>> [ 0.000000] ACPI: FACP 0x00000000A6FFB000 0000F4 (v04 ACRSYS
>>ACRPRDCT 00000001 1025 00040000)
>>> [ 0.000000] ACPI: DSDT 0x00000000A6FEC000 00B903 (v01 ACRSYS
>>ACRPRDCT 00000000 1025 00040000)
>>> [ 0.000000] ACPI: FACS 0x00000000A6FBB000 000040
>>> [ 0.000000] ACPI: FACS 0x00000000A6FBB000 000040
>>> [ 0.000000] ACPI: UEFI 0x00000000A6FFD000 000236 (v01 ACRSYS
>>ACRPRDCT 00000001 1025 00040000)
>>> [ 0.000000] ACPI: ASF! 0x00000000A6FFC000 0000A5 (v32 ACRSYS
>>ACRPRDCT 00000001 1025 00040000)
>>> [ 0.000000] ACPI: HPET 0x00000000A6FFA000 000038 (v01 ACRSYS
>>ACRPRDCT 00000001 1025 00040000)
>>> [ 0.000000] ACPI: APIC 0x00000000A6FF9000 00008C (v02 ACRSYS
>>ACRPRDCT 00000001 1025 00040000)
>>> [ 0.000000] ACPI: MCFG 0x00000000A6FF8000 00003C (v01 ACRSYS
>>ACRPRDCT 00000001 1025 00040000)
>>> [ 0.000000] ACPI: SLIC 0x00000000A6FEB000 000176 (v01 ACRSYS
>>ACRPRDCT 00000001 1025 00040000)
>>> [ 0.000000] ACPI: SSDT 0x00000000A6FEA000 0006FE (v01 ACRSYS
>>ACRPRDCT 00001000 1025 00040000)
>>> [ 0.000000] ACPI: BOOT 0x00000000A6FE8000 000028 (v01 ACRSYS
>>ACRPRDCT 00000001 1025 00040000)
>>> [ 0.000000] ACPI: ASPT 0x00000000A6FE3000 000034 (v07 ACRSYS
>>ACRPRDCT 00000001 1025 00040000)
>>> [ 0.000000] ACPI: FPDT 0x00000000A6FE1000 000044 (v01 ACRSYS
>>ACRPRDCT 00000001 1025 00040000)
>>> [ 0.000000] ACPI: SSDT 0x00000000A6FE0000 00079A (v01 ACRSYS
>>ACRPRDCT 00003000 1025 00040000)
>>> [ 0.000000] ACPI: SSDT 0x00000000A6FDF000 000A92 (v01 ACRSYS
>>ACRPRDCT 00003000 1025 00040000)
>>
>>So it is pretty unlikely that any change in IOMMU code causes your
>>issue. Not sure why your bisecting ended up there. You also have an
>>Intel GPU in the system:
>>
>>> 00:02.0 VGA compatible controller: Intel Corporation 2nd Generation
>>Core Processor Family Integrated Graphics Controller (rev 09) (prog-if
>>00 [VGA controller])
>>> Subsystem: Acer Incorporated [ALI] Device 0748
>>> Flags: bus master, fast devsel, latency 0, IRQ 28
>>> Memory at c0000000 (64-bit, non-prefetchable) [size=4M]
>>> Memory at b0000000 (64-bit, prefetchable) [size=256M]
>>> I/O ports at 2000 [size=64]
>>> [virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
>>> Capabilities: <access denied>
>>> Kernel driver in use: i915
>>
>>My best guess is that some changes in the i915 driver cause your issue,
>>I add the maintainers of i915 to the cc-list, maybe they have an idea.
>>
>>
>> Joerg
>
>
> I'll come up with a nouveau system example and it was quite easy to bisect. To quote the merge commit msg:
>
> This also required some changes outside of the IOMMU code, but these are acked by the respective maintainers.
>
> Any help on bisecting into it would be awesome.

So the information here is pretty scarce. Please file a bug at [1],
describe the problem, perhaps add drm.debug=14 module parameter and
attach dmesg from boot to reproducing the problem.

BR,
Jani.

[1] https://bugs.freedesktop.org/enter_bug.cgi?product=DRI&component=DRM/Intel


--
Jani Nikula, Intel Open Source Technology Center

2016-11-06 11:21:40

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: [BUG][REGRESSION] mangled display since -rc1

On 01.11.2016 12:47, Jani Nikula wrote:
> On Tue, 01 Nov 2016, Martin Kepplinger <[email protected]> wrote:
>> I'll come up with a nouveau system example and it was quite easy to bisect. To quote the merge commit msg:
>> This also required some changes outside of the IOMMU code, but these are acked by the respective maintainers.
>> Any help on bisecting into it would be awesome.
> So the information here is pretty scarce. Please file a bug at [1],
> describe the problem, perhaps add drm.debug=14 module parameter and
> attach dmesg from boot to reproducing the problem.

Martin, did you do that and can point me to the bug? And if not: Any
news on the issue?

FWIW: I added this report to the list of regressions for Linux 4.9. I'll
watch this thread for further updates on this issue to document progress
in my weekly reports. Please let me know via [email protected]
in case the discussion moves to a different place (bugzilla or another
mail thread for example).

tia! Ciao, Thorsten

2016-11-06 11:43:30

by Martin Kepplinger

[permalink] [raw]
Subject: Re: [BUG][REGRESSION] mangled display since -rc1

Am 2016-11-06 um 12:21 schrieb Thorsten Leemhuis:
> On 01.11.2016 12:47, Jani Nikula wrote:
>> On Tue, 01 Nov 2016, Martin Kepplinger <[email protected]> wrote:
>>> I'll come up with a nouveau system example and it was quite easy to bisect. To quote the merge commit msg:
>>> This also required some changes outside of the IOMMU code, but these are acked by the respective maintainers.
>>> Any help on bisecting into it would be awesome.
>> So the information here is pretty scarce. Please file a bug at [1],
>> describe the problem, perhaps add drm.debug=14 module parameter and
>> attach dmesg from boot to reproducing the problem.
>
> Martin, did you do that and can point me to the bug? And if not: Any
> news on the issue?
>
> FWIW: I added this report to the list of regressions for Linux 4.9. I'll
> watch this thread for further updates on this issue to document progress
> in my weekly reports. Please let me know via [email protected]
> in case the discussion moves to a different place (bugzilla or another
> mail thread for example).
>
> tia! Ciao, Thorsten
>

I did not file a bug in bugzilla yet. I haven't given up that we can fix
this here before the release. I've ignored it the last few days though.

Again: This is the first bad commit:

[56e520c7a0a490b63b042b047ec9659fc08762a4] Merge tag
'iommu-updates-v4.9' of
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

If you know how to bisect into this, please share your thoughts. Linus'
merge here is bigger than iommu-update-v4.9 i think and that's what I
have to look at really. I'm not sure how though.

One part of the merge is indeed iiommu-updates-v4.9:

That's the tag:
https://git.kernel.org/cgit/linux/kernel/git/joro/iommu.git/tag/?h=iommu-updates-v4.9

and the tagged object:
https://git.kernel.org/cgit/linux/kernel/git/joro/iommu.git/commit/?h=iommu-updates-v4.9&id=13a08259187c5cd3f63d98efa159ab42976d85a4

I'll have to think about how to check the iommu-updates-v4.9 part of the
merge again... to eliminate something.

2016-11-06 13:02:49

by Martin Kepplinger

[permalink] [raw]
Subject: Re: [BUG][REGRESSION] mangled display since -rc1

Am 2016-11-06 um 12:43 schrieb Martin Kepplinger:
> Am 2016-11-06 um 12:21 schrieb Thorsten Leemhuis:
>> On 01.11.2016 12:47, Jani Nikula wrote:
>>> On Tue, 01 Nov 2016, Martin Kepplinger <[email protected]> wrote:
>>>> I'll come up with a nouveau system example and it was quite easy to bisect. To quote the merge commit msg:
>>>> This also required some changes outside of the IOMMU code, but these are acked by the respective maintainers.
>>>> Any help on bisecting into it would be awesome.
>>> So the information here is pretty scarce. Please file a bug at [1],
>>> describe the problem, perhaps add drm.debug=14 module parameter and
>>> attach dmesg from boot to reproducing the problem.
>>
>> Martin, did you do that and can point me to the bug? And if not: Any
>> news on the issue?
>>
>> FWIW: I added this report to the list of regressions for Linux 4.9. I'll
>> watch this thread for further updates on this issue to document progress
>> in my weekly reports. Please let me know via [email protected]
>> in case the discussion moves to a different place (bugzilla or another
>> mail thread for example).
>>
>> tia! Ciao, Thorsten
>>
>
> I did not file a bug in bugzilla yet. I haven't given up that we can fix
> this here before the release. I've ignored it the last few days though.
>
> Again: This is the first bad commit:
>
> [56e520c7a0a490b63b042b047ec9659fc08762a4] Merge tag
> 'iommu-updates-v4.9' of
> git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
>
> If you know how to bisect into this, please share your thoughts. Linus'
> merge here is bigger than iommu-update-v4.9 i think and that's what I
> have to look at really. I'm not sure how though.
>
> One part of the merge is indeed iiommu-updates-v4.9:
>
> That's the tag:
> https://git.kernel.org/cgit/linux/kernel/git/joro/iommu.git/tag/?h=iommu-updates-v4.9
>
> and the tagged object:
> https://git.kernel.org/cgit/linux/kernel/git/joro/iommu.git/commit/?h=iommu-updates-v4.9&id=13a08259187c5cd3f63d98efa159ab42976d85a4
>
> I'll have to think about how to check the iommu-updates-v4.9 part of the
> merge again... to eliminate something.
>
-rc4 is still bad. all still relevant :(

2016-11-07 08:24:51

by Jani Nikula

[permalink] [raw]
Subject: Re: [BUG][REGRESSION] mangled display since -rc1

On Sun, 06 Nov 2016, Martin Kepplinger <[email protected]> wrote:
> I did not file a bug in bugzilla yet. I haven't given up that we can fix
> this here before the release. I've ignored it the last few days though.

You say it like filing the bug report and having the bug fixed are
mutually exclusive things...

Pretty please? It's easier for us to direct folks at the bug, with
history and logs in one place. I realize only Daniel and me were Cc'd
here, not intel-gfx list.

Also, please double check your bisect. Not sure why the finger points at
i915 when the bisect points at iommu merge.


BR,
Jani.


--
Jani Nikula, Intel Open Source Technology Center

2016-11-07 15:34:46

by Martin Kepplinger

[permalink] [raw]
Subject: Re: [BUG][REGRESSION] mangled display since -rc1

Am 2016-11-07 um 09:24 schrieb Jani Nikula:
> On Sun, 06 Nov 2016, Martin Kepplinger <[email protected]> wrote:
>> I did not file a bug in bugzilla yet. I haven't given up that we can fix
>> this here before the release. I've ignored it the last few days though.
>
> You say it like filing the bug report and having the bug fixed are
> mutually exclusive things...
>
> Pretty please? It's easier for us to direct folks at the bug, with
> history and logs in one place. I realize only Daniel and me were Cc'd
> here, not intel-gfx list.
>
> Also, please double check your bisect. Not sure why the finger points at
> i915 when the bisect points at iommu merge.
>
>
> BR,
> Jani.
>
>

Chris Clayton wrote off list and the mentioned patch fixes the problem
for me too, as it does for others. I hope it make it's way into the tree
soon:


-------- Weitergeleitete Nachricht --------
Betreff: Fwd: Re: Redraw issues on i915 on 4.9-rc
Datum: Mon, 7 Nov 2016 13:48:14 +0000
Von: Chris Clayton <[email protected]>
An: [email protected]

Hi Martin.


I can't contact you through LKML because I'm not subscribed, but I've
been working with Chris Wilson, one of the Intel
DRM developers to analyse and fix the corruption. We've got a patch that
fixes it for me and Norbert who also reported
the problem. The patch is at the bottom of this message.

Hope it helps.

Chris


-------- Forwarded Message --------
Subject: Re: Redraw issues on i915 on 4.9-rc
Date: Mon, 7 Nov 2016 09:25:59 +0000
From: Chris Wilson <[email protected]>
To: Chris Clayton <[email protected]>
CC: Norbert Preining <[email protected]>

On Mon, Nov 07, 2016 at 09:16:38AM +0000, Chris Clayton wrote:
> Hello again.
>
> I wasn't at all happy about the last bisect I did, so I've run it
again and this time spent at least 30 minutes using my
> system before marking a kernel as good. I've also noticed that when I
boot a bad kernel, the graphics associated with
> three of my desktop icons do not get drawn, so that helps.
>
> The outcome of the bisection is:
>
> a61007a83a4671da77210790997d5c8c92ed87ea is the first bad commit
> commit a61007a83a4671da77210790997d5c8c92ed87ea
> Author: Chris Wilson <[email protected]>
> Date: Thu Aug 18 17:17:02 2016 +0100
>
> drm/i915: Fix partial GGTT faulting

That's just the enabling patch, everything was meant to be in place by
then.

Oh noes, care to try:


diff --git a/drivers/gpu/drm/i915/i915_gem.c
b/drivers/gpu/drm/i915/i915_gem.c
index c642385bb236..a52b40bbac6f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1837,7 +1837,7 @@ int i915_gem_fault(struct vm_area_struct *area,
struct vm_fault *vmf)
/* Use a partial view if it is bigger than available
space */
chunk_size = MIN_CHUNK_PAGES;
if (i915_gem_object_is_tiled(obj))
- chunk_size = max(chunk_size, tile_row_pages(obj));
+ chunk_size = roundup(chunk_size,
tile_row_pages(obj));
memset(&view, 0, sizeof(view));
view.type = I915_GGTT_VIEW_PARTIAL;

--
Chris Wilson, Intel Open Source Technology Centre

2016-11-07 16:01:29

by Martin Steigerwald

[permalink] [raw]
Subject: Re: [BUG][REGRESSION] mangled display since -rc1

Hello.

Am Montag, 7. November 2016, 16:34:35 CET schrieb Martin Kepplinger:
> Am 2016-11-07 um 09:24 schrieb Jani Nikula:
> > On Sun, 06 Nov 2016, Martin Kepplinger <[email protected]> wrote:
> >> I did not file a bug in bugzilla yet. I haven't given up that we can fix
> >> this here before the release. I've ignored it the last few days though.
> >
> > You say it like filing the bug report and having the bug fixed are
> > mutually exclusive things...
> >
> > Pretty please? It's easier for us to direct folks at the bug, with
> > history and logs in one place. I realize only Daniel and me were Cc'd
> > here, not intel-gfx list.
> >
> > Also, please double check your bisect. Not sure why the finger points at
> > i915 when the bisect points at iommu merge.
[…]
> Chris Clayton wrote off list and the mentioned patch fixes the problem
> for me too, as it does for others. I hope it make it's way into the tree
> soon:

With 4.9-rc4 I have corruptions that look like the ones reported in this
thread.

I reported my finding on LKML thread about 3.9-rc4. And in Bugzilla in a bug
report with an attachment that shows the same type of corruptions as here in
this thread:

https://patchwork.freedesktop.org/patch/116808/

mentioned in the other bug report and the following LKML thread does not fix
the issue for me:

Re: [REGRESSION] Linux 4.9-rc4: gfx glitches on Intel Sandybridge (was: Re:
Linux 4.9-rc4)
https://lkml.org/lkml/2016/11/6/70

https://bugzilla.kernel.org/show_bug.cgi?id=177701#c4


In my case it looks like this:

https://martin-steigerwald.de/tmp/display-issues-with-kernel-4.9-rc4.png


I have a busy week, so I won´t to any bisecting at the moment. I am happy to
test another patch during breaks between holding the training, but please
point me specifically to what patch to test. Thank you.


In general I am a bit confused about:

1) when do I use the bugtracker

2) when to just post on LKML

3) and which bugtracker to use? bugzilla.kernel.org versus the freedesktop one
in this case. See:

http://lkml.iu.edu/hypermail/linux/kernel/1611.0/03126.html

4) how to determine whether a bug report matches my case or not. In that case
its easy enough considering the screenshots.

(using this list archive as threading view in lkml.org seems broken)

This bug is already being discussed in three places right now, I bet it makes
sense to settle on one place. I´d opt for Bugzilla but I would like to use my
MUA to access it for simple comments.

Thanks,
Martin

> -------- Weitergeleitete Nachricht --------
> Betreff: Fwd: Re: Redraw issues on i915 on 4.9-rc
> Datum: Mon, 7 Nov 2016 13:48:14 +0000
> Von: Chris Clayton <[email protected]>
> An: [email protected]
>
> Hi Martin.
>
>
> I can't contact you through LKML because I'm not subscribed, but I've
> been working with Chris Wilson, one of the Intel
> DRM developers to analyse and fix the corruption. We've got a patch that
> fixes it for me and Norbert who also reported
> the problem. The patch is at the bottom of this message.
>
> Hope it helps.
>
> Chris
>
>
> -------- Forwarded Message --------
> Subject: Re: Redraw issues on i915 on 4.9-rc
> Date: Mon, 7 Nov 2016 09:25:59 +0000
> From: Chris Wilson <[email protected]>
> To: Chris Clayton <[email protected]>
> CC: Norbert Preining <[email protected]>
>
> On Mon, Nov 07, 2016 at 09:16:38AM +0000, Chris Clayton wrote:
> > Hello again.
> >
> > I wasn't at all happy about the last bisect I did, so I've run it
>
> again and this time spent at least 30 minutes using my
>
> > system before marking a kernel as good. I've also noticed that when I
>
> boot a bad kernel, the graphics associated with
>
> > three of my desktop icons do not get drawn, so that helps.
> >
> > The outcome of the bisection is:
> >
> > a61007a83a4671da77210790997d5c8c92ed87ea is the first bad commit
> > commit a61007a83a4671da77210790997d5c8c92ed87ea
> > Author: Chris Wilson <[email protected]>
> > Date: Thu Aug 18 17:17:02 2016 +0100
> >
> > drm/i915: Fix partial GGTT faulting
>
> That's just the enabling patch, everything was meant to be in place by
> then.
>
> Oh noes, care to try:
>
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c
> b/drivers/gpu/drm/i915/i915_gem.c
> index c642385bb236..a52b40bbac6f 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1837,7 +1837,7 @@ int i915_gem_fault(struct vm_area_struct *area,
> struct vm_fault *vmf)
> /* Use a partial view if it is bigger than available
> space */
> chunk_size = MIN_CHUNK_PAGES;
> if (i915_gem_object_is_tiled(obj))
> - chunk_size = max(chunk_size, tile_row_pages(obj));
> + chunk_size = roundup(chunk_size,
> tile_row_pages(obj));
> memset(&view, 0, sizeof(view));
> view.type = I915_GGTT_VIEW_PARTIAL;


--
Martin

2016-11-07 16:08:17

by Martin Kepplinger

[permalink] [raw]
Subject: Re: [BUG][REGRESSION] mangled display since -rc1

Am 2016-11-07 um 17:01 schrieb Martin Steigerwald:
> Hello.
>
> Am Montag, 7. November 2016, 16:34:35 CET schrieb Martin Kepplinger:
>> Am 2016-11-07 um 09:24 schrieb Jani Nikula:
>>> On Sun, 06 Nov 2016, Martin Kepplinger <[email protected]> wrote:
>>>> I did not file a bug in bugzilla yet. I haven't given up that we can fix
>>>> this here before the release. I've ignored it the last few days though.
>>>
>>> You say it like filing the bug report and having the bug fixed are
>>> mutually exclusive things...
>>>
>>> Pretty please? It's easier for us to direct folks at the bug, with
>>> history and logs in one place. I realize only Daniel and me were Cc'd
>>> here, not intel-gfx list.
>>>
>>> Also, please double check your bisect. Not sure why the finger points at
>>> i915 when the bisect points at iommu merge.
> […]
>> Chris Clayton wrote off list and the mentioned patch fixes the problem
>> for me too, as it does for others. I hope it make it's way into the tree
>> soon:
>
> With 4.9-rc4 I have corruptions that look like the ones reported in this
> thread.
>
> I reported my finding on LKML thread about 3.9-rc4. And in Bugzilla in a bug
> report with an attachment that shows the same type of corruptions as here in
> this thread:
>
> https://patchwork.freedesktop.org/patch/116808/
>
> mentioned in the other bug report and the following LKML thread does not fix
> the issue for me:
>
> Re: [REGRESSION] Linux 4.9-rc4: gfx glitches on Intel Sandybridge (was: Re:
> Linux 4.9-rc4)
> https://lkml.org/lkml/2016/11/6/70
>
> https://bugzilla.kernel.org/show_bug.cgi?id=177701#c4
>
>
> In my case it looks like this:
>
> https://martin-steigerwald.de/tmp/display-issues-with-kernel-4.9-rc4.png
>
>
> I have a busy week, so I won´t to any bisecting at the moment. I am happy to
> test another patch during breaks between holding the training, but please
> point me specifically to what patch to test. Thank you.

this one: I just replaced max with roundup manually:

diff --git a/drivers/gpu/drm/i915/i915_gem.c
b/drivers/gpu/drm/i915/i915_gem.c
index c642385bb236..a52b40bbac6f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1837,7 +1837,7 @@ int i915_gem_fault(struct vm_area_struct *area,
struct vm_fault *vmf)
/* Use a partial view if it is bigger than available
space */
chunk_size = MIN_CHUNK_PAGES;
if (i915_gem_object_is_tiled(obj))
- chunk_size = max(chunk_size, tile_row_pages(obj));
+ chunk_size = roundup(chunk_size,
tile_row_pages(obj));
memset(&view, 0, sizeof(view));
view.type = I915_GGTT_VIEW_PARTIAL;


>
>
> In general I am a bit confused about:
>
> 1) when do I use the bugtracker
>
> 2) when to just post on LKML
>
> 3) and which bugtracker to use? bugzilla.kernel.org versus the freedesktop one
> in this case. See:
>
> http://lkml.iu.edu/hypermail/linux/kernel/1611.0/03126.html
>
> 4) how to determine whether a bug report matches my case or not. In that case
> its easy enough considering the screenshots.
>
> (using this list archive as threading view in lkml.org seems broken)
>
> This bug is already being discussed in three places right now, I bet it makes
> sense to settle on one place. I´d opt for Bugzilla but I would like to use my
> MUA to access it for simple comments.

whatever, it's a mess :) that's what the kernel summit is for, right?

>
> Thanks,
> Martin
>
>> -------- Weitergeleitete Nachricht --------
>> Betreff: Fwd: Re: Redraw issues on i915 on 4.9-rc
>> Datum: Mon, 7 Nov 2016 13:48:14 +0000
>> Von: Chris Clayton <[email protected]>
>> An: [email protected]
>>
>> Hi Martin.
>>
>>
>> I can't contact you through LKML because I'm not subscribed, but I've
>> been working with Chris Wilson, one of the Intel
>> DRM developers to analyse and fix the corruption. We've got a patch that
>> fixes it for me and Norbert who also reported
>> the problem. The patch is at the bottom of this message.
>>
>> Hope it helps.
>>
>> Chris
>>
>>
>> -------- Forwarded Message --------
>> Subject: Re: Redraw issues on i915 on 4.9-rc
>> Date: Mon, 7 Nov 2016 09:25:59 +0000
>> From: Chris Wilson <[email protected]>
>> To: Chris Clayton <[email protected]>
>> CC: Norbert Preining <[email protected]>
>>
>> On Mon, Nov 07, 2016 at 09:16:38AM +0000, Chris Clayton wrote:
>>> Hello again.
>>>
>>> I wasn't at all happy about the last bisect I did, so I've run it
>>
>> again and this time spent at least 30 minutes using my
>>
>>> system before marking a kernel as good. I've also noticed that when I
>>
>> boot a bad kernel, the graphics associated with
>>
>>> three of my desktop icons do not get drawn, so that helps.
>>>
>>> The outcome of the bisection is:
>>>
>>> a61007a83a4671da77210790997d5c8c92ed87ea is the first bad commit
>>> commit a61007a83a4671da77210790997d5c8c92ed87ea
>>> Author: Chris Wilson <[email protected]>
>>> Date: Thu Aug 18 17:17:02 2016 +0100
>>>
>>> drm/i915: Fix partial GGTT faulting
>>
>> That's just the enabling patch, everything was meant to be in place by
>> then.
>>
>> Oh noes, care to try:
>>
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c
>> b/drivers/gpu/drm/i915/i915_gem.c
>> index c642385bb236..a52b40bbac6f 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -1837,7 +1837,7 @@ int i915_gem_fault(struct vm_area_struct *area,
>> struct vm_fault *vmf)
>> /* Use a partial view if it is bigger than available
>> space */
>> chunk_size = MIN_CHUNK_PAGES;
>> if (i915_gem_object_is_tiled(obj))
>> - chunk_size = max(chunk_size, tile_row_pages(obj));
>> + chunk_size = roundup(chunk_size,
>> tile_row_pages(obj));
>> memset(&view, 0, sizeof(view));
>> view.type = I915_GGTT_VIEW_PARTIAL;
>
>

2016-11-07 17:37:44

by Jani Nikula

[permalink] [raw]
Subject: Re: [BUG][REGRESSION] mangled display since -rc1

On Mon, 07 Nov 2016, Martin Kepplinger <[email protected]> wrote:
> Am 2016-11-07 um 17:01 schrieb Martin Steigerwald:
>> Hello.
>>
>> Am Montag, 7. November 2016, 16:34:35 CET schrieb Martin Kepplinger:
>>> Chris Clayton wrote off list and the mentioned patch fixes the problem
>>> for me too, as it does for others. I hope it make it's way into the tree
>>> soon:
>>
>> With 4.9-rc4 I have corruptions that look like the ones reported in this
>> thread.
>>
>> I reported my finding on LKML thread about 3.9-rc4. And in Bugzilla in a bug
>> report with an attachment that shows the same type of corruptions as here in
>> this thread:
>>
>> https://patchwork.freedesktop.org/patch/116808/
>>
>> mentioned in the other bug report and the following LKML thread does not fix
>> the issue for me:
>>
>> Re: [REGRESSION] Linux 4.9-rc4: gfx glitches on Intel Sandybridge (was: Re:
>> Linux 4.9-rc4)
>> https://lkml.org/lkml/2016/11/6/70
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=177701#c4

That bug conflates two issues, with the main issue being unrelated to
corruption.

>> In my case it looks like this:
>>
>> https://martin-steigerwald.de/tmp/display-issues-with-kernel-4.9-rc4.png
>>
>>
>> I have a busy week, so I won´t to any bisecting at the moment. I am
>> happy to test another patch during breaks between holding the
>> training, but please point me specifically to what patch to
>> test. Thank you.
>
> this one: I just replaced max with roundup manually:

As I wrote in another thread, the fix has now been pushed to
drm-intel-fixes branch of http://cgit.freedesktop.org/drm-intel. It's
-rc4 plus half a dozen fixes, including "drm/i915: Round tile chunks up
for constructing partial VMAs" which should fix the corruption.

Please try that, and report back. If it doesn't fix the issue, please
file a bug at the freedesktop.org bugzilla.

BR,
Jani.

--
Jani Nikula, Intel Open Source Technology Center

2016-11-13 21:33:02

by Martin Kepplinger

[permalink] [raw]
Subject: Re: [BUG][REGRESSION] mangled display since -rc1

Am 2016-11-07 um 18:36 schrieb Jani Nikula:
> On Mon, 07 Nov 2016, Martin Kepplinger <[email protected]> wrote:
>> Am 2016-11-07 um 17:01 schrieb Martin Steigerwald:
>>> Hello.
>>>
>>> Am Montag, 7. November 2016, 16:34:35 CET schrieb Martin Kepplinger:
>>>> Chris Clayton wrote off list and the mentioned patch fixes the problem
>>>> for me too, as it does for others. I hope it make it's way into the tree
>>>> soon:
>>>
>>> With 4.9-rc4 I have corruptions that look like the ones reported in this
>>> thread.
>>>
>>> I reported my finding on LKML thread about 3.9-rc4. And in Bugzilla in a bug
>>> report with an attachment that shows the same type of corruptions as here in
>>> this thread:
>>>
>>> https://patchwork.freedesktop.org/patch/116808/
>>>
>>> mentioned in the other bug report and the following LKML thread does not fix
>>> the issue for me:
>>>
>>> Re: [REGRESSION] Linux 4.9-rc4: gfx glitches on Intel Sandybridge (was: Re:
>>> Linux 4.9-rc4)
>>> https://lkml.org/lkml/2016/11/6/70
>>>
>>> https://bugzilla.kernel.org/show_bug.cgi?id=177701#c4
>
> That bug conflates two issues, with the main issue being unrelated to
> corruption.
>
>>> In my case it looks like this:
>>>
>>> https://martin-steigerwald.de/tmp/display-issues-with-kernel-4.9-rc4.png
>>>
>>>
>>> I have a busy week, so I won´t to any bisecting at the moment. I am
>>> happy to test another patch during breaks between holding the
>>> training, but please point me specifically to what patch to
>>> test. Thank you.
>>
>> this one: I just replaced max with roundup manually:
>
> As I wrote in another thread, the fix has now been pushed to
> drm-intel-fixes branch of http://cgit.freedesktop.org/drm-intel. It's
> -rc4 plus half a dozen fixes, including "drm/i915: Round tile chunks up
> for constructing partial VMAs" which should fix the corruption.
>
> Please try that, and report back. If it doesn't fix the issue, please
> file a bug at the freedesktop.org bugzilla.
>
> BR,
> Jani.
>
Fixed in -rc5, thanks!

martin