2022-06-22 11:59:18

by Zdenek Kabelac

[permalink] [raw]
Subject: i915: crash with 5.19-rc2

Hello

While somewhat oldish hw (T61, 4G, C2D) - I've now witnessed new crash with Xorg:

(happened while reopening iconified Firefox window  - running 'standard'
rawhide -nodebug kernel 5.19.0-0.rc2.21.fc37.x86_64)

 page:00000000577758b3 refcount:0 mapcount:0 mapping:0000000000000000
index:0x1 pfn:0x1192cc
 flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
 raw: 0017ffffc0000000 ffffe683c47171c8 ffff8fa3f79377a8 0000000000000000
 raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
 page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio))
 ------------[ cut here ]------------
 kernel BUG at mm/shmem.c:708!
 invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
 CPU: 1 PID: 42896 Comm: Xorg Not tainted 5.19.0-0.rc2.21.fc37.x86_64 #1
 Hardware name: LENOVO 6464CTO/6464CTO, BIOS 7LETC9WW (2.29 ) 03/18/2011
 RIP: 0010:shmem_add_to_page_cache+0x48e/0x500
 Code: 01 0f 84 0a fc ff ff 48 8d 4a ff 31 d2 48 39 cb 0f 85 ff fb ff ff e9
f6 fb ff ff 48 c7 c6 70 01 64 bb 48 89 df e8 f2 99 01 00 <0f> 0b 48 c7 c6 a0
1b 64 bb 48 89 df e8 e1 99 01 00 0f 0b 48 8b 13
 RSP: 0018:ffff9ce7c047f6b0 EFLAGS: 00010286
 RAX: 000000000000003f RBX: ffffe683c464b300 RCX: 0000000000000000
 RDX: 0000000000000001 RSI: ffffffffbb67b8e8 RDI: 00000000ffffffff
 RBP: 0000000000023f97 R08: ffffffffbca122a0 R09: 64656b636f6c5f74
 R10: 747365745f6f696c R11: 6f6621284f494c4f R12: 00000000001120d4
 R13: ffff8fa2c6ae7890 R14: ffffe683c464b300 R15: 0000000000000001
 FS:  00007fc1cea31380(0000) GS:ffff8fa3f7900000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f6972e228c8 CR3: 0000000104ba8000 CR4: 00000000000006e0
 Call Trace:
 <TASK>
 shmem_swapin_folio+0x274/0x980
 shmem_getpage_gfp+0x234/0x990
 shmem_read_mapping_page_gfp+0x36/0xf0
 shmem_sg_alloc_table+0x11b/0x250 [i915]
 shmem_get_pages+0xaa/0x310 [i915]
 __i915_gem_object_get_pages+0x31/0x40 [i915]
 i915_vma_pin_ww+0x69d/0x920 [i915]
 eb_validate_vmas+0x17d/0x7a0 [i915]
 ? eb_pin_engine+0x262/0x2d0 [i915]
 i915_gem_do_execbuffer+0xd43/0x2c00 [i915]
 ? refill_obj_stock+0x102/0x1a0
 ? unix_stream_read_generic+0x1ea/0xa60
 ? unix_stream_read_generic+0x1ea/0xa60
 ? _raw_spin_lock_irqsave+0x23/0x50
 ? _atomic_dec_and_lock_irqsave+0x38/0x60
 ? __active_retire+0xb7/0x100 [i915]
 ? _raw_spin_unlock_irqrestore+0x23/0x40
 ? dma_fence_signal+0x39/0x50
 ? dma_resv_iter_walk_unlocked.part.0+0x164/0x170
 i915_gem_execbuffer2_ioctl+0x115/0x270 [i915]
 ? i915_gem_do_execbuffer+0x2c00/0x2c00 [i915]
 drm_ioctl_kernel+0x9b/0x140
 ? __check_object_size+0x47/0x260
 drm_ioctl+0x21c/0x410
 ? i915_gem_do_execbuffer+0x2c00/0x2c00 [i915]
 ? exit_to_user_mode_prepare+0x17d/0x1f0
 __x64_sys_ioctl+0x8a/0xc0
 do_syscall_64+0x58/0x80
 ? syscall_exit_to_user_mode+0x17/0x40
 ? do_syscall_64+0x67/0x80
 entry_SYSCALL_64_after_hwframe+0x46/0xb0
 RIP: 0033:0x7fc1cf28da9f
 Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44
24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff
ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
 RSP: 002b:00007ffe5f52e1c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
 RAX: ffffffffffffffda RBX: 00007ffe5f52e250 RCX: 00007fc1cf28da9f
 RDX: 00007ffe5f52e250 RSI: 0000000040406469 RDI: 000000000000000d
 RBP: 000000000000000d R08: 00007fc1ce938410 R09: 00007fc1cf2fa4c0
 R10: 0000000000000000 R11: 0000000000000246 R12: 000055e2dde0d340
 R13: 0000000000000114 R14: 00007ffe5f52e250 R15: 00007fc1cdc49000
 </TASK>
 Modules linked in: tls rfcomm snd_seq_dummy snd_hrtimer xt_CHECKSUM
xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 ip6table_mangle
ip6table_nat ip6table_filter ip6_tables iptable_mangle iptable_nat nf_nat
nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter ip_tables bridge stp
llc bnep binfmt_misc btusb btrtl btbcm btintel btmtk bluetooth ecdh_generic
snd_hda_codec_analog snd_hda_codec_generic iwl3945 iwlegacy coretemp kvm_intel
mac80211 snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi libarc4 kvm
iTCO_wdt snd_hda_codec intel_pmc_bxt iTCO_vendor_support snd_hda_core
snd_hwdep snd_seq snd_seq_device cfg80211 irqbypass snd_pcm thinkpad_acpi
pcspkr joydev i2c_i801 snd_timer i2c_smbus ledtrig_audio wmi_bmof r592
platform_profile snd memstick rfkill lpc_ich soundcore acpi_cpufreq nfsd
auth_rpcgss nfs_acl lockd grace sunrpc fuse zram i915 sdhci_pci cqhci sdhci
drm_buddy drm_display_helper e1000e mmc_core cec serio_raw yenta_socket ttm
wmi video ata_generic pata_acpi
 scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_multipath
 ---[ end trace 0000000000000000 ]---
 RIP: 0010:shmem_add_to_page_cache+0x48e/0x500
 Code: 01 0f 84 0a fc ff ff 48 8d 4a ff 31 d2 48 39 cb 0f 85 ff fb ff ff e9
f6 fb ff ff 48 c7 c6 70 01 64 bb 48 89 df e8 f2 99 01 00 <0f> 0b 48 c7 c6 a0
1b 64 bb 48 89 df e8 e1 99 01 00 0f 0b 48 8b 13
 RSP: 0018:ffff9ce7c047f6b0 EFLAGS: 00010286
 RAX: 000000000000003f RBX: ffffe683c464b300 RCX: 0000000000000000
 RDX: 0000000000000001 RSI: ffffffffbb67b8e8 RDI: 00000000ffffffff
 RBP: 0000000000023f97 R08: ffffffffbca122a0 R09: 64656b636f6c5f74
 R10: 747365745f6f696c R11: 6f6621284f494c4f R12: 00000000001120d4
 R13: ffff8fa2c6ae7890 R14: ffffe683c464b300 R15: 0000000000000001
 FS:  00007fc1cea31380(0000) GS:ffff8fa3f7900000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f6972e228c8 CR3: 0000000104ba8000 CR4: 00000000000006e0


Regards


Zdenek


2022-06-22 21:38:10

by Rodrigo Vivi

[permalink] [raw]
Subject: Re: [Intel-gfx] i915: crash with 5.19-rc2


Hi Zdenek,

On Wed, Jun 22, 2022 at 01:18:42PM +0200, Zdenek Kabelac wrote:
> Hello
>
> While somewhat oldish hw (T61, 4G, C2D) - I've now witnessed new crash with Xorg:
>
> (happened while reopening iconified Firefox window? - running 'standard'
> rawhide -nodebug kernel 5.19.0-0.rc2.21.fc37.x86_64)

any bisect possible?

if possible, could you please file a bug?
https://gitlab.freedesktop.org/drm/intel/-/wikis/How-to-file-i915-bugs

I know I know, the account requirement :/
also on main kernel bugzilla is probably better than the email here.

Thanks,
Rodrigo

>
> ?page:00000000577758b3 refcount:0 mapcount:0 mapping:0000000000000000
> index:0x1 pfn:0x1192cc
> ?flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
> ?raw: 0017ffffc0000000 ffffe683c47171c8 ffff8fa3f79377a8 0000000000000000
> ?raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
> ?page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio))
> ?------------[ cut here ]------------
> ?kernel BUG at mm/shmem.c:708!
> ?invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> ?CPU: 1 PID: 42896 Comm: Xorg Not tainted 5.19.0-0.rc2.21.fc37.x86_64 #1
> ?Hardware name: LENOVO 6464CTO/6464CTO, BIOS 7LETC9WW (2.29 ) 03/18/2011
> ?RIP: 0010:shmem_add_to_page_cache+0x48e/0x500
> ?Code: 01 0f 84 0a fc ff ff 48 8d 4a ff 31 d2 48 39 cb 0f 85 ff fb ff ff e9
> f6 fb ff ff 48 c7 c6 70 01 64 bb 48 89 df e8 f2 99 01 00 <0f> 0b 48 c7 c6 a0
> 1b 64 bb 48 89 df e8 e1 99 01 00 0f 0b 48 8b 13
> ?RSP: 0018:ffff9ce7c047f6b0 EFLAGS: 00010286
> ?RAX: 000000000000003f RBX: ffffe683c464b300 RCX: 0000000000000000
> ?RDX: 0000000000000001 RSI: ffffffffbb67b8e8 RDI: 00000000ffffffff
> ?RBP: 0000000000023f97 R08: ffffffffbca122a0 R09: 64656b636f6c5f74
> ?R10: 747365745f6f696c R11: 6f6621284f494c4f R12: 00000000001120d4
> ?R13: ffff8fa2c6ae7890 R14: ffffe683c464b300 R15: 0000000000000001
> ?FS:? 00007fc1cea31380(0000) GS:ffff8fa3f7900000(0000) knlGS:0000000000000000
> ?CS:? 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> ?CR2: 00007f6972e228c8 CR3: 0000000104ba8000 CR4: 00000000000006e0
> ?Call Trace:
> ?<TASK>
> ?shmem_swapin_folio+0x274/0x980
> ?shmem_getpage_gfp+0x234/0x990
> ?shmem_read_mapping_page_gfp+0x36/0xf0
> ?shmem_sg_alloc_table+0x11b/0x250 [i915]
> ?shmem_get_pages+0xaa/0x310 [i915]
> ?__i915_gem_object_get_pages+0x31/0x40 [i915]
> ?i915_vma_pin_ww+0x69d/0x920 [i915]
> ?eb_validate_vmas+0x17d/0x7a0 [i915]
> ?? eb_pin_engine+0x262/0x2d0 [i915]
> ?i915_gem_do_execbuffer+0xd43/0x2c00 [i915]
> ?? refill_obj_stock+0x102/0x1a0
> ?? unix_stream_read_generic+0x1ea/0xa60
> ?? unix_stream_read_generic+0x1ea/0xa60
> ?? _raw_spin_lock_irqsave+0x23/0x50
> ?? _atomic_dec_and_lock_irqsave+0x38/0x60
> ?? __active_retire+0xb7/0x100 [i915]
> ?? _raw_spin_unlock_irqrestore+0x23/0x40
> ?? dma_fence_signal+0x39/0x50
> ?? dma_resv_iter_walk_unlocked.part.0+0x164/0x170
> ?i915_gem_execbuffer2_ioctl+0x115/0x270 [i915]
> ?? i915_gem_do_execbuffer+0x2c00/0x2c00 [i915]
> ?drm_ioctl_kernel+0x9b/0x140
> ?? __check_object_size+0x47/0x260
> ?drm_ioctl+0x21c/0x410
> ?? i915_gem_do_execbuffer+0x2c00/0x2c00 [i915]
> ?? exit_to_user_mode_prepare+0x17d/0x1f0
> ?__x64_sys_ioctl+0x8a/0xc0
> ?do_syscall_64+0x58/0x80
> ?? syscall_exit_to_user_mode+0x17/0x40
> ?? do_syscall_64+0x67/0x80
> ?entry_SYSCALL_64_after_hwframe+0x46/0xb0
> ?RIP: 0033:0x7fc1cf28da9f
> ?Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44
> 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff
> ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
> ?RSP: 002b:00007ffe5f52e1c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> ?RAX: ffffffffffffffda RBX: 00007ffe5f52e250 RCX: 00007fc1cf28da9f
> ?RDX: 00007ffe5f52e250 RSI: 0000000040406469 RDI: 000000000000000d
> ?RBP: 000000000000000d R08: 00007fc1ce938410 R09: 00007fc1cf2fa4c0
> ?R10: 0000000000000000 R11: 0000000000000246 R12: 000055e2dde0d340
> ?R13: 0000000000000114 R14: 00007ffe5f52e250 R15: 00007fc1cdc49000
> ?</TASK>
> ?Modules linked in: tls rfcomm snd_seq_dummy snd_hrtimer xt_CHECKSUM
> xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 ip6table_mangle
> ip6table_nat ip6table_filter ip6_tables iptable_mangle iptable_nat nf_nat
> nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter ip_tables bridge
> stp llc bnep binfmt_misc btusb btrtl btbcm btintel btmtk bluetooth
> ecdh_generic snd_hda_codec_analog snd_hda_codec_generic iwl3945 iwlegacy
> coretemp kvm_intel mac80211 snd_hda_intel snd_intel_dspcfg
> snd_intel_sdw_acpi libarc4 kvm iTCO_wdt snd_hda_codec intel_pmc_bxt
> iTCO_vendor_support snd_hda_core snd_hwdep snd_seq snd_seq_device cfg80211
> irqbypass snd_pcm thinkpad_acpi pcspkr joydev i2c_i801 snd_timer i2c_smbus
> ledtrig_audio wmi_bmof r592 platform_profile snd memstick rfkill lpc_ich
> soundcore acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc fuse zram
> i915 sdhci_pci cqhci sdhci drm_buddy drm_display_helper e1000e mmc_core cec
> serio_raw yenta_socket ttm wmi video ata_generic pata_acpi
> ?scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_multipath
> ?---[ end trace 0000000000000000 ]---
> ?RIP: 0010:shmem_add_to_page_cache+0x48e/0x500
> ?Code: 01 0f 84 0a fc ff ff 48 8d 4a ff 31 d2 48 39 cb 0f 85 ff fb ff ff e9
> f6 fb ff ff 48 c7 c6 70 01 64 bb 48 89 df e8 f2 99 01 00 <0f> 0b 48 c7 c6 a0
> 1b 64 bb 48 89 df e8 e1 99 01 00 0f 0b 48 8b 13
> ?RSP: 0018:ffff9ce7c047f6b0 EFLAGS: 00010286
> ?RAX: 000000000000003f RBX: ffffe683c464b300 RCX: 0000000000000000
> ?RDX: 0000000000000001 RSI: ffffffffbb67b8e8 RDI: 00000000ffffffff
> ?RBP: 0000000000023f97 R08: ffffffffbca122a0 R09: 64656b636f6c5f74
> ?R10: 747365745f6f696c R11: 6f6621284f494c4f R12: 00000000001120d4
> ?R13: ffff8fa2c6ae7890 R14: ffffe683c464b300 R15: 0000000000000001
> ?FS:? 00007fc1cea31380(0000) GS:ffff8fa3f7900000(0000) knlGS:0000000000000000
> ?CS:? 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> ?CR2: 00007f6972e228c8 CR3: 0000000104ba8000 CR4: 00000000000006e0
>
>
> Regards
>
>
> Zdenek
>

2022-06-23 10:18:18

by Zdenek Kabelac

[permalink] [raw]
Subject: Re: [Intel-gfx] i915: crash with 5.19-rc2

Dne 22. 06. 22 v 22:46 Rodrigo Vivi napsal(a):
> Hi Zdenek,
>
> On Wed, Jun 22, 2022 at 01:18:42PM +0200, Zdenek Kabelac wrote:
>> Hello
>>
>> While somewhat oldish hw (T61, 4G, C2D) - I've now witnessed new crash with Xorg:
>>
>> (happened while reopening iconified Firefox window  - running 'standard'
>> rawhide -nodebug kernel 5.19.0-0.rc2.21.fc37.x86_64)
> any bisect possible?
>
> if possible, could you please file a bug?
> https://gitlab.freedesktop.org/drm/intel/-/wikis/How-to-file-i915-bugs
>
> I know I know, the account requirement :/
> also on main kernel bugzilla is probably better than the email here.


Hi


So far this bisect does not seem doable, since this crash happened after
several days of  uptime and so far happened just once (and I'm now already on
-rc3).

If I'll spot any more regular approach to hit this crash, I may try bisecting.
Meanwhile I just hope, someone will get an idea what has changed recently (I'd
not seen such crash with 5.18). Although I need to say that I'm witnessing
some GPU restarts lately causing just some 'temporary hanging' of Xorg
desktop, but that's not such a big deal.

Zdenek

2022-08-10 10:08:40

by Zdenek Kabelac

[permalink] [raw]
Subject: Re: i915: crash with 5.19-rc2

Dne 22. 06. 22 v 13:18 Zdenek Kabelac napsal(a):
> Hello
>
> While somewhat oldish hw (T61, 4G, C2D) - I've now witnessed new crash with
> Xorg:
>
> (happened while reopening iconified Firefox window  - running 'standard'
> rawhide -nodebug kernel 5.19.0-0.rc2.21.fc37.x86_64)
>

Hello


Ok, I think I now know what is behind this BUG/crash of intel graphics  - 
interestingly it took me a few weeks to realize this.

So I've actually installed with some Rawhide update 'zram-generator' package
to use  zram swap to help with memory of Firefox & Thunderbird a bit with this
4G RAM laptop. All worked fine. However side effect of usage of ZRAM swapping
became actually this occasional  kernel BUG hitting.

When I've stopped using  Zram swap  -  it now runs for 2 weeks without a
single deadlock - with single or dual screen monitor setup with many suspends
& resumes in between.

So I'm likely 100% sure that   ZRAM usage is triggering this issue.   While I
know this laptop is old and likely with low memory and so on - no sure if it's
worth to solve it - maybe good enough solution is to issue a warning user
should no comibine this old piece with ZRAM - but I'm all open to do some
testing for fix - although I still don't have a simple triggering path for
this issue to happen within short period of time.

Maybe driver is missing tomark some pages as pined into memory so ZRAM can't
swap them out ?.


>  page:00000000577758b3 refcount:0 mapcount:0 mapping:0000000000000000
> index:0x1 pfn:0x1192cc
>  flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
>  raw: 0017ffffc0000000 ffffe683c47171c8 ffff8fa3f79377a8 0000000000000000
>  raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
>  page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio))
>  ------------[ cut here ]------------
>  kernel BUG at mm/shmem.c:708!
>  invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
>  CPU: 1 PID: 42896 Comm: Xorg Not tainted 5.19.0-0.rc2.21.fc37.x86_64 #1
>  Hardware name: LENOVO 6464CTO/6464CTO, BIOS 7LETC9WW (2.29 ) 03/18/2011
>  RIP: 0010:shmem_add_to_page_cache+0x48e/0x500
>  Code: 01 0f 84 0a fc ff ff 48 8d 4a ff 31 d2 48 39 cb 0f 85 ff fb ff ff e9
> f6 fb ff ff 48 c7 c6 70 01 64 bb 48 89 df e8 f2 99 01 00 <0f> 0b 48 c7 c6 a0
> 1b 64 bb 48 89 df e8 e1 99 01 00 0f 0b 48 8b 13
>  RSP: 0018:ffff9ce7c047f6b0 EFLAGS: 00010286
>  RAX: 000000000000003f RBX: ffffe683c464b300 RCX: 0000000000000000
>  RDX: 0000000000000001 RSI: ffffffffbb67b8e8 RDI: 00000000ffffffff
>  RBP: 0000000000023f97 R08: ffffffffbca122a0 R09: 64656b636f6c5f74
>  R10: 747365745f6f696c R11: 6f6621284f494c4f R12: 00000000001120d4
>  R13: ffff8fa2c6ae7890 R14: ffffe683c464b300 R15: 0000000000000001
>  FS:  00007fc1cea31380(0000) GS:ffff8fa3f7900000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 00007f6972e228c8 CR3: 0000000104ba8000 CR4: 00000000000006e0
>  Call Trace:
>  <TASK>
>  shmem_swapin_folio+0x274/0x980
>  shmem_getpage_gfp+0x234/0x990
>  shmem_read_mapping_page_gfp+0x36/0xf0
>  shmem_sg_alloc_table+0x11b/0x250 [i915]



Regards


Zdenek


2022-08-10 16:00:27

by Hugh Dickins

[permalink] [raw]
Subject: Re: i915: crash with 5.19-rc2

On Wed, 10 Aug 2022, Zdenek Kabelac wrote:
> Dne 22. 06. 22 v 13:18 Zdenek Kabelac napsal(a):
> > Hello
> >
> > While somewhat oldish hw (T61, 4G, C2D) - I've now witnessed new crash with
> > Xorg:
> >
> > (happened while reopening iconified Firefox window  - running 'standard'
> > rawhide -nodebug kernel 5.19.0-0.rc2.21.fc37.x86_64)
> >
>
> Hello
>
>
> Ok, I think I now know what is behind this BUG/crash of intel graphics  - 
> interestingly it took me a few weeks to realize this.
>
> So I've actually installed with some Rawhide update 'zram-generator' package
> to use  zram swap to help with memory of Firefox & Thunderbird a bit with this
> 4G RAM laptop. All worked fine. However side effect of usage of ZRAM swapping
> became actually this occasional  kernel BUG hitting.
>
> When I've stopped using  Zram swap  -  it now runs for 2 weeks without a
> single deadlock - with single or dual screen monitor setup with many suspends
> & resumes in between.
>
> So I'm likely 100% sure that   ZRAM usage is triggering this issue.   While I
> know this laptop is old and likely with low memory and so on - no sure if it's
> worth to solve it - maybe good enough solution is to issue a warning user
> should no comibine this old piece with ZRAM - but I'm all open to do some
> testing for fix - although I still don't have a simple triggering path for
> this issue to happen within short period of time.
>
> Maybe driver is missing tomark some pages as pined into memory so ZRAM can't
> swap them out ?.
>
>
> >  page:00000000577758b3 refcount:0 mapcount:0 mapping:0000000000000000
> > index:0x1 pfn:0x1192cc
> >  flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
> >  raw: 0017ffffc0000000 ffffe683c47171c8 ffff8fa3f79377a8 0000000000000000
> >  raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
> >  page dumped because: VM_BUG_ON_FOLIO(!folio_test_locked(folio))
> >  ------------[ cut here ]------------
> >  kernel BUG at mm/shmem.c:708!
> >  invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> >  CPU: 1 PID: 42896 Comm: Xorg Not tainted 5.19.0-0.rc2.21.fc37.x86_64 #1
> >  Hardware name: LENOVO 6464CTO/6464CTO, BIOS 7LETC9WW (2.29 ) 03/18/2011
> >  RIP: 0010:shmem_add_to_page_cache+0x48e/0x500
> >  Code: 01 0f 84 0a fc ff ff 48 8d 4a ff 31 d2 48 39 cb 0f 85 ff fb ff ff e9
> > f6 fb ff ff 48 c7 c6 70 01 64 bb 48 89 df e8 f2 99 01 00 <0f> 0b 48 c7 c6 a0
> > 1b 64 bb 48 89 df e8 e1 99 01 00 0f 0b 48 8b 13
> >  RSP: 0018:ffff9ce7c047f6b0 EFLAGS: 00010286
> >  RAX: 000000000000003f RBX: ffffe683c464b300 RCX: 0000000000000000
> >  RDX: 0000000000000001 RSI: ffffffffbb67b8e8 RDI: 00000000ffffffff
> >  RBP: 0000000000023f97 R08: ffffffffbca122a0 R09: 64656b636f6c5f74
> >  R10: 747365745f6f696c R11: 6f6621284f494c4f R12: 00000000001120d4
> >  R13: ffff8fa2c6ae7890 R14: ffffe683c464b300 R15: 0000000000000001
> >  FS:  00007fc1cea31380(0000) GS:ffff8fa3f7900000(0000)
> > knlGS:0000000000000000
> >  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >  CR2: 00007f6972e228c8 CR3: 0000000104ba8000 CR4: 00000000000006e0
> >  Call Trace:
> >  <TASK>
> >  shmem_swapin_folio+0x274/0x980
> >  shmem_getpage_gfp+0x234/0x990
> >  shmem_read_mapping_page_gfp+0x36/0xf0
> >  shmem_sg_alloc_table+0x11b/0x250 [i915]

Sorry, I never noticed your original report in June.

This is not a bug in zram or i915, but what Matthew fixes in
https://lore.kernel.org/lkml/[email protected]/

I am a little surprised to see it hitting i915, since I had thought it
could only affect gma500: but looks like 965gm has similar limitations,
and so I expect that's what's on your laptop there.

Hugh

2022-08-10 16:48:41

by Matthew Wilcox

[permalink] [raw]
Subject: Re: i915: crash with 5.19-rc2

On Wed, Aug 10, 2022 at 08:55:32AM -0700, Hugh Dickins wrote:
> This is not a bug in zram or i915, but what Matthew fixes in
> https://lore.kernel.org/lkml/[email protected]/

Thanks for tracking that down, Hugh. Nice to know it's a crash rather
than a data corruption. The fix is in Andrew's tree, so I think it's
already destined for upstream soon.

Andrew, I have two fixes that I don't see in your tree:

https://lore.kernel.org/linux-mm/[email protected]/T/#u
https://lore.kernel.org/linux-mm/[email protected]/T/#u

The first is of minor importance, the second I believe Hugh has hit in
his testing.

2022-08-10 22:27:11

by Andrew Morton

[permalink] [raw]
Subject: Re: i915: crash with 5.19-rc2

On Wed, 10 Aug 2022 17:09:37 +0100 Matthew Wilcox <[email protected]> wrote:

> On Wed, Aug 10, 2022 at 08:55:32AM -0700, Hugh Dickins wrote:
> > This is not a bug in zram or i915, but what Matthew fixes in
> > https://lore.kernel.org/lkml/[email protected]/
>
> Thanks for tracking that down, Hugh. Nice to know it's a crash rather
> than a data corruption. The fix is in Andrew's tree, so I think it's
> already destined for upstream soon.

Yes, that's in the hotfixes queue for sending upstream Fridayish.

> Andrew, I have two fixes that I don't see in your tree:

Is it expected to be in my tree? It's a huge v1 patch series on which
I wasn't cc'ed?

> https://lore.kernel.org/linux-mm/[email protected]/T/#u
> https://lore.kernel.org/linux-mm/[email protected]/T/#u
>
> The first is of minor importance, the second I believe Hugh has hit in
> his testing.

In that case the second patch should be pulled out of that series, have
its changelog made to describe the runtime effects and have a Cc:stable
added, please.