2022-11-17 20:19:08

by Mirsad Todorovac

[permalink] [raw]
Subject: RCU stalls in squashfs_readahead()

Dear all,

I've noticed two similar RCU stalls with the similar call trace, both involving squashfs_readeahead().

Configuration is (excerpt):

CONFIG_KASAN_SHADOW_OFFSET=0xdffffc0000000000
CONFIG_HAVE_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE=16000
# CONFIG_DEBUG_KMEMLEAK_TEST is not set
# CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF is not set
CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN=y
CONFIG_HAVE_ARCH_KASAN=y
CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
CONFIG_CC_HAS_KASAN_GENERIC=y
CONFIG_KASAN=y
CONFIG_KASAN_GENERIC=y
CONFIG_KASAN_OUTLINE=y
# CONFIG_KASAN_INLINE is not set
CONFIG_KASAN_STACK=y
# CONFIG_KASAN_VMALLOC is not set
# CONFIG_KASAN_MODULE_TEST is not set

Do you think this is worthy of a bisect? I am not yet certain about being able to reproduce the stall, however it occurs at
great memory use and heavy swapping (I thought of experimenting with efficiency of multi-gen LRU in 6.1 with multimedia under heavy
loads and low memory ...).

Here is the excerpt from dmesg:

[ 1326.301544] perf: interrupt took too long (3181 > 3178), lowering kernel.perf_event_max_sample_rate to 62750
[ 1347.269247] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 3-.... } 6 jiffies s: 1317 root: 0x8/.
[ 1347.269294] rcu: blocking rcu_node structures (internal RCU debug):
[ 1347.269310] Sending NMI from CPU 2 to CPUs 3:
[ 1347.269329] NMI backtrace for cpu 3
[ 1347.269343] CPU: 3 PID: 7907 Comm: Isolated Web Co Not tainted 6.1.0-rc5 #1
[ 1347.269357] Hardware name: LENOVO 82H8/LNVNB161216, BIOS GGCN49WW 07/21/2022
[ 1347.269363] RIP: 0010:__asan_load8+0x3e/0xb0
[ 1347.269386] Code: b8 00 00 00 00 00 00 00 ff eb 0a 48 b8 00 00 00 00 00 80 ff ff 48 39 f8 77 44 48 8d 47 07 48 89 c2 83 e2 07 48
83 fa 07 75 1c <48> ba 00 00 00 00 00 fc ff df 48 c1 e8 03 0f b6 04 10 84 c0 75 30
[ 1347.269398] RSP: 0018:ffff888102eef288 EFLAGS: 00000246
[ 1347.269410] RAX: ffff888109280047 RBX: ffff8881157da0ca RCX: ffffffff8fca8434
[ 1347.269419] RDX: 0000000000000007 RSI: 0000000000000001 RDI: ffff888109280040
[ 1347.269427] RBP: ffff888102eef288 R08: 00000000000003ae R09: ffffed1021250de8
[ 1347.269435] R10: ffff888109286f39 R11: ffffed1021250de7 R12: 0000000000000155
[ 1347.269443] R13: ffffc9000122ff18 R14: ffff888109280000 R15: ffff888109280000
[ 1347.269451] FS: 00007efe541e9380(0000) GS:ffff888252f80000(0000) knlGS:0000000000000000
[ 1347.269462] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1347.269470] CR2: 00007fa7a15e2000 CR3: 000000013100a003 CR4: 0000000000770ee0
[ 1347.269479] PKRU: 55555554
[ 1347.269484] Call Trace:
[ 1347.269488] <TASK>
[ 1347.269494] lzma_main+0x3b4/0x12b0
[ 1347.269517] lzma2_lzma+0x2dd/0x460
[ 1347.269535] xz_dec_lzma2_run+0x11e/0xba0
[ 1347.269549] ? __asan_load4+0x8/0xb0
[ 1347.269564] ? crc32_le_base+0x179/0x250
[ 1347.269581] xz_dec_run+0x57b/0x11a0
[ 1347.269600] squashfs_xz_uncompress+0x196/0x370
[ 1347.269620] squashfs_decompress+0x8e/0xd0
[ 1347.269633] ? lzo_uncompress+0x400/0x400
[ 1347.269648] squashfs_read_data+0x1e6/0x900
[ 1347.269662] ? kmemleak_alloc+0x4b/0x80
[ 1347.269678] ? squashfs_bio_read.isra.0+0x230/0x230
[ 1347.269694] ? squashfs_page_actor_init_special+0x1bb/0x230
[ 1347.269710] squashfs_readahead+0xa86/0xe70
[ 1347.269722] ? put_pages_list+0x1e0/0x1e0
[ 1347.269747] ? squashfs_fill_page+0x190/0x190
[ 1347.269763] ? psi_task_change+0x106/0x130
[ 1347.269781] ? raw_spin_rq_unlock+0x17/0x60
[ 1347.269796] read_pages+0x12d/0x530
[ 1347.269814] ? file_ra_state_init+0x60/0x60
[ 1347.269829] ? xas_free_nodes+0x170/0x170
[ 1347.269844] ? filemap_alloc_folio+0xcf/0x120
[ 1347.269863] page_cache_ra_unbounded+0x1aa/0x280
[ 1347.269878] ? __rcu_read_unlock+0x57/0x270
[ 1347.269898] do_page_cache_ra+0x7c/0x90
[ 1347.269915] page_cache_ra_order+0x3b4/0x420
[ 1347.269936] filemap_fault+0x849/0xe40
[ 1347.269951] ? read_cache_page_gfp+0x90/0x90
[ 1347.269970] __do_fault+0x76/0x1e0
[ 1347.269984] do_fault+0x1e5/0x6c0
[ 1347.269998] __handle_mm_fault+0x905/0x1740
[ 1347.270015] ? copy_page_range+0x2130/0x2130
[ 1347.270028] ? mas_find+0x100/0x100
[ 1347.270050] handle_mm_fault+0x11c/0x3b0
[ 1347.270065] do_user_addr_fault+0x261/0x8b0
[ 1347.270084] exc_page_fault+0x61/0xf0
[ 1347.270101] asm_exc_page_fault+0x27/0x30
[ 1347.270115] RIP: 0033:0x7efe544ce8b4
[ 1347.270126] Code: d5 c4 e1 f9 7e d0 c4 e1 f9 7e d7 48 c1 f8 35 48 c1 ff 2e 48 89 c1 48 89 fe b8 ff 01 00 00 29 c8 83 e6 7f 48 8d
0d ec 45 07 00 <c5> fb 10 04 f1 48 98 48 8d 0d fe 44 07 00 c5 fb 59 0c c1 c5 f3 59
[ 1347.270137] RSP: 002b:00007ffc019a6e30 EFLAGS: 00010206
[ 1347.270147] RAX: 0000000000000005 RBX: 00007ffc019a6f50 RCX: 00007efe54542ea0
[ 1347.270156] RDX: ffffffffbfefe12e RSI: 000000000000007b RDI: 000000000000fd7b
[ 1347.270164] RBP: 00007ffc019a6f58 R08: 00007ffc019a6f40 R09: 00000000006f71c0
[ 1347.270173] R10: 00007efe547fc500 R11: 00007efe54459b60 R12: 00007ffc019a7060
[ 1347.270181] R13: 00007ffc019a6f48 R14: 00007ffc019a7040 R15: 00007ffc019a6f40
[ 1347.270199] </TASK>


[ 1534.865687] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 3-.... } 7 jiffies s: 1393 root: 0x8/.
[ 1534.865725] rcu: blocking rcu_node structures (internal RCU debug):
[ 1534.865735] Sending NMI from CPU 2 to CPUs 3:
[ 1534.865752] NMI backtrace for cpu 3
[ 1534.865762] CPU: 3 PID: 9629 Comm: FS Broker 9627 Not tainted 6.1.0-rc5 #1
[ 1534.865776] Hardware name: LENOVO 82H8/LNVNB161216, BIOS GGCN49WW 07/21/2022
[ 1534.865782] RIP: 0010:__asan_load4+0x8/0xb0
[ 1534.865802] Code: cc cc cc cc 48 c1 e8 03 80 3c 10 00 75 e9 5d c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 55 48 89
e5 4c 8b 45 08 <48> 83 ff fb 77 63 eb 0f 0f 1f 00 48 b8 00 00 00 00 00 00 00 ff eb
[ 1534.865811] RSP: 0018:ffff88811fd3f288 EFLAGS: 00000246
[ 1534.865823] RAX: 0000000000000000 RBX: 000000000001d378 RCX: dffffc0000000000
[ 1534.865831] RDX: 0000000000000003 RSI: 000000000000077f RDI: ffff888109280080
[ 1534.865838] RBP: ffff88811fd3f288 R08: ffffffff8fca8622 R09: 000000000001d380
[ 1534.865847] R10: ffff888114694020 R11: ffffed102843b9ff R12: ffff888109280006
[ 1534.865854] R13: ffff888109280000 R14: ffff888109280000 R15: 0000000002bbab0b
[ 1534.865861] FS: 00007fe923eff700(0000) GS:ffff888252f80000(0000) knlGS:0000000000000000
[ 1534.865871] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1534.865878] CR2: 00007fe90fe90ac0 CR3: 0000000138d5c003 CR4: 0000000000770ee0
[ 1534.865885] PKRU: 55555554
[ 1534.865889] Call Trace:
[ 1534.865893] <TASK>
[ 1534.865898] lzma_main+0x5a2/0x12b0
[ 1534.865920] lzma2_lzma+0x2dd/0x460
[ 1534.865935] xz_dec_lzma2_run+0x11e/0xba0
[ 1534.865948] ? crc32_le_base+0x58/0x250
[ 1534.865961] ? crc32_le_base+0x1e3/0x250
[ 1534.865977] xz_dec_run+0x57b/0x11a0
[ 1534.865995] squashfs_xz_uncompress+0x196/0x370
[ 1534.866011] squashfs_decompress+0x8e/0xd0
[ 1534.866022] ? lzo_uncompress+0x400/0x400
[ 1534.866034] squashfs_read_data+0x1e6/0x900
[ 1534.866047] ? kmemleak_alloc+0x4b/0x80
[ 1534.866061] ? squashfs_bio_read.isra.0+0x230/0x230
[ 1534.866074] ? squashfs_page_actor_init_special+0x1bb/0x230
[ 1534.866087] squashfs_readahead+0xa86/0xe70
[ 1534.866099] ? xas_create+0x141/0x600
[ 1534.866119] ? squashfs_fill_page+0x190/0x190
[ 1534.866136] ? psi_task_change+0x106/0x130
[ 1534.866155] ? raw_spin_rq_unlock+0x17/0x60
[ 1534.866171] read_pages+0x12d/0x530
[ 1534.866186] ? __kasan_check_read+0x11/0x20
[ 1534.866200] ? folio_batch_add_and_move+0x8d/0xa0
[ 1534.866217] ? file_ra_state_init+0x60/0x60
[ 1534.866231] ? folio_add_lru+0x47/0x70
[ 1534.866249] ? filemap_alloc_folio+0xcf/0x120
[ 1534.866267] page_cache_ra_unbounded+0x1eb/0x280
[ 1534.866280] ? __rcu_read_unlock+0x57/0x270
[ 1534.866297] do_page_cache_ra+0x7c/0x90
[ 1534.866312] page_cache_ra_order+0x3b4/0x420
[ 1534.866330] filemap_fault+0x849/0xe40
[ 1534.866343] ? read_cache_page_gfp+0x90/0x90
[ 1534.866359] __do_fault+0x76/0x1e0
[ 1534.866372] do_fault+0x1e5/0x6c0
[ 1534.866384] __handle_mm_fault+0x905/0x1740
[ 1534.866399] ? copy_page_range+0x2130/0x2130
[ 1534.866412] ? mas_find+0x100/0x100
[ 1534.866428] handle_mm_fault+0x11c/0x3b0
[ 1534.866443] do_user_addr_fault+0x261/0x8b0
[ 1534.866456] ? exit_to_user_mode_prepare+0x9b/0x190
[ 1534.866472] exc_page_fault+0x61/0xf0
[ 1534.866486] asm_exc_page_fault+0x27/0x30
[ 1534.866498] RIP: 0033:0x7fe93673dd00
[ 1534.866515] Code: Unable to access opcode bytes at 0x7fe93673dcd6.
[ 1534.866521] RSP: 002b:00007fe923efeea8 EFLAGS: 00010246
[ 1534.866532] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000001
[ 1534.866539] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000
[ 1534.866545] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007fe837100098
[ 1534.866553] R10: 00007fe868300de8 R11: 0000000000000217 R12: 00007fff95860b2e
[ 1534.866561] R13: 00007fff95860b2f R14: 00007fff95860bf8 R15: 00007fe923efef80
[ 1534.866576] </TASK>

Thank you,
Mirsad

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union


Attachments:
config-6.1.0-rc5.xz (55.37 kB)
dmesg.log.xz (25.52 kB)
lshw.txt.xz (5.13 kB)
Download all attachments

2022-11-17 23:44:56

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: BUG: in squashfs_xz_uncompress() (Was: RCU stalls in squashfs_readahead())

Hi,

While trying to bisect, I've found another bug that predated the introduction of squashfs_readahead(), but it has
a common denominator in squashfs_decompress() and squashfs_xz_uncompress().

Frankly, I need an advice on how to handle a situation like this.

Obviously, the best place to test the RCU problem with squashfs_readahead() is where it is introduced + Phillip's patches [012]/3.

Thanks,
Mirsad

P.S.

This is excerpt from dmesg log attached:

[ 25.338169] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 4-... } 6 jiffies s: 373 root: 0x10/.
[ 25.338202] rcu: blocking rcu_node structures (internal RCU debug):
[ 25.338210] Task dump for CPU 4:
[ 25.338216] task:canonical-livep state:R running task stack: 0 pid: 913 ppid: 1 flags:0x00004008
[ 25.338237] Call Trace:
[ 25.338243] <TASK>
[ 25.338249] ? __kasan_check_write+0x14/0x20
[ 25.338272] ? _raw_spin_lock_irqsave+0x93/0xf0
[ 25.338293] ? sysvec_apic_timer_interrupt+0x89/0x90
[ 25.338311] ? asm_sysvec_apic_timer_interrupt+0x20/0x30
[ 25.338327] ? sysvec_apic_timer_interrupt+0x89/0x90
[ 25.338342] ? sysvec_apic_timer_interrupt+0x89/0x90
[ 25.338359] ? asm_sysvec_apic_timer_interrupt+0x20/0x30
[ 25.338377] ? lzma_len+0x1fb/0x3b0
[ 25.338395] ? dict_repeat+0xec/0x120
[ 25.338413] ? lzma_main+0x10d/0x1260
[ 25.338437] ? lzma2_lzma+0x2a4/0x3b0
[ 25.338457] ? xz_dec_lzma2_run+0x11f/0xb90
[ 25.338483] ? xz_dec_run+0x346/0x11e0
[ 25.338506] ? squashfs_xz_uncompress+0x135/0x330
[ 25.338530] ? lzo_init+0xd0/0xd0
[ 25.338546] ? squashfs_decompress+0x88/0xc0
[ 25.338565] ? squashfs_read_data+0x1e5/0x8e0
[ 25.338582] ? __filemap_get_folio+0x37e/0x540
[ 25.338602] ? squashfs_bio_read.isra.2+0x220/0x220
[ 25.338628] ? squashfs_readpage_block+0x493/0x950
[ 25.338650] ? squashfs_read_folio+0xa4c/0x1140
[ 25.338671] ? xas_nomem+0x2f/0x100
[ 25.338689] ? squashfs_copy_cache+0x360/0x360
[ 25.338707] ? folio_wait_bit+0x3c0/0x3c0
[ 25.338725] ? __kasan_check_read+0x11/0x20
[ 25.338739] ? pagevec_add_and_need_flush+0x8c/0xa0
[ 25.338757] ? folio_add_lru+0x42/0x70
[ 25.338772] ? filemap_add_folio+0xd4/0x130
[ 25.338788] ? add_to_page_cache_locked+0xa0/0xa0
[ 25.338806] ? filemap_read_folio.isra.62+0xbe/0x4d0
[ 25.338822] ? __kasan_check_write+0x14/0x20
[ 25.338838] ? filemap_page_mkwrite+0x6a0/0x6a0
[ 25.338856] ? filemap_add_folio+0x130/0x130
[ 25.338879] ? filemap_fault+0x1041/0x1310
[ 25.338895] ? __kasan_check_read+0x11/0x20
[ 25.338918] ? read_cache_page_gfp+0xd0/0xd0
[ 25.338932] ? __rcu_read_unlock+0x55/0x260
[ 25.338953] ? try_to_wake_up+0x3c2/0xa10
[ 25.338972] ? __filemap_get_folio+0x540/0x540
[ 25.338985] ? __pmd+0x10/0x10
[ 25.339002] ? __do_fault+0x7c/0x1b0
[ 25.339019] ? do_fault+0x1ce/0x6a0
[ 25.339036] ? __handle_mm_fault+0x9cb/0x1490
[ 25.339057] ? copy_page_range+0x1b90/0x1b90
[ 25.339073] ? kernel_fpu_begin_mask+0x160/0x160
[ 25.339092] ? __ia32_sys_futex_time32+0x300/0x300
[ 25.339119] ? handle_mm_fault+0x11b/0x380
[ 25.339138] ? do_user_addr_fault+0x258/0x810
[ 25.339157] ? exc_page_fault+0x60/0xe0
[ 25.339173] ? asm_exc_page_fault+0x2c/0x40
[ 25.339194] </TASK>



On 17. 11. 2022. 20:21, Mirsad Goran Todorovac wrote:
> Dear all,
>
> I've noticed two similar RCU stalls with the similar call trace, both involving squashfs_readeahead().
>
> Configuration is (excerpt):
>
> CONFIG_KASAN_SHADOW_OFFSET=0xdffffc0000000000
> CONFIG_HAVE_DEBUG_KMEMLEAK=y
> CONFIG_DEBUG_KMEMLEAK=y
> CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE=16000
> # CONFIG_DEBUG_KMEMLEAK_TEST is not set
> # CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF is not set
> CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN=y
> CONFIG_HAVE_ARCH_KASAN=y
> CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
> CONFIG_CC_HAS_KASAN_GENERIC=y
> CONFIG_KASAN=y
> CONFIG_KASAN_GENERIC=y
> CONFIG_KASAN_OUTLINE=y
> # CONFIG_KASAN_INLINE is not set
> CONFIG_KASAN_STACK=y
> # CONFIG_KASAN_VMALLOC is not set
> # CONFIG_KASAN_MODULE_TEST is not set
>
> Do you think this is worthy of a bisect? I am not yet certain about being able to reproduce the stall, however it occurs at
> great memory use and heavy swapping (I thought of experimenting with efficiency of multi-gen LRU in 6.1 with multimedia under heavy
> loads and low memory ...).
>
> Here is the excerpt from dmesg:
>
> [ 1326.301544] perf: interrupt took too long (3181 > 3178), lowering kernel.perf_event_max_sample_rate to 62750
> [ 1347.269247] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 3-.... } 6 jiffies s: 1317 root: 0x8/.
> [ 1347.269294] rcu: blocking rcu_node structures (internal RCU debug):
> [ 1347.269310] Sending NMI from CPU 2 to CPUs 3:
> [ 1347.269329] NMI backtrace for cpu 3
> [ 1347.269343] CPU: 3 PID: 7907 Comm: Isolated Web Co Not tainted 6.1.0-rc5 #1
> [ 1347.269357] Hardware name: LENOVO 82H8/LNVNB161216, BIOS GGCN49WW 07/21/2022
> [ 1347.269363] RIP: 0010:__asan_load8+0x3e/0xb0
> [ 1347.269386] Code: b8 00 00 00 00 00 00 00 ff eb 0a 48 b8 00 00 00 00 00 80 ff ff 48 39 f8 77 44 48 8d 47 07 48 89 c2 83 e2 07 48
> 83 fa 07 75 1c <48> ba 00 00 00 00 00 fc ff df 48 c1 e8 03 0f b6 04 10 84 c0 75 30
> [ 1347.269398] RSP: 0018:ffff888102eef288 EFLAGS: 00000246
> [ 1347.269410] RAX: ffff888109280047 RBX: ffff8881157da0ca RCX: ffffffff8fca8434
> [ 1347.269419] RDX: 0000000000000007 RSI: 0000000000000001 RDI: ffff888109280040
> [ 1347.269427] RBP: ffff888102eef288 R08: 00000000000003ae R09: ffffed1021250de8
> [ 1347.269435] R10: ffff888109286f39 R11: ffffed1021250de7 R12: 0000000000000155
> [ 1347.269443] R13: ffffc9000122ff18 R14: ffff888109280000 R15: ffff888109280000
> [ 1347.269451] FS:  00007efe541e9380(0000) GS:ffff888252f80000(0000) knlGS:0000000000000000
> [ 1347.269462] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1347.269470] CR2: 00007fa7a15e2000 CR3: 000000013100a003 CR4: 0000000000770ee0
> [ 1347.269479] PKRU: 55555554
> [ 1347.269484] Call Trace:
> [ 1347.269488]  <TASK>
> [ 1347.269494]  lzma_main+0x3b4/0x12b0
> [ 1347.269517]  lzma2_lzma+0x2dd/0x460
> [ 1347.269535]  xz_dec_lzma2_run+0x11e/0xba0
> [ 1347.269549]  ? __asan_load4+0x8/0xb0
> [ 1347.269564]  ? crc32_le_base+0x179/0x250
> [ 1347.269581]  xz_dec_run+0x57b/0x11a0
> [ 1347.269600]  squashfs_xz_uncompress+0x196/0x370
> [ 1347.269620]  squashfs_decompress+0x8e/0xd0
> [ 1347.269633]  ? lzo_uncompress+0x400/0x400
> [ 1347.269648]  squashfs_read_data+0x1e6/0x900
> [ 1347.269662]  ? kmemleak_alloc+0x4b/0x80
> [ 1347.269678]  ? squashfs_bio_read.isra.0+0x230/0x230
> [ 1347.269694]  ? squashfs_page_actor_init_special+0x1bb/0x230
> [ 1347.269710]  squashfs_readahead+0xa86/0xe70
> [ 1347.269722]  ? put_pages_list+0x1e0/0x1e0
> [ 1347.269747]  ? squashfs_fill_page+0x190/0x190
> [ 1347.269763]  ? psi_task_change+0x106/0x130
> [ 1347.269781]  ? raw_spin_rq_unlock+0x17/0x60
> [ 1347.269796]  read_pages+0x12d/0x530
> [ 1347.269814]  ? file_ra_state_init+0x60/0x60
> [ 1347.269829]  ? xas_free_nodes+0x170/0x170
> [ 1347.269844]  ? filemap_alloc_folio+0xcf/0x120
> [ 1347.269863]  page_cache_ra_unbounded+0x1aa/0x280
> [ 1347.269878]  ? __rcu_read_unlock+0x57/0x270
> [ 1347.269898]  do_page_cache_ra+0x7c/0x90
> [ 1347.269915]  page_cache_ra_order+0x3b4/0x420
> [ 1347.269936]  filemap_fault+0x849/0xe40
> [ 1347.269951]  ? read_cache_page_gfp+0x90/0x90
> [ 1347.269970]  __do_fault+0x76/0x1e0
> [ 1347.269984]  do_fault+0x1e5/0x6c0
> [ 1347.269998]  __handle_mm_fault+0x905/0x1740
> [ 1347.270015]  ? copy_page_range+0x2130/0x2130
> [ 1347.270028]  ? mas_find+0x100/0x100
> [ 1347.270050]  handle_mm_fault+0x11c/0x3b0
> [ 1347.270065]  do_user_addr_fault+0x261/0x8b0
> [ 1347.270084]  exc_page_fault+0x61/0xf0
> [ 1347.270101]  asm_exc_page_fault+0x27/0x30
> [ 1347.270115] RIP: 0033:0x7efe544ce8b4
> [ 1347.270126] Code: d5 c4 e1 f9 7e d0 c4 e1 f9 7e d7 48 c1 f8 35 48 c1 ff 2e 48 89 c1 48 89 fe b8 ff 01 00 00 29 c8 83 e6 7f 48 8d
> 0d ec 45 07 00 <c5> fb 10 04 f1 48 98 48 8d 0d fe 44 07 00 c5 fb 59 0c c1 c5 f3 59
> [ 1347.270137] RSP: 002b:00007ffc019a6e30 EFLAGS: 00010206
> [ 1347.270147] RAX: 0000000000000005 RBX: 00007ffc019a6f50 RCX: 00007efe54542ea0
> [ 1347.270156] RDX: ffffffffbfefe12e RSI: 000000000000007b RDI: 000000000000fd7b
> [ 1347.270164] RBP: 00007ffc019a6f58 R08: 00007ffc019a6f40 R09: 00000000006f71c0
> [ 1347.270173] R10: 00007efe547fc500 R11: 00007efe54459b60 R12: 00007ffc019a7060
> [ 1347.270181] R13: 00007ffc019a6f48 R14: 00007ffc019a7040 R15: 00007ffc019a6f40
> [ 1347.270199]  </TASK>
>
>
> [ 1534.865687] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 3-.... } 7 jiffies s: 1393 root: 0x8/.
> [ 1534.865725] rcu: blocking rcu_node structures (internal RCU debug):
> [ 1534.865735] Sending NMI from CPU 2 to CPUs 3:
> [ 1534.865752] NMI backtrace for cpu 3
> [ 1534.865762] CPU: 3 PID: 9629 Comm: FS Broker 9627 Not tainted 6.1.0-rc5 #1
> [ 1534.865776] Hardware name: LENOVO 82H8/LNVNB161216, BIOS GGCN49WW 07/21/2022
> [ 1534.865782] RIP: 0010:__asan_load4+0x8/0xb0
> [ 1534.865802] Code: cc cc cc cc 48 c1 e8 03 80 3c 10 00 75 e9 5d c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 55 48 89
> e5 4c 8b 45 08 <48> 83 ff fb 77 63 eb 0f 0f 1f 00 48 b8 00 00 00 00 00 00 00 ff eb
> [ 1534.865811] RSP: 0018:ffff88811fd3f288 EFLAGS: 00000246
> [ 1534.865823] RAX: 0000000000000000 RBX: 000000000001d378 RCX: dffffc0000000000
> [ 1534.865831] RDX: 0000000000000003 RSI: 000000000000077f RDI: ffff888109280080
> [ 1534.865838] RBP: ffff88811fd3f288 R08: ffffffff8fca8622 R09: 000000000001d380
> [ 1534.865847] R10: ffff888114694020 R11: ffffed102843b9ff R12: ffff888109280006
> [ 1534.865854] R13: ffff888109280000 R14: ffff888109280000 R15: 0000000002bbab0b
> [ 1534.865861] FS:  00007fe923eff700(0000) GS:ffff888252f80000(0000) knlGS:0000000000000000
> [ 1534.865871] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1534.865878] CR2: 00007fe90fe90ac0 CR3: 0000000138d5c003 CR4: 0000000000770ee0
> [ 1534.865885] PKRU: 55555554
> [ 1534.865889] Call Trace:
> [ 1534.865893]  <TASK>
> [ 1534.865898]  lzma_main+0x5a2/0x12b0
> [ 1534.865920]  lzma2_lzma+0x2dd/0x460
> [ 1534.865935]  xz_dec_lzma2_run+0x11e/0xba0
> [ 1534.865948]  ? crc32_le_base+0x58/0x250
> [ 1534.865961]  ? crc32_le_base+0x1e3/0x250
> [ 1534.865977]  xz_dec_run+0x57b/0x11a0
> [ 1534.865995]  squashfs_xz_uncompress+0x196/0x370
> [ 1534.866011]  squashfs_decompress+0x8e/0xd0
> [ 1534.866022]  ? lzo_uncompress+0x400/0x400
> [ 1534.866034]  squashfs_read_data+0x1e6/0x900
> [ 1534.866047]  ? kmemleak_alloc+0x4b/0x80
> [ 1534.866061]  ? squashfs_bio_read.isra.0+0x230/0x230
> [ 1534.866074]  ? squashfs_page_actor_init_special+0x1bb/0x230
> [ 1534.866087]  squashfs_readahead+0xa86/0xe70
> [ 1534.866099]  ? xas_create+0x141/0x600
> [ 1534.866119]  ? squashfs_fill_page+0x190/0x190
> [ 1534.866136]  ? psi_task_change+0x106/0x130
> [ 1534.866155]  ? raw_spin_rq_unlock+0x17/0x60
> [ 1534.866171]  read_pages+0x12d/0x530
> [ 1534.866186]  ? __kasan_check_read+0x11/0x20
> [ 1534.866200]  ? folio_batch_add_and_move+0x8d/0xa0
> [ 1534.866217]  ? file_ra_state_init+0x60/0x60
> [ 1534.866231]  ? folio_add_lru+0x47/0x70
> [ 1534.866249]  ? filemap_alloc_folio+0xcf/0x120
> [ 1534.866267]  page_cache_ra_unbounded+0x1eb/0x280
> [ 1534.866280]  ? __rcu_read_unlock+0x57/0x270
> [ 1534.866297]  do_page_cache_ra+0x7c/0x90
> [ 1534.866312]  page_cache_ra_order+0x3b4/0x420
> [ 1534.866330]  filemap_fault+0x849/0xe40
> [ 1534.866343]  ? read_cache_page_gfp+0x90/0x90
> [ 1534.866359]  __do_fault+0x76/0x1e0
> [ 1534.866372]  do_fault+0x1e5/0x6c0
> [ 1534.866384]  __handle_mm_fault+0x905/0x1740
> [ 1534.866399]  ? copy_page_range+0x2130/0x2130
> [ 1534.866412]  ? mas_find+0x100/0x100
> [ 1534.866428]  handle_mm_fault+0x11c/0x3b0
> [ 1534.866443]  do_user_addr_fault+0x261/0x8b0
> [ 1534.866456]  ? exit_to_user_mode_prepare+0x9b/0x190
> [ 1534.866472]  exc_page_fault+0x61/0xf0
> [ 1534.866486]  asm_exc_page_fault+0x27/0x30
> [ 1534.866498] RIP: 0033:0x7fe93673dd00
> [ 1534.866515] Code: Unable to access opcode bytes at 0x7fe93673dcd6.
> [ 1534.866521] RSP: 002b:00007fe923efeea8 EFLAGS: 00010246
> [ 1534.866532] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000001
> [ 1534.866539] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000
> [ 1534.866545] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007fe837100098
> [ 1534.866553] R10: 00007fe868300de8 R11: 0000000000000217 R12: 00007fff95860b2e
> [ 1534.866561] R13: 00007fff95860b2f R14: 00007fff95860bf8 R15: 00007fe923efef80
> [ 1534.866576]  </TASK>
>
> Thank you,
> Mirsad
>
> --
> Mirsad Goran Todorovac
> Sistem inženjer
> Grafički fakultet | Akademija likovnih umjetnosti
> Sveučilište u Zagrebu

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union


Attachments:
dmesg-5.19-rc1.log.xz (26.81 kB)

2022-11-17 23:46:52

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: BUG: in squashfs_xz_uncompress() (Was: RCU stalls in squashfs_readahead())

Dear Mr. Lougher,

On 18. 11. 2022. 00:05, Mirsad Goran Todorovac wrote:
> Hi,
>
> While trying to bisect, I've found another bug that predated the introduction of squashfs_readahead(), but it has
> a common denominator in squashfs_decompress() and squashfs_xz_uncompress().
>
> Frankly, I need an advice on how to handle a situation like this.
>
> Obviously, the best place to test the RCU problem with squashfs_readahead() is where it is introduced + Phillip's patches [012]/3.
>
> Thanks,
> Mirsad
>
> P.S.
>
> This is excerpt from dmesg log attached:
>
> [   25.338169] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 4-... } 6 jiffies s: 373 root: 0x10/.
> [   25.338202] rcu: blocking rcu_node structures (internal RCU debug):
> [   25.338210] Task dump for CPU 4:
> [   25.338216] task:canonical-livep state:R  running task     stack:    0 pid:  913 ppid:     1 flags:0x00004008
> [   25.338237] Call Trace:
> [   25.338243]  <TASK>
> [   25.338249]  ? __kasan_check_write+0x14/0x20
> [   25.338272]  ? _raw_spin_lock_irqsave+0x93/0xf0
> [   25.338293]  ? sysvec_apic_timer_interrupt+0x89/0x90
> [   25.338311]  ? asm_sysvec_apic_timer_interrupt+0x20/0x30
> [   25.338327]  ? sysvec_apic_timer_interrupt+0x89/0x90
> [   25.338342]  ? sysvec_apic_timer_interrupt+0x89/0x90
> [   25.338359]  ? asm_sysvec_apic_timer_interrupt+0x20/0x30
> [   25.338377]  ? lzma_len+0x1fb/0x3b0
> [   25.338395]  ? dict_repeat+0xec/0x120
> [   25.338413]  ? lzma_main+0x10d/0x1260
> [   25.338437]  ? lzma2_lzma+0x2a4/0x3b0
> [   25.338457]  ? xz_dec_lzma2_run+0x11f/0xb90
> [   25.338483]  ? xz_dec_run+0x346/0x11e0
> [   25.338506]  ? squashfs_xz_uncompress+0x135/0x330
> [   25.338530]  ? lzo_init+0xd0/0xd0
> [   25.338546]  ? squashfs_decompress+0x88/0xc0
> [   25.338565]  ? squashfs_read_data+0x1e5/0x8e0
> [   25.338582]  ? __filemap_get_folio+0x37e/0x540
> [   25.338602]  ? squashfs_bio_read.isra.2+0x220/0x220
> [   25.338628]  ? squashfs_readpage_block+0x493/0x950
> [   25.338650]  ? squashfs_read_folio+0xa4c/0x1140
> [   25.338671]  ? xas_nomem+0x2f/0x100
> [   25.338689]  ? squashfs_copy_cache+0x360/0x360
> [   25.338707]  ? folio_wait_bit+0x3c0/0x3c0
> [   25.338725]  ? __kasan_check_read+0x11/0x20
> [   25.338739]  ? pagevec_add_and_need_flush+0x8c/0xa0
> [   25.338757]  ? folio_add_lru+0x42/0x70
> [   25.338772]  ? filemap_add_folio+0xd4/0x130
> [   25.338788]  ? add_to_page_cache_locked+0xa0/0xa0
> [   25.338806]  ? filemap_read_folio.isra.62+0xbe/0x4d0
> [   25.338822]  ? __kasan_check_write+0x14/0x20
> [   25.338838]  ? filemap_page_mkwrite+0x6a0/0x6a0
> [   25.338856]  ? filemap_add_folio+0x130/0x130
> [   25.338879]  ? filemap_fault+0x1041/0x1310
> [   25.338895]  ? __kasan_check_read+0x11/0x20
> [   25.338918]  ? read_cache_page_gfp+0xd0/0xd0
> [   25.338932]  ? __rcu_read_unlock+0x55/0x260
> [   25.338953]  ? try_to_wake_up+0x3c2/0xa10
> [   25.338972]  ? __filemap_get_folio+0x540/0x540
> [   25.338985]  ? __pmd+0x10/0x10
> [   25.339002]  ? __do_fault+0x7c/0x1b0
> [   25.339019]  ? do_fault+0x1ce/0x6a0
> [   25.339036]  ? __handle_mm_fault+0x9cb/0x1490
> [   25.339057]  ? copy_page_range+0x1b90/0x1b90
> [   25.339073]  ? kernel_fpu_begin_mask+0x160/0x160
> [   25.339092]  ? __ia32_sys_futex_time32+0x300/0x300
> [   25.339119]  ? handle_mm_fault+0x11b/0x380
> [   25.339138]  ? do_user_addr_fault+0x258/0x810
> [   25.339157]  ? exc_page_fault+0x60/0xe0
> [   25.339173]  ? asm_exc_page_fault+0x2c/0x40
> [   25.339194]  </TASK>
>
>
>
> On 17. 11. 2022. 20:21, Mirsad Goran Todorovac wrote:
>> Dear all,
>>
>> I've noticed two similar RCU stalls with the similar call trace, both involving squashfs_readeahead().
>>
>> Configuration is (excerpt):
>>
>> CONFIG_KASAN_SHADOW_OFFSET=0xdffffc0000000000
>> CONFIG_HAVE_DEBUG_KMEMLEAK=y
>> CONFIG_DEBUG_KMEMLEAK=y
>> CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE=16000
>> # CONFIG_DEBUG_KMEMLEAK_TEST is not set
>> # CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF is not set
>> CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN=y
>> CONFIG_HAVE_ARCH_KASAN=y
>> CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
>> CONFIG_CC_HAS_KASAN_GENERIC=y
>> CONFIG_KASAN=y
>> CONFIG_KASAN_GENERIC=y
>> CONFIG_KASAN_OUTLINE=y
>> # CONFIG_KASAN_INLINE is not set
>> CONFIG_KASAN_STACK=y
>> # CONFIG_KASAN_VMALLOC is not set
>> # CONFIG_KASAN_MODULE_TEST is not set
>>
>> Do you think this is worthy of a bisect? I am not yet certain about being able to reproduce the stall, however it occurs at
>> great memory use and heavy swapping (I thought of experimenting with efficiency of multi-gen LRU in 6.1 with multimedia under
>> heavy loads and low memory ...).
>>
>> Here is the excerpt from dmesg:
>>
>> [ 1326.301544] perf: interrupt took too long (3181 > 3178), lowering kernel.perf_event_max_sample_rate to 62750
>> [ 1347.269247] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 3-.... } 6 jiffies s: 1317 root: 0x8/.
>> [ 1347.269294] rcu: blocking rcu_node structures (internal RCU debug):
>> [ 1347.269310] Sending NMI from CPU 2 to CPUs 3:
>> [ 1347.269329] NMI backtrace for cpu 3
>> [ 1347.269343] CPU: 3 PID: 7907 Comm: Isolated Web Co Not tainted 6.1.0-rc5 #1
>> [ 1347.269357] Hardware name: LENOVO 82H8/LNVNB161216, BIOS GGCN49WW 07/21/2022
>> [ 1347.269363] RIP: 0010:__asan_load8+0x3e/0xb0
>> [ 1347.269386] Code: b8 00 00 00 00 00 00 00 ff eb 0a 48 b8 00 00 00 00 00 80 ff ff 48 39 f8 77 44 48 8d 47 07 48 89 c2 83 e2 07
>> 48 83 fa 07 75 1c <48> ba 00 00 00 00 00 fc ff df 48 c1 e8 03 0f b6 04 10 84 c0 75 30
>> [ 1347.269398] RSP: 0018:ffff888102eef288 EFLAGS: 00000246
>> [ 1347.269410] RAX: ffff888109280047 RBX: ffff8881157da0ca RCX: ffffffff8fca8434
>> [ 1347.269419] RDX: 0000000000000007 RSI: 0000000000000001 RDI: ffff888109280040
>> [ 1347.269427] RBP: ffff888102eef288 R08: 00000000000003ae R09: ffffed1021250de8
>> [ 1347.269435] R10: ffff888109286f39 R11: ffffed1021250de7 R12: 0000000000000155
>> [ 1347.269443] R13: ffffc9000122ff18 R14: ffff888109280000 R15: ffff888109280000
>> [ 1347.269451] FS:  00007efe541e9380(0000) GS:ffff888252f80000(0000) knlGS:0000000000000000
>> [ 1347.269462] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 1347.269470] CR2: 00007fa7a15e2000 CR3: 000000013100a003 CR4: 0000000000770ee0
>> [ 1347.269479] PKRU: 55555554
>> [ 1347.269484] Call Trace:
>> [ 1347.269488]  <TASK>
>> [ 1347.269494]  lzma_main+0x3b4/0x12b0
>> [ 1347.269517]  lzma2_lzma+0x2dd/0x460
>> [ 1347.269535]  xz_dec_lzma2_run+0x11e/0xba0
>> [ 1347.269549]  ? __asan_load4+0x8/0xb0
>> [ 1347.269564]  ? crc32_le_base+0x179/0x250
>> [ 1347.269581]  xz_dec_run+0x57b/0x11a0
>> [ 1347.269600]  squashfs_xz_uncompress+0x196/0x370
>> [ 1347.269620]  squashfs_decompress+0x8e/0xd0
>> [ 1347.269633]  ? lzo_uncompress+0x400/0x400
>> [ 1347.269648]  squashfs_read_data+0x1e6/0x900
>> [ 1347.269662]  ? kmemleak_alloc+0x4b/0x80
>> [ 1347.269678]  ? squashfs_bio_read.isra.0+0x230/0x230
>> [ 1347.269694]  ? squashfs_page_actor_init_special+0x1bb/0x230
>> [ 1347.269710]  squashfs_readahead+0xa86/0xe70
>> [ 1347.269722]  ? put_pages_list+0x1e0/0x1e0
>> [ 1347.269747]  ? squashfs_fill_page+0x190/0x190
>> [ 1347.269763]  ? psi_task_change+0x106/0x130
>> [ 1347.269781]  ? raw_spin_rq_unlock+0x17/0x60
>> [ 1347.269796]  read_pages+0x12d/0x530
>> [ 1347.269814]  ? file_ra_state_init+0x60/0x60
>> [ 1347.269829]  ? xas_free_nodes+0x170/0x170
>> [ 1347.269844]  ? filemap_alloc_folio+0xcf/0x120
>> [ 1347.269863]  page_cache_ra_unbounded+0x1aa/0x280
>> [ 1347.269878]  ? __rcu_read_unlock+0x57/0x270
>> [ 1347.269898]  do_page_cache_ra+0x7c/0x90
>> [ 1347.269915]  page_cache_ra_order+0x3b4/0x420
>> [ 1347.269936]  filemap_fault+0x849/0xe40
>> [ 1347.269951]  ? read_cache_page_gfp+0x90/0x90
>> [ 1347.269970]  __do_fault+0x76/0x1e0
>> [ 1347.269984]  do_fault+0x1e5/0x6c0
>> [ 1347.269998]  __handle_mm_fault+0x905/0x1740
>> [ 1347.270015]  ? copy_page_range+0x2130/0x2130
>> [ 1347.270028]  ? mas_find+0x100/0x100
>> [ 1347.270050]  handle_mm_fault+0x11c/0x3b0
>> [ 1347.270065]  do_user_addr_fault+0x261/0x8b0
>> [ 1347.270084]  exc_page_fault+0x61/0xf0
>> [ 1347.270101]  asm_exc_page_fault+0x27/0x30
>> [ 1347.270115] RIP: 0033:0x7efe544ce8b4
>> [ 1347.270126] Code: d5 c4 e1 f9 7e d0 c4 e1 f9 7e d7 48 c1 f8 35 48 c1 ff 2e 48 89 c1 48 89 fe b8 ff 01 00 00 29 c8 83 e6 7f 48
>> 8d 0d ec 45 07 00 <c5> fb 10 04 f1 48 98 48 8d 0d fe 44 07 00 c5 fb 59 0c c1 c5 f3 59
>> [ 1347.270137] RSP: 002b:00007ffc019a6e30 EFLAGS: 00010206
>> [ 1347.270147] RAX: 0000000000000005 RBX: 00007ffc019a6f50 RCX: 00007efe54542ea0
>> [ 1347.270156] RDX: ffffffffbfefe12e RSI: 000000000000007b RDI: 000000000000fd7b
>> [ 1347.270164] RBP: 00007ffc019a6f58 R08: 00007ffc019a6f40 R09: 00000000006f71c0
>> [ 1347.270173] R10: 00007efe547fc500 R11: 00007efe54459b60 R12: 00007ffc019a7060
>> [ 1347.270181] R13: 00007ffc019a6f48 R14: 00007ffc019a7040 R15: 00007ffc019a6f40
>> [ 1347.270199]  </TASK>
>>
>>
>> [ 1534.865687] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 3-.... } 7 jiffies s: 1393 root: 0x8/.
>> [ 1534.865725] rcu: blocking rcu_node structures (internal RCU debug):
>> [ 1534.865735] Sending NMI from CPU 2 to CPUs 3:
>> [ 1534.865752] NMI backtrace for cpu 3
>> [ 1534.865762] CPU: 3 PID: 9629 Comm: FS Broker 9627 Not tainted 6.1.0-rc5 #1
>> [ 1534.865776] Hardware name: LENOVO 82H8/LNVNB161216, BIOS GGCN49WW 07/21/2022
>> [ 1534.865782] RIP: 0010:__asan_load4+0x8/0xb0
>> [ 1534.865802] Code: cc cc cc cc 48 c1 e8 03 80 3c 10 00 75 e9 5d c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 55 48
>> 89 e5 4c 8b 45 08 <48> 83 ff fb 77 63 eb 0f 0f 1f 00 48 b8 00 00 00 00 00 00 00 ff eb
>> [ 1534.865811] RSP: 0018:ffff88811fd3f288 EFLAGS: 00000246
>> [ 1534.865823] RAX: 0000000000000000 RBX: 000000000001d378 RCX: dffffc0000000000
>> [ 1534.865831] RDX: 0000000000000003 RSI: 000000000000077f RDI: ffff888109280080
>> [ 1534.865838] RBP: ffff88811fd3f288 R08: ffffffff8fca8622 R09: 000000000001d380
>> [ 1534.865847] R10: ffff888114694020 R11: ffffed102843b9ff R12: ffff888109280006
>> [ 1534.865854] R13: ffff888109280000 R14: ffff888109280000 R15: 0000000002bbab0b
>> [ 1534.865861] FS:  00007fe923eff700(0000) GS:ffff888252f80000(0000) knlGS:0000000000000000
>> [ 1534.865871] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 1534.865878] CR2: 00007fe90fe90ac0 CR3: 0000000138d5c003 CR4: 0000000000770ee0
>> [ 1534.865885] PKRU: 55555554
>> [ 1534.865889] Call Trace:
>> [ 1534.865893]  <TASK>
>> [ 1534.865898]  lzma_main+0x5a2/0x12b0
>> [ 1534.865920]  lzma2_lzma+0x2dd/0x460
>> [ 1534.865935]  xz_dec_lzma2_run+0x11e/0xba0
>> [ 1534.865948]  ? crc32_le_base+0x58/0x250
>> [ 1534.865961]  ? crc32_le_base+0x1e3/0x250
>> [ 1534.865977]  xz_dec_run+0x57b/0x11a0
>> [ 1534.865995]  squashfs_xz_uncompress+0x196/0x370
>> [ 1534.866011]  squashfs_decompress+0x8e/0xd0
>> [ 1534.866022]  ? lzo_uncompress+0x400/0x400
>> [ 1534.866034]  squashfs_read_data+0x1e6/0x900
>> [ 1534.866047]  ? kmemleak_alloc+0x4b/0x80
>> [ 1534.866061]  ? squashfs_bio_read.isra.0+0x230/0x230
>> [ 1534.866074]  ? squashfs_page_actor_init_special+0x1bb/0x230
>> [ 1534.866087]  squashfs_readahead+0xa86/0xe70
>> [ 1534.866099]  ? xas_create+0x141/0x600
>> [ 1534.866119]  ? squashfs_fill_page+0x190/0x190
>> [ 1534.866136]  ? psi_task_change+0x106/0x130
>> [ 1534.866155]  ? raw_spin_rq_unlock+0x17/0x60
>> [ 1534.866171]  read_pages+0x12d/0x530
>> [ 1534.866186]  ? __kasan_check_read+0x11/0x20
>> [ 1534.866200]  ? folio_batch_add_and_move+0x8d/0xa0
>> [ 1534.866217]  ? file_ra_state_init+0x60/0x60
>> [ 1534.866231]  ? folio_add_lru+0x47/0x70
>> [ 1534.866249]  ? filemap_alloc_folio+0xcf/0x120
>> [ 1534.866267]  page_cache_ra_unbounded+0x1eb/0x280
>> [ 1534.866280]  ? __rcu_read_unlock+0x57/0x270
>> [ 1534.866297]  do_page_cache_ra+0x7c/0x90
>> [ 1534.866312]  page_cache_ra_order+0x3b4/0x420
>> [ 1534.866330]  filemap_fault+0x849/0xe40
>> [ 1534.866343]  ? read_cache_page_gfp+0x90/0x90
>> [ 1534.866359]  __do_fault+0x76/0x1e0
>> [ 1534.866372]  do_fault+0x1e5/0x6c0
>> [ 1534.866384]  __handle_mm_fault+0x905/0x1740
>> [ 1534.866399]  ? copy_page_range+0x2130/0x2130
>> [ 1534.866412]  ? mas_find+0x100/0x100
>> [ 1534.866428]  handle_mm_fault+0x11c/0x3b0
>> [ 1534.866443]  do_user_addr_fault+0x261/0x8b0
>> [ 1534.866456]  ? exit_to_user_mode_prepare+0x9b/0x190
>> [ 1534.866472]  exc_page_fault+0x61/0xf0
>> [ 1534.866486]  asm_exc_page_fault+0x27/0x30
>> [ 1534.866498] RIP: 0033:0x7fe93673dd00
>> [ 1534.866515] Code: Unable to access opcode bytes at 0x7fe93673dcd6.
>> [ 1534.866521] RSP: 002b:00007fe923efeea8 EFLAGS: 00010246
>> [ 1534.866532] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000001
>> [ 1534.866539] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000
>> [ 1534.866545] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007fe837100098
>> [ 1534.866553] R10: 00007fe868300de8 R11: 0000000000000217 R12: 00007fff95860b2e
>> [ 1534.866561] R13: 00007fff95860b2f R14: 00007fff95860bf8 R15: 00007fe923efef80
>> [ 1534.866576]  </TASK>

To what exact state of code should your patches phillip_[012]_3.patch be applied?
In https://lore.kernel.org/lkml/[email protected]/#Z31fs:squashfs:file.c
you say that it is a patch that remedied the commit 8fc78b6fe24c:

Fixes: 8fc78b6fe24c ("squashfs: implement readahead")

However, the patch does not apply to that commit:

marvin@marvin-IdeaPad-3-15ITL6:~/linux/kernel/linux_stable_build$ git apply --ignore-whitespace ../phillip_0_3.patch
error: patch failed: fs/squashfs/file.c:593
error: fs/squashfs/file.c: patch does not apply
error: patch failed: fs/squashfs/page_actor.c:71
error: fs/squashfs/page_actor.c: patch does not apply
error: patch failed: fs/squashfs/page_actor.h:29
error: fs/squashfs/page_actor.h: patch does not apply

marvin@marvin-IdeaPad-3-15ITL6:~/linux/kernel/linux_stable_build$ git show | head
commit 8fc78b6fe24c36b151ac98d7546591ed92083d4f
Author: Hsin-Yi Wang <[email protected]>
Date: Fri Jun 17 16:38:13 2022 +0800

squashfs: implement readahead

Implement readahead callback for squashfs. It will read datablocks which
cover pages in readahead request. For a few cases it will not mark page
as uptodate, including:

marvin@marvin-IdeaPad-3-15ITL6:~/linux/kernel/linux_stable_build$

I need this information for bisecting the RCU stall bug in squashfs_readahead(),
to see whether it happens immediately. I need a lower barrier for "git bisect".

Thank you,
Mirsad

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union


2022-11-18 06:40:20

by Phillip Lougher

[permalink] [raw]
Subject: Re: BUG: in squashfs_xz_uncompress() (Was: RCU stalls in squashfs_readahead())

On 17/11/2022 23:05, Mirsad Goran Todorovac wrote:
> Hi,
>
> While trying to bisect, I've found another bug that predated the
> introduction of squashfs_readahead(), but it has
> a common denominator in squashfs_decompress() and squashfs_xz_uncompress().

Wrong, the stall is happening in the XZ decompressor library, which
is *not* in Squashfs.

This reported stall in the decompressor code is likely a symptom of you
deliberately thrashing your system. When the system thrashes everything
starts to happen very slowly, and the system will spend a lot of
its time doing page I/O, and the CPU will spend a lot of time in
any CPU intensive code like the XZ decompressor library.

So the fact the stall is being hit here is a symptom and not
a cause. The decompressor code is likely running slowly due to
thrashing and waiting on paged-out buffers. This is not indicative
of any bug, merely a system running slowly due to overload.

As I said, this is not a Squashfs issue, because the code when the
stall takes place isn't in Squashfs.

The people responsible for the rcu code should have a lot more insight
about what happens when the system is thrashing, and how this will
throw up false positives. In this I believe this is an instance of
perfectly correct code running slowly due to thrashing incorrectly
being flagged as looping.

CC'ing Paul E. McKenney <[email protected]>

Phillip



>
> Frankly, I need an advice on how to handle a situation like this.
>
> Obviously, the best place to test the RCU problem with
> squashfs_readahead() is where it is introduced + Phillip's patches [012]/3.
>
> Thanks,
> Mirsad
>
> P.S.
>
> This is excerpt from dmesg log attached:
>
> [   25.338169] rcu: INFO: rcu_preempt detected expedited stalls on
> CPUs/tasks: { 4-... } 6 jiffies s: 373 root: 0x10/.
> [   25.338202] rcu: blocking rcu_node structures (internal RCU debug):
> [   25.338210] Task dump for CPU 4:
> [   25.338216] task:canonical-livep state:R  running task     stack:
> 0 pid:  913 ppid:     1 flags:0x00004008
> [   25.338237] Call Trace:
> [   25.338243]  <TASK>
> [   25.338249]  ? __kasan_check_write+0x14/0x20
> [   25.338272]  ? _raw_spin_lock_irqsave+0x93/0xf0
> [   25.338293]  ? sysvec_apic_timer_interrupt+0x89/0x90
> [   25.338311]  ? asm_sysvec_apic_timer_interrupt+0x20/0x30
> [   25.338327]  ? sysvec_apic_timer_interrupt+0x89/0x90
> [   25.338342]  ? sysvec_apic_timer_interrupt+0x89/0x90
> [   25.338359]  ? asm_sysvec_apic_timer_interrupt+0x20/0x30
> [   25.338377]  ? lzma_len+0x1fb/0x3b0
> [   25.338395]  ? dict_repeat+0xec/0x120
> [   25.338413]  ? lzma_main+0x10d/0x1260
> [   25.338437]  ? lzma2_lzma+0x2a4/0x3b0
> [   25.338457]  ? xz_dec_lzma2_run+0x11f/0xb90
> [   25.338483]  ? xz_dec_run+0x346/0x11e0
> [   25.338506]  ? squashfs_xz_uncompress+0x135/0x330
> [   25.338530]  ? lzo_init+0xd0/0xd0
> [   25.338546]  ? squashfs_decompress+0x88/0xc0
> [   25.338565]  ? squashfs_read_data+0x1e5/0x8e0
> [   25.338582]  ? __filemap_get_folio+0x37e/0x540
> [   25.338602]  ? squashfs_bio_read.isra.2+0x220/0x220
> [   25.338628]  ? squashfs_readpage_block+0x493/0x950
> [   25.338650]  ? squashfs_read_folio+0xa4c/0x1140
> [   25.338671]  ? xas_nomem+0x2f/0x100
> [   25.338689]  ? squashfs_copy_cache+0x360/0x360
> [   25.338707]  ? folio_wait_bit+0x3c0/0x3c0
> [   25.338725]  ? __kasan_check_read+0x11/0x20
> [   25.338739]  ? pagevec_add_and_need_flush+0x8c/0xa0
> [   25.338757]  ? folio_add_lru+0x42/0x70
> [   25.338772]  ? filemap_add_folio+0xd4/0x130
> [   25.338788]  ? add_to_page_cache_locked+0xa0/0xa0
> [   25.338806]  ? filemap_read_folio.isra.62+0xbe/0x4d0
> [   25.338822]  ? __kasan_check_write+0x14/0x20
> [   25.338838]  ? filemap_page_mkwrite+0x6a0/0x6a0
> [   25.338856]  ? filemap_add_folio+0x130/0x130
> [   25.338879]  ? filemap_fault+0x1041/0x1310
> [   25.338895]  ? __kasan_check_read+0x11/0x20
> [   25.338918]  ? read_cache_page_gfp+0xd0/0xd0
> [   25.338932]  ? __rcu_read_unlock+0x55/0x260
> [   25.338953]  ? try_to_wake_up+0x3c2/0xa10
> [   25.338972]  ? __filemap_get_folio+0x540/0x540
> [   25.338985]  ? __pmd+0x10/0x10
> [   25.339002]  ? __do_fault+0x7c/0x1b0
> [   25.339019]  ? do_fault+0x1ce/0x6a0
> [   25.339036]  ? __handle_mm_fault+0x9cb/0x1490
> [   25.339057]  ? copy_page_range+0x1b90/0x1b90
> [   25.339073]  ? kernel_fpu_begin_mask+0x160/0x160
> [   25.339092]  ? __ia32_sys_futex_time32+0x300/0x300
> [   25.339119]  ? handle_mm_fault+0x11b/0x380
> [   25.339138]  ? do_user_addr_fault+0x258/0x810
> [   25.339157]  ? exc_page_fault+0x60/0xe0
> [   25.339173]  ? asm_exc_page_fault+0x2c/0x40
> [   25.339194]  </TASK>
>
>
>
> On 17. 11. 2022. 20:21, Mirsad Goran Todorovac wrote:
>> Dear all,
>>
>> I've noticed two similar RCU stalls with the similar call trace, both
>> involving squashfs_readeahead().
>>
>> Configuration is (excerpt):
>>
>> CONFIG_KASAN_SHADOW_OFFSET=0xdffffc0000000000
>> CONFIG_HAVE_DEBUG_KMEMLEAK=y
>> CONFIG_DEBUG_KMEMLEAK=y
>> CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE=16000
>> # CONFIG_DEBUG_KMEMLEAK_TEST is not set
>> # CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF is not set
>> CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN=y
>> CONFIG_HAVE_ARCH_KASAN=y
>> CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
>> CONFIG_CC_HAS_KASAN_GENERIC=y
>> CONFIG_KASAN=y
>> CONFIG_KASAN_GENERIC=y
>> CONFIG_KASAN_OUTLINE=y
>> # CONFIG_KASAN_INLINE is not set
>> CONFIG_KASAN_STACK=y
>> # CONFIG_KASAN_VMALLOC is not set
>> # CONFIG_KASAN_MODULE_TEST is not set
>>
>> Do you think this is worthy of a bisect? I am not yet certain about
>> being able to reproduce the stall, however it occurs at
>> great memory use and heavy swapping (I thought of experimenting with
>> efficiency of multi-gen LRU in 6.1 with multimedia under heavy loads
>> and low memory ...).
>>
>> Here is the excerpt from dmesg:
>>
>> [ 1326.301544] perf: interrupt took too long (3181 > 3178), lowering
>> kernel.perf_event_max_sample_rate to 62750
>> [ 1347.269247] rcu: INFO: rcu_preempt detected expedited stalls on
>> CPUs/tasks: { 3-.... } 6 jiffies s: 1317 root: 0x8/.
>> [ 1347.269294] rcu: blocking rcu_node structures (internal RCU debug):
>> [ 1347.269310] Sending NMI from CPU 2 to CPUs 3:
>> [ 1347.269329] NMI backtrace for cpu 3
>> [ 1347.269343] CPU: 3 PID: 7907 Comm: Isolated Web Co Not tainted
>> 6.1.0-rc5 #1
>> [ 1347.269357] Hardware name: LENOVO 82H8/LNVNB161216, BIOS GGCN49WW
>> 07/21/2022
>> [ 1347.269363] RIP: 0010:__asan_load8+0x3e/0xb0
>> [ 1347.269386] Code: b8 00 00 00 00 00 00 00 ff eb 0a 48 b8 00 00 00
>> 00 00 80 ff ff 48 39 f8 77 44 48 8d 47 07 48 89 c2 83 e2 07 48 83 fa
>> 07 75 1c <48> ba 00 00 00 00 00 fc ff df 48 c1 e8 03 0f b6 04 10 84 c0
>> 75 30
>> [ 1347.269398] RSP: 0018:ffff888102eef288 EFLAGS: 00000246
>> [ 1347.269410] RAX: ffff888109280047 RBX: ffff8881157da0ca RCX:
>> ffffffff8fca8434
>> [ 1347.269419] RDX: 0000000000000007 RSI: 0000000000000001 RDI:
>> ffff888109280040
>> [ 1347.269427] RBP: ffff888102eef288 R08: 00000000000003ae R09:
>> ffffed1021250de8
>> [ 1347.269435] R10: ffff888109286f39 R11: ffffed1021250de7 R12:
>> 0000000000000155
>> [ 1347.269443] R13: ffffc9000122ff18 R14: ffff888109280000 R15:
>> ffff888109280000
>> [ 1347.269451] FS:  00007efe541e9380(0000) GS:ffff888252f80000(0000)
>> knlGS:0000000000000000
>> [ 1347.269462] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 1347.269470] CR2: 00007fa7a15e2000 CR3: 000000013100a003 CR4:
>> 0000000000770ee0
>> [ 1347.269479] PKRU: 55555554
>> [ 1347.269484] Call Trace:
>> [ 1347.269488]  <TASK>
>> [ 1347.269494]  lzma_main+0x3b4/0x12b0
>> [ 1347.269517]  lzma2_lzma+0x2dd/0x460
>> [ 1347.269535]  xz_dec_lzma2_run+0x11e/0xba0
>> [ 1347.269549]  ? __asan_load4+0x8/0xb0
>> [ 1347.269564]  ? crc32_le_base+0x179/0x250
>> [ 1347.269581]  xz_dec_run+0x57b/0x11a0
>> [ 1347.269600]  squashfs_xz_uncompress+0x196/0x370
>> [ 1347.269620]  squashfs_decompress+0x8e/0xd0
>> [ 1347.269633]  ? lzo_uncompress+0x400/0x400
>> [ 1347.269648]  squashfs_read_data+0x1e6/0x900
>> [ 1347.269662]  ? kmemleak_alloc+0x4b/0x80
>> [ 1347.269678]  ? squashfs_bio_read.isra.0+0x230/0x230
>> [ 1347.269694]  ? squashfs_page_actor_init_special+0x1bb/0x230
>> [ 1347.269710]  squashfs_readahead+0xa86/0xe70
>> [ 1347.269722]  ? put_pages_list+0x1e0/0x1e0
>> [ 1347.269747]  ? squashfs_fill_page+0x190/0x190
>> [ 1347.269763]  ? psi_task_change+0x106/0x130
>> [ 1347.269781]  ? raw_spin_rq_unlock+0x17/0x60
>> [ 1347.269796]  read_pages+0x12d/0x530
>> [ 1347.269814]  ? file_ra_state_init+0x60/0x60
>> [ 1347.269829]  ? xas_free_nodes+0x170/0x170
>> [ 1347.269844]  ? filemap_alloc_folio+0xcf/0x120
>> [ 1347.269863]  page_cache_ra_unbounded+0x1aa/0x280
>> [ 1347.269878]  ? __rcu_read_unlock+0x57/0x270
>> [ 1347.269898]  do_page_cache_ra+0x7c/0x90
>> [ 1347.269915]  page_cache_ra_order+0x3b4/0x420
>> [ 1347.269936]  filemap_fault+0x849/0xe40
>> [ 1347.269951]  ? read_cache_page_gfp+0x90/0x90
>> [ 1347.269970]  __do_fault+0x76/0x1e0
>> [ 1347.269984]  do_fault+0x1e5/0x6c0
>> [ 1347.269998]  __handle_mm_fault+0x905/0x1740
>> [ 1347.270015]  ? copy_page_range+0x2130/0x2130
>> [ 1347.270028]  ? mas_find+0x100/0x100
>> [ 1347.270050]  handle_mm_fault+0x11c/0x3b0
>> [ 1347.270065]  do_user_addr_fault+0x261/0x8b0
>> [ 1347.270084]  exc_page_fault+0x61/0xf0
>> [ 1347.270101]  asm_exc_page_fault+0x27/0x30
>> [ 1347.270115] RIP: 0033:0x7efe544ce8b4
>> [ 1347.270126] Code: d5 c4 e1 f9 7e d0 c4 e1 f9 7e d7 48 c1 f8 35 48
>> c1 ff 2e 48 89 c1 48 89 fe b8 ff 01 00 00 29 c8 83 e6 7f 48 8d 0d ec
>> 45 07 00 <c5> fb 10 04 f1 48 98 48 8d 0d fe 44 07 00 c5 fb 59 0c c1 c5
>> f3 59
>> [ 1347.270137] RSP: 002b:00007ffc019a6e30 EFLAGS: 00010206
>> [ 1347.270147] RAX: 0000000000000005 RBX: 00007ffc019a6f50 RCX:
>> 00007efe54542ea0
>> [ 1347.270156] RDX: ffffffffbfefe12e RSI: 000000000000007b RDI:
>> 000000000000fd7b
>> [ 1347.270164] RBP: 00007ffc019a6f58 R08: 00007ffc019a6f40 R09:
>> 00000000006f71c0
>> [ 1347.270173] R10: 00007efe547fc500 R11: 00007efe54459b60 R12:
>> 00007ffc019a7060
>> [ 1347.270181] R13: 00007ffc019a6f48 R14: 00007ffc019a7040 R15:
>> 00007ffc019a6f40
>> [ 1347.270199]  </TASK>
>>
>>
>> [ 1534.865687] rcu: INFO: rcu_preempt detected expedited stalls on
>> CPUs/tasks: { 3-.... } 7 jiffies s: 1393 root: 0x8/.
>> [ 1534.865725] rcu: blocking rcu_node structures (internal RCU debug):
>> [ 1534.865735] Sending NMI from CPU 2 to CPUs 3:
>> [ 1534.865752] NMI backtrace for cpu 3
>> [ 1534.865762] CPU: 3 PID: 9629 Comm: FS Broker 9627 Not tainted
>> 6.1.0-rc5 #1
>> [ 1534.865776] Hardware name: LENOVO 82H8/LNVNB161216, BIOS GGCN49WW
>> 07/21/2022
>> [ 1534.865782] RIP: 0010:__asan_load4+0x8/0xb0
>> [ 1534.865802] Code: cc cc cc cc 48 c1 e8 03 80 3c 10 00 75 e9 5d c3
>> cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 55 48 89 e5 4c
>> 8b 45 08 <48> 83 ff fb 77 63 eb 0f 0f 1f 00 48 b8 00 00 00 00 00 00 00
>> ff eb
>> [ 1534.865811] RSP: 0018:ffff88811fd3f288 EFLAGS: 00000246
>> [ 1534.865823] RAX: 0000000000000000 RBX: 000000000001d378 RCX:
>> dffffc0000000000
>> [ 1534.865831] RDX: 0000000000000003 RSI: 000000000000077f RDI:
>> ffff888109280080
>> [ 1534.865838] RBP: ffff88811fd3f288 R08: ffffffff8fca8622 R09:
>> 000000000001d380
>> [ 1534.865847] R10: ffff888114694020 R11: ffffed102843b9ff R12:
>> ffff888109280006
>> [ 1534.865854] R13: ffff888109280000 R14: ffff888109280000 R15:
>> 0000000002bbab0b
>> [ 1534.865861] FS:  00007fe923eff700(0000) GS:ffff888252f80000(0000)
>> knlGS:0000000000000000
>> [ 1534.865871] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 1534.865878] CR2: 00007fe90fe90ac0 CR3: 0000000138d5c003 CR4:
>> 0000000000770ee0
>> [ 1534.865885] PKRU: 55555554
>> [ 1534.865889] Call Trace:
>> [ 1534.865893]  <TASK>
>> [ 1534.865898]  lzma_main+0x5a2/0x12b0
>> [ 1534.865920]  lzma2_lzma+0x2dd/0x460
>> [ 1534.865935]  xz_dec_lzma2_run+0x11e/0xba0
>> [ 1534.865948]  ? crc32_le_base+0x58/0x250
>> [ 1534.865961]  ? crc32_le_base+0x1e3/0x250
>> [ 1534.865977]  xz_dec_run+0x57b/0x11a0
>> [ 1534.865995]  squashfs_xz_uncompress+0x196/0x370
>> [ 1534.866011]  squashfs_decompress+0x8e/0xd0
>> [ 1534.866022]  ? lzo_uncompress+0x400/0x400
>> [ 1534.866034]  squashfs_read_data+0x1e6/0x900
>> [ 1534.866047]  ? kmemleak_alloc+0x4b/0x80
>> [ 1534.866061]  ? squashfs_bio_read.isra.0+0x230/0x230
>> [ 1534.866074]  ? squashfs_page_actor_init_special+0x1bb/0x230
>> [ 1534.866087]  squashfs_readahead+0xa86/0xe70
>> [ 1534.866099]  ? xas_create+0x141/0x600
>> [ 1534.866119]  ? squashfs_fill_page+0x190/0x190
>> [ 1534.866136]  ? psi_task_change+0x106/0x130
>> [ 1534.866155]  ? raw_spin_rq_unlock+0x17/0x60
>> [ 1534.866171]  read_pages+0x12d/0x530
>> [ 1534.866186]  ? __kasan_check_read+0x11/0x20
>> [ 1534.866200]  ? folio_batch_add_and_move+0x8d/0xa0
>> [ 1534.866217]  ? file_ra_state_init+0x60/0x60
>> [ 1534.866231]  ? folio_add_lru+0x47/0x70
>> [ 1534.866249]  ? filemap_alloc_folio+0xcf/0x120
>> [ 1534.866267]  page_cache_ra_unbounded+0x1eb/0x280
>> [ 1534.866280]  ? __rcu_read_unlock+0x57/0x270
>> [ 1534.866297]  do_page_cache_ra+0x7c/0x90
>> [ 1534.866312]  page_cache_ra_order+0x3b4/0x420
>> [ 1534.866330]  filemap_fault+0x849/0xe40
>> [ 1534.866343]  ? read_cache_page_gfp+0x90/0x90
>> [ 1534.866359]  __do_fault+0x76/0x1e0
>> [ 1534.866372]  do_fault+0x1e5/0x6c0
>> [ 1534.866384]  __handle_mm_fault+0x905/0x1740
>> [ 1534.866399]  ? copy_page_range+0x2130/0x2130
>> [ 1534.866412]  ? mas_find+0x100/0x100
>> [ 1534.866428]  handle_mm_fault+0x11c/0x3b0
>> [ 1534.866443]  do_user_addr_fault+0x261/0x8b0
>> [ 1534.866456]  ? exit_to_user_mode_prepare+0x9b/0x190
>> [ 1534.866472]  exc_page_fault+0x61/0xf0
>> [ 1534.866486]  asm_exc_page_fault+0x27/0x30
>> [ 1534.866498] RIP: 0033:0x7fe93673dd00
>> [ 1534.866515] Code: Unable to access opcode bytes at 0x7fe93673dcd6.
>> [ 1534.866521] RSP: 002b:00007fe923efeea8 EFLAGS: 00010246
>> [ 1534.866532] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
>> 0000000000000001
>> [ 1534.866539] RDX: 0000000000000001 RSI: 0000000000000000 RDI:
>> 0000000000000000
>> [ 1534.866545] RBP: 0000000000000000 R08: 0000000000000000 R09:
>> 00007fe837100098
>> [ 1534.866553] R10: 00007fe868300de8 R11: 0000000000000217 R12:
>> 00007fff95860b2e
>> [ 1534.866561] R13: 00007fff95860b2f R14: 00007fff95860bf8 R15:
>> 00007fe923efef80
>> [ 1534.866576]  </TASK>
>>
>> Thank you,
>> Mirsad
>>
>> --
>> Mirsad Goran Todorovac
>> Sistem inženjer
>> Grafički fakultet | Akademija likovnih umjetnosti
>> Sveučilište u Zagrebu
>
> --
> Mirsad Goran Todorovac
> Sistem inženjer
> Grafički fakultet | Akademija likovnih umjetnosti
> Sveučilište u Zagrebu


2022-11-18 16:08:30

by Paul E. McKenney

[permalink] [raw]
Subject: Re: BUG: in squashfs_xz_uncompress() (Was: RCU stalls in squashfs_readahead())

On Fri, Nov 18, 2022 at 06:11:16AM +0000, Phillip Lougher wrote:
> On 17/11/2022 23:05, Mirsad Goran Todorovac wrote:
> > Hi,
> >
> > While trying to bisect, I've found another bug that predated the
> > introduction of squashfs_readahead(), but it has
> > a common denominator in squashfs_decompress() and squashfs_xz_uncompress().
>
> Wrong, the stall is happening in the XZ decompressor library, which
> is *not* in Squashfs.
>
> This reported stall in the decompressor code is likely a symptom of you
> deliberately thrashing your system. When the system thrashes everything
> starts to happen very slowly, and the system will spend a lot of
> its time doing page I/O, and the CPU will spend a lot of time in
> any CPU intensive code like the XZ decompressor library.
>
> So the fact the stall is being hit here is a symptom and not
> a cause. The decompressor code is likely running slowly due to
> thrashing and waiting on paged-out buffers. This is not indicative
> of any bug, merely a system running slowly due to overload.
>
> As I said, this is not a Squashfs issue, because the code when the
> stall takes place isn't in Squashfs.
>
> The people responsible for the rcu code should have a lot more insight
> about what happens when the system is thrashing, and how this will
> throw up false positives. In this I believe this is an instance of
> perfectly correct code running slowly due to thrashing incorrectly
> being flagged as looping.
>
> CC'ing Paul E. McKenney <[email protected]>
>
> Phillip
>
>
>
> >
> > Frankly, I need an advice on how to handle a situation like this.
> >
> > Obviously, the best place to test the RCU problem with
> > squashfs_readahead() is where it is introduced + Phillip's patches
> > [012]/3.
> >
> > Thanks,
> > Mirsad
> >
> > P.S.
> >
> > This is excerpt from dmesg log attached:
> >
> > [   25.338169] rcu: INFO: rcu_preempt detected expedited stalls on
> > CPUs/tasks: { 4-... } 6 jiffies s: 373 root: 0x10/.

The "6 jiffies" leads me to believe that this kernel was built with
CONFIG_ANDROID=y, but also with the short expedited grace-period commit,
that is, with CONFIG_RCU_EXP_CPU_STALL_TIMEOUT set to 20 milliseconds
and with HZ set to something like 250. This means that the kernel
will complain if an RCU read-side critical section exceeeds about 20
milliseconds. This is a fine thing if you are actually running on an
Android, but a bit extreme for general usage.

If my guesses are correct, please apply this commit from mainline:

1045a06724f3 ("remove CONFIG_ANDROID")

With this commit in place, CONFIG_RCU_EXP_CPU_STALL_TIMEOUT will default
to 21000 (that is, 21 seconds) and things will be much better.

Thanx, Paul

> > [   25.338202] rcu: blocking rcu_node structures (internal RCU debug):
> > [   25.338210] Task dump for CPU 4:
> > [   25.338216] task:canonical-livep state:R  running task     stack:
> > 0 pid:  913 ppid:     1 flags:0x00004008
> > [   25.338237] Call Trace:
> > [   25.338243]  <TASK>
> > [   25.338249]  ? __kasan_check_write+0x14/0x20
> > [   25.338272]  ? _raw_spin_lock_irqsave+0x93/0xf0
> > [   25.338293]  ? sysvec_apic_timer_interrupt+0x89/0x90
> > [   25.338311]  ? asm_sysvec_apic_timer_interrupt+0x20/0x30
> > [   25.338327]  ? sysvec_apic_timer_interrupt+0x89/0x90
> > [   25.338342]  ? sysvec_apic_timer_interrupt+0x89/0x90
> > [   25.338359]  ? asm_sysvec_apic_timer_interrupt+0x20/0x30
> > [   25.338377]  ? lzma_len+0x1fb/0x3b0
> > [   25.338395]  ? dict_repeat+0xec/0x120
> > [   25.338413]  ? lzma_main+0x10d/0x1260
> > [   25.338437]  ? lzma2_lzma+0x2a4/0x3b0
> > [   25.338457]  ? xz_dec_lzma2_run+0x11f/0xb90
> > [   25.338483]  ? xz_dec_run+0x346/0x11e0
> > [   25.338506]  ? squashfs_xz_uncompress+0x135/0x330
> > [   25.338530]  ? lzo_init+0xd0/0xd0
> > [   25.338546]  ? squashfs_decompress+0x88/0xc0
> > [   25.338565]  ? squashfs_read_data+0x1e5/0x8e0
> > [   25.338582]  ? __filemap_get_folio+0x37e/0x540
> > [   25.338602]  ? squashfs_bio_read.isra.2+0x220/0x220
> > [   25.338628]  ? squashfs_readpage_block+0x493/0x950
> > [   25.338650]  ? squashfs_read_folio+0xa4c/0x1140
> > [   25.338671]  ? xas_nomem+0x2f/0x100
> > [   25.338689]  ? squashfs_copy_cache+0x360/0x360
> > [   25.338707]  ? folio_wait_bit+0x3c0/0x3c0
> > [   25.338725]  ? __kasan_check_read+0x11/0x20
> > [   25.338739]  ? pagevec_add_and_need_flush+0x8c/0xa0
> > [   25.338757]  ? folio_add_lru+0x42/0x70
> > [   25.338772]  ? filemap_add_folio+0xd4/0x130
> > [   25.338788]  ? add_to_page_cache_locked+0xa0/0xa0
> > [   25.338806]  ? filemap_read_folio.isra.62+0xbe/0x4d0
> > [   25.338822]  ? __kasan_check_write+0x14/0x20
> > [   25.338838]  ? filemap_page_mkwrite+0x6a0/0x6a0
> > [   25.338856]  ? filemap_add_folio+0x130/0x130
> > [   25.338879]  ? filemap_fault+0x1041/0x1310
> > [   25.338895]  ? __kasan_check_read+0x11/0x20
> > [   25.338918]  ? read_cache_page_gfp+0xd0/0xd0
> > [   25.338932]  ? __rcu_read_unlock+0x55/0x260
> > [   25.338953]  ? try_to_wake_up+0x3c2/0xa10
> > [   25.338972]  ? __filemap_get_folio+0x540/0x540
> > [   25.338985]  ? __pmd+0x10/0x10
> > [   25.339002]  ? __do_fault+0x7c/0x1b0
> > [   25.339019]  ? do_fault+0x1ce/0x6a0
> > [   25.339036]  ? __handle_mm_fault+0x9cb/0x1490
> > [   25.339057]  ? copy_page_range+0x1b90/0x1b90
> > [   25.339073]  ? kernel_fpu_begin_mask+0x160/0x160
> > [   25.339092]  ? __ia32_sys_futex_time32+0x300/0x300
> > [   25.339119]  ? handle_mm_fault+0x11b/0x380
> > [   25.339138]  ? do_user_addr_fault+0x258/0x810
> > [   25.339157]  ? exc_page_fault+0x60/0xe0
> > [   25.339173]  ? asm_exc_page_fault+0x2c/0x40
> > [   25.339194]  </TASK>
> >
> >
> >
> > On 17. 11. 2022. 20:21, Mirsad Goran Todorovac wrote:
> > > Dear all,
> > >
> > > I've noticed two similar RCU stalls with the similar call trace,
> > > both involving squashfs_readeahead().
> > >
> > > Configuration is (excerpt):
> > >
> > > CONFIG_KASAN_SHADOW_OFFSET=0xdffffc0000000000
> > > CONFIG_HAVE_DEBUG_KMEMLEAK=y
> > > CONFIG_DEBUG_KMEMLEAK=y
> > > CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE=16000
> > > # CONFIG_DEBUG_KMEMLEAK_TEST is not set
> > > # CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF is not set
> > > CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN=y
> > > CONFIG_HAVE_ARCH_KASAN=y
> > > CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
> > > CONFIG_CC_HAS_KASAN_GENERIC=y
> > > CONFIG_KASAN=y
> > > CONFIG_KASAN_GENERIC=y
> > > CONFIG_KASAN_OUTLINE=y
> > > # CONFIG_KASAN_INLINE is not set
> > > CONFIG_KASAN_STACK=y
> > > # CONFIG_KASAN_VMALLOC is not set
> > > # CONFIG_KASAN_MODULE_TEST is not set
> > >
> > > Do you think this is worthy of a bisect? I am not yet certain about
> > > being able to reproduce the stall, however it occurs at
> > > great memory use and heavy swapping (I thought of experimenting with
> > > efficiency of multi-gen LRU in 6.1 with multimedia under heavy loads
> > > and low memory ...).
> > >
> > > Here is the excerpt from dmesg:
> > >
> > > [ 1326.301544] perf: interrupt took too long (3181 > 3178), lowering
> > > kernel.perf_event_max_sample_rate to 62750
> > > [ 1347.269247] rcu: INFO: rcu_preempt detected expedited stalls on
> > > CPUs/tasks: { 3-.... } 6 jiffies s: 1317 root: 0x8/.
> > > [ 1347.269294] rcu: blocking rcu_node structures (internal RCU debug):
> > > [ 1347.269310] Sending NMI from CPU 2 to CPUs 3:
> > > [ 1347.269329] NMI backtrace for cpu 3
> > > [ 1347.269343] CPU: 3 PID: 7907 Comm: Isolated Web Co Not tainted
> > > 6.1.0-rc5 #1
> > > [ 1347.269357] Hardware name: LENOVO 82H8/LNVNB161216, BIOS GGCN49WW
> > > 07/21/2022
> > > [ 1347.269363] RIP: 0010:__asan_load8+0x3e/0xb0
> > > [ 1347.269386] Code: b8 00 00 00 00 00 00 00 ff eb 0a 48 b8 00 00 00
> > > 00 00 80 ff ff 48 39 f8 77 44 48 8d 47 07 48 89 c2 83 e2 07 48 83 fa
> > > 07 75 1c <48> ba 00 00 00 00 00 fc ff df 48 c1 e8 03 0f b6 04 10 84
> > > c0 75 30
> > > [ 1347.269398] RSP: 0018:ffff888102eef288 EFLAGS: 00000246
> > > [ 1347.269410] RAX: ffff888109280047 RBX: ffff8881157da0ca RCX:
> > > ffffffff8fca8434
> > > [ 1347.269419] RDX: 0000000000000007 RSI: 0000000000000001 RDI:
> > > ffff888109280040
> > > [ 1347.269427] RBP: ffff888102eef288 R08: 00000000000003ae R09:
> > > ffffed1021250de8
> > > [ 1347.269435] R10: ffff888109286f39 R11: ffffed1021250de7 R12:
> > > 0000000000000155
> > > [ 1347.269443] R13: ffffc9000122ff18 R14: ffff888109280000 R15:
> > > ffff888109280000
> > > [ 1347.269451] FS:  00007efe541e9380(0000) GS:ffff888252f80000(0000)
> > > knlGS:0000000000000000
> > > [ 1347.269462] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [ 1347.269470] CR2: 00007fa7a15e2000 CR3: 000000013100a003 CR4:
> > > 0000000000770ee0
> > > [ 1347.269479] PKRU: 55555554
> > > [ 1347.269484] Call Trace:
> > > [ 1347.269488]  <TASK>
> > > [ 1347.269494]  lzma_main+0x3b4/0x12b0
> > > [ 1347.269517]  lzma2_lzma+0x2dd/0x460
> > > [ 1347.269535]  xz_dec_lzma2_run+0x11e/0xba0
> > > [ 1347.269549]  ? __asan_load4+0x8/0xb0
> > > [ 1347.269564]  ? crc32_le_base+0x179/0x250
> > > [ 1347.269581]  xz_dec_run+0x57b/0x11a0
> > > [ 1347.269600]  squashfs_xz_uncompress+0x196/0x370
> > > [ 1347.269620]  squashfs_decompress+0x8e/0xd0
> > > [ 1347.269633]  ? lzo_uncompress+0x400/0x400
> > > [ 1347.269648]  squashfs_read_data+0x1e6/0x900
> > > [ 1347.269662]  ? kmemleak_alloc+0x4b/0x80
> > > [ 1347.269678]  ? squashfs_bio_read.isra.0+0x230/0x230
> > > [ 1347.269694]  ? squashfs_page_actor_init_special+0x1bb/0x230
> > > [ 1347.269710]  squashfs_readahead+0xa86/0xe70
> > > [ 1347.269722]  ? put_pages_list+0x1e0/0x1e0
> > > [ 1347.269747]  ? squashfs_fill_page+0x190/0x190
> > > [ 1347.269763]  ? psi_task_change+0x106/0x130
> > > [ 1347.269781]  ? raw_spin_rq_unlock+0x17/0x60
> > > [ 1347.269796]  read_pages+0x12d/0x530
> > > [ 1347.269814]  ? file_ra_state_init+0x60/0x60
> > > [ 1347.269829]  ? xas_free_nodes+0x170/0x170
> > > [ 1347.269844]  ? filemap_alloc_folio+0xcf/0x120
> > > [ 1347.269863]  page_cache_ra_unbounded+0x1aa/0x280
> > > [ 1347.269878]  ? __rcu_read_unlock+0x57/0x270
> > > [ 1347.269898]  do_page_cache_ra+0x7c/0x90
> > > [ 1347.269915]  page_cache_ra_order+0x3b4/0x420
> > > [ 1347.269936]  filemap_fault+0x849/0xe40
> > > [ 1347.269951]  ? read_cache_page_gfp+0x90/0x90
> > > [ 1347.269970]  __do_fault+0x76/0x1e0
> > > [ 1347.269984]  do_fault+0x1e5/0x6c0
> > > [ 1347.269998]  __handle_mm_fault+0x905/0x1740
> > > [ 1347.270015]  ? copy_page_range+0x2130/0x2130
> > > [ 1347.270028]  ? mas_find+0x100/0x100
> > > [ 1347.270050]  handle_mm_fault+0x11c/0x3b0
> > > [ 1347.270065]  do_user_addr_fault+0x261/0x8b0
> > > [ 1347.270084]  exc_page_fault+0x61/0xf0
> > > [ 1347.270101]  asm_exc_page_fault+0x27/0x30
> > > [ 1347.270115] RIP: 0033:0x7efe544ce8b4
> > > [ 1347.270126] Code: d5 c4 e1 f9 7e d0 c4 e1 f9 7e d7 48 c1 f8 35 48
> > > c1 ff 2e 48 89 c1 48 89 fe b8 ff 01 00 00 29 c8 83 e6 7f 48 8d 0d ec
> > > 45 07 00 <c5> fb 10 04 f1 48 98 48 8d 0d fe 44 07 00 c5 fb 59 0c c1
> > > c5 f3 59
> > > [ 1347.270137] RSP: 002b:00007ffc019a6e30 EFLAGS: 00010206
> > > [ 1347.270147] RAX: 0000000000000005 RBX: 00007ffc019a6f50 RCX:
> > > 00007efe54542ea0
> > > [ 1347.270156] RDX: ffffffffbfefe12e RSI: 000000000000007b RDI:
> > > 000000000000fd7b
> > > [ 1347.270164] RBP: 00007ffc019a6f58 R08: 00007ffc019a6f40 R09:
> > > 00000000006f71c0
> > > [ 1347.270173] R10: 00007efe547fc500 R11: 00007efe54459b60 R12:
> > > 00007ffc019a7060
> > > [ 1347.270181] R13: 00007ffc019a6f48 R14: 00007ffc019a7040 R15:
> > > 00007ffc019a6f40
> > > [ 1347.270199]  </TASK>
> > >
> > >
> > > [ 1534.865687] rcu: INFO: rcu_preempt detected expedited stalls on
> > > CPUs/tasks: { 3-.... } 7 jiffies s: 1393 root: 0x8/.
> > > [ 1534.865725] rcu: blocking rcu_node structures (internal RCU debug):
> > > [ 1534.865735] Sending NMI from CPU 2 to CPUs 3:
> > > [ 1534.865752] NMI backtrace for cpu 3
> > > [ 1534.865762] CPU: 3 PID: 9629 Comm: FS Broker 9627 Not tainted
> > > 6.1.0-rc5 #1
> > > [ 1534.865776] Hardware name: LENOVO 82H8/LNVNB161216, BIOS GGCN49WW
> > > 07/21/2022
> > > [ 1534.865782] RIP: 0010:__asan_load4+0x8/0xb0
> > > [ 1534.865802] Code: cc cc cc cc 48 c1 e8 03 80 3c 10 00 75 e9 5d c3
> > > cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 55 48 89 e5 4c
> > > 8b 45 08 <48> 83 ff fb 77 63 eb 0f 0f 1f 00 48 b8 00 00 00 00 00 00
> > > 00 ff eb
> > > [ 1534.865811] RSP: 0018:ffff88811fd3f288 EFLAGS: 00000246
> > > [ 1534.865823] RAX: 0000000000000000 RBX: 000000000001d378 RCX:
> > > dffffc0000000000
> > > [ 1534.865831] RDX: 0000000000000003 RSI: 000000000000077f RDI:
> > > ffff888109280080
> > > [ 1534.865838] RBP: ffff88811fd3f288 R08: ffffffff8fca8622 R09:
> > > 000000000001d380
> > > [ 1534.865847] R10: ffff888114694020 R11: ffffed102843b9ff R12:
> > > ffff888109280006
> > > [ 1534.865854] R13: ffff888109280000 R14: ffff888109280000 R15:
> > > 0000000002bbab0b
> > > [ 1534.865861] FS:  00007fe923eff700(0000) GS:ffff888252f80000(0000)
> > > knlGS:0000000000000000
> > > [ 1534.865871] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [ 1534.865878] CR2: 00007fe90fe90ac0 CR3: 0000000138d5c003 CR4:
> > > 0000000000770ee0
> > > [ 1534.865885] PKRU: 55555554
> > > [ 1534.865889] Call Trace:
> > > [ 1534.865893]  <TASK>
> > > [ 1534.865898]  lzma_main+0x5a2/0x12b0
> > > [ 1534.865920]  lzma2_lzma+0x2dd/0x460
> > > [ 1534.865935]  xz_dec_lzma2_run+0x11e/0xba0
> > > [ 1534.865948]  ? crc32_le_base+0x58/0x250
> > > [ 1534.865961]  ? crc32_le_base+0x1e3/0x250
> > > [ 1534.865977]  xz_dec_run+0x57b/0x11a0
> > > [ 1534.865995]  squashfs_xz_uncompress+0x196/0x370
> > > [ 1534.866011]  squashfs_decompress+0x8e/0xd0
> > > [ 1534.866022]  ? lzo_uncompress+0x400/0x400
> > > [ 1534.866034]  squashfs_read_data+0x1e6/0x900
> > > [ 1534.866047]  ? kmemleak_alloc+0x4b/0x80
> > > [ 1534.866061]  ? squashfs_bio_read.isra.0+0x230/0x230
> > > [ 1534.866074]  ? squashfs_page_actor_init_special+0x1bb/0x230
> > > [ 1534.866087]  squashfs_readahead+0xa86/0xe70
> > > [ 1534.866099]  ? xas_create+0x141/0x600
> > > [ 1534.866119]  ? squashfs_fill_page+0x190/0x190
> > > [ 1534.866136]  ? psi_task_change+0x106/0x130
> > > [ 1534.866155]  ? raw_spin_rq_unlock+0x17/0x60
> > > [ 1534.866171]  read_pages+0x12d/0x530
> > > [ 1534.866186]  ? __kasan_check_read+0x11/0x20
> > > [ 1534.866200]  ? folio_batch_add_and_move+0x8d/0xa0
> > > [ 1534.866217]  ? file_ra_state_init+0x60/0x60
> > > [ 1534.866231]  ? folio_add_lru+0x47/0x70
> > > [ 1534.866249]  ? filemap_alloc_folio+0xcf/0x120
> > > [ 1534.866267]  page_cache_ra_unbounded+0x1eb/0x280
> > > [ 1534.866280]  ? __rcu_read_unlock+0x57/0x270
> > > [ 1534.866297]  do_page_cache_ra+0x7c/0x90
> > > [ 1534.866312]  page_cache_ra_order+0x3b4/0x420
> > > [ 1534.866330]  filemap_fault+0x849/0xe40
> > > [ 1534.866343]  ? read_cache_page_gfp+0x90/0x90
> > > [ 1534.866359]  __do_fault+0x76/0x1e0
> > > [ 1534.866372]  do_fault+0x1e5/0x6c0
> > > [ 1534.866384]  __handle_mm_fault+0x905/0x1740
> > > [ 1534.866399]  ? copy_page_range+0x2130/0x2130
> > > [ 1534.866412]  ? mas_find+0x100/0x100
> > > [ 1534.866428]  handle_mm_fault+0x11c/0x3b0
> > > [ 1534.866443]  do_user_addr_fault+0x261/0x8b0
> > > [ 1534.866456]  ? exit_to_user_mode_prepare+0x9b/0x190
> > > [ 1534.866472]  exc_page_fault+0x61/0xf0
> > > [ 1534.866486]  asm_exc_page_fault+0x27/0x30
> > > [ 1534.866498] RIP: 0033:0x7fe93673dd00
> > > [ 1534.866515] Code: Unable to access opcode bytes at 0x7fe93673dcd6.
> > > [ 1534.866521] RSP: 002b:00007fe923efeea8 EFLAGS: 00010246
> > > [ 1534.866532] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
> > > 0000000000000001
> > > [ 1534.866539] RDX: 0000000000000001 RSI: 0000000000000000 RDI:
> > > 0000000000000000
> > > [ 1534.866545] RBP: 0000000000000000 R08: 0000000000000000 R09:
> > > 00007fe837100098
> > > [ 1534.866553] R10: 00007fe868300de8 R11: 0000000000000217 R12:
> > > 00007fff95860b2e
> > > [ 1534.866561] R13: 00007fff95860b2f R14: 00007fff95860bf8 R15:
> > > 00007fe923efef80
> > > [ 1534.866576]  </TASK>
> > >
> > > Thank you,
> > > Mirsad
> > >
> > > --
> > > Mirsad Goran Todorovac
> > > Sistem inženjer
> > > Grafički fakultet | Akademija likovnih umjetnosti
> > > Sveučilište u Zagrebu
> >
> > --
> > Mirsad Goran Todorovac
> > Sistem inženjer
> > Grafički fakultet | Akademija likovnih umjetnosti
> > Sveučilište u Zagrebu
>

2022-11-18 16:31:45

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: BUG: in squashfs_xz_uncompress() (Was: RCU stalls in squashfs_readahead())

On 18. 11. 2022. 07:11, Phillip Lougher wrote:
> On 17/11/2022 23:05, Mirsad Goran Todorovac wrote:
>> Hi,
>>
>> While trying to bisect, I've found another bug that predated the introduction of squashfs_readahead(), but it has
>> a common denominator in squashfs_decompress() and squashfs_xz_uncompress().
>
> Wrong, the stall is happening in the XZ decompressor library, which
> is *not* in Squashfs.
>
> This reported stall in the decompressor code is likely a symptom of you
> deliberately thrashing your system.  When the system thrashes everything
> starts to happen very slowly, and the system will spend a lot of
> its time doing page I/O, and the CPU will spend a lot of time in
> any CPU intensive code like the XZ decompressor library.
>
> So the fact the stall is being hit here is a symptom and not
> a cause.  The decompressor code is likely running slowly due to
> thrashing and waiting on paged-out buffers.  This is not indicative
> of any bug, merely a system running slowly due to overload.
>
> As I said, this is not a Squashfs issue, because the code when the
> stall takes place isn't in Squashfs.
>
> The people responsible for the rcu code should have a lot more insight
> about what happens when the system is thrashing, and how this will
> throw up false positives.  In this I believe this is an instance of
> perfectly correct code running slowly due to thrashing incorrectly
> being flagged as looping.
>
> CC'ing Paul E. McKenney <[email protected]>
>
> Phillip
>
>
>
>>
>> Frankly, I need an advice on how to handle a situation like this.
>>
>> Obviously, the best place to test the RCU problem with squashfs_readahead() is where it is introduced + Phillip's patches [012]/3.
>>
>> Thanks,
>> Mirsad
>>
>> P.S.
>>
>> This is excerpt from dmesg log attached:
>>
>> [   25.338169] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 4-... } 6 jiffies s: 373 root: 0x10/.
>> [   25.338202] rcu: blocking rcu_node structures (internal RCU debug):
>> [   25.338210] Task dump for CPU 4:
>> [   25.338216] task:canonical-livep state:R  running task     stack: 0 pid:  913 ppid:     1 flags:0x00004008
>> [   25.338237] Call Trace:
>> [   25.338243]  <TASK>
>> [   25.338249]  ? __kasan_check_write+0x14/0x20
>> [   25.338272]  ? _raw_spin_lock_irqsave+0x93/0xf0
>> [   25.338293]  ? sysvec_apic_timer_interrupt+0x89/0x90
>> [   25.338311]  ? asm_sysvec_apic_timer_interrupt+0x20/0x30
>> [   25.338327]  ? sysvec_apic_timer_interrupt+0x89/0x90
>> [   25.338342]  ? sysvec_apic_timer_interrupt+0x89/0x90
>> [   25.338359]  ? asm_sysvec_apic_timer_interrupt+0x20/0x30
>> [   25.338377]  ? lzma_len+0x1fb/0x3b0
>> [   25.338395]  ? dict_repeat+0xec/0x120
>> [   25.338413]  ? lzma_main+0x10d/0x1260
>> [   25.338437]  ? lzma2_lzma+0x2a4/0x3b0
>> [   25.338457]  ? xz_dec_lzma2_run+0x11f/0xb90
>> [   25.338483]  ? xz_dec_run+0x346/0x11e0
>> [   25.338506]  ? squashfs_xz_uncompress+0x135/0x330
>> [   25.338530]  ? lzo_init+0xd0/0xd0
>> [   25.338546]  ? squashfs_decompress+0x88/0xc0
>> [   25.338565]  ? squashfs_read_data+0x1e5/0x8e0
>> [   25.338582]  ? __filemap_get_folio+0x37e/0x540
>> [   25.338602]  ? squashfs_bio_read.isra.2+0x220/0x220
>> [   25.338628]  ? squashfs_readpage_block+0x493/0x950
>> [   25.338650]  ? squashfs_read_folio+0xa4c/0x1140
>> [   25.338671]  ? xas_nomem+0x2f/0x100
>> [   25.338689]  ? squashfs_copy_cache+0x360/0x360
>> [   25.338707]  ? folio_wait_bit+0x3c0/0x3c0
>> [   25.338725]  ? __kasan_check_read+0x11/0x20
>> [   25.338739]  ? pagevec_add_and_need_flush+0x8c/0xa0
>> [   25.338757]  ? folio_add_lru+0x42/0x70
>> [   25.338772]  ? filemap_add_folio+0xd4/0x130
>> [   25.338788]  ? add_to_page_cache_locked+0xa0/0xa0
>> [   25.338806]  ? filemap_read_folio.isra.62+0xbe/0x4d0
>> [   25.338822]  ? __kasan_check_write+0x14/0x20
>> [   25.338838]  ? filemap_page_mkwrite+0x6a0/0x6a0
>> [   25.338856]  ? filemap_add_folio+0x130/0x130
>> [   25.338879]  ? filemap_fault+0x1041/0x1310
>> [   25.338895]  ? __kasan_check_read+0x11/0x20
>> [   25.338918]  ? read_cache_page_gfp+0xd0/0xd0
>> [   25.338932]  ? __rcu_read_unlock+0x55/0x260
>> [   25.338953]  ? try_to_wake_up+0x3c2/0xa10
>> [   25.338972]  ? __filemap_get_folio+0x540/0x540
>> [   25.338985]  ? __pmd+0x10/0x10
>> [   25.339002]  ? __do_fault+0x7c/0x1b0
>> [   25.339019]  ? do_fault+0x1ce/0x6a0
>> [   25.339036]  ? __handle_mm_fault+0x9cb/0x1490
>> [   25.339057]  ? copy_page_range+0x1b90/0x1b90
>> [   25.339073]  ? kernel_fpu_begin_mask+0x160/0x160
>> [   25.339092]  ? __ia32_sys_futex_time32+0x300/0x300
>> [   25.339119]  ? handle_mm_fault+0x11b/0x380
>> [   25.339138]  ? do_user_addr_fault+0x258/0x810
>> [   25.339157]  ? exc_page_fault+0x60/0xe0
>> [   25.339173]  ? asm_exc_page_fault+0x2c/0x40
>> [   25.339194]  </TASK>
>>
>>
>>
>> On 17. 11. 2022. 20:21, Mirsad Goran Todorovac wrote:
>>> Dear all,
>>>
>>> I've noticed two similar RCU stalls with the similar call trace, both involving squashfs_readeahead().
>>>
>>> Configuration is (excerpt):
>>>
>>> CONFIG_KASAN_SHADOW_OFFSET=0xdffffc0000000000
>>> CONFIG_HAVE_DEBUG_KMEMLEAK=y
>>> CONFIG_DEBUG_KMEMLEAK=y
>>> CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE=16000
>>> # CONFIG_DEBUG_KMEMLEAK_TEST is not set
>>> # CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF is not set
>>> CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN=y
>>> CONFIG_HAVE_ARCH_KASAN=y
>>> CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
>>> CONFIG_CC_HAS_KASAN_GENERIC=y
>>> CONFIG_KASAN=y
>>> CONFIG_KASAN_GENERIC=y
>>> CONFIG_KASAN_OUTLINE=y
>>> # CONFIG_KASAN_INLINE is not set
>>> CONFIG_KASAN_STACK=y
>>> # CONFIG_KASAN_VMALLOC is not set
>>> # CONFIG_KASAN_MODULE_TEST is not set
>>>
>>> Do you think this is worthy of a bisect? I am not yet certain about being able to reproduce the stall, however it occurs at
>>> great memory use and heavy swapping (I thought of experimenting with efficiency of multi-gen LRU in 6.1 with multimedia under
>>> heavy loads and low memory ...).
>>>
>>> Here is the excerpt from dmesg:
>>>
>>> [ 1326.301544] perf: interrupt took too long (3181 > 3178), lowering kernel.perf_event_max_sample_rate to 62750
>>> [ 1347.269247] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 3-.... } 6 jiffies s: 1317 root: 0x8/.
>>> [ 1347.269294] rcu: blocking rcu_node structures (internal RCU debug):
>>> [ 1347.269310] Sending NMI from CPU 2 to CPUs 3:
>>> [ 1347.269329] NMI backtrace for cpu 3
>>> [ 1347.269343] CPU: 3 PID: 7907 Comm: Isolated Web Co Not tainted 6.1.0-rc5 #1
>>> [ 1347.269357] Hardware name: LENOVO 82H8/LNVNB161216, BIOS GGCN49WW 07/21/2022
>>> [ 1347.269363] RIP: 0010:__asan_load8+0x3e/0xb0
>>> [ 1347.269386] Code: b8 00 00 00 00 00 00 00 ff eb 0a 48 b8 00 00 00 00 00 80 ff ff 48 39 f8 77 44 48 8d 47 07 48 89 c2 83 e2 07
>>> 48 83 fa 07 75 1c <48> ba 00 00 00 00 00 fc ff df 48 c1 e8 03 0f b6 04 10 84 c0 75 30
>>> [ 1347.269398] RSP: 0018:ffff888102eef288 EFLAGS: 00000246
>>> [ 1347.269410] RAX: ffff888109280047 RBX: ffff8881157da0ca RCX: ffffffff8fca8434
>>> [ 1347.269419] RDX: 0000000000000007 RSI: 0000000000000001 RDI: ffff888109280040
>>> [ 1347.269427] RBP: ffff888102eef288 R08: 00000000000003ae R09: ffffed1021250de8
>>> [ 1347.269435] R10: ffff888109286f39 R11: ffffed1021250de7 R12: 0000000000000155
>>> [ 1347.269443] R13: ffffc9000122ff18 R14: ffff888109280000 R15: ffff888109280000
>>> [ 1347.269451] FS:  00007efe541e9380(0000) GS:ffff888252f80000(0000) knlGS:0000000000000000
>>> [ 1347.269462] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [ 1347.269470] CR2: 00007fa7a15e2000 CR3: 000000013100a003 CR4: 0000000000770ee0
>>> [ 1347.269479] PKRU: 55555554
>>> [ 1347.269484] Call Trace:
>>> [ 1347.269488]  <TASK>
>>> [ 1347.269494]  lzma_main+0x3b4/0x12b0
>>> [ 1347.269517]  lzma2_lzma+0x2dd/0x460
>>> [ 1347.269535]  xz_dec_lzma2_run+0x11e/0xba0
>>> [ 1347.269549]  ? __asan_load4+0x8/0xb0
>>> [ 1347.269564]  ? crc32_le_base+0x179/0x250
>>> [ 1347.269581]  xz_dec_run+0x57b/0x11a0
>>> [ 1347.269600]  squashfs_xz_uncompress+0x196/0x370
>>> [ 1347.269620]  squashfs_decompress+0x8e/0xd0
>>> [ 1347.269633]  ? lzo_uncompress+0x400/0x400
>>> [ 1347.269648]  squashfs_read_data+0x1e6/0x900
>>> [ 1347.269662]  ? kmemleak_alloc+0x4b/0x80
>>> [ 1347.269678]  ? squashfs_bio_read.isra.0+0x230/0x230
>>> [ 1347.269694]  ? squashfs_page_actor_init_special+0x1bb/0x230
>>> [ 1347.269710]  squashfs_readahead+0xa86/0xe70
>>> [ 1347.269722]  ? put_pages_list+0x1e0/0x1e0
>>> [ 1347.269747]  ? squashfs_fill_page+0x190/0x190
>>> [ 1347.269763]  ? psi_task_change+0x106/0x130
>>> [ 1347.269781]  ? raw_spin_rq_unlock+0x17/0x60
>>> [ 1347.269796]  read_pages+0x12d/0x530
>>> [ 1347.269814]  ? file_ra_state_init+0x60/0x60
>>> [ 1347.269829]  ? xas_free_nodes+0x170/0x170
>>> [ 1347.269844]  ? filemap_alloc_folio+0xcf/0x120
>>> [ 1347.269863]  page_cache_ra_unbounded+0x1aa/0x280
>>> [ 1347.269878]  ? __rcu_read_unlock+0x57/0x270
>>> [ 1347.269898]  do_page_cache_ra+0x7c/0x90
>>> [ 1347.269915]  page_cache_ra_order+0x3b4/0x420
>>> [ 1347.269936]  filemap_fault+0x849/0xe40
>>> [ 1347.269951]  ? read_cache_page_gfp+0x90/0x90
>>> [ 1347.269970]  __do_fault+0x76/0x1e0
>>> [ 1347.269984]  do_fault+0x1e5/0x6c0
>>> [ 1347.269998]  __handle_mm_fault+0x905/0x1740
>>> [ 1347.270015]  ? copy_page_range+0x2130/0x2130
>>> [ 1347.270028]  ? mas_find+0x100/0x100
>>> [ 1347.270050]  handle_mm_fault+0x11c/0x3b0
>>> [ 1347.270065]  do_user_addr_fault+0x261/0x8b0
>>> [ 1347.270084]  exc_page_fault+0x61/0xf0
>>> [ 1347.270101]  asm_exc_page_fault+0x27/0x30
>>> [ 1347.270115] RIP: 0033:0x7efe544ce8b4
>>> [ 1347.270126] Code: d5 c4 e1 f9 7e d0 c4 e1 f9 7e d7 48 c1 f8 35 48 c1 ff 2e 48 89 c1 48 89 fe b8 ff 01 00 00 29 c8 83 e6 7f 48
>>> 8d 0d ec 45 07 00 <c5> fb 10 04 f1 48 98 48 8d 0d fe 44 07 00 c5 fb 59 0c c1 c5 f3 59
>>> [ 1347.270137] RSP: 002b:00007ffc019a6e30 EFLAGS: 00010206
>>> [ 1347.270147] RAX: 0000000000000005 RBX: 00007ffc019a6f50 RCX: 00007efe54542ea0
>>> [ 1347.270156] RDX: ffffffffbfefe12e RSI: 000000000000007b RDI: 000000000000fd7b
>>> [ 1347.270164] RBP: 00007ffc019a6f58 R08: 00007ffc019a6f40 R09: 00000000006f71c0
>>> [ 1347.270173] R10: 00007efe547fc500 R11: 00007efe54459b60 R12: 00007ffc019a7060
>>> [ 1347.270181] R13: 00007ffc019a6f48 R14: 00007ffc019a7040 R15: 00007ffc019a6f40
>>> [ 1347.270199]  </TASK>
>>>
>>>
>>> [ 1534.865687] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 3-.... } 7 jiffies s: 1393 root: 0x8/.
>>> [ 1534.865725] rcu: blocking rcu_node structures (internal RCU debug):
>>> [ 1534.865735] Sending NMI from CPU 2 to CPUs 3:
>>> [ 1534.865752] NMI backtrace for cpu 3
>>> [ 1534.865762] CPU: 3 PID: 9629 Comm: FS Broker 9627 Not tainted 6.1.0-rc5 #1
>>> [ 1534.865776] Hardware name: LENOVO 82H8/LNVNB161216, BIOS GGCN49WW 07/21/2022
>>> [ 1534.865782] RIP: 0010:__asan_load4+0x8/0xb0
>>> [ 1534.865802] Code: cc cc cc cc 48 c1 e8 03 80 3c 10 00 75 e9 5d c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 55 48
>>> 89 e5 4c 8b 45 08 <48> 83 ff fb 77 63 eb 0f 0f 1f 00 48 b8 00 00 00 00 00 00 00 ff eb
>>> [ 1534.865811] RSP: 0018:ffff88811fd3f288 EFLAGS: 00000246
>>> [ 1534.865823] RAX: 0000000000000000 RBX: 000000000001d378 RCX: dffffc0000000000
>>> [ 1534.865831] RDX: 0000000000000003 RSI: 000000000000077f RDI: ffff888109280080
>>> [ 1534.865838] RBP: ffff88811fd3f288 R08: ffffffff8fca8622 R09: 000000000001d380
>>> [ 1534.865847] R10: ffff888114694020 R11: ffffed102843b9ff R12: ffff888109280006
>>> [ 1534.865854] R13: ffff888109280000 R14: ffff888109280000 R15: 0000000002bbab0b
>>> [ 1534.865861] FS:  00007fe923eff700(0000) GS:ffff888252f80000(0000) knlGS:0000000000000000
>>> [ 1534.865871] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [ 1534.865878] CR2: 00007fe90fe90ac0 CR3: 0000000138d5c003 CR4: 0000000000770ee0
>>> [ 1534.865885] PKRU: 55555554
>>> [ 1534.865889] Call Trace:
>>> [ 1534.865893]  <TASK>
>>> [ 1534.865898]  lzma_main+0x5a2/0x12b0
>>> [ 1534.865920]  lzma2_lzma+0x2dd/0x460
>>> [ 1534.865935]  xz_dec_lzma2_run+0x11e/0xba0
>>> [ 1534.865948]  ? crc32_le_base+0x58/0x250
>>> [ 1534.865961]  ? crc32_le_base+0x1e3/0x250
>>> [ 1534.865977]  xz_dec_run+0x57b/0x11a0
>>> [ 1534.865995]  squashfs_xz_uncompress+0x196/0x370
>>> [ 1534.866011]  squashfs_decompress+0x8e/0xd0
>>> [ 1534.866022]  ? lzo_uncompress+0x400/0x400
>>> [ 1534.866034]  squashfs_read_data+0x1e6/0x900
>>> [ 1534.866047]  ? kmemleak_alloc+0x4b/0x80
>>> [ 1534.866061]  ? squashfs_bio_read.isra.0+0x230/0x230
>>> [ 1534.866074]  ? squashfs_page_actor_init_special+0x1bb/0x230
>>> [ 1534.866087]  squashfs_readahead+0xa86/0xe70
>>> [ 1534.866099]  ? xas_create+0x141/0x600
>>> [ 1534.866119]  ? squashfs_fill_page+0x190/0x190
>>> [ 1534.866136]  ? psi_task_change+0x106/0x130
>>> [ 1534.866155]  ? raw_spin_rq_unlock+0x17/0x60
>>> [ 1534.866171]  read_pages+0x12d/0x530
>>> [ 1534.866186]  ? __kasan_check_read+0x11/0x20
>>> [ 1534.866200]  ? folio_batch_add_and_move+0x8d/0xa0
>>> [ 1534.866217]  ? file_ra_state_init+0x60/0x60
>>> [ 1534.866231]  ? folio_add_lru+0x47/0x70
>>> [ 1534.866249]  ? filemap_alloc_folio+0xcf/0x120
>>> [ 1534.866267]  page_cache_ra_unbounded+0x1eb/0x280
>>> [ 1534.866280]  ? __rcu_read_unlock+0x57/0x270
>>> [ 1534.866297]  do_page_cache_ra+0x7c/0x90
>>> [ 1534.866312]  page_cache_ra_order+0x3b4/0x420
>>> [ 1534.866330]  filemap_fault+0x849/0xe40
>>> [ 1534.866343]  ? read_cache_page_gfp+0x90/0x90
>>> [ 1534.866359]  __do_fault+0x76/0x1e0
>>> [ 1534.866372]  do_fault+0x1e5/0x6c0
>>> [ 1534.866384]  __handle_mm_fault+0x905/0x1740
>>> [ 1534.866399]  ? copy_page_range+0x2130/0x2130
>>> [ 1534.866412]  ? mas_find+0x100/0x100
>>> [ 1534.866428]  handle_mm_fault+0x11c/0x3b0
>>> [ 1534.866443]  do_user_addr_fault+0x261/0x8b0
>>> [ 1534.866456]  ? exit_to_user_mode_prepare+0x9b/0x190
>>> [ 1534.866472]  exc_page_fault+0x61/0xf0
>>> [ 1534.866486]  asm_exc_page_fault+0x27/0x30
>>> [ 1534.866498] RIP: 0033:0x7fe93673dd00
>>> [ 1534.866515] Code: Unable to access opcode bytes at 0x7fe93673dcd6.
>>> [ 1534.866521] RSP: 002b:00007fe923efeea8 EFLAGS: 00010246
>>> [ 1534.866532] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000001
>>> [ 1534.866539] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000
>>> [ 1534.866545] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007fe837100098
>>> [ 1534.866553] R10: 00007fe868300de8 R11: 0000000000000217 R12: 00007fff95860b2e >>> [ 1534.866561] R13: 00007fff95860b2f R14: 00007fff95860bf8 R15: 00007fe923efef80
>>> [ 1534.866576]  </TASK>

Hello Phillip,

I have considered your theory, and I usually admit when I'm wrong, but this is the very
same combination of programs, windows and tabs used when bisecting the squashfs_readahead()
but, only this time ran under a kernel with CONFIG_DEBUG_KMEMLEAK and CONFIG_KASAN turned
on. So I don't think it's thrashing the system.

MGLRU was supposed to make smarter decisions of evicting the unused memory pages from cache
than the 5.10 code, so my intent was to see how that is working, too.

However, I've browsed through the source and it does actually happen when you call xz_dec_run()
from squashfs_xz_uncompress().

I think that 30 windows and 5-30 tabs each are an example of then normal use nowadays.

Plus, with v5.10 and the same config it doesn't throw stalls (only the unrelated systemd-udevd).

[ 7.333220] Missing case (clock == 269720)
[ 7.333341] WARNING: CPU: 4 PID: 214 at drivers/gpu/drm/i915/display/intel_dpll_mgr.c:2979 icl_get_dplls+0x948/0x11a0 [i915]
[ 7.333343] Modules linked in: i915(+) nvme nvme_core i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops
cec input_leds rc_core drm serio_raw xhci_pci vm
d xhci_pci_renesas mac_hid video pinctrl_tigerlake
[ 7.333368] CPU: 4 PID: 214 Comm: systemd-udevd Not tainted 5.10.0-05-2c85ebc57b3e-squashfs-stall #1
[ 7.333370] Hardware name: LENOVO 82H8/LNVNB161216, BIOS GGCN49WW 07/21/2022

So, my conclusion is that KASAN is operable in 5.10, and the problem with xz decompression is in between
5.19-rc1 and 5.10.0.

What can lock up a core for a dozen of jiffies on my laptop could knock it down for a couple of
minutes on a heavily loaded cloud server.

I lack the knowledge how to see a history of commits and modifications on a single file,
thereof the confusion.

However, I did not even blame your code for this stall I thought you might be interested since
in the RCU stall error it is called from your squash_readahead().

Thorsten has somewhat discouraged me when saying that there are 950 issues on the list to bisect.
I thought how the most of them requires a Google cloud to reproduce, and here is one we could bisect
ourselves, and I already know the developer.

So, at no point was I blaming you or demeaning you. I thought how would I react to a bug in my code,
and concluded that I would be glad that the code became more elegant and closer to the "golden cut"
of programming, if there is such a thing.

Have a nice day!

Thanks
Mirsad

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union


Subject: RE: BUG: in squashfs_xz_uncompress() (Was: RCU stalls in squashfs_readahead())



> -----Original Message-----
> From: Phillip Lougher <[email protected]>
> Sent: Friday, November 18, 2022 12:11 AM
> To: Mirsad Goran Todorovac <[email protected]>; LKML <linux-
> [email protected]>; Paul E. McKenney <[email protected]>
> Cc: [email protected]; Thorsten Leemhuis
> <[email protected]>
> Subject: Re: BUG: in squashfs_xz_uncompress() (Was: RCU stalls in
> squashfs_readahead())
>
> On 17/11/2022 23:05, Mirsad Goran Todorovac wrote:
> > Hi,
> >
> > While trying to bisect, I've found another bug that predated the
> > introduction of squashfs_readahead(), but it has
> > a common denominator in squashfs_decompress() and
> squashfs_xz_uncompress().
>
> Wrong, the stall is happening in the XZ decompressor library, which
> is *not* in Squashfs.
>
> This reported stall in the decompressor code is likely a symptom of you
> deliberately thrashing your system. When the system thrashes everything
> starts to happen very slowly, and the system will spend a lot of
> its time doing page I/O, and the CPU will spend a lot of time in
> any CPU intensive code like the XZ decompressor library.
>
> So the fact the stall is being hit here is a symptom and not
> a cause. The decompressor code is likely running slowly due to
> thrashing and waiting on paged-out buffers. This is not indicative
> of any bug, merely a system running slowly due to overload.
>
> As I said, this is not a Squashfs issue, because the code when the
> stall takes place isn't in Squashfs.
>
> The people responsible for the rcu code should have a lot more insight
> about what happens when the system is thrashing, and how this will
> throw up false positives. In this I believe this is an instance of
> perfectly correct code running slowly due to thrashing incorrectly
> being flagged as looping.
>
> CC'ing Paul E. McKenney <[email protected]>
>
> Phillip

How big can these readahead sizes be? Should one of the loops include
cond_resched() calls?

2022-11-20 17:02:10

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: BUG: BISECTED: in squashfs_xz_uncompress() (Was: RCU stalls in squashfs_readahead())

On 18. 11. 2022. 16:55, Paul E. McKenney wrote:
> On Fri, Nov 18, 2022 at 06:11:16AM +0000, Phillip Lougher wrote:
>> On 17/11/2022 23:05, Mirsad Goran Todorovac wrote:
>>> Hi,
>>>
>>> While trying to bisect, I've found another bug that predated the
>>> introduction of squashfs_readahead(), but it has
>>> a common denominator in squashfs_decompress() and squashfs_xz_uncompress().
>>
>> Wrong, the stall is happening in the XZ decompressor library, which
>> is *not* in Squashfs.
>>
>> This reported stall in the decompressor code is likely a symptom of you
>> deliberately thrashing your system. When the system thrashes everything
>> starts to happen very slowly, and the system will spend a lot of
>> its time doing page I/O, and the CPU will spend a lot of time in
>> any CPU intensive code like the XZ decompressor library.
>>
>> So the fact the stall is being hit here is a symptom and not
>> a cause. The decompressor code is likely running slowly due to
>> thrashing and waiting on paged-out buffers. This is not indicative
>> of any bug, merely a system running slowly due to overload.
>>
>> As I said, this is not a Squashfs issue, because the code when the
>> stall takes place isn't in Squashfs.
>>
>> The people responsible for the rcu code should have a lot more insight
>> about what happens when the system is thrashing, and how this will
>> throw up false positives. In this I believe this is an instance of
>> perfectly correct code running slowly due to thrashing incorrectly
>> being flagged as looping.
>>
>> CC'ing Paul E. McKenney <[email protected]>
>>
>>> Frankly, I need an advice on how to handle a situation like this.
>>>
>>> Obviously, the best place to test the RCU problem with
>>> squashfs_readahead() is where it is introduced + Phillip's patches
>>> [012]/3.
>>>
>>> Thanks,
>>> Mirsad
>>>
>>> P.S.
>>>
>>> This is excerpt from dmesg log attached:
>>>
>>> [   25.338169] rcu: INFO: rcu_preempt detected expedited stalls on
>>> CPUs/tasks: { 4-... } 6 jiffies s: 373 root: 0x10/.
>
> The "6 jiffies" leads me to believe that this kernel was built with
> CONFIG_ANDROID=y, but also with the short expedited grace-period commit,
> that is, with CONFIG_RCU_EXP_CPU_STALL_TIMEOUT set to 20 milliseconds
> and with HZ set to something like 250. This means that the kernel
> will complain if an RCU read-side critical section exceeeds about 20
> milliseconds. This is a fine thing if you are actually running on an
> Android, but a bit extreme for general usage.
>
> If my guesses are correct, please apply this commit from mainline:
>
> 1045a06724f3 ("remove CONFIG_ANDROID")
>
> With this commit in place, CONFIG_RCU_EXP_CPU_STALL_TIMEOUT will default
> to 21000 (that is, 21 seconds) and things will be much better.
>
> Thanx, Paul

Dear Paul,

Actually, I was able to reproduce and bisect the commit that started causing
described stalls (I have already explained how 5.10 didn't show any under the
exactly same saved widows and tabs set.)

# bad: [1e57930e9f4083ad5854ab6eadffe790a8167fb4] Merge tag 'rcu.2022.05.19a' of
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
git bisect bad 1e57930e9f4083ad5854ab6eadffe790a8167fb4
# good: [f993aed406eaf968ba3867a76bb46c95336a33d0] Merge tag 'block-5.18-2022-05-18' of git://git.kernel.dk/linux-block
git bisect good f993aed406eaf968ba3867a76bb46c95336a33d0
# good: [bf2431021c8cfe52c866d7bf640ced03a35ebe91] Merge tag 'efi-next-for-v5.19' of
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
git bisect good bf2431021c8cfe52c866d7bf640ced03a35ebe91
# good: [be05ee54378d451e1d60196921566d6087f3079f] Merge branches 'docs.2022.04.20a', 'fixes.2022.04.20a', 'nocb.2022.04.11b',
'rcu-tasks.2022.04.11b', 'srcu.2022.
05.03a', 'torture.2022.04.11b', 'torture-tasks.2022.04.20a' and 'torturescript.2022.04.20a' into HEAD
git bisect good be05ee54378d451e1d60196921566d6087f3079f
# good: [023033fe343cdf2ba83ab762f8de69241c7fc086] tools/nolibc/types: define PATH_MAX and MAXPATHLEN
git bisect good 023033fe343cdf2ba83ab762f8de69241c7fc086
# good: [96980b833a21c6dc29d0dfdc8f211fb8a10256a7] tools/nolibc/string: do not use __builtin_strlen() at -O0
git bisect good 96980b833a21c6dc29d0dfdc8f211fb8a10256a7
# good: [0e0ff638400be8f497a35b51a4751fd823f6bd6a] tools/nolibc/stdlib: Implement `malloc()`, `calloc()`, `realloc()` and `free()`
git bisect good 0e0ff638400be8f497a35b51a4751fd823f6bd6a
# good: [b2f02e9cdaade3ce565e6973ac313d2f814447f2] Merge tag 'lkmm.2022.05.20a' of
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
git bisect good b2f02e9cdaade3ce565e6973ac313d2f814447f2
# good: [9621fbee44df940e2e1b94b0676460a538dffefa] rcu: Move expedited grace period (GP) work to RT kthread_worker
git bisect good 9621fbee44df940e2e1b94b0676460a538dffefa
# bad: [ce13389053a347aa9f8ffbfda2238352536e15c9] Merge branch 'exp.2022.05.11a' into HEAD
git bisect bad ce13389053a347aa9f8ffbfda2238352536e15c9
# first bad commit: [ce13389053a347aa9f8ffbfda2238352536e15c9] Merge branch 'exp.2022.05.11a' into HEAD

The bug looks like this:

[ 57.876820] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-... } 6 jiffies s: 545 root: 0x1/.
[ 57.876854] rcu: blocking rcu_node structures (internal RCU debug):
[ 57.876862] Task dump for CPU 0:
[ 57.876867] task:snap-store state:R running task stack: 0 pid: 2678 ppid: 2230 flags:0x00004008
[ 57.876883] Call Trace:
[ 57.876888] <TASK>
[ 57.876893] ? _raw_spin_lock_irqsave+0x93/0xf0
[ 57.876909] ? sysvec_apic_timer_interrupt+0x89/0x90
[ 57.876920] ? asm_sysvec_apic_timer_interrupt+0x12/0x20
[ 57.876930] ? sysvec_apic_timer_interrupt+0x89/0x90
[ 57.876939] ? sysvec_apic_timer_interrupt+0x89/0x90
[ 57.876948] ? asm_sysvec_apic_timer_interrupt+0x12/0x20
[ 57.876959] ? lzma_len+0x1fb/0x3b0
[ 57.876971] ? dict_repeat+0xb3/0x120
[ 57.876981] ? lzma_main+0x943/0x1260
[ 57.876995] ? lzma2_lzma+0x2a4/0x3b0
[ 57.877007] ? xz_dec_lzma2_run+0x11f/0xb90
[ 57.877018] ? crc32_le_base+0x186/0x250
[ 57.877032] ? xz_dec_run+0x346/0x11e0
[ 57.877046] ? squashfs_xz_uncompress+0x135/0x330
[ 57.877060] ? lzo_init+0xd0/0xd0
[ 57.877069] ? squashfs_decompress+0x80/0xb0
[ 57.877079] ? squashfs_read_data+0x1eb/0x920
[ 57.877090] ? __filemap_get_folio+0x37e/0x540
[ 57.877104] ? squashfs_bio_read.isra.2+0x2a0/0x2a0
[ 57.877118] ? squashfs_readpage_block+0x493/0x950
[ 57.877131] ? squashfs_readpage+0xa52/0x1140
[ 57.877143] ? xas_nomem+0x2f/0x100
[ 57.877154] ? squashfs_copy_cache+0x360/0x360
[ 57.877164] ? folio_wait_bit+0x3c0/0x3c0
[ 57.877175] ? __kasan_check_read+0x11/0x20
[ 57.877186] ? pagevec_add_and_need_flush+0x8c/0xa0
[ 57.877197] ? folio_add_lru+0x42/0x70
[ 57.877206] ? filemap_add_folio+0xd4/0x130
[ 57.877216] ? add_to_page_cache_locked+0xa0/0xa0
[ 57.877228] ? filemap_read_folio.isra.62+0xbe/0x4d0
[ 57.877238] ? __kasan_check_write+0x14/0x20
[ 57.877248] ? filemap_page_mkwrite+0x6a0/0x6a0
[ 57.877260] ? filemap_add_folio+0x130/0x130
[ 57.877269] ? try_charge_memcg+0x4cc/0xa90
[ 57.877281] ? filemap_fault+0xc9a/0xe50
[ 57.877293] ? generic_file_read_iter+0x230/0x230
[ 57.877305] ? __rcu_read_unlock+0x55/0x260
[ 57.877317] ? __do_fault+0x7c/0x1b0
[ 57.877329] ? __handle_mm_fault+0x173b/0x1970
[ 57.877340] ? copy_page_range+0x1890/0x1890
[ 57.877356] ? handle_mm_fault+0x11b/0x380
[ 57.877366] ? do_user_addr_fault+0x258/0x810
[ 57.877380] ? exc_page_fault+0x60/0xe0
[ 57.877389] ? asm_exc_page_fault+0x8/0x30
[ 57.877398] ? asm_exc_page_fault+0x1e/0x30
[ 57.877410] </TASK>

Please find complete dmesg, my lshw and bisect attached.

I am not competent to estimate how this commit causes this kind of problems,
and whether stalls up to 20 jiffies are worth eradicating.

Hope this helps.

Thank you,
Mirsad

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union


Attachments:
config-6.1.0-rc5.xz (55.37 kB)
dmesg-5.18.0-rc1-31a-ce13389053a3.log.xz (1.04 kB)
dmesg-5.18.0-rc1-31a-ce13389053a3-full.log.xz (21.95 kB)
lshw.txt.xz (5.13 kB)
squashfs-stall-bisect-full.log.xz (1.52 kB)
Download all attachments

2022-11-20 18:53:56

by Paul E. McKenney

[permalink] [raw]
Subject: Re: BUG: BISECTED: in squashfs_xz_uncompress() (Was: RCU stalls in squashfs_readahead())

On Sun, Nov 20, 2022 at 04:49:37PM +0100, Mirsad Goran Todorovac wrote:
> On 18. 11. 2022. 16:55, Paul E. McKenney wrote:
> > On Fri, Nov 18, 2022 at 06:11:16AM +0000, Phillip Lougher wrote:
> > > On 17/11/2022 23:05, Mirsad Goran Todorovac wrote:
> > > > Hi,
> > > >
> > > > While trying to bisect, I've found another bug that predated the
> > > > introduction of squashfs_readahead(), but it has
> > > > a common denominator in squashfs_decompress() and squashfs_xz_uncompress().
> > >
> > > Wrong, the stall is happening in the XZ decompressor library, which
> > > is *not* in Squashfs.
> > >
> > > This reported stall in the decompressor code is likely a symptom of you
> > > deliberately thrashing your system. When the system thrashes everything
> > > starts to happen very slowly, and the system will spend a lot of
> > > its time doing page I/O, and the CPU will spend a lot of time in
> > > any CPU intensive code like the XZ decompressor library.
> > >
> > > So the fact the stall is being hit here is a symptom and not
> > > a cause. The decompressor code is likely running slowly due to
> > > thrashing and waiting on paged-out buffers. This is not indicative
> > > of any bug, merely a system running slowly due to overload.
> > >
> > > As I said, this is not a Squashfs issue, because the code when the
> > > stall takes place isn't in Squashfs.
> > >
> > > The people responsible for the rcu code should have a lot more insight
> > > about what happens when the system is thrashing, and how this will
> > > throw up false positives. In this I believe this is an instance of
> > > perfectly correct code running slowly due to thrashing incorrectly
> > > being flagged as looping.
> > >
> > > CC'ing Paul E. McKenney <[email protected]>
> > >
> > > > Frankly, I need an advice on how to handle a situation like this.
> > > >
> > > > Obviously, the best place to test the RCU problem with
> > > > squashfs_readahead() is where it is introduced + Phillip's patches
> > > > [012]/3.
> > > >
> > > > Thanks,
> > > > Mirsad
> > > >
> > > > P.S.
> > > >
> > > > This is excerpt from dmesg log attached:
> > > >
> > > > [   25.338169] rcu: INFO: rcu_preempt detected expedited stalls on
> > > > CPUs/tasks: { 4-... } 6 jiffies s: 373 root: 0x10/.
> >
> > The "6 jiffies" leads me to believe that this kernel was built with
> > CONFIG_ANDROID=y, but also with the short expedited grace-period commit,
> > that is, with CONFIG_RCU_EXP_CPU_STALL_TIMEOUT set to 20 milliseconds
> > and with HZ set to something like 250. This means that the kernel
> > will complain if an RCU read-side critical section exceeeds about 20
> > milliseconds. This is a fine thing if you are actually running on an
> > Android, but a bit extreme for general usage.
> >
> > If my guesses are correct, please apply this commit from mainline:
> >
> > 1045a06724f3 ("remove CONFIG_ANDROID")
> >
> > With this commit in place, CONFIG_RCU_EXP_CPU_STALL_TIMEOUT will default
> > to 21000 (that is, 21 seconds) and things will be much better.

Please apply the commit called out above and rerun your test.

But if you still cannot bring yourself to do that, more interspersed
below...

> Dear Paul,
>
> Actually, I was able to reproduce and bisect the commit that started causing
> described stalls (I have already explained how 5.10 didn't show any under the
> exactly same saved widows and tabs set.)
>
> # bad: [1e57930e9f4083ad5854ab6eadffe790a8167fb4] Merge tag
> 'rcu.2022.05.19a' of
> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
> git bisect bad 1e57930e9f4083ad5854ab6eadffe790a8167fb4
> # good: [f993aed406eaf968ba3867a76bb46c95336a33d0] Merge tag 'block-5.18-2022-05-18' of git://git.kernel.dk/linux-block
> git bisect good f993aed406eaf968ba3867a76bb46c95336a33d0
> # good: [bf2431021c8cfe52c866d7bf640ced03a35ebe91] Merge tag
> 'efi-next-for-v5.19' of
> git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
> git bisect good bf2431021c8cfe52c866d7bf640ced03a35ebe91
> # good: [be05ee54378d451e1d60196921566d6087f3079f] Merge branches
> 'docs.2022.04.20a', 'fixes.2022.04.20a', 'nocb.2022.04.11b',
> 'rcu-tasks.2022.04.11b', 'srcu.2022.
> 05.03a', 'torture.2022.04.11b', 'torture-tasks.2022.04.20a' and 'torturescript.2022.04.20a' into HEAD
> git bisect good be05ee54378d451e1d60196921566d6087f3079f
> # good: [023033fe343cdf2ba83ab762f8de69241c7fc086] tools/nolibc/types: define PATH_MAX and MAXPATHLEN
> git bisect good 023033fe343cdf2ba83ab762f8de69241c7fc086
> # good: [96980b833a21c6dc29d0dfdc8f211fb8a10256a7] tools/nolibc/string: do not use __builtin_strlen() at -O0
> git bisect good 96980b833a21c6dc29d0dfdc8f211fb8a10256a7
> # good: [0e0ff638400be8f497a35b51a4751fd823f6bd6a] tools/nolibc/stdlib: Implement `malloc()`, `calloc()`, `realloc()` and `free()`
> git bisect good 0e0ff638400be8f497a35b51a4751fd823f6bd6a
> # good: [b2f02e9cdaade3ce565e6973ac313d2f814447f2] Merge tag
> 'lkmm.2022.05.20a' of
> git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
> git bisect good b2f02e9cdaade3ce565e6973ac313d2f814447f2
> # good: [9621fbee44df940e2e1b94b0676460a538dffefa] rcu: Move expedited grace period (GP) work to RT kthread_worker
> git bisect good 9621fbee44df940e2e1b94b0676460a538dffefa
> # bad: [ce13389053a347aa9f8ffbfda2238352536e15c9] Merge branch 'exp.2022.05.11a' into HEAD
> git bisect bad ce13389053a347aa9f8ffbfda2238352536e15c9
> # first bad commit: [ce13389053a347aa9f8ffbfda2238352536e15c9] Merge branch 'exp.2022.05.11a' into HEAD

More precisely, this commit:

28b3ae426598 ("rcu: Introduce CONFIG_RCU_EXP_CPU_STALL_TIMEOUT")

The idea was to cause Android-based systems to have 20-millisecond
expedited RCU CPU stall warnings, but to leave other kernels at the
traditional 21-second value. Again, the "6 jiffies" below indicates
that you are running a kernel with CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=0
and CONFIG_ANDROID=y.

Except that there were a lot of non-Android systems (including Debian
and Fedora) that have nothing to do with Android, but that nevertheless
build their kernels with CONFIG_ANDROID=y, which no doubt seemed like
a good idea at the time.

For more information, please see https://lwn.net/Articles/899743/

The upshot of the discussion was this same commit that I pointed out
earlier that removes CONFIG_ANDROID completely:

1045a06724f3 ("remove CONFIG_ANDROID")

Again, please apply this commit and rerun your test.

> The bug looks like this:
>
> [ 57.876820] rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-... } 6 jiffies s: 545 root: 0x1/.

Yes, again "6 jiffies" which is roughly the 20 milliseconds set for
kernels built with CONFIG_ANDROID=y and HZ=250 that have this commit:

28b3ae426598 ("rcu: Introduce CONFIG_RCU_EXP_CPU_STALL_TIMEOUT")

But not this commit:

1045a06724f3 ("remove CONFIG_ANDROID")

Yet again, please apply this commit and rerun your test.

> [ 57.876854] rcu: blocking rcu_node structures (internal RCU debug):
> [ 57.876862] Task dump for CPU 0:
> [ 57.876867] task:snap-store state:R running task stack: 0 pid: 2678 ppid: 2230 flags:0x00004008
> [ 57.876883] Call Trace:
> [ 57.876888] <TASK>
> [ 57.876893] ? _raw_spin_lock_irqsave+0x93/0xf0
> [ 57.876909] ? sysvec_apic_timer_interrupt+0x89/0x90
> [ 57.876920] ? asm_sysvec_apic_timer_interrupt+0x12/0x20
> [ 57.876930] ? sysvec_apic_timer_interrupt+0x89/0x90
> [ 57.876939] ? sysvec_apic_timer_interrupt+0x89/0x90
> [ 57.876948] ? asm_sysvec_apic_timer_interrupt+0x12/0x20
> [ 57.876959] ? lzma_len+0x1fb/0x3b0
> [ 57.876971] ? dict_repeat+0xb3/0x120
> [ 57.876981] ? lzma_main+0x943/0x1260
> [ 57.876995] ? lzma2_lzma+0x2a4/0x3b0
> [ 57.877007] ? xz_dec_lzma2_run+0x11f/0xb90
> [ 57.877018] ? crc32_le_base+0x186/0x250
> [ 57.877032] ? xz_dec_run+0x346/0x11e0
> [ 57.877046] ? squashfs_xz_uncompress+0x135/0x330
> [ 57.877060] ? lzo_init+0xd0/0xd0
> [ 57.877069] ? squashfs_decompress+0x80/0xb0
> [ 57.877079] ? squashfs_read_data+0x1eb/0x920
> [ 57.877090] ? __filemap_get_folio+0x37e/0x540
> [ 57.877104] ? squashfs_bio_read.isra.2+0x2a0/0x2a0
> [ 57.877118] ? squashfs_readpage_block+0x493/0x950
> [ 57.877131] ? squashfs_readpage+0xa52/0x1140
> [ 57.877143] ? xas_nomem+0x2f/0x100
> [ 57.877154] ? squashfs_copy_cache+0x360/0x360
> [ 57.877164] ? folio_wait_bit+0x3c0/0x3c0
> [ 57.877175] ? __kasan_check_read+0x11/0x20
> [ 57.877186] ? pagevec_add_and_need_flush+0x8c/0xa0
> [ 57.877197] ? folio_add_lru+0x42/0x70
> [ 57.877206] ? filemap_add_folio+0xd4/0x130
> [ 57.877216] ? add_to_page_cache_locked+0xa0/0xa0
> [ 57.877228] ? filemap_read_folio.isra.62+0xbe/0x4d0
> [ 57.877238] ? __kasan_check_write+0x14/0x20
> [ 57.877248] ? filemap_page_mkwrite+0x6a0/0x6a0
> [ 57.877260] ? filemap_add_folio+0x130/0x130
> [ 57.877269] ? try_charge_memcg+0x4cc/0xa90
> [ 57.877281] ? filemap_fault+0xc9a/0xe50
> [ 57.877293] ? generic_file_read_iter+0x230/0x230
> [ 57.877305] ? __rcu_read_unlock+0x55/0x260
> [ 57.877317] ? __do_fault+0x7c/0x1b0
> [ 57.877329] ? __handle_mm_fault+0x173b/0x1970
> [ 57.877340] ? copy_page_range+0x1890/0x1890
> [ 57.877356] ? handle_mm_fault+0x11b/0x380
> [ 57.877366] ? do_user_addr_fault+0x258/0x810
> [ 57.877380] ? exc_page_fault+0x60/0xe0
> [ 57.877389] ? asm_exc_page_fault+0x8/0x30
> [ 57.877398] ? asm_exc_page_fault+0x1e/0x30
> [ 57.877410] </TASK>
>
> Please find complete dmesg, my lshw and bisect attached.
>
> I am not competent to estimate how this commit causes this kind of problems,
> and whether stalls up to 20 jiffies are worth eradicating.

The Android folks believe that they are on Android platforms, as you
can see from the commit log on:

28b3ae426598 ("rcu: Introduce CONFIG_RCU_EXP_CPU_STALL_TIMEOUT")

But this would be excessive on many other platforms, especially those
with large quantities of CPUs, devices, or memory.

> Hope this helps.

Likewise!

Thanx, Paul

> Thank you,
> Mirsad
>
> --
> Mirsad Goran Todorovac
> Sistem inženjer
> Grafički fakultet | Akademija likovnih umjetnosti
> Sveučilište u Zagrebu
> --
> System engineer
> Faculty of Graphic Arts | Academy of Fine Arts
> University of Zagreb, Republic of Croatia
> The European Union







2022-12-20 10:57:05

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: BUG: in squashfs_xz_uncompress() (Was: RCU stalls in squashfs_readahead())

On 11/18/22 17:51, Elliott, Robert (Servers) wrote:
>
>
>> -----Original Message-----
>> From: Phillip Lougher <[email protected]>
>> Sent: Friday, November 18, 2022 12:11 AM
>> To: Mirsad Goran Todorovac <[email protected]>; LKML <linux-
>> [email protected]>; Paul E. McKenney <[email protected]>
>> Cc: [email protected]; Thorsten Leemhuis
>> <[email protected]>
>> Subject: Re: BUG: in squashfs_xz_uncompress() (Was: RCU stalls in
>> squashfs_readahead())
>>
>> On 17/11/2022 23:05, Mirsad Goran Todorovac wrote:
>>> Hi,
>>>
>>> While trying to bisect, I've found another bug that predated the
>>> introduction of squashfs_readahead(), but it has
>>> a common denominator in squashfs_decompress() and
>> squashfs_xz_uncompress().
>>
>> Wrong, the stall is happening in the XZ decompressor library, which
>> is *not* in Squashfs.
>>
>> This reported stall in the decompressor code is likely a symptom of you
>> deliberately thrashing your system. When the system thrashes everything
>> starts to happen very slowly, and the system will spend a lot of
>> its time doing page I/O, and the CPU will spend a lot of time in
>> any CPU intensive code like the XZ decompressor library.
>>
>> So the fact the stall is being hit here is a symptom and not
>> a cause. The decompressor code is likely running slowly due to
>> thrashing and waiting on paged-out buffers. This is not indicative
>> of any bug, merely a system running slowly due to overload.
>>
>> As I said, this is not a Squashfs issue, because the code when the
>> stall takes place isn't in Squashfs.
>>
>> The people responsible for the rcu code should have a lot more insight
>> about what happens when the system is thrashing, and how this will
>> throw up false positives. In this I believe this is an instance of
>> perfectly correct code running slowly due to thrashing incorrectly
>> being flagged as looping.
>>
>> CC'ing Paul E. McKenney <[email protected]>
>>
>> Phillip
>
> How big can these readahead sizes be? Should one of the loops include
> cond_resched() calls?

Please allow me to assert that 6.1.0+ kernel (this Berlin time 6 AM
morning's build on on Torvalds' tree) built with CONFIG_KMEMLEAK=y,
CONFIG_KASAN=y, CONFIG_LRU_GEN=y (multi-gen LRU) and
CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=20 doesn't exhibit before seen RCU
stalls even with such a low timeout as 20 ms.

So I guess kudos go to the MG-LRU developers, or has Mr. Lougher done
something efficient in the meantime.

My $0.02!

Thank you,
Mirsad

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia