2022-07-26 12:35:54

by Mikhail Gavrilov

[permalink] [raw]
Subject: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!

Hi guys.
Always with intensive writing on a btrfs volume, the message "BUG:
MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.

[46729.134549] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
[46729.134557] turning off the locking correctness validator.
[46729.134559] Please attach the output of /proc/lock_stat to the bug report
[46729.134561] CPU: 22 PID: 166516 Comm: ThreadPoolForeg Tainted: G
W L -------- ---
5.19.0-0.rc7.20220722git68e77ffbfd06.56.fc37.x86_64 #1
[46729.134566] Hardware name: System manufacturer System Product
Name/ROG STRIX X570-I GAMING, BIOS 4403 04/27/2022
[46729.134569] Call Trace:
[46729.134572] <TASK>
[46729.134576] dump_stack_lvl+0x5b/0x77
[46729.134583] __lock_acquire.cold+0x167/0x29e
[46729.134594] lock_acquire+0xce/0x2d0
[46729.134599] ? btrfs_reserve_extent+0xbd/0x250
[46729.134606] ? btrfs_get_alloc_profile+0x17e/0x240
[46729.134611] btrfs_get_alloc_profile+0x19c/0x240
[46729.134614] ? btrfs_reserve_extent+0xbd/0x250
[46729.134618] btrfs_reserve_extent+0xbd/0x250
[46729.134629] btrfs_alloc_tree_block+0xa3/0x510
[46729.134635] ? release_extent_buffer+0xa7/0xe0
[46729.134643] split_node+0x131/0x3d0
[46729.134652] btrfs_search_slot+0x2f3/0xc30
[46729.134659] ? btrfs_insert_inode_ref+0x50/0x3b0
[46729.134664] btrfs_insert_empty_items+0x31/0x70
[46729.134669] btrfs_insert_inode_ref+0x99/0x3b0
[46729.134678] btrfs_rename2+0x317/0x1510
[46729.134690] ? vfs_rename+0x49d/0xd20
[46729.134693] ? btrfs_symlink+0x460/0x460
[46729.134696] vfs_rename+0x49d/0xd20
[46729.134705] ? do_renameat2+0x4a0/0x510
[46729.134709] do_renameat2+0x4a0/0x510
[46729.134720] __x64_sys_rename+0x3f/0x50
[46729.134724] do_syscall_64+0x5b/0x80
[46729.134729] ? memcg_slab_free_hook+0x1fd/0x2e0
[46729.134735] ? do_faccessat+0x111/0x260
[46729.134739] ? kmem_cache_free+0x379/0x3d0
[46729.134744] ? lock_is_held_type+0xe8/0x140
[46729.134749] ? do_syscall_64+0x67/0x80
[46729.134752] ? lockdep_hardirqs_on+0x7d/0x100
[46729.134757] ? do_syscall_64+0x67/0x80
[46729.134760] ? asm_exc_page_fault+0x22/0x30
[46729.134764] ? lockdep_hardirqs_on+0x7d/0x100
[46729.134768] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[46729.134773] RIP: 0033:0x7fd2a29b5afb
[46729.134798] Code: e8 7a 27 0a 00 f7 d8 19 c0 5b c3 0f 1f 40 00 b8
ff ff ff ff 5b c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 52 00 00
00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 f1 82 17 00
f7 d8
[46729.134801] RSP: 002b:00007fd25b70a5a8 EFLAGS: 00000282 ORIG_RAX:
0000000000000052
[46729.134805] RAX: ffffffffffffffda RBX: 00007fd25b70a5e0 RCX: 00007fd2a29b5afb
[46729.134808] RDX: 0000000000000000 RSI: 00003ba01ef60820 RDI: 00003ba00e4b2da0
[46729.134810] RBP: 00007fd25b70a660 R08: 0000000000000000 R09: 00007fd25b70a570
[46729.134812] R10: 00007ffd36b1f080 R11: 0000000000000282 R12: 00007fd25b70a5b8
[46729.134815] R13: 00003ba00e4b2da0 R14: 00007fd25b70a6c4 R15: 00003ba01ef60820
[46729.134823] </TASK>

In this regard, I want to ask, is this really a bug?
The kernel version is 5.19-rc7.

Here's the full kernel log: https://pastebin.com/hYWH7RHu
Here's /proc/lock_stat: https://pastebin.com/ex5w0QW9

--
Best Regards,
Mike Gavrilov.


2022-07-26 16:49:51

by David Sterba

[permalink] [raw]
Subject: Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!

On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
> Hi guys.
> Always with intensive writing on a btrfs volume, the message "BUG:
> MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.

Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
tends to work.

2022-07-26 19:43:35

by Chris Murphy

[permalink] [raw]
Subject: Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!



On Tue, Jul 26, 2022, at 12:42 PM, David Sterba wrote:
> On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
>> Hi guys.
>> Always with intensive writing on a btrfs volume, the message "BUG:
>> MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
>
> Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
> tends to work.

Fedora is using 17. I'll make a request to bump it to 18. Thanks.

--
Chris Murphy

2022-07-26 20:00:52

by Chris Murphy

[permalink] [raw]
Subject: Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!



On Tue, Jul 26, 2022, at 3:19 PM, Chris Murphy wrote:
> On Tue, Jul 26, 2022, at 12:42 PM, David Sterba wrote:
>> On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
>>> Hi guys.
>>> Always with intensive writing on a btrfs volume, the message "BUG:
>>> MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
>>
>> Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
>> tends to work.
>
> Fedora is using 17. I'll make a request to bump it to 18. Thanks.

Should it be 18 across all archs? Or is it OK to only bump x86_64?

--
Chris Murphy

2022-07-26 21:00:59

by David Sterba

[permalink] [raw]
Subject: Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!

On Tue, Jul 26, 2022 at 03:21:32PM -0400, Chris Murphy wrote:
>
>
> On Tue, Jul 26, 2022, at 3:19 PM, Chris Murphy wrote:
> > On Tue, Jul 26, 2022, at 12:42 PM, David Sterba wrote:
> >> On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
> >>> Hi guys.
> >>> Always with intensive writing on a btrfs volume, the message "BUG:
> >>> MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
> >>
> >> Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
> >> tends to work.
> >
> > Fedora is using 17. I'll make a request to bump it to 18. Thanks.
>
> Should it be 18 across all archs? Or is it OK to only bump x86_64?

I think it applies to all achritectures equally but I'm no lockdep
expert.

2022-08-03 19:34:58

by Mikhail Gavrilov

[permalink] [raw]
Subject: Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!

On Tue, Jul 26, 2022 at 9:47 PM David Sterba <[email protected]> wrote:
>
> On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
> > Hi guys.
> > Always with intensive writing on a btrfs volume, the message "BUG:
> > MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
>
> Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
> tends to work.

I confirm that after bumping LOCKDEP_CHAINS_BITS to 18 several days of
continuous writing on the BTRFS partition with different files with a
total size of 10Tb I didn't see this kernel bug message again.
Tetsuo, I saw your commit 5dc33592e95534dc8455ce3e9baaaf3dae0fff82 [1]
set for LOCKDEP_CHAINS_BITS default value 16.
Why not increase LOCKDEP_CHAINS_BITS to 18 by default?
Thanks.


[1] https://github.com/torvalds/linux/blame/master/lib/Kconfig.debug#L1387

--
Best Regards,
Mike Gavrilov.

2022-08-03 20:15:20

by Chris Murphy

[permalink] [raw]
Subject: Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!



On Wed, Aug 3, 2022, at 3:28 PM, Mikhail Gavrilov wrote:
> On Tue, Jul 26, 2022 at 9:47 PM David Sterba <[email protected]> wrote:
>>
>> On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
>> > Hi guys.
>> > Always with intensive writing on a btrfs volume, the message "BUG:
>> > MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
>>
>> Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
>> tends to work.
>
> I confirm that after bumping LOCKDEP_CHAINS_BITS to 18 several days of
> continuous writing on the BTRFS partition with different files with a
> total size of 10Tb I didn't see this kernel bug message again.
> Tetsuo, I saw your commit 5dc33592e95534dc8455ce3e9baaaf3dae0fff82 [1]
> set for LOCKDEP_CHAINS_BITS default value 16.
> Why not increase LOCKDEP_CHAINS_BITS to 18 by default?

This will be making it into Fedora debug kernels, which have lockdep enabled on them, starting with 5.20 series, which are now building in koji.
https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1921




--
Chris Murphy

2022-08-04 08:12:22

by Mikhail Gavrilov

[permalink] [raw]
Subject: Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!

On Thu, Aug 4, 2022 at 1:01 AM Chris Murphy <[email protected]> wrote:
>
> This will be making it into Fedora debug kernels, which have lockdep enabled on them, starting with 5.20 series, which are now building in koji.
> https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1921

I saw this change, but it would be good if users of all other
distributions will be happy too.

--
Best Regards,
Mike Gavrilov.

2022-08-04 11:29:17

by Tetsuo Handa

[permalink] [raw]
Subject: Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!

On 2022/08/04 16:35, Mikhail Gavrilov wrote:
> On Thu, Aug 4, 2022 at 1:01 AM Chris Murphy <[email protected]> wrote:
>>
>> This will be making it into Fedora debug kernels, which have lockdep enabled on them, starting with 5.20 series, which are now building in koji.
>> https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1921
>
> I saw this change, but it would be good if users of all other
> distributions will be happy too.
>

I'm not a lockdep maintainer.

Please submit a patch to lockdep maintainers and persuade lockdep maintainers
to change the default value. ;-)


2023-01-24 20:28:09

by Mikhail Gavrilov

[permalink] [raw]
Subject: Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!

On Tue, Jul 26, 2022 at 9:47 PM David Sterba <[email protected]> wrote:
>
> On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
> > Hi guys.
> > Always with intensive writing on a btrfs volume, the message "BUG:
> > MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
>
> Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
> tends to work.

Hi,
Today I was able to get the message "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too
low!" again even with LOCKDEP_CHAINS_BITS=18 and kernel 6.2-rc5.

❯ cat /boot/config-`uname -r` | grep LOCKDEP_CHAINS_BITS
CONFIG_LOCKDEP_CHAINS_BITS=18

[88685.088099] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
[88685.088124] turning off the locking correctness validator.
[88685.088133] Please attach the output of /proc/lock_stat to the bug report
[88685.088142] CPU: 14 PID: 1749746 Comm: mv Tainted: G W L
------- --- 6.2.0-0.rc5.20230123git2475bf0250de.38.fc38.x86_64 #1
[88685.088154] Hardware name: System manufacturer System Product
Name/ROG STRIX X570-I GAMING, BIOS 4408 10/28/2022

What's next? Increase this value to 19?
Actual full kernel log and lock_stat attached below.

--
Best Regards,
Mike Gavrilov.


Attachments:
dmesg.tar.xz (35.89 kB)
lock_stat.tar.xz (94.65 kB)
Download all attachments

2023-01-25 17:21:07

by David Sterba

[permalink] [raw]
Subject: Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!

On Wed, Jan 25, 2023 at 01:27:48AM +0500, Mikhail Gavrilov wrote:
> On Tue, Jul 26, 2022 at 9:47 PM David Sterba <[email protected]> wrote:
> >
> > On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
> > > Hi guys.
> > > Always with intensive writing on a btrfs volume, the message "BUG:
> > > MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
> >
> > Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
> > tends to work.
>
> Hi,
> Today I was able to get the message "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too
> low!" again even with LOCKDEP_CHAINS_BITS=18 and kernel 6.2-rc5.
>
> ❯ cat /boot/config-`uname -r` | grep LOCKDEP_CHAINS_BITS
> CONFIG_LOCKDEP_CHAINS_BITS=18
>
> [88685.088099] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
> [88685.088124] turning off the locking correctness validator.
> [88685.088133] Please attach the output of /proc/lock_stat to the bug report
> [88685.088142] CPU: 14 PID: 1749746 Comm: mv Tainted: G W L
> ------- --- 6.2.0-0.rc5.20230123git2475bf0250de.38.fc38.x86_64 #1
> [88685.088154] Hardware name: System manufacturer System Product
> Name/ROG STRIX X570-I GAMING, BIOS 4408 10/28/2022
>
> What's next? Increase this value to 19?

Yes, though increasing the value is a workaround so you may see the
warning again.

2023-01-26 09:48:02

by Mikhail Gavrilov

[permalink] [raw]
Subject: Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!

On Wed, Jan 25, 2023 at 10:21 PM David Sterba <[email protected]> wrote:
>
> On Wed, Jan 25, 2023 at 01:27:48AM +0500, Mikhail Gavrilov wrote:
> > On Tue, Jul 26, 2022 at 9:47 PM David Sterba <[email protected]> wrote:
> > >
> > > On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
> > > > Hi guys.
> > > > Always with intensive writing on a btrfs volume, the message "BUG:
> > > > MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
> > >
> > > Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
> > > tends to work.
> >
> > Hi,
> > Today I was able to get the message "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too
> > low!" again even with LOCKDEP_CHAINS_BITS=18 and kernel 6.2-rc5.
> >
> > ❯ cat /boot/config-`uname -r` | grep LOCKDEP_CHAINS_BITS
> > CONFIG_LOCKDEP_CHAINS_BITS=18
> >
> > [88685.088099] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
> > [88685.088124] turning off the locking correctness validator.
> > [88685.088133] Please attach the output of /proc/lock_stat to the bug report
> > [88685.088142] CPU: 14 PID: 1749746 Comm: mv Tainted: G W L
> > ------- --- 6.2.0-0.rc5.20230123git2475bf0250de.38.fc38.x86_64 #1
> > [88685.088154] Hardware name: System manufacturer System Product
> > Name/ROG STRIX X570-I GAMING, BIOS 4408 10/28/2022
> >
> > What's next? Increase this value to 19?
>
> Yes, though increasing the value is a workaround so you may see the
> warning again.

Is there any sense in this WARNING if we would ignore it and every
time increase the threshold value?
May Be set 99 right away? Or remove such a check condition?

--
Best Regards,
Mike Gavrilov.

2023-01-26 17:39:28

by Boqun Feng

[permalink] [raw]
Subject: Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!

[Cc lock folks]

On Thu, Jan 26, 2023 at 02:47:42PM +0500, Mikhail Gavrilov wrote:
> On Wed, Jan 25, 2023 at 10:21 PM David Sterba <[email protected]> wrote:
> >
> > On Wed, Jan 25, 2023 at 01:27:48AM +0500, Mikhail Gavrilov wrote:
> > > On Tue, Jul 26, 2022 at 9:47 PM David Sterba <[email protected]> wrote:
> > > >
> > > > On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
> > > > > Hi guys.
> > > > > Always with intensive writing on a btrfs volume, the message "BUG:
> > > > > MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
> > > >
> > > > Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
> > > > tends to work.
> > >
> > > Hi,
> > > Today I was able to get the message "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too
> > > low!" again even with LOCKDEP_CHAINS_BITS=18 and kernel 6.2-rc5.
> > >
> > > ❯ cat /boot/config-`uname -r` | grep LOCKDEP_CHAINS_BITS
> > > CONFIG_LOCKDEP_CHAINS_BITS=18
> > >
> > > [88685.088099] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
> > > [88685.088124] turning off the locking correctness validator.
> > > [88685.088133] Please attach the output of /proc/lock_stat to the bug report
> > > [88685.088142] CPU: 14 PID: 1749746 Comm: mv Tainted: G W L
> > > ------- --- 6.2.0-0.rc5.20230123git2475bf0250de.38.fc38.x86_64 #1
> > > [88685.088154] Hardware name: System manufacturer System Product
> > > Name/ROG STRIX X570-I GAMING, BIOS 4408 10/28/2022
> > >
> > > What's next? Increase this value to 19?
> >
> > Yes, though increasing the value is a workaround so you may see the
> > warning again.
>
> Is there any sense in this WARNING if we would ignore it and every
> time increase the threshold value?

Lockdep uses static allocated array to track lock holdings chains to
avoid dynmaic memory allocation in its own code. So if you see the
warning it means your test has more combination of lock holdings than
the array can record. In other words, you reach the resource limitation,
and in that sense it makes sense to just ignore it and increase the
value: you want to give lockdep enough resource to work, right?

> May Be set 99 right away? Or remove such a check condition?

That requires having 2^99 * 5 * sizeof(u16) memory for lock holding
chains array..

However, a few other options we can try in lockdep are:

* warn but not turn off the lockdep: the lock holding chain is
only a cache for what lock holding combination lockdep has ever
see, we also record the dependency in the graph. Without the
lock holding chain, lockdep can still work but just slower.

* allow dynmaic memory allocation in lockdep: I think this might
be OK since we have lockdep_recursion to avoid lockdep code ->
mm code -> lockdep code -> mm code ... deadlock. But maybe I'm
missing something. And even we allow it, the use of memory
doesn't change, you will still need that amout of memory to
track lock holding chains.

I'm not sure whether these options are better than just increasing the
number, maybe to unblock your ASAP, you can try make it 30 and make sure
you have large enough memory to test.

Regards,
Boqun

>
> --
> Best Regards,
> Mike Gavrilov.

2023-01-26 18:31:37

by Waiman Long

[permalink] [raw]
Subject: Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!

On 1/26/23 12:38, Boqun Feng wrote:
> [Cc lock folks]
>
> On Thu, Jan 26, 2023 at 02:47:42PM +0500, Mikhail Gavrilov wrote:
>> On Wed, Jan 25, 2023 at 10:21 PM David Sterba <[email protected]> wrote:
>>> On Wed, Jan 25, 2023 at 01:27:48AM +0500, Mikhail Gavrilov wrote:
>>>> On Tue, Jul 26, 2022 at 9:47 PM David Sterba <[email protected]> wrote:
>>>>> On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
>>>>>> Hi guys.
>>>>>> Always with intensive writing on a btrfs volume, the message "BUG:
>>>>>> MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
>>>>> Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
>>>>> tends to work.
>>>> Hi,
>>>> Today I was able to get the message "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too
>>>> low!" again even with LOCKDEP_CHAINS_BITS=18 and kernel 6.2-rc5.
>>>>
>>>> ❯ cat /boot/config-`uname -r` | grep LOCKDEP_CHAINS_BITS
>>>> CONFIG_LOCKDEP_CHAINS_BITS=18
>>>>
>>>> [88685.088099] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
>>>> [88685.088124] turning off the locking correctness validator.
>>>> [88685.088133] Please attach the output of /proc/lock_stat to the bug report
>>>> [88685.088142] CPU: 14 PID: 1749746 Comm: mv Tainted: G W L
>>>> ------- --- 6.2.0-0.rc5.20230123git2475bf0250de.38.fc38.x86_64 #1
>>>> [88685.088154] Hardware name: System manufacturer System Product
>>>> Name/ROG STRIX X570-I GAMING, BIOS 4408 10/28/2022
>>>>
>>>> What's next? Increase this value to 19?
>>> Yes, though increasing the value is a workaround so you may see the
>>> warning again.
>> Is there any sense in this WARNING if we would ignore it and every
>> time increase the threshold value?
> Lockdep uses static allocated array to track lock holdings chains to
> avoid dynmaic memory allocation in its own code. So if you see the
> warning it means your test has more combination of lock holdings than
> the array can record. In other words, you reach the resource limitation,
> and in that sense it makes sense to just ignore it and increase the
> value: you want to give lockdep enough resource to work, right?
>
>> May Be set 99 right away? Or remove such a check condition?
> That requires having 2^99 * 5 * sizeof(u16) memory for lock holding
> chains array..

Note that every increment of LOCKDEP_CHAINS_BITS double the storage
space. With 99, that will likely exceed the total amount of memory you
have in your system.

Boqun, where does the 5 figure come from. It is just a simple u16 array
of size MAX_LOCKDEP_CHAIN_HLOCKS. The chain_hlocks array stores the lock
chains that show up in the lockdep splats and in the /proc/lockdep*
files. Each chain is variable size. As we add new lock into the chain,
we have to repeatedly deallocate and reallocate a larger chain buffer.
That will cause fragmentation in the chain_hlocks[]. So if we have a
very long lock chain, the allocation may fail because the largest free
block is smaller than the requested chain length. There may be enough
free space in chain_hlocks, but it is just too fragmented to be useful.

Maybe we should figure out a better way to handle this fragmentation. In
the mean time, the easiest way forward is just to increase the
LOCKDEP_CHAINS_BITS by 1.

>
> However, a few other options we can try in lockdep are:
>
> * warn but not turn off the lockdep: the lock holding chain is
> only a cache for what lock holding combination lockdep has ever
> see, we also record the dependency in the graph. Without the
> lock holding chain, lockdep can still work but just slower.
>
> * allow dynmaic memory allocation in lockdep: I think this might
> be OK since we have lockdep_recursion to avoid lockdep code ->
> mm code -> lockdep code -> mm code ... deadlock. But maybe I'm
> missing something. And even we allow it, the use of memory
> doesn't change, you will still need that amout of memory to
> track lock holding chains.

It is not just the issue of calling the memory allocator. There is also
the issue of copying data from old chain_hlocks to new one while the old
one may be updated during the copying process unless we can freeze
everything else.

Cheers,
Longman


2023-01-26 19:00:38

by Boqun Feng

[permalink] [raw]
Subject: Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!

On Thu, Jan 26, 2023 at 01:30:34PM -0500, Waiman Long wrote:
> On 1/26/23 12:38, Boqun Feng wrote:
> > [Cc lock folks]
> >
> > On Thu, Jan 26, 2023 at 02:47:42PM +0500, Mikhail Gavrilov wrote:
> > > On Wed, Jan 25, 2023 at 10:21 PM David Sterba <[email protected]> wrote:
> > > > On Wed, Jan 25, 2023 at 01:27:48AM +0500, Mikhail Gavrilov wrote:
> > > > > On Tue, Jul 26, 2022 at 9:47 PM David Sterba <[email protected]> wrote:
> > > > > > On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
> > > > > > > Hi guys.
> > > > > > > Always with intensive writing on a btrfs volume, the message "BUG:
> > > > > > > MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
> > > > > > Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
> > > > > > tends to work.
> > > > > Hi,
> > > > > Today I was able to get the message "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too
> > > > > low!" again even with LOCKDEP_CHAINS_BITS=18 and kernel 6.2-rc5.
> > > > >
> > > > > ❯ cat /boot/config-`uname -r` | grep LOCKDEP_CHAINS_BITS
> > > > > CONFIG_LOCKDEP_CHAINS_BITS=18
> > > > >
> > > > > [88685.088099] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
> > > > > [88685.088124] turning off the locking correctness validator.
> > > > > [88685.088133] Please attach the output of /proc/lock_stat to the bug report
> > > > > [88685.088142] CPU: 14 PID: 1749746 Comm: mv Tainted: G W L
> > > > > ------- --- 6.2.0-0.rc5.20230123git2475bf0250de.38.fc38.x86_64 #1
> > > > > [88685.088154] Hardware name: System manufacturer System Product
> > > > > Name/ROG STRIX X570-I GAMING, BIOS 4408 10/28/2022
> > > > >
> > > > > What's next? Increase this value to 19?
> > > > Yes, though increasing the value is a workaround so you may see the
> > > > warning again.
> > > Is there any sense in this WARNING if we would ignore it and every
> > > time increase the threshold value?
> > Lockdep uses static allocated array to track lock holdings chains to
> > avoid dynmaic memory allocation in its own code. So if you see the
> > warning it means your test has more combination of lock holdings than
> > the array can record. In other words, you reach the resource limitation,
> > and in that sense it makes sense to just ignore it and increase the
> > value: you want to give lockdep enough resource to work, right?
> >
> > > May Be set 99 right away? Or remove such a check condition?
> > That requires having 2^99 * 5 * sizeof(u16) memory for lock holding
> > chains array..
>
> Note that every increment of LOCKDEP_CHAINS_BITS double the storage space.
> With 99, that will likely exceed the total amount of memory you have in your
> system.
>
> Boqun, where does the 5 figure come from. It is just a simple u16 array of

#define MAX_LOCKDEP_CHAINS_BITS CONFIG_LOCKDEP_CHAINS_BITS
#define MAX_LOCKDEP_CHAINS (1UL << MAX_LOCKDEP_CHAINS_BITS)

#define MAX_LOCKDEP_CHAIN_HLOCKS (MAX_LOCKDEP_CHAINS*5)

I think the last one means we think the average length of a lock chain
is 5, in other words, in average, a task hold at most 5 locks. I don't
know where the 5 came from either, but it's there ;-)

Regards,
Boqun

> size MAX_LOCKDEP_CHAIN_HLOCKS. The chain_hlocks array stores the lock chains
> that show up in the lockdep splats and in the /proc/lockdep* files. Each
> chain is variable size. As we add new lock into the chain, we have to
> repeatedly deallocate and reallocate a larger chain buffer. That will cause
> fragmentation in the chain_hlocks[]. So if we have a very long lock chain,
> the allocation may fail because the largest free block is smaller than the
> requested chain length. There may be enough free space in chain_hlocks, but
> it is just too fragmented to be useful.
>
> Maybe we should figure out a better way to handle this fragmentation. In the
> mean time, the easiest way forward is just to increase the
> LOCKDEP_CHAINS_BITS by 1.
>
> >
> > However, a few other options we can try in lockdep are:
> >
> > * warn but not turn off the lockdep: the lock holding chain is
> > only a cache for what lock holding combination lockdep has ever
> > see, we also record the dependency in the graph. Without the
> > lock holding chain, lockdep can still work but just slower.
> >
> > * allow dynmaic memory allocation in lockdep: I think this might
> > be OK since we have lockdep_recursion to avoid lockdep code ->
> > mm code -> lockdep code -> mm code ... deadlock. But maybe I'm
> > missing something. And even we allow it, the use of memory
> > doesn't change, you will still need that amout of memory to
> > track lock holding chains.
>
> It is not just the issue of calling the memory allocator. There is also the
> issue of copying data from old chain_hlocks to new one while the old one may
> be updated during the copying process unless we can freeze everything else.
>
> Cheers,
> Longman
>

2023-01-26 19:08:18

by Waiman Long

[permalink] [raw]
Subject: Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!

On 1/26/23 13:59, Boqun Feng wrote:
> On Thu, Jan 26, 2023 at 01:30:34PM -0500, Waiman Long wrote:
>> On 1/26/23 12:38, Boqun Feng wrote:
>>> [Cc lock folks]
>>>
>>> On Thu, Jan 26, 2023 at 02:47:42PM +0500, Mikhail Gavrilov wrote:
>>>> On Wed, Jan 25, 2023 at 10:21 PM David Sterba <[email protected]> wrote:
>>>>> On Wed, Jan 25, 2023 at 01:27:48AM +0500, Mikhail Gavrilov wrote:
>>>>>> On Tue, Jul 26, 2022 at 9:47 PM David Sterba <[email protected]> wrote:
>>>>>>> On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
>>>>>>>> Hi guys.
>>>>>>>> Always with intensive writing on a btrfs volume, the message "BUG:
>>>>>>>> MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
>>>>>>> Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
>>>>>>> tends to work.
>>>>>> Hi,
>>>>>> Today I was able to get the message "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too
>>>>>> low!" again even with LOCKDEP_CHAINS_BITS=18 and kernel 6.2-rc5.
>>>>>>
>>>>>> ❯ cat /boot/config-`uname -r` | grep LOCKDEP_CHAINS_BITS
>>>>>> CONFIG_LOCKDEP_CHAINS_BITS=18
>>>>>>
>>>>>> [88685.088099] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
>>>>>> [88685.088124] turning off the locking correctness validator.
>>>>>> [88685.088133] Please attach the output of /proc/lock_stat to the bug report
>>>>>> [88685.088142] CPU: 14 PID: 1749746 Comm: mv Tainted: G W L
>>>>>> ------- --- 6.2.0-0.rc5.20230123git2475bf0250de.38.fc38.x86_64 #1
>>>>>> [88685.088154] Hardware name: System manufacturer System Product
>>>>>> Name/ROG STRIX X570-I GAMING, BIOS 4408 10/28/2022
>>>>>>
>>>>>> What's next? Increase this value to 19?
>>>>> Yes, though increasing the value is a workaround so you may see the
>>>>> warning again.
>>>> Is there any sense in this WARNING if we would ignore it and every
>>>> time increase the threshold value?
>>> Lockdep uses static allocated array to track lock holdings chains to
>>> avoid dynmaic memory allocation in its own code. So if you see the
>>> warning it means your test has more combination of lock holdings than
>>> the array can record. In other words, you reach the resource limitation,
>>> and in that sense it makes sense to just ignore it and increase the
>>> value: you want to give lockdep enough resource to work, right?
>>>
>>>> May Be set 99 right away? Or remove such a check condition?
>>> That requires having 2^99 * 5 * sizeof(u16) memory for lock holding
>>> chains array..
>> Note that every increment of LOCKDEP_CHAINS_BITS double the storage space.
>> With 99, that will likely exceed the total amount of memory you have in your
>> system.
>>
>> Boqun, where does the 5 figure come from. It is just a simple u16 array of
> #define MAX_LOCKDEP_CHAINS_BITS CONFIG_LOCKDEP_CHAINS_BITS
> #define MAX_LOCKDEP_CHAINS (1UL << MAX_LOCKDEP_CHAINS_BITS)
>
> #define MAX_LOCKDEP_CHAIN_HLOCKS (MAX_LOCKDEP_CHAINS*5)
>
> I think the last one means we think the average length of a lock chain
> is 5, in other words, in average, a task hold at most 5 locks. I don't
> know where the 5 came from either, but it's there ;-)

You are right. I missed that when I looked. So 5 is assumed to the
average length of a lock chain.

Thanks,
Longman


2023-01-26 22:43:10

by Mikhail Gavrilov

[permalink] [raw]
Subject: Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!

On Thu, Jan 26, 2023 at 10:39 PM Boqun Feng <[email protected]> wrote:
>
> [Cc lock folks]
>
> On Thu, Jan 26, 2023 at 02:47:42PM +0500, Mikhail Gavrilov wrote:
> > On Wed, Jan 25, 2023 at 10:21 PM David Sterba <[email protected]> wrote:
> > >
> > > On Wed, Jan 25, 2023 at 01:27:48AM +0500, Mikhail Gavrilov wrote:
> > > > On Tue, Jul 26, 2022 at 9:47 PM David Sterba <[email protected]> wrote:
> > > > >
> > > > > On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
> > > > > > Hi guys.
> > > > > > Always with intensive writing on a btrfs volume, the message "BUG:
> > > > > > MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
> > > > >
> > > > > Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
> > > > > tends to work.
> > > >
> > > > Hi,
> > > > Today I was able to get the message "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too
> > > > low!" again even with LOCKDEP_CHAINS_BITS=18 and kernel 6.2-rc5.
> > > >
> > > > ❯ cat /boot/config-`uname -r` | grep LOCKDEP_CHAINS_BITS
> > > > CONFIG_LOCKDEP_CHAINS_BITS=18
> > > >
> > > > [88685.088099] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
> > > > [88685.088124] turning off the locking correctness validator.
> > > > [88685.088133] Please attach the output of /proc/lock_stat to the bug report
> > > > [88685.088142] CPU: 14 PID: 1749746 Comm: mv Tainted: G W L
> > > > ------- --- 6.2.0-0.rc5.20230123git2475bf0250de.38.fc38.x86_64 #1
> > > > [88685.088154] Hardware name: System manufacturer System Product
> > > > Name/ROG STRIX X570-I GAMING, BIOS 4408 10/28/2022
> > > >
> > > > What's next? Increase this value to 19?
> > >
> > > Yes, though increasing the value is a workaround so you may see the
> > > warning again.
> >
> > Is there any sense in this WARNING if we would ignore it and every
> > time increase the threshold value?
>
> Lockdep uses static allocated array to track lock holdings chains to
> avoid dynmaic memory allocation in its own code. So if you see the
> warning it means your test has more combination of lock holdings than
> the array can record. In other words, you reach the resource limitation,
> and in that sense it makes sense to just ignore it and increase the
> value: you want to give lockdep enough resource to work, right?

It is needed for correct working btrfs. David, am I right?

>
> > May Be set 99 right away? Or remove such a check condition?
>
> That requires having 2^99 * 5 * sizeof(u16) memory for lock holding
> chains array..
>
> However, a few other options we can try in lockdep are:
>
> * warn but not turn off the lockdep: the lock holding chain is
> only a cache for what lock holding combination lockdep has ever
> see, we also record the dependency in the graph. Without the
> lock holding chain, lockdep can still work but just slower.
>
> * allow dynmaic memory allocation in lockdep: I think this might
> be OK since we have lockdep_recursion to avoid lockdep code ->
> mm code -> lockdep code -> mm code ... deadlock. But maybe I'm
> missing something. And even we allow it, the use of memory
> doesn't change, you will still need that amout of memory to
> track lock holding chains.
>
> I'm not sure whether these options are better than just increasing the
> number, maybe to unblock your ASAP, you can try make it 30 and make sure
> you have large enough memory to test.

About just to increase the LOCKDEP_CHAINS_BITS by 1. Where should this
be done? In vanilla kernel on kernel.org? In a specific distribution?
or the user must rebuild the kernel himself? Maybe increase
LOCKDEP_CHAINS_BITS by 1 is most reliable solution, but it difficult
to distribute to end users because the meaning of using packaged
distributions is lost (user should change LOCKDEP_CHAINS_BITS in
config and rebuild the kernel by yourself).

It would be great if the chosen value would simply work always
everywhere. 30? ok! But as I understand, btrfs does not have any
guarantees for this. David, am I right?

Anyway, thank you for keeping the conversation going.

--
Best Regards,
Mike Gavrilov.

2023-01-26 22:52:42

by Boqun Feng

[permalink] [raw]
Subject: Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!

On Fri, Jan 27, 2023 at 03:42:52AM +0500, Mikhail Gavrilov wrote:
> On Thu, Jan 26, 2023 at 10:39 PM Boqun Feng <[email protected]> wrote:
> >
> > [Cc lock folks]
> >
> > On Thu, Jan 26, 2023 at 02:47:42PM +0500, Mikhail Gavrilov wrote:
> > > On Wed, Jan 25, 2023 at 10:21 PM David Sterba <[email protected]> wrote:
> > > >
> > > > On Wed, Jan 25, 2023 at 01:27:48AM +0500, Mikhail Gavrilov wrote:
> > > > > On Tue, Jul 26, 2022 at 9:47 PM David Sterba <[email protected]> wrote:
> > > > > >
> > > > > > On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
> > > > > > > Hi guys.
> > > > > > > Always with intensive writing on a btrfs volume, the message "BUG:
> > > > > > > MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
> > > > > >
> > > > > > Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
> > > > > > tends to work.
> > > > >
> > > > > Hi,
> > > > > Today I was able to get the message "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too
> > > > > low!" again even with LOCKDEP_CHAINS_BITS=18 and kernel 6.2-rc5.
> > > > >
> > > > > ❯ cat /boot/config-`uname -r` | grep LOCKDEP_CHAINS_BITS
> > > > > CONFIG_LOCKDEP_CHAINS_BITS=18
> > > > >
> > > > > [88685.088099] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
> > > > > [88685.088124] turning off the locking correctness validator.
> > > > > [88685.088133] Please attach the output of /proc/lock_stat to the bug report
> > > > > [88685.088142] CPU: 14 PID: 1749746 Comm: mv Tainted: G W L
> > > > > ------- --- 6.2.0-0.rc5.20230123git2475bf0250de.38.fc38.x86_64 #1
> > > > > [88685.088154] Hardware name: System manufacturer System Product
> > > > > Name/ROG STRIX X570-I GAMING, BIOS 4408 10/28/2022
> > > > >
> > > > > What's next? Increase this value to 19?
> > > >
> > > > Yes, though increasing the value is a workaround so you may see the
> > > > warning again.
> > >
> > > Is there any sense in this WARNING if we would ignore it and every
> > > time increase the threshold value?
> >
> > Lockdep uses static allocated array to track lock holdings chains to
> > avoid dynmaic memory allocation in its own code. So if you see the
> > warning it means your test has more combination of lock holdings than
> > the array can record. In other words, you reach the resource limitation,
> > and in that sense it makes sense to just ignore it and increase the
> > value: you want to give lockdep enough resource to work, right?
>
> It is needed for correct working btrfs. David, am I right?
>
> >
> > > May Be set 99 right away? Or remove such a check condition?
> >
> > That requires having 2^99 * 5 * sizeof(u16) memory for lock holding
> > chains array..
> >
> > However, a few other options we can try in lockdep are:
> >
> > * warn but not turn off the lockdep: the lock holding chain is
> > only a cache for what lock holding combination lockdep has ever
> > see, we also record the dependency in the graph. Without the
> > lock holding chain, lockdep can still work but just slower.
> >
> > * allow dynmaic memory allocation in lockdep: I think this might
> > be OK since we have lockdep_recursion to avoid lockdep code ->
> > mm code -> lockdep code -> mm code ... deadlock. But maybe I'm
> > missing something. And even we allow it, the use of memory
> > doesn't change, you will still need that amout of memory to
> > track lock holding chains.
> >
> > I'm not sure whether these options are better than just increasing the
> > number, maybe to unblock your ASAP, you can try make it 30 and make sure
> > you have large enough memory to test.
>
> About just to increase the LOCKDEP_CHAINS_BITS by 1. Where should this
> be done? In vanilla kernel on kernel.org? In a specific distribution?
> or the user must rebuild the kernel himself? Maybe increase
> LOCKDEP_CHAINS_BITS by 1 is most reliable solution, but it difficult
> to distribute to end users because the meaning of using packaged
> distributions is lost (user should change LOCKDEP_CHAINS_BITS in
> config and rebuild the kernel by yourself).
>

Lockdep is a dev tool to help finding out deadlocks, and it introduces
cost when enabled, although it's possible, I doubt no one will run
LOCKDEP enabled kernel in production environment. In other words, it's
a debug/test-kernel-only option.

Regards,
Boqun

> It would be great if the chosen value would simply work always
> everywhere. 30? ok! But as I understand, btrfs does not have any
> guarantees for this. David, am I right?
>
> Anyway, thank you for keeping the conversation going.
>
> --
> Best Regards,
> Mike Gavrilov.

2023-01-26 23:50:25

by Boqun Feng

[permalink] [raw]
Subject: Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!

On Fri, Jan 27, 2023 at 03:42:52AM +0500, Mikhail Gavrilov wrote:
> On Thu, Jan 26, 2023 at 10:39 PM Boqun Feng <[email protected]> wrote:
> >
> > [Cc lock folks]
> >
> > On Thu, Jan 26, 2023 at 02:47:42PM +0500, Mikhail Gavrilov wrote:
> > > On Wed, Jan 25, 2023 at 10:21 PM David Sterba <[email protected]> wrote:
> > > >
> > > > On Wed, Jan 25, 2023 at 01:27:48AM +0500, Mikhail Gavrilov wrote:
> > > > > On Tue, Jul 26, 2022 at 9:47 PM David Sterba <[email protected]> wrote:
> > > > > >
> > > > > > On Tue, Jul 26, 2022 at 05:32:54PM +0500, Mikhail Gavrilov wrote:
> > > > > > > Hi guys.
> > > > > > > Always with intensive writing on a btrfs volume, the message "BUG:
> > > > > > > MAX_LOCKDEP_CHAIN_HLOCKS too low!" appears in the kernel logs.
> > > > > >
> > > > > > Increase the config value of LOCKDEP_CHAINS_BITS, default is 16, 18
> > > > > > tends to work.
> > > > >
> > > > > Hi,
> > > > > Today I was able to get the message "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too
> > > > > low!" again even with LOCKDEP_CHAINS_BITS=18 and kernel 6.2-rc5.
> > > > >
> > > > > ❯ cat /boot/config-`uname -r` | grep LOCKDEP_CHAINS_BITS
> > > > > CONFIG_LOCKDEP_CHAINS_BITS=18
> > > > >
> > > > > [88685.088099] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
> > > > > [88685.088124] turning off the locking correctness validator.
> > > > > [88685.088133] Please attach the output of /proc/lock_stat to the bug report
> > > > > [88685.088142] CPU: 14 PID: 1749746 Comm: mv Tainted: G W L
> > > > > ------- --- 6.2.0-0.rc5.20230123git2475bf0250de.38.fc38.x86_64 #1
> > > > > [88685.088154] Hardware name: System manufacturer System Product
> > > > > Name/ROG STRIX X570-I GAMING, BIOS 4408 10/28/2022
> > > > >
> > > > > What's next? Increase this value to 19?
> > > >
> > > > Yes, though increasing the value is a workaround so you may see the
> > > > warning again.
> > >
> > > Is there any sense in this WARNING if we would ignore it and every
> > > time increase the threshold value?
> >
> > Lockdep uses static allocated array to track lock holdings chains to
> > avoid dynmaic memory allocation in its own code. So if you see the
> > warning it means your test has more combination of lock holdings than
> > the array can record. In other words, you reach the resource limitation,
> > and in that sense it makes sense to just ignore it and increase the
> > value: you want to give lockdep enough resource to work, right?
>
> It is needed for correct working btrfs. David, am I right?
>

Lockdep is not needed for correct working btrfs in production. It's a
tool to help btrfs developers to find deadlocks in
development/test/debug environment. End users, i.e. the users of linux
kernel don't need it.

Regards,
Boqun

> >
> > > May Be set 99 right away? Or remove such a check condition?
> >
> > That requires having 2^99 * 5 * sizeof(u16) memory for lock holding
> > chains array..
> >
> > However, a few other options we can try in lockdep are:
> >
> > * warn but not turn off the lockdep: the lock holding chain is
> > only a cache for what lock holding combination lockdep has ever
> > see, we also record the dependency in the graph. Without the
> > lock holding chain, lockdep can still work but just slower.
> >
> > * allow dynmaic memory allocation in lockdep: I think this might
> > be OK since we have lockdep_recursion to avoid lockdep code ->
> > mm code -> lockdep code -> mm code ... deadlock. But maybe I'm
> > missing something. And even we allow it, the use of memory
> > doesn't change, you will still need that amout of memory to
> > track lock holding chains.
> >
> > I'm not sure whether these options are better than just increasing the
> > number, maybe to unblock your ASAP, you can try make it 30 and make sure
> > you have large enough memory to test.
>
> About just to increase the LOCKDEP_CHAINS_BITS by 1. Where should this
> be done? In vanilla kernel on kernel.org? In a specific distribution?
> or the user must rebuild the kernel himself? Maybe increase
> LOCKDEP_CHAINS_BITS by 1 is most reliable solution, but it difficult
> to distribute to end users because the meaning of using packaged
> distributions is lost (user should change LOCKDEP_CHAINS_BITS in
> config and rebuild the kernel by yourself).
>
> It would be great if the chosen value would simply work always
> everywhere. 30? ok! But as I understand, btrfs does not have any
> guarantees for this. David, am I right?
>
> Anyway, thank you for keeping the conversation going.
>
> --
> Best Regards,
> Mike Gavrilov.

2023-01-27 00:21:52

by Waiman Long

[permalink] [raw]
Subject: Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!


On 1/26/23 17:42, Mikhail Gavrilov wrote:
>> I'm not sure whether these options are better than just increasing the
>> number, maybe to unblock your ASAP, you can try make it 30 and make sure
>> you have large enough memory to test.
> About just to increase the LOCKDEP_CHAINS_BITS by 1. Where should this
> be done? In vanilla kernel on kernel.org? In a specific distribution?
> or the user must rebuild the kernel himself? Maybe increase
> LOCKDEP_CHAINS_BITS by 1 is most reliable solution, but it difficult
> to distribute to end users because the meaning of using packaged
> distributions is lost (user should change LOCKDEP_CHAINS_BITS in
> config and rebuild the kernel by yourself).

Note that lockdep is typically only enabled in a debug kernel shipped by
a distro because of the high performance overhead. The non-debug kernel
doesn't have lockdep enabled. When LOCKDEP_CHAINS_BITS isn't big enough
when testing on the debug kernel, you can file a ticket to the distro
asking for an increase in CONFIG_LOCKDEP_CHAIN_BITS. Or you can build
your own debug kernel with a bigger CONFIG_LOCKDEP_CHAIN_BITS.

Cheers,
Longman


2023-01-27 03:38:26

by Chris Murphy

[permalink] [raw]
Subject: Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!



On Thu, Jan 26, 2023, at 7:20 PM, Waiman Long wrote:
> On 1/26/23 17:42, Mikhail Gavrilov wrote:
>>> I'm not sure whether these options are better than just increasing the
>>> number, maybe to unblock your ASAP, you can try make it 30 and make sure
>>> you have large enough memory to test.
>> About just to increase the LOCKDEP_CHAINS_BITS by 1. Where should this
>> be done? In vanilla kernel on kernel.org? In a specific distribution?
>> or the user must rebuild the kernel himself? Maybe increase
>> LOCKDEP_CHAINS_BITS by 1 is most reliable solution, but it difficult
>> to distribute to end users because the meaning of using packaged
>> distributions is lost (user should change LOCKDEP_CHAINS_BITS in
>> config and rebuild the kernel by yourself).
>
> Note that lockdep is typically only enabled in a debug kernel shipped by
> a distro because of the high performance overhead. The non-debug kernel
> doesn't have lockdep enabled. When LOCKDEP_CHAINS_BITS isn't big enough
> when testing on the debug kernel, you can file a ticket to the distro
> asking for an increase in CONFIG_LOCKDEP_CHAIN_BITS. Or you can build
> your own debug kernel with a bigger CONFIG_LOCKDEP_CHAIN_BITS.

Fedora bumped CONFIG_LOCKDEP_CHAINS_BITS=17 to 18 just 6 months ago for debug kernels.
https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1921

If 19 the recommended value I don't mind sending an MR for it. But if the idea is we're going to be back here talking about bumping it to 20 in six months, I'd like to avoid that.



--
Chris Murphy

2023-01-27 04:08:16

by Boqun Feng

[permalink] [raw]
Subject: Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!

On Thu, Jan 26, 2023 at 10:37:56PM -0500, Chris Murphy wrote:
>
>
> On Thu, Jan 26, 2023, at 7:20 PM, Waiman Long wrote:
> > On 1/26/23 17:42, Mikhail Gavrilov wrote:
> >>> I'm not sure whether these options are better than just increasing the
> >>> number, maybe to unblock your ASAP, you can try make it 30 and make sure
> >>> you have large enough memory to test.
> >> About just to increase the LOCKDEP_CHAINS_BITS by 1. Where should this
> >> be done? In vanilla kernel on kernel.org? In a specific distribution?
> >> or the user must rebuild the kernel himself? Maybe increase
> >> LOCKDEP_CHAINS_BITS by 1 is most reliable solution, but it difficult
> >> to distribute to end users because the meaning of using packaged
> >> distributions is lost (user should change LOCKDEP_CHAINS_BITS in
> >> config and rebuild the kernel by yourself).
> >
> > Note that lockdep is typically only enabled in a debug kernel shipped by
> > a distro because of the high performance overhead. The non-debug kernel
> > doesn't have lockdep enabled. When LOCKDEP_CHAINS_BITS isn't big enough
> > when testing on the debug kernel, you can file a ticket to the distro
> > asking for an increase in CONFIG_LOCKDEP_CHAIN_BITS. Or you can build
> > your own debug kernel with a bigger CONFIG_LOCKDEP_CHAIN_BITS.
>
> Fedora bumped CONFIG_LOCKDEP_CHAINS_BITS=17 to 18 just 6 months ago for debug kernels.
> https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1921
>
> If 19 the recommended value I don't mind sending an MR for it. But if
> the idea is we're going to be back here talking about bumping it to 20
> in six months, I'd like to avoid that.
>

How about a boot parameter then?

Regards,
Boqun

>
>
> --
> Chris Murphy

2023-01-27 05:36:37

by Mikhail Gavrilov

[permalink] [raw]
Subject: Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!

On Fri, Jan 27, 2023 at 9:08 AM Boqun Feng <[email protected]> wrote:
>
> On Thu, Jan 26, 2023 at 10:37:56PM -0500, Chris Murphy wrote:
> >
> >
> > On Thu, Jan 26, 2023, at 7:20 PM, Waiman Long wrote:
> > > On 1/26/23 17:42, Mikhail Gavrilov wrote:
> > >>> I'm not sure whether these options are better than just increasing the
> > >>> number, maybe to unblock your ASAP, you can try make it 30 and make sure
> > >>> you have large enough memory to test.
> > >> About just to increase the LOCKDEP_CHAINS_BITS by 1. Where should this
> > >> be done? In vanilla kernel on kernel.org? In a specific distribution?
> > >> or the user must rebuild the kernel himself? Maybe increase
> > >> LOCKDEP_CHAINS_BITS by 1 is most reliable solution, but it difficult
> > >> to distribute to end users because the meaning of using packaged
> > >> distributions is lost (user should change LOCKDEP_CHAINS_BITS in
> > >> config and rebuild the kernel by yourself).
> > >
> > > Note that lockdep is typically only enabled in a debug kernel shipped by
> > > a distro because of the high performance overhead. The non-debug kernel
> > > doesn't have lockdep enabled. When LOCKDEP_CHAINS_BITS isn't big enough
> > > when testing on the debug kernel, you can file a ticket to the distro
> > > asking for an increase in CONFIG_LOCKDEP_CHAIN_BITS. Or you can build
> > > your own debug kernel with a bigger CONFIG_LOCKDEP_CHAIN_BITS.
> >
> > Fedora bumped CONFIG_LOCKDEP_CHAINS_BITS=17 to 18 just 6 months ago for debug kernels.
> > https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1921
> >
> > If 19 the recommended value I don't mind sending an MR for it. But if
> > the idea is we're going to be back here talking about bumping it to 20
> > in six months, I'd like to avoid that.
> >
>
> How about a boot parameter then?

I would like this option.
This is better than rebuilding the kernel yourself and asking the
distribution's maintainers to increase this value.

Thanks.

--
Best Regards,
Mike Gavrilov.

2023-01-27 14:27:14

by Waiman Long

[permalink] [raw]
Subject: Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!

On 1/26/23 23:07, Boqun Feng wrote:
> On Thu, Jan 26, 2023 at 10:37:56PM -0500, Chris Murphy wrote:
>>
>> On Thu, Jan 26, 2023, at 7:20 PM, Waiman Long wrote:
>>> On 1/26/23 17:42, Mikhail Gavrilov wrote:
>>>>> I'm not sure whether these options are better than just increasing the
>>>>> number, maybe to unblock your ASAP, you can try make it 30 and make sure
>>>>> you have large enough memory to test.
>>>> About just to increase the LOCKDEP_CHAINS_BITS by 1. Where should this
>>>> be done? In vanilla kernel on kernel.org? In a specific distribution?
>>>> or the user must rebuild the kernel himself? Maybe increase
>>>> LOCKDEP_CHAINS_BITS by 1 is most reliable solution, but it difficult
>>>> to distribute to end users because the meaning of using packaged
>>>> distributions is lost (user should change LOCKDEP_CHAINS_BITS in
>>>> config and rebuild the kernel by yourself).
>>> Note that lockdep is typically only enabled in a debug kernel shipped by
>>> a distro because of the high performance overhead. The non-debug kernel
>>> doesn't have lockdep enabled. When LOCKDEP_CHAINS_BITS isn't big enough
>>> when testing on the debug kernel, you can file a ticket to the distro
>>> asking for an increase in CONFIG_LOCKDEP_CHAIN_BITS. Or you can build
>>> your own debug kernel with a bigger CONFIG_LOCKDEP_CHAIN_BITS.
>> Fedora bumped CONFIG_LOCKDEP_CHAINS_BITS=17 to 18 just 6 months ago for debug kernels.
>> https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1921
>>
>> If 19 the recommended value I don't mind sending an MR for it. But if
>> the idea is we're going to be back here talking about bumping it to 20
>> in six months, I'd like to avoid that.
>>
> How about a boot parameter then?

A boot parameter doesn't work for a statically allocated array which is
determined at compile time. Dynamic memory allocation isn't enabled yet
at early boot when lockdep will be used.

Cheers,
Longman


2023-01-27 15:34:06

by Chris Murphy

[permalink] [raw]
Subject: Re: BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!



On Fri, Jan 27, 2023, at 9:26 AM, Waiman Long wrote:
> On 1/26/23 23:07, Boqun Feng wrote:
>> On Thu, Jan 26, 2023 at 10:37:56PM -0500, Chris Murphy wrote:
>>>
>>> On Thu, Jan 26, 2023, at 7:20 PM, Waiman Long wrote:
>>>> On 1/26/23 17:42, Mikhail Gavrilov wrote:
>>>>>> I'm not sure whether these options are better than just increasing the
>>>>>> number, maybe to unblock your ASAP, you can try make it 30 and make sure
>>>>>> you have large enough memory to test.
>>>>> About just to increase the LOCKDEP_CHAINS_BITS by 1. Where should this
>>>>> be done? In vanilla kernel on kernel.org? In a specific distribution?
>>>>> or the user must rebuild the kernel himself? Maybe increase
>>>>> LOCKDEP_CHAINS_BITS by 1 is most reliable solution, but it difficult
>>>>> to distribute to end users because the meaning of using packaged
>>>>> distributions is lost (user should change LOCKDEP_CHAINS_BITS in
>>>>> config and rebuild the kernel by yourself).
>>>> Note that lockdep is typically only enabled in a debug kernel shipped by
>>>> a distro because of the high performance overhead. The non-debug kernel
>>>> doesn't have lockdep enabled. When LOCKDEP_CHAINS_BITS isn't big enough
>>>> when testing on the debug kernel, you can file a ticket to the distro
>>>> asking for an increase in CONFIG_LOCKDEP_CHAIN_BITS. Or you can build
>>>> your own debug kernel with a bigger CONFIG_LOCKDEP_CHAIN_BITS.
>>> Fedora bumped CONFIG_LOCKDEP_CHAINS_BITS=17 to 18 just 6 months ago for debug kernels.
>>> https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1921
>>>
>>> If 19 the recommended value I don't mind sending an MR for it. But if
>>> the idea is we're going to be back here talking about bumping it to 20
>>> in six months, I'd like to avoid that.
>>>
>> How about a boot parameter then?
>
> A boot parameter doesn't work for a statically allocated array which is
> determined at compile time. Dynamic memory allocation isn't enabled yet
> at early boot when lockdep will be used.

Also, at least in Fedora Rawhide where the mainline debug kernels appear, mostly get used non-interactively with automated tests. So if we're going to discover lockdep issues, we need the kernel logs to be reliable at the time those tests are run, and we don't have a practical way of adding another boot parameter just for these tests.

Anyway I went ahead and submitted an MR to bump this to 19.
https://gitlab.com/cki-project/kernel-ark/-/merge_requests/2271



--
Chris Murphy