2021-01-20 10:46:59

by Tetsuo Handa

[permalink] [raw]
Subject: [PATCH v4 (resend)] lockdep: Allow tuning tracing capacity constants.

Since syzkaller continues various test cases until the kernel crashes,
syzkaller tends to examine more locking dependencies than normal systems.
As a result, syzbot is reporting that the fuzz testing was terminated
due to hitting upper limits lockdep can track [1] [2] [3].

Peter Zijlstra does not want to allow tuning these limits via kernel
config options, for such change discourages thinking. But analysis via
/proc/lockdep* did not show any obvious culprit [4] [5]. It is possible
that many hundreds of kn->active lock instances are to some degree
contributing to these problems, but there is no means to verify whether
these instances are created for protecting same callback functions.
Unless Peter provides a way to make these instances per "which callback
functions the lock instance will call (identified by something like MD5
of string representations of callback functions which each lock instance
will protect)" than plain "serial number", I don't think that we can
verify the culprit.

[1] https://syzkaller.appspot.com/bug?id=3d97ba93fb3566000c1c59691ea427370d33ea1b
[2] https://syzkaller.appspot.com/bug?id=381cb436fe60dc03d7fd2a092b46d7f09542a72a
[3] https://syzkaller.appspot.com/bug?id=a588183ac34c1437fc0785e8f220e88282e5a29f
[4] https://lkml.kernel.org/r/[email protected]
[5] https://lkml.kernel.org/r/[email protected]

Reported-by: syzbot <[email protected]>
Reported-by: syzbot <[email protected]>
Reported-by: syzbot <[email protected]>
Signed-off-by: Tetsuo Handa <[email protected]>
Acked-by: Dmitry Vyukov <[email protected]>
---
kernel/locking/lockdep.c | 2 +-
kernel/locking/lockdep_internals.h | 8 +++---
lib/Kconfig.debug | 40 ++++++++++++++++++++++++++++++
3 files changed, 45 insertions(+), 5 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index c1418b47f625..c0553872668a 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1391,7 +1391,7 @@ static int add_lock_to_list(struct lock_class *this,
/*
* For good efficiency of modular, we use power of 2
*/
-#define MAX_CIRCULAR_QUEUE_SIZE 4096UL
+#define MAX_CIRCULAR_QUEUE_SIZE (1UL << CONFIG_LOCKDEP_CIRCULAR_QUEUE_BITS)
#define CQ_MASK (MAX_CIRCULAR_QUEUE_SIZE-1)

/*
diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
index de49f9e1c11b..ecb8662e7a4e 100644
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -99,16 +99,16 @@ static const unsigned long LOCKF_USED_IN_IRQ_READ =
#define MAX_STACK_TRACE_ENTRIES 262144UL
#define STACK_TRACE_HASH_SIZE 8192
#else
-#define MAX_LOCKDEP_ENTRIES 32768UL
+#define MAX_LOCKDEP_ENTRIES (1UL << CONFIG_LOCKDEP_BITS)

-#define MAX_LOCKDEP_CHAINS_BITS 16
+#define MAX_LOCKDEP_CHAINS_BITS CONFIG_LOCKDEP_CHAINS_BITS

/*
* Stack-trace: tightly packed array of stack backtrace
* addresses. Protected by the hash_lock.
*/
-#define MAX_STACK_TRACE_ENTRIES 524288UL
-#define STACK_TRACE_HASH_SIZE 16384
+#define MAX_STACK_TRACE_ENTRIES (1UL << CONFIG_LOCKDEP_STACK_TRACE_BITS)
+#define STACK_TRACE_HASH_SIZE (1 << CONFIG_LOCKDEP_STACK_TRACE_HASH_BITS)
#endif

/*
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 7937265ef879..4cb84b499636 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1332,6 +1332,46 @@ config LOCKDEP
config LOCKDEP_SMALL
bool

+config LOCKDEP_BITS
+ int "Bitsize for MAX_LOCKDEP_ENTRIES"
+ depends on LOCKDEP && !LOCKDEP_SMALL
+ range 10 30
+ default 15
+ help
+ Try increasing this value if you hit "BUG: MAX_LOCKDEP_ENTRIES too low!" message.
+
+config LOCKDEP_CHAINS_BITS
+ int "Bitsize for MAX_LOCKDEP_CHAINS"
+ depends on LOCKDEP && !LOCKDEP_SMALL
+ range 10 30
+ default 16
+ help
+ Try increasing this value if you hit "BUG: MAX_LOCKDEP_CHAINS too low!" message.
+
+config LOCKDEP_STACK_TRACE_BITS
+ int "Bitsize for MAX_STACK_TRACE_ENTRIES"
+ depends on LOCKDEP && !LOCKDEP_SMALL
+ range 10 30
+ default 19
+ help
+ Try increasing this value if you hit "BUG: MAX_STACK_TRACE_ENTRIES too low!" message.
+
+config LOCKDEP_STACK_TRACE_HASH_BITS
+ int "Bitsize for STACK_TRACE_HASH_SIZE"
+ depends on LOCKDEP && !LOCKDEP_SMALL
+ range 10 30
+ default 14
+ help
+ Try increasing this value if you need large MAX_STACK_TRACE_ENTRIES.
+
+config LOCKDEP_CIRCULAR_QUEUE_BITS
+ int "Bitsize for elements in circular_queue struct"
+ depends on LOCKDEP
+ range 10 30
+ default 12
+ help
+ Try increasing this value if you hit "lockdep bfs error:-1" warning due to __cq_enqueue() failure.
+
config DEBUG_LOCKDEP
bool "Lock dependency engine debugging"
depends on DEBUG_KERNEL && LOCKDEP
--
2.18.4


2021-01-20 12:05:42

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH v4 (resend)] lockdep: Allow tuning tracing capacity constants.

On Wed, Jan 20, 2021 at 11:12 AM Tetsuo Handa
<[email protected]> wrote:
>
> Since syzkaller continues various test cases until the kernel crashes,
> syzkaller tends to examine more locking dependencies than normal systems.
> As a result, syzbot is reporting that the fuzz testing was terminated
> due to hitting upper limits lockdep can track [1] [2] [3].
>
> Peter Zijlstra does not want to allow tuning these limits via kernel
> config options, for such change discourages thinking. But analysis via
> /proc/lockdep* did not show any obvious culprit [4] [5]. It is possible
> that many hundreds of kn->active lock instances are to some degree
> contributing to these problems, but there is no means to verify whether
> these instances are created for protecting same callback functions.
> Unless Peter provides a way to make these instances per "which callback
> functions the lock instance will call (identified by something like MD5
> of string representations of callback functions which each lock instance
> will protect)" than plain "serial number", I don't think that we can
> verify the culprit.
>
> [1] https://syzkaller.appspot.com/bug?id=3d97ba93fb3566000c1c59691ea427370d33ea1b
> [2] https://syzkaller.appspot.com/bug?id=381cb436fe60dc03d7fd2a092b46d7f09542a72a
> [3] https://syzkaller.appspot.com/bug?id=a588183ac34c1437fc0785e8f220e88282e5a29f
> [4] https://lkml.kernel.org/r/[email protected]
> [5] https://lkml.kernel.org/r/[email protected]
>
> Reported-by: syzbot <[email protected]>
> Reported-by: syzbot <[email protected]>
> Reported-by: syzbot <[email protected]>
> Signed-off-by: Tetsuo Handa <[email protected]>
> Acked-by: Dmitry Vyukov <[email protected]>

Thanks for your persistence!
I still support this. And assessment of lockdep stats on overflow
seems to confirm it's just a very large lock graph triggered by
syzkaller.


> ---
> kernel/locking/lockdep.c | 2 +-
> kernel/locking/lockdep_internals.h | 8 +++---
> lib/Kconfig.debug | 40 ++++++++++++++++++++++++++++++
> 3 files changed, 45 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index c1418b47f625..c0553872668a 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -1391,7 +1391,7 @@ static int add_lock_to_list(struct lock_class *this,
> /*
> * For good efficiency of modular, we use power of 2
> */
> -#define MAX_CIRCULAR_QUEUE_SIZE 4096UL
> +#define MAX_CIRCULAR_QUEUE_SIZE (1UL << CONFIG_LOCKDEP_CIRCULAR_QUEUE_BITS)
> #define CQ_MASK (MAX_CIRCULAR_QUEUE_SIZE-1)
>
> /*
> diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
> index de49f9e1c11b..ecb8662e7a4e 100644
> --- a/kernel/locking/lockdep_internals.h
> +++ b/kernel/locking/lockdep_internals.h
> @@ -99,16 +99,16 @@ static const unsigned long LOCKF_USED_IN_IRQ_READ =
> #define MAX_STACK_TRACE_ENTRIES 262144UL
> #define STACK_TRACE_HASH_SIZE 8192
> #else
> -#define MAX_LOCKDEP_ENTRIES 32768UL
> +#define MAX_LOCKDEP_ENTRIES (1UL << CONFIG_LOCKDEP_BITS)
>
> -#define MAX_LOCKDEP_CHAINS_BITS 16
> +#define MAX_LOCKDEP_CHAINS_BITS CONFIG_LOCKDEP_CHAINS_BITS
>
> /*
> * Stack-trace: tightly packed array of stack backtrace
> * addresses. Protected by the hash_lock.
> */
> -#define MAX_STACK_TRACE_ENTRIES 524288UL
> -#define STACK_TRACE_HASH_SIZE 16384
> +#define MAX_STACK_TRACE_ENTRIES (1UL << CONFIG_LOCKDEP_STACK_TRACE_BITS)
> +#define STACK_TRACE_HASH_SIZE (1 << CONFIG_LOCKDEP_STACK_TRACE_HASH_BITS)
> #endif
>
> /*
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 7937265ef879..4cb84b499636 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1332,6 +1332,46 @@ config LOCKDEP
> config LOCKDEP_SMALL
> bool
>
> +config LOCKDEP_BITS
> + int "Bitsize for MAX_LOCKDEP_ENTRIES"
> + depends on LOCKDEP && !LOCKDEP_SMALL
> + range 10 30
> + default 15
> + help
> + Try increasing this value if you hit "BUG: MAX_LOCKDEP_ENTRIES too low!" message.
> +
> +config LOCKDEP_CHAINS_BITS
> + int "Bitsize for MAX_LOCKDEP_CHAINS"
> + depends on LOCKDEP && !LOCKDEP_SMALL
> + range 10 30
> + default 16
> + help
> + Try increasing this value if you hit "BUG: MAX_LOCKDEP_CHAINS too low!" message.
> +
> +config LOCKDEP_STACK_TRACE_BITS
> + int "Bitsize for MAX_STACK_TRACE_ENTRIES"
> + depends on LOCKDEP && !LOCKDEP_SMALL
> + range 10 30
> + default 19
> + help
> + Try increasing this value if you hit "BUG: MAX_STACK_TRACE_ENTRIES too low!" message.
> +
> +config LOCKDEP_STACK_TRACE_HASH_BITS
> + int "Bitsize for STACK_TRACE_HASH_SIZE"
> + depends on LOCKDEP && !LOCKDEP_SMALL
> + range 10 30
> + default 14
> + help
> + Try increasing this value if you need large MAX_STACK_TRACE_ENTRIES.
> +
> +config LOCKDEP_CIRCULAR_QUEUE_BITS
> + int "Bitsize for elements in circular_queue struct"
> + depends on LOCKDEP
> + range 10 30
> + default 12
> + help
> + Try increasing this value if you hit "lockdep bfs error:-1" warning due to __cq_enqueue() failure.
> +
> config DEBUG_LOCKDEP
> bool "Lock dependency engine debugging"
> depends on DEBUG_KERNEL && LOCKDEP
> --
> 2.18.4
>

2021-02-01 13:26:44

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [PATCH v4 (resend)] lockdep: Allow tuning tracing capacity constants.

Hello, Andrew and Linus.

We are stuck because Peter cannot respond.
I think it is time to send this patch to linux-next. What do you think?

On 2021/01/20 19:18, Dmitry Vyukov wrote:
> On Wed, Jan 20, 2021 at 11:12 AM Tetsuo Handa
> <[email protected]> wrote:
>>
>> Since syzkaller continues various test cases until the kernel crashes,
>> syzkaller tends to examine more locking dependencies than normal systems.
>> As a result, syzbot is reporting that the fuzz testing was terminated
>> due to hitting upper limits lockdep can track [1] [2] [3].
>>
>> Peter Zijlstra does not want to allow tuning these limits via kernel
>> config options, for such change discourages thinking. But analysis via
>> /proc/lockdep* did not show any obvious culprit [4] [5]. It is possible
>> that many hundreds of kn->active lock instances are to some degree
>> contributing to these problems, but there is no means to verify whether
>> these instances are created for protecting same callback functions.
>> Unless Peter provides a way to make these instances per "which callback
>> functions the lock instance will call (identified by something like MD5
>> of string representations of callback functions which each lock instance
>> will protect)" than plain "serial number", I don't think that we can
>> verify the culprit.
>>
>> [1] https://syzkaller.appspot.com/bug?id=3d97ba93fb3566000c1c59691ea427370d33ea1b
>> [2] https://syzkaller.appspot.com/bug?id=381cb436fe60dc03d7fd2a092b46d7f09542a72a
>> [3] https://syzkaller.appspot.com/bug?id=a588183ac34c1437fc0785e8f220e88282e5a29f
>> [4] https://lkml.kernel.org/r/[email protected]
>> [5] https://lkml.kernel.org/r/[email protected]
>>
>> Reported-by: syzbot <[email protected]>
>> Reported-by: syzbot <[email protected]>
>> Reported-by: syzbot <[email protected]>
>> Signed-off-by: Tetsuo Handa <[email protected]>
>> Acked-by: Dmitry Vyukov <[email protected]>
>
> Thanks for your persistence!
> I still support this. And assessment of lockdep stats on overflow
> seems to confirm it's just a very large lock graph triggered by
> syzkaller.
>