2022-08-18 12:12:14

by Vincent Whitchurch

[permalink] [raw]
Subject: [PATCH] lockdep: Panic on warning if panic_on_warn is set

There does not seem to be any way to get the system to panic if a
lockdep warning is emitted, since those warnings don't use the normal
WARN() infrastructure. Panicking on any lockdep warning can be
desirable when the kernel is being run in a controlled environment
solely for the purpose of testing. Make lockdep respect panic_on_warn
to allow this, similar to KASAN and others.

Signed-off-by: Vincent Whitchurch <[email protected]>
---
kernel/locking/lockdep.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 64a13eb56078..d184bba02630 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -124,6 +124,12 @@ static __always_inline bool lockdep_enabled(void)
return true;
}

+static void lockdep_panic(void)
+{
+ if (panic_on_warn)
+ panic("panic_on_warn set ...\n");
+}
+
/*
* lockdep_lock: protects the lockdep graph, the hashes and the
* class/list/hash allocators.
@@ -977,6 +983,7 @@ static bool assign_lock_key(struct lockdep_map *lock)
pr_err("you didn't initialize this object before use?\n");
pr_err("turning off the locking correctness validator.\n");
dump_stack();
+ lockdep_panic();
return false;
}

@@ -2051,6 +2058,7 @@ static noinline void print_circular_bug(struct lock_list *this,

printk("\nstack backtrace:\n");
dump_stack();
+ lockdep_panic();
}

static noinline void print_bfs_bug(int ret)
@@ -2607,6 +2615,7 @@ print_bad_irq_dependency(struct task_struct *curr,

pr_warn("\nstack backtrace:\n");
dump_stack();
+ lockdep_panic();
}

static const char *state_names[] = {
@@ -2986,6 +2995,7 @@ print_deadlock_bug(struct task_struct *curr, struct held_lock *prev,

pr_warn("\nstack backtrace:\n");
dump_stack();
+ lockdep_panic();
}

/*
@@ -3583,6 +3593,7 @@ static void print_collision(struct task_struct *curr,

pr_warn("\nstack backtrace:\n");
dump_stack();
+ lockdep_panic();
}
#endif

@@ -3959,6 +3970,7 @@ print_usage_bug(struct task_struct *curr, struct held_lock *this,

pr_warn("\nstack backtrace:\n");
dump_stack();
+ lockdep_panic();
}

/*
@@ -4038,6 +4050,7 @@ print_irq_inversion_bug(struct task_struct *curr,

pr_warn("\nstack backtrace:\n");
dump_stack();
+ lockdep_panic();
}

/*
@@ -4703,6 +4716,7 @@ print_lock_invalid_wait_context(struct task_struct *curr,

pr_warn("stack backtrace:\n");
dump_stack();
+ lockdep_panic();

return 0;
}
@@ -4892,6 +4906,7 @@ print_lock_nested_lock_not_held(struct task_struct *curr,

pr_warn("\nstack backtrace:\n");
dump_stack();
+ lockdep_panic();
}

static int __lock_is_held(const struct lockdep_map *lock, int read);
@@ -5104,6 +5119,7 @@ static void print_unlock_imbalance_bug(struct task_struct *curr,

pr_warn("\nstack backtrace:\n");
dump_stack();
+ lockdep_panic();
}

static noinstr int match_held_lock(const struct held_lock *hlock,
@@ -5795,6 +5811,7 @@ static void print_lock_contention_bug(struct task_struct *curr,

pr_warn("\nstack backtrace:\n");
dump_stack();
+ lockdep_panic();
}

static void
@@ -6420,6 +6437,7 @@ print_freed_lock_bug(struct task_struct *curr, const void *mem_from,

pr_warn("\nstack backtrace:\n");
dump_stack();
+ lockdep_panic();
}

static inline int not_in_range(const void* mem_from, unsigned long mem_len,
@@ -6475,6 +6493,7 @@ static void print_held_locks_bug(void)
lockdep_print_held_locks(current);
pr_warn("\nstack backtrace:\n");
dump_stack();
+ lockdep_panic();
}

void debug_check_no_locks_held(void)
@@ -6593,5 +6612,6 @@ void lockdep_rcu_suspicious(const char *file, const int line, const char *s)
lockdep_print_held_locks(curr);
pr_warn("\nstack backtrace:\n");
dump_stack();
+ lockdep_panic();
}
EXPORT_SYMBOL_GPL(lockdep_rcu_suspicious);
--
2.34.1


2022-08-18 21:58:06

by Boqun Feng

[permalink] [raw]
Subject: Re: [PATCH] lockdep: Panic on warning if panic_on_warn is set

Hi,

On Thu, Aug 18, 2022 at 01:42:58PM +0200, Vincent Whitchurch wrote:
> There does not seem to be any way to get the system to panic if a
> lockdep warning is emitted, since those warnings don't use the normal
> WARN() infrastructure. Panicking on any lockdep warning can be
> desirable when the kernel is being run in a controlled environment
> solely for the purpose of testing. Make lockdep respect panic_on_warn
> to allow this, similar to KASAN and others.
>

I'm not completely against this, but could you explain why you want to
panic on lockdep warning? I assume you want to have a kdump so that you
can understand the lock bugs closely? But lockdep discovers lock issue
possiblity, so it's not an after-the-fact detector. In other words, when
lockdep warns, the deadlock cases don't happen in the meanwhile. And
also lockdep tries very hard to print useful information to locate the
issues.

This patch add lockdep_panic() to a few places, and it's a pain for
maintaining. So why do you want to panic on lockdep warning?

Regards,
Boqun

> Signed-off-by: Vincent Whitchurch <[email protected]>
> ---
> kernel/locking/lockdep.c | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
>
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 64a13eb56078..d184bba02630 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -124,6 +124,12 @@ static __always_inline bool lockdep_enabled(void)
> return true;
> }
>
> +static void lockdep_panic(void)
> +{
> + if (panic_on_warn)
> + panic("panic_on_warn set ...\n");
> +}
> +
> /*
> * lockdep_lock: protects the lockdep graph, the hashes and the
> * class/list/hash allocators.
> @@ -977,6 +983,7 @@ static bool assign_lock_key(struct lockdep_map *lock)
> pr_err("you didn't initialize this object before use?\n");
> pr_err("turning off the locking correctness validator.\n");
> dump_stack();
> + lockdep_panic();
> return false;
> }
>
> @@ -2051,6 +2058,7 @@ static noinline void print_circular_bug(struct lock_list *this,
>
> printk("\nstack backtrace:\n");
> dump_stack();
> + lockdep_panic();
> }
>
> static noinline void print_bfs_bug(int ret)
> @@ -2607,6 +2615,7 @@ print_bad_irq_dependency(struct task_struct *curr,
>
> pr_warn("\nstack backtrace:\n");
> dump_stack();
> + lockdep_panic();
> }
>
> static const char *state_names[] = {
> @@ -2986,6 +2995,7 @@ print_deadlock_bug(struct task_struct *curr, struct held_lock *prev,
>
> pr_warn("\nstack backtrace:\n");
> dump_stack();
> + lockdep_panic();
> }
>
> /*
> @@ -3583,6 +3593,7 @@ static void print_collision(struct task_struct *curr,
>
> pr_warn("\nstack backtrace:\n");
> dump_stack();
> + lockdep_panic();
> }
> #endif
>
> @@ -3959,6 +3970,7 @@ print_usage_bug(struct task_struct *curr, struct held_lock *this,
>
> pr_warn("\nstack backtrace:\n");
> dump_stack();
> + lockdep_panic();
> }
>
> /*
> @@ -4038,6 +4050,7 @@ print_irq_inversion_bug(struct task_struct *curr,
>
> pr_warn("\nstack backtrace:\n");
> dump_stack();
> + lockdep_panic();
> }
>
> /*
> @@ -4703,6 +4716,7 @@ print_lock_invalid_wait_context(struct task_struct *curr,
>
> pr_warn("stack backtrace:\n");
> dump_stack();
> + lockdep_panic();
>
> return 0;
> }
> @@ -4892,6 +4906,7 @@ print_lock_nested_lock_not_held(struct task_struct *curr,
>
> pr_warn("\nstack backtrace:\n");
> dump_stack();
> + lockdep_panic();
> }
>
> static int __lock_is_held(const struct lockdep_map *lock, int read);
> @@ -5104,6 +5119,7 @@ static void print_unlock_imbalance_bug(struct task_struct *curr,
>
> pr_warn("\nstack backtrace:\n");
> dump_stack();
> + lockdep_panic();
> }
>
> static noinstr int match_held_lock(const struct held_lock *hlock,
> @@ -5795,6 +5811,7 @@ static void print_lock_contention_bug(struct task_struct *curr,
>
> pr_warn("\nstack backtrace:\n");
> dump_stack();
> + lockdep_panic();
> }
>
> static void
> @@ -6420,6 +6437,7 @@ print_freed_lock_bug(struct task_struct *curr, const void *mem_from,
>
> pr_warn("\nstack backtrace:\n");
> dump_stack();
> + lockdep_panic();
> }
>
> static inline int not_in_range(const void* mem_from, unsigned long mem_len,
> @@ -6475,6 +6493,7 @@ static void print_held_locks_bug(void)
> lockdep_print_held_locks(current);
> pr_warn("\nstack backtrace:\n");
> dump_stack();
> + lockdep_panic();
> }
>
> void debug_check_no_locks_held(void)
> @@ -6593,5 +6612,6 @@ void lockdep_rcu_suspicious(const char *file, const int line, const char *s)
> lockdep_print_held_locks(curr);
> pr_warn("\nstack backtrace:\n");
> dump_stack();
> + lockdep_panic();
> }
> EXPORT_SYMBOL_GPL(lockdep_rcu_suspicious);
> --
> 2.34.1
>

2022-08-19 11:05:21

by Vincent Whitchurch

[permalink] [raw]
Subject: Re: [PATCH] lockdep: Panic on warning if panic_on_warn is set

On Thu, Aug 18, 2022 at 11:49:17PM +0200, Boqun Feng wrote:
> On Thu, Aug 18, 2022 at 01:42:58PM +0200, Vincent Whitchurch wrote:
> > There does not seem to be any way to get the system to panic if a
> > lockdep warning is emitted, since those warnings don't use the normal
> > WARN() infrastructure. Panicking on any lockdep warning can be
> > desirable when the kernel is being run in a controlled environment
> > solely for the purpose of testing. Make lockdep respect panic_on_warn
> > to allow this, similar to KASAN and others.
> >
>
> I'm not completely against this, but could you explain why you want to
> panic on lockdep warning? I assume you want to have a kdump so that you
> can understand the lock bugs closely? But lockdep discovers lock issue
> possiblity, so it's not an after-the-fact detector. In other words, when
> lockdep warns, the deadlock cases don't happen in the meanwhile. And
> also lockdep tries very hard to print useful information to locate the
> issues.

I'm not trying to obtain a kdump in this case. I test device drivers
under UML[0] and I want to make the tests stop and fail immediately if
the driver triggers any kind of problem which results in splats in the
log. I achieve this using panic_on_warn, panic_on_taint, and oops=panic
which result in a panic and an error exit code from UML.

[0] https://lore.kernel.org/lkml/[email protected]/

For lockdep, without this patch, I would be forced to parse the logs
after each test to determine if the test trigger a lockdep splat or not.

> This patch add lockdep_panic() to a few places, and it's a pain for
> maintaining. So why do you want to panic on lockdep warning?

It's adding the call to a lot of places since there is no existing
common function indicating the end of a lockdep warning. I can move the
already duplicated dump_stack() calls into the new function too so that
some code is removed. The "stack backtrace" could possible be
consolidated too, but one of the call sites uses printk instead of
pr_warn so I wasn't sure if it was OK to change that to a warn too.

2022-08-20 05:45:11

by Boqun Feng

[permalink] [raw]
Subject: Re: [PATCH] lockdep: Panic on warning if panic_on_warn is set

On Fri, Aug 19, 2022 at 12:59:56PM +0200, Vincent Whitchurch wrote:
> On Thu, Aug 18, 2022 at 11:49:17PM +0200, Boqun Feng wrote:
> > On Thu, Aug 18, 2022 at 01:42:58PM +0200, Vincent Whitchurch wrote:
> > > There does not seem to be any way to get the system to panic if a
> > > lockdep warning is emitted, since those warnings don't use the normal
> > > WARN() infrastructure. Panicking on any lockdep warning can be
> > > desirable when the kernel is being run in a controlled environment
> > > solely for the purpose of testing. Make lockdep respect panic_on_warn
> > > to allow this, similar to KASAN and others.
> > >
> >
> > I'm not completely against this, but could you explain why you want to
> > panic on lockdep warning? I assume you want to have a kdump so that you
> > can understand the lock bugs closely? But lockdep discovers lock issue
> > possiblity, so it's not an after-the-fact detector. In other words, when
> > lockdep warns, the deadlock cases don't happen in the meanwhile. And
> > also lockdep tries very hard to print useful information to locate the
> > issues.
>
> I'm not trying to obtain a kdump in this case. I test device drivers
> under UML[0] and I want to make the tests stop and fail immediately if
> the driver triggers any kind of problem which results in splats in the
> log. I achieve this using panic_on_warn, panic_on_taint, and oops=panic
> which result in a panic and an error exit code from UML.
>
> [0] https://lore.kernel.org/lkml/[email protected]/
>
> For lockdep, without this patch, I would be forced to parse the logs
> after each test to determine if the test trigger a lockdep splat or not.
>

In that case, would a standard line with every lockdep warning help? For
example:

[...] A LOCKDEP issue detected.

Two reasons I don't think making lockdep warning as panic is a good
idea:

* We don't know what other CIs expect, given today lockdep doesn't panic
with panic_on_warn, this patch is a change of behaviors to them, and
it may break their setups/scripts.

* As I said, lockdep warnings are different than other warnings, and
panicking doesn't provide more information for debugging.

So I think an extra line helping scripts to parse may be better.

Work for you?

Regards,
Boqun

> > This patch add lockdep_panic() to a few places, and it's a pain for
> > maintaining. So why do you want to panic on lockdep warning?
>
> It's adding the call to a lot of places since there is no existing
> common function indicating the end of a lockdep warning. I can move the
> already duplicated dump_stack() calls into the new function too so that
> some code is removed. The "stack backtrace" could possible be
> consolidated too, but one of the call sites uses printk instead of
> pr_warn so I wasn't sure if it was OK to change that to a warn too.

2022-08-22 13:06:39

by Vincent Whitchurch

[permalink] [raw]
Subject: Re: [PATCH] lockdep: Panic on warning if panic_on_warn is set

On Sat, Aug 20, 2022 at 07:18:37AM +0200, Boqun Feng wrote:
> > I'm not trying to obtain a kdump in this case. I test device drivers
> > under UML[0] and I want to make the tests stop and fail immediately if
> > the driver triggers any kind of problem which results in splats in the
> > log. I achieve this using panic_on_warn, panic_on_taint, and oops=panic
> > which result in a panic and an error exit code from UML.
> >
> > [0] https://lore.kernel.org/lkml/[email protected]/
> >
> > For lockdep, without this patch, I would be forced to parse the logs
> > after each test to determine if the test trigger a lockdep splat or not.
>
> In that case, would a standard line with every lockdep warning help? For
> example:
>
> [...] A LOCKDEP issue detected.
>
> Two reasons I don't think making lockdep warning as panic is a good
> idea:
>
> * We don't know what other CIs expect, given today lockdep doesn't panic
> with panic_on_warn, this patch is a change of behaviors to them, and
> it may break their setups/scripts.

Perhaps we could add a module parameter instead, so that the behaviour
can be enabled with lockdep.panic=1 or similar? Then no existing setups
will be affected.

> * As I said, lockdep warnings are different than other warnings, and
> panicking doesn't provide more information for debugging.
>
> So I think an extra line helping scripts to parse may be better.
>
> Work for you?

For my use case, the extra line isn't needed. If I must parse the logs,
I can already do it with the existing prints. But I'm trying to avoid
having to parse the logs altogether.