2019-04-02 16:06:25

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 0/4] lockdep cleanups and optimizations.

These are lockdep bits that got accumulated in my softirq queue but
are independant.

Patches 1 and 2 are cleanups. 3 and 4 are IRQ locking validation
optimization.

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
lockdep/core

HEAD: dd2ea3a0bffdc3e140dd4af085bae7f9f9a08d6d

Thanks,
Frederic
---

Frederic Weisbecker (4):
locking/lockdep: Move valid_state() inside CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING
locking/lockdep: Map remaining magic numbers to lock usage mask names
locking/lockdep: Use expanded masks on find_usage_*() functions
locking/lockdep: Test all incompatible scenario at once in check_irq_usage()


kernel/locking/lockdep.c | 249 ++++++++++++++++++++++++-------------
kernel/locking/lockdep_internals.h | 6 +
2 files changed, 172 insertions(+), 83 deletions(-)


2019-04-02 16:05:36

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 1/4] locking/lockdep: Move valid_state() inside CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING

valid_state() and print_usage_bug*() functions are not used beyond
irq locking correctness checks under
CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING.
Sadly the "unused function" warning wouldn't fire because valid_state()
is inline so the unused case has remained unseen until now.

So move them inside the appropriate
CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING section.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
---
kernel/locking/lockdep.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 34cdcbedda49..9c5819ef4a28 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2784,6 +2784,12 @@ static void check_chain_key(struct task_struct *curr)
#endif
}

+static int mark_lock(struct task_struct *curr, struct held_lock *this,
+ enum lock_usage_bit new_bit);
+
+#if defined(CONFIG_TRACE_IRQFLAGS) && defined(CONFIG_PROVE_LOCKING)
+
+
static void
print_usage_bug_scenario(struct held_lock *lock)
{
@@ -2853,10 +2859,6 @@ valid_state(struct task_struct *curr, struct held_lock *this,
return 1;
}

-static int mark_lock(struct task_struct *curr, struct held_lock *this,
- enum lock_usage_bit new_bit);
-
-#if defined(CONFIG_TRACE_IRQFLAGS) && defined(CONFIG_PROVE_LOCKING)

/*
* print irq inversion bug:
--
2.21.0

2019-04-02 16:06:17

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 4/4] locking/lockdep: Test all incompatible scenario at once in check_irq_usage()

check_prev_add_irq() tests all incompatible scenarios one after the
other while adding a lock (@next) to a tree dependency (@prev):

LOCK_USED_IN_HARDIRQ vs LOCK_ENABLED_HARDIRQ
LOCK_USED_IN_HARDIRQ_READ vs LOCK_ENABLED_HARDIRQ
LOCK_USED_IN_SOFTIRQ vs LOCK_ENABLED_SOFTIRQ
LOCK_USED_IN_SOFTIRQ_READ vs LOCK_ENABLED_SOFTIRQ

Also for these four scenarios, we must at least iterate the @prev
backward dependency. Then if it matches the relevant LOCK_USED_* bit,
we must also iterate the @next forward dependency.

Therefore in the best case we iterate 4 times, in the worst case 8 times.

A different approach can let us divide the number of branch iterations
by 4:

1) Iterate through @prev backward dependencies and accumulate all the IRQ
uses in a single mask. In the best case where the current lock hasn't
been used in IRQ, we stop here.

2) Iterate through @next forward dependencies and try to find a lock
whose usage is exclusive to the accumulated usages gathered in the
previous step. If we find one (call it @lockA), we have found an
incompatible use, otherwise we stop here. Only bad locking scenario
go further. So a sane verification stop here.

3) Iterate again through @prev backward dependency and find the lock
whose usage matches @lockA in term of incompatibility. Call that
lock @lockB.

4) Report the incompatible usages of @lockA and @lockB

If no incompatible use is found, the verification never goes beyond
step 2 which means at most two iterations.

The following compares the execution measurements of the function
check_prev_add_irq():

Number of calls | Avg (ns) | Stdev (ns) | Total time (ns)
------------------------------------------------------------------------
Mainline 8452 | 2652 | 11962 | 22415143
This patch 8452 | 1518 | 7090 | 12835602

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
---
kernel/locking/lockdep.c | 212 ++++++++++++++++++++---------
kernel/locking/lockdep_internals.h | 6 +
2 files changed, 151 insertions(+), 67 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 5e149dd78298..80f33c700314 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1676,6 +1676,14 @@ check_redundant(struct lock_list *root, struct lock_class *target,
}

#if defined(CONFIG_TRACE_IRQFLAGS) && defined(CONFIG_PROVE_LOCKING)
+
+static inline int usage_accumulate(struct lock_list *entry, void *mask)
+{
+ *(unsigned long *)mask |= entry->class->usage_mask;
+
+ return 0;
+}
+
/*
* Forwards and backwards subgraph searching, for the purposes of
* proving that two subgraphs can be connected by a new dependency
@@ -1687,8 +1695,6 @@ static inline int usage_match(struct lock_list *entry, void *mask)
return entry->class->usage_mask & *(unsigned long *)mask;
}

-
-
/*
* Find a node in the forwards-direction dependency sub-graph starting
* at @root->class that matches @bit.
@@ -1922,39 +1928,6 @@ print_bad_irq_dependency(struct task_struct *curr,
return 0;
}

-static int
-check_usage(struct task_struct *curr, struct held_lock *prev,
- struct held_lock *next, enum lock_usage_bit bit_backwards,
- enum lock_usage_bit bit_forwards, const char *irqclass)
-{
- int ret;
- struct lock_list this, that;
- struct lock_list *uninitialized_var(target_entry);
- struct lock_list *uninitialized_var(target_entry1);
-
- this.parent = NULL;
-
- this.class = hlock_class(prev);
- ret = find_usage_backwards(&this, lock_flag(bit_backwards), &target_entry);
- if (ret < 0)
- return print_bfs_bug(ret);
- if (ret == 1)
- return ret;
-
- that.parent = NULL;
- that.class = hlock_class(next);
- ret = find_usage_forwards(&that, lock_flag(bit_forwards), &target_entry1);
- if (ret < 0)
- return print_bfs_bug(ret);
- if (ret == 1)
- return ret;
-
- return print_bad_irq_dependency(curr, &this, &that,
- target_entry, target_entry1,
- prev, next,
- bit_backwards, bit_forwards, irqclass);
-}
-
static const char *state_names[] = {
#define LOCKDEP_STATE(__STATE) \
__stringify(__STATE),
@@ -1988,45 +1961,151 @@ static int exclusive_bit(int new_bit)
return state | (dir ^ LOCK_USAGE_DIR_MASK);
}

+static unsigned long exclusive_dir_mask(unsigned long mask)
+{
+ unsigned long excl;
+
+ /* Invert dir */
+ excl = (mask & LOCKF_ENABLED_IRQ_ALL) >> LOCK_USAGE_DIR_MASK;
+ excl |= (mask & LOCKF_USED_IN_IRQ_ALL) << LOCK_USAGE_DIR_MASK;
+
+ return excl;
+}
+
+static unsigned long exclusive_mask(unsigned long mask)
+{
+ unsigned long excl = exclusive_dir_mask(mask);
+
+ /* Strip read */
+ excl |= (excl & LOCKF_IRQ_READ) >> LOCK_USAGE_READ_MASK;
+ excl &= ~LOCKF_IRQ_READ;
+
+ return excl;
+}
+
+
+/*
+ * Retrieve the _possible_ original mask to which @mask is
+ * exclusive. Ie: this is the opposite of exclusive_mask().
+ * Note that 2 possible original bits can match an exclusive
+ * bit: one has LOCK_USAGE_READ_MASK set, the other has it
+ * cleared. So both are returned for each exclusive bit.
+ */
+static unsigned long original_mask(unsigned long mask)
+{
+ unsigned long excl = exclusive_dir_mask(mask);
+
+ /* Include read in existing usages */
+ excl |= (excl & LOCKF_IRQ) << LOCK_USAGE_READ_MASK;
+
+ return excl;
+}
+
+/*
+ * Find the first pair of bit match between an original
+ * usage mask and an exclusive usage mask.
+ */
+static int find_exclusive_match(unsigned long mask,
+ unsigned long excl_mask,
+ enum lock_usage_bit *bit,
+ enum lock_usage_bit *excl_bit)
+{
+ int fs, nr = 0;
+
+ while ((fs = ffs(mask))) {
+ int excl;
+
+ nr += fs;
+ excl = exclusive_bit(nr - 1);
+ if (excl_mask & lock_flag(excl)) {
+ *bit = nr - 1;
+ *excl_bit = excl;
+ return 0;
+ }
+ mask >>= fs - 1;
+ /*
+ * Prevent from shifts of sizeof(long) which can
+ * give unpredictable results.
+ */
+ mask >>= 1;
+ }
+ return -1;
+}
+
+/*
+ * Prove that the new dependency does not connect a hardirq-safe(-read)
+ * lock with a hardirq-unsafe lock - to achieve this we search
+ * the backwards-subgraph starting at <prev>, and the
+ * forwards-subgraph starting at <next>:
+ */
static int check_irq_usage(struct task_struct *curr, struct held_lock *prev,
- struct held_lock *next, enum lock_usage_bit bit)
+ struct held_lock *next)
{
+ unsigned long usage_mask = 0, forward_mask, backward_mask;
+ enum lock_usage_bit forward_bit = 0, backward_bit = 0;
+ struct lock_list *uninitialized_var(target_entry1);
+ struct lock_list *uninitialized_var(target_entry);
+ struct lock_list this, that;
+ int ret;
+
/*
- * Prove that the new dependency does not connect a hardirq-safe
- * lock with a hardirq-unsafe lock - to achieve this we search
- * the backwards-subgraph starting at <prev>, and the
- * forwards-subgraph starting at <next>:
+ * Step 1: gather all hard/soft IRQs usages backward in an
+ * accumulated usage mask.
*/
- if (!check_usage(curr, prev, next, bit,
- exclusive_bit(bit), state_name(bit)))
- return 0;
+ this.parent = NULL;
+ this.class = hlock_class(prev);
+
+ ret = __bfs_backwards(&this, &usage_mask, usage_accumulate, NULL);
+ if (ret < 0)
+ return print_bfs_bug(ret);

- bit++; /* _READ */
+ usage_mask &= LOCKF_USED_IN_IRQ_ALL;
+ if (!usage_mask)
+ return 1;

/*
- * Prove that the new dependency does not connect a hardirq-safe-read
- * lock with a hardirq-unsafe lock - to achieve this we search
- * the backwards-subgraph starting at <prev>, and the
- * forwards-subgraph starting at <next>:
+ * Step 2: find exclusive uses forward that match the previous
+ * backward accumulated mask.
*/
- if (!check_usage(curr, prev, next, bit,
- exclusive_bit(bit), state_name(bit)))
- return 0;
+ forward_mask = exclusive_mask(usage_mask);

- return 1;
-}
+ that.parent = NULL;
+ that.class = hlock_class(next);

-static int
-check_prev_add_irq(struct task_struct *curr, struct held_lock *prev,
- struct held_lock *next)
-{
-#define LOCKDEP_STATE(__STATE) \
- if (!check_irq_usage(curr, prev, next, LOCK_USED_IN_##__STATE)) \
- return 0;
-#include "lockdep_states.h"
-#undef LOCKDEP_STATE
+ ret = find_usage_forwards(&that, forward_mask, &target_entry1);
+ if (ret < 0)
+ return print_bfs_bug(ret);
+ if (ret == 1)
+ return ret;
+
+ /*
+ * Step 3: we found a bad match! Now retrieve a lock from the backward
+ * list whose usage mask matches the exclusive usage mask from the
+ * lock found on the forward list.
+ */
+ backward_mask = original_mask(target_entry1->class->usage_mask);
+
+ ret = find_usage_backwards(&this, backward_mask, &target_entry);
+ if (ret < 0)
+ return print_bfs_bug(ret);
+ if (DEBUG_LOCKS_WARN_ON(ret == 1))
+ return 1;
+
+ /*
+ * Step 4: narrow down to a pair of incompatible usage bits
+ * and report it.
+ */
+ ret = find_exclusive_match(target_entry->class->usage_mask,
+ target_entry1->class->usage_mask,
+ &backward_bit, &forward_bit);
+ if (DEBUG_LOCKS_WARN_ON(ret == -1))
+ return 1;

- return 1;
+ return print_bad_irq_dependency(curr, &this, &that,
+ target_entry, target_entry1,
+ prev, next,
+ backward_bit, forward_bit,
+ state_name(backward_bit));
}

static void inc_chains(void)
@@ -2043,9 +2122,8 @@ static void inc_chains(void)

#else

-static inline int
-check_prev_add_irq(struct task_struct *curr, struct held_lock *prev,
- struct held_lock *next)
+static inline int check_irq_usage(struct task_struct *curr,
+ struct held_lock *prev, struct held_lock *next)
{
return 1;
}
@@ -2225,7 +2303,7 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev,
else if (unlikely(ret < 0))
return print_bfs_bug(ret);

- if (!check_prev_add_irq(curr, prev, next))
+ if (!check_irq_usage(curr, prev, next))
return 0;

/*
diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
index d4c197425f68..d849692f2da7 100644
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -50,6 +50,12 @@ enum {
#define LOCKF_USED_IN_IRQ_READ \
(LOCKF_USED_IN_HARDIRQ_READ | LOCKF_USED_IN_SOFTIRQ_READ)

+#define LOCKF_ENABLED_IRQ_ALL (LOCKF_ENABLED_IRQ | LOCKF_ENABLED_IRQ_READ)
+#define LOCKF_USED_IN_IRQ_ALL (LOCKF_USED_IN_IRQ | LOCKF_USED_IN_IRQ_READ)
+
+#define LOCKF_IRQ (LOCKF_ENABLED_IRQ | LOCKF_USED_IN_IRQ)
+#define LOCKF_IRQ_READ (LOCKF_ENABLED_IRQ_READ | LOCKF_USED_IN_IRQ_READ)
+
/*
* CONFIG_LOCKDEP_SMALL is defined for sparc. Sparc requires .text,
* .data and .bss to fit in required 32MB limit for the kernel. With
--
2.21.0

2019-04-02 16:06:45

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 3/4] locking/lockdep: Use expanded masks on find_usage_*() functions

In order to optimize check_irq_usage() and factorize all the IRQ usage
validations we'll need to be able to check multiple lock usage bits at
once. Prepare the low level usage mask check functions for that purpose.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
---
kernel/locking/lockdep.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 2288aa2fa4c6..5e149dd78298 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1682,9 +1682,9 @@ check_redundant(struct lock_list *root, struct lock_class *target,
* without creating any illegal irq-safe -> irq-unsafe lock dependency.
*/

-static inline int usage_match(struct lock_list *entry, void *bit)
+static inline int usage_match(struct lock_list *entry, void *mask)
{
- return entry->class->usage_mask & (1 << (enum lock_usage_bit)bit);
+ return entry->class->usage_mask & *(unsigned long *)mask;
}


@@ -1700,14 +1700,14 @@ static inline int usage_match(struct lock_list *entry, void *bit)
* Return <0 on error.
*/
static int
-find_usage_forwards(struct lock_list *root, enum lock_usage_bit bit,
+find_usage_forwards(struct lock_list *root, unsigned long usage_mask,
struct lock_list **target_entry)
{
int result;

debug_atomic_inc(nr_find_usage_forwards_checks);

- result = __bfs_forwards(root, (void *)bit, usage_match, target_entry);
+ result = __bfs_forwards(root, &usage_mask, usage_match, target_entry);

return result;
}
@@ -1723,14 +1723,14 @@ find_usage_forwards(struct lock_list *root, enum lock_usage_bit bit,
* Return <0 on error.
*/
static int
-find_usage_backwards(struct lock_list *root, enum lock_usage_bit bit,
+find_usage_backwards(struct lock_list *root, unsigned long usage_mask,
struct lock_list **target_entry)
{
int result;

debug_atomic_inc(nr_find_usage_backwards_checks);

- result = __bfs_backwards(root, (void *)bit, usage_match, target_entry);
+ result = __bfs_backwards(root, &usage_mask, usage_match, target_entry);

return result;
}
@@ -1935,7 +1935,7 @@ check_usage(struct task_struct *curr, struct held_lock *prev,
this.parent = NULL;

this.class = hlock_class(prev);
- ret = find_usage_backwards(&this, bit_backwards, &target_entry);
+ ret = find_usage_backwards(&this, lock_flag(bit_backwards), &target_entry);
if (ret < 0)
return print_bfs_bug(ret);
if (ret == 1)
@@ -1943,7 +1943,7 @@ check_usage(struct task_struct *curr, struct held_lock *prev,

that.parent = NULL;
that.class = hlock_class(next);
- ret = find_usage_forwards(&that, bit_forwards, &target_entry1);
+ ret = find_usage_forwards(&that, lock_flag(bit_forwards), &target_entry1);
if (ret < 0)
return print_bfs_bug(ret);
if (ret == 1)
@@ -2941,7 +2941,7 @@ check_usage_forwards(struct task_struct *curr, struct held_lock *this,

root.parent = NULL;
root.class = hlock_class(this);
- ret = find_usage_forwards(&root, bit, &target_entry);
+ ret = find_usage_forwards(&root, lock_flag(bit), &target_entry);
if (ret < 0)
return print_bfs_bug(ret);
if (ret == 1)
@@ -2965,7 +2965,7 @@ check_usage_backwards(struct task_struct *curr, struct held_lock *this,

root.parent = NULL;
root.class = hlock_class(this);
- ret = find_usage_backwards(&root, bit, &target_entry);
+ ret = find_usage_backwards(&root, lock_flag(bit), &target_entry);
if (ret < 0)
return print_bfs_bug(ret);
if (ret == 1)
--
2.21.0

2019-04-02 16:06:54

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 2/4] locking/lockdep: Map remaining magic numbers to lock usage mask names

Clarify the code with mapping some more constant numbers that haven't
been named after their corresponding LOCK_USAGE_* symbol.

Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
---
kernel/locking/lockdep.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 9c5819ef4a28..2288aa2fa4c6 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -516,11 +516,11 @@ static char get_usage_char(struct lock_class *class, enum lock_usage_bit bit)
{
char c = '.';

- if (class->usage_mask & lock_flag(bit + 2))
+ if (class->usage_mask & lock_flag(bit + LOCK_USAGE_DIR_MASK))
c = '+';
if (class->usage_mask & lock_flag(bit)) {
c = '-';
- if (class->usage_mask & lock_flag(bit + 2))
+ if (class->usage_mask & lock_flag(bit + LOCK_USAGE_DIR_MASK))
c = '?';
}

@@ -1971,7 +1971,10 @@ static const char *state_rnames[] = {

static inline const char *state_name(enum lock_usage_bit bit)
{
- return (bit & LOCK_USAGE_READ_MASK) ? state_rnames[bit >> 2] : state_names[bit >> 2];
+ if (bit & LOCK_USAGE_READ_MASK)
+ return state_rnames[bit >> LOCK_USAGE_DIR_MASK];
+ else
+ return state_names[bit >> LOCK_USAGE_DIR_MASK];
}

static int exclusive_bit(int new_bit)
@@ -3017,7 +3020,7 @@ static int (*state_verbose_f[])(struct lock_class *class) = {
static inline int state_verbose(enum lock_usage_bit bit,
struct lock_class *class)
{
- return state_verbose_f[bit >> 2](class);
+ return state_verbose_f[bit >> LOCK_USAGE_DIR_MASK](class);
}

typedef int (*check_usage_f)(struct task_struct *, struct held_lock *,
--
2.21.0

2019-04-09 13:04:48

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 4/4] locking/lockdep: Test all incompatible scenario at once in check_irq_usage()

On Tue, Apr 02, 2019 at 06:02:44PM +0200, Frederic Weisbecker wrote:
> @@ -1988,45 +1961,151 @@ static int exclusive_bit(int new_bit)
> return state | (dir ^ LOCK_USAGE_DIR_MASK);
> }
>
> +static unsigned long exclusive_dir_mask(unsigned long mask)

Would you mind terribly if I call that: invert_dir_mask() ?

> +{
> + unsigned long excl;
> +
> + /* Invert dir */
> + excl = (mask & LOCKF_ENABLED_IRQ_ALL) >> LOCK_USAGE_DIR_MASK;
> + excl |= (mask & LOCKF_USED_IN_IRQ_ALL) << LOCK_USAGE_DIR_MASK;
> +
> + return excl;
> +}
> +
> +static unsigned long exclusive_mask(unsigned long mask)
> +{
> + unsigned long excl = exclusive_dir_mask(mask);
> +
> + /* Strip read */
> + excl |= (excl & LOCKF_IRQ_READ) >> LOCK_USAGE_READ_MASK;
> + excl &= ~LOCKF_IRQ_READ;
> +
> + return excl;
> +}

And I might write a comment to go with those functions; they're too
clever by half. I'm sure I'll have forgotten how they work in a few
months time.

Very well done :-)

> +/*
> + * Find the first pair of bit match between an original
> + * usage mask and an exclusive usage mask.
> + */
> +static int find_exclusive_match(unsigned long mask,
> + unsigned long excl_mask,
> + enum lock_usage_bit *bit,
> + enum lock_usage_bit *excl_bit)
> +{
> + int fs, nr = 0;
> +
> + while ((fs = ffs(mask))) {
> + int excl;
> +
> + nr += fs;
> + excl = exclusive_bit(nr - 1);
> + if (excl_mask & lock_flag(excl)) {
> + *bit = nr - 1;
> + *excl_bit = excl;
> + return 0;
> + }
> + mask >>= fs - 1;
> + /*
> + * Prevent from shifts of sizeof(long) which can
> + * give unpredictable results.
> + */
> + mask >>= 1;
> + }
> + return -1;

Should we write that like:

for_each_set_bit(bit, &mask, LOCK_USED) {
int excl = exclusive_bit(bit);
if (excl_mask & lock_flag(excl)) {
*bitp = bit;
*excl_bitp = excl;
return 0;
}
}
return -1;

Or something along those lines?

> +}
> +

2019-04-10 02:30:56

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH 4/4] locking/lockdep: Test all incompatible scenario at once in check_irq_usage()

On Tue, Apr 09, 2019 at 03:03:52PM +0200, Peter Zijlstra wrote:
> On Tue, Apr 02, 2019 at 06:02:44PM +0200, Frederic Weisbecker wrote:
> > @@ -1988,45 +1961,151 @@ static int exclusive_bit(int new_bit)
> > return state | (dir ^ LOCK_USAGE_DIR_MASK);
> > }
> >
> > +static unsigned long exclusive_dir_mask(unsigned long mask)
>
> Would you mind terribly if I call that: invert_dir_mask() ?

That's indeed much better.

>
> > +{
> > + unsigned long excl;
> > +
> > + /* Invert dir */
> > + excl = (mask & LOCKF_ENABLED_IRQ_ALL) >> LOCK_USAGE_DIR_MASK;
> > + excl |= (mask & LOCKF_USED_IN_IRQ_ALL) << LOCK_USAGE_DIR_MASK;
> > +
> > + return excl;
> > +}
> > +
> > +static unsigned long exclusive_mask(unsigned long mask)
> > +{
> > + unsigned long excl = exclusive_dir_mask(mask);
> > +
> > + /* Strip read */
> > + excl |= (excl & LOCKF_IRQ_READ) >> LOCK_USAGE_READ_MASK;
> > + excl &= ~LOCKF_IRQ_READ;
> > +
> > + return excl;
> > +}
>
> And I might write a comment to go with those functions; they're too
> clever by half. I'm sure I'll have forgotten how they work in a few
> months time.

It only takes a night for me to forget how my code works. Then I need
a whole day long to recollect. But once I'm done the next night starts.

So I'm not against comments, thanks :-)

> > +/*
> > + * Find the first pair of bit match between an original
> > + * usage mask and an exclusive usage mask.
> > + */
> > +static int find_exclusive_match(unsigned long mask,
> > + unsigned long excl_mask,
> > + enum lock_usage_bit *bit,
> > + enum lock_usage_bit *excl_bit)
> > +{
> > + int fs, nr = 0;
> > +
> > + while ((fs = ffs(mask))) {
> > + int excl;
> > +
> > + nr += fs;
> > + excl = exclusive_bit(nr - 1);
> > + if (excl_mask & lock_flag(excl)) {
> > + *bit = nr - 1;
> > + *excl_bit = excl;
> > + return 0;
> > + }
> > + mask >>= fs - 1;
> > + /*
> > + * Prevent from shifts of sizeof(long) which can
> > + * give unpredictable results.
> > + */
> > + mask >>= 1;
> > + }
> > + return -1;
>
> Should we write that like:
>
> for_each_set_bit(bit, &mask, LOCK_USED) {
> int excl = exclusive_bit(bit);
> if (excl_mask & lock_flag(excl)) {
> *bitp = bit;
> *excl_bitp = excl;
> return 0;
> }
> }
> return -1;
>
> Or something along those lines?

Ah yes indeed. Linus didn't like the idea of using bitmap functions for lockdep
usage bits in the softirq vector patchset but here the case is different as it's
not used in lockdep fastpath. That's only for the bad locking scenario path so it's
all good I think.

Should I re-issue the set or you do the changes?

Thanks.

2019-04-11 10:47:34

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 4/4] locking/lockdep: Test all incompatible scenario at once in check_irq_usage()

On Wed, Apr 10, 2019 at 04:28:48AM +0200, Frederic Weisbecker wrote:
> Should I re-issue the set or you do the changes?

I've made it like so; does that work? In particular, do the comments
make sense? It is always hard to write these things down.

---

Subject: locking/lockdep: Test all incompatible scenario at once in check_irq_usage()
From: Frederic Weisbecker <[email protected]>
Date: Tue, 2 Apr 2019 18:02:44 +0200

check_prev_add_irq() tests all incompatible scenarios one after the
other while adding a lock (@next) to a tree dependency (@prev):

LOCK_USED_IN_HARDIRQ vs LOCK_ENABLED_HARDIRQ
LOCK_USED_IN_HARDIRQ_READ vs LOCK_ENABLED_HARDIRQ
LOCK_USED_IN_SOFTIRQ vs LOCK_ENABLED_SOFTIRQ
LOCK_USED_IN_SOFTIRQ_READ vs LOCK_ENABLED_SOFTIRQ

Also for these four scenarios, we must at least iterate the @prev
backward dependency. Then if it matches the relevant LOCK_USED_* bit,
we must also iterate the @next forward dependency.

Therefore in the best case we iterate 4 times, in the worst case 8 times.

A different approach can let us divide the number of branch iterations
by 4:

1) Iterate through @prev backward dependencies and accumulate all the IRQ
uses in a single mask. In the best case where the current lock hasn't
been used in IRQ, we stop here.

2) Iterate through @next forward dependencies and try to find a lock
whose usage is exclusive to the accumulated usages gathered in the
previous step. If we find one (call it @lockA), we have found an
incompatible use, otherwise we stop here. Only bad locking scenario
go further. So a sane verification stop here.

3) Iterate again through @prev backward dependency and find the lock
whose usage matches @lockA in term of incompatibility. Call that
lock @lockB.

4) Report the incompatible usages of @lockA and @lockB

If no incompatible use is found, the verification never goes beyond
step 2 which means at most two iterations.

The following compares the execution measurements of the function
check_prev_add_irq():

Number of calls | Avg (ns) | Stdev (ns) | Total time (ns)
------------------------------------------------------------------------
Mainline 8452 | 2652 | 11962 | 22415143
This patch 8452 | 1518 | 7090 | 12835602

Cc: Ingo Molnar <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
kernel/locking/lockdep.c | 226 ++++++++++++++++++++++++++-----------
kernel/locking/lockdep_internals.h | 6
2 files changed, 165 insertions(+), 67 deletions(-)

--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1676,6 +1676,14 @@ check_redundant(struct lock_list *root,
}

#if defined(CONFIG_TRACE_IRQFLAGS) && defined(CONFIG_PROVE_LOCKING)
+
+static inline int usage_accumulate(struct lock_list *entry, void *mask)
+{
+ *(unsigned long *)mask |= entry->class->usage_mask;
+
+ return 0;
+}
+
/*
* Forwards and backwards subgraph searching, for the purposes of
* proving that two subgraphs can be connected by a new dependency
@@ -1687,8 +1695,6 @@ static inline int usage_match(struct loc
return entry->class->usage_mask & *(unsigned long *)mask;
}

-
-
/*
* Find a node in the forwards-direction dependency sub-graph starting
* at @root->class that matches @bit.
@@ -1922,39 +1928,6 @@ print_bad_irq_dependency(struct task_str
return 0;
}

-static int
-check_usage(struct task_struct *curr, struct held_lock *prev,
- struct held_lock *next, enum lock_usage_bit bit_backwards,
- enum lock_usage_bit bit_forwards, const char *irqclass)
-{
- int ret;
- struct lock_list this, that;
- struct lock_list *uninitialized_var(target_entry);
- struct lock_list *uninitialized_var(target_entry1);
-
- this.parent = NULL;
-
- this.class = hlock_class(prev);
- ret = find_usage_backwards(&this, lock_flag(bit_backwards), &target_entry);
- if (ret < 0)
- return print_bfs_bug(ret);
- if (ret == 1)
- return ret;
-
- that.parent = NULL;
- that.class = hlock_class(next);
- ret = find_usage_forwards(&that, lock_flag(bit_forwards), &target_entry1);
- if (ret < 0)
- return print_bfs_bug(ret);
- if (ret == 1)
- return ret;
-
- return print_bad_irq_dependency(curr, &this, &that,
- target_entry, target_entry1,
- prev, next,
- bit_backwards, bit_forwards, irqclass);
-}
-
static const char *state_names[] = {
#define LOCKDEP_STATE(__STATE) \
__stringify(__STATE),
@@ -1977,6 +1950,13 @@ static inline const char *state_name(enu
return state_names[bit >> LOCK_USAGE_DIR_MASK];
}

+/*
+ * The bit number is encoded like:
+ *
+ * bit0: 0 exclusive, 1 read lock
+ * bit1: 0 used in irq, 1 irq enabled
+ * bit2-n: state
+ */
static int exclusive_bit(int new_bit)
{
int state = new_bit & LOCK_USAGE_STATE_MASK;
@@ -1988,45 +1968,158 @@ static int exclusive_bit(int new_bit)
return state | (dir ^ LOCK_USAGE_DIR_MASK);
}

+/*
+ * Observe that when given a bitmask where each bitnr is encoded as above, a
+ * right shift of the mask transforms the individual bitnrs as -1.
+ *
+ * So for all bits where bitnr1 == 1, we can create the mask where bitnr1 == 0
+ * by subtracting 2, or shifting the mask right by 2.
+ *
+ * Similarly, bitnr1 == 0 becomes bitnr1 == 1 by adding 2, or shifting left 2.
+ *
+ * So split the mask (note that LOCKF_ENABLED_IRQ_ALL|LOCKF_USED_IN_IRQ_ALL is
+ * all bits set) and recompose with bitnr1 flipped.
+ */
+static unsigned long invert_dir_mask(unsigned long mask)
+{
+ unsigned long excl = 0;
+
+ /* Invert dir */
+ excl |= (mask & LOCKF_ENABLED_IRQ_ALL) >> LOCK_USAGE_DIR_MASK;
+ excl |= (mask & LOCKF_USED_IN_IRQ_ALL) << LOCK_USAGE_DIR_MASK;
+
+ return excl;
+}
+
+/*
+ * As above, we clear bitnr0 with bitmask ops. First, for all bits with bitnr1
+ * set, add those with bitnr1 cleared. And then mask out all bitnr1.
+ */
+static unsigned long exclusive_mask(unsigned long mask)
+{
+ unsigned long excl = invert_dir_mask(mask);
+
+ /* Strip read */
+ excl |= (excl & LOCKF_IRQ_READ) >> LOCK_USAGE_READ_MASK;
+ excl &= ~LOCKF_IRQ_READ;
+
+ return excl;
+}
+
+
+/*
+ * Retrieve the _possible_ original mask to which @mask is
+ * exclusive. Ie: this is the opposite of exclusive_mask().
+ * Note that 2 possible original bits can match an exclusive
+ * bit: one has LOCK_USAGE_READ_MASK set, the other has it
+ * cleared. So both are returned for each exclusive bit.
+ */
+static unsigned long original_mask(unsigned long mask)
+{
+ unsigned long excl = invert_dir_mask(mask);
+
+ /* Include read in existing usages */
+ excl |= (excl & LOCKF_IRQ) << LOCK_USAGE_READ_MASK;
+
+ return excl;
+}
+
+/*
+ * Find the first pair of bit match between an original
+ * usage mask and an exclusive usage mask.
+ */
+static int find_exclusive_match(unsigned long mask,
+ unsigned long excl_mask,
+ enum lock_usage_bit *bitp,
+ enum lock_usage_bit *excl_bitp)
+{
+ int bit, excl;
+
+ for_each_set_bit(bit, &mask, LOCK_USED) {
+ excl = exclusive_bit(bit);
+ if (excl_mask & lock_flag(excl)) {
+ *bitp = bit;
+ *excl_bitp = excl;
+ return 0;
+ }
+ }
+ return -1;
+}
+
+/*
+ * Prove that the new dependency does not connect a hardirq-safe(-read)
+ * lock with a hardirq-unsafe lock - to achieve this we search
+ * the backwards-subgraph starting at <prev>, and the
+ * forwards-subgraph starting at <next>:
+ */
static int check_irq_usage(struct task_struct *curr, struct held_lock *prev,
- struct held_lock *next, enum lock_usage_bit bit)
+ struct held_lock *next)
{
+ unsigned long usage_mask = 0, forward_mask, backward_mask;
+ enum lock_usage_bit forward_bit = 0, backward_bit = 0;
+ struct lock_list *uninitialized_var(target_entry1);
+ struct lock_list *uninitialized_var(target_entry);
+ struct lock_list this, that;
+ int ret;
+
/*
- * Prove that the new dependency does not connect a hardirq-safe
- * lock with a hardirq-unsafe lock - to achieve this we search
- * the backwards-subgraph starting at <prev>, and the
- * forwards-subgraph starting at <next>:
+ * Step 1: gather all hard/soft IRQs usages backward in an
+ * accumulated usage mask.
*/
- if (!check_usage(curr, prev, next, bit,
- exclusive_bit(bit), state_name(bit)))
- return 0;
+ this.parent = NULL;
+ this.class = hlock_class(prev);
+
+ ret = __bfs_backwards(&this, &usage_mask, usage_accumulate, NULL);
+ if (ret < 0)
+ return print_bfs_bug(ret);

- bit++; /* _READ */
+ usage_mask &= LOCKF_USED_IN_IRQ_ALL;
+ if (!usage_mask)
+ return 1;

/*
- * Prove that the new dependency does not connect a hardirq-safe-read
- * lock with a hardirq-unsafe lock - to achieve this we search
- * the backwards-subgraph starting at <prev>, and the
- * forwards-subgraph starting at <next>:
+ * Step 2: find exclusive uses forward that match the previous
+ * backward accumulated mask.
*/
- if (!check_usage(curr, prev, next, bit,
- exclusive_bit(bit), state_name(bit)))
- return 0;
+ forward_mask = exclusive_mask(usage_mask);

- return 1;
-}
+ that.parent = NULL;
+ that.class = hlock_class(next);

-static int
-check_prev_add_irq(struct task_struct *curr, struct held_lock *prev,
- struct held_lock *next)
-{
-#define LOCKDEP_STATE(__STATE) \
- if (!check_irq_usage(curr, prev, next, LOCK_USED_IN_##__STATE)) \
- return 0;
-#include "lockdep_states.h"
-#undef LOCKDEP_STATE
+ ret = find_usage_forwards(&that, forward_mask, &target_entry1);
+ if (ret < 0)
+ return print_bfs_bug(ret);
+ if (ret == 1)
+ return ret;

- return 1;
+ /*
+ * Step 3: we found a bad match! Now retrieve a lock from the backward
+ * list whose usage mask matches the exclusive usage mask from the
+ * lock found on the forward list.
+ */
+ backward_mask = original_mask(target_entry1->class->usage_mask);
+
+ ret = find_usage_backwards(&this, backward_mask, &target_entry);
+ if (ret < 0)
+ return print_bfs_bug(ret);
+ if (DEBUG_LOCKS_WARN_ON(ret == 1))
+ return 1;
+
+ /*
+ * Step 4: narrow down to a pair of incompatible usage bits
+ * and report it.
+ */
+ ret = find_exclusive_match(target_entry->class->usage_mask,
+ target_entry1->class->usage_mask,
+ &backward_bit, &forward_bit);
+ if (DEBUG_LOCKS_WARN_ON(ret == -1))
+ return 1;
+
+ return print_bad_irq_dependency(curr, &this, &that,
+ target_entry, target_entry1,
+ prev, next,
+ backward_bit, forward_bit,
+ state_name(backward_bit));
}

static void inc_chains(void)
@@ -2043,9 +2136,8 @@ static void inc_chains(void)

#else

-static inline int
-check_prev_add_irq(struct task_struct *curr, struct held_lock *prev,
- struct held_lock *next)
+static inline int check_irq_usage(struct task_struct *curr,
+ struct held_lock *prev, struct held_lock *next)
{
return 1;
}
@@ -2225,7 +2317,7 @@ check_prev_add(struct task_struct *curr,
else if (unlikely(ret < 0))
return print_bfs_bug(ret);

- if (!check_prev_add_irq(curr, prev, next))
+ if (!check_irq_usage(curr, prev, next))
return 0;

/*
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -66,6 +66,12 @@ static const unsigned long LOCKF_USED_IN
0;
#undef LOCKDEP_STATE

+#define LOCKF_ENABLED_IRQ_ALL (LOCKF_ENABLED_IRQ | LOCKF_ENABLED_IRQ_READ)
+#define LOCKF_USED_IN_IRQ_ALL (LOCKF_USED_IN_IRQ | LOCKF_USED_IN_IRQ_READ)
+
+#define LOCKF_IRQ (LOCKF_ENABLED_IRQ | LOCKF_USED_IN_IRQ)
+#define LOCKF_IRQ_READ (LOCKF_ENABLED_IRQ_READ | LOCKF_USED_IN_IRQ_READ)
+
/*
* CONFIG_LOCKDEP_SMALL is defined for sparc. Sparc requires .text,
* .data and .bss to fit in required 32MB limit for the kernel. With

2019-04-13 00:39:04

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH 4/4] locking/lockdep: Test all incompatible scenario at once in check_irq_usage()

On Thu, Apr 11, 2019 at 12:46:32PM +0200, Peter Zijlstra wrote:
> On Wed, Apr 10, 2019 at 04:28:48AM +0200, Frederic Weisbecker wrote:
> > Should I re-issue the set or you do the changes?
>
> I've made it like so; does that work? In particular, do the comments
> make sense? It is always hard to write these things down.

It's especially hard because we describe both the bits of the usage mask
and the bits of the bit numbers of the usage mask. The resulting description
can only be naughty :-)

That said I can probably clarify that in lockdep_internals.h on further
patches.

A few comments though:

> +/*
> + * The bit number is encoded like:
> + *
> + * bit0: 0 exclusive, 1 read lock
> + * bit1: 0 used in irq, 1 irq enabled
> + * bit2-n: state
> + */
> static int exclusive_bit(int new_bit)
> {
> int state = new_bit & LOCK_USAGE_STATE_MASK;
> @@ -1988,45 +1968,158 @@ static int exclusive_bit(int new_bit)
> return state | (dir ^ LOCK_USAGE_DIR_MASK);
> }
>
> +/*
> + * Observe that when given a bitmask where each bitnr is encoded as above, a
> + * right shift of the mask transforms the individual bitnrs as -1.
> + *
> + * So for all bits where bitnr1 == 1, we can create the mask where bitnr1 == 0

So by bitnr1 you're referring to direction, right?

> + * by subtracting 2, or shifting the mask right by 2.

In which case we can perhaps reformulate:

So for all bits whose number have LOCK_ENABLED_* set (bitnr1 == 1), we can create the
mask with those bit numbers using LOCK_USED_IN_* (bitnr1 == 0) instead by
subtracting the bit number by 2, or shifting the mask right by 2.

And same would go for below.

> + *
> + * Similarly, bitnr1 == 0 becomes bitnr1 == 1 by adding 2, or shifting left 2.
> + *
> + * So split the mask (note that LOCKF_ENABLED_IRQ_ALL|LOCKF_USED_IN_IRQ_ALL is
> + * all bits set) and recompose with bitnr1 flipped.
> + */
> +static unsigned long invert_dir_mask(unsigned long mask)
> +{
> + unsigned long excl = 0;
> +
> + /* Invert dir */
> + excl |= (mask & LOCKF_ENABLED_IRQ_ALL) >> LOCK_USAGE_DIR_MASK;
> + excl |= (mask & LOCKF_USED_IN_IRQ_ALL) << LOCK_USAGE_DIR_MASK;
> +
> + return excl;
> +}
> +
> +/*
> + * As above, we clear bitnr0 with bitmask ops. First, for all bits with bitnr1
> + * set, add those with bitnr1 cleared. And then mask out all bitnr1.
> + */

Same here:

As above, we clear bitnr0 (LOCK_*_READ off) with bitmask ops. First, for all bits
with bitnr1 set (LOCK_ENABLED_*) , add those with bitnr1 cleared (LOCK_USED_IN_*).
And then mask out all bitnr1.

> +static unsigned long exclusive_mask(unsigned long mask)
> +{
> + unsigned long excl = invert_dir_mask(mask);
> +
> + /* Strip read */
> + excl |= (excl & LOCKF_IRQ_READ) >> LOCK_USAGE_READ_MASK;
> + excl &= ~LOCKF_IRQ_READ;
> +
> + return excl;
> +}

Not sure I'm making things clearer but at least that's some more pointers
to enum lock_usage_bit (defined on headers where I should add more layout
explanations, especially to make it clear we play with bit number bits :-s )

Thanks.

2019-04-13 06:41:04

by Yuyang Du

[permalink] [raw]
Subject: Re: [PATCH 4/4] locking/lockdep: Test all incompatible scenario at once in check_irq_usage()

On Wed, 3 Apr 2019 at 00:05, Frederic Weisbecker <[email protected]> wrote:
>
> check_prev_add_irq() tests all incompatible scenarios one after the
> other while adding a lock (@next) to a tree dependency (@prev):
>
> LOCK_USED_IN_HARDIRQ vs LOCK_ENABLED_HARDIRQ
> LOCK_USED_IN_HARDIRQ_READ vs LOCK_ENABLED_HARDIRQ
> LOCK_USED_IN_SOFTIRQ vs LOCK_ENABLED_SOFTIRQ
> LOCK_USED_IN_SOFTIRQ_READ vs LOCK_ENABLED_SOFTIRQ

May I ask why

LOCK_USED_IN_*IRQ vs. LOCK_ENABLED_*IRQ_READ

is not tested?

Thanks,
Yuyang

2019-04-16 11:22:22

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 4/4] locking/lockdep: Test all incompatible scenario at once in check_irq_usage()

On Sat, Apr 13, 2019 at 02:35:45AM +0200, Frederic Weisbecker wrote:
> On Thu, Apr 11, 2019 at 12:46:32PM +0200, Peter Zijlstra wrote:

> > +/*
> > + * Observe that when given a bitmask where each bitnr is encoded as above, a
> > + * right shift of the mask transforms the individual bitnrs as -1.
> > + *
> > + * So for all bits where bitnr1 == 1, we can create the mask where bitnr1 == 0
>
> So by bitnr1 you're referring to direction, right?

Yes, as per the comment on exclusive_bit().

> > + * by subtracting 2, or shifting the mask right by 2.
>
> In which case we can perhaps reformulate:
>
> So for all bits whose number have LOCK_ENABLED_* set (bitnr1 == 1), we can create the
> mask with those bit numbers using LOCK_USED_IN_* (bitnr1 == 0) instead by
> subtracting the bit number by 2, or shifting the mask right by 2.
>
> And same would go for below.

Sure.

> > + *
> > + * Similarly, bitnr1 == 0 becomes bitnr1 == 1 by adding 2, or shifting left 2.
> > + *
> > + * So split the mask (note that LOCKF_ENABLED_IRQ_ALL|LOCKF_USED_IN_IRQ_ALL is
> > + * all bits set) and recompose with bitnr1 flipped.
> > + */
> > +static unsigned long invert_dir_mask(unsigned long mask)
> > +{
> > + unsigned long excl = 0;
> > +
> > + /* Invert dir */
> > + excl |= (mask & LOCKF_ENABLED_IRQ_ALL) >> LOCK_USAGE_DIR_MASK;
> > + excl |= (mask & LOCKF_USED_IN_IRQ_ALL) << LOCK_USAGE_DIR_MASK;
> > +
> > + return excl;
> > +}
> > +
> > +/*
> > + * As above, we clear bitnr0 with bitmask ops. First, for all bits with bitnr1
> > + * set, add those with bitnr1 cleared. And then mask out all bitnr1.
> > + */
>
> Same here:
>
> As above, we clear bitnr0 (LOCK_*_READ off) with bitmask ops. First, for all bits
> with bitnr1 set (LOCK_ENABLED_*) , add those with bitnr1 cleared (LOCK_USED_IN_*).
> And then mask out all bitnr1.

Ha! you failed to spot my failure, all the above should be bitnr0 of
course :/

> > +static unsigned long exclusive_mask(unsigned long mask)
> > +{
> > + unsigned long excl = invert_dir_mask(mask);
> > +
> > + /* Strip read */
> > + excl |= (excl & LOCKF_IRQ_READ) >> LOCK_USAGE_READ_MASK;
> > + excl &= ~LOCKF_IRQ_READ;
> > +
> > + return excl;
> > +}
>
> Not sure I'm making things clearer but at least that's some more pointers
> to enum lock_usage_bit (defined on headers where I should add more layout
> explanations, especially to make it clear we play with bit number bits :-s )

Find updated below.

---
Subject: locking/lockdep: Test all incompatible scenario at once in check_irq_usage()
From: Frederic Weisbecker <[email protected]>
Date: Tue, 2 Apr 2019 18:02:44 +0200

check_prev_add_irq() tests all incompatible scenarios one after the
other while adding a lock (@next) to a tree dependency (@prev):

LOCK_USED_IN_HARDIRQ vs LOCK_ENABLED_HARDIRQ
LOCK_USED_IN_HARDIRQ_READ vs LOCK_ENABLED_HARDIRQ
LOCK_USED_IN_SOFTIRQ vs LOCK_ENABLED_SOFTIRQ
LOCK_USED_IN_SOFTIRQ_READ vs LOCK_ENABLED_SOFTIRQ

Also for these four scenarios, we must at least iterate the @prev
backward dependency. Then if it matches the relevant LOCK_USED_* bit,
we must also iterate the @next forward dependency.

Therefore in the best case we iterate 4 times, in the worst case 8 times.

A different approach can let us divide the number of branch iterations
by 4:

1) Iterate through @prev backward dependencies and accumulate all the IRQ
uses in a single mask. In the best case where the current lock hasn't
been used in IRQ, we stop here.

2) Iterate through @next forward dependencies and try to find a lock
whose usage is exclusive to the accumulated usages gathered in the
previous step. If we find one (call it @lockA), we have found an
incompatible use, otherwise we stop here. Only bad locking scenario
go further. So a sane verification stop here.

3) Iterate again through @prev backward dependency and find the lock
whose usage matches @lockA in term of incompatibility. Call that
lock @lockB.

4) Report the incompatible usages of @lockA and @lockB

If no incompatible use is found, the verification never goes beyond
step 2 which means at most two iterations.

The following compares the execution measurements of the function
check_prev_add_irq():

Number of calls | Avg (ns) | Stdev (ns) | Total time (ns)
------------------------------------------------------------------------
Mainline 8452 | 2652 | 11962 | 22415143
This patch 8452 | 1518 | 7090 | 12835602

Cc: Ingo Molnar <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
kernel/locking/lockdep.c | 228 ++++++++++++++++++++++++++-----------
kernel/locking/lockdep_internals.h | 6
2 files changed, 167 insertions(+), 67 deletions(-)

--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1676,6 +1676,14 @@ check_redundant(struct lock_list *root,
}

#if defined(CONFIG_TRACE_IRQFLAGS) && defined(CONFIG_PROVE_LOCKING)
+
+static inline int usage_accumulate(struct lock_list *entry, void *mask)
+{
+ *(unsigned long *)mask |= entry->class->usage_mask;
+
+ return 0;
+}
+
/*
* Forwards and backwards subgraph searching, for the purposes of
* proving that two subgraphs can be connected by a new dependency
@@ -1687,8 +1695,6 @@ static inline int usage_match(struct loc
return entry->class->usage_mask & *(unsigned long *)mask;
}

-
-
/*
* Find a node in the forwards-direction dependency sub-graph starting
* at @root->class that matches @bit.
@@ -1922,39 +1928,6 @@ print_bad_irq_dependency(struct task_str
return 0;
}

-static int
-check_usage(struct task_struct *curr, struct held_lock *prev,
- struct held_lock *next, enum lock_usage_bit bit_backwards,
- enum lock_usage_bit bit_forwards, const char *irqclass)
-{
- int ret;
- struct lock_list this, that;
- struct lock_list *uninitialized_var(target_entry);
- struct lock_list *uninitialized_var(target_entry1);
-
- this.parent = NULL;
-
- this.class = hlock_class(prev);
- ret = find_usage_backwards(&this, lock_flag(bit_backwards), &target_entry);
- if (ret < 0)
- return print_bfs_bug(ret);
- if (ret == 1)
- return ret;
-
- that.parent = NULL;
- that.class = hlock_class(next);
- ret = find_usage_forwards(&that, lock_flag(bit_forwards), &target_entry1);
- if (ret < 0)
- return print_bfs_bug(ret);
- if (ret == 1)
- return ret;
-
- return print_bad_irq_dependency(curr, &this, &that,
- target_entry, target_entry1,
- prev, next,
- bit_backwards, bit_forwards, irqclass);
-}
-
static const char *state_names[] = {
#define LOCKDEP_STATE(__STATE) \
__stringify(__STATE),
@@ -1977,6 +1950,13 @@ static inline const char *state_name(enu
return state_names[bit >> LOCK_USAGE_DIR_MASK];
}

+/*
+ * The bit number is encoded like:
+ *
+ * bit0: 0 exclusive, 1 read lock
+ * bit1: 0 used in irq, 1 irq enabled
+ * bit2-n: state
+ */
static int exclusive_bit(int new_bit)
{
int state = new_bit & LOCK_USAGE_STATE_MASK;
@@ -1988,45 +1968,160 @@ static int exclusive_bit(int new_bit)
return state | (dir ^ LOCK_USAGE_DIR_MASK);
}

+/*
+ * Observe that when given a bitmask where each bitnr is encoded as above, a
+ * right shift of the mask transforms the individual bitnrs as -1 and
+ * conversely, a left shift transforms into +1 for the individual bitnrs.
+ *
+ * So for all bits whose number have LOCK_ENABLED_* set (bitnr1 == 1), we can
+ * create the mask with those bit numbers using LOCK_USED_IN_* (bitnr1 == 0)
+ * instead by subtracting the bit number by 2, or shifting the mask right by 2.
+ *
+ * Similarly, bitnr1 == 0 becomes bitnr1 == 1 by adding 2, or shifting left 2.
+ *
+ * So split the mask (note that LOCKF_ENABLED_IRQ_ALL|LOCKF_USED_IN_IRQ_ALL is
+ * all bits set) and recompose with bitnr1 flipped.
+ */
+static unsigned long invert_dir_mask(unsigned long mask)
+{
+ unsigned long excl = 0;
+
+ /* Invert dir */
+ excl |= (mask & LOCKF_ENABLED_IRQ_ALL) >> LOCK_USAGE_DIR_MASK;
+ excl |= (mask & LOCKF_USED_IN_IRQ_ALL) << LOCK_USAGE_DIR_MASK;
+
+ return excl;
+}
+
+/*
+ * As above, we clear bitnr0 (LOCK_*_READ off) with bitmask ops. First, for all
+ * bits with bitnr0 set (LOCK_*_READ), add those with bitnr0 cleared (LOCK_*).
+ * And then mask out all bitnr0.
+ */
+static unsigned long exclusive_mask(unsigned long mask)
+{
+ unsigned long excl = invert_dir_mask(mask);
+
+ /* Strip read */
+ excl |= (excl & LOCKF_IRQ_READ) >> LOCK_USAGE_READ_MASK;
+ excl &= ~LOCKF_IRQ_READ;
+
+ return excl;
+}
+
+/*
+ * Retrieve the _possible_ original mask to which @mask is
+ * exclusive. Ie: this is the opposite of exclusive_mask().
+ * Note that 2 possible original bits can match an exclusive
+ * bit: one has LOCK_USAGE_READ_MASK set, the other has it
+ * cleared. So both are returned for each exclusive bit.
+ */
+static unsigned long original_mask(unsigned long mask)
+{
+ unsigned long excl = invert_dir_mask(mask);
+
+ /* Include read in existing usages */
+ excl |= (excl & LOCKF_IRQ) << LOCK_USAGE_READ_MASK;
+
+ return excl;
+}
+
+/*
+ * Find the first pair of bit match between an original
+ * usage mask and an exclusive usage mask.
+ */
+static int find_exclusive_match(unsigned long mask,
+ unsigned long excl_mask,
+ enum lock_usage_bit *bitp,
+ enum lock_usage_bit *excl_bitp)
+{
+ int bit, excl;
+
+ for_each_set_bit(bit, &mask, LOCK_USED) {
+ excl = exclusive_bit(bit);
+ if (excl_mask & lock_flag(excl)) {
+ *bitp = bit;
+ *excl_bitp = excl;
+ return 0;
+ }
+ }
+ return -1;
+}
+
+/*
+ * Prove that the new dependency does not connect a hardirq-safe(-read)
+ * lock with a hardirq-unsafe lock - to achieve this we search
+ * the backwards-subgraph starting at <prev>, and the
+ * forwards-subgraph starting at <next>:
+ */
static int check_irq_usage(struct task_struct *curr, struct held_lock *prev,
- struct held_lock *next, enum lock_usage_bit bit)
+ struct held_lock *next)
{
+ unsigned long usage_mask = 0, forward_mask, backward_mask;
+ enum lock_usage_bit forward_bit = 0, backward_bit = 0;
+ struct lock_list *uninitialized_var(target_entry1);
+ struct lock_list *uninitialized_var(target_entry);
+ struct lock_list this, that;
+ int ret;
+
/*
- * Prove that the new dependency does not connect a hardirq-safe
- * lock with a hardirq-unsafe lock - to achieve this we search
- * the backwards-subgraph starting at <prev>, and the
- * forwards-subgraph starting at <next>:
+ * Step 1: gather all hard/soft IRQs usages backward in an
+ * accumulated usage mask.
*/
- if (!check_usage(curr, prev, next, bit,
- exclusive_bit(bit), state_name(bit)))
- return 0;
+ this.parent = NULL;
+ this.class = hlock_class(prev);
+
+ ret = __bfs_backwards(&this, &usage_mask, usage_accumulate, NULL);
+ if (ret < 0)
+ return print_bfs_bug(ret);

- bit++; /* _READ */
+ usage_mask &= LOCKF_USED_IN_IRQ_ALL;
+ if (!usage_mask)
+ return 1;

/*
- * Prove that the new dependency does not connect a hardirq-safe-read
- * lock with a hardirq-unsafe lock - to achieve this we search
- * the backwards-subgraph starting at <prev>, and the
- * forwards-subgraph starting at <next>:
+ * Step 2: find exclusive uses forward that match the previous
+ * backward accumulated mask.
*/
- if (!check_usage(curr, prev, next, bit,
- exclusive_bit(bit), state_name(bit)))
- return 0;
+ forward_mask = exclusive_mask(usage_mask);

- return 1;
-}
+ that.parent = NULL;
+ that.class = hlock_class(next);

-static int
-check_prev_add_irq(struct task_struct *curr, struct held_lock *prev,
- struct held_lock *next)
-{
-#define LOCKDEP_STATE(__STATE) \
- if (!check_irq_usage(curr, prev, next, LOCK_USED_IN_##__STATE)) \
- return 0;
-#include "lockdep_states.h"
-#undef LOCKDEP_STATE
+ ret = find_usage_forwards(&that, forward_mask, &target_entry1);
+ if (ret < 0)
+ return print_bfs_bug(ret);
+ if (ret == 1)
+ return ret;

- return 1;
+ /*
+ * Step 3: we found a bad match! Now retrieve a lock from the backward
+ * list whose usage mask matches the exclusive usage mask from the
+ * lock found on the forward list.
+ */
+ backward_mask = original_mask(target_entry1->class->usage_mask);
+
+ ret = find_usage_backwards(&this, backward_mask, &target_entry);
+ if (ret < 0)
+ return print_bfs_bug(ret);
+ if (DEBUG_LOCKS_WARN_ON(ret == 1))
+ return 1;
+
+ /*
+ * Step 4: narrow down to a pair of incompatible usage bits
+ * and report it.
+ */
+ ret = find_exclusive_match(target_entry->class->usage_mask,
+ target_entry1->class->usage_mask,
+ &backward_bit, &forward_bit);
+ if (DEBUG_LOCKS_WARN_ON(ret == -1))
+ return 1;
+
+ return print_bad_irq_dependency(curr, &this, &that,
+ target_entry, target_entry1,
+ prev, next,
+ backward_bit, forward_bit,
+ state_name(backward_bit));
}

static void inc_chains(void)
@@ -2043,9 +2138,8 @@ static void inc_chains(void)

#else

-static inline int
-check_prev_add_irq(struct task_struct *curr, struct held_lock *prev,
- struct held_lock *next)
+static inline int check_irq_usage(struct task_struct *curr,
+ struct held_lock *prev, struct held_lock *next)
{
return 1;
}
@@ -2225,7 +2319,7 @@ check_prev_add(struct task_struct *curr,
else if (unlikely(ret < 0))
return print_bfs_bug(ret);

- if (!check_prev_add_irq(curr, prev, next))
+ if (!check_irq_usage(curr, prev, next))
return 0;

/*
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -66,6 +66,12 @@ static const unsigned long LOCKF_USED_IN
0;
#undef LOCKDEP_STATE

+#define LOCKF_ENABLED_IRQ_ALL (LOCKF_ENABLED_IRQ | LOCKF_ENABLED_IRQ_READ)
+#define LOCKF_USED_IN_IRQ_ALL (LOCKF_USED_IN_IRQ | LOCKF_USED_IN_IRQ_READ)
+
+#define LOCKF_IRQ (LOCKF_ENABLED_IRQ | LOCKF_USED_IN_IRQ)
+#define LOCKF_IRQ_READ (LOCKF_ENABLED_IRQ_READ | LOCKF_USED_IN_IRQ_READ)
+
/*
* CONFIG_LOCKDEP_SMALL is defined for sparc. Sparc requires .text,
* .data and .bss to fit in required 32MB limit for the kernel. With

2019-04-16 15:23:18

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH 4/4] locking/lockdep: Test all incompatible scenario at once in check_irq_usage()

On Tue, Apr 16, 2019 at 01:20:09PM +0200, Peter Zijlstra wrote:
> On Sat, Apr 13, 2019 at 02:35:45AM +0200, Frederic Weisbecker wrote:
> > On Thu, Apr 11, 2019 at 12:46:32PM +0200, Peter Zijlstra wrote:
> > Same here:
> >
> > As above, we clear bitnr0 (LOCK_*_READ off) with bitmask ops. First, for all bits
> > with bitnr1 set (LOCK_ENABLED_*) , add those with bitnr1 cleared (LOCK_USED_IN_*).
> > And then mask out all bitnr1.
>
> Ha! you failed to spot my failure, all the above should be bitnr0 of
> course :/

Ooops, right.

> Find updated below.

Very good now, thanks a lot!

Subject: [tip:locking/core] locking/lockdep: Move valid_state() inside CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING

Commit-ID: 0d2cc3b3453254f1c56f9456ba03e092ed4cfb72
Gitweb: https://git.kernel.org/tip/0d2cc3b3453254f1c56f9456ba03e092ed4cfb72
Author: Frederic Weisbecker <[email protected]>
AuthorDate: Tue, 2 Apr 2019 18:02:41 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 18 Apr 2019 12:50:17 +0200

locking/lockdep: Move valid_state() inside CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING

valid_state() and print_usage_bug*() functions are not used beyond
irq locking correctness checks under CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING.

Sadly the "unused function" warning wouldn't fire because valid_state()
is inline so the unused case has remained unseen until now.

So move them inside the appropriate CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING
section.

Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/locking/lockdep.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 34cdcbedda49..9c5819ef4a28 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2784,6 +2784,12 @@ static void check_chain_key(struct task_struct *curr)
#endif
}

+static int mark_lock(struct task_struct *curr, struct held_lock *this,
+ enum lock_usage_bit new_bit);
+
+#if defined(CONFIG_TRACE_IRQFLAGS) && defined(CONFIG_PROVE_LOCKING)
+
+
static void
print_usage_bug_scenario(struct held_lock *lock)
{
@@ -2853,10 +2859,6 @@ valid_state(struct task_struct *curr, struct held_lock *this,
return 1;
}

-static int mark_lock(struct task_struct *curr, struct held_lock *this,
- enum lock_usage_bit new_bit);
-
-#if defined(CONFIG_TRACE_IRQFLAGS) && defined(CONFIG_PROVE_LOCKING)

/*
* print irq inversion bug:

Subject: [tip:locking/core] locking/lockdep: Use expanded masks on find_usage_*() functions

Commit-ID: 627f364d24c009b61c9199b2c75006e35c294675
Gitweb: https://git.kernel.org/tip/627f364d24c009b61c9199b2c75006e35c294675
Author: Frederic Weisbecker <[email protected]>
AuthorDate: Tue, 2 Apr 2019 18:02:43 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 18 Apr 2019 12:50:17 +0200

locking/lockdep: Use expanded masks on find_usage_*() functions

In order to optimize check_irq_usage() and factorize all the IRQ usage
validations we'll need to be able to check multiple lock usage bits at
once. Prepare the low level usage mask check functions for that purpose.

Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/locking/lockdep.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 2288aa2fa4c6..5e149dd78298 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1682,9 +1682,9 @@ check_redundant(struct lock_list *root, struct lock_class *target,
* without creating any illegal irq-safe -> irq-unsafe lock dependency.
*/

-static inline int usage_match(struct lock_list *entry, void *bit)
+static inline int usage_match(struct lock_list *entry, void *mask)
{
- return entry->class->usage_mask & (1 << (enum lock_usage_bit)bit);
+ return entry->class->usage_mask & *(unsigned long *)mask;
}


@@ -1700,14 +1700,14 @@ static inline int usage_match(struct lock_list *entry, void *bit)
* Return <0 on error.
*/
static int
-find_usage_forwards(struct lock_list *root, enum lock_usage_bit bit,
+find_usage_forwards(struct lock_list *root, unsigned long usage_mask,
struct lock_list **target_entry)
{
int result;

debug_atomic_inc(nr_find_usage_forwards_checks);

- result = __bfs_forwards(root, (void *)bit, usage_match, target_entry);
+ result = __bfs_forwards(root, &usage_mask, usage_match, target_entry);

return result;
}
@@ -1723,14 +1723,14 @@ find_usage_forwards(struct lock_list *root, enum lock_usage_bit bit,
* Return <0 on error.
*/
static int
-find_usage_backwards(struct lock_list *root, enum lock_usage_bit bit,
+find_usage_backwards(struct lock_list *root, unsigned long usage_mask,
struct lock_list **target_entry)
{
int result;

debug_atomic_inc(nr_find_usage_backwards_checks);

- result = __bfs_backwards(root, (void *)bit, usage_match, target_entry);
+ result = __bfs_backwards(root, &usage_mask, usage_match, target_entry);

return result;
}
@@ -1935,7 +1935,7 @@ check_usage(struct task_struct *curr, struct held_lock *prev,
this.parent = NULL;

this.class = hlock_class(prev);
- ret = find_usage_backwards(&this, bit_backwards, &target_entry);
+ ret = find_usage_backwards(&this, lock_flag(bit_backwards), &target_entry);
if (ret < 0)
return print_bfs_bug(ret);
if (ret == 1)
@@ -1943,7 +1943,7 @@ check_usage(struct task_struct *curr, struct held_lock *prev,

that.parent = NULL;
that.class = hlock_class(next);
- ret = find_usage_forwards(&that, bit_forwards, &target_entry1);
+ ret = find_usage_forwards(&that, lock_flag(bit_forwards), &target_entry1);
if (ret < 0)
return print_bfs_bug(ret);
if (ret == 1)
@@ -2941,7 +2941,7 @@ check_usage_forwards(struct task_struct *curr, struct held_lock *this,

root.parent = NULL;
root.class = hlock_class(this);
- ret = find_usage_forwards(&root, bit, &target_entry);
+ ret = find_usage_forwards(&root, lock_flag(bit), &target_entry);
if (ret < 0)
return print_bfs_bug(ret);
if (ret == 1)
@@ -2965,7 +2965,7 @@ check_usage_backwards(struct task_struct *curr, struct held_lock *this,

root.parent = NULL;
root.class = hlock_class(this);
- ret = find_usage_backwards(&root, bit, &target_entry);
+ ret = find_usage_backwards(&root, lock_flag(bit), &target_entry);
if (ret < 0)
return print_bfs_bug(ret);
if (ret == 1)

Subject: [tip:locking/core] locking/lockdep: Map remaining magic numbers to lock usage mask names

Commit-ID: c902a1e8d9c9b47cd8faa16892710247cdda9b02
Gitweb: https://git.kernel.org/tip/c902a1e8d9c9b47cd8faa16892710247cdda9b02
Author: Frederic Weisbecker <[email protected]>
AuthorDate: Tue, 2 Apr 2019 18:02:42 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 18 Apr 2019 12:50:17 +0200

locking/lockdep: Map remaining magic numbers to lock usage mask names

Clarify the code with mapping some more constant numbers that haven't
been named after their corresponding LOCK_USAGE_* symbol.

Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/locking/lockdep.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 9c5819ef4a28..2288aa2fa4c6 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -516,11 +516,11 @@ static char get_usage_char(struct lock_class *class, enum lock_usage_bit bit)
{
char c = '.';

- if (class->usage_mask & lock_flag(bit + 2))
+ if (class->usage_mask & lock_flag(bit + LOCK_USAGE_DIR_MASK))
c = '+';
if (class->usage_mask & lock_flag(bit)) {
c = '-';
- if (class->usage_mask & lock_flag(bit + 2))
+ if (class->usage_mask & lock_flag(bit + LOCK_USAGE_DIR_MASK))
c = '?';
}

@@ -1971,7 +1971,10 @@ static const char *state_rnames[] = {

static inline const char *state_name(enum lock_usage_bit bit)
{
- return (bit & LOCK_USAGE_READ_MASK) ? state_rnames[bit >> 2] : state_names[bit >> 2];
+ if (bit & LOCK_USAGE_READ_MASK)
+ return state_rnames[bit >> LOCK_USAGE_DIR_MASK];
+ else
+ return state_names[bit >> LOCK_USAGE_DIR_MASK];
}

static int exclusive_bit(int new_bit)
@@ -3017,7 +3020,7 @@ static int (*state_verbose_f[])(struct lock_class *class) = {
static inline int state_verbose(enum lock_usage_bit bit,
struct lock_class *class)
{
- return state_verbose_f[bit >> 2](class);
+ return state_verbose_f[bit >> LOCK_USAGE_DIR_MASK](class);
}

typedef int (*check_usage_f)(struct task_struct *, struct held_lock *,

Subject: [tip:locking/core] locking/lockdep: Test all incompatible scenarios at once in check_irq_usage()

Commit-ID: 948f83768a180ec8e85c4a8ff269d5e433d10815
Gitweb: https://git.kernel.org/tip/948f83768a180ec8e85c4a8ff269d5e433d10815
Author: Frederic Weisbecker <[email protected]>
AuthorDate: Tue, 2 Apr 2019 18:02:44 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Mon, 29 Apr 2019 08:29:20 +0200

locking/lockdep: Test all incompatible scenarios at once in check_irq_usage()

check_prev_add_irq() tests all incompatible scenarios one after the
other while adding a lock (@next) to a tree dependency (@prev):

LOCK_USED_IN_HARDIRQ vs LOCK_ENABLED_HARDIRQ
LOCK_USED_IN_HARDIRQ_READ vs LOCK_ENABLED_HARDIRQ
LOCK_USED_IN_SOFTIRQ vs LOCK_ENABLED_SOFTIRQ
LOCK_USED_IN_SOFTIRQ_READ vs LOCK_ENABLED_SOFTIRQ

Also for these four scenarios, we must at least iterate the @prev
backward dependency. Then if it matches the relevant LOCK_USED_* bit,
we must also iterate the @next forward dependency.

Therefore in the best case we iterate 4 times, in the worst case 8 times.

A different approach can let us divide the number of branch iterations
by 4:

1) Iterate through @prev backward dependencies and accumulate all the IRQ
uses in a single mask. In the best case where the current lock hasn't
been used in IRQ, we stop here.

2) Iterate through @next forward dependencies and try to find a lock
whose usage is exclusive to the accumulated usages gathered in the
previous step. If we find one (call it @lockA), we have found an
incompatible use, otherwise we stop here. Only bad locking scenario
go further. So a sane verification stop here.

3) Iterate again through @prev backward dependency and find the lock
whose usage matches @lockA in term of incompatibility. Call that
lock @lockB.

4) Report the incompatible usages of @lockA and @lockB

If no incompatible use is found, the verification never goes beyond
step 2 which means at most two iterations.

The following compares the execution measurements of the function
check_prev_add_irq():

Number of calls | Avg (ns) | Stdev (ns) | Total time (ns)
------------------------------------------------------------------------
Mainline 8452 | 2652 | 11962 | 22415143
This patch 8452 | 1518 | 7090 | 12835602

Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/locking/lockdep.c | 228 ++++++++++++++++++++++++++-----------
kernel/locking/lockdep_internals.h | 6 +
2 files changed, 167 insertions(+), 67 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 5e149dd78298..25ecc6d3058b 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1676,6 +1676,14 @@ check_redundant(struct lock_list *root, struct lock_class *target,
}

#if defined(CONFIG_TRACE_IRQFLAGS) && defined(CONFIG_PROVE_LOCKING)
+
+static inline int usage_accumulate(struct lock_list *entry, void *mask)
+{
+ *(unsigned long *)mask |= entry->class->usage_mask;
+
+ return 0;
+}
+
/*
* Forwards and backwards subgraph searching, for the purposes of
* proving that two subgraphs can be connected by a new dependency
@@ -1687,8 +1695,6 @@ static inline int usage_match(struct lock_list *entry, void *mask)
return entry->class->usage_mask & *(unsigned long *)mask;
}

-
-
/*
* Find a node in the forwards-direction dependency sub-graph starting
* at @root->class that matches @bit.
@@ -1922,39 +1928,6 @@ print_bad_irq_dependency(struct task_struct *curr,
return 0;
}

-static int
-check_usage(struct task_struct *curr, struct held_lock *prev,
- struct held_lock *next, enum lock_usage_bit bit_backwards,
- enum lock_usage_bit bit_forwards, const char *irqclass)
-{
- int ret;
- struct lock_list this, that;
- struct lock_list *uninitialized_var(target_entry);
- struct lock_list *uninitialized_var(target_entry1);
-
- this.parent = NULL;
-
- this.class = hlock_class(prev);
- ret = find_usage_backwards(&this, lock_flag(bit_backwards), &target_entry);
- if (ret < 0)
- return print_bfs_bug(ret);
- if (ret == 1)
- return ret;
-
- that.parent = NULL;
- that.class = hlock_class(next);
- ret = find_usage_forwards(&that, lock_flag(bit_forwards), &target_entry1);
- if (ret < 0)
- return print_bfs_bug(ret);
- if (ret == 1)
- return ret;
-
- return print_bad_irq_dependency(curr, &this, &that,
- target_entry, target_entry1,
- prev, next,
- bit_backwards, bit_forwards, irqclass);
-}
-
static const char *state_names[] = {
#define LOCKDEP_STATE(__STATE) \
__stringify(__STATE),
@@ -1977,6 +1950,13 @@ static inline const char *state_name(enum lock_usage_bit bit)
return state_names[bit >> LOCK_USAGE_DIR_MASK];
}

+/*
+ * The bit number is encoded like:
+ *
+ * bit0: 0 exclusive, 1 read lock
+ * bit1: 0 used in irq, 1 irq enabled
+ * bit2-n: state
+ */
static int exclusive_bit(int new_bit)
{
int state = new_bit & LOCK_USAGE_STATE_MASK;
@@ -1988,45 +1968,160 @@ static int exclusive_bit(int new_bit)
return state | (dir ^ LOCK_USAGE_DIR_MASK);
}

+/*
+ * Observe that when given a bitmask where each bitnr is encoded as above, a
+ * right shift of the mask transforms the individual bitnrs as -1 and
+ * conversely, a left shift transforms into +1 for the individual bitnrs.
+ *
+ * So for all bits whose number have LOCK_ENABLED_* set (bitnr1 == 1), we can
+ * create the mask with those bit numbers using LOCK_USED_IN_* (bitnr1 == 0)
+ * instead by subtracting the bit number by 2, or shifting the mask right by 2.
+ *
+ * Similarly, bitnr1 == 0 becomes bitnr1 == 1 by adding 2, or shifting left 2.
+ *
+ * So split the mask (note that LOCKF_ENABLED_IRQ_ALL|LOCKF_USED_IN_IRQ_ALL is
+ * all bits set) and recompose with bitnr1 flipped.
+ */
+static unsigned long invert_dir_mask(unsigned long mask)
+{
+ unsigned long excl = 0;
+
+ /* Invert dir */
+ excl |= (mask & LOCKF_ENABLED_IRQ_ALL) >> LOCK_USAGE_DIR_MASK;
+ excl |= (mask & LOCKF_USED_IN_IRQ_ALL) << LOCK_USAGE_DIR_MASK;
+
+ return excl;
+}
+
+/*
+ * As above, we clear bitnr0 (LOCK_*_READ off) with bitmask ops. First, for all
+ * bits with bitnr0 set (LOCK_*_READ), add those with bitnr0 cleared (LOCK_*).
+ * And then mask out all bitnr0.
+ */
+static unsigned long exclusive_mask(unsigned long mask)
+{
+ unsigned long excl = invert_dir_mask(mask);
+
+ /* Strip read */
+ excl |= (excl & LOCKF_IRQ_READ) >> LOCK_USAGE_READ_MASK;
+ excl &= ~LOCKF_IRQ_READ;
+
+ return excl;
+}
+
+/*
+ * Retrieve the _possible_ original mask to which @mask is
+ * exclusive. Ie: this is the opposite of exclusive_mask().
+ * Note that 2 possible original bits can match an exclusive
+ * bit: one has LOCK_USAGE_READ_MASK set, the other has it
+ * cleared. So both are returned for each exclusive bit.
+ */
+static unsigned long original_mask(unsigned long mask)
+{
+ unsigned long excl = invert_dir_mask(mask);
+
+ /* Include read in existing usages */
+ excl |= (excl & LOCKF_IRQ) << LOCK_USAGE_READ_MASK;
+
+ return excl;
+}
+
+/*
+ * Find the first pair of bit match between an original
+ * usage mask and an exclusive usage mask.
+ */
+static int find_exclusive_match(unsigned long mask,
+ unsigned long excl_mask,
+ enum lock_usage_bit *bitp,
+ enum lock_usage_bit *excl_bitp)
+{
+ int bit, excl;
+
+ for_each_set_bit(bit, &mask, LOCK_USED) {
+ excl = exclusive_bit(bit);
+ if (excl_mask & lock_flag(excl)) {
+ *bitp = bit;
+ *excl_bitp = excl;
+ return 0;
+ }
+ }
+ return -1;
+}
+
+/*
+ * Prove that the new dependency does not connect a hardirq-safe(-read)
+ * lock with a hardirq-unsafe lock - to achieve this we search
+ * the backwards-subgraph starting at <prev>, and the
+ * forwards-subgraph starting at <next>:
+ */
static int check_irq_usage(struct task_struct *curr, struct held_lock *prev,
- struct held_lock *next, enum lock_usage_bit bit)
+ struct held_lock *next)
{
+ unsigned long usage_mask = 0, forward_mask, backward_mask;
+ enum lock_usage_bit forward_bit = 0, backward_bit = 0;
+ struct lock_list *uninitialized_var(target_entry1);
+ struct lock_list *uninitialized_var(target_entry);
+ struct lock_list this, that;
+ int ret;
+
/*
- * Prove that the new dependency does not connect a hardirq-safe
- * lock with a hardirq-unsafe lock - to achieve this we search
- * the backwards-subgraph starting at <prev>, and the
- * forwards-subgraph starting at <next>:
+ * Step 1: gather all hard/soft IRQs usages backward in an
+ * accumulated usage mask.
*/
- if (!check_usage(curr, prev, next, bit,
- exclusive_bit(bit), state_name(bit)))
- return 0;
+ this.parent = NULL;
+ this.class = hlock_class(prev);

- bit++; /* _READ */
+ ret = __bfs_backwards(&this, &usage_mask, usage_accumulate, NULL);
+ if (ret < 0)
+ return print_bfs_bug(ret);
+
+ usage_mask &= LOCKF_USED_IN_IRQ_ALL;
+ if (!usage_mask)
+ return 1;

/*
- * Prove that the new dependency does not connect a hardirq-safe-read
- * lock with a hardirq-unsafe lock - to achieve this we search
- * the backwards-subgraph starting at <prev>, and the
- * forwards-subgraph starting at <next>:
+ * Step 2: find exclusive uses forward that match the previous
+ * backward accumulated mask.
*/
- if (!check_usage(curr, prev, next, bit,
- exclusive_bit(bit), state_name(bit)))
- return 0;
+ forward_mask = exclusive_mask(usage_mask);

- return 1;
-}
+ that.parent = NULL;
+ that.class = hlock_class(next);

-static int
-check_prev_add_irq(struct task_struct *curr, struct held_lock *prev,
- struct held_lock *next)
-{
-#define LOCKDEP_STATE(__STATE) \
- if (!check_irq_usage(curr, prev, next, LOCK_USED_IN_##__STATE)) \
- return 0;
-#include "lockdep_states.h"
-#undef LOCKDEP_STATE
+ ret = find_usage_forwards(&that, forward_mask, &target_entry1);
+ if (ret < 0)
+ return print_bfs_bug(ret);
+ if (ret == 1)
+ return ret;

- return 1;
+ /*
+ * Step 3: we found a bad match! Now retrieve a lock from the backward
+ * list whose usage mask matches the exclusive usage mask from the
+ * lock found on the forward list.
+ */
+ backward_mask = original_mask(target_entry1->class->usage_mask);
+
+ ret = find_usage_backwards(&this, backward_mask, &target_entry);
+ if (ret < 0)
+ return print_bfs_bug(ret);
+ if (DEBUG_LOCKS_WARN_ON(ret == 1))
+ return 1;
+
+ /*
+ * Step 4: narrow down to a pair of incompatible usage bits
+ * and report it.
+ */
+ ret = find_exclusive_match(target_entry->class->usage_mask,
+ target_entry1->class->usage_mask,
+ &backward_bit, &forward_bit);
+ if (DEBUG_LOCKS_WARN_ON(ret == -1))
+ return 1;
+
+ return print_bad_irq_dependency(curr, &this, &that,
+ target_entry, target_entry1,
+ prev, next,
+ backward_bit, forward_bit,
+ state_name(backward_bit));
}

static void inc_chains(void)
@@ -2043,9 +2138,8 @@ static void inc_chains(void)

#else

-static inline int
-check_prev_add_irq(struct task_struct *curr, struct held_lock *prev,
- struct held_lock *next)
+static inline int check_irq_usage(struct task_struct *curr,
+ struct held_lock *prev, struct held_lock *next)
{
return 1;
}
@@ -2225,7 +2319,7 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev,
else if (unlikely(ret < 0))
return print_bfs_bug(ret);

- if (!check_prev_add_irq(curr, prev, next))
+ if (!check_irq_usage(curr, prev, next))
return 0;

/*
diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
index 2b3ffd4117ad..150ec3f0c5b5 100644
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -66,6 +66,12 @@ static const unsigned long LOCKF_USED_IN_IRQ_READ =
0;
#undef LOCKDEP_STATE

+#define LOCKF_ENABLED_IRQ_ALL (LOCKF_ENABLED_IRQ | LOCKF_ENABLED_IRQ_READ)
+#define LOCKF_USED_IN_IRQ_ALL (LOCKF_USED_IN_IRQ | LOCKF_USED_IN_IRQ_READ)
+
+#define LOCKF_IRQ (LOCKF_ENABLED_IRQ | LOCKF_USED_IN_IRQ)
+#define LOCKF_IRQ_READ (LOCKF_ENABLED_IRQ_READ | LOCKF_USED_IN_IRQ_READ)
+
/*
* CONFIG_LOCKDEP_SMALL is defined for sparc. Sparc requires .text,
* .data and .bss to fit in required 32MB limit for the kernel. With