LinuxLists.cc - RFC: LockDoc - Detecting Locking Bugs in the Linux Kernel

2018-11-28 09:19:21

Subject: RFC: LockDoc - Detecting Locking Bugs in the Linux Kernel

Hi folks,

during the past months we've been developing LockDoc, a trace-based
approach of Lock Analysis in the Linux Kernel.
LockDoc uses execution traces of an instrumented Linux Kernel to
automatically deduce
locking rules for all members of arbitrary kernel data structures.
The traces are gathered running a manually selected fs-specific subset
of the Linux Test Project in a virtual machine.
These locking rules can be used to generate a comprehensive locking
documentation and to reveal potential bugs.

LockDoc generates rules for each tuple of (data structure, member, {r,w}).
It completely ignores any element of type atomic{,64,long}_t as well as
atomic_*() functions.
Accesses during initialization and destruction of objects are also ignored.
The output of LockDoc looks like this:
inode member: i_flags [w] (15 lock combinations)
hypotheses: 96
15.8% (88 out of 558 mem accesses): EMBOTHER(inode.i_rwsem[w]) ->
EMBSAME(inode.i_rwsem[w])
counterexample.sql.sh inode w:i_flags CEX SEQ
'EMBOTHER(inode.i_rwsem[w])' 'EMBSAME(inode.i_rwsem[w])'
15.8% (88 out of 558 mem accesses): EMBOTHER(inode.i_rwsem[w])
counterexample.sql.sh inode w:i_flags CEX SEQ
'EMBOTHER(inode.i_rwsem[w])'
! 99.8% (557 out of 558 mem accesses): EMBSAME(inode.i_rwsem[w])
! counterexample.sql.sh inode w:i_flags CEX SEQ
'EMBSAME(inode.i_rwsem[w])'
100% (558 out of 558 mem accesses): (no locks held)
(no counterexamples to be expected, this hypothesis has 100% support
in the observation set)

In this example LockDoc concludes that the lock
"EMBSAME(inode.i_rwsem[w])" is necessary for writing inode.i_flags.
EMBSAME stands for the lock embedded in the inode being accessed. In
this case it is the i_rwsem.
To be more precise, the write lock (--> "[w]") of i_rwsem is needed.
Based on this methodology, we can determine code locations that do not
adhere to the deduced locking rules.
The reports on rule-violating code include the stack trace and the
actual locks held.

We've now created a series of bug reports for the following data types:
struct inode (used by ext4), journal_t, and transaction_t.
We present the counterexamples for each tuple of (data structure,
member, {r,w}).
Depending on the complexity of the callgraph, the counterexamples are
either embedded in the callgraph or the callgraph is shown below them.
In the latter case, zooming can be enabled via a button in the heading.

We kindly ask you to have a look at our findings and send us some
comments back:
https://ess.cs.tu-dortmund.de/lockdoc-bugs/ml/

Our approach has already revealed one real bug and one suspicious
situation. Both have been confirmed by Jan.

Best regards,
Alex and Horst

--
Technische Universität Dortmund
Alexander Lochmann PGP key: 0xBC3EF6FD
Otto-Hahn-Str. 16 phone: +49.231.7556141
D-44227 Dortmund fax: +49.231.7556116
http://ess.cs.tu-dortmund.de/Staff/al

Attachments:

signature.asc (833.00 B)
OpenPGP digital signature

2018-11-29 07:37:50

by Andreas Dilger

[permalink] [raw]

Subject: Re: RFC: LockDoc - Detecting Locking Bugs in the Linux Kernel

On Nov 27, 2018, at 3:19 PM, Alexander Lochmann <[email protected]> wrote:
>
> Hi folks,
>
> during the past months we've been developing LockDoc, a trace-based
> approach of Lock Analysis in the Linux Kernel.
> LockDoc uses execution traces of an instrumented Linux Kernel to
> automatically deduce
> locking rules for all members of arbitrary kernel data structures.
> The traces are gathered running a manually selected fs-specific subset
> of the Linux Test Project in a virtual machine.
> These locking rules can be used to generate a comprehensive locking
> documentation and to reveal potential bugs.

This is quite interesting, and looks useful provided that there is a
workload that exercises the various codepaths. How long does such an
analysis take to run, and is there a plan to make this functionality
available to developers (e.g. either as a web portal, or to be able
to download the code and run it locally)?

> LockDoc generates rules for each tuple of (data structure, member, {r,w}).
> It completely ignores any element of type atomic{,64,long}_t as well as
> atomic_*() functions.
> Accesses during initialization and destruction of objects are also ignored.
> The output of LockDoc looks like this:
> inode member: i_flags [w] (15 lock combinations)
> hypotheses: 96
> 15.8% (88 out of 558 mem accesses): EMBOTHER(inode.i_rwsem[w]) ->
> EMBSAME(inode.i_rwsem[w])
> counterexample.sql.sh inode w:i_flags CEX SEQ
> 'EMBOTHER(inode.i_rwsem[w])' 'EMBSAME(inode.i_rwsem[w])'
> 15.8% (88 out of 558 mem accesses): EMBOTHER(inode.i_rwsem[w])
> counterexample.sql.sh inode w:i_flags CEX SEQ
> 'EMBOTHER(inode.i_rwsem[w])'
> ! 99.8% (557 out of 558 mem accesses): EMBSAME(inode.i_rwsem[w])
> ! counterexample.sql.sh inode w:i_flags CEX SEQ
> 'EMBSAME(inode.i_rwsem[w])'
> 100% (558 out of 558 mem accesses): (no locks held)
> (no counterexamples to be expected, this hypothesis has 100% support
> in the observation set)
>
> In this example LockDoc concludes that the lock
> "EMBSAME(inode.i_rwsem[w])" is necessary for writing inode.i_flags.
> EMBSAME stands for the lock embedded in the inode being accessed. In
> this case it is the i_rwsem.
> To be more precise, the write lock (--> "[w]") of i_rwsem is needed.
> Based on this methodology, we can determine code locations that do not
> adhere to the deduced locking rules.
> The reports on rule-violating code include the stack trace and the
> actual locks held.

Looking at the page, it isn't very clear where some of the callpaths go.
For example, in the "writing inode:ext4.i_nlink-__i_nlink" case, it
shows a callpath from vfs_symlink() calling drop_nlink(), but this is not
called directly from vfs_symlink(). It is actually going through
dir->i_op->symlink() (ext4_symlink() in our case), so it would be best
to show that dependency.

One minor bug in ext4_symlink() is that it should probably be calling
clear_nlink() to make it clear the link count should be zero, instead
of drop_nlink(), because we don't really want to trigger "remove_count"
and because the later part of this function is using set_nlink(inode, 1)
instead of inc_nlink() that decrements "remove_count" again. This does
not solve the reported warning, however.

In ext4_orphan_add(), called immediately after drop_nlink(), it checks:

WARN_ON_ONCE(!(inode->i_state & (I_NEW | I_FREEING)) &&
!inode_is_locked(inode));

so this is the case where i_state has I_NEW set. The question is whether
it is worthwhile to grab inode_lock() in this code, just to keep the lock
checker happy, or whether the code can be annotated to tell the checker
that I_NEW means i_rwsem is not needed.

Cheers, Andreas

>
> We've now created a series of bug reports for the following data types:
> struct inode (used by ext4), journal_t, and transaction_t.
> We present the counterexamples for each tuple of (data structure,
> member, {r,w}).
> Depending on the complexity of the callgraph, the counterexamples are
> either embedded in the callgraph or the callgraph is shown below them.
> In the latter case, zooming can be enabled via a button in the heading.
>
> We kindly ask you to have a look at our findings and send us some
> comments back:
> https://ess.cs.tu-dortmund.de/lockdoc-bugs/ml/
>
> Our approach has already revealed one real bug and one suspicious
> situation. Both have been confirmed by Jan.
>
> Best regards,
> Alex and Horst
>
> --
> Technische Universität Dortmund
> Alexander Lochmann PGP key: 0xBC3EF6FD
> Otto-Hahn-Str. 16 phone: +49.231.7556141
> D-44227 Dortmund fax: +49.231.7556116
> http://ess.cs.tu-dortmund.de/Staff/al
>

Cheers, Andreas

Attachments:

signature.asc (873.00 B)
Message signed with OpenPGP

2018-12-06 14:22:12

by Alexander Lochmann

[permalink] [raw]

Subject: Re: RFC: LockDoc - Detecting Locking Bugs in the Linux Kernel

Am 28.11.18 um 21:34 schrieb Andreas Dilger:
> On Nov 27, 2018, at 3:19 PM, Alexander Lochmann <[email protected]> wrote:
>>
>> Hi folks,
>>
>> during the past months we've been developing LockDoc, a trace-based
>> approach of Lock Analysis in the Linux Kernel.
>> LockDoc uses execution traces of an instrumented Linux Kernel to
>> automatically deduce
>> locking rules for all members of arbitrary kernel data structures.
>> The traces are gathered running a manually selected fs-specific subset
>> of the Linux Test Project in a virtual machine.
>> These locking rules can be used to generate a comprehensive locking
>> documentation and to reveal potential bugs.
>
>
> This is quite interesting, and looks useful provided that there is a
> workload that exercises the various codepaths. How long does such an
> analysis take to run, and is there a plan to make this functionality
> available to developers (e.g. either as a web portal, or to be able
> to download the code and run it locally)?
It takes ~34 min to gather the trace, and another 4,5h to process it.
Yes, we intend to release our project. However, we do not have a
schedule yet.
>
>> LockDoc generates rules for each tuple of (data structure, member, {r,w}).
>> It completely ignores any element of type atomic{,64,long}_t as well as
>> atomic_*() functions.
>> Accesses during initialization and destruction of objects are also ignored.
>> The output of LockDoc looks like this:
>> inode member: i_flags [w] (15 lock combinations)
>> hypotheses: 96
>> 15.8% (88 out of 558 mem accesses): EMBOTHER(inode.i_rwsem[w]) ->
>> EMBSAME(inode.i_rwsem[w])
>> counterexample.sql.sh inode w:i_flags CEX SEQ
>> 'EMBOTHER(inode.i_rwsem[w])' 'EMBSAME(inode.i_rwsem[w])'
>> 15.8% (88 out of 558 mem accesses): EMBOTHER(inode.i_rwsem[w])
>> counterexample.sql.sh inode w:i_flags CEX SEQ
>> 'EMBOTHER(inode.i_rwsem[w])'
>> ! 99.8% (557 out of 558 mem accesses): EMBSAME(inode.i_rwsem[w])
>> ! counterexample.sql.sh inode w:i_flags CEX SEQ
>> 'EMBSAME(inode.i_rwsem[w])'
>> 100% (558 out of 558 mem accesses): (no locks held)
>> (no counterexamples to be expected, this hypothesis has 100% support
>> in the observation set)
>>
>> In this example LockDoc concludes that the lock
>> "EMBSAME(inode.i_rwsem[w])" is necessary for writing inode.i_flags.
>> EMBSAME stands for the lock embedded in the inode being accessed. In
>> this case it is the i_rwsem.
>> To be more precise, the write lock (--> "[w]") of i_rwsem is needed.
>> Based on this methodology, we can determine code locations that do not
>> adhere to the deduced locking rules.
>> The reports on rule-violating code include the stack trace and the
>> actual locks held.
>
> Looking at the page, it isn't very clear where some of the callpaths go.
> For example, in the "writing inode:ext4.i_nlink-__i_nlink" case, it
> shows a callpath from vfs_symlink() calling drop_nlink(), but this is not
> called directly from vfs_symlink(). It is actually going through
> dir->i_op->symlink() (ext4_symlink() in our case), so it would be best
> to show that dependency.
Thanks for pointing us to a GCC bug. :) Since drop_nlink() is a leaf
function, GCC 8.2 does not generate the frame pointer code - even with
CONFIG_FRAME_POINTER set. Using GCC 7.3 "fixes" this.
(Disable -regparm=3 would also "fix" this issue with GCC 8.2. Since this
concerns the kernel abi, this is not an option...)
>
> One minor bug in ext4_symlink() is that it should probably be calling
> clear_nlink() to make it clear the link count should be zero, instead
> of drop_nlink(), because we don't really want to trigger "remove_count"
> and because the later part of this function is using set_nlink(inode, 1)
> instead of inc_nlink() that decrements "remove_count" again. This does
> not solve the reported warning, however.
>
> In ext4_orphan_add(), called immediately after drop_nlink(), it checks:
>
> WARN_ON_ONCE(!(inode->i_state & (I_NEW | I_FREEING)) &&
> !inode_is_locked(inode));
>
> so this is the case where i_state has I_NEW set. The question is whether
> it is worthwhile to grab inode_lock() in this code, just to keep the lock
> checker happy, or whether the code can be annotated to tell the checker
> that I_NEW means i_rwsem is not needed.
Who can help us here?
But it basically is a bug?

Cheers,
Alex
>
> Cheers, Andreas
>
>>
>> We've now created a series of bug reports for the following data types:
>> struct inode (used by ext4), journal_t, and transaction_t.
>> We present the counterexamples for each tuple of (data structure,
>> member, {r,w}).
>> Depending on the complexity of the callgraph, the counterexamples are
>> either embedded in the callgraph or the callgraph is shown below them.
>> In the latter case, zooming can be enabled via a button in the heading.
>>
>> We kindly ask you to have a look at our findings and send us some
>> comments back:
>> https://ess.cs.tu-dortmund.de/lockdoc-bugs/ml/
>>
>> Our approach has already revealed one real bug and one suspicious
>> situation. Both have been confirmed by Jan.
>>
>> Best regards,
>> Alex and Horst
>>
>> --
>> Technische Universität Dortmund
>> Alexander Lochmann PGP key: 0xBC3EF6FD
>> Otto-Hahn-Str. 16 phone: +49.231.7556141
>> D-44227 Dortmund fax: +49.231.7556116
>> http://ess.cs.tu-dortmund.de/Staff/al
>>
>
>
> Cheers, Andreas
>
>
>
>
>

--
Technische Universität Dortmund
Alexander Lochmann PGP key: 0xBC3EF6FD
Otto-Hahn-Str. 16 phone: +49.231.7556141
D-44227 Dortmund fax: +49.231.7556116
http://ess.cs.tu-dortmund.de/Staff/al

Attachments:

signature.asc (833.00 B)
OpenPGP digital signature