2009-07-01 17:06:42

by Roland Dreier

[permalink] [raw]
Subject: lockdep reported AB-BA problem in ecryptfs

I have an Ubuntu 9.10 system, using ecryptfs for my home directory,
running a 2.6.31-rc1+git kernel with lockdep enabled, and I got the
report below on login. This looks like a valid AB-BA issue; there are
two paths through the code, first:

ecryptfs_new_file_context() ->
ecryptfs_copy_mount_wide_sigs_to_inode_sigs() ->

mutex_lock(&mount_crypt_stat->global_auth_tok_list_mutex);

-> ecryptfs_add_keysig() ->

mutex_lock(&crypt_stat->keysig_list_mutex);

so global_auth_tok_list_mutex is taken before keysig_list_mutex.

and second:

ecryptfs_generate_key_packet_set() ->

mutex_lock(&crypt_stat->keysig_list_mutex);

-> ecryptfs_find_global_auth_tok_for_sig() ->

mutex_lock(&mount_crypt_stat->global_auth_tok_list_mutex);

so keysig_list_mutex is taken before global_auth_tok_list_mutex here.

I'm not sure if this could actually deadlock in practice (ie can we have
threads racing on these two paths?) but it seems risky.

I'm not an expert on ecryptfs locking bug the simplest fix I see for
this particular issue is to move taking keysig_list_mutex to before
global_auth_tok_list_mutex in ecryptfs_copy_mount_wide_sigs_to_inode_sigs(),
since ecryptfs_add_keysig() has only one caller and so we can rely on
that caller to take the needed lock. I can send a patch for this if
this sounds good.


here's the full report:

[ 24.787210] =======================================================
[ 24.787213] [ INFO: possible circular locking dependency detected ]
[ 24.787216] 2.6.31-2-generic #14~rbd2
[ 24.787217] -------------------------------------------------------
[ 24.787219] gdm/2640 is trying to acquire lock:
[ 24.787221] (&mount_crypt_stat->global_auth_tok_list_mutex){+.+.+.}, at: [<ffffffff8121591e>] ecryptfs_find_global_auth_tok_for_sig+0x2e/0x90
[ 24.787231]
[ 24.787231] but task is already holding lock:
[ 24.787233] (&crypt_stat->keysig_list_mutex){+.+.+.}, at: [<ffffffff81217728>] ecryptfs_generate_key_packet_set+0x58/0x2b0
[ 24.787239]
[ 24.787239] which lock already depends on the new lock.
[ 24.787240]
[ 24.787241]
[ 24.787242] the existing dependency chain (in reverse order) is:
[ 24.787244]
[ 24.787244] -> #1 (&crypt_stat->keysig_list_mutex){+.+.+.}:
[ 24.787248] [<ffffffff8108c897>] check_prev_add+0x2a7/0x370
[ 24.787253] [<ffffffff8108cfc1>] validate_chain+0x661/0x750
[ 24.787256] [<ffffffff8108d2e7>] __lock_acquire+0x237/0x430
[ 24.787259] [<ffffffff8108d585>] lock_acquire+0xa5/0x150
[ 24.787263] [<ffffffff815526cd>] __mutex_lock_common+0x4d/0x3d0
[ 24.787267] [<ffffffff81552b56>] mutex_lock_nested+0x46/0x60
[ 24.787270] [<ffffffff8121526a>] ecryptfs_add_keysig+0x5a/0xb0
[ 24.787273] [<ffffffff81213299>] ecryptfs_copy_mount_wide_sigs_to_inode_sigs+0x59/0xb0
[ 24.787276] [<ffffffff81214b06>] ecryptfs_new_file_context+0xa6/0x1a0
[ 24.787280] [<ffffffff8120e42a>] ecryptfs_initialize_file+0x4a/0x140
[ 24.787283] [<ffffffff8120e54d>] ecryptfs_create+0x2d/0x60
[ 24.787286] [<ffffffff8113a7d4>] vfs_create+0xb4/0xe0
[ 24.787290] [<ffffffff8113a8c4>] __open_namei_create+0xc4/0x110
[ 24.787293] [<ffffffff8113d1c1>] do_filp_open+0xa01/0xae0
[ 24.787296] [<ffffffff8112d8d9>] do_sys_open+0x69/0x140
[ 24.787300] [<ffffffff8112d9f0>] sys_open+0x20/0x30
[ 24.787303] [<ffffffff81013132>] system_call_fastpath+0x16/0x1b
[ 24.787308] [<ffffffffffffffff>] 0xffffffffffffffff
[ 24.787326]
[ 24.787326] -> #0 (&mount_crypt_stat->global_auth_tok_list_mutex){+.+.+.}:
[ 24.787330] [<ffffffff8108c675>] check_prev_add+0x85/0x370
[ 24.787333] [<ffffffff8108cfc1>] validate_chain+0x661/0x750
[ 24.787337] [<ffffffff8108d2e7>] __lock_acquire+0x237/0x430
[ 24.787340] [<ffffffff8108d585>] lock_acquire+0xa5/0x150
[ 24.787343] [<ffffffff815526cd>] __mutex_lock_common+0x4d/0x3d0
[ 24.787346] [<ffffffff81552b56>] mutex_lock_nested+0x46/0x60
[ 24.787349] [<ffffffff8121591e>] ecryptfs_find_global_auth_tok_for_sig+0x2e/0x90
[ 24.787352] [<ffffffff812177d5>] ecryptfs_generate_key_packet_set+0x105/0x2b0
[ 24.787356] [<ffffffff81212f49>] ecryptfs_write_headers_virt+0xc9/0x120
[ 24.787359] [<ffffffff8121306d>] ecryptfs_write_metadata+0xcd/0x200
[ 24.787362] [<ffffffff8120e44b>] ecryptfs_initialize_file+0x6b/0x140
[ 24.787365] [<ffffffff8120e54d>] ecryptfs_create+0x2d/0x60
[ 24.787368] [<ffffffff8113a7d4>] vfs_create+0xb4/0xe0
[ 24.787371] [<ffffffff8113a8c4>] __open_namei_create+0xc4/0x110
[ 24.787374] [<ffffffff8113d1c1>] do_filp_open+0xa01/0xae0
[ 24.787377] [<ffffffff8112d8d9>] do_sys_open+0x69/0x140
[ 24.787380] [<ffffffff8112d9f0>] sys_open+0x20/0x30
[ 24.787383] [<ffffffff81013132>] system_call_fastpath+0x16/0x1b
[ 24.787386] [<ffffffffffffffff>] 0xffffffffffffffff
[ 24.787390]
[ 24.787390] other info that might help us debug this:
[ 24.787391]
[ 24.787393] 2 locks held by gdm/2640:
[ 24.787394] #0: (&sb->s_type->i_mutex_key#11){+.+.+.}, at: [<ffffffff8113cb8b>] do_filp_open+0x3cb/0xae0
[ 24.787401] #1: (&crypt_stat->keysig_list_mutex){+.+.+.}, at: [<ffffffff81217728>] ecryptfs_generate_key_packet_set+0x58/0x2b0
[ 24.787407]
[ 24.787407] stack backtrace:
[ 24.787410] Pid: 2640, comm: gdm Tainted: G C 2.6.31-2-generic #14~rbd2
[ 24.787412] Call Trace:
[ 24.787416] [<ffffffff8108b988>] print_circular_bug_tail+0xa8/0xf0
[ 24.787419] [<ffffffff8108c675>] check_prev_add+0x85/0x370
[ 24.787423] [<ffffffff81094912>] ? __module_text_address+0x12/0x60
[ 24.787426] [<ffffffff8108cfc1>] validate_chain+0x661/0x750
[ 24.787429] [<ffffffff81017275>] ? print_context_stack+0x85/0x140
[ 24.787433] [<ffffffff81089c68>] ? find_usage_backwards+0x38/0x160
[ 24.787436] [<ffffffff8108d2e7>] __lock_acquire+0x237/0x430
[ 24.787439] [<ffffffff8108d585>] lock_acquire+0xa5/0x150
[ 24.787442] [<ffffffff8121591e>] ? ecryptfs_find_global_auth_tok_for_sig+0x2e/0x90
[ 24.787445] [<ffffffff8108b0b0>] ? check_usage_backwards+0x0/0xb0
[ 24.787448] [<ffffffff815526cd>] __mutex_lock_common+0x4d/0x3d0
[ 24.787451] [<ffffffff8121591e>] ? ecryptfs_find_global_auth_tok_for_sig+0x2e/0x90
[ 24.787454] [<ffffffff8121591e>] ? ecryptfs_find_global_auth_tok_for_sig+0x2e/0x90
[ 24.787458] [<ffffffff8108c02c>] ? mark_held_locks+0x6c/0xa0
[ 24.787461] [<ffffffff81125b0d>] ? kmem_cache_alloc+0xfd/0x1a0
[ 24.787464] [<ffffffff8108c34d>] ? trace_hardirqs_on_caller+0x14d/0x190
[ 24.787467] [<ffffffff81552b56>] mutex_lock_nested+0x46/0x60
[ 24.787470] [<ffffffff8121591e>] ecryptfs_find_global_auth_tok_for_sig+0x2e/0x90
[ 24.787473] [<ffffffff812177d5>] ecryptfs_generate_key_packet_set+0x105/0x2b0
[ 24.787476] [<ffffffff81212f49>] ecryptfs_write_headers_virt+0xc9/0x120
[ 24.787479] [<ffffffff8121306d>] ecryptfs_write_metadata+0xcd/0x200
[ 24.787482] [<ffffffff81210240>] ? ecryptfs_init_persistent_file+0x60/0xe0
[ 24.787485] [<ffffffff8120e44b>] ecryptfs_initialize_file+0x6b/0x140
[ 24.787488] [<ffffffff8120e54d>] ecryptfs_create+0x2d/0x60
[ 24.787491] [<ffffffff8113a7d4>] vfs_create+0xb4/0xe0
[ 24.787494] [<ffffffff8113a8c4>] __open_namei_create+0xc4/0x110
[ 24.787496] [<ffffffff8113d1c1>] do_filp_open+0xa01/0xae0
[ 24.787502] [<ffffffff8129a93e>] ? _raw_spin_unlock+0x5e/0xb0
[ 24.787505] [<ffffffff8155410b>] ? _spin_unlock+0x2b/0x40
[ 24.787508] [<ffffffff81139e9b>] ? getname+0x3b/0x240
[ 24.787511] [<ffffffff81148a5a>] ? alloc_fd+0xfa/0x140
[ 24.787515] [<ffffffff8112d8d9>] do_sys_open+0x69/0x140
[ 24.787517] [<ffffffff81553b8f>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 24.787520] [<ffffffff8112d9f0>] sys_open+0x20/0x30
[ 24.787523] [<ffffffff81013132>] system_call_fastpath+0x16/0x1b


2009-07-01 22:48:27

by Roland Dreier

[permalink] [raw]
Subject: [PATCH] eCryptfs: Fix lockdep-reported AB-BA mutex issue

Lockdep reports the following valid-looking possible AB-BA deadlock with
global_auth_tok_list_mutex and keysig_list_mutex:

ecryptfs_new_file_context() ->
ecryptfs_copy_mount_wide_sigs_to_inode_sigs() ->
mutex_lock(&mount_crypt_stat->global_auth_tok_list_mutex);
-> ecryptfs_add_keysig() ->
mutex_lock(&crypt_stat->keysig_list_mutex);

vs

ecryptfs_generate_key_packet_set() ->
mutex_lock(&crypt_stat->keysig_list_mutex);
-> ecryptfs_find_global_auth_tok_for_sig() ->
mutex_lock(&mount_crypt_stat->global_auth_tok_list_mutex);

ie the two mutexes are taken in opposite orders in the two different
code paths. I'm not sure if this is a real bug where two threads could
actually hit the two paths in parallel and deadlock, but it at least
makes lockdep impossible to use with ecryptfs since this report triggers
every time and disables future lockdep reporting.

Since ecryptfs_add_keysig() is called only from the single callsite in
ecryptfs_copy_mount_wide_sigs_to_inode_sigs(), the simplest fix seems to
be to move the lock of keysig_list_mutex back up outside of the where
global_auth_tok_list_mutex is taken. This patch does that, and fixes
the lockdep report on my system (and ecryptfs still works OK).

The full output of lockdep fixed by this patch is:

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.31-2-generic #14~rbd2
-------------------------------------------------------
gdm/2640 is trying to acquire lock:
(&mount_crypt_stat->global_auth_tok_list_mutex){+.+.+.}, at: [<ffffffff8121591e>] ecryptfs_find_global_auth_tok_for_sig+0x2e/0x90

but task is already holding lock:
(&crypt_stat->keysig_list_mutex){+.+.+.}, at: [<ffffffff81217728>] ecryptfs_generate_key_packet_set+0x58/0x2b0

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&crypt_stat->keysig_list_mutex){+.+.+.}:
[<ffffffff8108c897>] check_prev_add+0x2a7/0x370
[<ffffffff8108cfc1>] validate_chain+0x661/0x750
[<ffffffff8108d2e7>] __lock_acquire+0x237/0x430
[<ffffffff8108d585>] lock_acquire+0xa5/0x150
[<ffffffff815526cd>] __mutex_lock_common+0x4d/0x3d0
[<ffffffff81552b56>] mutex_lock_nested+0x46/0x60
[<ffffffff8121526a>] ecryptfs_add_keysig+0x5a/0xb0
[<ffffffff81213299>] ecryptfs_copy_mount_wide_sigs_to_inode_sigs+0x59/0xb0
[<ffffffff81214b06>] ecryptfs_new_file_context+0xa6/0x1a0
[<ffffffff8120e42a>] ecryptfs_initialize_file+0x4a/0x140
[<ffffffff8120e54d>] ecryptfs_create+0x2d/0x60
[<ffffffff8113a7d4>] vfs_create+0xb4/0xe0
[<ffffffff8113a8c4>] __open_namei_create+0xc4/0x110
[<ffffffff8113d1c1>] do_filp_open+0xa01/0xae0
[<ffffffff8112d8d9>] do_sys_open+0x69/0x140
[<ffffffff8112d9f0>] sys_open+0x20/0x30
[<ffffffff81013132>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

-> #0 (&mount_crypt_stat->global_auth_tok_list_mutex){+.+.+.}:
[<ffffffff8108c675>] check_prev_add+0x85/0x370
[<ffffffff8108cfc1>] validate_chain+0x661/0x750
[<ffffffff8108d2e7>] __lock_acquire+0x237/0x430
[<ffffffff8108d585>] lock_acquire+0xa5/0x150
[<ffffffff815526cd>] __mutex_lock_common+0x4d/0x3d0
[<ffffffff81552b56>] mutex_lock_nested+0x46/0x60
[<ffffffff8121591e>] ecryptfs_find_global_auth_tok_for_sig+0x2e/0x90
[<ffffffff812177d5>] ecryptfs_generate_key_packet_set+0x105/0x2b0
[<ffffffff81212f49>] ecryptfs_write_headers_virt+0xc9/0x120
[<ffffffff8121306d>] ecryptfs_write_metadata+0xcd/0x200
[<ffffffff8120e44b>] ecryptfs_initialize_file+0x6b/0x140
[<ffffffff8120e54d>] ecryptfs_create+0x2d/0x60
[<ffffffff8113a7d4>] vfs_create+0xb4/0xe0
[<ffffffff8113a8c4>] __open_namei_create+0xc4/0x110
[<ffffffff8113d1c1>] do_filp_open+0xa01/0xae0
[<ffffffff8112d8d9>] do_sys_open+0x69/0x140
[<ffffffff8112d9f0>] sys_open+0x20/0x30
[<ffffffff81013132>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

other info that might help us debug this:

2 locks held by gdm/2640:
#0: (&sb->s_type->i_mutex_key#11){+.+.+.}, at: [<ffffffff8113cb8b>] do_filp_open+0x3cb/0xae0
#1: (&crypt_stat->keysig_list_mutex){+.+.+.}, at: [<ffffffff81217728>] ecryptfs_generate_key_packet_set+0x58/0x2b0

stack backtrace:
Pid: 2640, comm: gdm Tainted: G C 2.6.31-2-generic #14~rbd2
Call Trace:
[<ffffffff8108b988>] print_circular_bug_tail+0xa8/0xf0
[<ffffffff8108c675>] check_prev_add+0x85/0x370
[<ffffffff81094912>] ? __module_text_address+0x12/0x60
[<ffffffff8108cfc1>] validate_chain+0x661/0x750
[<ffffffff81017275>] ? print_context_stack+0x85/0x140
[<ffffffff81089c68>] ? find_usage_backwards+0x38/0x160
[<ffffffff8108d2e7>] __lock_acquire+0x237/0x430
[<ffffffff8108d585>] lock_acquire+0xa5/0x150
[<ffffffff8121591e>] ? ecryptfs_find_global_auth_tok_for_sig+0x2e/0x90
[<ffffffff8108b0b0>] ? check_usage_backwards+0x0/0xb0
[<ffffffff815526cd>] __mutex_lock_common+0x4d/0x3d0
[<ffffffff8121591e>] ? ecryptfs_find_global_auth_tok_for_sig+0x2e/0x90
[<ffffffff8121591e>] ? ecryptfs_find_global_auth_tok_for_sig+0x2e/0x90
[<ffffffff8108c02c>] ? mark_held_locks+0x6c/0xa0
[<ffffffff81125b0d>] ? kmem_cache_alloc+0xfd/0x1a0
[<ffffffff8108c34d>] ? trace_hardirqs_on_caller+0x14d/0x190
[<ffffffff81552b56>] mutex_lock_nested+0x46/0x60
[<ffffffff8121591e>] ecryptfs_find_global_auth_tok_for_sig+0x2e/0x90
[<ffffffff812177d5>] ecryptfs_generate_key_packet_set+0x105/0x2b0
[<ffffffff81212f49>] ecryptfs_write_headers_virt+0xc9/0x120
[<ffffffff8121306d>] ecryptfs_write_metadata+0xcd/0x200
[<ffffffff81210240>] ? ecryptfs_init_persistent_file+0x60/0xe0
[<ffffffff8120e44b>] ecryptfs_initialize_file+0x6b/0x140
[<ffffffff8120e54d>] ecryptfs_create+0x2d/0x60
[<ffffffff8113a7d4>] vfs_create+0xb4/0xe0
[<ffffffff8113a8c4>] __open_namei_create+0xc4/0x110
[<ffffffff8113d1c1>] do_filp_open+0xa01/0xae0
[<ffffffff8129a93e>] ? _raw_spin_unlock+0x5e/0xb0
[<ffffffff8155410b>] ? _spin_unlock+0x2b/0x40
[<ffffffff81139e9b>] ? getname+0x3b/0x240
[<ffffffff81148a5a>] ? alloc_fd+0xfa/0x140
[<ffffffff8112d8d9>] do_sys_open+0x69/0x140
[<ffffffff81553b8f>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff8112d9f0>] sys_open+0x20/0x30
[<ffffffff81013132>] system_call_fastpath+0x16/0x1b

Signed-off-by: Roland Dreier <[email protected]>
---
fs/ecryptfs/crypto.c | 8 +++++---
fs/ecryptfs/keystore.c | 11 ++++-------
2 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/fs/ecryptfs/crypto.c b/fs/ecryptfs/crypto.c
index b91851f..1920a9a 100644
--- a/fs/ecryptfs/crypto.c
+++ b/fs/ecryptfs/crypto.c
@@ -925,7 +925,9 @@ static int ecryptfs_copy_mount_wide_sigs_to_inode_sigs(
struct ecryptfs_global_auth_tok *global_auth_tok;
int rc = 0;

+ mutex_lock(&crypt_stat->keysig_list_mutex);
mutex_lock(&mount_crypt_stat->global_auth_tok_list_mutex);
+
list_for_each_entry(global_auth_tok,
&mount_crypt_stat->global_auth_tok_list,
mount_crypt_stat_list) {
@@ -934,13 +936,13 @@ static int ecryptfs_copy_mount_wide_sigs_to_inode_sigs(
rc = ecryptfs_add_keysig(crypt_stat, global_auth_tok->sig);
if (rc) {
printk(KERN_ERR "Error adding keysig; rc = [%d]\n", rc);
- mutex_unlock(
- &mount_crypt_stat->global_auth_tok_list_mutex);
goto out;
}
}
- mutex_unlock(&mount_crypt_stat->global_auth_tok_list_mutex);
+
out:
+ mutex_unlock(&mount_crypt_stat->global_auth_tok_list_mutex);
+ mutex_unlock(&crypt_stat->keysig_list_mutex);
return rc;
}

diff --git a/fs/ecryptfs/keystore.c b/fs/ecryptfs/keystore.c
index af737bb..e85ca8e 100644
--- a/fs/ecryptfs/keystore.c
+++ b/fs/ecryptfs/keystore.c
@@ -2353,21 +2353,18 @@ struct kmem_cache *ecryptfs_key_sig_cache;
int ecryptfs_add_keysig(struct ecryptfs_crypt_stat *crypt_stat, char *sig)
{
struct ecryptfs_key_sig *new_key_sig;
- int rc = 0;

new_key_sig = kmem_cache_alloc(ecryptfs_key_sig_cache, GFP_KERNEL);
if (!new_key_sig) {
- rc = -ENOMEM;
printk(KERN_ERR
"Error allocating from ecryptfs_key_sig_cache\n");
- goto out;
+ return -ENOMEM;
}
memcpy(new_key_sig->keysig, sig, ECRYPTFS_SIG_SIZE_HEX);
- mutex_lock(&crypt_stat->keysig_list_mutex);
+ /* Caller must hold keysig_list_mutex */
list_add(&new_key_sig->crypt_stat_list, &crypt_stat->keysig_list);
- mutex_unlock(&crypt_stat->keysig_list_mutex);
-out:
- return rc;
+
+ return 0;
}

struct kmem_cache *ecryptfs_global_auth_tok_cache;

2009-07-01 23:08:21

by Roland Dreier

[permalink] [raw]
Subject: Another lockdep issue reported with ecryptfs

Same setup as before: ecryptfs home directory on Ubuntu 9.10, with the
lockdep fix for ecryptfs I just posted (so lockdep is not being
instantly disabled on login); I got the following on a system with an
idle firefox-3.5 window:

=============================================
[ INFO: possible recursive locking detected ]
2.6.31-2-generic #14~rbd3
---------------------------------------------
firefox-3.5/4162 is trying to acquire lock:
(&s->s_vfs_rename_mutex){+.+.+.}, at: [<ffffffff81139d31>] lock_rename+0x41/0xf0

but task is already holding lock:
(&s->s_vfs_rename_mutex){+.+.+.}, at: [<ffffffff81139d31>] lock_rename+0x41/0xf0

other info that might help us debug this:
3 locks held by firefox-3.5/4162:
#0: (&s->s_vfs_rename_mutex){+.+.+.}, at: [<ffffffff81139d31>] lock_rename+0x41/0xf0
#1: (&sb->s_type->i_mutex_key#11/1){+.+.+.}, at: [<ffffffff81139d5a>] lock_rename+0x6a/0xf0
#2: (&sb->s_type->i_mutex_key#11/2){+.+.+.}, at: [<ffffffff81139d6f>] lock_rename+0x7f/0xf0

stack backtrace:
Pid: 4162, comm: firefox-3.5 Tainted: G C 2.6.31-2-generic #14~rbd3
Call Trace:
[<ffffffff8108ae74>] print_deadlock_bug+0xf4/0x100
[<ffffffff8108ce26>] validate_chain+0x4c6/0x750
[<ffffffff8108d2e7>] __lock_acquire+0x237/0x430
[<ffffffff8108d585>] lock_acquire+0xa5/0x150
[<ffffffff81139d31>] ? lock_rename+0x41/0xf0
[<ffffffff815526ad>] __mutex_lock_common+0x4d/0x3d0
[<ffffffff81139d31>] ? lock_rename+0x41/0xf0
[<ffffffff81139d31>] ? lock_rename+0x41/0xf0
[<ffffffff8120eaf9>] ? ecryptfs_rename+0x99/0x170
[<ffffffff81552b36>] mutex_lock_nested+0x46/0x60
[<ffffffff81139d31>] lock_rename+0x41/0xf0
[<ffffffff8120eb2a>] ecryptfs_rename+0xca/0x170
[<ffffffff81139a9e>] vfs_rename_dir+0x13e/0x160
[<ffffffff8113ac7e>] vfs_rename+0xee/0x290
[<ffffffff8113c212>] ? __lookup_hash+0x102/0x160
[<ffffffff8113d512>] sys_renameat+0x252/0x280
[<ffffffff81133eb4>] ? cp_new_stat+0xe4/0x100
[<ffffffff8101316a>] ? sysret_check+0x2e/0x69
[<ffffffff8108c34d>] ? trace_hardirqs_on_caller+0x14d/0x190
[<ffffffff8113d55b>] sys_rename+0x1b/0x20
[<ffffffff81013132>] system_call_fastpath+0x16/0x1b

haven't tried to debug this yet.

2009-07-02 00:38:40

by Roland Dreier

[permalink] [raw]
Subject: Re: Another lockdep issue reported with ecryptfs


> =============================================
> [ INFO: possible recursive locking detected ]
> 2.6.31-2-generic #14~rbd3
> ---------------------------------------------
> firefox-3.5/4162 is trying to acquire lock:
> (&s->s_vfs_rename_mutex){+.+.+.}, at: [<ffffffff81139d31>] lock_rename+0x41/0xf0
>
> but task is already holding lock:
> (&s->s_vfs_rename_mutex){+.+.+.}, at: [<ffffffff81139d31>] lock_rename+0x41/0xf0
>
> other info that might help us debug this:
> 3 locks held by firefox-3.5/4162:
> #0: (&s->s_vfs_rename_mutex){+.+.+.}, at: [<ffffffff81139d31>] lock_rename+0x41/0xf0
> #1: (&sb->s_type->i_mutex_key#11/1){+.+.+.}, at: [<ffffffff81139d5a>] lock_rename+0x6a/0xf0
> #2: (&sb->s_type->i_mutex_key#11/2){+.+.+.}, at: [<ffffffff81139d6f>] lock_rename+0x7f/0xf0
>
> stack backtrace:
> Pid: 4162, comm: firefox-3.5 Tainted: G C 2.6.31-2-generic #14~rbd3
> Call Trace:
> [<ffffffff8108ae74>] print_deadlock_bug+0xf4/0x100
> [<ffffffff8108ce26>] validate_chain+0x4c6/0x750
> [<ffffffff8108d2e7>] __lock_acquire+0x237/0x430
> [<ffffffff8108d585>] lock_acquire+0xa5/0x150
> [<ffffffff81139d31>] ? lock_rename+0x41/0xf0
> [<ffffffff815526ad>] __mutex_lock_common+0x4d/0x3d0
> [<ffffffff81139d31>] ? lock_rename+0x41/0xf0
> [<ffffffff81139d31>] ? lock_rename+0x41/0xf0
> [<ffffffff8120eaf9>] ? ecryptfs_rename+0x99/0x170
> [<ffffffff81552b36>] mutex_lock_nested+0x46/0x60
> [<ffffffff81139d31>] lock_rename+0x41/0xf0
> [<ffffffff8120eb2a>] ecryptfs_rename+0xca/0x170
> [<ffffffff81139a9e>] vfs_rename_dir+0x13e/0x160
> [<ffffffff8113ac7e>] vfs_rename+0xee/0x290
> [<ffffffff8113c212>] ? __lookup_hash+0x102/0x160
> [<ffffffff8113d512>] sys_renameat+0x252/0x280
> [<ffffffff81133eb4>] ? cp_new_stat+0xe4/0x100
> [<ffffffff8101316a>] ? sysret_check+0x2e/0x69
> [<ffffffff8108c34d>] ? trace_hardirqs_on_caller+0x14d/0x190
> [<ffffffff8113d55b>] sys_rename+0x1b/0x20
> [<ffffffff81013132>] system_call_fastpath+0x16/0x1b

The trace above is totally reproducible by doing a cross-directory
rename on an ecryptfs directory.

The issue seems to be that sys_renameat() does lock_rename() then calls
into the filesystem; if the filesystem is ecryptfs, then
ecryptfs_rename() again does lock_rename() on the lower filesystem, and
lockdep can't tell that the two s_vfs_rename_mutexes are different. It
seems an annotation like the following is sufficient to fix this (it
does get rid of the lockdep trace in my simple tests); however I would
like to make sure I'm not misunderstanding the locking, hence the CC
list...

Thanks,
Roland


fs/super.c | 1 +
include/linux/fs.h | 1 +
2 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/fs/super.c b/fs/super.c
index 2761d3e..0a45b5a 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -98,6 +98,7 @@ static struct super_block *alloc_super(struct file_system_type *type)
s->s_count = S_BIAS;
atomic_set(&s->s_active, 1);
mutex_init(&s->s_vfs_rename_mutex);
+ lockdep_set_class(&s->s_vfs_rename_mutex, &type->s_vfs_rename_key);
mutex_init(&s->s_dquot.dqio_mutex);
mutex_init(&s->s_dquot.dqonoff_mutex);
init_rwsem(&s->s_dquot.dqptr_sem);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 0872372..feaf9e0 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1750,6 +1750,7 @@ struct file_system_type {

struct lock_class_key s_lock_key;
struct lock_class_key s_umount_key;
+ struct lock_class_key s_vfs_rename_key;

struct lock_class_key i_lock_key;
struct lock_class_key i_mutex_key;

2009-07-02 00:41:17

by Roland Dreier

[permalink] [raw]
Subject: Yet another lockdep issue reported with ecryptfs

And yet another lockdep report from ecryptfs, related to lower_file_mutex:

=================================
[ INFO: inconsistent lock state ]
2.6.31-2-generic #14~rbd3
---------------------------------
inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
kswapd0/323 [HC0[0]:SC0[0]:HE1:SE1] takes:
(&inode_info->lower_file_mutex){+.+.?.}, at: [<ffffffff81210d34>] ecryptfs_destroy_inode+0x34/0x100
{RECLAIM_FS-ON-W} state was registered at:
[<ffffffff8108c02c>] mark_held_locks+0x6c/0xa0
[<ffffffff8108c10f>] lockdep_trace_alloc+0xaf/0xe0
[<ffffffff81125a51>] kmem_cache_alloc+0x41/0x1a0
[<ffffffff8113117a>] get_empty_filp+0x7a/0x1a0
[<ffffffff8112dd46>] dentry_open+0x36/0xc0
[<ffffffff8121a36c>] ecryptfs_privileged_open+0x5c/0x2e0
[<ffffffff81210283>] ecryptfs_init_persistent_file+0xa3/0xe0
[<ffffffff8120e838>] ecryptfs_lookup_and_interpose_lower+0x278/0x380
[<ffffffff8120f97a>] ecryptfs_lookup+0x12a/0x250
[<ffffffff8113930a>] real_lookup+0xea/0x160
[<ffffffff8113afc8>] do_lookup+0xb8/0xf0
[<ffffffff8113b518>] __link_path_walk+0x518/0x870
[<ffffffff8113bd9c>] path_walk+0x5c/0xc0
[<ffffffff8113be5b>] do_path_lookup+0x5b/0xa0
[<ffffffff8113bfe7>] user_path_at+0x57/0xa0
[<ffffffff811340dc>] vfs_fstatat+0x3c/0x80
[<ffffffff8113424b>] vfs_stat+0x1b/0x20
[<ffffffff81134274>] sys_newstat+0x24/0x50
[<ffffffff81013132>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff
irq event stamp: 7811
hardirqs last enabled at (7811): [<ffffffff810c037f>] call_rcu+0x5f/0x90
hardirqs last disabled at (7810): [<ffffffff810c0353>] call_rcu+0x33/0x90
softirqs last enabled at (3764): [<ffffffff810631da>] __do_softirq+0x14a/0x220
softirqs last disabled at (3751): [<ffffffff8101440c>] call_softirq+0x1c/0x30

other info that might help us debug this:
2 locks held by kswapd0/323:
#0: (shrinker_rwsem){++++..}, at: [<ffffffff810f67ed>] shrink_slab+0x3d/0x190
#1: (&type->s_umount_key#35){.+.+..}, at: [<ffffffff811429a1>] prune_dcache+0xd1/0x1b0

stack backtrace:
Pid: 323, comm: kswapd0 Tainted: G C 2.6.31-2-generic #14~rbd3
Call Trace:
[<ffffffff8108ad6c>] print_usage_bug+0x18c/0x1a0
[<ffffffff8108aff0>] ? check_usage_forwards+0x0/0xc0
[<ffffffff8108bac2>] mark_lock_irq+0xf2/0x280
[<ffffffff8108bd87>] mark_lock+0x137/0x1d0
[<ffffffff81164710>] ? fsnotify_clear_marks_by_inode+0x30/0xf0
[<ffffffff8108bee6>] mark_irqflags+0xc6/0x1a0
[<ffffffff8108d337>] __lock_acquire+0x287/0x430
[<ffffffff8108d585>] lock_acquire+0xa5/0x150
[<ffffffff81210d34>] ? ecryptfs_destroy_inode+0x34/0x100
[<ffffffff8108d2e7>] ? __lock_acquire+0x237/0x430
[<ffffffff815526ad>] __mutex_lock_common+0x4d/0x3d0
[<ffffffff81210d34>] ? ecryptfs_destroy_inode+0x34/0x100
[<ffffffff81164710>] ? fsnotify_clear_marks_by_inode+0x30/0xf0
[<ffffffff81210d34>] ? ecryptfs_destroy_inode+0x34/0x100
[<ffffffff8129a91e>] ? _raw_spin_unlock+0x5e/0xb0
[<ffffffff81552b36>] mutex_lock_nested+0x46/0x60
[<ffffffff81210d34>] ecryptfs_destroy_inode+0x34/0x100
[<ffffffff81145d27>] destroy_inode+0x87/0xd0
[<ffffffff81146b4c>] generic_delete_inode+0x12c/0x1a0
[<ffffffff81145832>] iput+0x62/0x70
[<ffffffff811423c8>] dentry_iput+0x98/0x110
[<ffffffff81142550>] d_kill+0x50/0x80
[<ffffffff81142623>] prune_one_dentry+0xa3/0xc0
[<ffffffff811428b1>] __shrink_dcache_sb+0x271/0x290
[<ffffffff811429d9>] prune_dcache+0x109/0x1b0
[<ffffffff81142abf>] shrink_dcache_memory+0x3f/0x50
[<ffffffff810f68dd>] shrink_slab+0x12d/0x190
[<ffffffff810f9377>] balance_pgdat+0x4d7/0x640
[<ffffffff8104c4c0>] ? finish_task_switch+0x40/0x150
[<ffffffff810f63c0>] ? isolate_pages_global+0x0/0x60
[<ffffffff810f95f7>] kswapd+0x117/0x170
[<ffffffff810777a0>] ? autoremove_wake_function+0x0/0x40
[<ffffffff810f94e0>] ? kswapd+0x0/0x170
[<ffffffff810773be>] kthread+0x9e/0xb0
[<ffffffff8101430a>] child_rip+0xa/0x20
[<ffffffff81013c90>] ? restore_args+0x0/0x30
[<ffffffff81077320>] ? kthread+0x0/0xb0
[<ffffffff81014300>] ? child_rip+0x0/0x20

2009-07-02 03:58:23

by Roland Dreier

[permalink] [raw]
Subject: Re: Yet another lockdep issue reported with ecryptfs


> And yet another lockdep report from ecryptfs, related to lower_file_mutex:
>
> =================================
> [ INFO: inconsistent lock state ]
> 2.6.31-2-generic #14~rbd3
> ---------------------------------
> inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
> kswapd0/323 [HC0[0]:SC0[0]:HE1:SE1] takes:
> (&inode_info->lower_file_mutex){+.+.?.}, at: [<ffffffff81210d34>] ecryptfs_destroy_inode+0x34/0x100
> {RECLAIM_FS-ON-W} state was registered at:
> [<ffffffff8108c02c>] mark_held_locks+0x6c/0xa0
> [<ffffffff8108c10f>] lockdep_trace_alloc+0xaf/0xe0
> [<ffffffff81125a51>] kmem_cache_alloc+0x41/0x1a0
> [<ffffffff8113117a>] get_empty_filp+0x7a/0x1a0
> [<ffffffff8112dd46>] dentry_open+0x36/0xc0
> [<ffffffff8121a36c>] ecryptfs_privileged_open+0x5c/0x2e0
> [<ffffffff81210283>] ecryptfs_init_persistent_file+0xa3/0xe0
> [<ffffffff8120e838>] ecryptfs_lookup_and_interpose_lower+0x278/0x380
> [<ffffffff8120f97a>] ecryptfs_lookup+0x12a/0x250
> [<ffffffff8113930a>] real_lookup+0xea/0x160
> [<ffffffff8113afc8>] do_lookup+0xb8/0xf0
> [<ffffffff8113b518>] __link_path_walk+0x518/0x870
> [<ffffffff8113bd9c>] path_walk+0x5c/0xc0
> [<ffffffff8113be5b>] do_path_lookup+0x5b/0xa0
> [<ffffffff8113bfe7>] user_path_at+0x57/0xa0
> [<ffffffff811340dc>] vfs_fstatat+0x3c/0x80
> [<ffffffff8113424b>] vfs_stat+0x1b/0x20
> [<ffffffff81134274>] sys_newstat+0x24/0x50
> [<ffffffff81013132>] system_call_fastpath+0x16/0x1b
> [<ffffffffffffffff>] 0xffffffffffffffff
> irq event stamp: 7811
> hardirqs last enabled at (7811): [<ffffffff810c037f>] call_rcu+0x5f/0x90
> hardirqs last disabled at (7810): [<ffffffff810c0353>] call_rcu+0x33/0x90
> softirqs last enabled at (3764): [<ffffffff810631da>] __do_softirq+0x14a/0x220
> softirqs last disabled at (3751): [<ffffffff8101440c>] call_softirq+0x1c/0x30
>
> other info that might help us debug this:
> 2 locks held by kswapd0/323:
> #0: (shrinker_rwsem){++++..}, at: [<ffffffff810f67ed>] shrink_slab+0x3d/0x190
> #1: (&type->s_umount_key#35){.+.+..}, at: [<ffffffff811429a1>] prune_dcache+0xd1/0x1b0
>
> stack backtrace:
> Pid: 323, comm: kswapd0 Tainted: G C 2.6.31-2-generic #14~rbd3
> Call Trace:
> [<ffffffff8108ad6c>] print_usage_bug+0x18c/0x1a0
> [<ffffffff8108aff0>] ? check_usage_forwards+0x0/0xc0
> [<ffffffff8108bac2>] mark_lock_irq+0xf2/0x280
> [<ffffffff8108bd87>] mark_lock+0x137/0x1d0
> [<ffffffff81164710>] ? fsnotify_clear_marks_by_inode+0x30/0xf0
> [<ffffffff8108bee6>] mark_irqflags+0xc6/0x1a0
> [<ffffffff8108d337>] __lock_acquire+0x287/0x430
> [<ffffffff8108d585>] lock_acquire+0xa5/0x150
> [<ffffffff81210d34>] ? ecryptfs_destroy_inode+0x34/0x100
> [<ffffffff8108d2e7>] ? __lock_acquire+0x237/0x430
> [<ffffffff815526ad>] __mutex_lock_common+0x4d/0x3d0
> [<ffffffff81210d34>] ? ecryptfs_destroy_inode+0x34/0x100
> [<ffffffff81164710>] ? fsnotify_clear_marks_by_inode+0x30/0xf0
> [<ffffffff81210d34>] ? ecryptfs_destroy_inode+0x34/0x100
> [<ffffffff8129a91e>] ? _raw_spin_unlock+0x5e/0xb0
> [<ffffffff81552b36>] mutex_lock_nested+0x46/0x60
> [<ffffffff81210d34>] ecryptfs_destroy_inode+0x34/0x100
> [<ffffffff81145d27>] destroy_inode+0x87/0xd0
> [<ffffffff81146b4c>] generic_delete_inode+0x12c/0x1a0
> [<ffffffff81145832>] iput+0x62/0x70
> [<ffffffff811423c8>] dentry_iput+0x98/0x110
> [<ffffffff81142550>] d_kill+0x50/0x80
> [<ffffffff81142623>] prune_one_dentry+0xa3/0xc0
> [<ffffffff811428b1>] __shrink_dcache_sb+0x271/0x290
> [<ffffffff811429d9>] prune_dcache+0x109/0x1b0
> [<ffffffff81142abf>] shrink_dcache_memory+0x3f/0x50
> [<ffffffff810f68dd>] shrink_slab+0x12d/0x190
> [<ffffffff810f9377>] balance_pgdat+0x4d7/0x640
> [<ffffffff8104c4c0>] ? finish_task_switch+0x40/0x150
> [<ffffffff810f63c0>] ? isolate_pages_global+0x0/0x60
> [<ffffffff810f95f7>] kswapd+0x117/0x170
> [<ffffffff810777a0>] ? autoremove_wake_function+0x0/0x40
> [<ffffffff810f94e0>] ? kswapd+0x0/0x170
> [<ffffffff810773be>] kthread+0x9e/0xb0
> [<ffffffff8101430a>] child_rip+0xa/0x20
> [<ffffffff81013c90>] ? restore_args+0x0/0x30
> [<ffffffff81077320>] ? kthread+0x0/0xb0
> [<ffffffff81014300>] ? child_rip+0x0/0x20

The issue here seems to be that in normal use, ecryptfs's
lower_file_mutex is held across GFP_KERNEL allocations, etc. So it is
not safe to take it in reclaim context (or else we may end up in a
recursive deadlock under memory pressure). However
ecryptfs_destroy_inode() *does* take the inode's lower_file_mutex, which
causes this warning.

However as far as I can tell, there is no reason to take the mutex while
destroying the inode; if there were any chance that any other context
would be blocked out by holding the lower_file_mutex in the destroy
inode function, then that would be a use-after-free bug, since right
after dropping the mutex, ecryptfs_destroy_inode() does

kmem_cache_free(ecryptfs_inode_info_cache, inode_info);

Similar reasoning applies to ecryptfs_destroy_crypt_stat() taking
keysig_list_mutex, since there right after the lock is dropped, we do

memset(crypt_stat, 0, sizeof(struct ecryptfs_crypt_stat));

which obviously would cause problems if another thread tried to
mutex_lock() a member of crypt_stat.

If this makes sense I will resend the patch with a proper changelog and
signed-off-by line.

fs/ecryptfs/crypto.c | 2 --
fs/ecryptfs/super.c | 2 --
2 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/fs/ecryptfs/crypto.c b/fs/ecryptfs/crypto.c
index b91851f..4610fd6 100644
--- a/fs/ecryptfs/crypto.c
+++ b/fs/ecryptfs/crypto.c
@@ -245,13 +245,11 @@ void ecryptfs_destroy_crypt_stat(struct ecryptfs_crypt_stat *crypt_stat)
crypto_free_blkcipher(crypt_stat->tfm);
if (crypt_stat->hash_tfm)
crypto_free_hash(crypt_stat->hash_tfm);
- mutex_lock(&crypt_stat->keysig_list_mutex);
list_for_each_entry_safe(key_sig, key_sig_tmp,
&crypt_stat->keysig_list, crypt_stat_list) {
list_del(&key_sig->crypt_stat_list);
kmem_cache_free(ecryptfs_key_sig_cache, key_sig);
}
- mutex_unlock(&crypt_stat->keysig_list_mutex);
memset(crypt_stat, 0, sizeof(struct ecryptfs_crypt_stat));
}

diff --git a/fs/ecryptfs/super.c b/fs/ecryptfs/super.c
index 12d6496..b15a43a 100644
--- a/fs/ecryptfs/super.c
+++ b/fs/ecryptfs/super.c
@@ -77,7 +77,6 @@ static void ecryptfs_destroy_inode(struct inode *inode)
struct ecryptfs_inode_info *inode_info;

inode_info = ecryptfs_inode_to_private(inode);
- mutex_lock(&inode_info->lower_file_mutex);
if (inode_info->lower_file) {
struct dentry *lower_dentry =
inode_info->lower_file->f_dentry;
@@ -89,7 +88,6 @@ static void ecryptfs_destroy_inode(struct inode *inode)
d_drop(lower_dentry);
}
}
- mutex_unlock(&inode_info->lower_file_mutex);
ecryptfs_destroy_crypt_stat(&inode_info->crypt_stat);
kmem_cache_free(ecryptfs_inode_info_cache, inode_info);
}

2009-07-02 19:21:18

by Tyler Hicks

[permalink] [raw]
Subject: Re: [PATCH] eCryptfs: Fix lockdep-reported AB-BA mutex issue

On 07/01/2009 05:48 PM, Roland Dreier wrote:
> Lockdep reports the following valid-looking possible AB-BA deadlock with
> global_auth_tok_list_mutex and keysig_list_mutex:
>
> ecryptfs_new_file_context() ->
> ecryptfs_copy_mount_wide_sigs_to_inode_sigs() ->
> mutex_lock(&mount_crypt_stat->global_auth_tok_list_mutex);
> -> ecryptfs_add_keysig() ->
> mutex_lock(&crypt_stat->keysig_list_mutex);
>
> vs
>
> ecryptfs_generate_key_packet_set() ->
> mutex_lock(&crypt_stat->keysig_list_mutex);
> -> ecryptfs_find_global_auth_tok_for_sig() ->
> mutex_lock(&mount_crypt_stat->global_auth_tok_list_mutex);
>
> ie the two mutexes are taken in opposite orders in the two different
> code paths. I'm not sure if this is a real bug where two threads could
> actually hit the two paths in parallel and deadlock, but it at least
> makes lockdep impossible to use with ecryptfs since this report triggers
> every time and disables future lockdep reporting.

After looking at the code paths from a slightly higher level, I don't
think deadlock would ever occur from this issue. But, there's no need
in keeping it the way it is, especially with it disabling lockdep reporting.

>
> Since ecryptfs_add_keysig() is called only from the single callsite in
> ecryptfs_copy_mount_wide_sigs_to_inode_sigs(), the simplest fix seems to
> be to move the lock of keysig_list_mutex back up outside of the where
> global_auth_tok_list_mutex is taken. This patch does that, and fixes
> the lockdep report on my system (and ecryptfs still works OK).

Looks like a reasonable approach.

>
> The full output of lockdep fixed by this patch is:
>
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.31-2-generic #14~rbd2
> -------------------------------------------------------
> gdm/2640 is trying to acquire lock:
> (&mount_crypt_stat->global_auth_tok_list_mutex){+.+.+.}, at: [<ffffffff8121591e>] ecryptfs_find_global_auth_tok_for_sig+0x2e/0x90
>
> but task is already holding lock:
> (&crypt_stat->keysig_list_mutex){+.+.+.}, at: [<ffffffff81217728>] ecryptfs_generate_key_packet_set+0x58/0x2b0
>
> which lock already depends on the new lock.
>
> the existing dependency chain (in reverse order) is:
>
> -> #1 (&crypt_stat->keysig_list_mutex){+.+.+.}:
> [<ffffffff8108c897>] check_prev_add+0x2a7/0x370
> [<ffffffff8108cfc1>] validate_chain+0x661/0x750
> [<ffffffff8108d2e7>] __lock_acquire+0x237/0x430
> [<ffffffff8108d585>] lock_acquire+0xa5/0x150
> [<ffffffff815526cd>] __mutex_lock_common+0x4d/0x3d0
> [<ffffffff81552b56>] mutex_lock_nested+0x46/0x60
> [<ffffffff8121526a>] ecryptfs_add_keysig+0x5a/0xb0
> [<ffffffff81213299>] ecryptfs_copy_mount_wide_sigs_to_inode_sigs+0x59/0xb0
> [<ffffffff81214b06>] ecryptfs_new_file_context+0xa6/0x1a0
> [<ffffffff8120e42a>] ecryptfs_initialize_file+0x4a/0x140
> [<ffffffff8120e54d>] ecryptfs_create+0x2d/0x60
> [<ffffffff8113a7d4>] vfs_create+0xb4/0xe0
> [<ffffffff8113a8c4>] __open_namei_create+0xc4/0x110
> [<ffffffff8113d1c1>] do_filp_open+0xa01/0xae0
> [<ffffffff8112d8d9>] do_sys_open+0x69/0x140
> [<ffffffff8112d9f0>] sys_open+0x20/0x30
> [<ffffffff81013132>] system_call_fastpath+0x16/0x1b
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> -> #0 (&mount_crypt_stat->global_auth_tok_list_mutex){+.+.+.}:
> [<ffffffff8108c675>] check_prev_add+0x85/0x370
> [<ffffffff8108cfc1>] validate_chain+0x661/0x750
> [<ffffffff8108d2e7>] __lock_acquire+0x237/0x430
> [<ffffffff8108d585>] lock_acquire+0xa5/0x150
> [<ffffffff815526cd>] __mutex_lock_common+0x4d/0x3d0
> [<ffffffff81552b56>] mutex_lock_nested+0x46/0x60
> [<ffffffff8121591e>] ecryptfs_find_global_auth_tok_for_sig+0x2e/0x90
> [<ffffffff812177d5>] ecryptfs_generate_key_packet_set+0x105/0x2b0
> [<ffffffff81212f49>] ecryptfs_write_headers_virt+0xc9/0x120
> [<ffffffff8121306d>] ecryptfs_write_metadata+0xcd/0x200
> [<ffffffff8120e44b>] ecryptfs_initialize_file+0x6b/0x140
> [<ffffffff8120e54d>] ecryptfs_create+0x2d/0x60
> [<ffffffff8113a7d4>] vfs_create+0xb4/0xe0
> [<ffffffff8113a8c4>] __open_namei_create+0xc4/0x110
> [<ffffffff8113d1c1>] do_filp_open+0xa01/0xae0
> [<ffffffff8112d8d9>] do_sys_open+0x69/0x140
> [<ffffffff8112d9f0>] sys_open+0x20/0x30
> [<ffffffff81013132>] system_call_fastpath+0x16/0x1b
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> other info that might help us debug this:
>
> 2 locks held by gdm/2640:
> #0: (&sb->s_type->i_mutex_key#11){+.+.+.}, at: [<ffffffff8113cb8b>] do_filp_open+0x3cb/0xae0
> #1: (&crypt_stat->keysig_list_mutex){+.+.+.}, at: [<ffffffff81217728>] ecryptfs_generate_key_packet_set+0x58/0x2b0
>
> stack backtrace:
> Pid: 2640, comm: gdm Tainted: G C 2.6.31-2-generic #14~rbd2
> Call Trace:
> [<ffffffff8108b988>] print_circular_bug_tail+0xa8/0xf0
> [<ffffffff8108c675>] check_prev_add+0x85/0x370
> [<ffffffff81094912>] ? __module_text_address+0x12/0x60
> [<ffffffff8108cfc1>] validate_chain+0x661/0x750
> [<ffffffff81017275>] ? print_context_stack+0x85/0x140
> [<ffffffff81089c68>] ? find_usage_backwards+0x38/0x160
> [<ffffffff8108d2e7>] __lock_acquire+0x237/0x430
> [<ffffffff8108d585>] lock_acquire+0xa5/0x150
> [<ffffffff8121591e>] ? ecryptfs_find_global_auth_tok_for_sig+0x2e/0x90
> [<ffffffff8108b0b0>] ? check_usage_backwards+0x0/0xb0
> [<ffffffff815526cd>] __mutex_lock_common+0x4d/0x3d0
> [<ffffffff8121591e>] ? ecryptfs_find_global_auth_tok_for_sig+0x2e/0x90
> [<ffffffff8121591e>] ? ecryptfs_find_global_auth_tok_for_sig+0x2e/0x90
> [<ffffffff8108c02c>] ? mark_held_locks+0x6c/0xa0
> [<ffffffff81125b0d>] ? kmem_cache_alloc+0xfd/0x1a0
> [<ffffffff8108c34d>] ? trace_hardirqs_on_caller+0x14d/0x190
> [<ffffffff81552b56>] mutex_lock_nested+0x46/0x60
> [<ffffffff8121591e>] ecryptfs_find_global_auth_tok_for_sig+0x2e/0x90
> [<ffffffff812177d5>] ecryptfs_generate_key_packet_set+0x105/0x2b0
> [<ffffffff81212f49>] ecryptfs_write_headers_virt+0xc9/0x120
> [<ffffffff8121306d>] ecryptfs_write_metadata+0xcd/0x200
> [<ffffffff81210240>] ? ecryptfs_init_persistent_file+0x60/0xe0
> [<ffffffff8120e44b>] ecryptfs_initialize_file+0x6b/0x140
> [<ffffffff8120e54d>] ecryptfs_create+0x2d/0x60
> [<ffffffff8113a7d4>] vfs_create+0xb4/0xe0
> [<ffffffff8113a8c4>] __open_namei_create+0xc4/0x110
> [<ffffffff8113d1c1>] do_filp_open+0xa01/0xae0
> [<ffffffff8129a93e>] ? _raw_spin_unlock+0x5e/0xb0
> [<ffffffff8155410b>] ? _spin_unlock+0x2b/0x40
> [<ffffffff81139e9b>] ? getname+0x3b/0x240
> [<ffffffff81148a5a>] ? alloc_fd+0xfa/0x140
> [<ffffffff8112d8d9>] do_sys_open+0x69/0x140
> [<ffffffff81553b8f>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> [<ffffffff8112d9f0>] sys_open+0x20/0x30
> [<ffffffff81013132>] system_call_fastpath+0x16/0x1b
>
> Signed-off-by: Roland Dreier <[email protected]>
> ---
> fs/ecryptfs/crypto.c | 8 +++++---
> fs/ecryptfs/keystore.c | 11 ++++-------
> 2 files changed, 9 insertions(+), 10 deletions(-)
>
> diff --git a/fs/ecryptfs/crypto.c b/fs/ecryptfs/crypto.c
> index b91851f..1920a9a 100644
> --- a/fs/ecryptfs/crypto.c
> +++ b/fs/ecryptfs/crypto.c
> @@ -925,7 +925,9 @@ static int ecryptfs_copy_mount_wide_sigs_to_inode_sigs(
> struct ecryptfs_global_auth_tok *global_auth_tok;
> int rc = 0;
>
> + mutex_lock(&crypt_stat->keysig_list_mutex);
> mutex_lock(&mount_crypt_stat->global_auth_tok_list_mutex);
> +
> list_for_each_entry(global_auth_tok,
> &mount_crypt_stat->global_auth_tok_list,
> mount_crypt_stat_list) {
> @@ -934,13 +936,13 @@ static int ecryptfs_copy_mount_wide_sigs_to_inode_sigs(
> rc = ecryptfs_add_keysig(crypt_stat, global_auth_tok->sig);
> if (rc) {
> printk(KERN_ERR "Error adding keysig; rc = [%d]\n", rc);
> - mutex_unlock(
> - &mount_crypt_stat->global_auth_tok_list_mutex);
> goto out;
> }
> }
> - mutex_unlock(&mount_crypt_stat->global_auth_tok_list_mutex);
> +
> out:
> + mutex_unlock(&mount_crypt_stat->global_auth_tok_list_mutex);
> + mutex_unlock(&crypt_stat->keysig_list_mutex);
> return rc;
> }
>
> diff --git a/fs/ecryptfs/keystore.c b/fs/ecryptfs/keystore.c
> index af737bb..e85ca8e 100644
> --- a/fs/ecryptfs/keystore.c
> +++ b/fs/ecryptfs/keystore.c
> @@ -2353,21 +2353,18 @@ struct kmem_cache *ecryptfs_key_sig_cache;
> int ecryptfs_add_keysig(struct ecryptfs_crypt_stat *crypt_stat, char *sig)
> {
> struct ecryptfs_key_sig *new_key_sig;
> - int rc = 0;
>
> new_key_sig = kmem_cache_alloc(ecryptfs_key_sig_cache, GFP_KERNEL);
> if (!new_key_sig) {
> - rc = -ENOMEM;
> printk(KERN_ERR
> "Error allocating from ecryptfs_key_sig_cache\n");
> - goto out;
> + return -ENOMEM;
> }
> memcpy(new_key_sig->keysig, sig, ECRYPTFS_SIG_SIZE_HEX);
> - mutex_lock(&crypt_stat->keysig_list_mutex);
> + /* Caller must hold keysig_list_mutex */
> list_add(&new_key_sig->crypt_stat_list, &crypt_stat->keysig_list);
> - mutex_unlock(&crypt_stat->keysig_list_mutex);
> -out:
> - return rc;
> +
> + return 0;
> }
>
> struct kmem_cache *ecryptfs_global_auth_tok_cache;

This patch looks good. I really appreciate you tracking down and fixing
this problem. It will take me a little bit longer before I can get to
the other 2 patches.

Tyler

2009-07-04 22:04:34

by Roland Dreier

[permalink] [raw]
Subject: Re: [PATCH] eCryptfs: Fix lockdep-reported AB-BA mutex issue


> This patch looks good. I really appreciate you tracking down and fixing
> this problem. It will take me a little bit longer before I can get to
> the other 2 patches.

Thanks, so I'll assume that this patch has been queued up and will go
upstream at some point?

I'll resend the patch for removing the locking from
ecryptfs_destroy_crypt_stat() and ecryptfs_destroy_inode() with a proper
changelog and s-o-b. Again, that patch is (almost by definition) not
fixing a real issue, since the lock problems that lockdep is warning
about must never occur for the same reason that the patch is valid,
namely that no other context will ever try to lock an object that is in
the process of being freed.

However, as you mentioned, it definitely is worth cleaning up lockdep
false positives because lockdep disables itself as soon as it prints one
report, and so fixing up ecryptfs makes lockdep much more useful for
debugging other stuff while still being able to use ecryptfs (as I like
to do on my laptop, since I'd just as soon not hand over all my files in
case my laptop is lost somehow).

I'll resend the s_vfs_rename_mutex annotation patch via the generic vfs
tree, since the change is really not in ecryptfs code at all (although
it is triggered by having a stacked filesystem).

Thanks,
Roland