2008-06-26 13:59:49

by Cyrill Gorcunov

[permalink] [raw]
Subject: eCryptFS possible circular locking

Hi Michael,

there is potential circularo locking in eCryptFS. Actually, I think Ingo
boots catched it. (Original report from Ingo was about a month ago).
Here is a log:

---

> [Ingo Molnar - Fri, May 09, 2008 at 11:44:23AM +0200]
> |
> | * Ingo Molnar <[email protected]> wrote:
> |
> | > x86.git testing found the following "possible circular locking"
> | > lockdep warning, generated by the ecryptfs code:
> |
> | the correct URL for the config is:
> |
> | http://redhat.com/~mingo/misc/config-Wed_May__7_16_17_31_CEST_2008.bad
> |
> | this is reproducible, it happened a second time since i originally
> | reported it.
> |
> | Ingo
>
> Hi Ingo, I've just sent patch, could you test it, please?
>
> http://lkml.org/lkml/2008/5/10/20

that patch is now upstream - but see below there's still a circular
dependency.

Ingo

[ 21.530026]
[ 21.530026] =======================================================
[ 21.530026] [ INFO: possible circular locking dependency detected ]
[ 21.530026] 2.6.26-rc2-sched-devel.git #1205
[ 21.530026] -------------------------------------------------------
[ 21.530026] multipath.stati/1379 is trying to acquire lock:
[ 21.530026] (&ecryptfs_daemon_hash_mux){--..}, at: [<c020cca2>] ecryptfs_miscdev_open+0x22/0x140
[ 21.530026]
[ 21.530026] but task is already holding lock:
[ 21.530026] (misc_mtx){--..}, at: [<c02f4b92>] misc_open+0x22/0x120
[ 21.530026]
[ 21.530026] which lock already depends on the new lock.
[ 21.530026]
[ 21.530026]
[ 21.530026] the existing dependency chain (in reverse order) is:
[ 21.530026]
[ 21.530026] -> #1 (misc_mtx){--..}:
[ 21.530026] [<c01438aa>] __lock_acquire+0xc3a/0x1100
[ 21.530026] [<c0143de6>] lock_acquire+0x76/0xa0
[ 21.530026] [<c05d032a>] mutex_lock_nested+0x7a/0x280
[ 21.530026] [<c02f4cb5>] misc_register+0x25/0x140
[ 21.530026] [<c020cb5c>] ecryptfs_init_ecryptfs_miscdev+0x2c/0x60
[ 21.530026] [<c020c5fe>] ecryptfs_init_messaging+0x1ee/0x280
[ 21.530026] [<c0837662>] ecryptfs_init+0xc2/0x1b0
[ 21.530026] [<c0824852>] kernel_init+0x132/0x250
[ 21.530026] [<c010365f>] kernel_thread_helper+0x7/0x10
[ 21.530026] [<ffffffff>] 0xffffffff
[ 21.530026]
[ 21.530026] -> #0 (&ecryptfs_daemon_hash_mux){--..}:
[ 21.530026] [<c01436cd>] __lock_acquire+0xa5d/0x1100
[ 21.530026] [<c0143de6>] lock_acquire+0x76/0xa0
[ 21.530026] [<c05d032a>] mutex_lock_nested+0x7a/0x280
[ 21.530026] [<c020cca2>] ecryptfs_miscdev_open+0x22/0x140
[ 21.530026] [<c02f4bf2>] misc_open+0x82/0x120
[ 21.530026] [<c0187c26>] chrdev_open+0x86/0x150
[ 21.530026] [<c0183289>] __dentry_open+0xa9/0x260
[ 21.530026] [<c0183487>] nameidata_to_filp+0x47/0x60
[ 21.530026] [<c018fda4>] do_filp_open+0x184/0x7a0
[ 21.530026] [<c01830aa>] do_sys_open+0x4a/0xb0
[ 21.530026] [<c018317e>] sys_open+0x2e/0x40
[ 21.530026] [<c010330e>] syscall_call+0x7/0xb
[ 21.530026] [<ffffffff>] 0xffffffff
[ 21.530026]
[ 21.530026] other info that might help us debug this:
[ 21.530026]
[ 21.530026] 1 lock held by multipath.stati/1379:
[ 21.530026] #0: (misc_mtx){--..}, at: [<c02f4b92>] misc_open+0x22/0x120
[ 21.530026]

---

Some analisys shows the following interesting things -

Call chain (if being compiled into the kernel)
----------------------------------------------

ecryptfs_init
ecryptfs_init_messaging
ecryptfs_init_ecryptfs_miscdev
mutex_lock(&ecryptfs_daemon_hash_mux);
misc_register(&ecryptfs_miscdev);
mutex_lock(&misc_mtx);
mutex_unlock(&misc_mtx);
1 --->
mutex_unlock(&ecryptfs_daemon_hash_mux);

As only we have misc device registered it is valid to open it.
And that is happened by another proccess

chrdev_open
misc_open
mutex_lock(&misc_mtx);
ecryptfs_miscdev_open
mutex_lock(&ecryptfs_daemon_hash_mux);


If this happens at point (1) /ie chrdev_open called at this point/
we are getting to circular warning as on top. I'm not sure how to handle this
in more elegant way but I think we should defer all calls to misc
device until eCryptFS has been initialized. Could you take a look,
please?

- Cyrill -


2008-06-26 16:00:34

by Michael Halcrow

[permalink] [raw]
Subject: [PATCH] eCryptfs: Remove unnecessary mux from ecryptfs_init_ecryptfs_miscdev()

On Thu, Jun 26, 2008 at 05:59:04PM +0400, Cyrill Gorcunov wrote:
> Some analisys shows the following interesting things -
>
> Call chain (if being compiled into the kernel)
> ----------------------------------------------
>
> ecryptfs_init
> ecryptfs_init_messaging
> ecryptfs_init_ecryptfs_miscdev
> mutex_lock(&ecryptfs_daemon_hash_mux);
> misc_register(&ecryptfs_miscdev);
> mutex_lock(&misc_mtx);
> mutex_unlock(&misc_mtx);
> 1 --->
> mutex_unlock(&ecryptfs_daemon_hash_mux);
>
> As only we have misc device registered it is valid to open it.
> And that is happened by another proccess
>
> chrdev_open
> misc_open
> mutex_lock(&misc_mtx);
> ecryptfs_miscdev_open
> mutex_lock(&ecryptfs_daemon_hash_mux);
>
>
> If this happens at point (1) /ie chrdev_open called at this point/
> we are getting to circular warning as on top.

It appears to be only a warning and that deadlock cannot occur due to
the fact that ecryptfs_miscdev_open() cannot be called until after
mutex_unlock(&misc_mtx) occurs while under
ecryptfs_init_ecryptfs_miscdev(). However, on second glance, given the
call sequence detail you provided here, I think I am being
over-zealous in my hash table locking with ecryptfs_daemon_hash_mux in
ecryptfs_init_ecryptfs_miscdev(). The misc_mtx should provide all the
protection required to keep the daemon hash table sane during miscdev
registration.

---

The misc_mtx should provide all the protection required to keep the
daemon hash table sane during miscdev registration. Since this mutex
is causing gratuitous lockdep warnings, this patch removes it.

Signed-off-by: Michael Halcrow <[email protected]>
---
fs/ecryptfs/miscdev.c | 2 --
1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/fs/ecryptfs/miscdev.c b/fs/ecryptfs/miscdev.c
index 6560da1..a9926a4 100644
--- a/fs/ecryptfs/miscdev.c
+++ b/fs/ecryptfs/miscdev.c
@@ -577,13 +577,11 @@ int ecryptfs_init_ecryptfs_miscdev(void)
int rc;

atomic_set(&ecryptfs_num_miscdev_opens, 0);
- mutex_lock(&ecryptfs_daemon_hash_mux);
rc = misc_register(&ecryptfs_miscdev);
if (rc)
printk(KERN_ERR "%s: Failed to register miscellaneous device "
"for communications with userspace daemons; rc = [%d]\n",
__func__, rc);
- mutex_unlock(&ecryptfs_daemon_hash_mux);
return rc;
}

--
1.5.3.7

2008-06-26 16:05:43

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [PATCH] eCryptfs: Remove unnecessary mux from ecryptfs_init_ecryptfs_miscdev()

[Michael Halcrow - Thu, Jun 26, 2008 at 11:03:20AM -0500]
| On Thu, Jun 26, 2008 at 05:59:04PM +0400, Cyrill Gorcunov wrote:
| > Some analisys shows the following interesting things -
| >
| > Call chain (if being compiled into the kernel)
| > ----------------------------------------------
| >
| > ecryptfs_init
| > ecryptfs_init_messaging
| > ecryptfs_init_ecryptfs_miscdev
| > mutex_lock(&ecryptfs_daemon_hash_mux);
| > misc_register(&ecryptfs_miscdev);
| > mutex_lock(&misc_mtx);
| > mutex_unlock(&misc_mtx);
| > 1 --->
| > mutex_unlock(&ecryptfs_daemon_hash_mux);
| >
| > As only we have misc device registered it is valid to open it.
| > And that is happened by another proccess
| >
| > chrdev_open
| > misc_open
| > mutex_lock(&misc_mtx);
| > ecryptfs_miscdev_open
| > mutex_lock(&ecryptfs_daemon_hash_mux);
| >
| >
| > If this happens at point (1) /ie chrdev_open called at this point/
| > we are getting to circular warning as on top.
|
| It appears to be only a warning and that deadlock cannot occur due to
| the fact that ecryptfs_miscdev_open() cannot be called until after
| mutex_unlock(&misc_mtx) occurs while under
| ecryptfs_init_ecryptfs_miscdev(). However, on second glance, given the
| call sequence detail you provided here, I think I am being
| over-zealous in my hash table locking with ecryptfs_daemon_hash_mux in
| ecryptfs_init_ecryptfs_miscdev(). The misc_mtx should provide all the
| protection required to keep the daemon hash table sane during miscdev
| registration.
|
| ---
|
| The misc_mtx should provide all the protection required to keep the
| daemon hash table sane during miscdev registration. Since this mutex
| is causing gratuitous lockdep warnings, this patch removes it.
|
| Signed-off-by: Michael Halcrow <[email protected]>
| ---
| fs/ecryptfs/miscdev.c | 2 --
| 1 files changed, 0 insertions(+), 2 deletions(-)
|
| diff --git a/fs/ecryptfs/miscdev.c b/fs/ecryptfs/miscdev.c
| index 6560da1..a9926a4 100644
| --- a/fs/ecryptfs/miscdev.c
| +++ b/fs/ecryptfs/miscdev.c
| @@ -577,13 +577,11 @@ int ecryptfs_init_ecryptfs_miscdev(void)
| int rc;
|
| atomic_set(&ecryptfs_num_miscdev_opens, 0);
| - mutex_lock(&ecryptfs_daemon_hash_mux);
| rc = misc_register(&ecryptfs_miscdev);
| if (rc)
| printk(KERN_ERR "%s: Failed to register miscellaneous device "
| "for communications with userspace daemons; rc = [%d]\n",
| __func__, rc);
| - mutex_unlock(&ecryptfs_daemon_hash_mux);
| return rc;
| }
|
| --
| 1.5.3.7
|

Yes, it was only warning, but it was a bit annoying you know :)
Thanks a lot, Michael!

- Cyrill -