From: Nicolai Stange <nicstange@gmail.com>
To: Johannes Berg <johannes@sipsolutions.net>
Cc: Nicolai Stange <nicstange@gmail.com>,
        linux-kernel <linux-kernel@vger.kernel.org>,
        "Paul E.McKenney" <paulmck@linux.vnet.ibm.com>,
        gregkh <gregkh@linuxfoundation.org>
Subject: Re: deadlock in synchronize_srcu() in debugfs?
References: <1490280886.2766.4.camel@sipsolutions.net>
        <87o9ws6m4s.fsf@gmail.com> <1490614617.3393.4.camel@sipsolutions.net>
        <87lgrnmd8b.fsf@gmail.com> <1490860547.32041.1.camel@sipsolutions.net>
        <87inmroy9w.fsf@d147-183.mpimet.mpg.de>
        <1490872283.32041.4.camel@sipsolutions.net>
Date: Fri, 31 Mar 2017 11:03:29 +0200
In-Reply-To: <1490872283.32041.4.camel@sipsolutions.net> (Johannes Berg's
        message of "Thu, 30 Mar 2017 13:11:23 +0200")
Message-ID: <871stdyg0u.fsf@gmail.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Sender: linux-kernel-owner@vger.kernel.org
Content-Transfer-Encoding: 8bit
Content-Length: 4302
Lines: 111

On Thu, Mar 30 2017, Johannes Berg wrote:

> On Thu, 2017-03-30 at 12:27 +0200, Nicolai Stange wrote:
>> So, please correct me if I'm wrong, there are two problems with
>> indefinitely blocking debugfs files' fops:
>> 
>> 1. The one which actually hung your system:
>>    An indefinitely blocking debugfs_remove() while holding a lock.
>>    Other tasks attempting to grab that same lock get stuck as well.
>> 
>> 2. The other one you've found, namely that the locking granularity is
>>    too coarse: a debugfs_remove() would get blocked by unrelated
>> files'
>>    pending fops.
>
> No, this isn't really an accurate description of the two problems.
>
>> AFAICS, the first one can't get resolved by simply refining the
>> blocking granularity: a debugfs_remove() on the indefinitely blocking
>> file would still block as well.
>
> Correct.
>
> The first problem - the one I ran into - is the following:
>
> 1)
> A given debugfs file's .read() was waiting for some event to happen
> (being a blocking file), and I was trying to debugfs_remove() some
> completely unrelated file, this got stuck.

I got it now. I was missing the "completely unrelated file" part.
(Admittedly, a related file would have made no sense at all -- the
remover would have been responsible to cancel any indefinite blocking in
there, as you said).

> Due to me holding a lock while doing this debugfs_remove(), other tasks
> *also* got stuck, but that's just a sub-problem - having the
> debugfs_remove() of an unrelated file get stuck would already have been
> a problem - the fact that other tasks also got stuck was just an
> additional wrinkle.
>
> Mind - this is a livelock of sorts - if the debugfs file will ever make
> progress, the system can recover.
>


> 2)
> There's a complete deadlock situation if this happens:
>
> CPU1					CPU2
>
> debugfs_file_read(file="foo")		mutex_lock(&M);
> srcu_read_lock(&debugfs_srcu);		debugfs_remove(file="bar")
> mutex_lock(&M);				synchronize_srcu(&debugfs_srcu)
>
> This is intrinsically unrecoverable.

Let's address this in a second step.


> This is the core of the problem really - that you're tying completely
> unrelated processes together.
>
> Therefore, to continue using SRCU in this way means that you have to
> disallow blocking debugfs files. There may not be many of those, but
> any single one of them would be a problem.
>
> If we stop using SRCU this way we can discuss how we can fix it - but
> anything more coarse grained than per-file (which really makes SRCU
> unsuitable) would still have the same problem one way or another. And
> we haven't even addressed the deadlock situation (2 above) either.
>
>> When I did this, per-file reader/writer locks actuallt came to my
>> mind first. The problem here is that debugfs_use_file_start() must
>> grab the lock first and check whether the file has been deleted in
>> the meanwhile. But as it stands, there's nothing that would guarantee
>> the existence of the lock at the time it's to be taken.
>
> That seems like a strange argument to me - something has to exist for a
> process to be able to look up the file, and currently the proxy also
> has to exist?

No, the proxies are created at file _open_ time and installed at the
struct file.

Rationale: there are potentially many debugfs files with only few of them
opened at a time and a proxy, i.e. a struct file_operations, is quite
large.


> So when a file is created you can allocate the proxy for it, and if you
> can look up the proxy object - perhaps even using plain RCU - then you
> also have the lock? IOW, instead of storing just the real_fops in
> d_fsdata, you can store a small object that holds a lock and the
> real_fops. You can always access that object, and lock it, but the
> real_fops inside it might eventually end up NULL which you handle
> through proxying. No?

As said, there isn't always a proxy object around.

Of course, attaching some sort of lock on a per-file basis should be
doable. I just refrained from doing it so far (and resorted to SRCU
instead) because I wasn't aware of those indefinite blockers and wanted
to avoid the additional complexity (namely avoiding use-after-frees on
that lock).

I'll work out a solution this weekend and send some RFC patches then.

Thanks for your clarifications!

Nicolai