2019-11-12 07:52:17

by Mikulas Patocka

[permalink] [raw]
Subject: Re: [PATCH 1/2] dm-snapshot: fix crash with the realtime kernel



On Mon, 11 Nov 2019, Mike Snitzer wrote:

> On Mon, Nov 11 2019 at 11:37am -0500,
> Nikos Tsironis <[email protected]> wrote:
>
> > On 11/11/19 3:59 PM, Mikulas Patocka wrote:
> > > Snapshot doesn't work with realtime kernels since the commit f79ae415b64c.
> > > hlist_bl is implemented as a raw spinlock and the code takes two non-raw
> > > spinlocks while holding hlist_bl (non-raw spinlocks are blocking mutexes
> > > in the realtime kernel, so they couldn't be taken inside a raw spinlock).
> > >
> > > This patch fixes the problem by using non-raw spinlock
> > > exception_table_lock instead of the hlist_bl lock.
> > >
> > > Signed-off-by: Mikulas Patocka <[email protected]>
> > > Fixes: f79ae415b64c ("dm snapshot: Make exception tables scalable")
> > >
> >
> > Hi Mikulas,
> >
> > I wasn't aware that hlist_bl is implemented as a raw spinlock in the
> > real time kernel. I would expect it to be a standard non-raw spinlock,
> > so everything works as expected. But, after digging further in the real
> > time tree, I found commit ad7675b15fd87f1 ("list_bl: Make list head
> > locking RT safe") which suggests that such a conversion would break
> > other parts of the kernel.
>
> Right, the proper fix is to update list_bl to work on realtime (which I
> assume the referenced commit does). I do not want to take this
> dm-snapshot specific workaround that open-codes what should be done
> within hlist_{bl_lock,unlock}, etc.

If we change list_bl to use non-raw spinlock, it fails in dentry lookup
code. The dentry code takes a seqlock (which is implemented as preempt
disable in the realtime kernel) and then takes a list_bl lock.

This is wrong from the real-time perspective (the chain in the hash could
be arbitrarily long, so using non-raw spinlock could cause unbounded
wait), however we can't do anything with it.

I think that fixing dm-snapshot is way easier than fixing the dentry code.
If you have an idea how to fix the dentry code, tell us.

> I'm not yet sure which realtime mailing list and/or maintainers should
> be cc'd to further the inclussion of commit ad7675b15fd87f1 -- Nikos do
> you?
>
> Thanks,
> Mike

Mikulas


2019-11-12 11:47:56

by Nikos Tsironis

[permalink] [raw]
Subject: Re: [PATCH 1/2] dm-snapshot: fix crash with the realtime kernel

On 11/12/19 9:50 AM, Mikulas Patocka wrote:
>
>
> On Mon, 11 Nov 2019, Mike Snitzer wrote:
>
>> On Mon, Nov 11 2019 at 11:37am -0500,
>> Nikos Tsironis <[email protected]> wrote:
>>
>>> On 11/11/19 3:59 PM, Mikulas Patocka wrote:
>>>> Snapshot doesn't work with realtime kernels since the commit f79ae415b64c.
>>>> hlist_bl is implemented as a raw spinlock and the code takes two non-raw
>>>> spinlocks while holding hlist_bl (non-raw spinlocks are blocking mutexes
>>>> in the realtime kernel, so they couldn't be taken inside a raw spinlock).
>>>>
>>>> This patch fixes the problem by using non-raw spinlock
>>>> exception_table_lock instead of the hlist_bl lock.
>>>>
>>>> Signed-off-by: Mikulas Patocka <[email protected]>
>>>> Fixes: f79ae415b64c ("dm snapshot: Make exception tables scalable")
>>>>
>>>
>>> Hi Mikulas,
>>>
>>> I wasn't aware that hlist_bl is implemented as a raw spinlock in the
>>> real time kernel. I would expect it to be a standard non-raw spinlock,
>>> so everything works as expected. But, after digging further in the real
>>> time tree, I found commit ad7675b15fd87f1 ("list_bl: Make list head
>>> locking RT safe") which suggests that such a conversion would break
>>> other parts of the kernel.
>>
>> Right, the proper fix is to update list_bl to work on realtime (which I
>> assume the referenced commit does). I do not want to take this
>> dm-snapshot specific workaround that open-codes what should be done
>> within hlist_{bl_lock,unlock}, etc.
>
> If we change list_bl to use non-raw spinlock, it fails in dentry lookup
> code. The dentry code takes a seqlock (which is implemented as preempt
> disable in the realtime kernel) and then takes a list_bl lock.
>
> This is wrong from the real-time perspective (the chain in the hash could
> be arbitrarily long, so using non-raw spinlock could cause unbounded
> wait), however we can't do anything with it.
>
> I think that fixing dm-snapshot is way easier than fixing the dentry code.
> If you have an idea how to fix the dentry code, tell us.
>

I too think that it would be better to fix list_bl. dm-snapshot isn't
really broken. One should be able to acquire a spinlock while holding
another spinlock.

Moreover, apart from dm-snapshot, anyone ever using list_bl is at risk
of breaking the realtime kernel, if he or she is not aware of that
particular limitation of list_bl's implementation in the realtime tree.

But, I agree that it's a lot easier "fixing" dm-snapshot than fixing the
dentry code.

>> I'm not yet sure which realtime mailing list and/or maintainers should
>> be cc'd to further the inclussion of commit ad7675b15fd87f1 -- Nikos do
>> you?

No, unfortunately, I don't know for sure either. [1] and [2] suggest
that the relevant mailing lists are LKML and linux-rt-users and the
maintainers are Sebastian Siewior, Thomas Gleixner and Steven Rostedt.

I believe they are already Cc'd in the other thread regarding Mikulas'
"realtime: avoid BUG when the list is not locked" patch (for some reason
the thread doesn't properly appear in dm-devel archives and also my
mails to dm-devel have being failing since yesterday - Could there be an
issue with the mailing list?), so maybe we should Cc them in this thread
too.

Nikos

[1] https://wiki.linuxfoundation.org/realtime/communication/mailinglists
[2] https://wiki.linuxfoundation.org/realtime/communication/send_rt_patches

>>
>> Thanks,
>> Mike
>
> Mikulas
>

2019-11-13 06:03:59

by Crystal Wood

[permalink] [raw]
Subject: Re: [PATCH 1/2] dm-snapshot: fix crash with the realtime kernel

On Tue, 2019-11-12 at 13:45 +0200, Nikos Tsironis wrote:
> On 11/12/19 9:50 AM, Mikulas Patocka wrote:
> >
> > On Mon, 11 Nov 2019, Mike Snitzer wrote:
> >
> > > On Mon, Nov 11 2019 at 11:37am -0500,
> > > Nikos Tsironis <[email protected]> wrote:
> > >
> > > > On 11/11/19 3:59 PM, Mikulas Patocka wrote:
> > > > > Snapshot doesn't work with realtime kernels since the commit
> > > > > f79ae415b64c.
> > > > > hlist_bl is implemented as a raw spinlock and the code takes two
> > > > > non-raw
> > > > > spinlocks while holding hlist_bl (non-raw spinlocks are blocking
> > > > > mutexes
> > > > > in the realtime kernel, so they couldn't be taken inside a raw
> > > > > spinlock).
> > > > >
> > > > > This patch fixes the problem by using non-raw spinlock
> > > > > exception_table_lock instead of the hlist_bl lock.
> > > > >
> > > > > Signed-off-by: Mikulas Patocka <[email protected]>
> > > > > Fixes: f79ae415b64c ("dm snapshot: Make exception tables
> > > > > scalable")
> > > > >
> > > >
> > > > Hi Mikulas,
> > > >
> > > > I wasn't aware that hlist_bl is implemented as a raw spinlock in the
> > > > real time kernel. I would expect it to be a standard non-raw
> > > > spinlock,
> > > > so everything works as expected. But, after digging further in the
> > > > real
> > > > time tree, I found commit ad7675b15fd87f1 ("list_bl: Make list head
> > > > locking RT safe") which suggests that such a conversion would break
> > > > other parts of the kernel.
> > >
> > > Right, the proper fix is to update list_bl to work on realtime (which
> > > I
> > > assume the referenced commit does). I do not want to take this
> > > dm-snapshot specific workaround that open-codes what should be done
> > > within hlist_{bl_lock,unlock}, etc.
> >
> > If we change list_bl to use non-raw spinlock, it fails in dentry lookup
> > code. The dentry code takes a seqlock (which is implemented as preempt
> > disable in the realtime kernel) and then takes a list_bl lock.
> >
> > This is wrong from the real-time perspective (the chain in the hash
> > could
> > be arbitrarily long, so using non-raw spinlock could cause unbounded
> > wait), however we can't do anything with it.
> >
> > I think that fixing dm-snapshot is way easier than fixing the dentry
> > code.
> > If you have an idea how to fix the dentry code, tell us.
>
> I too think that it would be better to fix list_bl. dm-snapshot isn't
> really broken. One should be able to acquire a spinlock while holding
> another spinlock.

That's not universally true -- even in the absence of RT there are nesting
considerations. But it would probably be good if raw locks weren't hidden
inside other locking primitives without making it clear (ideally in the
function names) that it's a raw lock.

> Moreover, apart from dm-snapshot, anyone ever using list_bl is at risk
> of breaking the realtime kernel, if he or she is not aware of that
> particular limitation of list_bl's implementation in the realtime tree.

In particular the non-rcu variant seems inherently bad unless you protect
traversal with some other lock (in which case why use bl at all?). Maybe
fully separate the rcu version of list_bl and keep using raw locks there
(with the name clearly indicating so), with the side benefit that
accidentally mixing rcu and non-rcu operations on the same list would become
a build error, and convert the non-rcu list_bl to use non-raw locks on RT.

BTW, I'm wondering what makes bit spinlocks worse than raw spinlocks on
RT... commit ad7675b15fd87f19 says there's no lockdep visibility, but that
seems orthogonal to RT, and could be addressed by adding a dep_map on debug
builds the same way the raw lock is currently added. The other bit spinlock
conversion commits that I could find are replacing them with non-raw locks.

-Scott