2011-05-24 19:53:40

by Fabio Erculiani

[permalink] [raw]
Subject: [PATCH] fbmem: fix race condition between register_framebuffer() and fb_open()

register_framebuffer() acquires registration_lock, then calls
do_register_framebuffer() which calls fb_notifier_call_chain()
which calls fb_open() which calls get_fb_info() which tries
(at last) to acquire registration_lock again.
This is a workaround.

Signed-off-by: Fabio Erculiani <[email protected]>
---
drivers/video/fbmem.c | 13 ++++++++++++-
1 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/drivers/video/fbmem.c b/drivers/video/fbmem.c
index 5aac00e..b9831a0 100644
--- a/drivers/video/fbmem.c
+++ b/drivers/video/fbmem.c
@@ -1628,6 +1628,11 @@ static int do_register_framebuffer(struct fb_info *fb_info)
event.info = fb_info;
if (!lock_fb_info(fb_info))
return -ENODEV;
+
+ /* FIXME: unlock registration mutex before registration
+ * notification is sent, in order to avoid deadlock.
+ */
+ mutex_unlock(&registration_lock);
fb_notifier_call_chain(FB_EVENT_FB_REGISTERED, &event);
unlock_fb_info(fb_info);
return 0;
@@ -1692,7 +1697,13 @@ register_framebuffer(struct fb_info *fb_info)

mutex_lock(&registration_lock);
ret = do_register_framebuffer(fb_info);
- mutex_unlock(&registration_lock);
+ if (ret != 0)
+ /*
+ * FIXME: mutex is unlocked only if ret == 0.
+ * This is the second part of the workaround
+ * that prevents deadlocking.
+ */
+ mutex_unlock(&registration_lock);

return ret;
}
--
1.7.5.rc1


2011-05-24 20:46:05

by Bruno Prémont

[permalink] [raw]
Subject: Re: [PATCH] fbmem: fix race condition between register_framebuffer() and fb_open()

On Tue, 24 May 2011 Fabio Erculiani <[email protected]> wrote:
> register_framebuffer() acquires registration_lock, then calls
> do_register_framebuffer() which calls fb_notifier_call_chain()
> which calls fb_open() which calls get_fb_info() which tries
> (at last) to acquire registration_lock again.
> This is a workaround.

What calls fb_open() instead of just calling fb_info->fbops->fb_open
when it is set?
In other words what kernel handler of fb_notifier chain does file-based
open of the framebuffer or is the event pushed out synchronously to
userspace?

> Signed-off-by: Fabio Erculiani <[email protected]>
> ---
> drivers/video/fbmem.c | 13 ++++++++++++-
> 1 files changed, 12 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/video/fbmem.c b/drivers/video/fbmem.c
> index 5aac00e..b9831a0 100644
> --- a/drivers/video/fbmem.c
> +++ b/drivers/video/fbmem.c
> @@ -1628,6 +1628,11 @@ static int do_register_framebuffer(struct fb_info *fb_info)
> event.info = fb_info;
> if (!lock_fb_info(fb_info))
> return -ENODEV;
> +
> + /* FIXME: unlock registration mutex before registration
> + * notification is sent, in order to avoid deadlock.
> + */
> + mutex_unlock(&registration_lock);
> fb_notifier_call_chain(FB_EVENT_FB_REGISTERED, &event);
> unlock_fb_info(fb_info);
> return 0;
> @@ -1692,7 +1697,13 @@ register_framebuffer(struct fb_info *fb_info)
>
> mutex_lock(&registration_lock);
> ret = do_register_framebuffer(fb_info);
> - mutex_unlock(&registration_lock);
> + if (ret != 0)
> + /*
> + * FIXME: mutex is unlocked only if ret == 0.
> + * This is the second part of the workaround
> + * that prevents deadlocking.
> + */
> + mutex_unlock(&registration_lock);
>
> return ret;
> }

2011-05-25 16:19:37

by Bruno Prémont

[permalink] [raw]
Subject: Re: [PATCH] fbmem: fix race condition between register_framebuffer() and fb_open()

On Tue, 24 May 2011 Bruno Prémont <[email protected]> wrote:
> On Tue, 24 May 2011 Fabio Erculiani <[email protected]> wrote:
> > register_framebuffer() acquires registration_lock, then calls
> > do_register_framebuffer() which calls fb_notifier_call_chain()
> > which calls fb_open() which calls get_fb_info() which tries
> > (at last) to acquire registration_lock again.
> > This is a workaround.
>
> What calls fb_open() instead of just calling fb_info->fbops->fb_open
> when it is set?
> In other words what kernel handler of fb_notifier chain does file-based
> open of the framebuffer or is the event pushed out synchronously to
> userspace?

Note that fb_info->lock is also taken on both sides:
- per definition around all fb_notifier_call_chain() calls
- as well in fb_open() to protect fb_info->fb_ops->fb_open call.

Bruno


> > Signed-off-by: Fabio Erculiani <[email protected]>
> > ---
> > drivers/video/fbmem.c | 13 ++++++++++++-
> > 1 files changed, 12 insertions(+), 1 deletions(-)
> >
> > diff --git a/drivers/video/fbmem.c b/drivers/video/fbmem.c
> > index 5aac00e..b9831a0 100644
> > --- a/drivers/video/fbmem.c
> > +++ b/drivers/video/fbmem.c
> > @@ -1628,6 +1628,11 @@ static int do_register_framebuffer(struct fb_info *fb_info)
> > event.info = fb_info;
> > if (!lock_fb_info(fb_info))
> > return -ENODEV;
> > +
> > + /* FIXME: unlock registration mutex before registration
> > + * notification is sent, in order to avoid deadlock.
> > + */
> > + mutex_unlock(&registration_lock);
> > fb_notifier_call_chain(FB_EVENT_FB_REGISTERED, &event);
> > unlock_fb_info(fb_info);
> > return 0;
> > @@ -1692,7 +1697,13 @@ register_framebuffer(struct fb_info *fb_info)
> >
> > mutex_lock(&registration_lock);
> > ret = do_register_framebuffer(fb_info);
> > - mutex_unlock(&registration_lock);
> > + if (ret != 0)
> > + /*
> > + * FIXME: mutex is unlocked only if ret == 0.
> > + * This is the second part of the workaround
> > + * that prevents deadlocking.
> > + */
> > + mutex_unlock(&registration_lock);
> >
> > return ret;
> > }

2011-05-25 18:17:10

by Fabio Erculiani

[permalink] [raw]
Subject: Re: [PATCH] fbmem: fix race condition between register_framebuffer() and fb_open()

I'm not a fbdev expert. So I leave the real fix to real men ( ;-) ).
It is causing deadlock during boot, so I would consider it quite critical.
Users using any fb driver will get into troubles.
The workaround is to boot with vga=normal.

Cheers,
--
Fabio Erculiani

2011-05-25 18:41:43

by Anca Emanuel

[permalink] [raw]
Subject: Re: [PATCH] fbmem: fix race condition between register_framebuffer() and fb_open()

On Wed, May 25, 2011 at 9:17 PM, Fabio Erculiani <[email protected]> wrote:
> I'm not a fbdev expert. So I leave the real fix to real men ( ;-) ).
> It is causing deadlock during boot, so I would consider it quite critical.
> Users using any fb driver will get into troubles.
> The workaround is to boot with vga=normal.

Did you test 2.6.39 ? The fix from Linus is not ok ?

2011-05-25 18:47:09

by Bruno Prémont

[permalink] [raw]
Subject: Re: [PATCH] fbmem: fix race condition between register_framebuffer() and fb_open()

On Wed, 25 May 2011 Fabio Erculiani <[email protected]> wrote:
> I'm not a fbdev expert. So I leave the real fix to real men ( ;-) ).
> It is causing deadlock during boot, so I would consider it quite critical.
> Users using any fb driver will get into troubles.
> The workaround is to boot with vga=normal.

What is your system doing during boot? I've never seen it here but maybe
my boot sequence is too simple.

Could you tell if it deadlocks before init gets started or afterwards,
which fb drivers (and extra kernel patches if any) are in use.

If you have the complete backtrace of the deadlocked processes it would
help getting a better idea of what is affected and how (and why just the
framebuffer's lock is not causing trouble with earlier kernel versions).

Bruno


> Cheers,

2011-05-25 18:48:27

by Fabio Erculiani

[permalink] [raw]
Subject: Re: [PATCH] fbmem: fix race condition between register_framebuffer() and fb_open()

On Wed, May 25, 2011 at 8:41 PM, Anca Emanuel <[email protected]> wrote:
>
> Did you test 2.6.39 ? The fix from Linus is not ok ?
>

Sorry, maybe I missed it?

--
Fabio Erculiani

2011-05-25 18:51:23

by Bruno Prémont

[permalink] [raw]
Subject: Re: [PATCH] fbmem: fix race condition between register_framebuffer() and fb_open()

On Wed, 25 May 2011 Fabio Erculiani <[email protected]> wrote:
> On Wed, May 25, 2011 at 8:41 PM, Anca Emanuel <[email protected]> wrote:
> >
> > Did you test 2.6.39 ? The fix from Linus is not ok ?
> >
>
> Sorry, maybe I missed it?

Your patch was against 2.6.39 or shortly before as Linus's code
did introduce the do_ variants of register/unregister/conflict functions.

Bruno

2011-05-25 18:52:20

by Fabio Erculiani

[permalink] [raw]
Subject: Re: [PATCH] fbmem: fix race condition between register_framebuffer() and fb_open()

On Wed, May 25, 2011 at 8:46 PM, Bruno Prémont
<[email protected]> wrote:
> On Wed, 25 May 2011 Fabio Erculiani <[email protected]> wrote:
>> I'm not a fbdev expert. So I leave the real fix to real men ( ;-) ).
>> It is causing deadlock during boot, so I would consider it quite critical.
>> Users using any fb driver will get into troubles.
>> The workaround is to boot with vga=normal.
>
> What is your system doing during boot? I've never seen it here but maybe
> my boot sequence is too simple.

I'm using vesafb and vga=791. It is quite simple to reproduce.
Also see: http://bugs.gentoo.org/show_bug.cgi?id=368109

>
> Could you tell if it deadlocks before init gets started or afterwards,
> which fb drivers (and extra kernel patches if any) are in use.

Exactly when register_framebuffer() is called, in my case, early in
the boot phase, before init.

>
> If you have the complete backtrace of the deadlocked processes it would
> help getting a better idea of what is affected and how (and why just the
> framebuffer's lock is not causing trouble with earlier kernel versions).

Because that code got a HUGE rewrite in the latest cycle, where
registration_lock has been introduce.
Just make a diff between 2.6.38 and 2.6.39. It will be easy to see
that SO MANY lines have changed ;-)
Anyway, since I'm out of office these days, I won't be able to send
you the traceback this week, but since so many people have run into
it, I guess it's fairly simple to reproduce.

>
> Bruno
>
>
>> Cheers,
>



--
Fabio Erculiani

2011-05-25 18:53:25

by Fabio Erculiani

[permalink] [raw]
Subject: Re: [PATCH] fbmem: fix race condition between register_framebuffer() and fb_open()

On Wed, May 25, 2011 at 8:51 PM, Bruno Prémont
<[email protected]> wrote:
>
> Your patch was against 2.6.39 or shortly before as Linus's code
> did introduce the do_ variants of register/unregister/conflict functions.
>
> Bruno
>

I made that patch against the 2.6.39 tag.
Git ain't lying ;-)

--
Fabio Erculiani

2011-05-25 18:57:25

by Bruno Prémont

[permalink] [raw]
Subject: Re: [PATCH] fbmem: fix race condition between register_framebuffer() and fb_open()

On Wed, 25 May 2011 Fabio Erculiani <[email protected]> wrote:
> On Wed, May 25, 2011 at 8:46 PM, Bruno Prémont wrote:
> > On Wed, 25 May 2011 Fabio Erculiani <[email protected]> wrote:
> >> I'm not a fbdev expert. So I leave the real fix to real men ( ;-) ).
> >> It is causing deadlock during boot, so I would consider it quite critical.
> >> Users using any fb driver will get into troubles.
> >> The workaround is to boot with vga=normal.
> >
> > What is your system doing during boot? I've never seen it here but maybe
> > my boot sequence is too simple.
>
> I'm using vesafb and vga=791. It is quite simple to reproduce.
> Also see: http://bugs.gentoo.org/show_bug.cgi?id=368109

Looks like gentoo kernel, might be splash is related to the hang

> > Could you tell if it deadlocks before init gets started or afterwards,
> > which fb drivers (and extra kernel patches if any) are in use.
>
> Exactly when register_framebuffer() is called, in my case, early in
> the boot phase, before init.
>
> >
> > If you have the complete backtrace of the deadlocked processes it would
> > help getting a better idea of what is affected and how (and why just the
> > framebuffer's lock is not causing trouble with earlier kernel versions).
>
> Because that code got a HUGE rewrite in the latest cycle, where
> registration_lock has been introduce.
> Just make a diff between 2.6.38 and 2.6.39. It will be easy to see
> that SO MANY lines have changed ;-)
> Anyway, since I'm out of office these days, I won't be able to send
> you the traceback this week, but since so many people have run into
> it, I guess it's fairly simple to reproduce.

Rewrite was not so huge, and I'm not affected here (running with KMS
compiled in though and vanilla kernel).
Probably the cause is related to one of the extra patches provided by
Gentoo and subscribing to framebuffer registration notifications.

Will have a look.

Bruno

2011-05-25 19:04:47

by Fabio Erculiani

[permalink] [raw]
Subject: Re: [PATCH] fbmem: fix race condition between register_framebuffer() and fb_open()

On Wed, May 25, 2011 at 8:57 PM, Bruno Prémont
<[email protected]> wrote:
> On Wed, 25 May 2011 Fabio Erculiani <[email protected]> wrote:
>> On Wed, May 25, 2011 at 8:46 PM, Bruno Prémont wrote:
>> > On Wed, 25 May 2011 Fabio Erculiani <[email protected]> wrote:
>> >> I'm not a fbdev expert. So I leave the real fix to real men ( ;-) ).
>> >> It is causing deadlock during boot, so I would consider it quite critical.
>> >> Users using any fb driver will get into troubles.
>> >> The workaround is to boot with vga=normal.
>> >
>> > What is your system doing during boot? I've never seen it here but maybe
>> > my boot sequence is too simple.
>>
>> I'm using vesafb and vga=791. It is quite simple to reproduce.
>> Also see: http://bugs.gentoo.org/show_bug.cgi?id=368109
>
> Looks like gentoo kernel, might be splash is related to the hang

Then, if you say so, it must be the fbsplash patch for sure, I keep
forgetting of that :-/

>
>> > Could you tell if it deadlocks before init gets started or afterwards,
>> > which fb drivers (and extra kernel patches if any) are in use.
>>
>> Exactly when register_framebuffer() is called, in my case, early in
>> the boot phase, before init.
>>
>> >
>> > If you have the complete backtrace of the deadlocked processes it would
>> > help getting a better idea of what is affected and how (and why just the
>> > framebuffer's lock is not causing trouble with earlier kernel versions).
>>
>> Because that code got a HUGE rewrite in the latest cycle, where
>> registration_lock has been introduce.
>> Just make a diff between 2.6.38 and 2.6.39. It will be easy to see
>> that SO MANY lines have changed ;-)
>> Anyway, since I'm out of office these days, I won't be able to send
>> you the traceback this week, but since so many people have run into
>> it, I guess it's fairly simple to reproduce.
>
> Rewrite was not so huge, and I'm not affected here (running with KMS
> compiled in though and vanilla kernel).
> Probably the cause is related to one of the extra patches provided by
> Gentoo and subscribing to framebuffer registration notifications.
>
> Will have a look.

Thanks a lot, let me know if I can help out next week.

>
> Bruno
>



--
Fabio Erculiani

2011-05-25 19:13:11

by Bruno Prémont

[permalink] [raw]
Subject: Re: [PATCH] fbmem: fix race condition between register_framebuffer() and fb_open()

On Wed, 25 May 2011 Fabio Erculiani <[email protected]> wrote:
> On Wed, May 25, 2011 at 8:57 PM, Bruno Prémont wrote:
> > On Wed, 25 May 2011 Fabio Erculiani <[email protected]> wrote:
> >> On Wed, May 25, 2011 at 8:46 PM, Bruno Prémont wrote:
> >> > On Wed, 25 May 2011 Fabio Erculiani <[email protected]> wrote:
> >> >> I'm not a fbdev expert. So I leave the real fix to real men ( ;-) ).
> >> >> It is causing deadlock during boot, so I would consider it quite critical.
> >> >> Users using any fb driver will get into troubles.
> >> >> The workaround is to boot with vga=normal.
> >> >
> >> > What is your system doing during boot? I've never seen it here but maybe
> >> > my boot sequence is too simple.
> >>
> >> I'm using vesafb and vga=791. It is quite simple to reproduce.
> >> Also see: http://bugs.gentoo.org/show_bug.cgi?id=368109
> >
> > Looks like gentoo kernel, might be splash is related to the hang
>
> Then, if you say so, it must be the fbsplash patch for sure, I keep
> forgetting of that :-/

I've had a look at the bug report which points at fbcon_decore
patch.

Looking into that patch confirms my impression:
fbcon_decor calls a userspace helper at the time fbcon takes over
console and that userspace helper then tries to open fb device
with the aim of calling some IOCTLs.

Probably changing fbcon_decor to just call the userspace helper in
non-blocking mode (or having userspace helper "fork and detach") would
avoid the deadlock as well.
Though fbcon_decor seems to rely on helper's return code...

What is the matching piece on userspace side so I can look at it as well?

Bruno

2011-05-25 19:16:33

by Fabio Erculiani

[permalink] [raw]
Subject: Re: [PATCH] fbmem: fix race condition between register_framebuffer() and fb_open()

On Wed, May 25, 2011 at 9:12 PM, Bruno Prémont
<[email protected]> wrote:
> On Wed, 25 May 2011 Fabio Erculiani <[email protected]> wrote:
>> On Wed, May 25, 2011 at 8:57 PM, Bruno Prémont wrote:
>> > On Wed, 25 May 2011 Fabio Erculiani <[email protected]> wrote:
>> >> On Wed, May 25, 2011 at 8:46 PM, Bruno Prémont wrote:
>> >> > On Wed, 25 May 2011 Fabio Erculiani <[email protected]> wrote:
>> >> >> I'm not a fbdev expert. So I leave the real fix to real men ( ;-) ).
>> >> >> It is causing deadlock during boot, so I would consider it quite critical.
>> >> >> Users using any fb driver will get into troubles.
>> >> >> The workaround is to boot with vga=normal.
>> >> >
>> >> > What is your system doing during boot? I've never seen it here but maybe
>> >> > my boot sequence is too simple.
>> >>
>> >> I'm using vesafb and vga=791. It is quite simple to reproduce.
>> >> Also see: http://bugs.gentoo.org/show_bug.cgi?id=368109
>> >
>> > Looks like gentoo kernel, might be splash is related to the hang
>>
>> Then, if you say so, it must be the fbsplash patch for sure, I keep
>> forgetting of that :-/
>
> I've had a look at the bug report which points at fbcon_decore
> patch.
>
> Looking into that patch confirms my impression:
>  fbcon_decor calls a userspace helper at the time fbcon takes over
>  console and that userspace helper then tries to open fb device
>  with the aim of calling some IOCTLs.
>
> Probably changing fbcon_decor to just call the userspace helper in
> non-blocking mode (or having userspace helper "fork and detach") would
> avoid the deadlock as well.
> Though fbcon_decor seems to rely on helper's return code...
>
> What is the matching piece on userspace side so I can look at it as well?

I guess you're talking about this:
http://fbsplash.berlios.de/wiki/doku.php
the package is named splashutils, and contains userspace helpers.

>
> Bruno
>



--
Fabio Erculiani