2007-02-09 06:38:15

by S.Çağlar Onur

[permalink] [raw]
Subject: [BUG] at drivers/char/vt.c:3332 do_blank_screen() on resume

Hi;

With 2.6.20, resuming from disk sometimes cannot returns on vt7 where X runs
but everything seems working, so just changing to vt1 and returning to vt7
solves the problem. But dmesg shows some BUG() output like [1] whenever this
problem occurs

I'm using 20070207 snapshot of suspend, s2disk is used to suspend2disk,
machine is a Sony Vaio FS-215B which is in the whitelist and works well with
2.6.16/17/18 (all have fbsplash and vesafb-tng patches [from gentoo]
applied).

[1] http://cekirdek.pardus.org.tr/~caglar/dmesg.2.6.20

Cheers
--
S.Çağlar Onur <[email protected]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


Attachments:
(No filename) (710.00 B)
(No filename) (189.00 B)
Download all attachments

2007-02-15 10:40:51

by Pavel Machek

[permalink] [raw]
Subject: Re: [BUG] at drivers/char/vt.c:3332 do_blank_screen() on resume

Hi!
> With 2.6.20, resuming from disk sometimes cannot returns on vt7 where X runs
> but everything seems working, so just changing to vt1 and returning to vt7
> solves the problem. But dmesg shows some BUG() output like [1] whenever this
> problem occurs
>
> I'm using 20070207 snapshot of suspend, s2disk is used to suspend2disk,
> machine is a Sony Vaio FS-215B which is in the whitelist and works well with
> 2.6.16/17/18 (all have fbsplash and vesafb-tng patches [from gentoo]
> applied).
>
> [1] http://cekirdek.pardus.org.tr/~caglar/dmesg.2.6.20

Contact fbcon people... They are calling do_blank_screen+0x4e/0x218
from fbcon_event_notify+0x8f1/0xa1e ... probably without taking
neccessary locks.

Aha, it seems their blanking code triggers _while_ resuming, which is
not exactly nice.

Pavel

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2007-02-15 10:57:38

by Andrew Morton

[permalink] [raw]
Subject: Re: [BUG] at drivers/char/vt.c:3332 do_blank_screen() on resume

On Thu, 15 Feb 2007 11:40:32 +0100 Pavel Machek <[email protected]> wrote:

> Contact fbcon people...

There aren't any, basically. Since Tony disappeared James has been helping out
but doesn't have a lot of time. So we're pretty much on our own with problems in
this area.

2007-02-15 11:00:29

by S.Çağlar Onur

[permalink] [raw]
Subject: Re: [BUG] at drivers/char/vt.c:3332 do_blank_screen() on resume

15 Şub 2007 Per tarihinde, Andrew Morton şunları yazmıştı:
> On Thu, 15 Feb 2007 11:40:32 +0100 Pavel Machek <[email protected]> wrote:
> > Contact fbcon people...
>
> There aren't any, basically. Since Tony disappeared James has been helping
> out but doesn't have a lot of time. So we're pretty much on our own with
> problems in this area.

I already sent same mail to
linux-fbdev-devel mailing lists at sf.net with hope :)

Cheers
--
S.Çağlar Onur <[email protected]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


Attachments:
(No filename) (603.00 B)
(No filename) (189.00 B)
Download all attachments

2007-02-21 23:41:24

by Antonino A. Daplas

[permalink] [raw]
Subject: Re: [BUG] at drivers/char/vt.c:3332 do_blank_screen() on resume

On Thu, 2007-02-15 at 13:00 +0200, S.Çağlar Onur wrote:
> 15 Şub 2007 Per tarihinde, Andrew Morton şunları yazmıştı:
> > On Thu, 15 Feb 2007 11:40:32 +0100 Pavel Machek <[email protected]> wrote:
> > > Contact fbcon people...
> >
> > There aren't any, basically. Since Tony disappeared James has been helping
> > out but doesn't have a lot of time. So we're pretty much on our own with
> > problems in this area.
>
> I already sent same mail to
> linux-fbdev-devel mailing lists at sf.net with hope :)
>
> Cheers

Interesting... It does look like this was triggered by calling
do_blank_screen() without taking the console semaphore, but
console_callback() should have taken that.

Second point is that vesafb does not have any blanking functionality,
thus it should not trigger fbcon_event_notify(). My guess is you are
using an out-of-tree vesafb?

BUG: at drivers/char/vt.c:3332 do_blank_screen()
[<c02881f7>] do_blank_screen+0x4e/0x218
[<c02977fa>] fbcon_event_notify+0x8f1/0xa1e
[<c027b2b0>] extract_buf+0xac/0xe1
[<c0102002>] __switch_to+0xeb/0x15d
[<c034bbe5>] __sched_text_start+0x865/0x929
[<c029a50d>] bit_cursor+0x4c8/0x50b
[<c034bd67>] wait_for_completion+0x79/0xaf
[<c011cbe2>] default_wake_function+0x0/0xc
[<c034ece2>] notifier_call_chain+0x19/0x32
[<c012d1c6>] blocking_notifier_call_chain+0x23/0x33
[<c028db54>] fb_blank+0x4a/0x53
[<c0299028>] fbcon_blank+0xf4/0x1e3
[<c0294f33>] fbcon_cursor+0x21c/0x250
[<c029a045>] bit_cursor+0x0/0x50b
[<c0129e90>] lock_timer_base+0x15/0x2f
[<c0129eee>] try_to_del_timer_sync+0x44/0x4a
[<c0298f34>] fbcon_blank+0x0/0x1e3
[<c028835a>] do_blank_screen+0x1b1/0x218
[<c028abdd>] console_callback+0xaf/0xbf
[<c012fa59>] run_workqueue+0x85/0x135
[<c028ab2e>] console_callback+0x0/0xbf
[<c01302f0>] worker_thread+0x10a/0x136
[<c011cbe2>] default_wake_function+0x0/0xc
[<c01301e6>] worker_thread+0x0/0x136
[<c0132982>] kthread+0xb2/0xdc
[<c01328d0>] kthread+0x0/0xdc
[<c0103a8f>] kernel_thread_helper+0x7/0x10

As for the last tracing, it looks to be valid bug to me.
complete_change_console() should be called with the console sem
taken. I'll look into this.

BUG: at drivers/char/vt.c:3486 set_palette()
[<c0287538>] set_palette+0x41/0x59
[<c028886f>] redraw_screen+0x110/0x17e
[<c0282394>] complete_change_console+0x2a/0xba
[<c028ab73>] console_callback+0x45/0xbf
[<c012fa59>] run_workqueue+0x85/0x135
[<c028ab2e>] console_callback+0x0/0xbf
[<c01302f0>] worker_thread+0x10a/0x136
[<c011cbe2>] default_wake_function+0x0/0xc
[<c01301e6>] worker_thread+0x0/0x136
[<c0132982>] kthread+0xb2/0xdc
[<c01328d0>] kthread+0x0/0xdc
[<c0103a8f>] kernel_thread_helper+0x7/0x10

Tony


2007-02-22 12:53:04

by S.Çağlar Onur

[permalink] [raw]
Subject: Re: [BUG] at drivers/char/vt.c:3332 do_blank_screen() on resume

Hi;

22 Şub 2007 Per tarihinde, Antonino A. Daplas şunları yazmıştı:
> On Thu, 2007-02-15 at 13:00 +0200, S.Çağlar Onur wrote:
> > 15 Şub 2007 Per tarihinde, Andrew Morton şunları yazmıştı:
> > > On Thu, 15 Feb 2007 11:40:32 +0100 Pavel Machek <[email protected]> wrote:
> > > > Contact fbcon people...
> > >
> > > There aren't any, basically. Since Tony disappeared James has been
> > > helping out but doesn't have a lot of time. So we're pretty much on
> > > our own with problems in this area.
> >
> > I already sent same mail to
> > linux-fbdev-devel mailing lists at sf.net with hope :)
> >
> > Cheers
>
> Interesting... It does look like this was triggered by calling
> do_blank_screen() without taking the console semaphore, but
> console_callback() should have taken that.
>
> Second point is that vesafb does not have any blanking functionality,
> thus it should not trigger fbcon_event_notify(). My guess is you are
> using an out-of-tree vesafb?

As i wrote we are using vesafb-tng [http://dev.gentoo.org/~spock/projects/]
for a long time but this hits me only with 2.6.20 and only after
suspend2disk, so im adding Michał Januszewski to CC if this is vesafb-tng
related :).

> BUG: at drivers/char/vt.c:3332 do_blank_screen()
> [<c02881f7>] do_blank_screen+0x4e/0x218
> [<c02977fa>] fbcon_event_notify+0x8f1/0xa1e
> [<c027b2b0>] extract_buf+0xac/0xe1
> [<c0102002>] __switch_to+0xeb/0x15d
> [<c034bbe5>] __sched_text_start+0x865/0x929
> [<c029a50d>] bit_cursor+0x4c8/0x50b
> [<c034bd67>] wait_for_completion+0x79/0xaf
> [<c011cbe2>] default_wake_function+0x0/0xc
> [<c034ece2>] notifier_call_chain+0x19/0x32
> [<c012d1c6>] blocking_notifier_call_chain+0x23/0x33
> [<c028db54>] fb_blank+0x4a/0x53
> [<c0299028>] fbcon_blank+0xf4/0x1e3
> [<c0294f33>] fbcon_cursor+0x21c/0x250
> [<c029a045>] bit_cursor+0x0/0x50b
> [<c0129e90>] lock_timer_base+0x15/0x2f
> [<c0129eee>] try_to_del_timer_sync+0x44/0x4a
> [<c0298f34>] fbcon_blank+0x0/0x1e3
> [<c028835a>] do_blank_screen+0x1b1/0x218
> [<c028abdd>] console_callback+0xaf/0xbf
> [<c012fa59>] run_workqueue+0x85/0x135
> [<c028ab2e>] console_callback+0x0/0xbf
> [<c01302f0>] worker_thread+0x10a/0x136
> [<c011cbe2>] default_wake_function+0x0/0xc
> [<c01301e6>] worker_thread+0x0/0x136
> [<c0132982>] kthread+0xb2/0xdc
> [<c01328d0>] kthread+0x0/0xdc
> [<c0103a8f>] kernel_thread_helper+0x7/0x10
>
> As for the last tracing, it looks to be valid bug to me.
> complete_change_console() should be called with the console sem
> taken. I'll look into this.

If testing needed just ask please :)

> BUG: at drivers/char/vt.c:3486 set_palette()
> [<c0287538>] set_palette+0x41/0x59
> [<c028886f>] redraw_screen+0x110/0x17e
> [<c0282394>] complete_change_console+0x2a/0xba
> [<c028ab73>] console_callback+0x45/0xbf
> [<c012fa59>] run_workqueue+0x85/0x135
> [<c028ab2e>] console_callback+0x0/0xbf
> [<c01302f0>] worker_thread+0x10a/0x136
> [<c011cbe2>] default_wake_function+0x0/0xc
> [<c01301e6>] worker_thread+0x0/0x136
> [<c0132982>] kthread+0xb2/0xdc
> [<c01328d0>] kthread+0x0/0xdc
> [<c0103a8f>] kernel_thread_helper+0x7/0x10
>
> Tony

Cheers
--
S.Çağlar Onur <[email protected]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


Attachments:
(No filename) (3.23 kB)
(No filename) (189.00 B)
Download all attachments

2007-02-22 13:38:28

by Antonino A. Daplas

[permalink] [raw]
Subject: Re: [BUG] at drivers/char/vt.c:3332 do_blank_screen() on resume

On Thu, 2007-02-22 at 14:53 +0200, S.Çağlar Onur wrote:
> Hi;
>
> 22 Şub 2007 Per tarihinde, Antonino A. Daplas şunları yazmıştı:
> > On Thu, 2007-02-15 at 13:00 +0200, S.Çağlar Onur wrote:
> > > 15 Şub 2007 Per tarihinde, Andrew Morton şunları yazmıştı:
> > > > On Thu, 15 Feb 2007 11:40:32 +0100 Pavel Machek <[email protected]> wrote:
> > > > > Contact fbcon people...
> > > >
> > > > There aren't any, basically. Since Tony disappeared James has been
> > > > helping out but doesn't have a lot of time. So we're pretty much on
> > > > our own with problems in this area.
> > >
> > > I already sent same mail to
> > > linux-fbdev-devel mailing lists at sf.net with hope :)
> > >
> > > Cheers
> >
> > Interesting... It does look like this was triggered by calling
> > do_blank_screen() without taking the console semaphore, but
> > console_callback() should have taken that.
> >
> > Second point is that vesafb does not have any blanking functionality,
> > thus it should not trigger fbcon_event_notify(). My guess is you are
> > using an out-of-tree vesafb?
>
> As i wrote we are using vesafb-tng [http://dev.gentoo.org/~spock/projects/]
> for a long time but this hits me only with 2.6.20 and only after
> suspend2disk, so im adding Michał Januszewski to CC if this is vesafb-tng
> related :).
>

Ah, and you have fb_splash too. That's why the tracing was not what I
expected it to be.

Try using video=vesafb:noblank to disable hardware blanking and find out
if you can still reproduce the oops.

> > BUG: at drivers/char/vt.c:3332 do_blank_screen()
> > [<c02881f7>] do_blank_screen+0x4e/0x218
> > [<c02977fa>] fbcon_event_notify+0x8f1/0xa1e
> > [<c027b2b0>] extract_buf+0xac/0xe1
> > [<c0102002>] __switch_to+0xeb/0x15d
> > [<c034bbe5>] __sched_text_start+0x865/0x929
> > [<c029a50d>] bit_cursor+0x4c8/0x50b
> > [<c034bd67>] wait_for_completion+0x79/0xaf
> > [<c011cbe2>] default_wake_function+0x0/0xc
> > [<c034ece2>] notifier_call_chain+0x19/0x32
> > [<c012d1c6>] blocking_notifier_call_chain+0x23/0x33
> > [<c028db54>] fb_blank+0x4a/0x53
> > [<c0299028>] fbcon_blank+0xf4/0x1e3
> > [<c0294f33>] fbcon_cursor+0x21c/0x250
> > [<c029a045>] bit_cursor+0x0/0x50b
> > [<c0129e90>] lock_timer_base+0x15/0x2f
> > [<c0129eee>] try_to_del_timer_sync+0x44/0x4a
> > [<c0298f34>] fbcon_blank+0x0/0x1e3
> > [<c028835a>] do_blank_screen+0x1b1/0x218
> > [<c028abdd>] console_callback+0xaf/0xbf
> > [<c012fa59>] run_workqueue+0x85/0x135
> > [<c028ab2e>] console_callback+0x0/0xbf
> > [<c01302f0>] worker_thread+0x10a/0x136
> > [<c011cbe2>] default_wake_function+0x0/0xc
> > [<c01301e6>] worker_thread+0x0/0x136
> > [<c0132982>] kthread+0xb2/0xdc
> > [<c01328d0>] kthread+0x0/0xdc
> > [<c0103a8f>] kernel_thread_helper+0x7/0x10
> >
> > As for the last tracing, it looks to be valid bug to me.
> > complete_change_console() should be called with the console sem
> > taken. I'll look into this.
>
> If testing needed just ask please :)

After grepping for change_console, all callers of change_console and
complete_change_console are acquiring the console semaphore, so I really
don't know what's going on here...

Since you are using a non-vanilla kernel, can you just do
a grep change_console of the kernel source and see if you can find a
caller that missed doing an acquire_console_sem().

>
> > BUG: at drivers/char/vt.c:3486 set_palette()
> > [<c0287538>] set_palette+0x41/0x59
> > [<c028886f>] redraw_screen+0x110/0x17e
> > [<c0282394>] complete_change_console+0x2a/0xba
> > [<c028ab73>] console_callback+0x45/0xbf
> > [<c012fa59>] run_workqueue+0x85/0x135
> > [<c028ab2e>] console_callback+0x0/0xbf
> > [<c01302f0>] worker_thread+0x10a/0x136
> > [<c011cbe2>] default_wake_function+0x0/0xc
> > [<c01301e6>] worker_thread+0x0/0x136
> > [<c0132982>] kthread+0xb2/0xdc
> > [<c01328d0>] kthread+0x0/0xdc
> > [<c0103a8f>] kernel_thread_helper+0x7/0x10
> >
> > Tony
>
> Cheers

Tony

2007-02-22 13:52:05

by S.Çağlar Onur

[permalink] [raw]
Subject: Re: [BUG] at drivers/char/vt.c:3332 do_blank_screen() on resume

Hi;

22 Şub 2007 Per tarihinde, Antonino A. Daplas şunları yazmıştı:
> Ah, and you have fb_splash too. That's why the tracing was not what I
> expected it to be.
>
> Try using video=vesafb:noblank to disable hardware blanking and find out
> if you can still reproduce the oops.

I'll try and also will try to reproduce with vanilla one.

> After grepping for change_console, all callers of change_console and
> complete_change_console are acquiring the console semaphore, so I really
> don't know what's going on here...
>
> Since you are using a non-vanilla kernel, can you just do
> a grep change_console of the kernel source and see if you can find a
> caller that missed doing an acquire_console_sem().

fbsplash and vesafb-tng has no change_console call and patched one seems same
with vanilla one.

[caglar@zangetsu][~/svk/playground/caglar/kernel/kernel/files/gentoo]> grep
change_console fbsplash-0.9.2-r5.patch
[caglar@zangetsu][~/svk/playground/caglar/kernel/kernel/files/gentoo]> grep
change_console vesafb-tng-1.0-rc2.patch
[caglar@zangetsu][~/svk/playground/caglar/kernel/kernel/files/gentoo]>

zangetsu linux-2.6.20 # grep change_console * -r
drivers/char/vt_ioctl.c:static void complete_change_console(struct vc_data
*vc);
drivers/char/vt_ioctl.c:
complete_change_console(vc_cons[newvt].d);
drivers/char/vt_ioctl.c:static void complete_change_console(struct vc_data
*vc)
drivers/char/vt_ioctl.c: * clean up (similar to logic employed in
change_console())
drivers/char/vt_ioctl.c:void change_console(struct vc_data *new_vc)
drivers/char/vt_ioctl.c: complete_change_console(new_vc);
drivers/char/vt.c:
change_console(vc_cons[want_console].d);
include/linux/vt_kern.h:void change_console(struct vc_data *new_vc);

Cheers
--
S.Çağlar Onur <[email protected]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


Attachments:
(No filename) (1.93 kB)
(No filename) (189.00 B)
Download all attachments

2007-02-28 12:09:45

by S.Çağlar Onur

[permalink] [raw]
Subject: Re: [BUG] at drivers/char/vt.c:3332 do_blank_screen() on resume

22 Şub 2007 Per tarihinde, S.Çağlar Onur şunları yazmıştı:
> 22 Şub 2007 Per tarihinde, Antonino A. Daplas şunları yazmıştı:
> > Ah, and you have fb_splash too. That's why the tracing was not what I
> > expected it to be.
> >
> > Try using video=vesafb:noblank to disable hardware blanking and find out
> > if you can still reproduce the oops.
>
> I'll try and also will try to reproduce with vanilla one.

Sorry for long delay, here are some more test results;

* using video=vesafb:noblank _sometimes_ causes hard freezes on resume, and
_sometimes_ X can't start properly (it enters a weird "switch vt1 - wait for
some seconds - switch vt7" loop and after ~10 minutes X starts magically!)
whenever this happens system starts to become really unresponsive and dmesg
and Xorg's logs shows nothing strange :(. I will try an older kernel with
noblank to see its related or not.

* If system resumes normally with noblank (it can sometimes :)), dmesg shows
no error at all, X starts normally and system works well.

* I cannot reproduce that BUG with _vanilla one_ but please note that i cannot
easily reproduce that with patched one also (it occurs only once for ~20
suspend2disk/resume cycle)

I'll try to test more and trying to reproduce, if i can find anything else
i'll knock your door again :)

Cheers
--
S.Çağlar Onur <[email protected]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


Attachments:
(No filename) (1.46 kB)
(No filename) (189.00 B)
Download all attachments