2012-10-25 19:38:17

by Sasha Levin

[permalink] [raw]
Subject: tty, vt: lockdep warnings

Hi all,

While fuzzing with trinity inside a KVM tools (lkvm) guest running latest -next kernel,
I've stumbled on the following spew:

[ 603.081796] ======================================================
[ 603.081797] [ INFO: possible circular locking dependency detected ]
[ 603.081800] 3.7.0-rc2-next-20121025-sasha-00001-g673f98e-dirty #77 Tainted: G W
[ 603.081801] -------------------------------------------------------
[ 603.081802] kworker/0:1/902 is trying to acquire lock:
[ 603.081815] ((fb_notifier_list).rwsem){.+.+.+}, at: [<ffffffff8114136f>] __blocking_notifier_call_chain+0x7f/0xc0
[ 603.081815]
[ 603.081815] but task is already holding lock:
[ 603.081822] (console_lock){+.+.+.}, at: [<ffffffff81bc833e>] console_callback+0xe/0x130
[ 603.081823]
[ 603.081823] which lock already depends on the new lock.
[ 603.081823]
[ 603.081824]
[ 603.081824] the existing dependency chain (in reverse order) is:
[ 603.081827]
[ 603.081827] -> #1 (console_lock){+.+.+.}:
[ 603.081832] [<ffffffff8118569a>] lock_acquire+0x1aa/0x240
[ 603.081837] [<ffffffff8110b6e8>] console_lock+0x68/0x70
[ 603.081841] [<ffffffff81bc786b>] register_con_driver+0x1b/0x110
[ 603.081844] [<ffffffff81bc7a99>] take_over_console+0x29/0x60
[ 603.081848] [<ffffffff81a6ee33>] fbcon_takeover+0x63/0xc0
[ 603.081851] [<ffffffff81a724af>] fbcon_event_notify+0x33f/0x730
[ 603.081854] [<ffffffff81140e8e>] notifier_call_chain+0xee/0x130
[ 603.081857] [<ffffffff81141388>] __blocking_notifier_call_chain+0x98/0xc0
[ 603.081860] [<ffffffff811413c1>] blocking_notifier_call_chain+0x11/0x20
[ 603.081863] [<ffffffff81a635b6>] fb_notifier_call_chain+0x16/0x20
[ 603.081866] [<ffffffff81a6638d>] register_framebuffer+0x24d/0x290
[ 603.081871] [<ffffffff83961605>] vga16fb_probe+0x1c0/0x227
[ 603.081876] [<ffffffff81e63192>] platform_drv_probe+0x12/0x20
[ 603.081879] [<ffffffff81e61bb5>] driver_probe_device+0x155/0x340
[ 603.081881] [<ffffffff81e61e7e>] __device_attach+0x2e/0x50
[ 603.081884] [<ffffffff81e5fff6>] bus_for_each_drv+0x56/0xa0
[ 603.081887] [<ffffffff81e617a8>] device_attach+0x88/0xc0
[ 603.081889] [<ffffffff81e60236>] bus_probe_device+0x36/0xd0
[ 603.081892] [<ffffffff81e5e0bf>] device_add+0x4df/0x750
[ 603.081895] [<ffffffff81e63aa8>] platform_device_add+0x1e8/0x280
[ 603.081900] [<ffffffff85b11144>] vga16fb_init+0x8d/0xbb
[ 603.081905] [<ffffffff85acecb2>] do_one_initcall+0x7a/0x135
[ 603.081908] [<ffffffff8392eb19>] kernel_init+0x299/0x470
[ 603.081912] [<ffffffff83aedebc>] ret_from_fork+0x7c/0xb0
[ 603.081915]
[ 603.081915] -> #0 ((fb_notifier_list).rwsem){.+.+.+}:
[ 603.081918] [<ffffffff811828df>] __lock_acquire+0x14df/0x1ca0
[ 603.081921] [<ffffffff8118569a>] lock_acquire+0x1aa/0x240
[ 603.081924] [<ffffffff83aea187>] down_read+0x47/0x90
[ 603.081927] [<ffffffff8114136f>] __blocking_notifier_call_chain+0x7f/0xc0
[ 603.081930] [<ffffffff811413c1>] blocking_notifier_call_chain+0x11/0x20
[ 603.081932] [<ffffffff81a635b6>] fb_notifier_call_chain+0x16/0x20
[ 603.081934] [<ffffffff81a65306>] fb_blank+0x36/0xa0
[ 603.081938] [<ffffffff81a729ee>] fbcon_blank+0x14e/0x2d0
[ 603.081941] [<ffffffff81bc7c93>] do_blank_screen+0x1b3/0x2b0
[ 603.081943] [<ffffffff81bc83f3>] console_callback+0xc3/0x130
[ 603.081946] [<ffffffff8112d5a9>] process_one_work+0x3b9/0x770
[ 603.081949] [<ffffffff8112df2a>] worker_thread+0x2ba/0x3f0
[ 603.081951] [<ffffffff81138c33>] kthread+0xe3/0xf0
[ 603.081954] [<ffffffff83aedebc>] ret_from_fork+0x7c/0xb0
[ 603.081955]
[ 603.081955] other info that might help us debug this:
[ 603.081955]
[ 603.081956] Possible unsafe locking scenario:
[ 603.081956]
[ 603.081956] CPU0 CPU1
[ 603.081957] ---- ----
[ 603.081959] lock(console_lock);
[ 603.081961] lock((fb_notifier_list).rwsem);
[ 603.081962] lock(console_lock);
[ 603.081964] lock((fb_notifier_list).rwsem);
[ 603.081964]
[ 603.081964] *** DEADLOCK ***
[ 603.081964]
[ 603.081966] 3 locks held by kworker/0:1/902:
[ 603.081971] #0: (events){.+.+.+}, at: [<ffffffff8112d458>] process_one_work+0x268/0x770
[ 603.081976] #1: (console_work){+.+...}, at: [<ffffffff8112d458>] process_one_work+0x268/0x770
[ 603.081981] #2: (console_lock){+.+.+.}, at: [<ffffffff81bc833e>] console_callback+0xe/0x130
[ 603.081981]
[ 603.081981] stack backtrace:
[ 603.081984] Pid: 902, comm: kworker/0:1 Tainted: G W 3.7.0-rc2-next-20121025-sasha-00001-g673f98e-dirty #77
[ 603.081985] Call Trace:
[ 603.081990] [<ffffffff83a90609>] print_circular_bug+0x1fb/0x20c
[ 603.081994] [<ffffffff811828df>] __lock_acquire+0x14df/0x1ca0
[ 603.081997] [<ffffffff8117cfda>] ? __bfs+0x16a/0x220
[ 603.082000] [<ffffffff8118569a>] lock_acquire+0x1aa/0x240
[ 603.082003] [<ffffffff8114136f>] ? __blocking_notifier_call_chain+0x7f/0xc0
[ 603.082005] [<ffffffff83aea187>] down_read+0x47/0x90
[ 603.082009] [<ffffffff8114136f>] ? __blocking_notifier_call_chain+0x7f/0xc0
[ 603.082011] [<ffffffff8114136f>] __blocking_notifier_call_chain+0x7f/0xc0
[ 603.082014] [<ffffffff811413c1>] blocking_notifier_call_chain+0x11/0x20
[ 603.082017] [<ffffffff81a635b6>] fb_notifier_call_chain+0x16/0x20
[ 603.082019] [<ffffffff81a65306>] fb_blank+0x36/0xa0
[ 603.082022] [<ffffffff81a729ee>] fbcon_blank+0x14e/0x2d0
[ 603.082025] [<ffffffff83aec67d>] ? _raw_spin_unlock_irqrestore+0x5d/0xb0
[ 603.082028] [<ffffffff811802cd>] ? trace_hardirqs_on+0xd/0x10
[ 603.082031] [<ffffffff83aec6a4>] ? _raw_spin_unlock_irqrestore+0x84/0xb0
[ 603.082035] [<ffffffff8111e444>] ? try_to_del_timer_sync+0x64/0x70
[ 603.082037] [<ffffffff8111e548>] ? del_timer_sync+0xf8/0x130
[ 603.082039] [<ffffffff8111e488>] ? del_timer_sync+0x38/0x130
[ 603.082042] [<ffffffff81bc7c93>] do_blank_screen+0x1b3/0x2b0
[ 603.082045] [<ffffffff81bc83f3>] console_callback+0xc3/0x130
[ 603.082048] [<ffffffff8112d5a9>] process_one_work+0x3b9/0x770
[ 603.082050] [<ffffffff8112d458>] ? process_one_work+0x268/0x770
[ 603.082053] [<ffffffff8112ae60>] ? work_fixup_activate+0x90/0x90
[ 603.082056] [<ffffffff81bc8330>] ? poke_blanked_console+0x100/0x100
[ 603.082059] [<ffffffff8112df2a>] worker_thread+0x2ba/0x3f0
[ 603.082061] [<ffffffff8112dc70>] ? rescuer_thread+0x2d0/0x2d0
[ 603.082063] [<ffffffff81138c33>] kthread+0xe3/0xf0
[ 603.082066] [<ffffffff8117d7be>] ? put_lock_stats.isra.16+0xe/0x40
[ 603.082069] [<ffffffff81138b50>] ? insert_kthread_work+0x90/0x90
[ 603.082072] [<ffffffff83aedebc>] ret_from_fork+0x7c/0xb0
[ 603.082074] [<ffffffff81138b50>] ? insert_kthread_work+0x90/0x90


2012-10-26 13:33:06

by Alan

[permalink] [raw]
Subject: Re: tty, vt: lockdep warnings

On Thu, 25 Oct 2012 15:37:43 -0400
Sasha Levin <[email protected]> wrote:

> Hi all,
>
> While fuzzing with trinity inside a KVM tools (lkvm) guest running latest -next kernel,
> I've stumbled on the following spew:

Looks real enough but its not a tty/vt layer spew. This is all coming out
of the core framebuffer code which doesn't seem to be able to decide what
the locking rules at the invocation of fb_notifier_call_chain are.

It might need some console layer tweaking to provide 'register console
and I already hold the locks' or similar but that notifier needs some
kind of sanity applying as well.

Cc'ing the fbdev folks

Alan

2012-11-05 17:27:07

by Sasha Levin

[permalink] [raw]
Subject: Re: tty, vt: lockdep warnings

Ping? Should I bisect it?

On Fri, Oct 26, 2012 at 9:37 AM, Alan Cox <[email protected]> wrote:
> On Thu, 25 Oct 2012 15:37:43 -0400
> Sasha Levin <[email protected]> wrote:
>
>> Hi all,
>>
>> While fuzzing with trinity inside a KVM tools (lkvm) guest running latest -next kernel,
>> I've stumbled on the following spew:
>
> Looks real enough but its not a tty/vt layer spew. This is all coming out
> of the core framebuffer code which doesn't seem to be able to decide what
> the locking rules at the invocation of fb_notifier_call_chain are.
>
> It might need some console layer tweaking to provide 'register console
> and I already hold the locks' or similar but that notifier needs some
> kind of sanity applying as well.
>
> Cc'ing the fbdev folks
>
> Alan
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2012-11-05 17:54:40

by Alan

[permalink] [raw]
Subject: Re: tty, vt: lockdep warnings

On Mon, 5 Nov 2012 12:26:43 -0500
Sasha Levin <[email protected]> wrote:

> Ping? Should I bisect it?
>
> On Fri, Oct 26, 2012 at 9:37 AM, Alan Cox <[email protected]> wrote:
> > On Thu, 25 Oct 2012 15:37:43 -0400
> > Sasha Levin <[email protected]> wrote:
> >
> >> Hi all,
> >>
> >> While fuzzing with trinity inside a KVM tools (lkvm) guest running latest -next kernel,
> >> I've stumbled on the following spew:
> >
> > Looks real enough but its not a tty/vt layer spew. This is all coming out
> > of the core framebuffer code which doesn't seem to be able to decide what
> > the locking rules at the invocation of fb_notifier_call_chain are.
> >
> > It might need some console layer tweaking to provide 'register console
> > and I already hold the locks' or similar but that notifier needs some
> > kind of sanity applying as well.
> >
> > Cc'ing the fbdev folks

I've cc'd the framebuffer folks. I can see why its occurring but I have
no idea how they intend to fix it and I've not seen any replies.

Sorry but I've got enough other things on my plate right now without
trying to deal with the locking brain damage that the fbdev layer is.

As far as I can tell the actual bug proper is years old.

Alan

2012-11-05 18:01:09

by Sasha Levin

[permalink] [raw]
Subject: Re: tty, vt: lockdep warnings

On 11/05/2012 12:59 PM, Alan Cox wrote:
> On Mon, 5 Nov 2012 12:26:43 -0500
> Sasha Levin <[email protected]> wrote:
>
>> Ping? Should I bisect it?
>>
>> On Fri, Oct 26, 2012 at 9:37 AM, Alan Cox <[email protected]> wrote:
>>> On Thu, 25 Oct 2012 15:37:43 -0400
>>> Sasha Levin <[email protected]> wrote:
>>>
>>>> Hi all,
>>>>
>>>> While fuzzing with trinity inside a KVM tools (lkvm) guest running latest -next kernel,
>>>> I've stumbled on the following spew:
>>>
>>> Looks real enough but its not a tty/vt layer spew. This is all coming out
>>> of the core framebuffer code which doesn't seem to be able to decide what
>>> the locking rules at the invocation of fb_notifier_call_chain are.
>>>
>>> It might need some console layer tweaking to provide 'register console
>>> and I already hold the locks' or similar but that notifier needs some
>>> kind of sanity applying as well.
>>>
>>> Cc'ing the fbdev folks
>
> I've cc'd the framebuffer folks. I can see why its occurring but I have
> no idea how they intend to fix it and I've not seen any replies.
>
> Sorry but I've got enough other things on my plate right now without
> trying to deal with the locking brain damage that the fbdev layer is.
>
> As far as I can tell the actual bug proper is years old.
>
> Alan
>

Ow, I figured it's something new since I've only now started seeing it in fuzz
tests, and it reproduces pretty much every time.


Thanks,
Sasha

2012-11-05 19:18:05

by Hugh Dickins

[permalink] [raw]
Subject: Re: tty, vt: lockdep warnings

On Mon, 5 Nov 2012, Sasha Levin wrote:
> On 11/05/2012 12:59 PM, Alan Cox wrote:
> > On Mon, 5 Nov 2012 12:26:43 -0500
> > Sasha Levin <[email protected]> wrote:
> >
> >> Ping? Should I bisect it?
> >>
> >> On Fri, Oct 26, 2012 at 9:37 AM, Alan Cox <[email protected]> wrote:
> >>> On Thu, 25 Oct 2012 15:37:43 -0400
> >>> Sasha Levin <[email protected]> wrote:
> >>>
> >>>> Hi all,
> >>>>
> >>>> While fuzzing with trinity inside a KVM tools (lkvm) guest running latest -next kernel,
> >>>> I've stumbled on the following spew:
> >>>
> >>> Looks real enough but its not a tty/vt layer spew. This is all coming out
> >>> of the core framebuffer code which doesn't seem to be able to decide what
> >>> the locking rules at the invocation of fb_notifier_call_chain are.
> >>>
> >>> It might need some console layer tweaking to provide 'register console
> >>> and I already hold the locks' or similar but that notifier needs some
> >>> kind of sanity applying as well.
> >>>
> >>> Cc'ing the fbdev folks
> >
> > I've cc'd the framebuffer folks. I can see why its occurring but I have
> > no idea how they intend to fix it and I've not seen any replies.
> >
> > Sorry but I've got enough other things on my plate right now without
> > trying to deal with the locking brain damage that the fbdev layer is.
> >
> > As far as I can tell the actual bug proper is years old.
> >
> > Alan
> >
>
> Ow, I figured it's something new since I've only now started seeing it in fuzz
> tests, and it reproduces pretty much every time.

The fbdev potential for deadlock may be years old, but the warning
(and consequent disabling of lockdep from that point on - making it
useless to everybody else in need of it) is new, and comes from the
commit below in linux-next.

I revert it in my own testing: if there is no quick fix to the
fbdev issue on the way, Daniel, please revert it from your tree.

Thanks,
Hugh

commit daee779718a319ff9f83e1ba3339334ac650bb22
Author: Daniel Vetter <[email protected]>
Date: Sat Sep 22 19:52:11 2012 +0200

console: implement lockdep support for console_lock

Dave Airlie recently discovered a locking bug in the fbcon layer,
where a timer_del_sync (for the blinking cursor) deadlocks with the
timer itself, since both (want to) hold the console_lock:

https://lkml.org/lkml/2012/8/21/36

Unfortunately the console_lock isn't a plain mutex and hence has no
lockdep support. Which resulted in a few days wasted of tracking down
this bug (complicated by the fact that printk doesn't show anything
when the console is locked) instead of noticing the bug much earlier
with the lockdep splat.

Hence I've figured I need to fix that for the next deadlock involving
console_lock - and with kms/drm growing ever more complex locking
that'll eventually happen.

Now the console_lock has rather funky semantics, so after a quick irc
discussion with Thomas Gleixner and Dave Airlie I've quickly ditched
the original idead of switching to a real mutex (since it won't work)
and instead opted to annotate the console_lock with lockdep
information manually.

There are a few special cases:
- The console_lock state is protected by the console_sem, and usually
grabbed/dropped at _lock/_unlock time. But the suspend/resume code
drops the semaphore without dropping the console_lock (see
suspend_console/resume_console). But since the same thread that did
the suspend will do the resume, we don't need to fix up anything.

- In the printk code there's a special trylock, only used to kick off
the logbuffer printk'ing in console_unlock. But all that happens
while lockdep is disable (since printk does a few other evil
tricks). So no issue there, either.

- The console_lock can also be acquired form irq context (but only
with a trylock). lockdep already handles that.

This all leaves us with annotating the normal console_lock, _unlock
and _trylock functions.

And yes, it works - simply unloading a drm kms driver resulted in
lockdep complaining about the deadlock in fbcon_deinit:

======================================================
[ INFO: possible circular locking dependency detected ]
3.6.0-rc2+ #552 Not tainted
-------------------------------------------------------
kms-reload/3577 is trying to acquire lock:
((&info->queue)){+.+...}, at: [<ffffffff81058c70>] wait_on_work+0x0/0xa7

but task is already holding lock:
(console_lock){+.+.+.}, at: [<ffffffff81264686>] bind_con_driver+0x38/0x263

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (console_lock){+.+.+.}:
[<ffffffff81087440>] lock_acquire+0x95/0x105
[<ffffffff81040190>] console_lock+0x59/0x5b
[<ffffffff81209cb6>] fb_flashcursor+0x2e/0x12c
[<ffffffff81057c3e>] process_one_work+0x1d9/0x3b4
[<ffffffff810584a2>] worker_thread+0x1a7/0x24b
[<ffffffff8105ca29>] kthread+0x7f/0x87
[<ffffffff813b1204>] kernel_thread_helper+0x4/0x10

-> #0 ((&info->queue)){+.+...}:
[<ffffffff81086cb3>] __lock_acquire+0x999/0xcf6
[<ffffffff81087440>] lock_acquire+0x95/0x105
[<ffffffff81058cab>] wait_on_work+0x3b/0xa7
[<ffffffff81058dd6>] __cancel_work_timer+0xbf/0x102
[<ffffffff81058e33>] cancel_work_sync+0xb/0xd
[<ffffffff8120a3b3>] fbcon_deinit+0x11c/0x1dc
[<ffffffff81264793>] bind_con_driver+0x145/0x263
[<ffffffff81264a45>] unbind_con_driver+0x14f/0x195
[<ffffffff8126540c>] store_bind+0x1ad/0x1c1
[<ffffffff8127cbb7>] dev_attr_store+0x13/0x1f
[<ffffffff8116d884>] sysfs_write_file+0xe9/0x121
[<ffffffff811145b2>] vfs_write+0x9b/0xfd
[<ffffffff811147b7>] sys_write+0x3e/0x6b
[<ffffffff813b0039>] system_call_fastpath+0x16/0x1b

other info that might help us debug this:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(console_lock);
lock((&info->queue));
lock(console_lock);
lock((&info->queue));

*** DEADLOCK ***

v2: Mark the lockdep_map static, noticed by Jani Nikula.

Cc: Dave Airlie <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Alan Cox <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

diff --git a/kernel/printk.c b/kernel/printk.c
index 2d607f4..ee79f14 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -87,6 +87,12 @@ static DEFINE_SEMAPHORE(console_sem);
struct console *console_drivers;
EXPORT_SYMBOL_GPL(console_drivers);

+#ifdef CONFIG_LOCKDEP
+static struct lockdep_map console_lock_dep_map = {
+ .name = "console_lock"
+};
+#endif
+
/*
* This is used for debugging the mess that is the VT code by
* keeping track if we have the console semaphore held. It's
@@ -1914,6 +1920,7 @@ void console_lock(void)
return;
console_locked = 1;
console_may_schedule = 1;
+ mutex_acquire(&console_lock_dep_map, 0, 0, _RET_IP_);
}
EXPORT_SYMBOL(console_lock);

@@ -1935,6 +1942,7 @@ int console_trylock(void)
}
console_locked = 1;
console_may_schedule = 0;
+ mutex_acquire(&console_lock_dep_map, 0, 1, _RET_IP_);
return 1;
}
EXPORT_SYMBOL(console_trylock);
@@ -2095,6 +2103,7 @@ skip:
local_irq_restore(flags);
}
console_locked = 0;
+ mutex_release(&console_lock_dep_map, 1, _RET_IP_);

/* Release the exclusive_console once it is used */
if (unlikely(exclusive_console))

2012-11-05 20:10:26

by Alan

[permalink] [raw]
Subject: Re: tty, vt: lockdep warnings

> The fbdev potential for deadlock may be years old, but the warning
> (and consequent disabling of lockdep from that point on - making it
> useless to everybody else in need of it) is new, and comes from the
> commit below in linux-next.
>
> I revert it in my own testing: if there is no quick fix to the
> fbdev issue on the way, Daniel, please revert it from your tree.

If you revert it you swap it for a different deadlock - and one that
happens more often I would expect. Not very useful.

I'm hoping the framebuffer maintainer will bother to respond to this
because that's the only way it can be sorted out.

Alan

2012-11-05 20:34:57

by Hugh Dickins

[permalink] [raw]
Subject: Re: tty, vt: lockdep warnings

On Mon, 5 Nov 2012, Alan Cox wrote:
> > The fbdev potential for deadlock may be years old, but the warning
> > (and consequent disabling of lockdep from that point on - making it
> > useless to everybody else in need of it) is new, and comes from the
> > commit below in linux-next.
> >
> > I revert it in my own testing: if there is no quick fix to the
> > fbdev issue on the way, Daniel, please revert it from your tree.
>
> If you revert it you swap it for a different deadlock - and one that
> happens more often I would expect. Not very useful.

But a deadlock we have lived with for years. Without reverting,
we're prevented from discovering all the new deadlocks we're adding.

>
> I'm hoping the framebuffer maintainer will bother to respond to this
> because that's the only way it can be sorted out.

That would be ideal - thanks.

Hugh

2012-11-06 16:06:13

by Alan

[permalink] [raw]
Subject: Re: tty, vt: lockdep warnings

On Mon, 5 Nov 2012 12:34:44 -0800 (PST)
Hugh Dickins <[email protected]> wrote:

> On Mon, 5 Nov 2012, Alan Cox wrote:
> > > The fbdev potential for deadlock may be years old, but the warning
> > > (and consequent disabling of lockdep from that point on - making it
> > > useless to everybody else in need of it) is new, and comes from the
> > > commit below in linux-next.
> > >
> > > I revert it in my own testing: if there is no quick fix to the
> > > fbdev issue on the way, Daniel, please revert it from your tree.
> >
> > If you revert it you swap it for a different deadlock - and one that
> > happens more often I would expect. Not very useful.
>
> But a deadlock we have lived with for years. Without reverting,
> we're prevented from discovering all the new deadlocks we're adding.

We lived with it locking boxes up on users but not knowing why. The root
cause is loading two different framebuffers with one taking over from
another - that should be an obscure corner case and once the fuzz testing
can avoid.

> That would be ideal - thanks.


I had a semi-informed poke at this and came up with a possible patch (not very tested)


commit f4fa6c739ecc367dbb98f5be1ff626d9b2750878
Author: Alan Cox <[email protected]>
Date: Tue Nov 6 15:33:18 2012 +0000

fb: Rework locking to fix lock ordering on takeover

Adjust the console layer to allow a take over call where the caller already
holds the locks. Make the fb layer lock in order.

This s partly a band aid, the fb layer is terminally confused about the
locking rules it uses for its notifiers it seems.

Signed-off-by: Alan Cox <[email protected]>

diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index f87d7e8..ea57f27 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -2984,7 +2984,7 @@ int __init vty_init(const struct file_operations *console_fops)

static struct class *vtconsole_class;

-static int bind_con_driver(const struct consw *csw, int first, int last,
+static int do_bind_con_driver(const struct consw *csw, int first, int last,
int deflt)
{
struct module *owner = csw->owner;
@@ -2995,7 +2995,7 @@ static int bind_con_driver(const struct consw *csw, int first, int last,
if (!try_module_get(owner))
return -ENODEV;

- console_lock();
+ WARN_CONSOLE_UNLOCKED();

/* check if driver is registered */
for (i = 0; i < MAX_NR_CON_DRIVER; i++) {
@@ -3080,11 +3080,22 @@ static int bind_con_driver(const struct consw *csw, int first, int last,

retval = 0;
err:
- console_unlock();
module_put(owner);
return retval;
};

+
+static int bind_con_driver(const struct consw *csw, int first, int last,
+ int deflt)
+{
+ int ret;
+
+ console_unlock();
+ ret = do_bind_con_driver(csw, first, last, deflt);
+ console_unlock();
+ return ret;
+}
+
#ifdef CONFIG_VT_HW_CONSOLE_BINDING
static int con_is_graphics(const struct consw *csw, int first, int last)
{
@@ -3196,9 +3207,9 @@ int unbind_con_driver(const struct consw *csw, int first, int last, int deflt)
if (!con_is_bound(csw))
con_driver->flag &= ~CON_DRIVER_FLAG_INIT;

- console_unlock();
/* ignore return value, binding should not fail */
- bind_con_driver(defcsw, first, last, deflt);
+ do_bind_con_driver(defcsw, first, last, deflt);
+ console_unlock();
err:
module_put(owner);
return retval;
@@ -3489,28 +3500,18 @@ int con_debug_leave(void)
}
EXPORT_SYMBOL_GPL(con_debug_leave);

-/**
- * register_con_driver - register console driver to console layer
- * @csw: console driver
- * @first: the first console to take over, minimum value is 0
- * @last: the last console to take over, maximum value is MAX_NR_CONSOLES -1
- *
- * DESCRIPTION: This function registers a console driver which can later
- * bind to a range of consoles specified by @first and @last. It will
- * also initialize the console driver by calling con_startup().
- */
-int register_con_driver(const struct consw *csw, int first, int last)
+static int do_register_con_driver(const struct consw *csw, int first, int last)
{
struct module *owner = csw->owner;
struct con_driver *con_driver;
const char *desc;
int i, retval = 0;

+ WARN_CONSOLE_UNLOCKED();
+
if (!try_module_get(owner))
return -ENODEV;

- console_lock();
-
for (i = 0; i < MAX_NR_CON_DRIVER; i++) {
con_driver = &registered_con_driver[i];

@@ -3563,10 +3564,29 @@ int register_con_driver(const struct consw *csw, int first, int last)
}

err:
- console_unlock();
module_put(owner);
return retval;
}
+
+/**
+ * register_con_driver - register console driver to console layer
+ * @csw: console driver
+ * @first: the first console to take over, minimum value is 0
+ * @last: the last console to take over, maximum value is MAX_NR_CONSOLES -1
+ *
+ * DESCRIPTION: This function registers a console driver which can later
+ * bind to a range of consoles specified by @first and @last. It will
+ * also initialize the console driver by calling con_startup().
+ */
+int register_con_driver(const struct consw *csw, int first, int last)
+{
+ int retval;
+
+ console_lock();
+ retval = do_register_con_driver(csw, first, last);
+ console_unlock();
+ return retval;
+}
EXPORT_SYMBOL(register_con_driver);

/**
@@ -3622,6 +3642,29 @@ EXPORT_SYMBOL(unregister_con_driver);
*
* take_over_console is basically a register followed by unbind
*/
+int do_take_over_console(const struct consw *csw, int first, int last, int deflt)
+{
+ int err;
+
+ err = do_register_con_driver(csw, first, last);
+ /* if we get an busy error we still want to bind the console driver
+ * and return success, as we may have unbound the console driver
+ ?* but not unregistered it.
+ */
+ if (err == -EBUSY)
+ err = 0;
+ if (!err)
+ do_bind_con_driver(csw, first, last, deflt);
+
+ return err;
+}
+/*
+ * If we support more console drivers, this function is used
+ * when a driver wants to take over some existing consoles
+ * and become default driver for newly opened ones.
+ *
+ * take_over_console is basically a register followed by unbind
+ */
int take_over_console(const struct consw *csw, int first, int last, int deflt)
{
int err;
diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
index fdefa8f..c75f8ce 100644
--- a/drivers/video/console/fbcon.c
+++ b/drivers/video/console/fbcon.c
@@ -529,6 +529,34 @@ static int search_for_mapped_con(void)
return retval;
}

+static int do_fbcon_takeover(int show_logo)
+{
+ int err, i;
+
+ if (!num_registered_fb)
+ return -ENODEV;
+
+ if (!show_logo)
+ logo_shown = FBCON_LOGO_DONTSHOW;
+
+ for (i = first_fb_vc; i <= last_fb_vc; i++)
+ con2fb_map[i] = info_idx;
+
+ err = do_take_over_console(&fb_con, first_fb_vc, last_fb_vc,
+ fbcon_is_default);
+
+ if (err) {
+ for (i = first_fb_vc; i <= last_fb_vc; i++) {
+ con2fb_map[i] = -1;
+ }
+ info_idx = -1;
+ } else {
+ fbcon_has_console_bind = 1;
+ }
+
+ return err;
+}
+
static int fbcon_takeover(int show_logo)
{
int err, i;
@@ -3115,7 +3143,7 @@ static int fbcon_fb_registered(struct fb_info *info)
}

if (info_idx != -1)
- ret = fbcon_takeover(1);
+ ret = do_fbcon_takeover(1);
} else {
for (i = first_fb_vc; i <= last_fb_vc; i++) {
if (con2fb_map_boot[i] == idx)
diff --git a/drivers/video/fbmem.c b/drivers/video/fbmem.c
index 3ff0105..588bdab 100644
--- a/drivers/video/fbmem.c
+++ b/drivers/video/fbmem.c
@@ -1650,7 +1650,9 @@ static int do_register_framebuffer(struct fb_info *fb_info)
event.info = fb_info;
if (!lock_fb_info(fb_info))
return -ENODEV;
+ console_lock();
fb_notifier_call_chain(FB_EVENT_FB_REGISTERED, &event);
+ console_unlock();
unlock_fb_info(fb_info);
return 0;
}
diff --git a/include/linux/console.h b/include/linux/console.h
index dedb082..4ef4307 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -78,6 +78,7 @@ int con_is_bound(const struct consw *csw);
int register_con_driver(const struct consw *csw, int first, int last);
int unregister_con_driver(const struct consw *csw);
int take_over_console(const struct consw *sw, int first, int last, int deflt);
+int do_take_over_console(const struct consw *sw, int first, int last, int deflt);
void give_up_console(const struct consw *sw);
#ifdef CONFIG_HW_CONSOLE
int con_debug_enter(struct vc_data *vc);

2012-11-06 16:42:49

by Dave Jones

[permalink] [raw]
Subject: Re: tty, vt: lockdep warnings

On Tue, Nov 06, 2012 at 04:11:00PM +0000, Alan Cox wrote:

> > But a deadlock we have lived with for years. Without reverting,
> > we're prevented from discovering all the new deadlocks we're adding.
>
> We lived with it locking boxes up on users but not knowing why.

Circa 3.5 we got a lot more reports of this happening too for some reason.
Turns out that races are awfully resistant to bisecting too.

> The root
> cause is loading two different framebuffers with one taking over from
> another - that should be an obscure corner case and once the fuzz testing
> can avoid.
>
> I had a semi-informed poke at this and came up with a possible patch (not very tested)

If this fixes the real problems we've been seeing, I'll dance a jig.

Dave

2012-11-06 17:33:52

by Alan

[permalink] [raw]
Subject: Re: tty, vt: lockdep warnings

> > The root
> > cause is loading two different framebuffers with one taking over from
> > another - that should be an obscure corner case and once the fuzz testing
> > can avoid.
> >
> > I had a semi-informed poke at this and came up with a possible patch (not very tested)
>
> If this fixes the real problems we've been seeing, I'll dance a jig.

Youtube...

At this point my bigger concern is that it'll just make something else
warn instead. The underlying problem is that fbcon layer implements a
single threaded notifier whose locking semantics are at best random. It's
not calld with a specific set of locks each time. Possibly it sohuld be
two notifiers (one for fb stuff, one for console layer stuff) but the
entire layer is horrible. I live in home the KMS guys will rip out the
useful bits and build a straight kms fb layer with refcounting and the
like 8)

Testing certainly needed and if it's still blowing up then hopefully
further traces will help fix up the other cases we don't know about.

Alan

2012-11-07 04:29:47

by Hugh Dickins

[permalink] [raw]
Subject: Re: tty, vt: lockdep warnings

On Tue, 6 Nov 2012, Alan Cox wrote:
> On Mon, 5 Nov 2012 12:34:44 -0800 (PST)
> Hugh Dickins <[email protected]> wrote:
> > On Mon, 5 Nov 2012, Alan Cox wrote:
> > > > The fbdev potential for deadlock may be years old, but the warning
> > > > (and consequent disabling of lockdep from that point on - making it
> > > > useless to everybody else in need of it) is new, and comes from the
> > > > commit below in linux-next.
> > > >
> > > > I revert it in my own testing: if there is no quick fix to the
> > > > fbdev issue on the way, Daniel, please revert it from your tree.
> > >
> > > If you revert it you swap it for a different deadlock - and one that
> > > happens more often I would expect. Not very useful.
> >
> > But a deadlock we have lived with for years. Without reverting,
> > we're prevented from discovering all the new deadlocks we're adding.
>
> We lived with it locking boxes up on users but not knowing why. The root
> cause is loading two different framebuffers with one taking over from
> another - that should be an obscure corner case and once the fuzz testing
> can avoid.

I'm bemused, but at least I now understand why we disagreed on this.

You thought it was a lockdep splat I got in the course of fuzz testing,
or doing some other obscure test: no, I thought I got it in booting up
the laptop, so it was in the way of doing useful testing thereafter.

I'd swear that I saw it two or three times, on each boot of 3.7.0-rc3-mm1;
then lost patience and deleted all the console_lock_dep_map lines from
kernel/printk.c, after which no problem.

But /var/log/messages calls me a liar, shows only one instance, and that
10 minutes after booting: that splat appended below in case it tells you
anything new; but I've no idea what triggered iti. (The "W" taint comes
from my using a "numa=fake=2" boot option, which surprised smpboot.c to
find smt-siblings on different nodes: not related to the console, I hope).

>
> > That would be ideal - thanks.
>
>
> I had a semi-informed poke at this and came up with a possible patch (not very tested)

Many thanks for your effort.

>
> commit f4fa6c739ecc367dbb98f5be1ff626d9b2750878
> Author: Alan Cox <[email protected]>
> Date: Tue Nov 6 15:33:18 2012 +0000
>
> fb: Rework locking to fix lock ordering on takeover
>
> Adjust the console layer to allow a take over call where the caller already
> holds the locks. Make the fb layer lock in order.
>
> This s partly a band aid, the fb layer is terminally confused about the
> locking rules it uses for its notifiers it seems.
>
> Signed-off-by: Alan Cox <[email protected]>

So I went to test this, but first tried to reproduce the orginal lockdep
splat that had irritated me so, and was utterly unsuccessful. So although
I am now running happily with your patch applied, no ill effects observed,
this gives no confidence because I cannot reproduce the condition anyway.

Sorry to be so unhelpful, original splat without your patch below.

Ah, now I actually scan through it, I see references to blank screen:
I'll try taking off your patch and seeing if it came up at screen
blanking time, then put on your patch back on and try again.
I'll report back in an hour or two.

Hugh

======================================================
[ INFO: possible circular locking dependency detected ]
3.7.0-rc3-mm1 #2 Tainted: G W
-------------------------------------------------------
kworker/0:1/30 is trying to acquire lock:
ACPI: Invalid Power Resource to register!
((fb_notifier_list).rwsem){.+.+.+}, at: [<ffffffff81080ed0>] __blocking_notifier_call_chain+0x6b/0xa2

but task is already holding lock:
(console_lock){+.+.+.}, at: [<ffffffff812877df>] console_callback+0xc/0xf7

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (console_lock){+.+.+.}:
[<ffffffff810a517a>] __lock_acquire+0x7fc/0x8bb
[<ffffffff810a561c>] lock_acquire+0x57/0x6d
[<ffffffff8106074d>] console_lock+0x67/0x69
[<ffffffff8128599a>] register_con_driver+0x36/0x128
[<ffffffff81285e60>] take_over_console+0x21/0x2b7
[<ffffffff81237014>] fbcon_takeover+0x56/0x98
[<ffffffff8123a7dd>] fbcon_event_notify+0x3bb/0x6ee
[<ffffffff81080c32>] notifier_call_chain+0xa7/0xd4
[<ffffffff81080ee6>] __blocking_notifier_call_chain+0x81/0xa2
[<ffffffff81080f16>] blocking_notifier_call_chain+0xf/0x11
[<ffffffff8122f7aa>] fb_notifier_call_chain+0x16/0x18
[<ffffffff81231523>] register_framebuffer+0x20c/0x270
[<ffffffff81297357>] drm_fb_helper_single_fb_probe+0x1ce/0x270
[<ffffffff812975c3>] drm_fb_helper_initial_config+0x1ca/0x1e1
[<ffffffff812e80bd>] intel_fbdev_init+0x76/0x89
[<ffffffff812b430a>] i915_driver_load+0xb20/0xcf7
[<ffffffff812a346e>] drm_get_pci_dev+0x162/0x25b
[<ffffffff8150f1b8>] i915_pci_probe+0x60/0x69
[<ffffffff812269cf>] local_pci_probe+0x12/0x16
[<ffffffff812274a1>] pci_device_probe+0xbe/0xeb
[<ffffffff812f783d>] driver_probe_device+0x91/0x19e
[<ffffffff812f79a7>] __driver_attach+0x5d/0x80
[<ffffffff812f5fac>] bus_for_each_dev+0x52/0x84
[<ffffffff812f74ed>] driver_attach+0x19/0x1b
[<ffffffff812f701f>] bus_add_driver+0xe7/0x20c
[<ffffffff812f7f1b>] driver_register+0x8e/0x114
[<ffffffff81227590>] __pci_register_driver+0x5a/0x5f
[<ffffffff812a35e7>] drm_pci_init+0x80/0xe5
[<ffffffff818c1229>] i915_init+0x66/0x68
[<ffffffff81000231>] do_one_initcall+0x7a/0x131
[<ffffffff81508129>] kernel_init+0x106/0x26d
[<ffffffff81529cac>] ret_from_fork+0x7c/0xb0

-> #0 ((fb_notifier_list).rwsem){.+.+.+}:
[<ffffffff810a3e14>] validate_chain.isra.21+0x7b0/0xd45
[<ffffffff810a517a>] __lock_acquire+0x7fc/0x8bb
[<ffffffff810a561c>] lock_acquire+0x57/0x6d
[<ffffffff81526df5>] down_read+0x42/0x57
[<ffffffff81080ed0>] __blocking_notifier_call_chain+0x6b/0xa2
[<ffffffff81080f16>] blocking_notifier_call_chain+0xf/0x11
[<ffffffff8122f7aa>] fb_notifier_call_chain+0x16/0x18
[<ffffffff8122fe47>] fb_blank+0x36/0x85
[<ffffffff81237ceb>] fbcon_blank+0x129/0x269
[<ffffffff8128589b>] do_blank_screen+0x18a/0x253
[<ffffffff8128789f>] console_callback+0xcc/0xf7
[<ffffffff81076a5e>] process_one_work+0x20e/0x3a2
[<ffffffff81076e0a>] worker_thread+0x1ee/0x2cb
[<ffffffff8107b90a>] kthread+0xd0/0xd8
[<ffffffff81529cac>] ret_from_fork+0x7c/0xb0

other info that might help us debug this:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(console_lock);
lock((fb_notifier_list).rwsem);
lock(console_lock);
lock((fb_notifier_list).rwsem);

*** DEADLOCK ***

3 locks held by kworker/0:1/30:
#0: (events){.+.+.+}, at: [<ffffffff810769f6>] process_one_work+0x1a6/0x3a2
#1: (console_work){+.+.+.}, at: [<ffffffff810769f6>] process_one_work+0x1a6/0x3a2
#2: (console_lock){+.+.+.}, at: [<ffffffff812877df>] console_callback+0xc/0xf7

stack backtrace:
Pid: 30, comm: kworker/0:1 Tainted: G W 3.7.0-rc3-mm1 #2
Call Trace:
[<ffffffff8151e443>] print_circular_bug+0x28d/0x29e
[<ffffffff810a3e14>] validate_chain.isra.21+0x7b0/0xd45
[<ffffffff8107928b>] ? __kernel_text_address+0x22/0x41
[<ffffffff810a517a>] __lock_acquire+0x7fc/0x8bb
[<ffffffff810a561c>] lock_acquire+0x57/0x6d
[<ffffffff81080ed0>] ? __blocking_notifier_call_chain+0x6b/0xa2
[<ffffffff81526df5>] down_read+0x42/0x57
[<ffffffff81080ed0>] ? __blocking_notifier_call_chain+0x6b/0xa2
[<ffffffff81080ed0>] __blocking_notifier_call_chain+0x6b/0xa2
[<ffffffff81080f16>] blocking_notifier_call_chain+0xf/0x11
[<ffffffff8122f7aa>] fb_notifier_call_chain+0x16/0x18
[<ffffffff8122fe47>] fb_blank+0x36/0x85
[<ffffffff81237ceb>] fbcon_blank+0x129/0x269
[<ffffffff8152910b>] ? _raw_spin_unlock_irqrestore+0x3a/0x64
[<ffffffff810a5dbf>] ? trace_hardirqs_on_caller+0x114/0x170
[<ffffffff810a5e28>] ? trace_hardirqs_on+0xd/0xf
[<ffffffff81529117>] ? _raw_spin_unlock_irqrestore+0x46/0x64
[<ffffffff8106bc00>] ? try_to_del_timer_sync+0x48/0x54
[<ffffffff8106bc77>] ? del_timer_sync+0x6b/0xb4
[<ffffffff8106bc9a>] ? del_timer_sync+0x8e/0xb4
[<ffffffff8106bc0c>] ? try_to_del_timer_sync+0x54/0x54
[<ffffffff8128589b>] do_blank_screen+0x18a/0x253
[<ffffffff8128789f>] console_callback+0xcc/0xf7
[<ffffffff810769f6>] ? process_one_work+0x1a6/0x3a2
[<ffffffff81076a5e>] process_one_work+0x20e/0x3a2
[<ffffffff810769f6>] ? process_one_work+0x1a6/0x3a2
[<ffffffff812877d3>] ? poke_blanked_console+0xc9/0xc9
[<ffffffff81076e0a>] worker_thread+0x1ee/0x2cb
[<ffffffff81076c1c>] ? process_scheduled_works+0x2a/0x2a
[<ffffffff8107b90a>] kthread+0xd0/0xd8
[<ffffffff8152915d>] ? _raw_spin_unlock_irq+0x28/0x50
[<ffffffff8107b83a>] ? __init_kthread_worker+0x55/0x55
[<ffffffff81529cac>] ret_from_fork+0x7c/0xb0
[<ffffffff8107b83a>] ? __init_kthread_worker+0x55/0x55

2012-11-07 06:26:49

by Hugh Dickins

[permalink] [raw]
Subject: Re: tty, vt: lockdep warnings

On Tue, 6 Nov 2012, Hugh Dickins wrote:
>
> Ah, now I actually scan through it, I see references to blank screen:
> I'll try taking off your patch and seeing if it came up at screen
> blanking time, then put on your patch back on and try again.
> I'll report back in an hour or two.

Yes, that was it. When the console screen blanked on 3.7.0-rc3-mm1,
it generated that lockdep splat, visible once I unblanked. But once I
applied your patch to the kernel, lockdep kept quiet across blank/unblank.

Thanks!
Hugh

2012-11-07 13:47:36

by Sasha Levin

[permalink] [raw]
Subject: Re: tty, vt: lockdep warnings

On 11/06/2012 12:38 PM, Alan Cox wrote:
>> > The root
>> > cause is loading two different framebuffers with one taking over from
>> > another - that should be an obscure corner case and once the fuzz testing
>> > can avoid.
>> >
>> > I had a semi-informed poke at this and came up with a possible patch (not very tested)
>>
>> If this fixes the real problems we've been seeing, I'll dance a jig.
>
> Youtube...

+1

> At this point my bigger concern is that it'll just make something else
> warn instead. The underlying problem is that fbcon layer implements a
> single threaded notifier whose locking semantics are at best random. It's
> not calld with a specific set of locks each time. Possibly it sohuld be
> two notifiers (one for fb stuff, one for console layer stuff) but the
> entire layer is horrible. I live in home the KMS guys will rip out the
> useful bits and build a straight kms fb layer with refcounting and the
> like 8)
>
> Testing certainly needed and if it's still blowing up then hopefully
> further traces will help fix up the other cases we don't know about.

So the good news are that the original lockdep splat I've reported is gone.

The semi-bad news are that there's a new one. It happens less frequently
but I assume it's not a new splat either, but was well hidden behind the
other splat.

[ 1885.997312] ======================================================
[ 1885.997312] [ INFO: possible circular locking dependency detected ]
[ 1885.997316] 3.7.0-rc4-next-20121106-sasha-00008-g353b62f #117 Tainted: G W
[ 1885.997316] -------------------------------------------------------
[ 1885.997319] trinity-child26/7820 is trying to acquire lock:
[ 1885.997330] (&fb_info->lock){+.+.+.}, at: [<ffffffff81a665d1>] lock_fb_info+0x21/0x50
[ 1885.997331]
[ 1885.997331] but task is already holding lock:
[ 1885.997336] (console_lock){+.+.+.}, at: [<ffffffff81a6c469>] store_modes+0x59/0x100
[ 1885.997337]
[ 1885.997337] which lock already depends on the new lock.
[ 1885.997337]
[ 1885.997338]
[ 1885.997338] the existing dependency chain (in reverse order) is:
[ 1885.997341]
[ 1885.997341] -> #1 (console_lock){+.+.+.}:
[ 1885.997347] [<ffffffff8118536a>] lock_acquire+0x1aa/0x240
[ 1885.997351] [<ffffffff8110b618>] console_lock+0x68/0x70
[ 1885.997354] [<ffffffff81a689b2>] register_framebuffer+0x242/0x2a0
[ 1885.997359] [<ffffffff8390e4bc>] vga16fb_probe+0x1c0/0x227
[ 1885.997364] [<ffffffff81e6b472>] platform_drv_probe+0x12/0x20
[ 1885.997369] [<ffffffff81e69e95>] driver_probe_device+0x155/0x340
[ 1885.997372] [<ffffffff81e6a15e>] __device_attach+0x2e/0x50
[ 1885.997375] [<ffffffff81e682d6>] bus_for_each_drv+0x56/0xa0
[ 1885.997379] [<ffffffff81e69a88>] device_attach+0x88/0xc0
[ 1885.997382] [<ffffffff81e68516>] bus_probe_device+0x36/0xd0
[ 1885.997385] [<ffffffff81e6639f>] device_add+0x4df/0x750
[ 1885.997388] [<ffffffff81e6bda8>] platform_device_add+0x1e8/0x280
[ 1885.997393] [<ffffffff85b0f35a>] vga16fb_init+0x8d/0xbb
[ 1885.997399] [<ffffffff85acccb2>] do_one_initcall+0x7a/0x135
[ 1885.997402] [<ffffffff838db8d9>] kernel_init+0x299/0x470
[ 1885.997406] [<ffffffff83a98fbc>] ret_from_fork+0x7c/0xb0
[ 1885.997409]
[ 1885.997409] -> #0 (&fb_info->lock){+.+.+.}:
[ 1885.997413] [<ffffffff811825af>] __lock_acquire+0x14df/0x1ca0
[ 1885.997416] [<ffffffff8118536a>] lock_acquire+0x1aa/0x240
[ 1885.997421] [<ffffffff83a944d9>] __mutex_lock_common+0x59/0x5a0
[ 1885.997425] [<ffffffff83a94a5f>] mutex_lock_nested+0x3f/0x50
[ 1885.997427] [<ffffffff81a665d1>] lock_fb_info+0x21/0x50
[ 1885.997430] [<ffffffff81a68b95>] fb_new_modelist+0xf5/0x140
[ 1885.997433] [<ffffffff81a6c4ac>] store_modes+0x9c/0x100
[ 1885.997436] [<ffffffff81e65013>] dev_attr_store+0x13/0x20
[ 1885.997440] [<ffffffff812f820a>] sysfs_write_file+0xfa/0x150
[ 1885.997444] [<ffffffff8127a220>] vfs_write+0xb0/0x180
[ 1885.997447] [<ffffffff8127a3e0>] sys_write+0x50/0xa0
[ 1885.997450] [<ffffffff83a99298>] tracesys+0xe1/0xe6
[ 1885.997451]
[ 1885.997451] other info that might help us debug this:
[ 1885.997451]
[ 1885.997452] Possible unsafe locking scenario:
[ 1885.997452]
[ 1885.997453] CPU0 CPU1
[ 1885.997454] ---- ----
[ 1885.997456] lock(console_lock);
[ 1885.997458] lock(&fb_info->lock);
[ 1885.997460] lock(console_lock);
[ 1885.997462] lock(&fb_info->lock);
[ 1885.997463]
[ 1885.997463] *** DEADLOCK ***
[ 1885.997463]
[ 1885.997464] 3 locks held by trinity-child26/7820:
[ 1885.997470] #0: (&buffer->mutex){+.+.+.}, at: [<ffffffff812f8153>] sysfs_write_file+0x43/0x150
[ 1885.997475] #1: (s_active#388){.+.+.+}, at: [<ffffffff812f81f2>] sysfs_write_file+0xe2/0x150
[ 1885.997481] #2: (console_lock){+.+.+.}, at: [<ffffffff81a6c469>] store_modes+0x59/0x100
[ 1885.997481]
[ 1885.997481] stack backtrace:
[ 1885.997484] Pid: 7820, comm: trinity-child26 Tainted: G W 3.7.0-rc4-next-20121106-sasha-00008-g353b62f #117
[ 1885.997485] Call Trace:
[ 1885.997492] [<ffffffff83a3c736>] print_circular_bug+0x1fb/0x20c
[ 1885.997496] [<ffffffff811825af>] __lock_acquire+0x14df/0x1ca0
[ 1885.997499] [<ffffffff81a01dc3>] ? debug_smp_processor_id+0x23/0x120
[ 1885.997504] [<ffffffff810a3da6>] ? kvm_clock_read+0x46/0x80
[ 1885.997508] [<ffffffff81152fa5>] ? sched_clock_local+0x25/0xa0
[ 1885.997511] [<ffffffff8118536a>] lock_acquire+0x1aa/0x240
[ 1885.997514] [<ffffffff81a665d1>] ? lock_fb_info+0x21/0x50
[ 1885.997517] [<ffffffff83a944d9>] __mutex_lock_common+0x59/0x5a0
[ 1885.997520] [<ffffffff81a665d1>] ? lock_fb_info+0x21/0x50
[ 1885.997523] [<ffffffff81180d3a>] ? __lock_is_held+0x5a/0x80
[ 1885.997526] [<ffffffff81a665d1>] ? lock_fb_info+0x21/0x50
[ 1885.997531] [<ffffffff8125d4dd>] ? kfree+0x20d/0x330
[ 1885.997534] [<ffffffff83a94a5f>] mutex_lock_nested+0x3f/0x50
[ 1885.997537] [<ffffffff81a665d1>] lock_fb_info+0x21/0x50
[ 1885.997539] [<ffffffff81a68b95>] fb_new_modelist+0xf5/0x140
[ 1885.997543] [<ffffffff81a6c4ac>] store_modes+0x9c/0x100
[ 1885.997546] [<ffffffff81e65013>] dev_attr_store+0x13/0x20
[ 1885.997553] [<ffffffff812f820a>] sysfs_write_file+0xfa/0x150
[ 1885.997556] [<ffffffff8127a220>] vfs_write+0xb0/0x180
[ 1885.997558] [<ffffffff8127a3e0>] sys_write+0x50/0xa0
[ 1885.997561] [<ffffffff83a99298>] tracesys+0xe1/0xe6
[ 1885.990746] irq event stamp: 35487
[ 1885.990746] hardirqs last enabled at (35487): [<ffffffff83a977fb>] _raw_spin_unlock_irq+0x2b/0x80
[ 1885.990746] hardirqs last disabled at (35486): [<ffffffff83a975da>] _raw_spin_lock_irq+0x2a/0x90
[ 1885.990746] softirqs last enabled at (34856): [<ffffffff811139a2>] __do_softirq+0x372/0x440
[ 1885.990746] softirqs last disabled at (34831): [<ffffffff83a9a4fc>] call_softirq+0x1c/0x30

2012-11-07 15:57:41

by Alan

[permalink] [raw]
Subject: Re: tty, vt: lockdep warnings

On Wed, 07 Nov 2012 08:47:08 -0500
Sasha Levin <[email protected]> wrote:

> On 11/06/2012 12:38 PM, Alan Cox wrote:
> >> > The root
> >> > cause is loading two different framebuffers with one taking over from
> >> > another - that should be an obscure corner case and once the fuzz testing
> >> > can avoid.
> >> >
> >> > I had a semi-informed poke at this and came up with a possible patch (not very tested)
> >>
> >> If this fixes the real problems we've been seeing, I'll dance a jig.
> >
> > Youtube...
>
> +1
>
> > At this point my bigger concern is that it'll just make something else
> > warn instead. The underlying problem is that fbcon layer implements a
> > single threaded notifier whose locking semantics are at best random. It's
> > not calld with a specific set of locks each time. Possibly it sohuld be
> > two notifiers (one for fb stuff, one for console layer stuff) but the
> > entire layer is horrible. I live in home the KMS guys will rip out the
> > useful bits and build a straight kms fb layer with refcounting and the
> > like 8)
> >
> > Testing certainly needed and if it's still blowing up then hopefully
> > further traces will help fix up the other cases we don't know about.
>
> So the good news are that the original lockdep splat I've reported is gone.
>
> The semi-bad news are that there's a new one. It happens less frequently
> but I assume it's not a new splat either, but was well hidden behind the
> other splat.

Doesn't look too bad (fingers crossed)

try

commit 8f965b0816d98ed535fdfccc3e50d84818e9c43b
Author: Alan Cox <[email protected]>
Date: Wed Nov 7 15:48:48 2012 +0000

fb: Rework locking to fix lock ordering on takeover

Adjust the console layer to allow a take over call where the caller already
holds the locks. Make the fb layer lock in order.

This s partly a band aid, the fb layer is terminally confused about the
locking rules it uses for its notifiers it seems.

Signed-off-by: Alan Cox <[email protected]>

diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index f87d7e8..ea57f27 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -2984,7 +2984,7 @@ int __init vty_init(const struct file_operations *console_fops)

static struct class *vtconsole_class;

-static int bind_con_driver(const struct consw *csw, int first, int last,
+static int do_bind_con_driver(const struct consw *csw, int first, int last,
int deflt)
{
struct module *owner = csw->owner;
@@ -2995,7 +2995,7 @@ static int bind_con_driver(const struct consw *csw, int first, int last,
if (!try_module_get(owner))
return -ENODEV;

- console_lock();
+ WARN_CONSOLE_UNLOCKED();

/* check if driver is registered */
for (i = 0; i < MAX_NR_CON_DRIVER; i++) {
@@ -3080,11 +3080,22 @@ static int bind_con_driver(const struct consw *csw, int first, int last,

retval = 0;
err:
- console_unlock();
module_put(owner);
return retval;
};

+
+static int bind_con_driver(const struct consw *csw, int first, int last,
+ int deflt)
+{
+ int ret;
+
+ console_unlock();
+ ret = do_bind_con_driver(csw, first, last, deflt);
+ console_unlock();
+ return ret;
+}
+
#ifdef CONFIG_VT_HW_CONSOLE_BINDING
static int con_is_graphics(const struct consw *csw, int first, int last)
{
@@ -3196,9 +3207,9 @@ int unbind_con_driver(const struct consw *csw, int first, int last, int deflt)
if (!con_is_bound(csw))
con_driver->flag &= ~CON_DRIVER_FLAG_INIT;

- console_unlock();
/* ignore return value, binding should not fail */
- bind_con_driver(defcsw, first, last, deflt);
+ do_bind_con_driver(defcsw, first, last, deflt);
+ console_unlock();
err:
module_put(owner);
return retval;
@@ -3489,28 +3500,18 @@ int con_debug_leave(void)
}
EXPORT_SYMBOL_GPL(con_debug_leave);

-/**
- * register_con_driver - register console driver to console layer
- * @csw: console driver
- * @first: the first console to take over, minimum value is 0
- * @last: the last console to take over, maximum value is MAX_NR_CONSOLES -1
- *
- * DESCRIPTION: This function registers a console driver which can later
- * bind to a range of consoles specified by @first and @last. It will
- * also initialize the console driver by calling con_startup().
- */
-int register_con_driver(const struct consw *csw, int first, int last)
+static int do_register_con_driver(const struct consw *csw, int first, int last)
{
struct module *owner = csw->owner;
struct con_driver *con_driver;
const char *desc;
int i, retval = 0;

+ WARN_CONSOLE_UNLOCKED();
+
if (!try_module_get(owner))
return -ENODEV;

- console_lock();
-
for (i = 0; i < MAX_NR_CON_DRIVER; i++) {
con_driver = &registered_con_driver[i];

@@ -3563,10 +3564,29 @@ int register_con_driver(const struct consw *csw, int first, int last)
}

err:
- console_unlock();
module_put(owner);
return retval;
}
+
+/**
+ * register_con_driver - register console driver to console layer
+ * @csw: console driver
+ * @first: the first console to take over, minimum value is 0
+ * @last: the last console to take over, maximum value is MAX_NR_CONSOLES -1
+ *
+ * DESCRIPTION: This function registers a console driver which can later
+ * bind to a range of consoles specified by @first and @last. It will
+ * also initialize the console driver by calling con_startup().
+ */
+int register_con_driver(const struct consw *csw, int first, int last)
+{
+ int retval;
+
+ console_lock();
+ retval = do_register_con_driver(csw, first, last);
+ console_unlock();
+ return retval;
+}
EXPORT_SYMBOL(register_con_driver);

/**
@@ -3622,6 +3642,29 @@ EXPORT_SYMBOL(unregister_con_driver);
*
* take_over_console is basically a register followed by unbind
*/
+int do_take_over_console(const struct consw *csw, int first, int last, int deflt)
+{
+ int err;
+
+ err = do_register_con_driver(csw, first, last);
+ /* if we get an busy error we still want to bind the console driver
+ * and return success, as we may have unbound the console driver
+ ?* but not unregistered it.
+ */
+ if (err == -EBUSY)
+ err = 0;
+ if (!err)
+ do_bind_con_driver(csw, first, last, deflt);
+
+ return err;
+}
+/*
+ * If we support more console drivers, this function is used
+ * when a driver wants to take over some existing consoles
+ * and become default driver for newly opened ones.
+ *
+ * take_over_console is basically a register followed by unbind
+ */
int take_over_console(const struct consw *csw, int first, int last, int deflt)
{
int err;
diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
index fdefa8f..c75f8ce 100644
--- a/drivers/video/console/fbcon.c
+++ b/drivers/video/console/fbcon.c
@@ -529,6 +529,34 @@ static int search_for_mapped_con(void)
return retval;
}

+static int do_fbcon_takeover(int show_logo)
+{
+ int err, i;
+
+ if (!num_registered_fb)
+ return -ENODEV;
+
+ if (!show_logo)
+ logo_shown = FBCON_LOGO_DONTSHOW;
+
+ for (i = first_fb_vc; i <= last_fb_vc; i++)
+ con2fb_map[i] = info_idx;
+
+ err = do_take_over_console(&fb_con, first_fb_vc, last_fb_vc,
+ fbcon_is_default);
+
+ if (err) {
+ for (i = first_fb_vc; i <= last_fb_vc; i++) {
+ con2fb_map[i] = -1;
+ }
+ info_idx = -1;
+ } else {
+ fbcon_has_console_bind = 1;
+ }
+
+ return err;
+}
+
static int fbcon_takeover(int show_logo)
{
int err, i;
@@ -3115,7 +3143,7 @@ static int fbcon_fb_registered(struct fb_info *info)
}

if (info_idx != -1)
- ret = fbcon_takeover(1);
+ ret = do_fbcon_takeover(1);
} else {
for (i = first_fb_vc; i <= last_fb_vc; i++) {
if (con2fb_map_boot[i] == idx)
diff --git a/drivers/video/fbmem.c b/drivers/video/fbmem.c
index 3ff0105..564ebe9 100644
--- a/drivers/video/fbmem.c
+++ b/drivers/video/fbmem.c
@@ -1650,7 +1650,9 @@ static int do_register_framebuffer(struct fb_info *fb_info)
event.info = fb_info;
if (!lock_fb_info(fb_info))
return -ENODEV;
+ console_lock();
fb_notifier_call_chain(FB_EVENT_FB_REGISTERED, &event);
+ console_unlock();
unlock_fb_info(fb_info);
return 0;
}
@@ -1853,11 +1855,8 @@ int fb_new_modelist(struct fb_info *info)
err = 1;

if (!list_empty(&info->modelist)) {
- if (!lock_fb_info(info))
- return -ENODEV;
event.info = info;
err = fb_notifier_call_chain(FB_EVENT_NEW_MODELIST, &event);
- unlock_fb_info(info);
}

return err;
diff --git a/drivers/video/fbsysfs.c b/drivers/video/fbsysfs.c
index a55e366..ef476b0 100644
--- a/drivers/video/fbsysfs.c
+++ b/drivers/video/fbsysfs.c
@@ -177,6 +177,8 @@ static ssize_t store_modes(struct device *device,
if (i * sizeof(struct fb_videomode) != count)
return -EINVAL;

+ if (!lock_fb_info(fb_info))
+ return -ENODEV;
console_lock();
list_splice(&fb_info->modelist, &old_list);
fb_videomode_to_modelist((const struct fb_videomode *)buf, i,
@@ -188,6 +190,7 @@ static ssize_t store_modes(struct device *device,
fb_destroy_modelist(&old_list);

console_unlock();
+ unlock_fb_info(fb_info);

return 0;
}
diff --git a/include/linux/console.h b/include/linux/console.h
index dedb082..4ef4307 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -78,6 +78,7 @@ int con_is_bound(const struct consw *csw);
int register_con_driver(const struct consw *csw, int first, int last);
int unregister_con_driver(const struct consw *csw);
int take_over_console(const struct consw *sw, int first, int last, int deflt);
+int do_take_over_console(const struct consw *sw, int first, int last, int deflt);
void give_up_console(const struct consw *sw);
#ifdef CONFIG_HW_CONSOLE
int con_debug_enter(struct vc_data *vc);

2012-11-07 16:16:24

by Bjørn Mork

[permalink] [raw]
Subject: Re: tty, vt: lockdep warnings

Alan Cox <[email protected]> writes:

> +
> +static int bind_con_driver(const struct consw *csw, int first, int last,
> + int deflt)
> +{
> + int ret;
> +
> + console_unlock();

console_lock() maybe...

> + ret = do_bind_con_driver(csw, first, last, deflt);
> + console_unlock();
> + return ret;
> +}
> +


Bjørn

2012-11-07 16:56:55

by Alan

[permalink] [raw]
Subject: Re: tty, vt: lockdep warnings

On Wed, 07 Nov 2012 17:15:45 +0100
Bj?rn Mork <[email protected]> wrote:

> Alan Cox <[email protected]> writes:
>
> > +
> > +static int bind_con_driver(const struct consw *csw, int first, int last,
> > + int deflt)
> > +{
> > + int ret;
> > +
> > + console_unlock();
>
> console_lock() maybe...

Thanks I'll fix that path in v3

2012-11-08 14:29:47

by Alan

[permalink] [raw]
Subject: Re: tty, vt: lockdep warnings (Patch v3)

commit f35b3fbf24c4e4debb6a7a864b09854ccc2a22e7
Author: Alan Cox <[email protected]>
Date: Wed Nov 7 16:35:08 2012 +0000

fb: Rework locking to fix lock ordering on takeover

Adjust the console layer to allow a take over call where the caller already
holds the locks. Make the fb layer lock in order.

This s partly a band aid, the fb layer is terminally confused about the
locking rules it uses for its notifiers it seems.

Signed-off-by: Alan Cox <[email protected]>

diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index f87d7e8..77bf205 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -2984,7 +2984,7 @@ int __init vty_init(const struct file_operations *console_fops)

static struct class *vtconsole_class;

-static int bind_con_driver(const struct consw *csw, int first, int last,
+static int do_bind_con_driver(const struct consw *csw, int first, int last,
int deflt)
{
struct module *owner = csw->owner;
@@ -2995,7 +2995,7 @@ static int bind_con_driver(const struct consw *csw, int first, int last,
if (!try_module_get(owner))
return -ENODEV;

- console_lock();
+ WARN_CONSOLE_UNLOCKED();

/* check if driver is registered */
for (i = 0; i < MAX_NR_CON_DRIVER; i++) {
@@ -3080,11 +3080,22 @@ static int bind_con_driver(const struct consw *csw, int first, int last,

retval = 0;
err:
- console_unlock();
module_put(owner);
return retval;
};

+
+static int bind_con_driver(const struct consw *csw, int first, int last,
+ int deflt)
+{
+ int ret;
+
+ console_lock();
+ ret = do_bind_con_driver(csw, first, last, deflt);
+ console_unlock();
+ return ret;
+}
+
#ifdef CONFIG_VT_HW_CONSOLE_BINDING
static int con_is_graphics(const struct consw *csw, int first, int last)
{
@@ -3196,9 +3207,9 @@ int unbind_con_driver(const struct consw *csw, int first, int last, int deflt)
if (!con_is_bound(csw))
con_driver->flag &= ~CON_DRIVER_FLAG_INIT;

- console_unlock();
/* ignore return value, binding should not fail */
- bind_con_driver(defcsw, first, last, deflt);
+ do_bind_con_driver(defcsw, first, last, deflt);
+ console_unlock();
err:
module_put(owner);
return retval;
@@ -3489,28 +3500,18 @@ int con_debug_leave(void)
}
EXPORT_SYMBOL_GPL(con_debug_leave);

-/**
- * register_con_driver - register console driver to console layer
- * @csw: console driver
- * @first: the first console to take over, minimum value is 0
- * @last: the last console to take over, maximum value is MAX_NR_CONSOLES -1
- *
- * DESCRIPTION: This function registers a console driver which can later
- * bind to a range of consoles specified by @first and @last. It will
- * also initialize the console driver by calling con_startup().
- */
-int register_con_driver(const struct consw *csw, int first, int last)
+static int do_register_con_driver(const struct consw *csw, int first, int last)
{
struct module *owner = csw->owner;
struct con_driver *con_driver;
const char *desc;
int i, retval = 0;

+ WARN_CONSOLE_UNLOCKED();
+
if (!try_module_get(owner))
return -ENODEV;

- console_lock();
-
for (i = 0; i < MAX_NR_CON_DRIVER; i++) {
con_driver = &registered_con_driver[i];

@@ -3563,10 +3564,29 @@ int register_con_driver(const struct consw *csw, int first, int last)
}

err:
- console_unlock();
module_put(owner);
return retval;
}
+
+/**
+ * register_con_driver - register console driver to console layer
+ * @csw: console driver
+ * @first: the first console to take over, minimum value is 0
+ * @last: the last console to take over, maximum value is MAX_NR_CONSOLES -1
+ *
+ * DESCRIPTION: This function registers a console driver which can later
+ * bind to a range of consoles specified by @first and @last. It will
+ * also initialize the console driver by calling con_startup().
+ */
+int register_con_driver(const struct consw *csw, int first, int last)
+{
+ int retval;
+
+ console_lock();
+ retval = do_register_con_driver(csw, first, last);
+ console_unlock();
+ return retval;
+}
EXPORT_SYMBOL(register_con_driver);

/**
@@ -3622,6 +3642,29 @@ EXPORT_SYMBOL(unregister_con_driver);
*
* take_over_console is basically a register followed by unbind
*/
+int do_take_over_console(const struct consw *csw, int first, int last, int deflt)
+{
+ int err;
+
+ err = do_register_con_driver(csw, first, last);
+ /* if we get an busy error we still want to bind the console driver
+ * and return success, as we may have unbound the console driver
+ ?* but not unregistered it.
+ */
+ if (err == -EBUSY)
+ err = 0;
+ if (!err)
+ do_bind_con_driver(csw, first, last, deflt);
+
+ return err;
+}
+/*
+ * If we support more console drivers, this function is used
+ * when a driver wants to take over some existing consoles
+ * and become default driver for newly opened ones.
+ *
+ * take_over_console is basically a register followed by unbind
+ */
int take_over_console(const struct consw *csw, int first, int last, int deflt)
{
int err;
diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
index fdefa8f..c75f8ce 100644
--- a/drivers/video/console/fbcon.c
+++ b/drivers/video/console/fbcon.c
@@ -529,6 +529,34 @@ static int search_for_mapped_con(void)
return retval;
}

+static int do_fbcon_takeover(int show_logo)
+{
+ int err, i;
+
+ if (!num_registered_fb)
+ return -ENODEV;
+
+ if (!show_logo)
+ logo_shown = FBCON_LOGO_DONTSHOW;
+
+ for (i = first_fb_vc; i <= last_fb_vc; i++)
+ con2fb_map[i] = info_idx;
+
+ err = do_take_over_console(&fb_con, first_fb_vc, last_fb_vc,
+ fbcon_is_default);
+
+ if (err) {
+ for (i = first_fb_vc; i <= last_fb_vc; i++) {
+ con2fb_map[i] = -1;
+ }
+ info_idx = -1;
+ } else {
+ fbcon_has_console_bind = 1;
+ }
+
+ return err;
+}
+
static int fbcon_takeover(int show_logo)
{
int err, i;
@@ -3115,7 +3143,7 @@ static int fbcon_fb_registered(struct fb_info *info)
}

if (info_idx != -1)
- ret = fbcon_takeover(1);
+ ret = do_fbcon_takeover(1);
} else {
for (i = first_fb_vc; i <= last_fb_vc; i++) {
if (con2fb_map_boot[i] == idx)
diff --git a/drivers/video/fbmem.c b/drivers/video/fbmem.c
index 3ff0105..564ebe9 100644
--- a/drivers/video/fbmem.c
+++ b/drivers/video/fbmem.c
@@ -1650,7 +1650,9 @@ static int do_register_framebuffer(struct fb_info *fb_info)
event.info = fb_info;
if (!lock_fb_info(fb_info))
return -ENODEV;
+ console_lock();
fb_notifier_call_chain(FB_EVENT_FB_REGISTERED, &event);
+ console_unlock();
unlock_fb_info(fb_info);
return 0;
}
@@ -1853,11 +1855,8 @@ int fb_new_modelist(struct fb_info *info)
err = 1;

if (!list_empty(&info->modelist)) {
- if (!lock_fb_info(info))
- return -ENODEV;
event.info = info;
err = fb_notifier_call_chain(FB_EVENT_NEW_MODELIST, &event);
- unlock_fb_info(info);
}

return err;
diff --git a/drivers/video/fbsysfs.c b/drivers/video/fbsysfs.c
index a55e366..ef476b0 100644
--- a/drivers/video/fbsysfs.c
+++ b/drivers/video/fbsysfs.c
@@ -177,6 +177,8 @@ static ssize_t store_modes(struct device *device,
if (i * sizeof(struct fb_videomode) != count)
return -EINVAL;

+ if (!lock_fb_info(fb_info))
+ return -ENODEV;
console_lock();
list_splice(&fb_info->modelist, &old_list);
fb_videomode_to_modelist((const struct fb_videomode *)buf, i,
@@ -188,6 +190,7 @@ static ssize_t store_modes(struct device *device,
fb_destroy_modelist(&old_list);

console_unlock();
+ unlock_fb_info(fb_info);

return 0;
}
diff --git a/include/linux/console.h b/include/linux/console.h
index dedb082..4ef4307 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -78,6 +78,7 @@ int con_is_bound(const struct consw *csw);
int register_con_driver(const struct consw *csw, int first, int last);
int unregister_con_driver(const struct consw *csw);
int take_over_console(const struct consw *sw, int first, int last, int deflt);
+int do_take_over_console(const struct consw *sw, int first, int last, int deflt);
void give_up_console(const struct consw *sw);
#ifdef CONFIG_HW_CONSOLE
int con_debug_enter(struct vc_data *vc);

2012-11-09 19:34:28

by Hugh Dickins

[permalink] [raw]
Subject: Re: tty, vt: lockdep warnings (Patch v3)

On Thu, 8 Nov 2012, Alan Cox wrote:
> commit f35b3fbf24c4e4debb6a7a864b09854ccc2a22e7
> Author: Alan Cox <[email protected]>
> Date: Wed Nov 7 16:35:08 2012 +0000
>
> fb: Rework locking to fix lock ordering on takeover
>
> Adjust the console layer to allow a take over call where the caller already
> holds the locks. Make the fb layer lock in order.
>
> This s partly a band aid, the fb layer is terminally confused about the
> locking rules it uses for its notifiers it seems.
>
> Signed-off-by: Alan Cox <[email protected]>

This version works for me too - thanks.
Hugh

>
> diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
> index f87d7e8..77bf205 100644
> --- a/drivers/tty/vt/vt.c
> +++ b/drivers/tty/vt/vt.c
> @@ -2984,7 +2984,7 @@ int __init vty_init(const struct file_operations *console_fops)
>
> static struct class *vtconsole_class;
>
> -static int bind_con_driver(const struct consw *csw, int first, int last,
> +static int do_bind_con_driver(const struct consw *csw, int first, int last,
> int deflt)
> {
> struct module *owner = csw->owner;
> @@ -2995,7 +2995,7 @@ static int bind_con_driver(const struct consw *csw, int first, int last,
> if (!try_module_get(owner))
> return -ENODEV;
>
> - console_lock();
> + WARN_CONSOLE_UNLOCKED();
>
> /* check if driver is registered */
> for (i = 0; i < MAX_NR_CON_DRIVER; i++) {
> @@ -3080,11 +3080,22 @@ static int bind_con_driver(const struct consw *csw, int first, int last,
>
> retval = 0;
> err:
> - console_unlock();
> module_put(owner);
> return retval;
> };
>
> +
> +static int bind_con_driver(const struct consw *csw, int first, int last,
> + int deflt)
> +{
> + int ret;
> +
> + console_lock();
> + ret = do_bind_con_driver(csw, first, last, deflt);
> + console_unlock();
> + return ret;
> +}
> +
> #ifdef CONFIG_VT_HW_CONSOLE_BINDING
> static int con_is_graphics(const struct consw *csw, int first, int last)
> {
> @@ -3196,9 +3207,9 @@ int unbind_con_driver(const struct consw *csw, int first, int last, int deflt)
> if (!con_is_bound(csw))
> con_driver->flag &= ~CON_DRIVER_FLAG_INIT;
>
> - console_unlock();
> /* ignore return value, binding should not fail */
> - bind_con_driver(defcsw, first, last, deflt);
> + do_bind_con_driver(defcsw, first, last, deflt);
> + console_unlock();
> err:
> module_put(owner);
> return retval;
> @@ -3489,28 +3500,18 @@ int con_debug_leave(void)
> }
> EXPORT_SYMBOL_GPL(con_debug_leave);
>
> -/**
> - * register_con_driver - register console driver to console layer
> - * @csw: console driver
> - * @first: the first console to take over, minimum value is 0
> - * @last: the last console to take over, maximum value is MAX_NR_CONSOLES -1
> - *
> - * DESCRIPTION: This function registers a console driver which can later
> - * bind to a range of consoles specified by @first and @last. It will
> - * also initialize the console driver by calling con_startup().
> - */
> -int register_con_driver(const struct consw *csw, int first, int last)
> +static int do_register_con_driver(const struct consw *csw, int first, int last)
> {
> struct module *owner = csw->owner;
> struct con_driver *con_driver;
> const char *desc;
> int i, retval = 0;
>
> + WARN_CONSOLE_UNLOCKED();
> +
> if (!try_module_get(owner))
> return -ENODEV;
>
> - console_lock();
> -
> for (i = 0; i < MAX_NR_CON_DRIVER; i++) {
> con_driver = &registered_con_driver[i];
>
> @@ -3563,10 +3564,29 @@ int register_con_driver(const struct consw *csw, int first, int last)
> }
>
> err:
> - console_unlock();
> module_put(owner);
> return retval;
> }
> +
> +/**
> + * register_con_driver - register console driver to console layer
> + * @csw: console driver
> + * @first: the first console to take over, minimum value is 0
> + * @last: the last console to take over, maximum value is MAX_NR_CONSOLES -1
> + *
> + * DESCRIPTION: This function registers a console driver which can later
> + * bind to a range of consoles specified by @first and @last. It will
> + * also initialize the console driver by calling con_startup().
> + */
> +int register_con_driver(const struct consw *csw, int first, int last)
> +{
> + int retval;
> +
> + console_lock();
> + retval = do_register_con_driver(csw, first, last);
> + console_unlock();
> + return retval;
> +}
> EXPORT_SYMBOL(register_con_driver);
>
> /**
> @@ -3622,6 +3642,29 @@ EXPORT_SYMBOL(unregister_con_driver);
> *
> * take_over_console is basically a register followed by unbind
> */
> +int do_take_over_console(const struct consw *csw, int first, int last, int deflt)
> +{
> + int err;
> +
> + err = do_register_con_driver(csw, first, last);
> + /* if we get an busy error we still want to bind the console driver
> + * and return success, as we may have unbound the console driver
> + ?* but not unregistered it.
> + */
> + if (err == -EBUSY)
> + err = 0;
> + if (!err)
> + do_bind_con_driver(csw, first, last, deflt);
> +
> + return err;
> +}
> +/*
> + * If we support more console drivers, this function is used
> + * when a driver wants to take over some existing consoles
> + * and become default driver for newly opened ones.
> + *
> + * take_over_console is basically a register followed by unbind
> + */
> int take_over_console(const struct consw *csw, int first, int last, int deflt)
> {
> int err;
> diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
> index fdefa8f..c75f8ce 100644
> --- a/drivers/video/console/fbcon.c
> +++ b/drivers/video/console/fbcon.c
> @@ -529,6 +529,34 @@ static int search_for_mapped_con(void)
> return retval;
> }
>
> +static int do_fbcon_takeover(int show_logo)
> +{
> + int err, i;
> +
> + if (!num_registered_fb)
> + return -ENODEV;
> +
> + if (!show_logo)
> + logo_shown = FBCON_LOGO_DONTSHOW;
> +
> + for (i = first_fb_vc; i <= last_fb_vc; i++)
> + con2fb_map[i] = info_idx;
> +
> + err = do_take_over_console(&fb_con, first_fb_vc, last_fb_vc,
> + fbcon_is_default);
> +
> + if (err) {
> + for (i = first_fb_vc; i <= last_fb_vc; i++) {
> + con2fb_map[i] = -1;
> + }
> + info_idx = -1;
> + } else {
> + fbcon_has_console_bind = 1;
> + }
> +
> + return err;
> +}
> +
> static int fbcon_takeover(int show_logo)
> {
> int err, i;
> @@ -3115,7 +3143,7 @@ static int fbcon_fb_registered(struct fb_info *info)
> }
>
> if (info_idx != -1)
> - ret = fbcon_takeover(1);
> + ret = do_fbcon_takeover(1);
> } else {
> for (i = first_fb_vc; i <= last_fb_vc; i++) {
> if (con2fb_map_boot[i] == idx)
> diff --git a/drivers/video/fbmem.c b/drivers/video/fbmem.c
> index 3ff0105..564ebe9 100644
> --- a/drivers/video/fbmem.c
> +++ b/drivers/video/fbmem.c
> @@ -1650,7 +1650,9 @@ static int do_register_framebuffer(struct fb_info *fb_info)
> event.info = fb_info;
> if (!lock_fb_info(fb_info))
> return -ENODEV;
> + console_lock();
> fb_notifier_call_chain(FB_EVENT_FB_REGISTERED, &event);
> + console_unlock();
> unlock_fb_info(fb_info);
> return 0;
> }
> @@ -1853,11 +1855,8 @@ int fb_new_modelist(struct fb_info *info)
> err = 1;
>
> if (!list_empty(&info->modelist)) {
> - if (!lock_fb_info(info))
> - return -ENODEV;
> event.info = info;
> err = fb_notifier_call_chain(FB_EVENT_NEW_MODELIST, &event);
> - unlock_fb_info(info);
> }
>
> return err;
> diff --git a/drivers/video/fbsysfs.c b/drivers/video/fbsysfs.c
> index a55e366..ef476b0 100644
> --- a/drivers/video/fbsysfs.c
> +++ b/drivers/video/fbsysfs.c
> @@ -177,6 +177,8 @@ static ssize_t store_modes(struct device *device,
> if (i * sizeof(struct fb_videomode) != count)
> return -EINVAL;
>
> + if (!lock_fb_info(fb_info))
> + return -ENODEV;
> console_lock();
> list_splice(&fb_info->modelist, &old_list);
> fb_videomode_to_modelist((const struct fb_videomode *)buf, i,
> @@ -188,6 +190,7 @@ static ssize_t store_modes(struct device *device,
> fb_destroy_modelist(&old_list);
>
> console_unlock();
> + unlock_fb_info(fb_info);
>
> return 0;
> }
> diff --git a/include/linux/console.h b/include/linux/console.h
> index dedb082..4ef4307 100644
> --- a/include/linux/console.h
> +++ b/include/linux/console.h
> @@ -78,6 +78,7 @@ int con_is_bound(const struct consw *csw);
> int register_con_driver(const struct consw *csw, int first, int last);
> int unregister_con_driver(const struct consw *csw);
> int take_over_console(const struct consw *sw, int first, int last, int deflt);
> +int do_take_over_console(const struct consw *sw, int first, int last, int deflt);
> void give_up_console(const struct consw *sw);
> #ifdef CONFIG_HW_CONSOLE
> int con_debug_enter(struct vc_data *vc);
>

2012-11-09 19:43:15

by Sasha Levin

[permalink] [raw]
Subject: Re: tty, vt: lockdep warnings (Patch v3)

On 11/09/2012 02:34 PM, Hugh Dickins wrote:
> On Thu, 8 Nov 2012, Alan Cox wrote:
>> commit f35b3fbf24c4e4debb6a7a864b09854ccc2a22e7
>> Author: Alan Cox <[email protected]>
>> Date: Wed Nov 7 16:35:08 2012 +0000
>>
>> fb: Rework locking to fix lock ordering on takeover
>>
>> Adjust the console layer to allow a take over call where the caller already
>> holds the locks. Make the fb layer lock in order.
>>
>> This s partly a band aid, the fb layer is terminally confused about the
>> locking rules it uses for its notifiers it seems.
>>
>> Signed-off-by: Alan Cox <[email protected]>
>
> This version works for me too - thanks.
> Hugh

I was planning to test it last night, but new code in mm/ failed horribly and
was BUG()ing all over the place, so I didn't get any significant testing
of this patch done.

Will update...


Thanks,
Sasha

2012-11-13 16:24:53

by Sasha Levin

[permalink] [raw]
Subject: Re: tty, vt: lockdep warnings (Patch v3)

On Fri, Nov 9, 2012 at 2:42 PM, Sasha Levin <[email protected]> wrote:
> On 11/09/2012 02:34 PM, Hugh Dickins wrote:
>> On Thu, 8 Nov 2012, Alan Cox wrote:
>>> commit f35b3fbf24c4e4debb6a7a864b09854ccc2a22e7
>>> Author: Alan Cox <[email protected]>
>>> Date: Wed Nov 7 16:35:08 2012 +0000
>>>
>>> fb: Rework locking to fix lock ordering on takeover
>>>
>>> Adjust the console layer to allow a take over call where the caller already
>>> holds the locks. Make the fb layer lock in order.
>>>
>>> This s partly a band aid, the fb layer is terminally confused about the
>>> locking rules it uses for its notifiers it seems.
>>>
>>> Signed-off-by: Alan Cox <[email protected]>
>>
>> This version works for me too - thanks.
>> Hugh
>
> I was planning to test it last night, but new code in mm/ failed horribly and
> was BUG()ing all over the place, so I didn't get any significant testing
> of this patch done.
>
> Will update...

Nothing complains anywhere, looks great.


Thanks,
Sasha