2010-08-08 20:18:10

by walt

[permalink] [raw]
Subject: [BISECTED] Today's Linus.git hangs during boot: can't find the floppy controller

This commit produces the error:

commit 6ee0578b4daaea01c96b172c6aacca43fd9807a6
Author: Suresh Siddha <[email protected]>
Date: Fri Jul 30 14:57:37 2010 -0700

workqueue: mark init_workqueues() as early_initcall()

Mark init_workqueues() as early_initcall() and thus it will be initialized
before smp bringup. init_workqueues() registers for the hotcpu notifier
and thus it should cope with the processors that are brought online after
the workqueues are initialized.

x86 smp bringup code uses workqueues and uses a workaround for the
cold boot process (as the workqueues are initialized post smp_init()).
Marking init_workqueues() as early_initcall() will pave the way for
cleaning up this code.

Signed-off-by: Suresh Siddha <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Andrew Morton <[email protected]>


Just after the error message about the floppy controller not found, the
machine hangs for two minutes and then this message:

task swapper:1 blocked for greater than 120 seconds, followed by a stack
trace, and again every two minutes AFAICT.

I'm not including all the gory messages and stack traces because I'm
hoping you'll know what the problem is without them. (Fingers crossed.)

BTW, I see this problem only on my dual-core machine, not the older single
single processor machine, as I would expect from reading the commit message.
(Both machine have properly functioning floppy drives.)

I'll be happy to supply more details if needed.

Thanks


2010-08-09 06:37:51

by Markus Trippelsdorf

[permalink] [raw]
Subject: Re: [BISECTED] Today's Linus.git hangs during boot: can't find the floppy controller

On Sun, Aug 08, 2010 at 01:17:53PM -0700, walt wrote:
> This commit produces the error:
>
> commit 6ee0578b4daaea01c96b172c6aacca43fd9807a6
> Author: Suresh Siddha <[email protected]>
> Date: Fri Jul 30 14:57:37 2010 -0700
>
> workqueue: mark init_workqueues() as early_initcall()
>
> Mark init_workqueues() as early_initcall() and thus it will be initialized
> before smp bringup. init_workqueues() registers for the hotcpu notifier
> and thus it should cope with the processors that are brought online after
> the workqueues are initialized.
>
> x86 smp bringup code uses workqueues and uses a workaround for the
> cold boot process (as the workqueues are initialized post smp_init()).
> Marking init_workqueues() as early_initcall() will pave the way for
> cleaning up this code.
>
> Signed-off-by: Suresh Siddha <[email protected]>
> Signed-off-by: Tejun Heo <[email protected]>
> Cc: Oleg Nesterov <[email protected]>
> Cc: Andrew Morton <[email protected]>
>
>
> Just after the error message about the floppy controller not found, the
> machine hangs for two minutes and then this message:
>
> task swapper:1 blocked for greater than 120 seconds, followed by a stack
> trace, and again every two minutes AFAICT.
>
> I'm not including all the gory messages and stack traces because I'm
> hoping you'll know what the problem is without them. (Fingers crossed.)
>
> BTW, I see this problem only on my dual-core machine, not the older single
> single processor machine, as I would expect from reading the commit message.
> (Both machine have properly functioning floppy drives.)

(Added Tejun to CC)

I see a similar problem here. The kernel will boot, but the system will
not initialize (no X11).
After reverting the commit, the system starts normally and the only
workqueue problem left is drm related:

% dmesg | grep ERROR
[drm:drm_kms_helper_poll_enable] *ERROR* delayed enqueue failed 1
[drm:output_poll_execute] *ERROR* delayed enqueue failed 1
[drm:output_poll_execute] *ERROR* delayed enqueue failed 1
[drm:output_poll_execute] *ERROR* delayed enqueue failed 1
[drm:output_poll_execute] *ERROR* delayed enqueue failed 1
[drm:output_poll_execute] *ERROR* delayed enqueue failed 1
[drm:output_poll_execute] *ERROR* delayed enqueue failed 1
[drm:output_poll_execute] *ERROR* delayed enqueue failed 1
...


--
?A man who doesn't know he is in prison can never escape.?
William S. Burroughs

2010-08-09 08:29:08

by Heiko Carstens

[permalink] [raw]
Subject: Re: [BISECTED] Today's Linus.git hangs during boot: can't find the floppy controller

On Mon, Aug 09, 2010 at 08:37:42AM +0200, Markus Trippelsdorf wrote:
> On Sun, Aug 08, 2010 at 01:17:53PM -0700, walt wrote:
> > This commit produces the error:
> >
> > commit 6ee0578b4daaea01c96b172c6aacca43fd9807a6
> > Author: Suresh Siddha <[email protected]>
> > Date: Fri Jul 30 14:57:37 2010 -0700
> >
> > workqueue: mark init_workqueues() as early_initcall()
> >
> > Mark init_workqueues() as early_initcall() and thus it will be initialized
> > before smp bringup. init_workqueues() registers for the hotcpu notifier
> > and thus it should cope with the processors that are brought online after
> > the workqueues are initialized.
> >
> > x86 smp bringup code uses workqueues and uses a workaround for the
> > cold boot process (as the workqueues are initialized post smp_init()).
> > Marking init_workqueues() as early_initcall() will pave the way for
> > cleaning up this code.
> >
> > Signed-off-by: Suresh Siddha <[email protected]>
> > Signed-off-by: Tejun Heo <[email protected]>
> > Cc: Oleg Nesterov <[email protected]>
> > Cc: Andrew Morton <[email protected]>
> >
> >
> > Just after the error message about the floppy controller not found, the
> > machine hangs for two minutes and then this message:
> >
> > task swapper:1 blocked for greater than 120 seconds, followed by a stack
> > trace, and again every two minutes AFAICT.
> >
> > I'm not including all the gory messages and stack traces because I'm
> > hoping you'll know what the problem is without them. (Fingers crossed.)
> >
> > BTW, I see this problem only on my dual-core machine, not the older single
> > single processor machine, as I would expect from reading the commit message.
> > (Both machine have properly functioning floppy drives.)
>
> (Added Tejun to CC)
>
> I see a similar problem here. The kernel will boot, but the system will
> not initialize (no X11).
> After reverting the commit, the system starts normally and the only
> workqueue problem left is drm related:

I would guess that both of you have CONFIG_HOTPLUG_CPU disabled and enabling
that config option will make the problem disappear.
>From the problem description and the patch it looks like it got screwed up
the same way I did two years ago.
See commits a802dd0eb5fc97a50cf1abb1f788a8f6cc5db635 and
4403b406d4369a275d483ece6ddee0088cc0d592.

2010-08-09 08:33:11

by Heiko Carstens

[permalink] [raw]
Subject: Re: [BISECTED] Today's Linus.git hangs during boot: can't find the floppy controller

On Mon, Aug 09, 2010 at 10:30:53AM +0200, Heiko Carstens wrote:
> On Mon, Aug 09, 2010 at 08:37:42AM +0200, Markus Trippelsdorf wrote:
> > On Sun, Aug 08, 2010 at 01:17:53PM -0700, walt wrote:
> > > This commit produces the error:
> > >
> > > commit 6ee0578b4daaea01c96b172c6aacca43fd9807a6
> > > Author: Suresh Siddha <[email protected]>
> > > Date: Fri Jul 30 14:57:37 2010 -0700
> > >
> > > workqueue: mark init_workqueues() as early_initcall()
> > >
> > > Mark init_workqueues() as early_initcall() and thus it will be initialized
> > > before smp bringup. init_workqueues() registers for the hotcpu notifier
> > > and thus it should cope with the processors that are brought online after
> > > the workqueues are initialized.
> > >
> > > x86 smp bringup code uses workqueues and uses a workaround for the
> > > cold boot process (as the workqueues are initialized post smp_init()).
> > > Marking init_workqueues() as early_initcall() will pave the way for
> > > cleaning up this code.
> > >
> > > Signed-off-by: Suresh Siddha <[email protected]>
> > > Signed-off-by: Tejun Heo <[email protected]>
> > > Cc: Oleg Nesterov <[email protected]>
> > > Cc: Andrew Morton <[email protected]>
> > >
> > >
> > > Just after the error message about the floppy controller not found, the
> > > machine hangs for two minutes and then this message:
> > >
> > > task swapper:1 blocked for greater than 120 seconds, followed by a stack
> > > trace, and again every two minutes AFAICT.
> > >
> > > I'm not including all the gory messages and stack traces because I'm
> > > hoping you'll know what the problem is without them. (Fingers crossed.)
> > >
> > > BTW, I see this problem only on my dual-core machine, not the older single
> > > single processor machine, as I would expect from reading the commit message.
> > > (Both machine have properly functioning floppy drives.)
> >
> > (Added Tejun to CC)
> >
> > I see a similar problem here. The kernel will boot, but the system will
> > not initialize (no X11).
> > After reverting the commit, the system starts normally and the only
> > workqueue problem left is drm related:
>
> I would guess that both of you have CONFIG_HOTPLUG_CPU disabled and enabling
> that config option will make the problem disappear.
> From the problem description and the patch it looks like it got screwed up
> the same way I did two years ago.
> See commits a802dd0eb5fc97a50cf1abb1f788a8f6cc5db635 and
> 4403b406d4369a275d483ece6ddee0088cc0d592.

Add Suresh to cc list.

2010-08-09 09:05:40

by Tejun Heo

[permalink] [raw]
Subject: Re: [BISECTED] Today's Linus.git hangs during boot: can't find the floppy controller

Hello,

On 08/09/2010 10:34 AM, Heiko Carstens wrote:
>>> I see a similar problem here. The kernel will boot, but the system will
>>> not initialize (no X11).
>>> After reverting the commit, the system starts normally and the only
>>> workqueue problem left is drm related:
>>
>> I would guess that both of you have CONFIG_HOTPLUG_CPU disabled and enabling
>> that config option will make the problem disappear.
>> From the problem description and the patch it looks like it got screwed up
>> the same way I did two years ago.
>> See commits a802dd0eb5fc97a50cf1abb1f788a8f6cc5db635 and
>> 4403b406d4369a275d483ece6ddee0088cc0d592.

Heh, that's almost hilarious. I'll see if this can be fixed
differently this time.

Thanks.

--
tejun

2010-08-09 09:37:21

by Tejun Heo

[permalink] [raw]
Subject: [PATCH wq#for-linus] workqueue: workqueue_cpu_callback() should be cpu_notifier not hotcpu_notifier

Commit 6ee0578b (workqueue: mark init_workqueues as early_initcall)
made workqueue SMP initialization depend on workqueue_cpu_callback(),
which however was registered as hotcpu_notifier() and didn't get
called if CONFIG_HOTPLUG_CPU is not set. This made gcwqs on non-boot
CPUs not create their initial workers leading to boot failures. Fix
it by making it a cpu_notifier.

Signed-off-by: Tejun Heo <[email protected]>
Reported-and-bisected-by: walt <[email protected]>
---
So, something like this. Can you please verify the fix?

Thanks.

kernel/workqueue.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index da6c482..2994a0e 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -3527,7 +3527,7 @@ static int __init init_workqueues(void)
unsigned int cpu;
int i;

- hotcpu_notifier(workqueue_cpu_callback, CPU_PRI_WORKQUEUE);
+ cpu_notifier(workqueue_cpu_callback, CPU_PRI_WORKQUEUE);

/* initialize gcwqs */
for_each_gcwq_cpu(cpu) {

2010-08-09 09:46:34

by Markus Trippelsdorf

[permalink] [raw]
Subject: Re: [PATCH wq#for-linus] workqueue: workqueue_cpu_callback() should be cpu_notifier not hotcpu_notifier

On Mon, Aug 09, 2010 at 11:36:20AM +0200, Tejun Heo wrote:
> Commit 6ee0578b (workqueue: mark init_workqueues as early_initcall)
> made workqueue SMP initialization depend on workqueue_cpu_callback(),
> which however was registered as hotcpu_notifier() and didn't get
> called if CONFIG_HOTPLUG_CPU is not set. This made gcwqs on non-boot
> CPUs not create their initial workers leading to boot failures. Fix
> it by making it a cpu_notifier.
>
> So, something like this. Can you please verify the fix?

This fixes the boot problem here. Thanks.
(The drm delayed enqueue problem, which I mentioned earlier still
persists.)

--
?A man who doesn't know he is in prison can never escape.?
William S. Burroughs

2010-08-09 09:50:00

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH wq#for-linus] workqueue: workqueue_cpu_callback() should be cpu_notifier not hotcpu_notifier

Hello,

On 08/09/2010 11:46 AM, Markus Trippelsdorf wrote:
> This fixes the boot problem here. Thanks.

Great. May I add your Tested-by?

> (The drm delayed enqueue problem, which I mentioned earlier still
> persists.)

Yeah, I'm looking into it now but it looks like the error message is
simply spurious. queue_delayed_work() returns 1 if the work was
actually queued and 0 if it was already pending and thus the function
call was no-op. output_poll_execute() is incorrectly interpreting 1
return as error. I'll look through the history and try to find out
whether/how wq changes affected the behavior, but the fix is most
likely simply killing the message.

Thanks.

--
tejun

2010-08-09 09:56:30

by Markus Trippelsdorf

[permalink] [raw]
Subject: Re: [PATCH wq#for-linus] workqueue: workqueue_cpu_callback() should be cpu_notifier not hotcpu_notifier

On Mon, Aug 09, 2010 at 11:49:00AM +0200, Tejun Heo wrote:
> Hello,
>
> On 08/09/2010 11:46 AM, Markus Trippelsdorf wrote:
> > This fixes the boot problem here. Thanks.
>
> Great. May I add your Tested-by?

Sure.
Tested-by: Markus Trippelsdorf <[email protected]>

> > (The drm delayed enqueue problem, which I mentioned earlier still
> > persists.)
>
> Yeah, I'm looking into it now but it looks like the error message is
> simply spurious. queue_delayed_work() returns 1 if the work was
> actually queued and 0 if it was already pending and thus the function
> call was no-op. output_poll_execute() is incorrectly interpreting 1
> return as error. I'll look through the history and try to find out
> whether/how wq changes affected the behavior, but the fix is most
> likely simply killing the message.

Yes, this looks like a cosmetic issue and I observe no other graphics
problems at all.
--
?A man who doesn't know he is in prison can never escape.?
William S. Burroughs

2010-08-09 10:01:49

by Tejun Heo

[permalink] [raw]
Subject: [PATCH wq#for-linus] drm: fix a fallout from slow-work -> wq conversion

Commit 991ea75c (drm: use workqueue instead of slow-work), which made
drm to use wq instead of slow-work, didn't account for the return
value difference between delayed_slow_work_enqueue() and
queue_delayed_work(). The former returns 0 on success and -errno on
failures while the latter never fails and only uses the return value
to indicate whether the work was already pending or not.

This misconversion triggered spurious error messages. Remove the now
unnecessary return value check and error message.

Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Markus Trippelsdorf <[email protected]>
Cc: David Airlie <[email protected]>
Cc: [email protected]
---
Markus, it's almost trivial but it would be great if you can test this
one too.

David, may I route this wq#for-linus?

Thanks.

drivers/gpu/drm/drm_crtc_helper.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/drm_crtc_helper.c b/drivers/gpu/drm/drm_crtc_helper.c
index 4598130..211ed7e 100644
--- a/drivers/gpu/drm/drm_crtc_helper.c
+++ b/drivers/gpu/drm/drm_crtc_helper.c
@@ -839,7 +839,6 @@ static void output_poll_execute(struct work_struct *work)
struct drm_connector *connector;
enum drm_connector_status old_status, status;
bool repoll = false, changed = false;
- int ret;

mutex_lock(&dev->mode_config.mutex);
list_for_each_entry(connector, &dev->mode_config.connector_list, head) {
@@ -874,11 +873,8 @@ static void output_poll_execute(struct work_struct *work)
dev->mode_config.funcs->output_poll_changed(dev);
}

- if (repoll) {
- ret = queue_delayed_work(system_nrt_wq, delayed_work, DRM_OUTPUT_POLL_PERIOD);
- if (ret)
- DRM_ERROR("delayed enqueue failed %d\n", ret);
- }
+ if (repoll)
+ queue_delayed_work(system_nrt_wq, delayed_work, DRM_OUTPUT_POLL_PERIOD);
}

void drm_kms_helper_poll_disable(struct drm_device *dev)

2010-08-09 10:14:11

by Markus Trippelsdorf

[permalink] [raw]
Subject: Re: [PATCH wq#for-linus] drm: fix a fallout from slow-work -> wq conversion

On Mon, Aug 09, 2010 at 12:00:49PM +0200, Tejun Heo wrote:
> Commit 991ea75c (drm: use workqueue instead of slow-work), which made
> drm to use wq instead of slow-work, didn't account for the return
> value difference between delayed_slow_work_enqueue() and
> queue_delayed_work(). The former returns 0 on success and -errno on
> failures while the latter never fails and only uses the return value
> to indicate whether the work was already pending or not.
>
> This misconversion triggered spurious error messages. Remove the now
> unnecessary return value check and error message.
>
> Signed-off-by: Tejun Heo <[email protected]>
> Reported-by: Markus Trippelsdorf <[email protected]>
> Cc: David Airlie <[email protected]>
> Cc: [email protected]
> ---
> Markus, it's almost trivial but it would be great if you can test this
> one too.

Looks good, but drm_kms_helper_poll_disable needs the same treatment.


diff --git a/drivers/gpu/drm/drm_crtc_helper.c b/drivers/gpu/drm/drm_crtc_helper.c
index 4598130..b9e4dbf 100644
--- a/drivers/gpu/drm/drm_crtc_helper.c
+++ b/drivers/gpu/drm/drm_crtc_helper.c
@@ -839,7 +839,6 @@ static void output_poll_execute(struct work_struct *work)
struct drm_connector *connector;
enum drm_connector_status old_status, status;
bool repoll = false, changed = false;
- int ret;

mutex_lock(&dev->mode_config.mutex);
list_for_each_entry(connector, &dev->mode_config.connector_list, head) {
@@ -874,11 +873,8 @@ static void output_poll_execute(struct work_struct *work)
dev->mode_config.funcs->output_poll_changed(dev);
}

- if (repoll) {
- ret = queue_delayed_work(system_nrt_wq, delayed_work, DRM_OUTPUT_POLL_PERIOD);
- if (ret)
- DRM_ERROR("delayed enqueue failed %d\n", ret);
- }
+ if (repoll)
+ queue_delayed_work(system_nrt_wq, delayed_work, DRM_OUTPUT_POLL_PERIOD);
}

void drm_kms_helper_poll_disable(struct drm_device *dev)
@@ -893,18 +889,14 @@ void drm_kms_helper_poll_enable(struct drm_device *dev)
{
bool poll = false;
struct drm_connector *connector;
- int ret;

list_for_each_entry(connector, &dev->mode_config.connector_list, head) {
if (connector->polled)
poll = true;
}

- if (poll) {
- ret = queue_delayed_work(system_nrt_wq, &dev->mode_config.output_poll_work, DRM_OUTPUT_POLL_PERIOD);
- if (ret)
- DRM_ERROR("delayed enqueue failed %d\n", ret);
- }
+ if (poll)
+ queue_delayed_work(system_nrt_wq, &dev->mode_config.output_poll_work, DRM_OUTPUT_POLL_PERIOD);
}
EXPORT_SYMBOL(drm_kms_helper_poll_enable);

--
?A man who doesn't know he is in prison can never escape.?
William S. Burroughs

2010-08-09 10:21:12

by Tejun Heo

[permalink] [raw]
Subject: [PATCH] drm: fix fallouts from slow-work -> wq conversion

>From 9a919c46dfa48a9c1f465174609b90253eb8ffc1 Mon Sep 17 00:00:00 2001
From: Tejun Heo <[email protected]>
Date: Mon, 9 Aug 2010 12:01:27 +0200

Commit 991ea75c (drm: use workqueue instead of slow-work), which made
drm to use wq instead of slow-work, didn't account for the return
value difference between delayed_slow_work_enqueue() and
queue_delayed_work(). The former returns 0 on success and -errno on
failures while the latter never fails and only uses the return value
to indicate whether the work was already pending or not.

This misconversion triggered spurious error messages. Remove the now
unnecessary return value check and error message.

Markus: caught another incorrect conversion in drm_kms_helper_poll_enable()

Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Markus Trippelsdorf <[email protected]>
Tested-by: Markus Trippelsdorf <[email protected]>
Cc: David Airlie <[email protected]>
Cc: [email protected]
---
Oops, you're right. So, this should do it.

Thank you.

drivers/gpu/drm/drm_crtc_helper.c | 16 ++++------------
1 files changed, 4 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/drm_crtc_helper.c b/drivers/gpu/drm/drm_crtc_helper.c
index 4598130..b9e4dbf 100644
--- a/drivers/gpu/drm/drm_crtc_helper.c
+++ b/drivers/gpu/drm/drm_crtc_helper.c
@@ -839,7 +839,6 @@ static void output_poll_execute(struct work_struct *work)
struct drm_connector *connector;
enum drm_connector_status old_status, status;
bool repoll = false, changed = false;
- int ret;

mutex_lock(&dev->mode_config.mutex);
list_for_each_entry(connector, &dev->mode_config.connector_list, head) {
@@ -874,11 +873,8 @@ static void output_poll_execute(struct work_struct *work)
dev->mode_config.funcs->output_poll_changed(dev);
}

- if (repoll) {
- ret = queue_delayed_work(system_nrt_wq, delayed_work, DRM_OUTPUT_POLL_PERIOD);
- if (ret)
- DRM_ERROR("delayed enqueue failed %d\n", ret);
- }
+ if (repoll)
+ queue_delayed_work(system_nrt_wq, delayed_work, DRM_OUTPUT_POLL_PERIOD);
}

void drm_kms_helper_poll_disable(struct drm_device *dev)
@@ -893,18 +889,14 @@ void drm_kms_helper_poll_enable(struct drm_device *dev)
{
bool poll = false;
struct drm_connector *connector;
- int ret;

list_for_each_entry(connector, &dev->mode_config.connector_list, head) {
if (connector->polled)
poll = true;
}

- if (poll) {
- ret = queue_delayed_work(system_nrt_wq, &dev->mode_config.output_poll_work, DRM_OUTPUT_POLL_PERIOD);
- if (ret)
- DRM_ERROR("delayed enqueue failed %d\n", ret);
- }
+ if (poll)
+ queue_delayed_work(system_nrt_wq, &dev->mode_config.output_poll_work, DRM_OUTPUT_POLL_PERIOD);
}
EXPORT_SYMBOL(drm_kms_helper_poll_enable);

--
1.7.1

2010-08-09 14:00:36

by walt

[permalink] [raw]
Subject: Re: [PATCH] drm: fix fallouts from slow-work -> wq conversion

On 08/09/2010 03:20 AM, Tejun Heo wrote:
> From 9a919c46dfa48a9c1f465174609b90253eb8ffc1 Mon Sep 17 00:00:00 2001
> From: Tejun Heo<[email protected]>
> Date: Mon, 9 Aug 2010 12:01:27 +0200
>
> Commit 991ea75c (drm: use workqueue instead of slow-work), which made
> drm to use wq instead of slow-work, didn't account for the return
> value difference between delayed_slow_work_enqueue() and
> queue_delayed_work(). The former returns 0 on success and -errno on
> failures while the latter never fails and only uses the return value
> to indicate whether the work was already pending or not.
>
> This misconversion triggered spurious error messages. Remove the now
> unnecessary return value check and error message.


> diff --git a/drivers/gpu/drm/drm_crtc_helper.c b/drivers/gpu/drm/drm_crtc_helper.c
> index 4598130..b9e4dbf 100644
> --- a/drivers/gpu/drm/drm_crtc_helper.c
> +++ b/drivers/gpu/drm/drm_crtc_helper.c
> @@ -839,7 +839,6 @@ static void output_poll_execute(struct work_struct *work)
> struct drm_connector *connector;
> enum drm_connector_status old_status, status;
> bool repoll = false, changed = false;
> - int ret;
>
> mutex_lock(&dev->mode_config.mutex);
> list_for_each_entry(connector,&dev->mode_config.connector_list, head) {
> @@ -874,11 +873,8 @@ static void output_poll_execute(struct work_struct *work)
> dev->mode_config.funcs->output_poll_changed(dev);
> }
>
> - if (repoll) {
> - ret = queue_delayed_work(system_nrt_wq, delayed_work, DRM_OUTPUT_POLL_PERIOD);
> - if (ret)
> - DRM_ERROR("delayed enqueue failed %d\n", ret);
> - }
> + if (repoll)
> + queue_delayed_work(system_nrt_wq, delayed_work, DRM_OUTPUT_POLL_PERIOD);
> }
>
> void drm_kms_helper_poll_disable(struct drm_device *dev)
> @@ -893,18 +889,14 @@ void drm_kms_helper_poll_enable(struct drm_device *dev)
> {
> bool poll = false;
> struct drm_connector *connector;
> - int ret;
>
> list_for_each_entry(connector,&dev->mode_config.connector_list, head) {
> if (connector->polled)
> poll = true;
> }
>
> - if (poll) {
> - ret = queue_delayed_work(system_nrt_wq,&dev->mode_config.output_poll_work, DRM_OUTPUT_POLL_PERIOD);
> - if (ret)
> - DRM_ERROR("delayed enqueue failed %d\n", ret);
> - }
> + if (poll)
> + queue_delayed_work(system_nrt_wq,&dev->mode_config.output_poll_work, DRM_OUTPUT_POLL_PERIOD);
> }
> EXPORT_SYMBOL(drm_kms_helper_poll_enable);


I was getting the same spurious error messages, and this patches fixes it, thanks.

2010-08-09 14:14:19

by walt

[permalink] [raw]
Subject: Re: [PATCH wq#for-linus] workqueue: workqueue_cpu_callback() should be cpu_notifier not hotcpu_notifier

On 08/09/2010 02:36 AM, Tejun Heo wrote:
> Commit 6ee0578b (workqueue: mark init_workqueues as early_initcall)
> made workqueue SMP initialization depend on workqueue_cpu_callback(),
> which however was registered as hotcpu_notifier() and didn't get
> called if CONFIG_HOTPLUG_CPU is not set. This made gcwqs on non-boot
> CPUs not create their initial workers leading to boot failures. Fix
> it by making it a cpu_notifier.
> ---
> So, something like this. Can you please verify the fix?

>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index da6c482..2994a0e 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -3527,7 +3527,7 @@ static int __init init_workqueues(void)
> unsigned int cpu;
> int i;
>
> - hotcpu_notifier(workqueue_cpu_callback, CPU_PRI_WORKQUEUE);
> + cpu_notifier(workqueue_cpu_callback, CPU_PRI_WORKQUEUE);

This fixes the hang during boot for me too, thanks.

2010-08-09 17:08:37

by Suresh Siddha

[permalink] [raw]
Subject: Re: [PATCH wq#for-linus] workqueue: workqueue_cpu_callback() should be cpu_notifier not hotcpu_notifier

On Mon, 2010-08-09 at 02:36 -0700, Tejun Heo wrote:
> Commit 6ee0578b (workqueue: mark init_workqueues as early_initcall)
> made workqueue SMP initialization depend on workqueue_cpu_callback(),
> which however was registered as hotcpu_notifier() and didn't get
> called if CONFIG_HOTPLUG_CPU is not set. This made gcwqs on non-boot
> CPUs not create their initial workers leading to boot failures. Fix
> it by making it a cpu_notifier.
>
> Signed-off-by: Tejun Heo <[email protected]>
> Reported-and-bisected-by: walt <[email protected]>
> ---
> So, something like this. Can you please verify the fix?
>
> Thanks.
>
> kernel/workqueue.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index da6c482..2994a0e 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -3527,7 +3527,7 @@ static int __init init_workqueues(void)
> unsigned int cpu;
> int i;
>
> - hotcpu_notifier(workqueue_cpu_callback, CPU_PRI_WORKQUEUE);
> + cpu_notifier(workqueue_cpu_callback, CPU_PRI_WORKQUEUE);
>
> /* initialize gcwqs */
> for_each_gcwq_cpu(cpu) {

Acked-by: Suresh Siddha <[email protected]>

2018-06-19 20:33:33

by Dave Airlie

[permalink] [raw]
Subject: Re: [PATCH] drm: fix fallouts from slow-work -> wq conversion

On 9 August 2010 at 20:20, Tejun Heo <[email protected]> wrote:
> >From 9a919c46dfa48a9c1f465174609b90253eb8ffc1 Mon Sep 17 00:00:00 2001
> From: Tejun Heo <[email protected]>
> Date: Mon, 9 Aug 2010 12:01:27 +0200
>
> Commit 991ea75c (drm: use workqueue instead of slow-work), which made
> drm to use wq instead of slow-work, didn't account for the return
> value difference between delayed_slow_work_enqueue() and
> queue_delayed_work(). The former returns 0 on success and -errno on
> failures while the latter never fails and only uses the return value
> to indicate whether the work was already pending or not.
>
> This misconversion triggered spurious error messages. Remove the now
> unnecessary return value check and error message.
>
> Markus: caught another incorrect conversion in drm_kms_helper_poll_enable()

Acked-by: David Airlie <[email protected]>

For queuing via your tree.

Thanks,
Dave.
>
> Signed-off-by: Tejun Heo <[email protected]>
> Reported-by: Markus Trippelsdorf <[email protected]>
> Tested-by: Markus Trippelsdorf <[email protected]>
> Cc: David Airlie <[email protected]>
> Cc: [email protected]
> ---
> Oops, you're right. So, this should do it.
>
> Thank you.
>
> drivers/gpu/drm/drm_crtc_helper.c | 16 ++++------------
> 1 files changed, 4 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_crtc_helper.c b/drivers/gpu/drm/drm_crtc_helper.c
> index 4598130..b9e4dbf 100644
> --- a/drivers/gpu/drm/drm_crtc_helper.c
> +++ b/drivers/gpu/drm/drm_crtc_helper.c
> @@ -839,7 +839,6 @@ static void output_poll_execute(struct work_struct *work)
> struct drm_connector *connector;
> enum drm_connector_status old_status, status;
> bool repoll = false, changed = false;
> - int ret;
>
> mutex_lock(&dev->mode_config.mutex);
> list_for_each_entry(connector, &dev->mode_config.connector_list, head) {
> @@ -874,11 +873,8 @@ static void output_poll_execute(struct work_struct *work)
> dev->mode_config.funcs->output_poll_changed(dev);
> }
>
> - if (repoll) {
> - ret = queue_delayed_work(system_nrt_wq, delayed_work, DRM_OUTPUT_POLL_PERIOD);
> - if (ret)
> - DRM_ERROR("delayed enqueue failed %d\n", ret);
> - }
> + if (repoll)
> + queue_delayed_work(system_nrt_wq, delayed_work, DRM_OUTPUT_POLL_PERIOD);
> }
>
> void drm_kms_helper_poll_disable(struct drm_device *dev)
> @@ -893,18 +889,14 @@ void drm_kms_helper_poll_enable(struct drm_device *dev)
> {
> bool poll = false;
> struct drm_connector *connector;
> - int ret;
>
> list_for_each_entry(connector, &dev->mode_config.connector_list, head) {
> if (connector->polled)
> poll = true;
> }
>
> - if (poll) {
> - ret = queue_delayed_work(system_nrt_wq, &dev->mode_config.output_poll_work, DRM_OUTPUT_POLL_PERIOD);
> - if (ret)
> - DRM_ERROR("delayed enqueue failed %d\n", ret);
> - }
> + if (poll)
> + queue_delayed_work(system_nrt_wq, &dev->mode_config.output_poll_work, DRM_OUTPUT_POLL_PERIOD);
> }
> EXPORT_SYMBOL(drm_kms_helper_poll_enable);
>
> --
> 1.7.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
> _______________________________________________
> dri-devel mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/dri-devel