2009-11-26 08:00:57

by Stephen Rothwell

[permalink] [raw]
Subject: linux-next: workqueues tree build failure

Hi Tejun,

Today's linux-next build (x86_64 allmodconfig) failed like this:

sound/soc/codecs/tlv320dac33.c: In function 'dac33_i2c_probe':
sound/soc/codecs/tlv320dac33.c:1121: error: implicit declaration of function 'create_rt_workqueue'
sound/soc/codecs/tlv320dac33.c:1121: warning: assignment makes pointer from integer without a cast

Caused by commit 1b2d88bf4c63ed3a8c9033c358905d3463aa8bc2 ("stop_machine:
reimplement without using workqueue") (which removed create_rt_workqueue
()) interacting with commit c8bf93f0fe8c5a509a29e30f3bac823fa0f6d96e
("ASoC: Codec driver for Texas Instruments tlv320dac33 codec") from the
sound tree.

I have no idea how to fix this, so I have used the workqueues tree from
next-20091125 until someone can suggest a solution.
--
Cheers,
Stephen Rothwell [email protected]
http://www.canb.auug.org.au/~sfr/


Attachments:
(No filename) (867.00 B)
(No filename) (198.00 B)
Download all attachments

2009-11-26 08:04:20

by Tejun Heo

[permalink] [raw]
Subject: Re: linux-next: workqueues tree build failure

11/26/2009 05:00 PM, Stephen Rothwell wrote:
> Hi Tejun,
>
> Today's linux-next build (x86_64 allmodconfig) failed like this:
>
> sound/soc/codecs/tlv320dac33.c: In function 'dac33_i2c_probe':
> sound/soc/codecs/tlv320dac33.c:1121: error: implicit declaration of function 'create_rt_workqueue'
> sound/soc/codecs/tlv320dac33.c:1121: warning: assignment makes pointer from integer without a cast
>
> Caused by commit 1b2d88bf4c63ed3a8c9033c358905d3463aa8bc2 ("stop_machine:
> reimplement without using workqueue") (which removed create_rt_workqueue
> ()) interacting with commit c8bf93f0fe8c5a509a29e30f3bac823fa0f6d96e
> ("ASoC: Codec driver for Texas Instruments tlv320dac33 codec") from the
> sound tree.
>
> I have no idea how to fix this, so I have used the workqueues tree from
> next-20091125 until someone can suggest a solution.

Takashi, RT workqueue is going away. Do you really need it?

Thanks.

--
tejun

2009-11-26 08:18:31

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: linux-next: workqueues tree build failure

Hello,

On Thursday 26 November 2009 10:04:07 ext Tejun Heo wrote:
> 11/26/2009 05:00 PM, Stephen Rothwell wrote:
> > Hi Tejun,
> >
> > Today's linux-next build (x86_64 allmodconfig) failed like this:
> >
> > sound/soc/codecs/tlv320dac33.c: In function 'dac33_i2c_probe':
> > sound/soc/codecs/tlv320dac33.c:1121: error: implicit declaration of
> > function 'create_rt_workqueue' sound/soc/codecs/tlv320dac33.c:1121:
> > warning: assignment makes pointer from integer without a cast
> >
> > Caused by commit 1b2d88bf4c63ed3a8c9033c358905d3463aa8bc2 ("stop_machine:
> > reimplement without using workqueue") (which removed create_rt_workqueue
> > ()) interacting with commit c8bf93f0fe8c5a509a29e30f3bac823fa0f6d96e
> > ("ASoC: Codec driver for Texas Instruments tlv320dac33 codec") from the
> > sound tree.
> >
> > I have no idea how to fix this, so I have used the workqueues tree from
> > next-20091125 until someone can suggest a solution.
>
> Takashi, RT workqueue is going away. Do you really need it?

What can be used instead of RT workqueue?
The tlv320dac33 needs RT workqueue because I need to send the I2C command with
minimum delay to the codec. If this can not be done (the workqueue is delayed),
and the codec does not receive the command in time, it will literally die.

What are the options to replace the RT workqueue?

>
> Thanks.
>

--
P?ter

2009-11-26 09:12:43

by Tejun Heo

[permalink] [raw]
Subject: Re: linux-next: workqueues tree build failure

Hello,

11/26/2009 05:16 PM, Peter Ujfalusi wrote:
>> Takashi, RT workqueue is going away. Do you really need it?
>
> What can be used instead of RT workqueue?
> The tlv320dac33 needs RT workqueue because I need to send the I2C
> command with minimum delay to the codec. If this can not be done
> (the workqueue is delayed), and the codec does not receive the
> command in time, it will literally die. What are the options to
> replace the RT workqueue?

The problem with RT workqueue is that RT and queue don't really mix
well. To act in real time, it requires all the resource pre-allocated
and dedicated to it making queueing or pooling meaningless. The
original workqueue code created dedicated pool of threads for each
workqueue so it could be used for RT but new implementation uses
shared worker pool, so it can't be used as an interface to dedicated
threads.

I haven't read the code but,

* If you need to respond fast, wouldn't you be doing that from IRQ
handler or softirq? Do you need task context?

* Or is it that it's not triggered by IRQ but once the transfer
started it can't be interrupted? But in this case preempt_disable()
or local_irq_disable() should suffice.

Thanks.

--
tejun

2009-11-26 09:24:28

by Takashi Iwai

[permalink] [raw]
Subject: Re: linux-next: workqueues tree build failure

At Thu, 26 Nov 2009 18:12:26 +0900,
Tejun Heo wrote:
>
> Hello,
>
> 11/26/2009 05:16 PM, Peter Ujfalusi wrote:
> >> Takashi, RT workqueue is going away. Do you really need it?
> >
> > What can be used instead of RT workqueue?
> > The tlv320dac33 needs RT workqueue because I need to send the I2C
> > command with minimum delay to the codec. If this can not be done
> > (the workqueue is delayed), and the codec does not receive the
> > command in time, it will literally die. What are the options to
> > replace the RT workqueue?
>
> The problem with RT workqueue is that RT and queue don't really mix
> well. To act in real time, it requires all the resource pre-allocated
> and dedicated to it making queueing or pooling meaningless. The
> original workqueue code created dedicated pool of threads for each
> workqueue so it could be used for RT but new implementation uses
> shared worker pool, so it can't be used as an interface to dedicated
> threads.
>
> I haven't read the code but,
>
> * If you need to respond fast, wouldn't you be doing that from IRQ
> handler or softirq? Do you need task context?
>
> * Or is it that it's not triggered by IRQ but once the transfer
> started it can't be interrupted? But in this case preempt_disable()
> or local_irq_disable() should suffice.

The relevant code uses the workqueue as a sort of BH, just triggers
from the hard irq handler. If any, we may use a threaded handler or
so...


thanks,

Takashi

2009-11-26 09:32:28

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: linux-next: workqueues tree build failure

On Thursday 26 November 2009 11:12:26 ext Tejun Heo wrote:
> Hello,
>
> 11/26/2009 05:16 PM, Peter Ujfalusi wrote:
> >> Takashi, RT workqueue is going away. Do you really need it?
> >
> > What can be used instead of RT workqueue?
> > The tlv320dac33 needs RT workqueue because I need to send the I2C
> > command with minimum delay to the codec. If this can not be done
> > (the workqueue is delayed), and the codec does not receive the
> > command in time, it will literally die. What are the options to
> > replace the RT workqueue?
>
> The problem with RT workqueue is that RT and queue don't really mix
> well. To act in real time, it requires all the resource pre-allocated
> and dedicated to it making queueing or pooling meaningless. The
> original workqueue code created dedicated pool of threads for each
> workqueue so it could be used for RT but new implementation uses
> shared worker pool, so it can't be used as an interface to dedicated
> threads.
>
> I haven't read the code but,
>
> * If you need to respond fast, wouldn't you be doing that from IRQ
> handler or softirq? Do you need task context?

I2C communication can not be done in interrupt context.
I'll take a look at the threaded IRQ, but AFAIK it is just a wrapper to use the
global workqueue (I'm not sure, I have not actually checked it).
For a quick fix, I can convert the tlv320dac33 driver to use this, and revisit
later, if it does not fulfill the timing requirements for the HW.

>
> * Or is it that it's not triggered by IRQ but once the transfer
> started it can't be interrupted? But in this case preempt_disable()
> or local_irq_disable() should suffice.

As Takashi already commented, it is used as a BH.

>
> Thanks.
>

--
P?ter

2009-11-26 10:18:56

by Mark Brown

[permalink] [raw]
Subject: Re: linux-next: workqueues tree build failure

On Thu, Nov 26, 2009 at 11:31:15AM +0200, Peter Ujfalusi wrote:

> For a quick fix, I can convert the tlv320dac33 driver to use this, and revisit
> later, if it does not fulfill the timing requirements for the HW.

If there are problems then creating a dedicated thread you can wake
would probably do the job.

2009-11-26 11:45:35

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: linux-next: workqueues tree build failure

On Thursday 26 November 2009 12:18:59 ext Mark Brown wrote:
> On Thu, Nov 26, 2009 at 11:31:15AM +0200, Peter Ujfalusi wrote:
> > For a quick fix, I can convert the tlv320dac33 driver to use this, and
> > revisit later, if it does not fulfill the timing requirements for the HW.
>
> If there are problems then creating a dedicated thread you can wake
> would probably do the job.

For now I'm going to change the create_rt_workqueue to
create_singlethread_workqueue, that is available at the wq.git of kernel.org.

If it is not sufficient, than I might use dedicated thread, which I woke up, but
I would like to keep that as a backup plan, since it needs more work ;)

I'll send the patch in a few minutes.

--
P?ter

2009-11-26 12:43:04

by Andy Walls

[permalink] [raw]
Subject: Re: linux-next: workqueues tree build failure

On Thu, 2009-11-26 at 18:12 +0900, Tejun Heo wrote:
> Hello,
>
> 11/26/2009 05:16 PM, Peter Ujfalusi wrote:
> >> Takashi, RT workqueue is going away. Do you really need it?
> >
> > What can be used instead of RT workqueue?
> > The tlv320dac33 needs RT workqueue because I need to send the I2C
> > command with minimum delay to the codec. If this can not be done
> > (the workqueue is delayed), and the codec does not receive the
> > command in time, it will literally die. What are the options to
> > replace the RT workqueue?
>
> The problem with RT workqueue is that RT and queue don't really mix
> well. To act in real time, it requires all the resource pre-allocated
> and dedicated to it making queueing or pooling meaningless. The
> original workqueue code created dedicated pool of threads for each
> workqueue so it could be used for RT but new implementation uses
> shared worker pool, so it can't be used as an interface to dedicated
> threads.
>
> I haven't read the code but,
>
> * If you need to respond fast, wouldn't you be doing that from IRQ
> handler or softirq? Do you need task context?

Hi Tejun,

I'm not sure doing things like I2C transactions in the in the top half
of the IRQ handler is generally viable. On shared IRQ lines, wouldn't
this hold off the interrupt for another device for too long?

For example, I already ran across the case of an error path in the ahci
disk controller driver interrupt handler holding off interrupts from the
cx18 driver longer than the CX23418 firmware would tolerate on a shared
interrupt line.

Workhandlers for deferring work are a nice way to avoid such bad system
level interactions.

Regards,
Andy


> * Or is it that it's not triggered by IRQ but once the transfer
> started it can't be interrupted? But in this case preempt_disable()
> or local_irq_disable() should suffice.
>
> Thanks.

2009-11-26 12:51:34

by Andy Walls

[permalink] [raw]
Subject: Re: linux-next: workqueues tree build failure

On Thu, 2009-11-26 at 13:44 +0200, Peter Ujfalusi wrote:
> On Thursday 26 November 2009 12:18:59 ext Mark Brown wrote:
> > On Thu, Nov 26, 2009 at 11:31:15AM +0200, Peter Ujfalusi wrote:
> > > For a quick fix, I can convert the tlv320dac33 driver to use this, and
> > > revisit later, if it does not fulfill the timing requirements for the HW.
> >
> > If there are problems then creating a dedicated thread you can wake
> > would probably do the job.
>
> For now I'm going to change the create_rt_workqueue to
> create_singlethread_workqueue, that is available at the wq.git of kernel.org.
>
> If it is not sufficient, than I might use dedicated thread, which I woke up, but
> I would like to keep that as a backup plan, since it needs more work ;)

Peter,

I would suspect using a single-threaded workqueue is better than a
wake_up() of another thread. IIRC, after queuing work, the workqueue's
single thread may run almost immediately on the same processor. With
waking up sleeping threads, I've run into scheduler delays around 10 ms
on a dual core x86_64 desktop system.

Regards,
Andy

> I'll send the patch in a few minutes.

2009-11-26 12:56:56

by Mark Brown

[permalink] [raw]
Subject: Re: linux-next: workqueues tree build failure

On Thu, Nov 26, 2009 at 07:40:25AM -0500, Andy Walls wrote:

> I'm not sure doing things like I2C transactions in the in the top half
> of the IRQ handler is generally viable. On shared IRQ lines, wouldn't
> this hold off the interrupt for another device for too long?

You're going to need to do I2C I/O to acknowledge and deassert the
interrupt on most of these devices so if someone's shared the IRQ line
with something that's too latency sensitive the hardware is broken
anyway.

> Workhandlers for deferring work are a nice way to avoid such bad system
> level interactions.

In order to cope with the fact that the IRQ can't be deasserted without
talking to the device it will almost always be masked while waiting for
the deferred work.

2009-11-26 13:25:05

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: linux-next: workqueues tree build failure

On Thursday 26 November 2009 14:49:56 ext Andy Walls wrote:
>
> Peter,
>
> I would suspect using a single-threaded workqueue is better than a
> wake_up() of another thread. IIRC, after queuing work, the workqueue's
> single thread may run almost immediately on the same processor. With
> waking up sleeping threads, I've run into scheduler delays around 10 ms
> on a dual core x86_64 desktop system.

Hello Andy,

I have sent the patch which changes from rt to singlethread, I hope it fixes the
breakage in linux-next.
In short testing, when there is virtually no load on the system I can not see
any difference, which might change later.
But I'll keep in mind that probably I'm not going to better off with the waking
up of a sleeping thread.

>
> Regards,
> Andy

Thanks,
P?ter

2009-11-27 02:03:09

by Tejun Heo

[permalink] [raw]
Subject: Re: linux-next: workqueues tree build failure

Hello,

11/26/2009 09:40 PM, Andy Walls wrote:
>> * If you need to respond fast, wouldn't you be doing that from IRQ
>> handler or softirq? Do you need task context?
>
> I'm not sure doing things like I2C transactions in the in the top half
> of the IRQ handler is generally viable. On shared IRQ lines, wouldn't
> this hold off the interrupt for another device for too long?
>
> For example, I already ran across the case of an error path in the ahci
> disk controller driver interrupt handler holding off interrupts from the
> cx18 driver longer than the CX23418 firmware would tolerate on a shared
> interrupt line.

Sounds like it should be using bottom half tasklet not workqueue.
Tasklet is exactly designed to handle situations like this. Is there
any reason tasklet can't be used?

Thanks.

--
tejun

2009-11-27 08:37:32

by Takashi Iwai

[permalink] [raw]
Subject: Re: linux-next: workqueues tree build failure

At Fri, 27 Nov 2009 11:02:25 +0900,
Tejun Heo wrote:
>
> Hello,
>
> 11/26/2009 09:40 PM, Andy Walls wrote:
> >> * If you need to respond fast, wouldn't you be doing that from IRQ
> >> handler or softirq? Do you need task context?
> >
> > I'm not sure doing things like I2C transactions in the in the top half
> > of the IRQ handler is generally viable. On shared IRQ lines, wouldn't
> > this hold off the interrupt for another device for too long?
> >
> > For example, I already ran across the case of an error path in the ahci
> > disk controller driver interrupt handler holding off interrupts from the
> > cx18 driver longer than the CX23418 firmware would tolerate on a shared
> > interrupt line.
>
> Sounds like it should be using bottom half tasklet not workqueue.
> Tasklet is exactly designed to handle situations like this. Is there
> any reason tasklet can't be used?

Right now the h/w accessing code is using mutex. I'm not sure whether
the deeper part might sleep, though...


Takashi

2009-11-27 08:42:43

by Tejun Heo

[permalink] [raw]
Subject: Re: linux-next: workqueues tree build failure

Hello,

11/27/2009 05:37 PM, Takashi Iwai wrote:
>> Sounds like it should be using bottom half tasklet not workqueue.
>> Tasklet is exactly designed to handle situations like this. Is there
>> any reason tasklet can't be used?
>
> Right now the h/w accessing code is using mutex. I'm not sure whether
> the deeper part might sleep, though...

Ah... I see. Using mutex from a handler where response time is
critical is strange tho. Anyways, I don't really think singlethread
will satisfy the timing requirement under loaded conditions. IMHO,
update locking and using tasklets would be the best.

Thanks.

--
tejun

2009-11-27 10:10:23

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: linux-next: workqueues tree build failure

Hello,

On Friday 27 November 2009 10:42:27 ext Tejun Heo wrote:
> Hello,
>
> 11/27/2009 05:37 PM, Takashi Iwai wrote:
> >> Sounds like it should be using bottom half tasklet not workqueue.
> >> Tasklet is exactly designed to handle situations like this. Is there
> >> any reason tasklet can't be used?
> >
> > Right now the h/w accessing code is using mutex. I'm not sure whether
> > the deeper part might sleep, though...
>
> Ah... I see. Using mutex from a handler where response time is
> critical is strange tho. Anyways, I don't really think singlethread
> will satisfy the timing requirement under loaded conditions. IMHO,
> update locking and using tasklets would be the best.

Hmmm, yes, I'm aware that it might be not sufficient to use the singlethread wq,
but that was the fastest fix that I could think of, and it was needed for the
linux-next.
In case of the tlv320dac33 driver, I can also play with the codec as well to
adjust the required latency, but if I hit the wall, than for sure I will look
for another solution for the problem.
So far this is working for my development needs, but time will tell if I need to
make bigger modifications.

For the record: on OMAP platforms the I2C implementation is sleeping during the
transfer (internally using interrupt based transfer), which might be not that
wise in the tasklet. Since the I2C bus is kind of a slow bus, this could take
several ms in some cases.

Thank you,
P?ter

2009-11-27 11:38:06

by Mark Brown

[permalink] [raw]
Subject: Re: linux-next: workqueues tree build failure

On Fri, Nov 27, 2009 at 12:09:18PM +0200, Peter Ujfalusi wrote:

> For the record: on OMAP platforms the I2C implementation is sleeping during the
> transfer (internally using interrupt based transfer), which might be not that
> wise in the tasklet. Since the I2C bus is kind of a slow bus, this could take
> several ms in some cases.

There's also the possibility of contention with other users of the I2C
bus which doesn't help matters here.

2009-11-27 13:53:07

by Andy Walls

[permalink] [raw]
Subject: Re: linux-next: workqueues tree build failure

On Fri, 2009-11-27 at 11:02 +0900, Tejun Heo wrote:
> Hello,
>
> 11/26/2009 09:40 PM, Andy Walls wrote:
> >> * If you need to respond fast, wouldn't you be doing that from IRQ
> >> handler or softirq? Do you need task context?
> >
> > I'm not sure doing things like I2C transactions in the in the top half
> > of the IRQ handler is generally viable.

I'll back off on this statement a little as Mark Brown pointed out. Two
16 bit exchanges on a 100 kHz I2C bus is, as a lower bound, 32/100000 =
320 microseconds (ignoring stop and start bits, turn-around times, I2C
master register access times, etc.). 320 usec isn't bad.

But in general having a slow top half can have bad effects on other
drivers sharing an IRQ line.

> On shared IRQ lines, wouldn't
> > this hold off the interrupt for another device for too long?
> >
> > For example, I already ran across the case of an error path in the ahci
> > disk controller driver interrupt handler holding off interrupts from the
> > cx18 driver longer than the CX23418 firmware would tolerate on a shared
> > interrupt line.
>
> Sounds like it should be using bottom half tasklet not workqueue.
> Tasklet is exactly designed to handle situations like this. Is there
> any reason tasklet can't be used?

Well if we're talking about the cx18 driver:

1. I recall reading a while ago that tasklets were on the way out.
Here's a recent reference:
http://lwn.net/Articles/2.6-kernel-api/
and an older one:
http://lwn.net/Articles/239633/

2. CX23418 IRQs that need deferred work can come in in very close
succession. This is due to *one* mailbox of information from firmware
to the cx18 driver and buffer notifications for multiple streams coming
in on that mailbox and the firmware doesn't have a long timeout waiting
for the mailbox to be handled by the cx18 driver. I needed to make sure
sequential events get queued up properly and none of them dropped.
Queueing up multiple defered work items with a tasklet seemed
problematic compared to a workqueue.

(3. Once upon a time, the cx18 buffer queues where protected by mutexes,
which can sleep. Sleeping wasn't allowed in a tasklet but was in
workhandlers. Those queue locks are now spinlocks though.)


But in the case I mentioned above for the cx18 and ahci drivers with
devices that share an interrupt - tasklet vs. workqueue doesn't matter.

There is a variable, but often short, time interval from CX23418 raising
the IRQn to provide notification of a completed buffer DMA, until the
CX23418 gives up waiting for cx18 driver ack and begins to overwrite
that notification with a new one.

If the ahci driver is servicing an interrupt for the shared IRQn and
leaving it masked for a very long time, like it does in an error
handling path, it is the ahci driver's interrupt service routine that is
consuming the cx18 driver's timeline, not the workqueue delays.

The fix for that is a system level one: don't have a data acquisition
type peripherial like a CX23418 based TV capture card, share an IRQ with
some other peripherial whose linux driver might have a sometimes slow
top-half. Unfortunately, a normal home end user doesn't know to do that
until he asks for help.


Thanks.

Regards,
Andy


> Thanks.