2004-04-27 18:12:48

by Kenneth Johansson

[permalink] [raw]
Subject: [BUG] DVD writing in 2.6.6-rc2

I have a problem when using growisofs version 5.19.

The problem is that in the very end when gowisofs tries to flush the
cache. When stracing the process I can see it sits in a call to poll
that never returns.

I noticed that if I start growisofs and later attach to it with "strace
-p" I can make it continue with killing the strace process. just to see
it hang in the next poll. But re attaching then killing the strace again
a few times and growisofs finally dose a normal exit.

This happens every time.

The only unusual thing with my setup is that I only have the DVD burner
on the IDE controller no disks so I guess it's some type of missed
interrupt.

I looked at the drivers/ide/pci/via82cxxx.c but that one did not do much
apart from setting a few configurations so I guess it's some generic
code that dose the real work.

----------VIA BusMastering IDE Configuration----------------
Driver Version: 3.38
South Bridge: VIA vt8233a
Revision: ISA 0x0 IDE 0x6
Highest DMA rate: UDMA133
BM-DMA base: 0x9800
PCI clock: 33.3MHz
Master Read Cycle IRDY: 0ws
Master Write Cycle IRDY: 0ws
BM IDE Status Register Read Retry: yes
Max DRDY Pulse Width: No limit
-----------------------Primary IDE-------Secondary IDE------
Read DMA FIFO flush: yes yes
End Sector FIFO flush: no no
Prefetch Buffer: no no
Post Write Buffer: no no
Enabled: yes yes
Simplex only: no no
Cable Type: 80w 80w
-------------------drive0----drive1----drive2----drive3-----
Transfer Mode: PIO PIO UDMA UDMA
Address Setup: 120ns 120ns 120ns 120ns
Cmd Active: 360ns 360ns 90ns 90ns
Cmd Recovery: 210ns 210ns 30ns 30ns
Data Active: 330ns 330ns 90ns 90ns
Data Recovery: 270ns 270ns 30ns 30ns
Cycle Time: 600ns 600ns 60ns 60ns
Transfer Rate: 3.3MB/s 3.3MB/s 33.3MB/s 33.3MB/s





2004-04-27 18:36:16

by Jens Axboe

[permalink] [raw]
Subject: Re: [BUG] DVD writing in 2.6.6-rc2

On Tue, Apr 27 2004, Kenneth Johansson wrote:
> I have a problem when using growisofs version 5.19.
>
> The problem is that in the very end when gowisofs tries to flush the
> cache. When stracing the process I can see it sits in a call to poll
> that never returns.
>
> I noticed that if I start growisofs and later attach to it with "strace
> -p" I can make it continue with killing the strace process. just to see
> it hang in the next poll. But re attaching then killing the strace again
> a few times and growisofs finally dose a normal exit.
>
> This happens every time.

I noted the same thing yesterday with cdrdao, so yours is not an
isolated incident. I'll debug it tomorrow.

--
Jens Axboe

2004-04-27 22:16:33

by Moritz Muehlenhoff

[permalink] [raw]
Subject: Re: [BUG] DVD writing in 2.6.6-rc2

Jens Axboe wrote:
>> I have a problem when using growisofs version 5.19.
>>
>> The problem is that in the very end when gowisofs tries to flush the
>> cache. When stracing the process I can see it sits in a call to poll
>> that never returns.
>
> I noted the same thing yesterday with cdrdao, so yours is not an
> isolated incident. I'll debug it tomorrow.

I can confirm this bug for kernel 2.6.3-rc4 as well, so it's not too recent.
I found the following on my console after quitting k3b ungracefully, maybe
it's useful for narrowing down the problem:

Message from syslogd@gandalf at Tue Apr 27 22:45:53 2004 ...
gandalf kernel: MCE: The hardware reports a non fatal, correctable incident occu
rred on CPU 0.

Message from syslogd@gandalf at Tue Apr 27 22:45:53 2004 ...
gandalf kernel: Bank 2: 940040000000017a

gandalf:/root# Mutex destroy failure: Device or resource busy
gandalf:/root# ICE default IO error handler doing an exit(), pid = 17846, errno = 0
ICE default IO error handler doing an exit(), pid = 17791, errno = 2

2004-04-27 22:24:18

by Panagiotis Papadakos

[permalink] [raw]
Subject: Re: [BUG] DVD writing in 2.6.6-rc2

I think this is a growisofs 5.19 problem.
You should download:
http://fy.chalmers.se/~appro/linux/DVD+RW/tools/dvd+rw-tools-5.19-1.4.9.7.tar.gz

Panagiotis Papadakos

On Tue, 27 Apr 2004, Kenneth Johansson wrote:

> I have a problem when using growisofs version 5.19.
>
> The problem is that in the very end when gowisofs tries to flush the
> cache. When stracing the process I can see it sits in a call to poll
> that never returns.
>
> I noticed that if I start growisofs and later attach to it with "strace
> -p" I can make it continue with killing the strace process. just to see
> it hang in the next poll. But re attaching then killing the strace again
> a few times and growisofs finally dose a normal exit.
>
> This happens every time.
>
> The only unusual thing with my setup is that I only have the DVD burner
> on the IDE controller no disks so I guess it's some type of missed
> interrupt.
>
> I looked at the drivers/ide/pci/via82cxxx.c but that one did not do much
> apart from setting a few configurations so I guess it's some generic
> code that dose the real work.
>
> ----------VIA BusMastering IDE Configuration----------------
> Driver Version: 3.38
> South Bridge: VIA vt8233a
> Revision: ISA 0x0 IDE 0x6
> Highest DMA rate: UDMA133
> BM-DMA base: 0x9800
> PCI clock: 33.3MHz
> Master Read Cycle IRDY: 0ws
> Master Write Cycle IRDY: 0ws
> BM IDE Status Register Read Retry: yes
> Max DRDY Pulse Width: No limit
> -----------------------Primary IDE-------Secondary IDE------
> Read DMA FIFO flush: yes yes
> End Sector FIFO flush: no no
> Prefetch Buffer: no no
> Post Write Buffer: no no
> Enabled: yes yes
> Simplex only: no no
> Cable Type: 80w 80w
> -------------------drive0----drive1----drive2----drive3-----
> Transfer Mode: PIO PIO UDMA UDMA
> Address Setup: 120ns 120ns 120ns 120ns
> Cmd Active: 360ns 360ns 90ns 90ns
> Cmd Recovery: 210ns 210ns 30ns 30ns
> Data Active: 330ns 330ns 90ns 90ns
> Data Recovery: 270ns 270ns 30ns 30ns
> Cycle Time: 600ns 600ns 60ns 60ns
> Transfer Rate: 3.3MB/s 3.3MB/s 33.3MB/s 33.3MB/s
>
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2004-04-28 11:31:26

by Jens Axboe

[permalink] [raw]
Subject: Re: [BUG] DVD writing in 2.6.6-rc2

On Wed, Apr 28 2004, Moritz Muehlenhoff wrote:
> Jens Axboe wrote:
> >> I have a problem when using growisofs version 5.19.
> >>
> >> The problem is that in the very end when gowisofs tries to flush the
> >> cache. When stracing the process I can see it sits in a call to poll
> >> that never returns.
> >
> > I noted the same thing yesterday with cdrdao, so yours is not an
> > isolated incident. I'll debug it tomorrow.
>
> I can confirm this bug for kernel 2.6.3-rc4 as well, so it's not too recent.
> I found the following on my console after quitting k3b ungracefully, maybe

This is really strange, I haven't been able to locate a kernel problem.
Looking at command traces, cdrdao issues two FLUSH_CACHE commands when
ending the writes. The last commands are:

0x2a (last real write)
0x35 (first flush cache) (takes about 10 seconds to complete)
0x00 (test unit ready)
0x51 (read disc info, length 4)
0x35 (2nd flush cache) -> never completes

That last sync cache never generates an interrupt, so cdrdao gets stuck
forever waiting on it. I cannot even reproduce this with a test case

--
Jens Axboe

2004-04-28 20:12:23

by Jens Axboe

[permalink] [raw]
Subject: Re: [BUG] DVD writing in 2.6.6-rc2

On Wed, Apr 28 2004, Kenneth Johansson wrote:
> On Wed, 2004-04-28 at 13:30, Jens Axboe wrote:
> > On Wed, Apr 28 2004, Moritz Muehlenhoff wrote:
> > > Jens Axboe wrote:
> > > >> I have a problem when using growisofs version 5.19.
> > > >>
> > > >> The problem is that in the very end when gowisofs tries to flush the
> > > >> cache. When stracing the process I can see it sits in a call to poll
> > > >> that never returns.
> > > >
> > > > I noted the same thing yesterday with cdrdao, so yours is not an
> > > > isolated incident. I'll debug it tomorrow.
> > >
> > > I can confirm this bug for kernel 2.6.3-rc4 as well, so it's not too recent.
> > > I found the following on my console after quitting k3b ungracefully, maybe
> >
> > This is really strange, I haven't been able to locate a kernel problem.
> > Looking at command traces, cdrdao issues two FLUSH_CACHE commands when
> > ending the writes. The last commands are:
> >
> > 0x2a (last real write)
> > 0x35 (first flush cache) (takes about 10 seconds to complete)
> > 0x00 (test unit ready)
> > 0x51 (read disc info, length 4)
> > 0x35 (2nd flush cache) -> never completes
> >
> > That last sync cache never generates an interrupt, so cdrdao gets stuck
> > forever waiting on it. I cannot even reproduce this with a test case
>
> My problem was in growisofs itself. They use poll to wait instead of
> using nanosleep and totally confusing me. It did not help that strace do
> not print out the arguments until the syscall return so I never saw the
> input to poll.

So it just helped me unearth a different problem :-). There certainly is
a bug here, it looks like it's hardware though (see the above
description). ide-cd just needs to have it's SYNC_CACHE retries
limited, then the kernel should at least recover.

--
Jens Axboe

2004-04-28 20:04:09

by Moritz Muehlenhoff

[permalink] [raw]
Subject: Re: [BUG] DVD writing in 2.6.6-rc2

Jens Axboe wrote:
> This is really strange, I haven't been able to locate a kernel problem.

In the case of k3b it turned out to be a bug in growisofs from the
dvd+rw-tools package; with the hotfix included in Debian/sid every-
things works as expected.
It seems very strange that both triggered a similar bug like this,
though.

2004-04-28 19:56:31

by Kenneth Johansson

[permalink] [raw]
Subject: Re: [BUG] DVD writing in 2.6.6-rc2

On Wed, 2004-04-28 at 13:30, Jens Axboe wrote:
> On Wed, Apr 28 2004, Moritz Muehlenhoff wrote:
> > Jens Axboe wrote:
> > >> I have a problem when using growisofs version 5.19.
> > >>
> > >> The problem is that in the very end when gowisofs tries to flush the
> > >> cache. When stracing the process I can see it sits in a call to poll
> > >> that never returns.
> > >
> > > I noted the same thing yesterday with cdrdao, so yours is not an
> > > isolated incident. I'll debug it tomorrow.
> >
> > I can confirm this bug for kernel 2.6.3-rc4 as well, so it's not too recent.
> > I found the following on my console after quitting k3b ungracefully, maybe
>
> This is really strange, I haven't been able to locate a kernel problem.
> Looking at command traces, cdrdao issues two FLUSH_CACHE commands when
> ending the writes. The last commands are:
>
> 0x2a (last real write)
> 0x35 (first flush cache) (takes about 10 seconds to complete)
> 0x00 (test unit ready)
> 0x51 (read disc info, length 4)
> 0x35 (2nd flush cache) -> never completes
>
> That last sync cache never generates an interrupt, so cdrdao gets stuck
> forever waiting on it. I cannot even reproduce this with a test case

My problem was in growisofs itself. They use poll to wait instead of
using nanosleep and totally confusing me. It did not help that strace do
not print out the arguments until the syscall return so I never saw the
input to poll.



2004-04-28 22:13:44

by Chris Wedgwood

[permalink] [raw]
Subject: Re: [BUG] DVD writing in 2.6.6-rc2

On Wed, Apr 28, 2004 at 10:09:54PM +0200, Jens Axboe wrote:

> So it just helped me unearth a different problem :-). There
> certainly is a bug here, it looks like it's hardware though (see the
> above description). ide-cd just needs to have it's SYNC_CACHE
> retries limited, then the kernel should at least recover.

I see cache flushes also go mute and hang forwever. I think ideally
we need to hack the IDE layer to have sane timeouts or something
perhaps?


--cw

2004-04-29 06:20:08

by Jens Axboe

[permalink] [raw]
Subject: Re: [BUG] DVD writing in 2.6.6-rc2

On Wed, Apr 28 2004, Chris Wedgwood wrote:
> On Wed, Apr 28, 2004 at 10:09:54PM +0200, Jens Axboe wrote:
>
> > So it just helped me unearth a different problem :-). There
> > certainly is a bug here, it looks like it's hardware though (see the
> > above description). ide-cd just needs to have it's SYNC_CACHE
> > retries limited, then the kernel should at least recover.
>
> I see cache flushes also go mute and hang forwever. I think ideally
> we need to hack the IDE layer to have sane timeouts or something
> perhaps?

Yeah, with 'retries' I mean waiting retries. The command does expire and
ide-cd notices it just doesn't put an upper bound on how long it waits.
A minute or so should suffice, unless the caller specifies otherwise.

--
Jens Axboe

2004-04-29 06:24:45

by Chris Wedgwood

[permalink] [raw]
Subject: Re: [BUG] DVD writing in 2.6.6-rc2

On Thu, Apr 29, 2004 at 08:19:58AM +0200, Jens Axboe wrote:

> Yeah, with 'retries' I mean waiting retries. The command does expire
> and ide-cd notices it just doesn't put an upper bound on how long it
> waits. A minute or so should suffice, unless the caller specifies
> otherwise.

Can we poke the IDE drive is it's hung or reset it in these cases? I
wonder more in the general case where I've had to reboot because the
burner went nuts for some reason...

--cw

2004-04-29 06:30:40

by Jens Axboe

[permalink] [raw]
Subject: Re: [BUG] DVD writing in 2.6.6-rc2

On Wed, Apr 28 2004, Chris Wedgwood wrote:
> On Thu, Apr 29, 2004 at 08:19:58AM +0200, Jens Axboe wrote:
>
> > Yeah, with 'retries' I mean waiting retries. The command does expire
> > and ide-cd notices it just doesn't put an upper bound on how long it
> > waits. A minute or so should suffice, unless the caller specifies
> > otherwise.
>
> Can we poke the IDE drive is it's hung or reset it in these cases? I
> wonder more in the general case where I've had to reboot because the
> burner went nuts for some reason...

Drive needs to be reset. Something like this should be all that's
needed.

===== drivers/ide/ide-cd.c 1.81 vs edited =====
--- 1.81/drivers/ide/ide-cd.c Mon Apr 26 22:56:10 2004
+++ edited/drivers/ide/ide-cd.c Thu Apr 29 08:29:07 2004
@@ -844,7 +844,6 @@
case GPCMD_FORMAT_UNIT:
case GPCMD_RESERVE_RZONE_TRACK:
case GPCMD_CLOSE_TRACK:
- case GPCMD_FLUSH_CACHE:
wait = ATAPI_WAIT_PC;
break;
default:

--
Jens Axboe