2003-02-26 16:38:44

by rain.wang

[permalink] [raw]
Subject: system hang on HDIO_DRIVE_RESET! help!

Hi,
I did HDIO_DRIVE_RESET ioctl, but system hung without any response,
only printed some mesages from kernel(v2.4.20):

hda: DMA disabled
hda: ide_set_handler: handler not null; old=c01ce300, new=c01d4400
bug: kernel timer added twice at c01ce102

would you please help me with it?

Regards
rain.w





2003-02-26 18:32:11

by Alan

[permalink] [raw]
Subject: Re: system hang on HDIO_DRIVE_RESET! help!

On Wed, 2003-02-26 at 16:45, rain.wang wrote:
> Hi,
> I did HDIO_DRIVE_RESET ioctl, but system hung without any response,
> only printed some mesages from kernel(v2.4.20):
>
> hda: DMA disabled
> hda: ide_set_handler: handler not null; old=c01ce300, new=c01d4400
> bug: kernel timer added twice at c01ce102
>
> would you please help me with it?

Does this still occur on 2.4.21pre. It should be fixed now

2003-02-28 04:58:31

by rain.wang

[permalink] [raw]
Subject: Re: system hang on HDIO_DRIVE_RESET! help!

Alan Cox wrote:

> On Wed, 2003-02-26 at 16:45, rain.wang wrote:
> > Hi,
> > I did HDIO_DRIVE_RESET ioctl, but system hung without any response,
> > only printed some mesages from kernel(v2.4.20):
> >
> > hda: DMA disabled
> > hda: ide_set_handler: handler not null; old=c01ce300, new=c01d4400
> > bug: kernel timer added twice at c01ce102
> >
> > would you please help me with it?
>
> Does this still occur on 2.4.21pre. It should be fixed now

I had tested 'hdparm -w /dev/hda' under 2.4.21-pre4, but problem sill exist,

just same message as in 2.4.20.

rain.w


2003-02-28 12:22:28

by Alan

[permalink] [raw]
Subject: Re: system hang on HDIO_DRIVE_RESET! help!

On Fri, 2003-02-28 at 05:04, rain.wang wrote:
> > Does this still occur on 2.4.21pre. It should be fixed now
>
> I had tested 'hdparm -w /dev/hda' under 2.4.21-pre4, but problem sill exist,
>
> just same message as in 2.4.20.

What controller are you using and I'll look into it a bit further

2003-02-28 13:23:51

by rain.wang

[permalink] [raw]
Subject: Re: system hang on HDIO_DRIVE_RESET! help!

Alan Cox wrote:

> On Fri, 2003-02-28 at 05:04, rain.wang wrote:
> > > Does this still occur on 2.4.21pre. It should be fixed now
> >
> > I had tested 'hdparm -w /dev/hda' under 2.4.21-pre4, but problem sill exist,
> >
> > just same message as in 2.4.20.
>
> What controller are you using and I'll look into it a bit further

Intel 82801AA host controller, and I found when I disabled DMA before doing
drive reset, system wouldn't hang at most time. It seemed not tight related with

host chip, does it?

rain.w


2003-03-04 13:16:12

by rain.wang

[permalink] [raw]
Subject: Re: system hang on HDIO_DRIVE_RESET! help!

"rain.wang" wrote:

> Alan Cox wrote:
>
> > On Wed, 2003-02-26 at 16:45, rain.wang wrote:
> > > Hi,
> > > I did HDIO_DRIVE_RESET ioctl, but system hung without any response,
> > > only printed some mesages from kernel(v2.4.20):
> > >
> > > hda: DMA disabled
> > > hda: ide_set_handler: handler not null; old=c01ce300, new=c01d4400
> > > bug: kernel timer added twice at c01ce102
> > >
> > > would you please help me with it?
> >
> > Does this still occur on 2.4.21pre. It should be fixed now
>
> I had tested 'hdparm -w /dev/hda' under 2.4.21-pre4, but problem sill exist,
>
> just same message as in 2.4.20.
>
> rain.w

Hi Alan,
I had tested 'hdparm -w /dev/hda' under 2.4.25-pre5-ac1, system
crashed
with
kernel oops message:
kernel BUG at ide-iops:1046!
...

can this be resolved?

rain.w

2003-03-04 14:12:35

by Alan

[permalink] [raw]
Subject: Re: system hang on HDIO_DRIVE_RESET! help!

On Tue, 2003-03-04 at 13:22, rain.wang wrote:
> I had tested 'hdparm -w /dev/hda' under 2.4.25-pre5-ac1, system
> crashed
> with
> kernel oops message:
> kernel BUG at ide-iops:1046!
> ...
>
> can this be resolved?

Once I understand what the problems all are yes. The BUG() is good, it
confirms that what we are both seeing is the same thing - the reset is
managing to issue two commands to the controller at the same time.

2003-03-07 05:57:31

by rain.wang

[permalink] [raw]
Subject: Re: system hang on HDIO_DRIVE_RESET! help!

Alan Cox wrote:

> On Tue, 2003-03-04 at 13:22, rain.wang wrote:
> > I had tested 'hdparm -w /dev/hda' under 2.4.25-pre5-ac1, system
> > crashed
> > with
> > kernel oops message:
> > kernel BUG at ide-iops:1046!
> > ...
> >
> > can this be resolved?
>
> Once I understand what the problems all are yes. The BUG() is good, it
> confirms that what we are both seeing is the same thing - the reset is
> managing to issue two commands to the controller at the same time.

Hi,
thank you, Alan. I tested pre5-ac2 patch and that seems all ok.

rain.w

2003-03-07 11:42:43

by Alan

[permalink] [raw]
Subject: Re: system hang on HDIO_DRIVE_RESET! help!

On Fri, 2003-03-07 at 06:04, rain.wang wrote:
> > Once I understand what the problems all are yes. The BUG() is good, it
> > confirms that what we are both seeing is the same thing - the reset is
> > managing to issue two commands to the controller at the same time.
>
> Hi,
> thank you, Alan. I tested pre5-ac2 patch and that seems all ok.

Thanks for the confirmation it is fixed

2003-03-14 08:21:20

by rain.wang

[permalink] [raw]
Subject: Re: system hang on HDIO_DRIVE_RESET! help!

Alan Cox wrote:

> On Fri, 2003-03-07 at 06:04, rain.wang wrote:
> > > Once I understand what the problems all are yes. The BUG() is good, it
> > > confirms that what we are both seeing is the same thing - the reset is
> > > managing to issue two commands to the controller at the same time.
> >
> > Hi,
> > thank you, Alan. I tested pre5-ac2 patch and that seems all ok.
>
> Thanks for the confirmation it is fixed

Hi Alan,
for 2.4.21-pre5-ac2 and -ac3 patch also.
there's still problem on reset. when I do 'hdparm -w /dev/hda' once
after another, all seems ok. but when I make a shell script and let
'hdparm -w' run in several times loop, system would always crashed
at the second time and left oops messages:
kernel BUG at ide.c:1700!
...
so, if any bugs still locking there?

rain.w


2003-03-14 09:04:21

by Andre Hedrick

[permalink] [raw]
Subject: Re: system hang on HDIO_DRIVE_RESET! help!


Rain,

The only way to deal with this is to treat the operations a failed and
punch them back out to block for clean up. Now we failed the a command.
However, I think I need to set a default block hook during the reset
process for the drive, channel, hba ... depending on the magnitude of the
wrecking ball generated. I need to offline Alan for this core dump.

The hang is in the clean ups after the reset.

I suspect the driver/hba is in DMA and drive is not.

Cheers,

Andre Hedrick
LAD Storage Consulting Group
------------------------------------
Pokemon (n), A Jamaican proctologist
------------------------------------

On Fri, 14 Mar 2003, rain.wang wrote:

> Alan Cox wrote:
>
> > On Fri, 2003-03-07 at 06:04, rain.wang wrote:
> > > > Once I understand what the problems all are yes. The BUG() is good, it
> > > > confirms that what we are both seeing is the same thing - the reset is
> > > > managing to issue two commands to the controller at the same time.
> > >
> > > Hi,
> > > thank you, Alan. I tested pre5-ac2 patch and that seems all ok.
> >
> > Thanks for the confirmation it is fixed
>
> Hi Alan,
> for 2.4.21-pre5-ac2 and -ac3 patch also.
> there's still problem on reset. when I do 'hdparm -w /dev/hda' once
> after another, all seems ok. but when I make a shell script and let
> 'hdparm -w' run in several times loop, system would always crashed
> at the second time and left oops messages:
> kernel BUG at ide.c:1700!
> ...
> so, if any bugs still locking there?
>
> rain.w
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>


2003-03-14 13:02:38

by Alan

[permalink] [raw]
Subject: Re: system hang on HDIO_DRIVE_RESET! help!

On Fri, 2003-03-14 at 09:13, Andre Hedrick wrote:
> Rain,
>
> The only way to deal with this is to treat the operations a failed and
> punch them back out to block for clean up. Now we failed the a command.
> However, I think I need to set a default block hook during the reset
> process for the drive, channel, hba ... depending on the magnitude of the
> wrecking ball generated. I need to offline Alan for this core dump.

I fixed one set of races with resets and it doesnt suprise me there is
another right now.