2002-09-14 18:23:15

by Alex Davis

[permalink] [raw]
Subject: Re: Possible bug and question about ide_notify_reboot in 2.4.19

>Putting the drive in stand-by mode has the side effect of flushing
>the cache.
Maxtor's tech support says this is NOT true.

>So before poweroff, send the FLUSH CACHE command,
>then send the standby command, hope that one of them works ..
Problem is we're currently flushing the cache AFTER we do
standby...

>I put put-the-drive-in-standby-mode stuff in halt.c of sysvinit
>after several reports of fs corruption at poweroff and it seems
>to have fixed the problems for the people who reported them.
That code is only executed if the '-h' option is passed to halt:
Some distros (namely Slackware 7.x) pass the '-p' option instead
(look in /etc/rc.d/rc.0).


Ok how about this: I'm current testing some patches against
ide.c and friends. Why don't I just add ( and document ) a
define called NO_STANDBY_ON_SHUTDOWN which would live in
ide.c. By default it would not be defined. Then I just wrap
the standby code in an '#ifndef NO_STANDBY_ON_SHUTDOWN..#endif'
block.




__________________________________________________
Do You Yahoo!?
Yahoo! Finance - Get real-time stock quotes
http://finance.yahoo.com


2002-09-14 19:26:07

by Alan

[permalink] [raw]
Subject: Re: Possible bug and question about ide_notify_reboot in 2.4.19

On Sat, 2002-09-14 at 19:28, Alex Davis wrote:
> >Putting the drive in stand-by mode has the side effect of flushing
> >the cache.
> Maxtor's tech support says this is NOT true.

Hint 1. Other people make disks too
Hint 2. The guys who did the code include a member of the standards
committee.


> Ok how about this: I'm current testing some patches against
> ide.c and friends. Why don't I just add ( and document ) a
> define called NO_STANDBY_ON_SHUTDOWN which would live in
> ide.c. By default it would not be defined. Then I just wrap
> the standby code in an '#ifndef NO_STANDBY_ON_SHUTDOWN..#endif'
> block.

Unless Andre agrees the change is required in the new IDE they won't be
going in.

2002-09-14 19:47:34

by Alex Davis

[permalink] [raw]
Subject: Re: Possible bug and question about ide_notify_reboot in 2.4.19

> Hint 1. Other people make disks too.
I'm glad you and I realize that. It seems that others might not. So
far, in this thread, only one person using one brand of disk (IBM)
has found something in writing about the cache issue. Let's see,
that leaves Maxtor/Quantum, Seagate, Fujitsu, .....

> Hint 2. The guys who did the code include a member of the standards
> committee.
And your point is...?? Does this somehow preclude them being wrong??

>Unless Andre agrees the change is required in the new IDE they won't be
>going in.
Fair enough.



__________________________________________________
Do You Yahoo!?
Yahoo! Finance - Get real-time stock quotes
http://finance.yahoo.com

2002-09-14 20:55:33

by Andre Hedrick

[permalink] [raw]
Subject: Re: Possible bug and question about ide_notify_reboot in 2.4.19


Hi Alex,

We (T13 Standards) only recently required (shall) all non-packet device to
support flush cache. No where does it state that a device supporting PM
for a standby (shall), the key word here is "shall", issue a flush-cache.

Does the early IBM laptop drive ring a bell?
IBM set an erratium about flush cache and spindown/power down, where it is
the "host driver" is responsible for the data.

So regardless of what you want, and what hardware you have, there is
hardware which doesn't do it properly.

So if yours gets it correct, great, dance for joy.

I will not break support for older hardware, on a whim.
You said you can make a patch, please do so and apply it to your tree.

One thing you will figure out is I am absolutely retentive to the SPEC,
and careful to not break older version of the standard, even retired
versions.

Now, if you want the option, submit the patch for review. For two or
three days there has been no patch to test.

To be absolutely honest, I really do not like to give options in the
kernel-config build which can cause backwards compatablity problems.
The only what I would consider it is to perform a revision check on
major/minor against the devices present and should any device violate,
forcablely OOPS the kernel into a deadlock crash before filesystems can be
mounted.

Cheers,

Andre Hedrick
LAD Storage Consulting Group

On 14 Sep 2002, Alan Cox wrote:

> On Sat, 2002-09-14 at 19:28, Alex Davis wrote:
> > >Putting the drive in stand-by mode has the side effect of flushing
> > >the cache.
> > Maxtor's tech support says this is NOT true.
>
> Hint 1. Other people make disks too
> Hint 2. The guys who did the code include a member of the standards
> committee.
>
>
> > Ok how about this: I'm current testing some patches against
> > ide.c and friends. Why don't I just add ( and document ) a
> > define called NO_STANDBY_ON_SHUTDOWN which would live in
> > ide.c. By default it would not be defined. Then I just wrap
> > the standby code in an '#ifndef NO_STANDBY_ON_SHUTDOWN..#endif'
> > block.
>
> Unless Andre agrees the change is required in the new IDE they won't be
> going in.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2002-09-14 21:12:19

by Andre Hedrick

[permalink] [raw]
Subject: Re: Possible bug and question about ide_notify_reboot in 2.4.19

On Sat, 14 Sep 2002, Alex Davis wrote:

> > Hint 1. Other people make disks too.
> I'm glad you and I realize that. It seems that others might not. So
> far, in this thread, only one person using one brand of disk (IBM)
> has found something in writing about the cache issue. Let's see,
> that leaves Maxtor/Quantum, Seagate, Fujitsu, .....

I can list a bunch, but I know about them under heavy NDA's with anvils
looming over head.

Like what happens if a drive issues a self flush cache and receives an
error so the next issue from user/kernel space will cause the device to
internally deadlock. Yeah this is a firmware bug, imho. Yet when it was
to be addressed by the commitee, "NONE" of the drive vendors reported back
their behavior, iirc. Thus the proposal was dropped.

> > Hint 2. The guys who did the code include a member of the standards
> > committee.
> And your point is...?? Does this somehow preclude them being wrong??

You should come in the room sometime and watch.
It is not so much being wrong, it is all in the language.

There are things in the standard, which make Bill Clinton look squeaky
clean. You think Clinton's "is" was bad.

Try this one, "READ_VERIFY"

You issue a write to platter, then a read_verify to have the device do an
internal comparison. Usually a bastardized benchmark pile of dung.
Some drive vendors in the past would pull the data out of dirty disk
buffer cache, and not off the platters. Translation it never made it to
platter, and you never know if it did. When caught by their competitors,
the language change to "shall have been read off the platter some time in
the past". Yet you just issued a write to platter, so that means that
data can not have been read in the past but must be in the future.

Future/Past the pull it out of cache.

Want more to make your guts turn?

Andre Hedrick
LAD Storage Consulting Group

2002-09-14 21:37:04

by Alex Davis

[permalink] [raw]
Subject: Re: Possible bug and question about ide_notify_reboot in 2.4.19

--- Andre Hedrick <[email protected]> wrote:
>
> Hi Alex,
>
> We (T13 Standards) only recently required (shall) all non-packet device to
> support flush cache. No where does it state that a device supporting PM
> for a standby (shall), the key word here is "shall", issue a flush-cache.
I am assuming that a hard drive is a non-packet device. Let me make sure I'm
interpreting this correctly: older ( and some current ) drives may flush cache
on standby/sleep; current and future drives may not. In addition, older drives
may not support the flush cache command.

> I will not break support for older hardware, on a whim.
Not my intention.

> You said you can make a patch, please do so and apply it to your tree.
> Now, if you want the option, submit the patch for review. For two or
> three days there has been no patch to test.
Still testing locally. I also want to fix the code so that the flush is
done before the standby.

>
> To be absolutely honest, I really do not like to give options in the
> kernel-config build which can cause backwards compatablity problems.
This wouldn't be a config option. You would have to modify ide.c by
hand to disable standby.
>
> Andre Hedrick
> LAD Storage Consulting Group
>


__________________________________________________
Do You Yahoo!?
Yahoo! Finance - Get real-time stock quotes
http://finance.yahoo.com

2002-09-15 04:13:44

by Andre Hedrick

[permalink] [raw]
Subject: Re: Possible bug and question about ide_notify_reboot in 2.4.19

On Sat, 14 Sep 2002, Alex Davis wrote:

> --- Andre Hedrick <[email protected]> wrote:
> >
> > Hi Alex,
> >
> > We (T13 Standards) only recently required (shall) all non-packet device to
> > support flush cache. No where does it state that a device supporting PM
> > for a standby (shall), the key word here is "shall", issue a flush-cache.
> I am assuming that a hard drive is a non-packet device. Let me make sure I'm

Today, yes ... In the past no.
There are a handfull of these strange beasts which still exists.

> interpreting this correctly: older ( and some current ) drives may flush cache
> on standby/sleep; current and future drives may not. In addition, older drives

It means there are not rules (rules is a loose term) for how to do this in
the standard.

> may not support the flush cache command.

There is supported v/s enabled, and these can be optional.
Optional == (Mandatory Optional) because nobody wants to not have the
feature ready or they will miss the sale.

> > I will not break support for older hardware, on a whim.
> Not my intention.

Cool.

> > You said you can make a patch, please do so and apply it to your tree.
> > Now, if you want the option, submit the patch for review. For two or
> > three days there has been no patch to test.
> Still testing locally. I also want to fix the code so that the flush is
> done before the standby.

Wait, how did the order go south?

> >
> > To be absolutely honest, I really do not like to give options in the
> > kernel-config build which can cause backwards compatablity problems.
> This wouldn't be a config option. You would have to modify ide.c by
> hand to disable standby.

So a manual config option? That is more reasonable and not doable by
accident.

Cheers,

Andre Hedrick
LAD Storage Consulting Group

2002-09-15 07:35:25

by Alex Davis

[permalink] [raw]
Subject: Re: Possible bug and question about ide_notify_reboot in 2.4.19


--- Andre Hedrick <[email protected]> wrote:
> > > You said you can make a patch, please do so and apply it to your tree.
> > > Now, if you want the option, submit the patch for review. For two or
> > > three days there has been no patch to test.
> > Still testing locally. I also want to fix the code so that the flush is
> > done before the standby.
>
> Wait, how did the order go south?
>
I don't know how, but I know it happened somewhere between 2.4.18 and 2.4.19.
This is a snippet of code from the ide_notify_reboot() function in ide.c in
2.4.19 vanilla:
4024 for (unit = 0; unit < MAX_DRIVES; ++unit) {
4025 drive = &hwif->drives[unit];
4026 if (!drive->present)
4027 continue;
4028
4029 /* set the drive to standby */
4030 printk("%s ", drive->name);
4031 if (event != SYS_RESTART)
4032 if (drive->driver != NULL && DRIVER(drive)->standby(drive))
4033 continue;
4034
4035 if (drive->driver != NULL && DRIVER(drive)->cleanup(drive))
4036 continue;
4037 }

Notice that we are calling standby(), then cleanup(): standby() puts the disk
to sleep, but cleanup flushes the cache.

Here is the code for standby() from ide_disk.c in 2.4.19 vanilla:
1235 static int do_idedisk_standby (ide_drive_t *drive) // put disk to sleep.
1236 {
1237 struct hd_drive_task_hdr taskfile;
1238 struct hd_drive_hob_hdr hobfile;
1239 memset(&taskfile, 0, sizeof(struct hd_drive_task_hdr));
1240 memset(&hobfile, 0, sizeof(struct hd_drive_hob_hdr));
1241 taskfile.command = WIN_STANDBYNOW1;
1242 return ide_wait_taskfile(drive, &taskfile, &hobfile, NULL);
1243 }

Now here's the cleanup() function:
1420 static int idedisk_cleanup (ide_drive_t *drive)
1421 {
1422 if ((drive->id->cfs_enable_2 & 0x3000) && drive->wcache)
1423 if (do_idedisk_flushcache(drive))
1424 printk (KERN_INFO "%s: Write Cache FAILED Flushing!\n",
1425 drive->name);
1426 return ide_unregister_subdriver(drive);
1427 }

--------------------------------------------------------------------------

Looking at this snippet of the the first snippet
4031 if (event != SYS_RESTART)
4032 if (drive->driver != NULL && DRIVER(drive)->standby(drive))
4033 continue;
4034
4035 if (drive->driver != NULL && DRIVER(drive)->cleanup(drive))
4036 continue;
I also see another potential bug (actually, the original bug I posted about):
standby() returns 0 on success, and non-zero on failure. By this logic, if
standby() fails then cleanup() won't get called.




__________________________________________________
Do You Yahoo!?
Yahoo! Finance - Get real-time stock quotes
http://finance.yahoo.com