LinuxLists.cc - [RFC] Disk shock protection (revisited)

2008-02-26 00:33:54

Subject: [RFC] Disk shock protection (revisited)

Hi all,

at the moment I'm having another go at trying to make the disk shock
protection patch fit for upstream submission. However, there are still
some fundamental issues I'd like to discuss in order to make sure that
I'm heading in the right direction.

The general idea: A daemon running in user space monitors input data
from an accelerometer. When the daemon detects a critical condition,
i.e., a sudden acceleration (for instance, laptop slides off the desk),
it signals the kernel so the hard disk may be put into a (more) safe
state. To this end, the kernel has to issue an idle immediate command
with unload feature and stop the block layer queue afterwards. Once the
daemon tells us that the imminent danger is over, the most important
task for the kernel is to restart the block layer queue. See below for
more details.

This project is (and I personally am) mainly concerned with laptops
equipped with an accelerometer and an (S)ATA hard drive that supports
the unload feature of the idle immediate command. Jens Axboe, however,
suggested right from the beginning that there might be more general
applications for the block layer queue freezing part of the story. The
question is now to what extent are the requirements for a disk shock
protection facility (specific to ATA devices) and a general block layer
queue freezing facility compatible and in what way should they be
exposed to user space.

Probably, the major problem is that I don't really know what kind of
applications (apart from shock protection) I should be thinking of that
might want to have a queue freezing facility at hand. Still, one thing
seems obvious to me: For disk shock protection, time is of the essence,
whereas in the more general case of simple block layer queue freezing,
the situation is different as far as lower levels are concerned. In
particular, I'm inclined to believe that in the context of such a
general application it would be desirable to be able to freeze the queue
of an ATA device *without* issuing an idle immediate command first.
Obviously, the interface exposed to userspace would have to provide for
these diverging needs.

The disk-protect patch in it's current form [1] got stuck somewhere
between trying to provide a general queue freezing facility and
accommodating the needs of a disk shock protection setup. The sysfs
attributes required to request immediate disk parking from user space
are exported under /sys/block/. This is very convenient from the user's
point of view because the hierarchy is intuitive and you can easily find
the subdirectory associated to your hard disk. Conceptually, though, it
doesn't feel right. That is, for simple queue freezing, it is perfectly
alright, of course, but I don't see why and, indeed, how an ATA specific
feature like immediate disk parking could be controlled from the block
layer in a straight forward way. Besides, Jens, quite understandably,
objects to the introduction of yet another queue hook which is the
current way of telling ATA and non-ATA devices apart. Instead, he
suggests to implement generic block layer notification requests like
REQ_LB_OP_FREEZE of type REQ_TYPE_LINUX_BLOCK and let low level drivers
act upon it as they see fit. But then we would still need a way to
configure the way libata / ide actuaaly does respond to those block
layer messages. As explained above, the user might want to choose
whether or idle immediate is to be issued or simple queue freezing is
enough for his / her purposes. Besides, some drives that actually
support the unload feature of the idle immediate command don't report
that capability in the IDENTIFY data, so userspace needs a way to tell
the driver that the feature is available after all.

So, roughly my questions are these:

1. Who is to be in charge for the shock protection application? Should
userspace speak to libata / ide directly (through sysfs) and the low
level drivers will notify block layer to stop the queeue or should
userspace always talk to the block layer, regardless whether we want
to park an ATA disk or just freeze the queue? In the latter case we'd
still need the option to configure the exact behaviour for ATA
devices.
2. Depending on the answer to the previous question, by what mechanism
should block layer and lld interact? Special requests, queue hooks or
something in some way similar to power management functions (once
suggested by James Bottomley)?
3. What is the preferred way to pass device specific configuration
options to libata (preferrably at runtime, i.e., after module
loading)?

Please let me know if you need any further information. Also, I will
certainly have more questions once I try to my hand at any of your
suggestions.

Thanks in advance,

Elias

[1] http://article.gmane.org/gmane.linux.drivers.hdaps.devel/1094

2008-02-26 00:02:39

by Jeff Garzik

[permalink] [raw]

Subject: Re: [RFC] Disk shock protection (revisited)

Elias Oltmanns wrote:
> The general idea: A daemon running in user space monitors input data
> from an accelerometer. When the daemon detects a critical condition,
> i.e., a sudden acceleration (for instance, laptop slides off the desk),
> it signals the kernel so the hard disk may be put into a (more) safe
> state. To this end, the kernel has to issue an idle immediate command
> with unload feature and stop the block layer queue afterwards. Once the
> daemon tells us that the imminent danger is over, the most important
> task for the kernel is to restart the block layer queue. See below for
> more details.

Speaking specifically to that problem, it seems to me that you either
want an mlock'd daemon, or just simply to keep everything in the kernel,
for this specific solution.

You don't want, for example, to swap out other apps, swap in the daemon,
in order to handle a sudden acceleration.

Jeff

2008-02-26 00:31:19

by Elias Oltmanns

[permalink] [raw]

Subject: Re: [RFC] Disk shock protection (revisited)

Jeff Garzik <[email protected]> wrote:
> Elias Oltmanns wrote:
>> The general idea: A daemon running in user space monitors input data
>> from an accelerometer. When the daemon detects a critical condition,
>> i.e., a sudden acceleration (for instance, laptop slides off the desk),
>> it signals the kernel so the hard disk may be put into a (more) safe
>> state. To this end, the kernel has to issue an idle immediate command
>> with unload feature and stop the block layer queue afterwards. Once the
>> daemon tells us that the imminent danger is over, the most important
>> task for the kernel is to restart the block layer queue. See below for
>> more details.
>
> Speaking specifically to that problem, it seems to me that you either
> want an mlock'd daemon, or just simply to keep everything in the
> kernel, for this specific solution.

Yes, the daemon is mlock'd.

>
> You don't want, for example, to swap out other apps, swap in the
> daemon, in order to handle a sudden acceleration.

Quite. But with mlock this particular problem can be handled in user
space just fine. The only reason I can see right now for putting this
logic into the kernel as well is to keep the functionality around even
after task freeze during suspend / resume. On the other hand, I don't
know whether this is really worth the effort even though the time when
the suspend operation is in progress can arguably be one of the most
accident-prone moments (think of users packing their things in a hurry).

Regards,

Elias

2008-02-26 01:33:27

by Henrique de Moraes Holschuh

[permalink] [raw]

Subject: Re: [RFC] Disk shock protection (revisited)

On Tue, 26 Feb 2008, Elias Oltmanns wrote:
> > You don't want, for example, to swap out other apps, swap in the
> > daemon, in order to handle a sudden acceleration.
>
> Quite. But with mlock this particular problem can be handled in user
> space just fine. The only reason I can see right now for putting this

And, as long as there is a way to also invoke it from within the kernel, we
can call it from inside the kernel as well when there is a reason to make
that function available. Full flexibility is easily attainable here and
nothing we should bother about too much.

> logic into the kernel as well is to keep the functionality around even

Some hardware (Apple's?) has the entire APS logic in firmware (and AFAIK
*also* export the accelerometer data for other uses). On those boxes, if
you want to trust the firmware, you just ignore the accelerometer and hook
to an interrupt. When you get the interrupt, you freeze the queue and
unload heads. Adding that to work in-kernel would be trivial.

Adding a suspend-time-only emergency HDAPS in-kernel monitor thread would
also be doable, if we wanted to duplicate that for ThinkPads (I don't really
think it is needed, but...). As long as queue freezing and the required
unload immediate procedure can be called in at *any* time before the disk
device, and its buses and controller are suspended, it would do to implement
whatever we feel it is needed.

--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh

2008-02-26 12:50:39

by Alan

[permalink] [raw]

Subject: Re: [RFC] Disk shock protection (revisited)

> The general idea: A daemon running in user space monitors input data
> from an accelerometer. When the daemon detects a critical condition,

That sounds like a non starter. What if the box is busy, what if the
daemon or something you touch needs memory and causes paging ?

Given the accelerometer data should be very simple doesn't it actually
make sense in this specific case to put the logic (not thresholds) in
kernel space.

> state. To this end, the kernel has to issue an idle immediate command
> with unload feature and stop the block layer queue afterwards. Once the

Yep. Pity the worst case completion time for an IDE I/O is 60 seconds or
so.

> 1. Who is to be in charge for the shock protection application? Should
> userspace speak to libata / ide directly (through sysfs) and the low

I think it has to be kernel side for speed, and because you will need to
issue idle immediate while a command sequence is active which is
*extremely* hairy as you have to recover from the mess and restart the
relevant I/O. Plus you may need controller specific knowledge on issuing
it (and changes to libata).

> 2. Depending on the answer to the previous question, by what mechanism
> should block layer and lld interact? Special requests, queue hooks or
> something in some way similar to power management functions (once
> suggested by James Bottomley)?

Idle immediate seem to simply fit the queue model, it happens in
*parallel* to I/O events and is special in all sorts of ways.

> 3. What is the preferred way to pass device specific configuration
> options to libata (preferrably at runtime, i.e., after module
> loading)?

sysfs

2008-02-26 20:48:20

by Willy Tarreau

[permalink] [raw]

Subject: Re: [RFC] Disk shock protection (revisited)

Hi Elias,

On Tue, Feb 26, 2008 at 12:56:31AM +0100, Elias Oltmanns wrote:

[ very interesting project ]

> Probably, the major problem is that I don't really know what kind of
> applications (apart from shock protection) I should be thinking of that
> might want to have a queue freezing facility at hand.

In terms of applications, depending on the sensitivity of the accelerometer,
we can imagine that a laptop would immediately force unmount crypted
filesystems if it believes it's being stolen, for instance. It's just a
random idea that comes to my mind, in the hope it may help you imagine
some crazy usages. But generally you should not fool your mind with too
many hypothetical cases, ideas will come once you provide a smart interface
and this interface will evolve with future needs.

Concerning your daemon, I think that every millisecond counts when a
laptop falls on the floor. So I think that running it in the kernel
should help you gain those precious milliseconds. I doubt your daemon
could trigger fast enough while X is starting or during some activities
which require a lot of CPU or uninterruptible I/O. If (I don't know)
the driver can be woken up by an interrupt from the controller, it
might react faster.

Good luck, and I sincerely wish you success on this project!
Willy

2008-02-28 08:26:59

by Elias Oltmanns

[permalink] [raw]

Subject: Re: [RFC] Disk shock protection (revisited)

Alan Cox <[email protected]> wrote:
>> The general idea: A daemon running in user space monitors input data
>> from an accelerometer. When the daemon detects a critical condition,
>
> That sounds like a non starter. What if the box is busy, what if the
> daemon or something you touch needs memory and causes paging ?

The daemon runs mlock'd anyway, so there won't be any need for paging
there. As for responsiveness under heavy load, I'm not quite sure I get
your meaning. On my system, at least, the only way I have managed to
decrease responsiveness noticeably is to cause a lot of I/O operations
on my disk. But even then it's not the overall responsiveness that gets
hurt but just any action that requires further I/O. Since the daemon
stays in memory all the time, it can go ahead and notify the kernel that
the disk heads should be unloaded. The kernel takes care to insert the
idle immediate command at the head of the queue. Am I missing something?

>
> Given the accelerometer data should be very simple doesn't it actually
> make sense in this specific case to put the logic (not thresholds) in
> kernel space.

The simplicity of the input data doesn't necessarily imply that the
evaluation logic is simple as well; but then the daemon is rather simple
in this case. Still, probably due to my lack of experience I don't quite
see what can be gained by putting it into kernel space which cannot be
achieved using the mlock feature or nice levels.

The important thing is this: There will be a dedicated code path for
disk head parking in the kernel. If the actual decision about when head
parking should take place is left to a daemon in user space, it is much
easier for the user to specify which devices should be protected and
which input data the decision should be based upon in case the system
happens to have access to more than one accelerometer. Right now, I don't
feel quite up to the job to write a dedicated kernel module that
replaces the daemon and is designed in a sufficiently generic way to
cope with all sorts of weird system configurations. Since I wouldn't
even know where to start, someone would have to point me in the right
direction first and probably have a lot of patience with me and my
questions in the process.

>
>> state. To this end, the kernel has to issue an idle immediate command
>> with unload feature and stop the block layer queue afterwards. Once the
>
> Yep. Pity the worst case completion time for an IDE I/O is 60 seconds or
> so.

Well, the low level driver would have to make sure that no requests are
accepted after the idle immediate command has been received. The block
layer queue is stopped later merely to stop the request_fn() to be
called for the time that lld won't accept any requests anyway. See
further comments below.

>
>> 1. Who is to be in charge for the shock protection application? Should
>> userspace speak to libata / ide directly (through sysfs) and the low
>
> I think it has to be kernel side for speed, and because you will need to
> issue idle immediate while a command sequence is active which is
> *extremely* hairy as you have to recover from the mess and restart the
> relevant I/O. Plus you may need controller specific knowledge on issuing
> it (and changes to libata).

As indicated above, I'd appreciate it if you could explain in a bit more
detail why it is not enough to let the kernel take care of just the
actual disk parking. It really is perfectly possible that I miss
something obvious here, so please bare with me.

Let me also make quite clear what exactly I intend to keep in kernel
space and what the daemon is supposed to be doing. When the daemon
decides that we had better stop all I/O to the disk, it writes an
integer to a sysfs attribute specifying the number of seconds it expects
the disk to be kept in the safe mode for. From there on everything is
going to be handled in kernel space, i.e., issuing idle immediate while
making sure that no other command gets issued to the hardware after that
and freezing the block layer queue eventually in order to stop
the request_fn() from being called needlessly. Once the specified time
is up or if the daemon writes 0 to that sysfs attrribute before that
time, it is kernel space code again that takes care that normal
operation is resumed.

>
>> 2. Depending on the answer to the previous question, by what mechanism
>> should block layer and lld interact? Special requests, queue hooks or
>> something in some way similar to power management functions (once
>> suggested by James Bottomley)?
>
> Idle immediate seem to simply fit the queue model, it happens in
> *parallel* to I/O events and is special in all sorts of ways.

Well, this is something we'll have to discuss too since I don't have the
SATA specs and haven't a clue as to how idle immediate behaves in an NCQ
enabled system. However, my question was about something more basic than
that, namely, what should be handled by the block layer and what by the
libata / ide subsystem and how they should interact with each other.
But never mind that now because I have had some ideas since and will
come up with a patch series once the other issues have been settled, so
we can have a more hands on discussion about this particular problem
then.

>
>> 3. What is the preferred way to pass device specific configuration
>> options to libata (preferrably at runtime, i.e., after module
>> loading)?
>
> sysfs

Yes, I thought as much. I just haven't quite worked out yet where or how
I am supposed to introduce libata specific sysfs attributes since this
seems to be left to the scsi midlayer so far.

Regards,

Elias

2008-02-28 10:12:19

by Elias Oltmanns

[permalink] [raw]

Subject: Re: [RFC] Disk shock protection (revisited)

Willy Tarreau <[email protected]> wrote:
> Hi Elias,

Hi Willy,

>
> On Tue, Feb 26, 2008 at 12:56:31AM +0100, Elias Oltmanns wrote:
>
> [ very interesting project ]
>
>> Probably, the major problem is that I don't really know what kind of
>> applications (apart from shock protection) I should be thinking of that
>> might want to have a queue freezing facility at hand.
>
> In terms of applications, depending on the sensitivity of the accelerometer,
> we can imagine that a laptop would immediately force unmount crypted
> filesystems if it believes it's being stolen, for instance. It's just a
> random idea that comes to my mind, in the hope it may help you imagine
> some crazy usages.

Well, this application would use the same input data (acceleromtere) but
it would certainly not require a generic queue freezing facility.

> But generally you should not fool your mind with too many hypothetical
> cases, ideas will come once you provide a smart interface and this
> interface will evolve with future needs.
>
> Concerning your daemon, I think that every millisecond counts when a
> laptop falls on the floor. So I think that running it in the kernel
> should help you gain those precious milliseconds.

The idle immediate command itself may need up to 300 milliseconds to
complete according to the ATA standard. This seems like a very long time
compared to CPU standards, i.e., the time usually needed to serve a
lightweight daemon.

> I doubt your daemon could trigger fast enough while X is starting or
> during some activities which require a lot of CPU or uninterruptible
> I/O.

On my system the daemon's response *feels* just fine even while
compiling a kernel; I haven't done any measurements or benchmarks
though. The thing I'm most concerned about is uninterruptible I/O but
I'm not quite sure whether and how this can be addressed in kernel
space.

Regards,

Elias

2008-02-28 11:25:30

by Alan

[permalink] [raw]

Subject: Re: [RFC] Disk shock protection (revisited)

> > That sounds like a non starter. What if the box is busy, what if the
> > daemon or something you touch needs memory and causes paging ?
>
> The daemon runs mlock'd anyway, so there won't be any need for paging

mlock does not guarantee anything of that form. A syscall by an mlocked
process which causes a memory allocation can cause paging of another
process on the system.

> there. As for responsiveness under heavy load, I'm not quite sure I get
> your meaning. On my system, at least, the only way I have managed to
> decrease responsiveness noticeably is to cause a lot of I/O operations

It depends a lot on hardware but you can certainly get user space delays
in seconds as an extreme worst case.

> stays in memory all the time, it can go ahead and notify the kernel that
> the disk heads should be unloaded. The kernel takes care to insert the
> idle immediate command at the head of the queue. Am I missing something?

Yes - the fact we may well have bounced off the floor already.

> happens to have access to more than one accelerometer. Right now, I don't
> feel quite up to the job to write a dedicated kernel module that
> replaces the daemon and is designed in a sufficiently generic way to

Thats fine - nothing says a user space daemon isn't a good starting point.

> > Yep. Pity the worst case completion time for an IDE I/O is 60 seconds or
> > so.
>
> Well, the low level driver would have to make sure that no requests are
> accepted after the idle immediate command has been received. The block

No doesn't work like that. The command currently being processed on IDE
can take up to 60 seconds to complete. Idle immediate (on the devices it
works for - it hangs some) is very special in that it can be used in some
cases to interrupt a running command sequence. It requires a significant
amount of work in the driver layer to then clean up and requeue the
partial command and to know if it is possible to do so.

> and freezing the block layer queue eventually in order to stop
> the request_fn() from being called needlessly. Once the specified time
> is up or if the daemon writes 0 to that sysfs attrribute before that
> time, it is kernel space code again that takes care that normal
> operation is resumed.

I think we have three things here

1. A general queue freeze scheme from user space
2. A general implementation of a queue freeze that stops further
command issuing while the queue is blocked
3. The ability for devices to provide a function to be called
when a queue freeze is done (ie idle immediate and the like)

The fine details of how you abort an ATA command don't actually matter
for an initial implementation and can be written once the core stuff is
right.

> Well, this is something we'll have to discuss too since I don't have the
> SATA specs and haven't a clue as to how idle immediate behaves in an NCQ
> enabled system. However, my question was about something more basic than

I have the specs, and I don't understand it or even if it is valid to do
so. Some research (as in trying it and seeing) may be needed.

> Yes, I thought as much. I just haven't quite worked out yet where or how
> I am supposed to introduce libata specific sysfs attributes since this
> seems to be left to the scsi midlayer so far.

The scsi midlayer is the main manager of queues so that seems sane - and
if you've got the basic queue freeze logic right then one assumes it will
work for scsi too.

2008-02-28 17:00:34

by Greg Freemyer

[permalink] [raw]

Subject: Re: [RFC] Disk shock protection (revisited)

On Thu, Feb 28, 2008 at 6:13 AM, Alan Cox <[email protected]> wrote:
> > > That sounds like a non starter. What if the box is busy, what if the
> > > daemon or something you touch needs memory and causes paging ?
> >
> > The daemon runs mlock'd anyway, so there won't be any need for paging
>
> mlock does not guarantee anything of that form. A syscall by an mlocked
> process which causes a memory allocation can cause paging of another
> process on the system.
>
>
> > there. As for responsiveness under heavy load, I'm not quite sure I get
> > your meaning. On my system, at least, the only way I have managed to
> > decrease responsiveness noticeably is to cause a lot of I/O operations
>
> It depends a lot on hardware but you can certainly get user space delays
> in seconds as an extreme worst case.

I don't know the details, but I believe the Linux-HA heartbeat daemons
take significant effort to eliminate unexpected delays. See
http://www.linux-ha.org/

Lars Marowsky-Bree of Novell is extremely involved in the project and
he at least occasionally posts on LKML. I've cc'ed him.

Greg
--
Greg Freemyer
Litigation Triage Solutions Specialist
http://www.linkedin.com/in/gregfreemyer
First 99 Days Litigation White Paper -
http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf

The Norcross Group
The Intersection of Evidence & Technology
http://www.norcrossgroup.com

2008-03-07 18:21:30

by Elias Oltmanns

[permalink] [raw]

Subject: Re: [RFC] Disk shock protection (revisited)

Alan Cox <[email protected]> wrote:
[...]
>> there. As for responsiveness under heavy load, I'm not quite sure I get
>> your meaning. On my system, at least, the only way I have managed to
>> decrease responsiveness noticeably is to cause a lot of I/O operations
>
> It depends a lot on hardware but you can certainly get user space delays
> in seconds as an extreme worst case.
>
>> stays in memory all the time, it can go ahead and notify the kernel that
>> the disk heads should be unloaded. The kernel takes care to insert the
>> idle immediate command at the head of the queue. Am I missing something?
>
> Yes - the fact we may well have bounced off the floor already.

Well, with or without shock protection it can't get any worse by then,
can it? But in all those cases where the system manages to get the heads
of the platter in time, the owner may be greatful for this feature.

>
>> happens to have access to more than one accelerometer. Right now, I don't
>> feel quite up to the job to write a dedicated kernel module that
>> replaces the daemon and is designed in a sufficiently generic way to
>
> Thats fine - nothing says a user space daemon isn't a good starting point.

A starting point it is then.

>
>> > Yep. Pity the worst case completion time for an IDE I/O is 60 seconds or
>> > so.
>>
>> Well, the low level driver would have to make sure that no requests are
>> accepted after the idle immediate command has been received. The block
>
> No doesn't work like that. The command currently being processed on IDE
> can take up to 60 seconds to complete. Idle immediate (on the devices it
> works for - it hangs some) is very special in that it can be used in some
> cases to interrupt a running command sequence. It requires a significant
> amount of work in the driver layer to then clean up and requeue the
> partial command and to know if it is possible to do so.

This business of aborting commands is exactly what I haven't a clue
about. At first I thought I could do something similar to
ata_do_link_abort but I obviously want to avoid the need for a soft
reset before issuing idle immediate. How am I to go about it?

>
>> and freezing the block layer queue eventually in order to stop
>> the request_fn() from being called needlessly. Once the specified time
>> is up or if the daemon writes 0 to that sysfs attrribute before that
>> time, it is kernel space code again that takes care that normal
>> operation is resumed.
>
> I think we have three things here
>
> 1. A general queue freeze scheme from user space
> 2. A general implementation of a queue freeze that stops further
> command issuing while the queue is blocked
> 3. The ability for devices to provide a function to be called
> when a queue freeze is done (ie idle immediate and the like)
>
> The fine details of how you abort an ATA command don't actually matter
> for an initial implementation and can be written once the core stuff is
> right.
>
[...]
>> Yes, I thought as much. I just haven't quite worked out yet where or how
>> I am supposed to introduce libata specific sysfs attributes since this
>> seems to be left to the scsi midlayer so far.
>
> The scsi midlayer is the main manager of queues so that seems sane - and
> if you've got the basic queue freeze logic right then one assumes it will
> work for scsi too.

Basic queue freezing certainly will. But we'll need attributes specific
to ATA so the user can determine whether
1. idle immediate should be issued (if supported) on queue freeze
events;
2. idle immediate is supported on this particular device even though
dev->id doesn't say so;
3. idle immediate is malfunctioning and should be avoided even though
dev->id reports support for that feature;
4. (perhaps we should drop that): use standby immediate if idle
immediate isn't supported for some reason.

I'm going to send a first draft of a patch series in reply to this
email. It is a stripped down version intended to get the general idea
across. The first of these four patches will eventually need to be
modified to actually abort in flight commands and clear up the mess
afterwards. However, first and foremost I'd like to draw your attention
to the use of REQ_TYPE_LINUX_BLOCK requests as demonstrated in the other
three patches. The question is whether the underlying concept is right.
Although the question how to handle REQ_TYPE_LINUX_BLOCK requests in the
scsi subsystem has been raised on the linux-scsi ml, it has never been
answered really because this request type was deemed unsuitable for the
application in question. See, for instance, the thread starting at [1].
My patch approach has been partly inspired by the patch discussed there.
Before I raise this issue yet again, I'd like to know whether
REQ_TYPE_LINUX_BLOCK is the right choice for my application in your
opinion or whether another mechanism might be more suitable as James
suggested a while ago (see [2]).

Regards,

Elias

[1] http://permalink.gmane.org/gmane.linux.scsi/30049
[2] http://permalink.gmane.org/gmane.linux.scsi/37951

2008-03-07 22:56:33

by Alan

[permalink] [raw]

Subject: Re: [RFC] Disk shock protection (revisited)

> This business of aborting commands is exactly what I haven't a clue
> about. At first I thought I could do something similar to
> ata_do_link_abort but I obviously want to avoid the need for a soft
> reset before issuing idle immediate. How am I to go about it?

See the ATA 7 specification, and then stare at libata, and then stress a
lot and realise it is going to be very hard.

Seriously - get the queue freezing stuff working first and then I'm sure
one of us libata folk who actually have the misfortune to read these
specs regularly will add in the details like that.

2008-03-13 14:53:36

by Elias Oltmanns

[permalink] [raw]

Subject: Re: [RFC] Disk shock protection (revisited)

Elias Oltmanns <[email protected]> wrote:
[...]
> I'm going to send a first draft of a patch series in reply to this
> email. It is a stripped down version intended to get the general idea
> across.

Have you had got round to looking at these patches yet?

> The first of these four patches will eventually need to be modified to
> actually abort in flight commands and clear up the mess afterwards.
> However, first and foremost I'd like to draw your attention to the use
> of REQ_TYPE_LINUX_BLOCK requests as demonstrated in the other three
> patches. The question is whether the underlying concept is right.
> Although the question how to handle REQ_TYPE_LINUX_BLOCK requests in
> the scsi subsystem has been raised on the linux-scsi ml, it has never
> been answered really because this request type was deemed unsuitable
> for the application in question. See, for instance, the thread
> starting at [1]. My patch approach has been partly inspired by the
> patch discussed there. Before I raise this issue yet again, I'd like
> to know whether REQ_TYPE_LINUX_BLOCK is the right choice for my
> application in your opinion or whether another mechanism might be more
> suitable as James suggested a while ago (see [2]).
>
> Regards,
>
> Elias
>
> [1] http://permalink.gmane.org/gmane.linux.scsi/30049
> [2] http://permalink.gmane.org/gmane.linux.scsi/37951

Sorry, I got these two the wrong way round. [1] should be [2] and vice
versa.

Regards,

Elias

2008-03-15 14:53:54

by Alan

[permalink] [raw]

Subject: Re: [RFC] Disk shock protection (revisited)

On Thu, 13 Mar 2008 15:51:59 +0100
Elias Oltmanns <[email protected]> wrote:

> Elias Oltmanns <[email protected]> wrote:
> [...]
> > I'm going to send a first draft of a patch series in reply to this
> > email. It is a stripped down version intended to get the general idea
> > across.
>
> Have you had got round to looking at these patches yet?

I've spent almost all of the last two weeks at or travelling to/from
conferences so no.

Alan

2008-03-02 10:15:27

by Pavel Machek

[permalink] [raw]

Subject: Re: [RFC] Disk shock protection (revisited)

Hi!

> > > That sounds like a non starter. What if the box is busy, what if the
> > > daemon or something you touch needs memory and causes paging ?
> >
> > The daemon runs mlock'd anyway, so there won't be any need for paging
>
> mlock does not guarantee anything of that form. A syscall by an mlocked
> process which causes a memory allocation can cause paging of another
> process on the system.

Well... but you can be careful about the syscalls, right?

Anyway, active protection is 'best effort' anyway. There's not enough
time to park heads if you drop the machine without tilting it first...
and we have been running with no protection for years now...

> > stays in memory all the time, it can go ahead and notify the kernel that
> > the disk heads should be unloaded. The kernel takes care to insert the
> > idle immediate command at the head of the queue. Am I missing something?
>
> Yes - the fact we may well have bounced off the floor already.

Well, shit happens. Even notebook with parked harddrive is not
guaranteed to survive the fall.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html