Subject: Re: IDE Hotswap


On Tuesday 28 September 2004 20:49, Suresh Grandhi wrote:
> Hi,
> My hardware has IDE hotswap support. I like to implement the hotswap driver
> over 2.6.8 kernel.
> Right now I am able to register and unregister the IDE devices successfully
> if they are not part of any raid or LVM.
> If I/O operation is going on the disk, then IDE unregister fails generating
> lot of I/O error messages.
> In this case, I want to make a work-around.
> Instead of unregistering the IDE device, is there a way to re-register the
> IDE device with the already existing ports and irq when the disk is
> inserted?

No and such workaround won't work anyway because
re-register operation is nothing else but unregister+register.

hotswap is currently unsupported, some TODO items:
- convert IDE driver to use dynamic objects instead of static ones
- add needed refcounting and locking
- convert IDE driver to new driver-model
- add full sysfs support
- add hotswap

Any help/support is appreciated.

Regards,
Bartlomiej


2004-09-29 00:48:20

by Alan

[permalink] [raw]
Subject: Re: IDE Hotswap

On Maw, 2004-09-28 at 22:38, Bartlomiej Zolnierkiewicz wrote:
> No and such workaround won't work anyway because
> re-register operation is nothing else but unregister+register.

If you grab the 2.6.8.1-ac patch you can do IDE controller hotplugging
and a few other things but not yet drive hotplugging. 2.4 can do drive
hotplug although you need a small -ac patch if you see wrong disk
geometry data.

For new controllers (ie SATA ones) use Jeff Garzik's serial ATA layer as
that is a lot cleaner and the SCSI layer already has a good basic
understanding of hotplug management.

> Any help/support is appreciated.

Except for the dynamic stuff I consider the problem solved. Its up to
you when and what you merge and I understand why you want to get stuff
like sysfs there.

For drive level hotplug its actually a lot easier and I guess that is
the case most users care about. The changes done for 2.6 clean up stuff
like suspend mean the nasties in 2.4 for sequencing have gone away. No
refcounting needed since the block and fs layer are doing it all for
you. TTY layer, revoke(), and some other current critical bonfires first
before I can help with that however.

Alan

Subject: Re: IDE Hotswap

On Tuesday 28 September 2004 23:45, Alan Cox wrote:
> On Maw, 2004-09-28 at 22:38, Bartlomiej Zolnierkiewicz wrote:
> > No and such workaround won't work anyway because
> > re-register operation is nothing else but unregister+register.
>
> If you grab the 2.6.8.1-ac patch you can do IDE controller hotplugging
> and a few other things but not yet drive hotplugging. 2.4 can do drive
> hotplug although you need a small -ac patch if you see wrong disk
> geometry data.
>
> For new controllers (ie SATA ones) use Jeff Garzik's serial ATA layer as
> that is a lot cleaner and the SCSI layer already has a good basic
> understanding of hotplug management.
>
> > Any help/support is appreciated.
>
> Except for the dynamic stuff I consider the problem solved. Its up to
> you when and what you merge and I understand why you want to get stuff
> like sysfs there.

Your patch is a nice start but it don't solve main issues, not to even
mention minor stuff like leaving /proc/ide/<chipset> around.

Merging it now is asking for problems.

> For drive level hotplug its actually a lot easier and I guess that is
> the case most users care about. The changes done for 2.6 clean up stuff

drive level hotplug is actually much harder
and it is _required_ for controller level hotplug, no? :)

> like suspend mean the nasties in 2.4 for sequencing have gone away. No
> refcounting needed since the block and fs layer are doing it all for

It helps but you still get bunch of races. Refcounting is _really_ needed.

> you. TTY layer, revoke(), and some other current critical bonfires first
> before I can help with that however.

Fine.

Bartlomiej

2004-09-29 02:53:16

by Alan

[permalink] [raw]
Subject: Re: IDE Hotswap

On Mer, 2004-09-29 at 02:54, Bartlomiej Zolnierkiewicz wrote:
> Your patch is a nice start but it don't solve main issues, not to even
> mention minor stuff like leaving /proc/ide/<chipset> around.
> Merging it now is asking for problems.

Oh I agree. There is a patch but it isn't the final answer. There is a
small resource bug (harmless but a bug) in the 2.6.8.1-ac patch. I
hadn't noticed proc/ide/<chipset> leaking but I'll take a look when I
get time to sort that out.

> > For drive level hotplug its actually a lot easier and I guess that is
> > the case most users care about. The changes done for 2.6 clean up stuff
>
> drive level hotplug is actually much harder
> and it is _required_ for controller level hotplug, no? :)

Actually its all a lot easier now than in 2.4 because the block layer is
designed to make it possible.

> > like suspend mean the nasties in 2.4 for sequencing have gone away. No
> > refcounting needed since the block and fs layer are doing it all for
>
> It helps but you still get bunch of races. Refcounting is _really_ needed.

Even in 2.4 ide drive hotplug was easy. The drive hotplug comes out
trivially because your controllers are fairly constant. As we all know
driver level hotplug is a bit trickier although the block layer has
really made this vastly easier in 2.6

For drive level hotplug you don't actually need refcounting at all
providing you've got a couple of locking issues dealt with.

Firstly the drive never goes away as a high level object (in fact you
don't want it to as then you can't ioctl it to make it come back!). That
means the upper layers don't know anything about it.

At the IDE layer the 2.4 code simply enforced the rule that you must be
the only opener of the device in order to hot unplug it. That means we
know its quiescent and not mounted. The only 2.4 race I know about is
suspend in parallel to hot unplug, and 2.6 has the mechanism to fix that
properly because suspend is a command state machine.

Providing hot unplug is about making drive->present stuff vary and
flipping to ide_default drivers the world is happy. The moment you want
to make /dev/hda 'disappear' to the block layer its fun, and on 2.4 its
as good as impossible.

Alan

Subject: Re: IDE Hotswap

On Wednesday 29 September 2004 03:50, Alan Cox wrote:
> On Mer, 2004-09-29 at 02:54, Bartlomiej Zolnierkiewicz wrote:
> > Your patch is a nice start but it don't solve main issues, not to even
> > mention minor stuff like leaving /proc/ide/<chipset> around.
> > Merging it now is asking for problems.
>
> Oh I agree. There is a patch but it isn't the final answer. There is a
> small resource bug (harmless but a bug) in the 2.6.8.1-ac patch. I
> hadn't noticed proc/ide/<chipset> leaking but I'll take a look when I
> get time to sort that out.

/proc/ide/<chipset> is the smallest problem

Don't waste your time on it, I've almost killed this bloat in my tree.

> > > like suspend mean the nasties in 2.4 for sequencing have gone away. No
> > > refcounting needed since the block and fs layer are doing it all for
> >
> > It helps but you still get bunch of races. Refcounting is _really_ needed.
>
> Even in 2.4 ide drive hotplug was easy. The drive hotplug comes out
> trivially because your controllers are fairly constant. As we all know
> driver level hotplug is a bit trickier although the block layer has
> really made this vastly easier in 2.6
>
> For drive level hotplug you don't actually need refcounting at all
> providing you've got a couple of locking issues dealt with.

These issues can't be solved without refcounting.

Feel free to probe me wrong, you can start with fixing
->open vs unregister race (drive->usage involved). :)

> Firstly the drive never goes away as a high level object (in fact you
> don't want it to as then you can't ioctl it to make it come back!). That
> means the upper layers don't know anything about it.

ioctls on not present devices are layering VIOLATION

> At the IDE layer the 2.4 code simply enforced the rule that you must be
> the only opener of the device in order to hot unplug it. That means we

"enforced" - there are a couple of races, sorry but ROTFL

> know its quiescent and not mounted. The only 2.4 race I know about is

- double unlock obvious mistake
- ->open() vs unregister
- /proc races (the same you fixed in your 2.6 patch)
- ioctl races

I'm sure there is more.

> suspend in parallel to hot unplug, and 2.6 has the mechanism to fix that
> properly because suspend is a command state machine.
>
> Providing hot unplug is about making drive->present stuff vary and
> flipping to ide_default drivers the world is happy. The moment you want
> to make /dev/hda 'disappear' to the block layer its fun, and on 2.4 its
> as good as impossible.

gendisk layer and block layer enforces you to make /dev/hda disappear.
Does sysfs ring any bells? Ask viro about static objects vs sysfs.
And yes not only gendisk and block enforces this, Patrick added basic,
premature sysfs support to IDE driver in the middle of 2.5 series.

We can get back to discussion when you get familiar with issues involved.

Bartlomiej

2004-09-29 15:43:49

by Alan

[permalink] [raw]
Subject: Re: IDE Hotswap

On Mer, 2004-09-29 at 13:08, Bartlomiej Zolnierkiewicz wrote:
> > Even in 2.4 ide drive hotplug was easy. The drive hotplug comes out
> > trivially because your controllers are fairly constant. As we all know
> > driver level hotplug is a bit trickier although the block layer has
> > really made this vastly easier in 2.6
> >
> > For drive level hotplug you don't actually need refcounting at all
> > providing you've got a couple of locking issues dealt with.
>
> These issues can't be solved without refcounting.

So you keep saying, but you refcount objects that are going away, you
don't need to refcount objects that are staying put.

> Feel free to probe me wrong, you can start with fixing
> ->open vs unregister race (drive->usage involved). :)

Doesn't occur in the 2.4 situation or the 2.6 stuff with the locking in
the 2.6.8.1-ac patch.

> > Firstly the drive never goes away as a high level object (in fact you
> > don't want it to as then you can't ioctl it to make it come back!). That
> > means the upper layers don't know anything about it.
>
> ioctls on not present devices are layering VIOLATION

Oh dear then I guess most of Linux is misdesigned. You aren't thinking
about the semantics at all.

If /dev/hda is a CD-ROM drive I can issue commands to it with no CD
present. Thats not a layering violation, and its how the IDE code works.
So whats the difference between hotplugging a drive and removing
a CD. Both are removing the media but leaving the controller behind.

On that item I think you are talking out of your backside.

> > At the IDE layer the 2.4 code simply enforced the rule that you must be
> > the only opener of the device in order to hot unplug it. That means we
>
> "enforced" - there are a couple of races, sorry but ROTFL
>
> > know its quiescent and not mounted. The only 2.4 race I know about is
>
> - double unlock obvious mistake
Details ?
> - ->open() vs unregister
unregister is hot plug controller not drive and thats unfixable in 2.4
> - /proc races (the same you fixed in your 2.6 patch)
yeah that lot postdates the 2.4 work. hotplug drives is not the cause
however.
> - ioctl races
Details ?

> gendisk layer and block layer enforces you to make /dev/hda disappear.

No it does not. The block layer couldn't give a flying **** whether
/dev/hda disappears or not. SCSI devices that are offlined don't need to
disappear either you just hand back commands with an error. You know -
like every CD-ROM does...

> Does sysfs ring any bells? Ask viro about static objects vs sysfs.
> And yes not only gendisk and block enforces this, Patrick added basic,
> premature sysfs support to IDE driver in the middle of 2.5 series.
>
> We can get back to discussion when you get familiar with issues involved.

Ditto...

Alan

Subject: Re: IDE Hotswap

On Wednesday 29 September 2004 16:35, Alan Cox wrote:
> On Mer, 2004-09-29 at 13:08, Bartlomiej Zolnierkiewicz wrote:
> > > Even in 2.4 ide drive hotplug was easy. The drive hotplug comes out
> > > trivially because your controllers are fairly constant. As we all know
> > > driver level hotplug is a bit trickier although the block layer has
> > > really made this vastly easier in 2.6
> > >
> > > For drive level hotplug you don't actually need refcounting at all
> > > providing you've got a couple of locking issues dealt with.
> >
> > These issues can't be solved without refcounting.
>
> So you keep saying, but you refcount objects that are going away, you
> don't need to refcount objects that are staying put.
>
> > Feel free to probe me wrong, you can start with fixing
> > ->open vs unregister race (drive->usage involved). :)
>
> Doesn't occur in the 2.4 situation or the 2.6 stuff with the locking in
> the 2.6.8.1-ac patch.

I will verify this in a few days, I have some real work to do first.

> > > Firstly the drive never goes away as a high level object (in fact you
> > > don't want it to as then you can't ioctl it to make it come back!). That
> > > means the upper layers don't know anything about it.
> >
> > ioctls on not present devices are layering VIOLATION
>
> Oh dear then I guess most of Linux is misdesigned. You aren't thinking
> about the semantics at all.
>
> If /dev/hda is a CD-ROM drive I can issue commands to it with no CD
> present. Thats not a layering violation, and its how the IDE code works.
> So whats the difference between hotplugging a drive and removing
> a CD. Both are removing the media but leaving the controller behind.

Removing a CD leaves you with a drive between controller and media.

1st case: ioctl -> IDE driver -> drive
(lack of refcounting workarounded by ide-default)

2nd case: ioctl -> IDE driver
(lack of refcounting)

> On that item I think you are talking out of your backside.
>
> > > At the IDE layer the 2.4 code simply enforced the rule that you must be
> > > the only opener of the device in order to hot unplug it. That means we
> >
> > "enforced" - there are a couple of races, sorry but ROTFL
> >
> > > know its quiescent and not mounted. The only 2.4 race I know about is
> >
> > - double unlock obvious mistake
> Details ?

2003/08/16 alan | /* Drive shutdown sequence done */
2003/08/16 alan | /* Prevent new opens ?? */
2003/08/16 alan | spin_unlock_irqrestore(&io_request_lock, flags);
2003/08/16 alan | /*
2003/08/16 alan | * Flush kernel side caches, and dump the /proc files
2003/08/16 alan | */
2003/08/16 alan | spin_unlock_irqrestore(&io_request_lock, flags);

> > - ->open() vs unregister
> unregister is hot plug controller not drive and thats unfixable in 2.4

Doesn't matter. race is the same AFAICS, drive->usage
access is not protected by any lock.

> > - /proc races (the same you fixed in your 2.6 patch)
> yeah that lot postdates the 2.4 work. hotplug drives is not the cause
> however.
> > - ioctl races
> Details ?

OK BKL protects us against i.e. concurrent HDIO_GETGEO
and hotplug ioctl. There is however no protection for controller
hotplug.

> > gendisk layer and block layer enforces you to make /dev/hda disappear.
>
> No it does not. The block layer couldn't give a flying **** whether
> /dev/hda disappears or not. SCSI devices that are offlined don't need to
> disappear either you just hand back commands with an error. You know -
> like every CD-ROM does...
>
> > Does sysfs ring any bells? Ask viro about static objects vs sysfs.
> > And yes not only gendisk and block enforces this, Patrick added basic,
> > premature sysfs support to IDE driver in the middle of 2.5 series.
> >
> > We can get back to discussion when you get familiar with issues involved.
>
> Ditto...
>
> Alan
>
>

2004-09-29 18:52:29

by Alan

[permalink] [raw]
Subject: Re: IDE Hotswap

On Mer, 2004-09-29 at 19:20, Bartlomiej Zolnierkiewicz wrote:
> > Doesn't occur in the 2.4 situation or the 2.6 stuff with the locking in
> > the 2.6.8.1-ac patch.
>
> I will verify this in a few days, I have some real work to do first.

2.6 is the important stuff sure.

> > > - double unlock obvious mistake
> > Details ?
>
> 2003/08/16 alan | /* Drive shutdown sequence done */
> 2003/08/16 alan | /* Prevent new opens ?? */
> 2003/08/16 alan | spin_unlock_irqrestore(&io_request_lock, flags);
> 2003/08/16 alan | /*
> 2003/08/16 alan | * Flush kernel side caches, and dump the /proc files
> 2003/08/16 alan | */
> 2003/08/16 alan | spin_unlock_irqrestore(&io_request_lock, flags);
>

Thanks. I'll go over this when I dig out the one little bit 2.4.2x needs
for hotplug to behave right with hard disks that I should get to
Marcelo.

> OK BKL protects us against i.e. concurrent HDIO_GETGEO
> and hotplug ioctl. There is however no protection for controller
> hotplug.

Agreed.