udev names are created when kernel detects corr. device. Unfortunately for
removable media kernel rescans for partitions only when I try to access
device. Meaning - because kernel does not know partition table it did not
send hotplug event so udev did not create device nodes. But without device
nodes I have no way to access device in Unix :(
specifically I have now my Jaz and I have no (reasonable) way to access
partition 4 assuming device nodes are managed by udev.
devfs solved this problem by
- always exporting at least handle to the whole disk (sda as example)
- using something simple like dd if=/dev/sda count=1 on lookup for
non-existing partition (/dev/sda4) that would rescan partitions and create
device nodes for them.
static /dev simply has all nodes available and does not suffer from this
problem at all.
unfortunately there are no lookup events in case if udev ... meaning at this
moment user must manually rescan partitions after inserting new media. I do
not see any way to solve this problem at all given current implementation.
The closest is to blindly create nodes for all partitions as soon as block
device is available.
-andrey
On Thu, Jan 01, 2004 at 11:33:04PM +0300, Andrey Borzenkov wrote:
> udev names are created when kernel detects corr. device. Unfortunately for
> removable media kernel rescans for partitions only when I try to access
> device. Meaning - because kernel does not know partition table it did not
> send hotplug event so udev did not create device nodes. But without device
> nodes I have no way to access device in Unix :(
>
> specifically I have now my Jaz and I have no (reasonable) way to access
> partition 4 assuming device nodes are managed by udev.
>
> devfs solved this problem by
>
> - always exporting at least handle to the whole disk (sda as example)
Doesn't the kernel always create the main block device for this device?
If so, udev will catch that. If not, there's no way udev will work for
this kind of device, sorry. You could make a script that just creates
the device node in /tmp, runs dd on it, and then cleans it all up to
force partition scanning.
thanks,
greg k-h
On Saturday 03 January 2004 08:58, Greg KH wrote:
> On Thu, Jan 01, 2004 at 11:33:04PM +0300, Andrey Borzenkov wrote:
> > udev names are created when kernel detects corr. device. Unfortunately
> > for removable media kernel rescans for partitions only when I try to
> > access device. Meaning - because kernel does not know partition table it
> > did not send hotplug event so udev did not create device nodes. But
> > without device nodes I have no way to access device in Unix :(
> >
> > specifically I have now my Jaz and I have no (reasonable) way to access
> > partition 4 assuming device nodes are managed by udev.
> >
> > devfs solved this problem by
> >
> > - always exporting at least handle to the whole disk (sda as example)
>
> Doesn't the kernel always create the main block device for this device?
yes
> If so, udev will catch that.
yes. So what - how does it help? User needs /dev/sda4. User has /dev/sda only.
Any attempt to refer to /dev/sda4 simply returns "No such file or directory"
> If not, there's no way udev will work for
> this kind of device, sorry.
this worked seamlessly using static /dev. This worked seamlessly using devfs.
If it won't work with udev - it means regression. And believe me - it is
serious regression for end-users (I still remember similar problems we had
when transitioning to devfs and users' reaction to this).
> You could make a script that just creates
> the device node in /tmp, runs dd on it, and then cleans it all up to
> force partition scanning.
>
You miss the point. When should this script be run? There is no event when you
just insert Jaz disk; nor is there any way to trigger revalidation on access
to non-existing device like is the case without udev.
what I aim at - udev needs to provide some extension mechanism to allow
arbitrarily scripts to be run. Such script could then create all block nodes
(hmm ... how can script know the number of possible nodes and their names?)
in hope somebody gets an idea ...
regards
-andrey
On Sat, Jan 03, Andries Brouwer wrote:
> On Sat, Jan 03, 2004 at 11:51:33AM +0300, Andrey Borzenkov wrote:
>
> > yes. So what - how does it help? User needs /dev/sda4. User has /dev/sda only.
> > Any attempt to refer to /dev/sda4 simply returns "No such file or directory"
>
> Things are far from perfect here, but "blockdev --rereadpt /dev/sda" helps.
Is there really no way to get a media change notification from ZIP or
JAZ drives?
--
USB is for mice, FireWire is for men!
sUse lINUX ag, nÜRNBERG
On Sat, Jan 03, 2004 at 11:51:33AM +0300, Andrey Borzenkov wrote:
> yes. So what - how does it help? User needs /dev/sda4. User has /dev/sda only.
> Any attempt to refer to /dev/sda4 simply returns "No such file or directory"
Things are far from perfect here, but "blockdev --rereadpt /dev/sda" helps.
On Saturday 03 January 2004 15:42, Olaf Hering wrote:
> On Sat, Jan 03, Andries Brouwer wrote:
> > On Sat, Jan 03, 2004 at 11:51:33AM +0300, Andrey Borzenkov wrote:
> > > yes. So what - how does it help? User needs /dev/sda4. User has
> > > /dev/sda only. Any attempt to refer to /dev/sda4 simply returns "No
> > > such file or directory"
> >
> > Things are far from perfect here, but "blockdev --rereadpt /dev/sda"
> > helps.
>
sure. But that requires manual user intervention. And it has been working
without any manual user intervention before. That is why I called it
regression.
I just try to draw attention to simple (but very nasty for users) problem in
udev. This quotation removed too much from my original post to reduce the
problem to simple "how to reread partition table".
> Is there really no way to get a media change notification from ZIP or
> JAZ drives?
If anyone knows please tell me - I will put it into supermount ...
AFAIK in case of SCSI this is impossible simply by virtue of protocol - SCSI
device is not initiator. So you need something to poll device for status.
That is usually done on device open except in this case you can't open
because you do not yet have handle.
thank you
-andrey
On Sat, Jan 03 2004, Andrey Borzenkov wrote:
> > Is there really no way to get a media change notification from ZIP or
> > JAZ drives?
>
> If anyone knows please tell me - I will put it into supermount ...
>
> AFAIK in case of SCSI this is impossible simply by virtue of protocol - SCSI
> device is not initiator. So you need something to poll device for status.
> That is usually done on device open except in this case you can't open
> because you do not yet have handle.
You could queue a media notification request for long periods of time,
being completed by the drive when a media change happens. At least mmc
allows for this, doubt anyone has ever done it.
So yeah, poll...
--
Jens Axboe
On Sat, Jan 03, 2004 at 11:51:33AM +0300, Andrey Borzenkov wrote:
> > You could make a script that just creates
> > the device node in /tmp, runs dd on it, and then cleans it all up to
> > force partition scanning.
> >
>
> You miss the point. When should this script be run? There is no event when you
> just insert Jaz disk; nor is there any way to trigger revalidation on access
> to non-existing device like is the case without udev.
>
> what I aim at - udev needs to provide some extension mechanism to allow
> arbitrarily scripts to be run.
It does provide that mechanism. See the CALLOUT rule. It can run any
program or script when a new device is seen by the kernel.
thanks,
greg k-h
On Wed, Jan 07 2004, Olaf Hering wrote:
> On Sat, Jan 03, Jens Axboe wrote:
>
> > On Sat, Jan 03 2004, Andrey Borzenkov wrote:
> > > > Is there really no way to get a media change notification from ZIP or
> > > > JAZ drives?
> > >
> > > If anyone knows please tell me - I will put it into supermount ...
> > >
> > > AFAIK in case of SCSI this is impossible simply by virtue of protocol - SCSI
> > > device is not initiator. So you need something to poll device for status.
> > > That is usually done on device open except in this case you can't open
> > > because you do not yet have handle.
> >
> > You could queue a media notification request for long periods of time,
> > being completed by the drive when a media change happens. At least mmc
> > allows for this, doubt anyone has ever done it.
> >
> > So yeah, poll...
>
> Poll how? "kmediachangethread"? Or polling in userland? The latter would
> (probably) lead to endless IO errors. Not very good.
No need to put it in the kernel, user space fits the bil nicely. I don't
see how this would lead to IO errors?
> If I understand the Darwin sources correctly, a polling is used. But I
> havent looked hard how they do it.
It's the only way to do it.
--
Jens Axboe
On Sat, Jan 03, Jens Axboe wrote:
> On Sat, Jan 03 2004, Andrey Borzenkov wrote:
> > > Is there really no way to get a media change notification from ZIP or
> > > JAZ drives?
> >
> > If anyone knows please tell me - I will put it into supermount ...
> >
> > AFAIK in case of SCSI this is impossible simply by virtue of protocol - SCSI
> > device is not initiator. So you need something to poll device for status.
> > That is usually done on device open except in this case you can't open
> > because you do not yet have handle.
>
> You could queue a media notification request for long periods of time,
> being completed by the drive when a media change happens. At least mmc
> allows for this, doubt anyone has ever done it.
>
> So yeah, poll...
Poll how? "kmediachangethread"? Or polling in userland? The latter would
(probably) lead to endless IO errors. Not very good.
If I understand the Darwin sources correctly, a polling is used. But I
havent looked hard how they do it.
--
USB is for mice, FireWire is for men!
sUse lINUX ag, nÜRNBERG
On Wed, Jan 07 2004, Olaf Hering wrote:
> On Wed, Jan 07, Jens Axboe wrote:
>
> > No need to put it in the kernel, user space fits the bil nicely. I don't
> > see how this would lead to IO errors?
>
> Ok, how should it be done on my SCSI and parallel port ZIP? An ATAPI ZIP
> with 2.4 behaves like that:
>
>
> nectarine:~ # blockdev --rereadpt /dev/hdd
> /dev/hdd: Eingabe-/Ausgabefehler
> nectarine:~ # dmesg | tail
> nfs: server Hilbert2 OK
> nfs: server Hilbert3 not responding, still trying
> nfs: server Hilbert3 OK
> nfs: server Hilbert3 not responding, still trying
> nfs: server Hilbert3 OK
> nfs: server Hilbert2 not responding, still trying
> nfs: server Hilbert2 OK
> ide-floppy: hdd: I/O error, pc = 0, key = 2, asc = 3a, ascq = 0
> ide-floppy: hdd: I/O error, pc = 1b, key = 2, asc = 3a, ascq = 0
> hdd: No disk in drive
> nectarine:~ # cat /proc/ide/hdd/model
> IOMEGA ZIP 100 ATAPI
Two problems here. First, ide-floppy should not verbosely fail these
commands (2/3a/00 is 'medium not present'). Second, you are not using
the proper mechanism to detect media events.
> I have not checked 2.6, but I doubt it is smarter.
I doubt ide-floppy behaves any differently. But at least you can send
packet commands generically to any atapi/scsi device in 2.6, so the code
has a chance to work a bit more generically (without tons of ugly and
different ioctls for each device type).
--
Jens Axboe
On Wed, Jan 07, Jens Axboe wrote:
> No need to put it in the kernel, user space fits the bil nicely. I don't
> see how this would lead to IO errors?
Ok, how should it be done on my SCSI and parallel port ZIP? An ATAPI ZIP
with 2.4 behaves like that:
nectarine:~ # blockdev --rereadpt /dev/hdd
/dev/hdd: Eingabe-/Ausgabefehler
nectarine:~ # dmesg | tail
nfs: server Hilbert2 OK
nfs: server Hilbert3 not responding, still trying
nfs: server Hilbert3 OK
nfs: server Hilbert3 not responding, still trying
nfs: server Hilbert3 OK
nfs: server Hilbert2 not responding, still trying
nfs: server Hilbert2 OK
ide-floppy: hdd: I/O error, pc = 0, key = 2, asc = 3a, ascq = 0
ide-floppy: hdd: I/O error, pc = 1b, key = 2, asc = 3a, ascq = 0
hdd: No disk in drive
nectarine:~ # cat /proc/ide/hdd/model
IOMEGA ZIP 100 ATAPI
I have not checked 2.6, but I doubt it is smarter.
--
USB is for mice, FireWire is for men!
sUse lINUX ag, nÜRNBERG
On Wed, Jan 07 2004, Olaf Hering wrote:
> On Wed, Jan 07, Jens Axboe wrote:
>
> > On Wed, Jan 07 2004, Olaf Hering wrote:
> > > On Wed, Jan 07, Jens Axboe wrote:
> > >
> > > > No need to put it in the kernel, user space fits the bil nicely. I don't
> > > > see how this would lead to IO errors?
> > >
> > > Ok, how should it be done on my SCSI and parallel port ZIP? An ATAPI ZIP
> ^^^
>
> "How"? We need a sane way to deal with removeable medias.
> Do you have example code that can be put into the udev distribution?
Depends. If the device supports event status notification, then that is
what should be used. If not, you have to hack some code around test unit
ready (checking the sense info on return, if failed). You'd most likely
want to do this manually, with SG_IO.
--
Jens Axboe
On Wed, Jan 07, Jens Axboe wrote:
> On Wed, Jan 07 2004, Olaf Hering wrote:
> > On Wed, Jan 07, Jens Axboe wrote:
> >
> > > No need to put it in the kernel, user space fits the bil nicely. I don't
> > > see how this would lead to IO errors?
> >
> > Ok, how should it be done on my SCSI and parallel port ZIP? An ATAPI ZIP
^^^
"How"? We need a sane way to deal with removeable medias.
Do you have example code that can be put into the udev distribution?
--
USB is for mice, FireWire is for men!
sUse lINUX ag, nÜRNBERG
On Wednesday 07 January 2004 13:31, Jens Axboe wrote:
> On Wed, Jan 07 2004, Olaf Hering wrote:
> > On Wed, Jan 07, Jens Axboe wrote:
> > > On Wed, Jan 07 2004, Olaf Hering wrote:
> > > > On Wed, Jan 07, Jens Axboe wrote:
> > > > > No need to put it in the kernel, user space fits the bil nicely. I
> > > > > don't see how this would lead to IO errors?
> > > >
> > > > Ok, how should it be done on my SCSI and parallel port ZIP? An ATAPI
> > > > ZIP
> >
> > ^^^
> >
> > "How"? We need a sane way to deal with removeable medias.
> > Do you have example code that can be put into the udev distribution?
>
> Depends. If the device supports event status notification, then that is
> what should be used.
Would you please give some pointers to information about "event status
notification".
thank you
> If not, you have to hack some code around test unit
> ready (checking the sense info on return, if failed). You'd most likely
> want to do this manually, with SG_IO.
On Wed, Jan 07 2004, Andrey Borzenkov wrote:
> On Wednesday 07 January 2004 13:31, Jens Axboe wrote:
> > On Wed, Jan 07 2004, Olaf Hering wrote:
> > > On Wed, Jan 07, Jens Axboe wrote:
> > > > On Wed, Jan 07 2004, Olaf Hering wrote:
> > > > > On Wed, Jan 07, Jens Axboe wrote:
> > > > > > No need to put it in the kernel, user space fits the bil nicely. I
> > > > > > don't see how this would lead to IO errors?
> > > > >
> > > > > Ok, how should it be done on my SCSI and parallel port ZIP? An ATAPI
> > > > > ZIP
> > >
> > > ^^^
> > >
> > > "How"? We need a sane way to deal with removeable medias.
> > > Do you have example code that can be put into the udev distribution?
> >
> > Depends. If the device supports event status notification, then that is
> > what should be used.
>
> Would you please give some pointers to information about "event status
> notification".
Sure, I'm talking about GPCMD_GET_EVENT_STATUS_NOTIFICATION (see
cdrom.h), opcode 0x4a in the mt fuji or mmc docs. You can fetch here:
ftp://ftp.avc-pioneer.com/Mtfuji5/
--
Jens Axboe
On Wednesday 07 January 2004 12:50, Jens Axboe wrote:
> > > So yeah, poll...
> >
> > Poll how? "kmediachangethread"? Or polling in userland? The latter would
> > (probably) lead to endless IO errors. Not very good.
>
> No need to put it in the kernel, user space fits the bil nicely.
unfortunately opening device in userland effectively locks tray making media
change impossible. at least given current ->open semantic.
even periodic access is quite annoying for users (tray closing while user
attempts to insert CD)
we may agree that O_NDELAY does not affect locked state; currently this is not
consistent across drivers (e.g. cdrom does not lock tray while sd does)
On Wed, Jan 07 2004, Andrey Borzenkov wrote:
> On Wednesday 07 January 2004 12:50, Jens Axboe wrote:
> > > > So yeah, poll...
> > >
> > > Poll how? "kmediachangethread"? Or polling in userland? The latter would
> > > (probably) lead to endless IO errors. Not very good.
> >
> > No need to put it in the kernel, user space fits the bil nicely.
>
> unfortunately opening device in userland effectively locks tray making
> media change impossible. at least given current ->open semantic.
>
> even periodic access is quite annoying for users (tray closing while
> user attempts to insert CD)
cdrom layer handles this with O_NONBLOCK basically meaning a 'not for
data' open.
> we may agree that O_NDELAY does not affect locked state; currently
> this is not consistent across drivers (e.g. cdrom does not lock tray
> while sd does)
cdrom has no special O_NDELAY checks.
--
Jens Axboe
On Wed, Jan 07 2004, Andrey Borzenkov wrote:
> On Wednesday 07 January 2004 14:05, Jens Axboe wrote:
> > On Wed, Jan 07 2004, Andrey Borzenkov wrote:
> > > On Wednesday 07 January 2004 12:50, Jens Axboe wrote:
> > > > > > So yeah, poll...
> > > > >
> > > > > Poll how? "kmediachangethread"? Or polling in userland? The latter
> > > > > would (probably) lead to endless IO errors. Not very good.
> > > >
> > > > No need to put it in the kernel, user space fits the bil nicely.
> > >
> > > unfortunately opening device in userland effectively locks tray making
> > > media change impossible. at least given current ->open semantic.
> > >
> > > even periodic access is quite annoying for users (tray closing while
> > > user attempts to insert CD)
> >
> > cdrom layer handles this with O_NONBLOCK basically meaning a 'not for
> > data' open.
> >
> > > we may agree that O_NDELAY does not affect locked state; currently
> > > this is not consistent across drivers (e.g. cdrom does not lock tray
> > > while sd does)
> >
> > cdrom has no special O_NDELAY checks.
>
> ok I meant O_NONBLOCK, sorry. they are synonyms anyway
>
> {pts/0}% grep NONBLO *
> fcntl.h:#define O_NONBLOCK 04000
> fcntl.h:#define O_NDELAY O_NONBLOCK
Oh right, so same thing :-). cdrom layer has always documented it as
O_NONBLOCK.
--
Jens Axboe
On Wednesday 07 January 2004 14:05, Jens Axboe wrote:
> On Wed, Jan 07 2004, Andrey Borzenkov wrote:
> > On Wednesday 07 January 2004 12:50, Jens Axboe wrote:
> > > > > So yeah, poll...
> > > >
> > > > Poll how? "kmediachangethread"? Or polling in userland? The latter
> > > > would (probably) lead to endless IO errors. Not very good.
> > >
> > > No need to put it in the kernel, user space fits the bil nicely.
> >
> > unfortunately opening device in userland effectively locks tray making
> > media change impossible. at least given current ->open semantic.
> >
> > even periodic access is quite annoying for users (tray closing while
> > user attempts to insert CD)
>
> cdrom layer handles this with O_NONBLOCK basically meaning a 'not for
> data' open.
>
> > we may agree that O_NDELAY does not affect locked state; currently
> > this is not consistent across drivers (e.g. cdrom does not lock tray
> > while sd does)
>
> cdrom has no special O_NDELAY checks.
ok I meant O_NONBLOCK, sorry. they are synonyms anyway
{pts/0}% grep NONBLO *
fcntl.h:#define O_NONBLOCK 04000
fcntl.h:#define O_NDELAY O_NONBLOCK
On Wed, Jan 07, 2004 at 11:25:15AM +0100, Olaf Hering wrote:
> On Wed, Jan 07, Jens Axboe wrote:
>
> > On Wed, Jan 07 2004, Olaf Hering wrote:
> > > On Wed, Jan 07, Jens Axboe wrote:
> > >
> > > > No need to put it in the kernel, user space fits the bil nicely. I don't
> > > > see how this would lead to IO errors?
> > >
> > > Ok, how should it be done on my SCSI and parallel port ZIP? An ATAPI ZIP
> ^^^
>
> "How"? We need a sane way to deal with removeable medias.
> Do you have example code that can be put into the udev distribution?
In udev? Why, that sounds like the job for some other tool/program,
that can sit around and poll devices. Not job for udev (which is for
naming devices.)
thanks,
greg k-h
On Fri, 2 Jan 2004, Greg KH wrote:
>
> Doesn't the kernel always create the main block device for this device?
> If so, udev will catch that.
But udev should probably also create all the sub-nodes if it doesn't
already.
And it really has to create _all_ of them, exactly because there's no way
to know ahead-of-time which of them will be available.
Then, user space can just access "/dev/sda1" or whatever, and the act of
accessing it will force the re-scan.
Linus
On Wed, Jan 07, 2004 at 10:38:31AM -0800, Linus Torvalds wrote:
>
> On Fri, 2 Jan 2004, Greg KH wrote:
> >
> > Doesn't the kernel always create the main block device for this device?
> > If so, udev will catch that.
>
> But udev should probably also create all the sub-nodes if it doesn't
> already.
It doesn't, as I thought we could rely on the kernel partition support.
> And it really has to create _all_ of them, exactly because there's no way
> to know ahead-of-time which of them will be available.
>
> Then, user space can just access "/dev/sda1" or whatever, and the act of
> accessing it will force the re-scan.
Hm, that would work, but what about a user program that just polls on
the device, as the rest of this thread discusses? As removable devices
are not the "norm" it would seem a bit of overkill to create 16
partitions for every block device, if they need them or not.
thanks,
greg k-h
Greg KH wrote:
>On Wed, Jan 07, 2004 at 10:38:31AM -0800, Linus Torvalds wrote:
>
>
>>On Fri, 2 Jan 2004, Greg KH wrote:
>>
>>
>>>Doesn't the kernel always create the main block device for this device?
>>>If so, udev will catch that.
>>>
>>>
>>But udev should probably also create all the sub-nodes if it doesn't
>>already.
>>
>>
>
>It doesn't, as I thought we could rely on the kernel partition support.
>
>
>
>>And it really has to create _all_ of them, exactly because there's no way
>>to know ahead-of-time which of them will be available.
>>
>>Then, user space can just access "/dev/sda1" or whatever, and the act of
>>accessing it will force the re-scan.
>>
>>
>
>Hm, that would work, but what about a user program that just polls on
>the device, as the rest of this thread discusses? As removable devices
>are not the "norm" it would seem a bit of overkill to create 16
>partitions for every block device, if they need them or not.
>
>thanks,
>
>greg k-h
>
>
>
Accessing the partition would not cause the rescan (accessing the whole
disk causes.) I think devfs does/did this rescan on access.
--Mika
On Wed, Jan 07, 2004 at 09:23:39PM +0200, Mika Penttil? wrote:
> Greg KH wrote:
> >On Wed, Jan 07, 2004 at 10:38:31AM -0800, Linus Torvalds wrote:
> >>And it really has to create _all_ of them, exactly because there's no way
> >>to know ahead-of-time which of them will be available.
> >>
> >>Then, user space can just access "/dev/sda1" or whatever, and the act of
> >>accessing it will force the re-scan.
> >>
> >>
> >
> >Hm, that would work, but what about a user program that just polls on
> >the device, as the rest of this thread discusses? As removable devices
> >are not the "norm" it would seem a bit of overkill to create 16
> >partitions for every block device, if they need them or not.
> >
> Accessing the partition would not cause the rescan (accessing the whole
> disk causes.) I think devfs does/did this rescan on access.
It would rescan on access of a partition or the main block device?
If accessing the partition doesn't work, than having udev create all
partitions wouldn't help anything :(
thanks,
greg k-h
On Wed, 7 Jan 2004, Greg KH wrote:
> > But udev should probably also create all the sub-nodes if it doesn't
> > already.
>
> It doesn't, as I thought we could rely on the kernel partition support.
Indeed, we _can_ rely on the kernel partition support, but the subnodes
are needed to get at those partitions.
Obviously, a "repartitioning hotplug event" can create the subnodes, but
that will fail exactly because it wouldn't allow the user to just access
the nodes.
> Hm, that would work, but what about a user program that just polls on
> the device, as the rest of this thread discusses?
I hate those "background CPU users". Have you looked at "ps" output after
something like kscd has run, and does a CD check every second? It's
_expensive_. It goes all the way down to the hardware, sends a request
to the device.
Doing it every five minutes wouldn't be an issue, but doing it every five
minutes would be absolutely _horrible_ from a user perspective standpoint.
If you insert a smartmedia card in your cardreader, you expect to be able
to access it pretty much immediately when you start typing. So a second or
two of delay is fine, but even just five or ten seconds are already bad.
So the choice is:
- probe every removable device once a second
- pre-populate the device nodes, and when the user presses the icon that
says "mount", it will just do so. Immediately. No delay at all.
> As removable devices
> are not the "norm" it would seem a bit of overkill to create 16
> partitions for every block device, if they need them or not.
It might be acceptable to create even just the first one, since things
like cameras etc only ever generate a single partition. But the
inconsistency would drive me mad. I'd just do all of them.
NOTE! We do have an alternative: if we were to just make block device
nodes support "readdir" and "lookup", you could just do
open("/dev/sda/1" ...)
and it magically works right. I've wanted to do this for a long time, but
every time I suggest allowing it, people scream.
Linus
Greg KH wrote:
>On Wed, Jan 07, 2004 at 09:23:39PM +0200, Mika Penttil? wrote:
>
>
>>Greg KH wrote:
>>
>>
>>>On Wed, Jan 07, 2004 at 10:38:31AM -0800, Linus Torvalds wrote:
>>>
>>>
>>>>And it really has to create _all_ of them, exactly because there's no way
>>>>to know ahead-of-time which of them will be available.
>>>>
>>>>Then, user space can just access "/dev/sda1" or whatever, and the act of
>>>>accessing it will force the re-scan.
>>>>
>>>>
>>>>
>>>>
>>>Hm, that would work, but what about a user program that just polls on
>>>the device, as the rest of this thread discusses? As removable devices
>>>are not the "norm" it would seem a bit of overkill to create 16
>>>partitions for every block device, if they need them or not.
>>>
>>>
>>>
>>Accessing the partition would not cause the rescan (accessing the whole
>>disk causes.) I think devfs does/did this rescan on access.
>>
>>
>
>It would rescan on access of a partition or the main block device?
>
>If accessing the partition doesn't work, than having udev create all
>partitions wouldn't help anything :(
>
>thanks,
>
>greg k-h
>
>
Right, rescan on access of main block device works, partition not, afaics.
--Mika
On Wed, Jan 07, 2004 at 11:31:55AM -0800, Linus Torvalds wrote:
> On Wed, 7 Jan 2004, Greg KH wrote:
> > > But udev should probably also create all the sub-nodes if it doesn't
> > > already.
> >
> > It doesn't, as I thought we could rely on the kernel partition support.
>
> Indeed, we _can_ rely on the kernel partition support, but the subnodes
> are needed to get at those partitions.
>
> Obviously, a "repartitioning hotplug event" can create the subnodes, but
> that will fail exactly because it wouldn't allow the user to just access
> the nodes.
It sounds like just having the device node around will not cause the
rescan if you access it. I don't have any such devices here to test
this out or not. If true, having udev create all nodes will not help
out much :(
> > Hm, that would work, but what about a user program that just polls on
> > the device, as the rest of this thread discusses?
>
> I hate those "background CPU users". Have you looked at "ps" output after
> something like kscd has run, and does a CD check every second? It's
> _expensive_. It goes all the way down to the hardware, sends a request
> to the device.
Oh I know, it's one of the first things I disable when setting up a box :)
> Doing it every five minutes wouldn't be an issue, but doing it every five
> minutes would be absolutely _horrible_ from a user perspective standpoint.
> If you insert a smartmedia card in your cardreader, you expect to be able
> to access it pretty much immediately when you start typing. So a second or
> two of delay is fine, but even just five or ten seconds are already bad.
>
> So the choice is:
> - probe every removable device once a second
> - pre-populate the device nodes, and when the user presses the icon that
> says "mount", it will just do so. Immediately. No delay at all.
Based on the previous info, I think we are stuck with probing :(
> NOTE! We do have an alternative: if we were to just make block device
> nodes support "readdir" and "lookup", you could just do
>
> open("/dev/sda/1" ...)
>
> and it magically works right. I've wanted to do this for a long time, but
> every time I suggest allowing it, people scream.
Hm, that would be nice. I don't remember seeing it being proposed
before, what are the main complaints people have with this?
thanks,
greg k-h
> > NOTE! We do have an alternative: if we were to just make block device
> > nodes support "readdir" and "lookup", you could just do
> >
> > open("/dev/sda/1" ...)
> >
> > and it magically works right. I've wanted to do this for a long time, but
> > every time I suggest allowing it, people scream.
Nooo!!! Resist the temptation!! Don't give in!
> Hm, that would be nice. I don't remember seeing it being proposed
> before, what are the main complaints people have with this?
Consider the long-range ramifications
if a device can also be a directory, just "magically".
I'm not going to automatically diss the idea (other than my
natural reaction above) but please consider beyond the immediate hack.
It reminds me of AIX from the days when it ran on PCs.
They had this neat trick of "hidden directories" (for a different
purpose). It looked like an executable, but really was a
directory containing multiple executables for various platforms.
(This version of AIX also ran on the mainframe, AIX/386 and AIX/370.)
There were/are better ways of solving the problem they were addressing.
-- R;
On Wed, Jan 07, Linus Torvalds wrote:
>
>
> On Fri, 2 Jan 2004, Greg KH wrote:
> >
> > Doesn't the kernel always create the main block device for this device?
> > If so, udev will catch that.
>
> But udev should probably also create all the sub-nodes if it doesn't
> already.
>
> And it really has to create _all_ of them, exactly because there's no way
> to know ahead-of-time which of them will be available.
>
> Then, user space can just access "/dev/sda1" or whatever, and the act of
> accessing it will force the re-scan.
How would that work? I mean, what will a tool that cares about a block
event do? It will run a fdisk/parted -l /udev/sda to figure out what partitions
are there (just to skip an extended partition sda5, as example) and
finds no media. That tool will never run again on sda, unless a new
block add event comes in. So some sort of polling is required for that
class of devices.
If we create sda1 - sda15, you mean a mount /udev/sda15 /mnt is supposed
to fail with -ENODEV instead of -ENOENT?
--
USB is for mice, FireWire is for men!
sUse lINUX ag, nÜRNBERG
On Wed, Jan 07, 2004 at 02:25:44PM -0600, Richard Troth wrote:
> Consider the long-range ramifications
> if a device can also be a directory, just "magically".
> I'm not going to automatically diss the idea (other than my
> natural reaction above) but please consider beyond the immediate hack.
What do you consider the "long-range" ramifications of this change?
Curious,
greg k-h
>> NOTE! We do have an alternative: if we were to just make block device
>> nodes support "readdir" and "lookup", you could just do
>>
>> open("/dev/sda/1" ...)
>>
>> and it magically works right. I've wanted to do this for a long time, but
>> every time I suggest allowing it, people scream.
>
> Hm, that would be nice. I don't remember seeing it being proposed
> before, what are the main complaints people have with this?
Couldn't the partitions go under "/dev/sdaX/{1,2,3}" and solve the same
problem without doing magic on the devices?
M.
On Wed, Jan 07, 2004 at 04:32:26PM -0800, Martin J. Bligh wrote:
> >> NOTE! We do have an alternative: if we were to just make block device
> >> nodes support "readdir" and "lookup", you could just do
> >>
> >> open("/dev/sda/1" ...)
> >>
> >> and it magically works right. I've wanted to do this for a long time, but
> >> every time I suggest allowing it, people scream.
> >
> > Hm, that would be nice. I don't remember seeing it being proposed
> > before, what are the main complaints people have with this?
>
> Couldn't the partitions go under "/dev/sdaX/{1,2,3}" and solve the same
> problem without doing magic on the devices?
No, that's not the point. As discussed on irc, I think you now
understand the issue (partitions not being present, media changed
without kernel knowing about it, etc.)
thanks,
greg k-h
On Wed, Jan 07, 2004 at 11:31:55AM -0800, Linus Torvalds wrote:
> NOTE! We do have an alternative: if we were to just make block device
> nodes support "readdir" and "lookup", you could just do
>
> open("/dev/sda/1" ...)
>
> and it magically works right. I've wanted to do this for a long time, but
> every time I suggest allowing it, people scream.
... and do so for a good reason.
Guys, could we please put the entire thing on hold for a week or so?
There's stuff around the block hotplug that would simplify a lot in
that area and I'd rather see partitioning code (and bdev code in general,
for that matter) not messed with until we do that right.
Some of that is already in -mm, some should go in shortly and it can
make a lot of kludges unnecessary.
--Greg KH <[email protected]> wrote (on Wednesday, January 07, 2004 16:41:24 -0800):
> On Wed, Jan 07, 2004 at 04:32:26PM -0800, Martin J. Bligh wrote:
>> >> NOTE! We do have an alternative: if we were to just make block device
>> >> nodes support "readdir" and "lookup", you could just do
>> >>
>> >> open("/dev/sda/1" ...)
>> >>
>> >> and it magically works right. I've wanted to do this for a long time, but
>> >> every time I suggest allowing it, people scream.
>> >
>> > Hm, that would be nice. I don't remember seeing it being proposed
>> > before, what are the main complaints people have with this?
>>
>> Couldn't the partitions go under "/dev/sdaX/{1,2,3}" and solve the same
>> problem without doing magic on the devices?
>
> No, that's not the point. As discussed on irc, I think you now
> understand the issue (partitions not being present, media changed
> without kernel knowing about it, etc.)
I think it is the point ... you don't need to scan the device until
someone does "readdir" / "lookup" as Linus says above. Just this way
there's no duplexing of the block devices, just a magic directory node
(which might avoid the screams Linus mentions).
And it gets rid of the polling crap.
M.
On Thu, Jan 08, 2004 at 12:42:50AM +0000, [email protected] wrote:
> On Wed, Jan 07, 2004 at 11:31:55AM -0800, Linus Torvalds wrote:
>
> > NOTE! We do have an alternative: if we were to just make block device
> > nodes support "readdir" and "lookup", you could just do
> >
> > open("/dev/sda/1" ...)
> >
> > and it magically works right. I've wanted to do this for a long time, but
> > every time I suggest allowing it, people scream.
>
> ... and do so for a good reason.
>
> Guys, could we please put the entire thing on hold for a week or so?
> There's stuff around the block hotplug that would simplify a lot in
> that area and I'd rather see partitioning code (and bdev code in general,
> for that matter) not messed with until we do that right.
Fine with me, the current situation works just fine for my devices right
now :)
greg k-h
On Wed, Jan 07, 2004 at 05:07:33PM -0800, Martin J. Bligh wrote:
> --Greg KH <[email protected]> wrote (on Wednesday, January 07, 2004 16:41:24 -0800):
>
> > On Wed, Jan 07, 2004 at 04:32:26PM -0800, Martin J. Bligh wrote:
> >> >> NOTE! We do have an alternative: if we were to just make block device
> >> >> nodes support "readdir" and "lookup", you could just do
> >> >>
> >> >> open("/dev/sda/1" ...)
> >> >>
> >> >> and it magically works right. I've wanted to do this for a long time, but
> >> >> every time I suggest allowing it, people scream.
> >> >
> >> > Hm, that would be nice. I don't remember seeing it being proposed
> >> > before, what are the main complaints people have with this?
> >>
> >> Couldn't the partitions go under "/dev/sdaX/{1,2,3}" and solve the same
> >> problem without doing magic on the devices?
> >
> > No, that's not the point. As discussed on irc, I think you now
> > understand the issue (partitions not being present, media changed
> > without kernel knowing about it, etc.)
>
> I think it is the point ... you don't need to scan the device until
> someone does "readdir" / "lookup" as Linus says above. Just this way
> there's no duplexing of the block devices, just a magic directory node
> (which might avoid the screams Linus mentions).
> And it gets rid of the polling crap.
The problem is the following:
- user plugs in their usb flash reader with no media in it
- the main block device is create, no partitions
- user plugs a flash stick/whatever into the reader
- kernel gets no notification of this event
- userspace gets no notification of this event
How can userspace know to open the main block device now? Require that
we put a big "Rescan media now" button on the desktop? That's one way,
but users are used to having to not do that.
If a daemon does the scanning every so often, the media can be
automatically mounted, and an application can pop up saying that it
found some pictures on the new device, do you want to open up your image
application?
That help?
greg k-h
On Wed, 7 Jan 2004, Greg KH wrote:
>
> If accessing the partition doesn't work, than having udev create all
> partitions wouldn't help anything :(
Accessing a partition should definitely work. I regularly just stick in a
memory card in a card reader and directly access sda1.
Linus
On Wed, 7 Jan 2004, Greg KH wrote:
>
> It sounds like just having the device node around will not cause the
> rescan if you access it.
Who has problems with it, and why?
When you open a partition, that will automatically open the "whole
device", and that should cause a re-scan if the device supports the "disk
change" flagging.
This definitely has worked for me. It used to be broken in really early
2.5.x, but it got fixed with the block device interface fixes.
Linus
> The problem is the following:
> - user plugs in their usb flash reader with no media in it
> - the main block device is create, no partitions
> - user plugs a flash stick/whatever into the reader
> - kernel gets no notification of this event
> - userspace gets no notification of this event
You can solve the partitions bit by rescanning when someone opens
the "/dev/hdaX" directory, was my only point. Which was just a little
twist on Linus's thing, only a bit simpler. And no, it doesn't work
unless that dir is a virtual thing that's "magic".
> How can userspace know to open the main block device now? Require that
> we put a big "Rescan media now" button on the desktop? That's one way,
> but users are used to having to not do that.
I don't think you should have to press something explicit for the udev
stuff to be correct. But you should to start an application that does
pretty stuff with pictures, like you mention below.
> If a daemon does the scanning every so often, the media can be
As Linus pointed out ... "every so often" has to be horribly frequent, so
I don't see how that can work. How often are you proposing? once a second?
> automatically mounted, and an application can pop up saying that it
> found some pictures on the new device, do you want to open up your image
> application?
Ick. You mean like the Windows crap than autoruns stuff off your CD?
If you really, really want that, then yes you'd have to poll, but it
would be nice to solve the partitions bit without that, IMHO. I'd
see a lot more demand for the partitions being correctly populated
than magically triggered GUI stuff.
M.
On Wed, 7 Jan 2004, Olaf Hering wrote:
> >
> > Then, user space can just access "/dev/sda1" or whatever, and the act of
> > accessing it will force the re-scan.
>
> How would that work? I mean, what will a tool that cares about a block
> event do? It will run a fdisk/parted -l /udev/sda to figure out what partitions
> are there (just to skip an extended partition sda5, as example) and
> finds no media. That tool will never run again on sda, unless a new
> block add event comes in. So some sort of polling is required for that
> class of devices.
What is your problem?
I'll use a very common and simple case that I do myself: use any USB media
reader to read a camera card. It will be a FAT filesystem on the first
partition, so your fstab might look like this:
/dev/sda1 /mnt/smartmedia vfat noauto,user,ro 0 0
and then you just do "mount /mnt/smartmedia", and you're done.
This works. I do it all the time. You just stick in your card, and mount
it, and off it foes. No "fdisk" or "parted" _anywhere_.
Linus
On Thu, 8 Jan 2004, Andries Brouwer wrote:
>
> Now the plan (or at least my plan) has always been to remove all
> partition detection from the kernel. It can all be done from user space.
We had this discussion last year. It makes no sense to cripple the kernel
that way. Not gonna happen.
When I insert a card in my card reader, it had better "just work". WITHOUT
any strange "poll another device Y to make device X" work.
Linus
On Wed, Jan 07, 2004 at 10:38:31AM -0800, Linus Torvalds wrote:
> > Doesn't the kernel always create the main block device for this device?
> > If so, udev will catch that.
>
> But udev should probably also create all the sub-nodes if it doesn't
> already.
>
> And it really has to create _all_ of them, exactly because there's no way
> to know ahead-of-time which of them will be available.
>
> Then, user space can just access "/dev/sda1" or whatever, and the act of
> accessing it will force the re-scan.
1. That is a bad idea.
2. There is no problem.
Ad 1)
- All, how many is that? 16? 64? more?
Today IDE disks have 64 minors.
- I hate to think of the sysfs tree. Today I am unhappy already:
# ls /sys/block
. nbd102 nbd116 nbd15 nbd29 nbd42 nbd56 nbd7 nbd83 nbd97 ram6
.. nbd103 nbd117 nbd16 nbd3 nbd43 nbd57 nbd70 nbd84 nbd98 ram7
fd0 nbd104 nbd118 nbd17 nbd30 nbd44 nbd58 nbd71 nbd85 nbd99 ram8
hda nbd105 nbd119 nbd18 nbd31 nbd45 nbd59 nbd72 nbd86 ram0 ram9
hdb nbd106 nbd12 nbd19 nbd32 nbd46 nbd6 nbd73 nbd87 ram1 sda
hdd nbd107 nbd120 nbd2 nbd33 nbd47 nbd60 nbd74 nbd88 ram10 sdb
hde nbd108 nbd121 nbd20 nbd34 nbd48 nbd61 nbd75 nbd89 ram11 sdc
hdf nbd109 nbd122 nbd21 nbd35 nbd49 nbd62 nbd76 nbd9 ram12 sdd
hdg nbd11 nbd123 nbd22 nbd36 nbd5 nbd63 nbd77 nbd90 ram13 sde
md0 nbd110 nbd124 nbd23 nbd37 nbd50 nbd64 nbd78 nbd91 ram14 sr0
nbd0 nbd111 nbd125 nbd24 nbd38 nbd51 nbd65 nbd79 nbd92 ram15
nbd1 nbd112 nbd126 nbd25 nbd39 nbd52 nbd66 nbd8 nbd93 ram2
nbd10 nbd113 nbd127 nbd26 nbd4 nbd53 nbd67 nbd80 nbd94 ram3
nbd100 nbd114 nbd13 nbd27 nbd40 nbd54 nbd68 nbd81 nbd95 ram4
nbd101 nbd115 nbd14 nbd28 nbd41 nbd55 nbd69 nbd82 nbd96 ram5
Someone decided to create "all" nbd devices.
If a similar thing is done for all disks then soon this directory will have
more nodes that current /dev.
Ad 2)
On the other hand, there is no need at all for udev or the kernel
to do anything special.
If hotplug and udev do their work, then the node for the whole device
is created.
Now the plan (or at least my plan) has always been to remove all
partition detection from the kernel. It can all be done from user space.
We have had the infrastructure for a long time, and it works.
So, it is trivial to create the utility parsept+mount that takes
a device, reads its partition table, tells the kernel about the
partitions found there, and mounts the desired partition.
It can be a mount option to do a parsept if the device does not
exist yet.
Andries
On Thu, 8 Jan 2004, Andries Brouwer wrote:
>
> On Wed, Jan 07, 2004 at 06:19:42PM -0800, Linus Torvalds wrote:
>
> > When I insert a card in my card reader, it had better "just work". WITHOUT
> > any strange "poll another device Y to make device X" work.
>
> But it does just work.
>
> Already today.
Exactly. It works today, because:
- the device nodes are there.
Ergo: udev should create the device nodes
- the kernel autopartitions the device on any open (both main device and
the subpartitions) when it notices a changed media. No polling
required.
Ergo: the kernel should continue to do this.
We should _not_ be in the situation where either of these things aren't
true.
Linus
On Wed, Jan 07, 2004 at 06:19:42PM -0800, Linus Torvalds wrote:
> When I insert a card in my card reader, it had better "just work". WITHOUT
> any strange "poll another device Y to make device X" work.
But it does just work.
Already today.
On Wed, 07 Jan 2004 14:25:44 CST, Richard Troth said:
> Consider the long-range ramifications
> if a device can also be a directory, just "magically".
> I'm not going to automatically diss the idea (other than my
> natural reaction above) but please consider beyond the immediate hack.
If it's so ugly, why do Solaris and Irix not have a problem in making a
device graph that looks somewhat like that as well? The idea is *not*
so crazy as to discard out-of-hand. Perhaps what you need is:
/dev/hda/disk whole-disk access
/dev/hda/0 partition 0
/dev/hda/1 partition 1
... etc
> It reminds me of AIX from the days when it ran on PCs.
> They had this neat trick of "hidden directories" (for a different
> purpose). It looked like an executable, but really was a
> directory containing multiple executables for various platforms.
> (This version of AIX also ran on the mainframe, AIX/386 and AIX/370.)
> There were/are better ways of solving the problem they were addressing.
You think that was ugly, you should have seen it around the F17 level when
there were still some bugs in there (you haven't lived till you see the ooglyness
of an 'rsh' between TCF cluster members of different flavors - it propagated
the environment variable that controlled that stuff.. :)
Incidentally, I'm told that HP/UX has a similar concept, at least in some
releases...
On Wed, Jan 07, 2004 at 06:56:30PM -0800, Linus Torvalds wrote:
> > > When I insert a card in my card reader, it had better "just work". WITHOUT
> > > any strange "poll another device Y to make device X" work.
> >
> > But it does just work.
> >
> > Already today.
>
> Exactly. It works today, because:
>
> - the device nodes are there.
>
> Ergo: udev should create the device nodes
>
> - the kernel autopartitions the device on any open (both main device and
> the subpartitions) when it notices a changed media. No polling
> required.
>
> Ergo: the kernel should continue to do this.
>
> We should _not_ be in the situation where either of these things aren't
> true.
Indeed.
I am even happy in a somewhat more general situation that you are.
If the kernel autopartitions (and make recognition of new partitions
hotplug events so that udev can create the device nodes), all is well.
On the other hand, if the kernel does not autopartition
(for example because I had not selected CONFIG_MAC_PARTITION
and put a Mac disk in my ZIP drive) then all is still well
since user space can tell the kernel about the partitions
it sees on this Mac disk. Of course these partitions that
userspace tells the kernel about must also lead to hotplug
events.
Andries
On Thu, 8 Jan 2004, Andries Brouwer wrote:
>
> I am even happy in a somewhat more general situation that you are.
> If the kernel autopartitions (and make recognition of new partitions
> hotplug events so that udev can create the device nodes), all is well.
Yes. We _could_ do that, by just making a "we noticed the disk change" be
a hotplug event. However, I'm loath to do that, because some devices
literally don't even have an easily read disk change signal, so what they
do is
- assume the disk _always_ changed on open
- do a quick IO to verify it
and I'd be nervous about that kind of thing resulting in hotplug being
called constantly if somebody rude just has an endless loop of
"open()/close()".
Linus
On Wednesday 07 January 2004 21:03, Linus Torvalds wrote:
>On Wed, 7 Jan 2004, Olaf Hering wrote:
>> > Then, user space can just access "/dev/sda1" or whatever, and
>> > the act of accessing it will force the re-scan.
>>
>> How would that work? I mean, what will a tool that cares about a
>> block event do? It will run a fdisk/parted -l /udev/sda to figure
>> out what partitions are there (just to skip an extended partition
>> sda5, as example) and finds no media. That tool will never run
>> again on sda, unless a new block add event comes in. So some sort
>> of polling is required for that class of devices.
>
>What is your problem?
>
>I'll use a very common and simple case that I do myself: use any USB
> media reader to read a camera card. It will be a FAT filesystem on
> the first partition, so your fstab might look like this:
>
> /dev/sda1 /mnt/smartmedia vfat
> noauto,user,ro 0 0
>
>and then you just do "mount /mnt/smartmedia", and you're done.
>
>This works. I do it all the time. You just stick in your card, and
> mount it, and off it foes. No "fdisk" or "parted" _anywhere_.
I do too, except the card is still in my camera when I do it. But, I
do have to ask, why the ro? I regularly do housekeeping in the
camera once I've downloaded the images I want. The only problem I've
had is related to deleting the first images all in a row. Apparently
fat thinks an empty sector is the end of the directory. So one must
delete on LIFO basis.
> Linus
>-
>To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
--
Cheers, Gene
AMD K6-III@500mhz 320M
Athlon1600XP@1400mhz 512M
99.22% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attornies please note, additions to this message
by Gene Heskett are:
Copyright 2003 by Maurice Eugene Heskett, all rights reserved.
On Wed, 7 Jan 2004, Gene Heskett wrote:
>
> I do too, except the card is still in my camera when I do it.
My camera just does USB-1, and with a gigabyte card that's just too
painful. Besides, I don't want to eat camera batteries. So I just pop it
over in something that is ten times faster.
> But, I do have to ask, why the ro?
I don't trust camera firmware to do a full VFAT implementation, so out of
principle I only let the camera write to it normally. That way I don't
need to worry about the limited braincapacity of my poor camera.
> Apparently fat thinks an empty sector is the end of the directory. So
> one must delete on LIFO basis.
Sounds like your camera gets easily confused too. Me, I just transfer the
whole thing, and then I let the camera do a "format".
I've seen cameras that have serious problems with old filesystems - when
they get fragmented enough, the camera says that there is 50% free space,
but can't actually write a single picture any more. Deleting pictures to
make space only helps a bit, then it's "full" again.
Which is why I just delete everything by letting the camera do the
formatting.
Some day cameras will run Linux too, and I'll trust them. In the meantime
I just don't expect them to do that well.
Linus
I feel, the media revalidation issue can be solved by hacking protocol(SCSI
or USB) layer and the respective notification of the same to user space by a
signal.
-----Original Message-----
From: [email protected]
[mailto:[email protected]]On Behalf Of Jens Axboe
Sent: Wednesday, January 07, 2004 4:01 PM
To: Olaf Hering
Cc: Andrey Borzenkov; Andries Brouwer; Greg KH;
[email protected]; [email protected]
Subject: Re: removable media revalidation - udev vs. devfs or static
/dev
On Wed, Jan 07 2004, Olaf Hering wrote:
> On Wed, Jan 07, Jens Axboe wrote:
>
> > On Wed, Jan 07 2004, Olaf Hering wrote:
> > > On Wed, Jan 07, Jens Axboe wrote:
> > >
> > > > No need to put it in the kernel, user space fits the bil nicely. I
don't
> > > > see how this would lead to IO errors?
> > >
> > > Ok, how should it be done on my SCSI and parallel port ZIP? An ATAPI
ZIP
> ^^^
>
> "How"? We need a sane way to deal with removeable medias.
> Do you have example code that can be put into the udev distribution?
--
Jens Axboe
On Wed, Jan 07, Linus Torvalds wrote:
> This works. I do it all the time. You just stick in your card, and mount
> it, and off it foes. No "fdisk" or "parted" _anywhere_.
This is the point. You do it, and I do it. We just know that device and
have it configured somehow.
Now what should a distro do for these 'unknown' devices? Add 15 fstab
entries and let KDE put 15 icons on the desktop (for that ZIP) and give
the user a choice?
Like you said in another mail:
If you insert a smartmedia card in your cardreader, you expect to be
able to access it pretty much immediately when you start typing.
Noone knows in advance what media will be inserted. So we have to poll
if the hardware doesnt inform us.
We can make an assumption and add just one icon with a smart application
behind it. This app does the 'blockdev --rereadpt /dev/hdd' on request,
and not every 2 seconds.
--
USB is for mice, FireWire is for men!
sUse lINUX ag, nÜRNBERG
Le jeu 08/01/2004 ? 04:43, Linus Torvalds a ?crit :
> On Thu, 8 Jan 2004, Andries Brouwer wrote:
> >
> > I am even happy in a somewhat more general situation that you are.
> > If the kernel autopartitions (and make recognition of new partitions
> > hotplug events so that udev can create the device nodes), all is well.
>
> Yes. We _could_ do that, by just making a "we noticed the disk change" be
> a hotplug event. However, I'm loath to do that, because some devices
> literally don't even have an easily read disk change signal, so what they
> do is
>
> - assume the disk _always_ changed on open
> - do a quick IO to verify it
>
> and I'd be nervous about that kind of thing resulting in hotplug being
> called constantly if somebody rude just has an endless loop of
> "open()/close()".
Theses devices are kind of broken anyway, aren't they ? I see no safe
way of handling disk changes on them, except having a "I changed disk in
this drive" button on the desktop and rely on the user's good behavior.
Currently the kernel will may have a wrong idea of what's in the drive
if it doesn't poll, and that may wreak havoc.
Xav
In ens.mailing-lists.linux-kernel, you wrote:
> Exactly. It works today, because:
>
> - the device nodes are there.
>
> Ergo: udev should create the device nodes
>
> - the kernel autopartitions the device on any open (both main device and
> the subpartitions) when it notices a changed media. No polling
> required.
>
> Ergo: the kernel should continue to do this.
Maybe I don't understand the issues, maybe I am going to say something
stupid, but couldn't the problem be solved by adding an extra option to
mount, which would say: if device is a main device and no filesystem on
it, then try to open all subpartitions in turn till a filesystem is
found.
This way, I could just put in fstab
/dev/sda /mnt/media auto mymagicoption 0 0
mount first access sda, forcing a partition read. /dev/sda1 is created
by udev and mount finds it during its scan.
Complexity is in userspace. No polling, no extra device nodes. Policy is
in userspace (what should mount do when it finds two partitions ?
Creates subdirectories ? Fails ? Mount the first one ? The largest one ?
Just add as many mount options as you wish !). Plus it solves an extra
problem: I have zip disks with no partition tables (everything in
/dev/sda), and zip disks with filesystem sometimes in sda1, sometimes in
sda4. Right now, writing a fstab is difficult.
What did I miss ?
?ric Brunet
>
> > NOTE! We do have an alternative: if we were to just make block device
> > nodes support "readdir" and "lookup", you could just do
> >
> > open("/dev/sda/1" ...)
> >
> > and it magically works right. I've wanted to do this for a long time, but
> > every time I suggest allowing it, people scream.
>
> Hm, that would be nice. I don't remember seeing it being proposed
> before, what are the main complaints people have with this?
>
this has been in Linux long enough and was called "devfs". Apparently
somebody decided this was evil and removed it. I too am interested
what exatcly was wrong with this design (not implementation)?
Unfortunately the problem is worse than just that.
The main reason to use udev is to have persistent names for devices.
Currently my USB may be sda1 and next time I stick it in may be sdb1;
so I'd like to call it /dev/usb0 and use it.
But in this case we do not have even this possibility of revalidating
media on access to /dev/sda/1 because not only do not we have
/dev/usb0 as yet - we do not even know what it possibly points at.
Assuming - oh, horror - that we do use devfs, we have LOOKUP event,
so we can call naming agent for /dev/usb0 - and we can tell it that
usb0 refers to SCSI device on first port of my USB hub (you usually
plug it in the same slot do not you?) It can find out that there
is already block device for it and simply initiate rescan of
partition. Magically making sda/1 appear and linking usb0 to it.
Without some kind of LOOKUP event apparently the only possibility
is polling :(
regards
-andrey
On Thu, 2004-01-08 at 03:15, Greg KH wrote:
> The problem is the following:
> - user plugs in their usb flash reader with no media in it
> - the main block device is create, no partitions
> - user plugs a flash stick/whatever into the reader
> - kernel gets no notification of this event
> - userspace gets no notification of this event
>
> How can userspace know to open the main block device now? Require that
> we put a big "Rescan media now" button on the desktop? That's one way,
> but users are used to having to not do that.
>
Stupid question - can't the kernel do the rescan?
--
Martin Schlemmer
On Wed, 2004-01-07 at 22:43, Linus Torvalds wrote:
> Yes. We _could_ do that, by just making a "we noticed the disk change" be
> a hotplug event. However, I'm loath to do that, because some devices
> literally don't even have an easily read disk change signal, so what they
> do is
I like the idea of a hotplug event on media change (basically, a hotplug
event for partitions). And, in fact, I am loath not to do it.
The current direction with the kernel and udev is letting us move _away_
from polling. Projects such as HAL are helping to finally integrate
hardware management throughout the system. But HAL is going to be very
confused by some of the alternative solutions for partitions: requiring
that all of the partition device nodes preexist is going to really
complicate things, and I really do not want to have to poll on all of
them in order for HAL to have an idea of what partitions are valid.
But I hear you loud and clear about dumb devices that cannot detect
media change. They pose a problem.
I want a proper solution, too... ideas?
Robert Love
On Thu, Jan 08, 2004 at 07:28:47PM -0500, Robert Love wrote:
> On Wed, 2004-01-07 at 22:43, Linus Torvalds wrote:
>
> > Yes. We _could_ do that, by just making a "we noticed the disk change" be
> > a hotplug event. However, I'm loath to do that, because some devices
> > literally don't even have an easily read disk change signal, so what they
> > do is
>
> I like the idea of a hotplug event on media change (basically, a hotplug
> event for partitions). And, in fact, I am loath not to do it.
>
> The current direction with the kernel and udev is letting us move _away_
> from polling. Projects such as HAL are helping to finally integrate
> hardware management throughout the system. But HAL is going to be very
> confused by some of the alternative solutions for partitions: requiring
> that all of the partition device nodes preexist is going to really
> complicate things, and I really do not want to have to poll on all of
> them in order for HAL to have an idea of what partitions are valid.
>
> But I hear you loud and clear about dumb devices that cannot detect
> media change. They pose a problem.
If we have a hotplug media change event we can do:
if new device x appears:
x->device_nodes = all possible partition nodes
start_media_change_timer()
if media change on x:
x->device_nodes = current media partitions
clear_media_change_timer()
create_nodes(x)
if timeout: /* we didn't get media change so create all nodes */
create_nodes(x)
Then the dumb devices (which should be a small minority) just show up
with a harmless excess of partitions.
--
Matt Mackall : http://www.selenic.com : Linux development and consulting
On Thu, 8 Jan 2004, Robert Love wrote:
>
> I like the idea of a hotplug event on media change (basically, a hotplug
> event for partitions). And, in fact, I am loath not to do it.
The thing is, you won't actually get a "media changed" event with almost
any normal hardware.
What you will get is "somebody tried to open the media, and we realized
that it has changed since the last open".
> The current direction with the kernel and udev is letting us move _away_
> from polling.
.. but this isn't actually the case for media change. You'd still not get
the media change when the change _occurs_. You'd only get it much later.
> But I hear you loud and clear about dumb devices that cannot detect
> media change. They pose a problem.
No, even the smart devices pose a problem - but at least for those the
test whether the media has changed is pretty simple.
The really _dumb_ devices just assume the media has changed every time the
device is opened.
Linus
On Wed, Jan 07, 2004 at 10:57:00AM -0800, Greg KH wrote:
> Hm, that would work, but what about a user program that just polls on
> the device, as the rest of this thread discusses? As removable devices
> are not the "norm" it would seem a bit of overkill to create 16
> partitions for every block device, if they need them or not.
Um, adding all 16 partitions for a block device that has 5
defined is opposite of the intention of udev, no? While I'd prefer the
partition code in-kernel provide hotplug events for each partition, if
it is instead scanned by udev, udev should indeed scan the partition
table. Remember, udev should be able to give the appropriate
system-defined names for the partition, not just 'sda1'.
Joel
--
"Gone to plant a weeping willow
On the bank's green edge it will roll, roll, roll.
Sing a lulaby beside the waters.
Lovers come and go, the river roll, roll, rolls."
Joel Becker
Senior Member of Technical Staff
Oracle Corporation
E-mail: [email protected]
Phone: (650) 506-8127
Joel Becker <[email protected]> writes:
> On Wed, Jan 07, 2004 at 10:57:00AM -0800, Greg KH wrote:
> > Hm, that would work, but what about a user program that just polls on
> > the device, as the rest of this thread discusses? As removable devices
> > are not the "norm" it would seem a bit of overkill to create 16
> > partitions for every block device, if they need them or not.
>
> Um, adding all 16 partitions for a block device that has 5
> defined is opposite of the intention of udev, no?
It shouldn't be _that_ bad.
removable media devices usually tell you that they are removable media
devices (scsi: inquiry data has a bit for that IIRC). If you pass
this up to hotplug it can use that to figure whenever it has a hard
disk (=> just create the existing partitions) or a removable device
(=> create a few more nodes to catch the usual removable media
layouts).
I also think you don't need *all* minors for removable media. I
havn't seen removable media with extended partitions so far. IIRC zip
floppys are using /dev/sda4 and most other ones either /dev/sda1 or
/dev/sda directly, so we likely can catch 99% with just three device
nodes.
Gerd
--
You have a new virus in /var/mail/kraxel
>
> On Wed, Jan 07, 2004 at 10:57:00AM -0800, Greg KH wrote:
to create 16 partitions for every block device, if they need them or not.
>
> Um, adding all 16 partitions for a block device that
> has 5 defined is opposite of the intention of udev, no?
> While I'd prefer the partition code in-kernel provide hotplug
> events for each partition, if it is instead scanned by udev,
> udev should indeed scan the partition table. Remember, udev
> should be able to give the appropriate system-defined names
> for the partition, not just 'sda1'.
>
> Joel
I think current kernel do provide hotplug events for each partition, the
key problem is as Linus said, the most hardware will not give a event when
media changes. So I just use a stupid way(just like "use a big button"),
pull out first and then plug in the flashdriver when changing the media
or inserting a media to an empty driver. Then udev can remove the old
node for and create new node for new media.
Matt Mackall wrote :
> Then the dumb devices (which should be a small minority) just show up
> with a harmless excess of partitions.
You are underestimating the hardware manufacturers ingenuity. Cheap stuff
(camera card readers...) will always be dumb. In fact people have not even
been discussing there how dumb it can get. I happen to own a dual CF/SM
reader (was cheaper than the single SM reader I needed at the time). It
doesn't appear to support media change notification. In fact the reader
chip seems to simple to process both card slots at the same time. The
manufacturer solved this problem by using a mechanical plastic flap that
prevents insertion of a second card when there is already one in the
reader. There is no notification on what slot is in use to the OS. Both
windows and linux treat it as a dual reader (even though there can only be
a single card inserted at any point of time), export two drives and do
continuous polling just to find out which slot is in use.
Since the SM card is slot two, every single time I plug the reader I see
the CF part errorring out before the driver take a look at the SM one.
Cheers,
--
Nicolas Mailhot
On Fri, Jan 09, 2004 at 10:49:03AM +0100, Gerd Knorr wrote:
> I also think you don't need *all* minors for removable media. I
> havn't seen removable media with extended partitions so far. IIRC zip
> floppys are using /dev/sda4 and most other ones either /dev/sda1 or
> /dev/sda directly, so we likely can catch 99% with just three device
> nodes.
Ahh, but that's magic, and we don't want magic. Today, you just
'magically' know that your camera card reader shows up at sda1. We
don't want that (or at least, I hope we don't). We want sysfs to
describe exactly what appeared (a block device with one partition), and
we want udev to give it the name our policy has asked it to (/dev/disk1
or /dev/camera1 or /dev/partition1 or whatever that policy is). In a
udev world, I don't want to have to intrinsicly know that sda1 is where
some card reader devices appear. Naming should be a priori, not random.
Joel
--
"Glory is fleeting, but obscurity is forever."
- Napoleon Bonaparte
Joel Becker
Senior Member of Technical Staff
Oracle Corporation
E-mail: [email protected]
Phone: (650) 506-8127
On 01.12, Joel Becker wrote:
> On Fri, Jan 09, 2004 at 10:49:03AM +0100, Gerd Knorr wrote:
> > I also think you don't need *all* minors for removable media. I
> > havn't seen removable media with extended partitions so far. IIRC zip
> > floppys are using /dev/sda4 and most other ones either /dev/sda1 or
> > /dev/sda directly, so we likely can catch 99% with just three device
> > nodes.
>
> Ahh, but that's magic, and we don't want magic. Today, you just
> 'magically' know that your camera card reader shows up at sda1. We
Or that a standard MacOS hfs+ drive (or usb flash) does not show at sdc9
(as it does, in fact...).
--
J.A. Magallon <jamagallon()able!es> \ Software is like sex:
werewolf!able!es \ It's better when it's free
Mandrake Linux release 10.0 (Cooker) for i586
Linux 2.6.1-jam2 (gcc 3.3.2 (Mandrake Linux 10.0 3.3.2-4mdk))
On Wed, Jan 07, Jens Axboe wrote:
> On Wed, Jan 07 2004, Olaf Hering wrote:
> > On Wed, Jan 07, Jens Axboe wrote:
> >
> > > No need to put it in the kernel, user space fits the bil nicely. I don't
> > > see how this would lead to IO errors?
> >
> > Ok, how should it be done on my SCSI and parallel port ZIP? An ATAPI ZIP
> > with 2.4 behaves like that:
> >
> >
> > nectarine:~ # blockdev --rereadpt /dev/hdd
> > /dev/hdd: Eingabe-/Ausgabefehler
> > nectarine:~ # dmesg | tail
> > nfs: server Hilbert2 OK
> > nfs: server Hilbert3 not responding, still trying
> > nfs: server Hilbert3 OK
> > nfs: server Hilbert3 not responding, still trying
> > nfs: server Hilbert3 OK
> > nfs: server Hilbert2 not responding, still trying
> > nfs: server Hilbert2 OK
> > ide-floppy: hdd: I/O error, pc = 0, key = 2, asc = 3a, ascq = 0
> > ide-floppy: hdd: I/O error, pc = 1b, key = 2, asc = 3a, ascq = 0
> > hdd: No disk in drive
> > nectarine:~ # cat /proc/ide/hdd/model
> > IOMEGA ZIP 100 ATAPI
>
> Two problems here. First, ide-floppy should not verbosely fail these
> commands (2/3a/00 is 'medium not present'). Second, you are not using
> the proper mechanism to detect media events.
2.6.2-mm2 prints still these warnings. Should the driver handle this
'special' case?
ide-floppy driver 0.99.newide
hdd: No disk in drive
hdd: 98304kB, 96/64/32 CHS, 4096 kBps, 512 sector size, 2941 rpm
ide-floppy: hdd: I/O error, pc = 0, key = 2, asc = 3a, ascq = 0
ide-floppy: hdd: I/O error, pc = 1b, key = 2, asc = 3a, ascq = 0
hdd: No disk in drive
ide-floppy: hdd: I/O error, pc = 0, key = 2, asc = 3a, ascq = 0
ide-floppy: hdd: I/O error, pc = 1b, key = 2, asc = 3a, ascq = 0
hdd: No disk in drive
--
USB is for mice, FireWire is for men!
sUse lINUX ag, nÜRNBERG