2003-09-24 20:30:03

by Andries E. Brouwer

[permalink] [raw]
Subject: rfc: test whether a device has a partition table

As everyone knows it is a bad idea to let the kernel guess
whether there is a partition table on a given block device,
and if so, of what type.
Nevertheless this is what almost everybody does.

Until now the philosophy was: floppies do not have a partition table,
disks do have one, and for ZIP drives nobody knows. With USB we get
more types of block device that may or may not have a partition table
(and if they have none, usually there is a FAT filesystem with bootsector).
In such cases the kernel assumes a partition table, and creates a mess
if there was none. Some heuristics are needed.

Many checks are possible (for a DOS-type partition table: boot indicator
must be 0 or 0x80, partitions are not larger than the disk,
non-extended partitions are mutually disjoint; for a boot sector:
it starts with a jump, the number of bytes per sector is 512 or
at least a power of two, the number of sectors per cluster is 1
or at least a power of two, the number of reserved sectors is 1 or 32,
the number of FAT copies is 2, ...).

I tried a minimal test, and the below is good enough for the
boot sectors and DOS-type partition tables that I have here.

So, question: are there people with DOS-type partition tables
or FAT fs bootsectors where the below gives the wrong answer?
I would be interested in a copy of the sector.

I expect to submit some sanity check to DOS-type partition table
parsing, and hope to recognize with high probability the presence
of a full disk FAT filesystem.

Andries

------------ sniffsect.c -----------------

/*
* Given a block device, does it have a DOS-type partition table?
* Or does it behave like a floppy and have a whole-disk filesystem?
* Or is it something else?
*
* Return 1 for pt, -1 for boot sect, 0 for unknown.
*/

#include <stdio.h>
#include <fcntl.h>

/* DOS-type partition */
struct partition {
unsigned char bootable; /* 0 or 0x80 */
unsigned char begin_chs[3];
unsigned char systype;
unsigned char end_chs[3];
unsigned char start_sect[4];
unsigned char nr_sects[4];
};

int sniffsect(unsigned char *p) {
struct partition *pt;
int i, n;
int maybept = 1;
int maybebs = 1;

/* Both DOS-type pt and boot sector have a 55 aa signature */
if (p[510] != 0x55 || p[511] != 0xaa)
return 0;

/* A partition table has boot indicators 0 or 0x80 */
for (i=0; i<4; i++) {
pt = (struct partition *)(p + 446 + 16*i);
if (pt->bootable != 0 && pt->bootable != 0x80)
maybept = 0;
}

/* A boot sector has a power of two as #sectors/cluster */
n = p[13];
if (n == 0 || (n & (n-1)) != 0)
maybebs = 0;

/* A boot sector has a power of two as #bytes/sector */
n = (p[12] << 8) + p[11];
if (n == 0 || (n & (n-1)) != 0)
maybebs = 0;

return maybept - maybebs;
}

int main(int argc, char **argv) {
unsigned char sect[512];
int fd, n;

if (argc != 2) {
fprintf(stderr, "Call: sniffsect file\n");
exit(1);
}

fd = open(argv[1], O_RDONLY);
if (fd == -1) {
perror(argv[1]);
fprintf(stderr, "Cannot open %s\n", argv[1]);
exit(1);
}

n = read(fd, sect, sizeof(sect));
if (n != sizeof(sect)) {
if (n == -1)
perror(argv[1]);
fprintf(stderr, "Cannot read 512 bytes from %s\n", argv[1]);
exit(1);
}

n = sniffsect(sect);
printf((n == 1) ? "partition table\n" :
(n == -1) ? "boot sector\n" : "no idea\n");
return 0;
}


2003-09-24 21:55:33

by Linus Torvalds

[permalink] [raw]
Subject: Re: rfc: test whether a device has a partition table

[email protected] wrote:
>
> As everyone knows it is a bad idea to let the kernel guess
> whether there is a partition table on a given block device,
> and if so, of what type.

So you say, and so you've said for a long time, but claiming that "everybody
knows it" is clearly not true.

In particular, I think that a kernel that doesn't do partitioning is quite
fundamentally broken. I'm sure others will agree.

If you have unusual cases (and let's face it, they don't much happen - we
have traditionally had _very_ few problems with getting things partitioned)
then you should be able to override them from user space and have user space
be able to tell the kernel about special partitions.

And hey, surprise surprise, you can do exactly that.

Also, surprise surprise, pretty much nobody actually does it. Because the
defaults are so sane.

Repeat after me: make the defaults so sane that most people don't even
have to think about it.

In short, I think your first sentence (upon which the rest of the argument
depends) is just quite _fundamentally_ flawed.

Linus

2003-09-24 23:50:47

by Andries Brouwer

[permalink] [raw]
Subject: Re: rfc: test whether a device has a partition table

On Wed, Sep 24, 2003 at 02:54:21PM -0700, Linus Torvalds wrote:

> Repeat after me: make the defaults so sane that most people don't even
> have to think about it.
>
> In short, I think your first sentence (upon which the rest of the argument
> depends) is just quite _fundamentally_ flawed.

Ha, Linus - didn't you know I am always right?

But being right in theory - like you say, I have repeated these
things for many years - is not enough to submit a kernel patch.
The post of today was prompted by a mail about
certain USB devices:

> On closer examination it seems to be the partition table
> which is read ok (as one partition) on W2K and XP
> but Linux (both 2.4 and 2.6) gets really confused and
> thinks there are 4 malformed partitions.

and

> Linux probably needs to handle this situation more
> gracefully. A local police force bought a bunch of
> these devices for Linux based forensic work. They
> are a bit disappointed at the moment.

So, now not only theory but also practice is involved, and
we must do something.

My post implicitly suggested the minimal thing to do.
It will not be enough - heuristics are never enough -
but it probably helps in most cases.

I wait a little for reactions, and hope to send you a patch later.

Andries

2003-09-25 00:05:10

by Al Viro

[permalink] [raw]
Subject: Re: rfc: test whether a device has a partition table

On Thu, Sep 25, 2003 at 01:50:41AM +0200, Andries Brouwer wrote:
> > On closer examination it seems to be the partition table
> > which is read ok (as one partition) on W2K and XP
> > but Linux (both 2.4 and 2.6) gets really confused and
> > thinks there are 4 malformed partitions.
>
> and
>
> > Linux probably needs to handle this situation more
> > gracefully. A local police force bought a bunch of
> > these devices for Linux based forensic work. They
> > are a bit disappointed at the moment.
>
> So, now not only theory but also practice is involved, and
> we must do something.

If there *is* a partition table with one entry and it gets misparsed - we
have a real bug that has to be dealt with and your heuristics won't help.
If there is no partition table at all and in fact they have a filesystem
on the entire disk - let them use *entire* *disk*. You can very well
read /dev/sd<letter>, mount it, whatever. Here I do have a SCSI disk
that is not partitioned at all. And guess what? It works with no extra
efforts needed:

Vendor: QUANTUM Model: ATLAS10K3_18_SCA Rev: 020W
Type: Direct-Access ANSI SCSI revision: 03
sym53c1010-33-0-<0,0>: tagged command queue depth set to 4
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
sym53c1010-33-0-<0,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 31)
SCSI device sda: 35916548 512-byte hdwr sectors (18389 MB)
sda: unknown partition table

al@satch:~/kernel/2.5$ mount | grep sda
/dev/sda on /mnt/sda type ext2 (rw)

Note that usb-storage looks like a SCSI host for the rest of kernel, so that's
exactly the same situation - device that is expected to be partitioned but in
reality isn't.

So what precisely are you trying to fix?

2003-09-25 00:18:21

by Linus Torvalds

[permalink] [raw]
Subject: Re: rfc: test whether a device has a partition table


On Thu, 25 Sep 2003, Andries Brouwer wrote:
>
> Ha, Linus - didn't you know I am always right?

Yeah, sure. But ...

> But being right in theory - like you say, I have repeated these
> things for many years - is not enough to submit a kernel patch.
> The post of today was prompted by a mail about
> certain USB devices:
>
> > On closer examination it seems to be the partition table
> > which is read ok (as one partition) on W2K and XP
> > but Linux (both 2.4 and 2.6) gets really confused and
> > thinks there are 4 malformed partitions.

So? There's a bug, and we'll fix it.

The _worst_ thing that can happen is that you have four extra (totally
bogus) partitions, and you end up using the whole device.

That's my point about partitioning - not that it's necessarily perfect,
but even when it _isn't_ perfect, it's no worse than not partitioning at
all.

I know you don't want the kernel to partition at all. But I don't see your
point.

> > Linux probably needs to handle this situation more
> > gracefully. A local police force bought a bunch of
> > these devices for Linux based forensic work. They
> > are a bit disappointed at the moment.
>
> So, now not only theory but also practice is involved, and
> we must do something.

Why don't they just read the whole device, if that is what they want to
do?

So we have two cases:
a) we have a bug in the partitioning code, and don't parse the partition
table right:
- let's fix the bug
b) people don't want to read the partition info at all, as it's bogus
- use the whole-device node.

In neither case is your "the kernel shouldn't guess" argument the answer,
as far as I can see. And in both cases you _can_ fix it up in user mode if
you know how, so clearly the kernel was no worse off guessing.

Linus

2003-09-25 00:50:54

by Douglas Gilbert

[permalink] [raw]
Subject: Re: rfc: test whether a device has a partition table

[email protected] wrote:
> As everyone knows it is a bad idea to let the kernel guess
> whether there is a partition table on a given block device,
> and if so, of what type.
> Nevertheless this is what almost everybody does.
>
> Until now the philosophy was: floppies do not have a partition table,
> disks do have one, and for ZIP drives nobody knows. With USB we get
> more types of block device that may or may not have a partition table
> (and if they have none, usually there is a FAT filesystem with bootsector).
> In such cases the kernel assumes a partition table, and creates a mess
> if there was none. Some heuristics are needed.
>
> Many checks are possible (for a DOS-type partition table: boot indicator
> must be 0 or 0x80, partitions are not larger than the disk,
> non-extended partitions are mutually disjoint; for a boot sector:
> it starts with a jump, the number of bytes per sector is 512 or
> at least a power of two, the number of sectors per cluster is 1
> or at least a power of two, the number of reserved sectors is 1 or 32,
> the number of FAT copies is 2, ...).
>
> I tried a minimal test, and the below is good enough for the
> boot sectors and DOS-type partition tables that I have here.
>
> So, question: are there people with DOS-type partition tables
> or FAT fs bootsectors where the below gives the wrong answer?
> I would be interested in a copy of the sector.
>
> I expect to submit some sanity check to DOS-type partition table
> parsing, and hope to recognize with high probability the presence
> of a full disk FAT filesystem.
>
> Andries
>
> ------------ sniffsect.c -----------------
>
> /*
> * Given a block device, does it have a DOS-type partition table?
> * Or does it behave like a floppy and have a whole-disk filesystem?
> * Or is it something else?
> *
> * Return 1 for pt, -1 for boot sect, 0 for unknown.
> */
>
> #include <stdio.h>
> #include <fcntl.h>
>
> /* DOS-type partition */
> struct partition {
> unsigned char bootable; /* 0 or 0x80 */
> unsigned char begin_chs[3];
> unsigned char systype;
> unsigned char end_chs[3];
> unsigned char start_sect[4];
> unsigned char nr_sects[4];
> };
>
> int sniffsect(unsigned char *p) {
> struct partition *pt;
> int i, n;
> int maybept = 1;
> int maybebs = 1;
>
> /* Both DOS-type pt and boot sector have a 55 aa signature */
> if (p[510] != 0x55 || p[511] != 0xaa)
> return 0;
>
> /* A partition table has boot indicators 0 or 0x80 */
> for (i=0; i<4; i++) {
> pt = (struct partition *)(p + 446 + 16*i);
> if (pt->bootable != 0 && pt->bootable != 0x80)
> maybept = 0;
> }
>
> /* A boot sector has a power of two as #sectors/cluster */
> n = p[13];
> if (n == 0 || (n & (n-1)) != 0)
> maybebs = 0;
>
> /* A boot sector has a power of two as #bytes/sector */
> n = (p[12] << 8) + p[11];
> if (n == 0 || (n & (n-1)) != 0)
> maybebs = 0;
>
> return maybept - maybebs;
> }
>
> int main(int argc, char **argv) {
> unsigned char sect[512];
> int fd, n;
>
> if (argc != 2) {
> fprintf(stderr, "Call: sniffsect file\n");
> exit(1);
> }
>
> fd = open(argv[1], O_RDONLY);
> if (fd == -1) {
> perror(argv[1]);
> fprintf(stderr, "Cannot open %s\n", argv[1]);
> exit(1);
> }
>
> n = read(fd, sect, sizeof(sect));
> if (n != sizeof(sect)) {
> if (n == -1)
> perror(argv[1]);
> fprintf(stderr, "Cannot read 512 bytes from %s\n", argv[1]);
> exit(1);
> }
>
> n = sniffsect(sect);
> printf((n == 1) ? "partition table\n" :
> (n == -1) ? "boot sector\n" : "no idea\n");
> return 0;
> }
>

I have a USB 500 MB USB key that confuses linux (both 2.4 and
2.6) since it has no partition table. It shows up on my laptop as:

$ cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
Vendor: Prolific Model: USBFlashDisk Rev: 1.00
Type: Direct-Access ANSI SCSI revision: 02

I can mount it with:
$ mount /dev/sda /mnt/extra
$ df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/hda3 14108400 9458612 3933100 71% /
none 192256 0 192256 0% /dev/shm
/dev/sda 511856 103328 408528 21% /mnt/extra

sniffsect correctly identifies the difference between the USB
"floppy" and my main disk:
$ ./sniffsect /dev/sda
boot sector
$ ./sniffsect /dev/hda
partition table

Doug Gilbert


2003-09-25 01:00:55

by Al Viro

[permalink] [raw]
Subject: Re: rfc: test whether a device has a partition table

On Thu, Sep 25, 2003 at 10:42:44AM +1000, Douglas Gilbert wrote:
> I have a USB 500 MB USB key that confuses linux (both 2.4 and
> 2.6) since it has no partition table. It shows up on my laptop as:

Confuses it in which sense?

> $ cat /proc/scsi/scsi
> Attached devices:
> Host: scsi0 Channel: 00 Id: 00 Lun: 00
> Vendor: Prolific Model: USBFlashDisk Rev: 1.00
> Type: Direct-Access ANSI SCSI revision: 02
>
> I can mount it with:
> $ mount /dev/sda /mnt/extra

So it works fine. what's the problem?

2003-09-25 01:28:46

by Douglas Gilbert

[permalink] [raw]
Subject: Re: rfc: test whether a device has a partition table


Disk /dev/sda: 524 MB, 524288000 bytes
17 heads, 59 sectors/track, 1020 cylinders
Units = cylinders of 1003 * 512 = 513536 bytes

Device Boot Start End Blocks Id System
/dev/sda1 ? 1914209 2457017 272218546+ 20 Unknown
Partition 1 has different physical/logical beginnings (non-Linux?):
phys=(356, 97, 46) logical=(1914208, 5, 40)
Partition 1 has different physical/logical endings:
phys=(357, 116, 40) logical=(2457016, 16, 59)
Partition 1 does not end on cylinder boundary.
/dev/sda2 ? 1326206 1863570 269488144 6b Unknown
Partition 2 has different physical/logical beginnings (non-Linux?):
phys=(288, 110, 57) logical=(1326205, 9, 57)
Partition 2 has different physical/logical endings:
phys=(269, 101, 57) logical=(1863569, 13, 16)
Partition 2 does not end on cylinder boundary.
/dev/sda3 ? 537378 1931558 699181456 53 OnTrack DM6 Aux3
Partition 3 has different physical/logical beginnings (non-Linux?):
phys=(345, 32, 19) logical=(537377, 4, 25)
Partition 3 has different physical/logical endings:
phys=(324, 77, 19) logical=(1931557, 10, 42)
Partition 3 does not end on cylinder boundary.
/dev/sda4 * 1390457 1390478 10668+ 49 Unknown
Partition 4 has different physical/logical beginnings (non-Linux?):
phys=(87, 1, 0) logical=(1390456, 5, 1)
Partition 4 has different physical/logical endings:
phys=(335, 78, 2) logical=(1390477, 9, 38)
Partition 4 does not end on cylinder boundary.

Partition table entries are not in disk order


Attachments:
prolific_sect0.img.gz (525.00 B)
fdisk.txt (1.50 kB)
Download all attachments

2003-09-25 04:47:06

by Linus Torvalds

[permalink] [raw]
Subject: Re: rfc: test whether a device has a partition table


On Thu, 25 Sep 2003, Andries Brouwer wrote:
>
> My post implicitly suggested the minimal thing to do.
> It will not be enough - heuristics are never enough -
> but it probably helps in most cases.

I don't mind the 0x00/0x80 "boot flag" checks - those look fairly obvious
and look reasonably safe to add to the partitioning code.

There are other checks that can be done - verifying that the start/end
sector values are at all sensible. We do _some_ of that, but only for
partitions 3 and 4, for example. We could do more - like checking the
actual sector numbers (but I think some formatters leave them as zero).

Which actually makes me really nervous - it implies that we've probably
seen partitions 1&2 contain garbage there, and the problem is that if
you'r etoo careful in checking, you will make a system unusable.

This is why it is so much nicer to be overly permissive ratehr than being
a stickler for having all the values right.

And your random byte checks for power-of-2 make no sense. What are they
based on?

Linus

2003-09-25 04:56:36

by Linus Torvalds

[permalink] [raw]
Subject: Re: rfc: test whether a device has a partition table


On Wed, 24 Sep 2003, Linus Torvalds wrote:
>
> And your random byte checks for power-of-2 make no sense. What are they
> based on?

Oh, I found the regular DOS bootsector layout.

The thing is, that's FAT-specific. The BIOS doesn't care, and the old
Linux boot-sector stuff never had that, for example. It has the 0xAA55
flag, and that makes it bootable.

I bet the same is true of other bootsectors too, that either didn't know
about the FAT version, or just needed the space for better things and knew
the BIOS didn't care. And some of them might easily have powers-of-two
values in those magic bytes.

Linus

2003-09-25 06:55:15

by Xavier Bestel

[permalink] [raw]
Subject: Re: rfc: test whether a device has a partition table

Le jeu 25/09/2003 ? 02:18, Linus Torvalds a ?crit :

> The _worst_ thing that can happen is that you have four extra (totally
> bogus) partitions, and you end up using the whole device.

That means that hotplug/automount will have to re-parse the partition
table itself to mount only the real partitions. So the job is done
twice, in-kernel and in userspace.

Have no doubts that *real* users (like the police force mentionned by
Andries) will let their system automount their USB disks, they'll never
figure out which devices look bogus (dev/sd what ?!?) and which one to
mount.

If the partition discovery and validity check is done in userspace, why
still do it in-kernel ?

Xav

2003-09-25 10:57:26

by Andries Brouwer

[permalink] [raw]
Subject: Re: rfc: test whether a device has a partition table

On Wed, Sep 24, 2003 at 05:18:15PM -0700, Linus Torvalds wrote:

> So? There's a bug, and we'll fix it.

Yes - that is what I was in the process of doing.
Things go wrong, we add a few heuristics and they'll
work right again, most of the time.

> I know you don't want the kernel to partition at all.
> But I don't see your point.

I did not want to start this particular discussion.

But now that you bring it up, let me say the usual things.
Probably there is no need to answer - there are no new
insights or new proposals here.

Letting mount or the kernel guess the type of the filesystem to mount
is bad. If the kernel or mount guesses wrong the result can be fs
corruption and kernel crash. So the right approach is to always
give a -t option to mount and a rootfstype= boot option to the kernel.

But most people don't, and survive. And I maintain mount and
over time a system of heuristics has been built into mount
to make it rather likely that a guess will be correct.

The partition situation is similar but a bit worse.
We have the second half: likely guesses,
but we lack the first half: correctness with certainty.

What probably will happen as a result of this episode is
that the likelihood of certain guesses is improved a bit.
But I wouldnt mind the option of having certainty
instead of probability. Userspace that tells the kernel,
instead of letting the kernel probe.

Andries

2003-09-25 11:42:24

by Andries Brouwer

[permalink] [raw]
Subject: Re: rfc: test whether a device has a partition table

On Wed, Sep 24, 2003 at 09:47:01PM -0700, Linus Torvalds wrote:

> On Thu, 25 Sep 2003, Andries Brouwer wrote:

> > My post implicitly suggested the minimal thing to do.
> > It will not be enough - heuristics are never enough -
> > but it probably helps in most cases.
>
> I don't mind the 0x00/0x80 "boot flag" checks - those look fairly obvious
> and look reasonably safe to add to the partitioning code.
>
> There are other checks that can be done - verifying that the start/end
> sector values are at all sensible. We do _some_ of that, but only for
> partitions 3 and 4, for example. We could do more - like checking the
> actual sector numbers (but I think some formatters leave them as zero).
>
> Which actually makes me really nervous - it implies that we've probably
> seen partitions 1&2 contain garbage there, and the problem is that if
> you're too careful in checking, you will make a system unusable.

No and yes.

Note that all checks that are there today are mine.
No, the missing check on 1&2 does not mean that there may be garbage there,
it was just the other way around. In a chain of logical partitions inside
an extended partition almost always only slots 1 and 2 are used, and
slots 3 and 4 are zeroed out. But it happens that slots 3 and 4 are used,
so we want to look at them. But sometimes slots 3 and 4 contain complete
garbage, so we trust them much less than slots 1 and 2, and accept them only
when everything really looks right.

It is possible to add more checks, and each time there was reason to do so
we added the minimal check.

> And your random byte checks for power-of-2 make no sense. What are they
> based on?

First you say that they make no sense and then you ask why they make sense?
You might as well just ask.

I don't know whether you want a general or a technical answer.
Let us try the technical one. A FAT bootsector has in bytes 11-12
a little-endian short that gives the number of bytes per sector.
It is almost always 512, but also 1024, 2048, 4096 occur.
A FAT bootsector has in byte 13 the number of sectors per cluster.
It is usually 1, but also 2, 4, 8, 16, 32, 64, 128 occur.

Thus, it is a reasonable test to check these three bytes and
require two powers of two. If that fails, then we do not have
a FAT bootsector (of a type I have ever seen).

Andries

2003-09-25 12:14:47

by Andries Brouwer

[permalink] [raw]
Subject: Re: rfc: test whether a device has a partition table

On Thu, Sep 25, 2003 at 01:05:03AM +0100, [email protected] wrote:

> If there is no partition table at all and in fact they have a filesystem
> on the entire disk - let them use *entire* *disk*. You can very well
> read /dev/sd<letter>, mount it, whatever. Here I do have a SCSI disk
> that is not partitioned at all. And guess what? It works with no extra
> efforts needed:
>
> SCSI device sda: 35916548 512-byte hdwr sectors (18389 MB)
> sda: unknown partition table
>
> al@satch:~/kernel/2.5$ mount | grep sda
> /dev/sda on /mnt/sda type ext2 (rw)
>
> Note that usb-storage looks like a SCSI host for the rest of kernel, so that's
> exactly the same situation - device that is expected to be partitioned but in
> reality isn't.
>
> So what precisely are you trying to fix?

You forget two things.

First, if the kernel comes up with a bogus partition table, this
will confuse users (and user space) greatly. It is not harmless,
even though you would know how to survive.

Second, if the kernel reads random stuff from flash media that may yield
I/O errors. Such media do often not have blocks at a fixed place, but
have at the start a table that says where on the media a given block lives.
Blocks that have never been written do not occur in the table, and attempts
to read them give an I/O error. (And our famous SCSI error handling may
want to retry a few times, reset the device and retry, reset the bus
and retry .. I have seen boot times of a quarter of an hour because
the kernel was busy retrying SmartMedia accesses.)
In short - we should not read random blocks from a disk on flash media.

Andries