2000-12-01 16:39:48

by Roberto Ragusa

[permalink] [raw]
Subject: kernel panic in SoftwareRAID autodetection

Please CC to me because I'm not a LKML subscriber.

Hi,

I found a real showstopper problem in the SoftwareRAID autodetect
code; 2.4.0-test10 and 2.4.0-test11 are affected (I didn't test
previous versions).

I'm using two IDE disk with some RAIDed partitions:

md5 : active raid0 hdc5[1] hda5[0] 2056064 blocks 32k chunks
md6 : active raid0 hdc6[1] hda6[0] 1027968 blocks 32k chunks
md7 : active raid0 hdc7[1] hda7[0] 634752 blocks 32k chunks
md8 : active raid0 hdc8[1] hda8[0] 1027968 blocks 32k chunks
md9 : active raid0 hdc9[1] hda9[0] 2237568 blocks 32k chunks
md10 : active raid0 hdc10[1] hda10[0] 60288 blocks 32k chunks
md11 : active linear hdc13[3] hdc11[1] hda13[2] hda11[0] 422912 blocks 32k rounding

that is 6 of them are RAID0 and 1 is RAIDlinear.

I compiled the 2.4.0-test11 kernel on a RH7.0 installation (using kgcc),
with the following options:

CONFIG_MD=y
CONFIG_BLK_DEV_MD=y
CONFIG_MD_LINEAR=y
CONFIG_MD_RAID0=y
CONFIG_MD_RAID1=y
CONFIG_MD_RAID5=m
# CONFIG_MD_BOOT is not set
CONFIG_AUTODETECT_RAID=y
# CONFIG_BLK_DEV_LVM is not set
# CONFIG_LVM_PROC_FS is not set

but when I try to boot the kernel panics during the RAID autodetect phase
(without autodetection it seems OK). Tried many times, not a random problem.
So I used "console=ttyS0" to grab the output and I found it is booting
without errors (!). Tried many times.
I had to hand copy the text from the screen to this mail.

This is what I see when everything is OK (serial console grab):

Linux version 2.4.0-test11 ([email protected]) (gcc version egcs-2.91.66 19990314
/Linux (egcs-1.1.2 release)) #1 Fri Dec 1 12:27:42 CET 2000
[...]
Partition check:
hda: hda1 hda2 hda3 hda4 < hda5 hda6 hda7 hda8 hda9 hda10 hda11 hda12 hda13 hda
14 >
hdc: hdc1 hdc2 hdc3 hdc4 < hdc5 hdc6 hdc7 hdc8 hdc9 hdc10 hdc11 hdc12 hdc13 hdc
14 >
hdd: hdd1 hdd2 hdd3 hdd4
[...]
md driver 0.90.0 MAX_MD_DEVS=256, MAX_REAL=12
linear personality registered
raid0 personality registered
raid1 personality registered
md.c: sizeof(mdp_super_t) = 4096
autodetecting RAID arrays
(read) hda5's sb offset: 1028032 [events: 00000ace]
(read) hda6's sb offset: 513984 [events: 00000aca]
(read) hda7's sb offset: 317376 [events: 00000aca]
(read) hda8's sb offset: 513984 [events: 00000aca]
(read) hda9's sb offset: 1118784 [events: 00000096]
(read) hda10's sb offset: 30144 [events: 00000096]
(read) hda11's sb offset: 105728 [events: 000000a4]
(read) hda13's sb offset: 105728 [events: 000000a4]
(read) hdc5's sb offset: 1028032 [events: 00000ace]
(read) hdc6's sb offset: 513984 [events: 00000aca]
(read) hdc7's sb offset: 317376 [events: 00000aca]
(read) hdc8's sb offset: 513984 [events: 00000aca]
(read) hdc9's sb offset: 1118784 [events: 00000096]
(read) hdc10's sb offset: 30144 [events: 00000096]
(read) hdc11's sb offset: 105728 [events: 000000a4]
(read) hdc13's sb offset: 105728 [events: 000000a4]
autorun ...
considering hdc13 ...
adding hdc13 ...
adding hdc11 ...
adding hda13 ...
adding hda11 ...
created md11
bind<hda11,1>
md11: WARNING: hda13 appears to be on the same physical disk as hda11. True
protection against single-disk failure might be compromised.
bind<hda13,2>
bind<hdc11,3>
md11: WARNING: hdc13 appears to be on the same physical disk as hdc11. True
protection against single-disk failure might be compromised.
bind<hdc13,4>
running: <hdc13><hdc11><hda13><hda11>
now!
hdc13's event counter: 000000a4
hdc11's event counter: 000000a4
hda13's event counter: 000000a4
hda11's event counter: 000000a4
md11: max total readahead window set to 124k
md11: 1 data-disks, max readahead per data-disk: 124k
md: updating md11 RAID superblock on device
hdc13 [events: 000000a5](write) hdc13's sb offset: 105728
hdc11 [events: 000000a5](write) hdc11's sb offset: 105728
hda13 [events: 000000a5](write) hda13's sb offset: 105728
hda11 [events: 000000a5](write) hda11's sb offset: 105728
.
considering hdc10 ...
adding hdc10 ...
adding hda10 ...
created md10
bind<hda10,1>
bind<hdc10,2>
running: <hdc10><hda10>
now!
hdc10's event counter: 00000096
hda10's event counter: 00000096
md10: max total readahead window set to 496k
md10: 2 data-disks, max readahead per data-disk: 248k
raid0: looking at hda10
raid0: comparing hda10(30144) with hda10(30144)
raid0: END
raid0: ==> UNIQUE
raid0: 1 zones
raid0: looking at hdc10
raid0: comparing hdc10(30144) with hda10(30144)
raid0: EQUAL
raid0: FINAL 1 zones
zone 0
checking hda10 ... contained as device 0
(30144) is smallest!.
checking hdc10 ... contained as device 1
zone->nb_dev: 2, size: 60288
current zone offset: 30144
done.
raid0 : md_size is 60288 blocks.
raid0 : conf->smallest->size is 60288 blocks.
raid0 : nb_zone is 1.
raid0 : Allocating 8 bytes for hash.
md: updating md10 RAID superblock on device
hdc10 [events: 00000097](write) hdc10's sb offset: 30144
hda10 [events: 00000097](write) hda10's sb offset: 30144
.
considering hdc9 ...
[...]

When I don't use the serial console, only the first RAID is well
detected; I get something different: (hand copied)

md: updating md11 RAID superblock on device
hdc13 [events: 000000a7](write) hdc13's sb offset: 105728
hdc11 [events: 000000a7](write) hdc11's sb offset: 105728
hda13 [events: 000000a7](write) hda13's sb offset: 105728
hda11 [events: 000000a7](write) hda11's sb offset: 105728
.
considering hdc10 ...
Unable to handle kernel NULL pointer dereference at virtual address 00000000
printing eip:
c0127da2
*pde = 000000
Oops: 0002
CPU: 0
EIP: 0010:[<c0127da2>]
EFLAGS: 00010093
eax: 00000000 ebx: cbf4a1e0 ecx: cbff8f40 edx: 00000001
esi: cbffc638 edi: 00000286 ebp: 00000000 esp: c133df6c
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 1, stackpage=c133d000)
Stack: cbf1c000 00000001 cbf4b080 cbf4b000 00001000 000bf1c0 c01af03f cbf1c000
cbf27090 cbf27180 c133dfc4 cbf27180 c01b08b1 cbf23000 cbf23000 0000160d
00000010 c0105000 0008e000 00160001 cbf27480 c133dfc4 c133dfc4 c133dfc4
Call Trace: [<c01af03f>] [<c01b08b1>] [<c0105000>] [<c0107007>] [<c01089ff>]
Code: 89 18 8b 46 04 89 4e 04 89 31 89 41 04 89 08 eb 27 8b 43 08
kernel panic: Attempted to kill init!

Ksymoops says:

>>EIP; c0127da2 <kfree+7e/c0> <=====
Trace; c01af03f <sb_equal+7b/94>
Trace; c01b08b1 <autorun_devices+85/230>
Trace; c0105000 <empty_bad_page+0/1000>
Trace; c0107007 <init+7/150>
Trace; c01089ff <kernel_thread+23/30>
Code; c0127da2 <kfree+7e/c0> 00000000 <_EIP>:
Code; c0127da2 <kfree+7e/c0> 0: 89 18 mov %ebx,(%eax) <=====
Code; c0127da4 <kfree+80/c0> 2: 8b 46 04 mov 0x4(%esi),%eax
Code; c0127da7 <kfree+83/c0> 5: 89 4e 04 mov %ecx,0x4(%esi)
Code; c0127daa <kfree+86/c0> 8: 89 31 mov %esi,(%ecx)
Code; c0127dac <kfree+88/c0> a: 89 41 04 mov %eax,0x4(%ecx)
Code; c0127daf <kfree+8b/c0> d: 89 08 mov %ecx,(%eax)
Code; c0127db1 <kfree+8d/c0> f: eb 27 jmp 38 <_EIP+0x38> c0127dda <kfree+b6/c0>
Code; c0127db3 <kfree+8f/c0> 11: 8b 43 08 mov 0x8(%ebx),%eax

kernel panic: Attempted to kill init!

I hope this is enough to track down the problem.
Please CC to me because I'm not a LKML subscriber.

--
Roberto Ragusa robertoragusa at technologist.com


2000-12-01 20:18:58

by NeilBrown

[permalink] [raw]
Subject: Re: kernel panic in SoftwareRAID autodetection

On Friday December 1, [email protected] wrote:
> Please CC to me because I'm not a LKML subscriber.
>
> Hi,
>
> I found a real showstopper problem in the SoftwareRAID autodetect
> code; 2.4.0-test10 and 2.4.0-test11 are affected (I didn't test
> previous versions).

Fixed in 2.4.0-test12pre3.
If you are a raid users, it might be worth while subscribing to
[email protected]. Then you probably would have known
already....

NeilBrown

2000-12-04 22:06:22

by Roberto Ragusa

[permalink] [raw]
Subject: Re: kernel panic in SoftwareRAID autodetection

On 01-Dec-00, Neil Brown wrote:
> On Friday December 1, [email protected] wrote:
>> I found a real showstopper problem in the SoftwareRAID autodetect
>> code; 2.4.0-test10 and 2.4.0-test11 are affected (I didn't test
>> previous versions).
[detailed report]
>
> Fixed in 2.4.0-test12pre3.

I tried 2.4.0-test12pre3.
The problem is *not* fixed: kernel panic again.

Please CC to me because I'm not a LKML subscriber.

--
Roberto Ragusa robertoragusa at technologist.com

2000-12-05 22:31:16

by NeilBrown

[permalink] [raw]
Subject: Re: kernel panic in SoftwareRAID autodetection

On Monday December 4, [email protected] wrote:
> On 01-Dec-00, Neil Brown wrote:
> > On Friday December 1, [email protected] wrote:
> >> I found a real showstopper problem in the SoftwareRAID autodetect
> >> code; 2.4.0-test10 and 2.4.0-test11 are affected (I didn't test
> >> previous versions).
> [detailed report]
> >
> > Fixed in 2.4.0-test12pre3.
>
> I tried 2.4.0-test12pre3.
> The problem is *not* fixed: kernel panic again.

My apologies. There was a "oops in SoftwareRAID autodetect" in test10
and test11 that was fixed in test12pre3, and I just assumed that your's
was the same, and didn't look at it properly.

On looking again, your problem is definately different and quite
weird.

The fact that it works fine with "console=ttyS0" but doesn't without
is very suspicious. It suggests to me that there is some odd
interaction going on, and that the problem may well have nothing
directly to do with RAID.

I will try to reproduce it myself, but I suspect that there is at
least an even chance that I won't be able to. If small changes like
"console=ttyS0" make it go away then some other small difference in
my setup could equally change the result.

The code which was Oopsing was in kfree, and this seems to compile
very differently for SMP than for UP. I think that you were compiling
for UP - is that correct? Could you try compiling with SMP support
and see if that makes a difference?

...... however, I went back and poured over the code for a little
while, and I think I found something. linear.c may over-run a
kmalloced buffer, which could product exactly what you are getting.
The following patch isn't *correct*, but if it makes a difference for
you, then it means that we have found the problem.. please let me
know.


--- drivers/md/linear.c 2000/12/03 22:14:54 1.3
+++ drivers/md/linear.c 2000/12/05 21:57:42
@@ -98,7 +98,7 @@
table++;
}
}
- table->dev1 = NULL;
+/* table->dev1 = NULL; */

return 0;


>
> Please CC to me because I'm not a LKML subscriber.

Ofcourse. I think it is common courtesy to reply to the author, and
cc to the list if appropriate.
I would prefer it if, when replying to me, you send the reply
explicitly to me as well as to the list, as that way I get to see it
sooner (I only read mail to linux-kernel every few days, whereas mail
explicitly addressed to me get a much higher priority).

NeilBrown

>
> --
> Roberto Ragusa robertoragusa at technologist.com
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/

2000-12-06 00:46:14

by Roberto Ragusa

[permalink] [raw]
Subject: Re: kernel panic in SoftwareRAID autodetection

On 06-Dec-00, Neil Brown wrote:

> The following patch isn't *correct*, but if it makes a difference for
> you, then it means that we have found the problem.. please let me
> know.
[one line patch]

Yes, it makes a difference :-) . The boot doesn't fail anymore and
all the RAID partitions are correctly detected.

BTW, here is a little patch regarding a silly problem I found
about RAID partitions naming (/proc/partitions).
No more "md8" "md9" "md:" "md;" ... but "md8" "md9" "md10" "md11" ...
Well, this patch should work up to "md99".

--- fs/partitions/check.c Thu Nov 2 12:22:50 2000
+++ fs/partitions/check.c Wed Dec 6 00:34:46 2000
@@ -121,10 +121,17 @@
unit += 2;
case IDE0_MAJOR:
maj = "hd";
- break;
- case MD_MAJOR:
- unit -= 'a'-'0';
- break;
+ }
+ if (hd->major == MD_MAJOR) {
+ unit -= 'a';
+ if (unit<10) {
+ sprintf(buf, "%s%c", maj, '0' + unit);
+ return buf;
+ }
+ else {
+ sprintf(buf, "%s%c%c", maj, '0' + unit / 10, '0' + unit % 10);
+ return buf;
+ }
}
if (hd->major >= SCSI_DISK1_MAJOR && hd->major <= SCSI_DISK7_MAJOR) {
unit = unit + (hd->major - SCSI_DISK1_MAJOR + 1) * 16;


For 2.2.x kernels, the file to be patched is drivers/block/genhd.c.

>> Please CC to me because I'm not a LKML subscriber.
>
> Ofcourse. I think it is common courtesy to reply to the author, and
> cc to the list if appropriate.

OK.

--
Roberto Ragusa robertoragusa at technologist.com

2000-12-06 01:08:49

by Peter Samuelson

[permalink] [raw]
Subject: Re: kernel panic in SoftwareRAID autodetection


[Roberto Ragusa]
> BTW, here is a little patch regarding a silly problem I found
> about RAID partitions naming (/proc/partitions).
> No more "md8" "md9" "md:" "md;" ... but "md8" "md9" "md10" "md11" ...
> Well, this patch should work up to "md99".

This stuff *really* should be split out into the drivers. Brian Kress
had a patch against test11 for this. Brian? You want to fold in this
fix?

> + if (unit<10) {
> + sprintf(buf, "%s%c", maj, '0' + unit);
> + return buf;
> + }
> + else {
> + sprintf(buf, "%s%c%c", maj, '0' + unit / 10, '0' + unit % 10);
> + return buf;
> + }

Errrm, that's ugly. (: Use %d, unless I'm missing something:

sprintf(buf, "%s%d", maj, unit);

Peter

2000-12-06 13:49:21

by Brian Kress

[permalink] [raw]
Subject: Re: kernel panic in SoftwareRAID autodetection

Peter Samuelson wrote:
>
> [Roberto Ragusa]
> > BTW, here is a little patch regarding a silly problem I found
> > about RAID partitions naming (/proc/partitions).
> > No more "md8" "md9" "md:" "md;" ... but "md8" "md9" "md10" "md11" ...
> > Well, this patch should work up to "md99".
>
> This stuff *really* should be split out into the drivers. Brian Kress
> had a patch against test11 for this. Brian? You want to fold in this
> fix?

Sure. I got resounding silence to posting the patch last time,
so I'm not sure if anyone actually wants this patch, but here it is
again with this fix (the non ugly version) folded in.


Brian


--------------------------- Patch
---------------------------------------

diff -u --recursive linux-2.4.0-test11/drivers/block/DAC960.c
linux-2.4.0-test11-ppfix/drivers/block/DAC960.c
--- linux-2.4.0-test11/drivers/block/DAC960.c Mon Nov 20 15:17:25 2000
+++ linux-2.4.0-test11-ppfix/drivers/block/DAC960.c Thu Nov 23 14:29:37
2000
@@ -1885,6 +1885,21 @@
}


+char *DAC960_disk_name(struct gendisk *hd, int minor, char *buf)
+
+{
+ int ctlr = hd->major - DAC960_MAJOR;
+ int disk = minor >> hd->minor_shift;
+ int part = minor & ((1 << hd->minor_shift) - 1);
+
+ if (part)
+ sprintf(buf, "%s/c%dd%dp%d", hd->major_name, ctlr, disk, part);
+ else
+ sprintf(buf, "%s/c%dd%d", hd->major_name, ctlr, disk);
+ return buf;
+}
+
+
/*
DAC960_RegisterBlockDevice registers the Block Device structures
associated with Controller.
@@ -1945,6 +1960,7 @@
Controller->GenericDiskInfo.nr_real = Controller->LogicalDriveCount;
Controller->GenericDiskInfo.next = NULL;
Controller->GenericDiskInfo.fops = &DAC960_BlockDeviceOperations;
+ Controller->GenericDiskInfo.hd_name = DAC960_disk_name;
/*
Install the Generic Disk Information structure at the end of the
list.
*/
diff -u --recursive linux-2.4.0-test11/drivers/block/cciss.c
linux-2.4.0-test11-ppfix/drivers/block/cciss.c
--- linux-2.4.0-test11/drivers/block/cciss.c Mon Nov 20 15:17:25 2000
+++ linux-2.4.0-test11-ppfix/drivers/block/cciss.c Thu Nov 23 14:22:55
2000
@@ -1749,6 +1749,20 @@
kfree(size_buff);
}

+char *cciss_disk_name(struct gendisk *hd, int minor, char *buf)
+
+{
+ int ctlr = hd->major - COMPAQ_CISS_MAJOR;
+ int disk = minor >> hd->minor_shift;
+ int part = minor & ((1 << hd->minor_shift) - 1);
+
+ if (part)
+ sprintf(buf, "%s/c%dd%dp%d", hd->major_name, ctlr, disk, part);
+ else
+ sprintf(buf, "%s/c%dd%d", hd->major_name, ctlr, disk);
+ return buf;
+}
+
/*
* This is it. Find all the controllers and register them. I really
hate
* stealing all these major device numbers.
@@ -1851,6 +1865,7 @@
hba[i]->gendisk.part = hba[i]->hd;
hba[i]->gendisk.sizes = hba[i]->sizes;
hba[i]->gendisk.nr_real = hba[i]->num_luns;
+ hba[i]->gendisk.hd_name = cciss_disk_name;

/* Get on the disk list */
hba[i]->gendisk.next = gendisk_head;
diff -u --recursive linux-2.4.0-test11/drivers/block/cpqarray.c
linux-2.4.0-test11-ppfix/drivers/block/cpqarray.c
--- linux-2.4.0-test11/drivers/block/cpqarray.c Mon Nov 20 15:20:29 2000
+++ linux-2.4.0-test11-ppfix/drivers/block/cpqarray.c Thu Nov 23
14:14:03 2000
@@ -362,6 +362,20 @@
}
#endif /* MODULE */

+char *ida_disk_name(struct gendisk *hd, int minor, char *buf)
+
+{
+ int ctlr = hd->major - COMPAQ_SMART2_MAJOR;
+ int disk = minor >> hd->minor_shift;
+ int part = minor & ((1 << hd->minor_shift) - 1);
+
+ if (part)
+ sprintf(buf, "%s/c%dd%dp%d", hd->major_name, ctlr, disk, part);
+ else
+ sprintf(buf, "%s/c%dd%d", hd->major_name, ctlr, disk);
+ return buf;
+}
+
/*
* This is it. Find all the controllers and register them. I really
hate
* stealing all these major device numbers.
@@ -512,6 +526,7 @@
ida_gendisk[i].part = ida + (i*256);
ida_gendisk[i].sizes = ida_sizes + (i*256);
ida_gendisk[i].nr_real = 0;
+ ida_gendisk[i].hd_name = ida_disk_name;

/* Get on the disk list */
ida_gendisk[i].next = gendisk_head;
diff -u --recursive linux-2.4.0-test11/drivers/ide/ide-probe.c
linux-2.4.0-test11-ppfix/drivers/ide/ide-probe.c
--- linux-2.4.0-test11/drivers/ide/ide-probe.c Thu Aug 3 19:29:49 2000
+++ linux-2.4.0-test11-ppfix/drivers/ide/ide-probe.c Thu Nov 23 12:32:16
2000
@@ -726,6 +726,40 @@
return 0;
}

+char* ide_disk_name(struct gendisk *hd, int minor, char *buf)
+
+{
+ int unit = (minor >> hd->minor_shift) + 'a';
+ unsigned int part = minor & ((1 << hd->minor_shift) - 1);
+
+ switch (hd->major) {
+ case IDE9_MAJOR:
+ unit += 2;
+ case IDE8_MAJOR:
+ unit += 2;
+ case IDE7_MAJOR:
+ unit += 2;
+ case IDE6_MAJOR:
+ unit += 2;
+ case IDE5_MAJOR:
+ unit += 2;
+ case IDE4_MAJOR:
+ unit += 2;
+ case IDE3_MAJOR:
+ unit += 2;
+ case IDE2_MAJOR:
+ unit += 2;
+ case IDE1_MAJOR:
+ unit += 2;
+ case IDE0_MAJOR:
+ }
+ if (part)
+ sprintf(buf, "hd%c%d", unit, part);
+ else
+ sprintf(buf, "hd%c", unit);
+ return buf;
+}
+
/*
* init_gendisk() (as opposed to ide_geninit) is called for each major
device,
* after probing for drives, to allocate partition tables and other
data
@@ -781,6 +815,7 @@
gd->fops = ide_fops; /* file operations */
gd->de_arr = kmalloc (sizeof *gd->de_arr * units, GFP_KERNEL);
gd->flags = kmalloc (sizeof *gd->flags * units, GFP_KERNEL);
+ gd->hd_name = ide_disk_name;
if (gd->de_arr)
memset (gd->de_arr, 0, sizeof *gd->de_arr * units);
if (gd->flags)
diff -u --recursive linux-2.4.0-test11/drivers/md/lvm.c
linux-2.4.0-test11-ppfix/drivers/md/lvm.c
--- linux-2.4.0-test11/drivers/md/lvm.c Mon Nov 20 15:20:30 2000
+++ linux-2.4.0-test11-ppfix/drivers/md/lvm.c Thu Nov 23 07:47:20 2000
@@ -213,9 +213,7 @@
&lvm_proc_get_info;
#endif

-#ifdef LVM_HD_NAME
-void lvm_hd_name(char *, int);
-#endif
+char *lvm_hd_name(struct gendisk *, int, char *);
/* End external function prototypes */


@@ -230,9 +228,6 @@
int lvm_snapshot_alloc(lv_t *);
void lvm_snapshot_release(lv_t *);

-#ifdef LVM_HD_NAME
-extern void (*lvm_hd_name_ptr) (char *, int);
-#endif
static int lvm_map(struct buffer_head *, int);
static int lvm_do_lock_lvm(void);
static int lvm_do_le_remap(vg_t *, void *);
@@ -397,11 +392,6 @@
lvm_gendisk.next = NULL;
}

-#ifdef LVM_HD_NAME
- /* reference from drivers/block/genhd.c */
- lvm_hd_name_ptr = lvm_hd_name;
-#endif
-
blk_init_queue(BLK_DEFAULT_QUEUE(MAJOR_NR), DEVICE_REQUEST);
blk_queue_make_request(BLK_DEFAULT_QUEUE(MAJOR_NR),
lvm_make_request_fn);
blk_queue_pluggable(BLK_DEFAULT_QUEUE(MAJOR_NR),
lvm_plug_device_noop);
@@ -460,11 +450,6 @@
remove_proc_entry(LVM_NAME, &proc_root);
#endif

-#ifdef LVM_HD_NAME
- /* reference from linux/drivers/block/genhd.c */
- lvm_hd_name_ptr = NULL;
-#endif
-
printk(KERN_INFO "%s -- Module successfully deactivated\n", lvm_name);

return;
@@ -1448,24 +1433,22 @@
* internal support functions
*/

-#ifdef LVM_HD_NAME
/*
* generate "hard disk" name
*/
-void lvm_hd_name(char *buf, int minor)
+char *lvm_hd_name(struct gendisk *hd, int minor, char *buf)
{
int len = 0;
lv_t *lv_ptr;

if (vg[VG_BLK(minor)] == NULL ||
(lv_ptr = vg[VG_BLK(minor)]->lv[LV_BLK(minor)]) == NULL)
- return;
+ return buf;
len = strlen(lv_ptr->lv_name) - 5;
memcpy(buf, &lv_ptr->lv_name[5], len);
buf[len] = 0;
- return;
+ return buf;
}
-#endif


/*
@@ -2515,6 +2498,8 @@

blksize_size[MAJOR_NR] = lvm_blocksizes;
blk_size[MAJOR_NR] = lvm_size;
+
+ lvm_gendisk.hd_name=lvm_hd_name;

return;
} /* lvm_gen_init() */
diff -u --recursive linux-2.4.0-test11/drivers/md/md.c
linux-2.4.0-test11-ppfix/drivers/md/md.c
--- linux-2.4.0-test11/drivers/md/md.c Mon Nov 20 15:20:30 2000
+++ linux-2.4.0-test11-ppfix/drivers/md/md.c Thu Nov 23 07:56:06 2000
@@ -113,6 +113,8 @@
extern struct block_device_operations md_fops;
static devfs_handle_t devfs_handle;

+char *md_disk_name(struct gendisk*, int, char *);
+
static struct gendisk md_gendisk=
{
major: MD_MAJOR,
@@ -125,6 +127,7 @@
real_devices: NULL,
next: NULL,
fops: &md_fops,
+ hd_name: md_disk_name,
};

/*
@@ -215,6 +218,14 @@
MOD_INC_USE_COUNT;

return mddev;
+}
+
+char * md_disk_name(struct gendisk *hd, int minor, char* buf)
+
+{
+ int unit = (minor >> hd->minor_shift);
+ sprintf(buf, "%s%d", hd->major_name, unit);
+ return buf;
}

struct gendisk * find_gendisk (kdev_t dev)
diff -u --recursive linux-2.4.0-test11/drivers/scsi/sd.c
linux-2.4.0-test11-ppfix/drivers/scsi/sd.c
--- linux-2.4.0-test11/drivers/scsi/sd.c Mon Nov 20 15:17:18 2000
+++ linux-2.4.0-test11-ppfix/drivers/scsi/sd.c Thu Nov 23 09:53:01 2000
@@ -1013,6 +1013,30 @@
return i;
}

+char *scsi_disk_name(struct gendisk *hd, int minor, char *buf)
+
+{
+ int unit = (minor >> hd->minor_shift) + 'a';
+ unsigned int part = minor & ((1 << hd->minor_shift) - 1);
+
+ if (hd->major >= SCSI_DISK1_MAJOR && hd->major <= SCSI_DISK7_MAJOR) {
+ unit = unit + (hd->major - SCSI_DISK1_MAJOR + 1) * 16;
+ if (unit > 'z') {
+ unit -= 'z' + 1;
+ sprintf(buf, "sd%c%c", 'a' + unit / 26, 'a' + unit % 26);
+ if (part)
+ sprintf(buf + 4, "%d", part);
+ return buf;
+ }
+ }
+
+ if (part)
+ sprintf(buf, "sd%c%d", unit, part);
+ else
+ sprintf(buf, "sd%c", unit);
+ return buf;
+}
+
/*
* The sd_init() function looks at all SCSI drives present, determines
* their size, and reads partition table entries for them.
@@ -1109,6 +1133,7 @@
sd_gendisks[i].next = sd_gendisks + i + 1;
sd_gendisks[i].real_devices =
(void *) (rscsi_disks + i * SCSI_DISKS_PER_MAJOR);
+ sd_gendisks[i].hd_name=scsi_disk_name;
}

LAST_SD_GENDISK.next = NULL;
diff -u --recursive linux-2.4.0-test11/fs/partitions/check.c
linux-2.4.0-test11-ppfix/fs/partitions/check.c
--- linux-2.4.0-test11/fs/partitions/check.c Mon Nov 20 15:17:27 2000
+++ linux-2.4.0-test11-ppfix/fs/partitions/check.c Thu Nov 23 14:30:45
2000
@@ -83,11 +83,10 @@
*/
char *disk_name (struct gendisk *hd, int minor, char *buf)
{
- unsigned int part;
const char *maj = hd->major_name;
int unit = (minor >> hd->minor_shift) + 'a';
+ unsigned int part = minor & ((1 << hd->minor_shift) - 1);

- part = minor & ((1 << hd->minor_shift) - 1);
if (hd->part[minor].de) {
int pos;

@@ -95,77 +94,8 @@
if (pos >= 0)
return buf + pos;
}
- /*
- * IDE devices use multiple major numbers, but the drives
- * are named as: {hda,hdb}, {hdc,hdd}, {hde,hdf}, {hdg,hdh}..
- * This requires special handling here.
- */
- switch (hd->major) {
- case IDE9_MAJOR:
- unit += 2;
- case IDE8_MAJOR:
- unit += 2;
- case IDE7_MAJOR:
- unit += 2;
- case IDE6_MAJOR:
- unit += 2;
- case IDE5_MAJOR:
- unit += 2;
- case IDE4_MAJOR:
- unit += 2;
- case IDE3_MAJOR:
- unit += 2;
- case IDE2_MAJOR:
- unit += 2;
- case IDE1_MAJOR:
- unit += 2;
- case IDE0_MAJOR:
- maj = "hd";
- break;
- case MD_MAJOR:
- unit -= 'a'-'0';
- break;
- }
- if (hd->major >= SCSI_DISK1_MAJOR && hd->major <= SCSI_DISK7_MAJOR) {
- unit = unit + (hd->major - SCSI_DISK1_MAJOR + 1) * 16;
- if (unit > 'z') {
- unit -= 'z' + 1;
- sprintf(buf, "sd%c%c", 'a' + unit / 26, 'a' + unit % 26);
- if (part)
- sprintf(buf + 4, "%d", part);
- return buf;
- }
- }
- if (hd->major >= COMPAQ_SMART2_MAJOR && hd->major <=
COMPAQ_SMART2_MAJOR+7) {
- int ctlr = hd->major - COMPAQ_SMART2_MAJOR;
- int disk = minor >> hd->minor_shift;
- int part = minor & (( 1 << hd->minor_shift) - 1);
- if (part == 0)
- sprintf(buf, "%s/c%dd%d", maj, ctlr, disk);
- else
- sprintf(buf, "%s/c%dd%dp%d", maj, ctlr, disk, part);
- return buf;
- }
- if (hd->major >= COMPAQ_CISS_MAJOR && hd->major <=
COMPAQ_CISS_MAJOR+7) {
- int ctlr = hd->major - COMPAQ_CISS_MAJOR;
- int disk = minor >> hd->minor_shift;
- int part = minor & (( 1 << hd->minor_shift) - 1);
- if (part == 0)
- sprintf(buf, "%s/c%dd%d", maj, ctlr, disk);
- else
- sprintf(buf, "%s/c%dd%dp%d", maj, ctlr, disk,
part);
- return buf;
- }
- if (hd->major >= DAC960_MAJOR && hd->major <= DAC960_MAJOR+7) {
- int ctlr = hd->major - DAC960_MAJOR;
- int disk = minor >> hd->minor_shift;
- int part = minor & (( 1 << hd->minor_shift) - 1);
- if (part == 0)
- sprintf(buf, "%s/c%dd%d", maj, ctlr, disk);
- else
- sprintf(buf, "%s/c%dd%dp%d", maj, ctlr, disk, part);
- return buf;
- }
+ if (hd->hd_name) return hd->hd_name(hd, minor, buf);
+
if (part)
sprintf(buf, "%s%c%d", maj, unit, part);
else
diff -u --recursive linux-2.4.0-test11/include/linux/genhd.h
linux-2.4.0-test11-ppfix/include/linux/genhd.h
--- linux-2.4.0-test11/include/linux/genhd.h Mon Nov 20 15:31:01 2000
+++ linux-2.4.0-test11-ppfix/include/linux/genhd.h Thu Nov 23 07:44:35
2000
@@ -72,6 +72,8 @@

devfs_handle_t *de_arr; /* one per physical disc */
char *flags; /* one per physical disc */
+
+ char *(*hd_name) (struct gendisk *, int, char *);
};
#endif /* __KERNEL__ */

2000-12-06 17:14:06

by Peter Samuelson

[permalink] [raw]
Subject: Re: kernel panic in SoftwareRAID autodetection


[Brian Kress <[email protected]>]
> I got resounding silence to posting the patch last time, so I'm not
> sure if anyone actually wants this patch,

Well, I like it, but admittedly it's mostly in the "cleanup" category
(though it does fix the LVM name issue) so at this point in 2.4 I guess
Linus has more important stuff to worry about.

The best thing about your patch is that by putting the logic back in
the individual drivers, it makes check.c not depend on your module
configuration (so you can compile a disk module, either inside or
outside the kernel tree, without worrying about editing or recompiling
check.c).

> +char *DAC960_disk_name(struct gendisk *hd, int minor, char *buf)
> +char *cciss_disk_name(struct gendisk *hd, int minor, char *buf)
> +char *ida_disk_name(struct gendisk *hd, int minor, char *buf)
> +char* ide_disk_name(struct gendisk *hd, int minor, char *buf)
> +char *lvm_hd_name(struct gendisk *, int, char *);
> +char *lvm_hd_name(struct gendisk *hd, int minor, char *buf)
> +char *md_disk_name(struct gendisk*, int, char *);
> +char * md_disk_name(struct gendisk *hd, int minor, char* buf)
> +char *scsi_disk_name(struct gendisk *hd, int minor, char *buf)

These should all be 'static char *'.

Peter

2000-12-06 19:38:24

by Brian Kress

[permalink] [raw]
Subject: Re: kernel panic in SoftwareRAID autodetection

Peter Samuelson wrote:
>
> [Brian Kress <[email protected]>]
> > I got resounding silence to posting the patch last time, so I'm not
> > sure if anyone actually wants this patch,
>
> Well, I like it, but admittedly it's mostly in the "cleanup" category
> (though it does fix the LVM name issue) so at this point in 2.4 I guess
> Linus has more important stuff to worry about.
>

Probably. Maybe I can get it in when 2.5 kicks off.

>
> The best thing about your patch is that by putting the logic back in
> the individual drivers, it makes check.c not depend on your module
> configuration (so you can compile a disk module, either inside or
> outside the kernel tree, without worrying about editing or recompiling
> check.c).
>

Yep, that's the point. I hate having generic code have to
know things about specific drivers.

>
> > +char *DAC960_disk_name(struct gendisk *hd, int minor, char *buf)
> > +char *cciss_disk_name(struct gendisk *hd, int minor, char *buf)
> > +char *ida_disk_name(struct gendisk *hd, int minor, char *buf)
> > +char* ide_disk_name(struct gendisk *hd, int minor, char *buf)
> > +char *lvm_hd_name(struct gendisk *, int, char *);
> > +char *lvm_hd_name(struct gendisk *hd, int minor, char *buf)
> > +char *md_disk_name(struct gendisk*, int, char *);
> > +char * md_disk_name(struct gendisk *hd, int minor, char* buf)
> > +char *scsi_disk_name(struct gendisk *hd, int minor, char *buf)
>
> These should all be 'static char *'.
>

Yes, you're right. I've updated that and (hopefully) fixed
the problem (as Toon van der Pas pointed out) with my mailer mangling
the patch. Here's the updated patch:


Brian

--------------------------------- Snip ----------------------------

diff -u --recursive linux-2.4.0-test11/drivers/block/DAC960.c linux-2.4.0-test11-ppfix/drivers/block/DAC960.c
--- linux-2.4.0-test11/drivers/block/DAC960.c Mon Nov 20 15:17:25 2000
+++ linux-2.4.0-test11-ppfix/drivers/block/DAC960.c Thu Nov 23 14:29:37 2000
@@ -1885,6 +1885,21 @@
}


+static char *DAC960_disk_name(struct gendisk *hd, int minor, char *buf)
+
+{
+ int ctlr = hd->major - DAC960_MAJOR;
+ int disk = minor >> hd->minor_shift;
+ int part = minor & ((1 << hd->minor_shift) - 1);
+
+ if (part)
+ sprintf(buf, "%s/c%dd%dp%d", hd->major_name, ctlr, disk, part);
+ else
+ sprintf(buf, "%s/c%dd%d", hd->major_name, ctlr, disk);
+ return buf;
+}
+
+
/*
DAC960_RegisterBlockDevice registers the Block Device structures
associated with Controller.
@@ -1945,6 +1960,7 @@
Controller->GenericDiskInfo.nr_real = Controller->LogicalDriveCount;
Controller->GenericDiskInfo.next = NULL;
Controller->GenericDiskInfo.fops = &DAC960_BlockDeviceOperations;
+ Controller->GenericDiskInfo.hd_name = DAC960_disk_name;
/*
Install the Generic Disk Information structure at the end of the list.
*/
diff -u --recursive linux-2.4.0-test11/drivers/block/cciss.c linux-2.4.0-test11-ppfix/drivers/block/cciss.c
--- linux-2.4.0-test11/drivers/block/cciss.c Mon Nov 20 15:17:25 2000
+++ linux-2.4.0-test11-ppfix/drivers/block/cciss.c Thu Nov 23 14:22:55 2000
@@ -1749,6 +1749,20 @@
kfree(size_buff);
}

+static char *cciss_disk_name(struct gendisk *hd, int minor, char *buf)
+
+{
+ int ctlr = hd->major - COMPAQ_CISS_MAJOR;
+ int disk = minor >> hd->minor_shift;
+ int part = minor & ((1 << hd->minor_shift) - 1);
+
+ if (part)
+ sprintf(buf, "%s/c%dd%dp%d", hd->major_name, ctlr, disk, part);
+ else
+ sprintf(buf, "%s/c%dd%d", hd->major_name, ctlr, disk);
+ return buf;
+}
+
/*
* This is it. Find all the controllers and register them. I really hate
* stealing all these major device numbers.
@@ -1851,6 +1865,7 @@
hba[i]->gendisk.part = hba[i]->hd;
hba[i]->gendisk.sizes = hba[i]->sizes;
hba[i]->gendisk.nr_real = hba[i]->num_luns;
+ hba[i]->gendisk.hd_name = cciss_disk_name;

/* Get on the disk list */
hba[i]->gendisk.next = gendisk_head;
diff -u --recursive linux-2.4.0-test11/drivers/block/cpqarray.c linux-2.4.0-test11-ppfix/drivers/block/cpqarray.c
--- linux-2.4.0-test11/drivers/block/cpqarray.c Mon Nov 20 15:20:29 2000
+++ linux-2.4.0-test11-ppfix/drivers/block/cpqarray.c Thu Nov 23 14:14:03 2000
@@ -362,6 +362,20 @@
}
#endif /* MODULE */

+static char *ida_disk_name(struct gendisk *hd, int minor, char *buf)
+
+{
+ int ctlr = hd->major - COMPAQ_SMART2_MAJOR;
+ int disk = minor >> hd->minor_shift;
+ int part = minor & ((1 << hd->minor_shift) - 1);
+
+ if (part)
+ sprintf(buf, "%s/c%dd%dp%d", hd->major_name, ctlr, disk, part);
+ else
+ sprintf(buf, "%s/c%dd%d", hd->major_name, ctlr, disk);
+ return buf;
+}
+
/*
* This is it. Find all the controllers and register them. I really hate
* stealing all these major device numbers.
@@ -512,6 +526,7 @@
ida_gendisk[i].part = ida + (i*256);
ida_gendisk[i].sizes = ida_sizes + (i*256);
ida_gendisk[i].nr_real = 0;
+ ida_gendisk[i].hd_name = ida_disk_name;

/* Get on the disk list */
ida_gendisk[i].next = gendisk_head;
diff -u --recursive linux-2.4.0-test11/drivers/ide/ide-probe.c linux-2.4.0-test11-ppfix/drivers/ide/ide-probe.c
--- linux-2.4.0-test11/drivers/ide/ide-probe.c Thu Aug 3 19:29:49 2000
+++ linux-2.4.0-test11-ppfix/drivers/ide/ide-probe.c Thu Nov 23 12:32:16 2000
@@ -726,6 +726,40 @@
return 0;
}

+static char* ide_disk_name(struct gendisk *hd, int minor, char *buf)
+
+{
+ int unit = (minor >> hd->minor_shift) + 'a';
+ unsigned int part = minor & ((1 << hd->minor_shift) - 1);
+
+ switch (hd->major) {
+ case IDE9_MAJOR:
+ unit += 2;
+ case IDE8_MAJOR:
+ unit += 2;
+ case IDE7_MAJOR:
+ unit += 2;
+ case IDE6_MAJOR:
+ unit += 2;
+ case IDE5_MAJOR:
+ unit += 2;
+ case IDE4_MAJOR:
+ unit += 2;
+ case IDE3_MAJOR:
+ unit += 2;
+ case IDE2_MAJOR:
+ unit += 2;
+ case IDE1_MAJOR:
+ unit += 2;
+ case IDE0_MAJOR:
+ }
+ if (part)
+ sprintf(buf, "hd%c%d", unit, part);
+ else
+ sprintf(buf, "hd%c", unit);
+ return buf;
+}
+
/*
* init_gendisk() (as opposed to ide_geninit) is called for each major device,
* after probing for drives, to allocate partition tables and other data
@@ -781,6 +815,7 @@
gd->fops = ide_fops; /* file operations */
gd->de_arr = kmalloc (sizeof *gd->de_arr * units, GFP_KERNEL);
gd->flags = kmalloc (sizeof *gd->flags * units, GFP_KERNEL);
+ gd->hd_name = ide_disk_name;
if (gd->de_arr)
memset (gd->de_arr, 0, sizeof *gd->de_arr * units);
if (gd->flags)
diff -u --recursive linux-2.4.0-test11/drivers/md/lvm.c linux-2.4.0-test11-ppfix/drivers/md/lvm.c
--- linux-2.4.0-test11/drivers/md/lvm.c Mon Nov 20 15:20:30 2000
+++ linux-2.4.0-test11-ppfix/drivers/md/lvm.c Thu Nov 23 07:47:20 2000
@@ -213,9 +213,7 @@
&lvm_proc_get_info;
#endif

-#ifdef LVM_HD_NAME
-void lvm_hd_name(char *, int);
-#endif
+char *lvm_hd_name(struct gendisk *, int, char *);
/* End external function prototypes */


@@ -230,9 +228,6 @@
int lvm_snapshot_alloc(lv_t *);
void lvm_snapshot_release(lv_t *);

-#ifdef LVM_HD_NAME
-extern void (*lvm_hd_name_ptr) (char *, int);
-#endif
static int lvm_map(struct buffer_head *, int);
static int lvm_do_lock_lvm(void);
static int lvm_do_le_remap(vg_t *, void *);
@@ -397,11 +392,6 @@
lvm_gendisk.next = NULL;
}

-#ifdef LVM_HD_NAME
- /* reference from drivers/block/genhd.c */
- lvm_hd_name_ptr = lvm_hd_name;
-#endif
-
blk_init_queue(BLK_DEFAULT_QUEUE(MAJOR_NR), DEVICE_REQUEST);
blk_queue_make_request(BLK_DEFAULT_QUEUE(MAJOR_NR), lvm_make_request_fn);
blk_queue_pluggable(BLK_DEFAULT_QUEUE(MAJOR_NR), lvm_plug_device_noop);
@@ -460,11 +450,6 @@
remove_proc_entry(LVM_NAME, &proc_root);
#endif

-#ifdef LVM_HD_NAME
- /* reference from linux/drivers/block/genhd.c */
- lvm_hd_name_ptr = NULL;
-#endif
-
printk(KERN_INFO "%s -- Module successfully deactivated\n", lvm_name);

return;
@@ -1448,24 +1433,22 @@
* internal support functions
*/

-#ifdef LVM_HD_NAME
/*
* generate "hard disk" name
*/
-void lvm_hd_name(char *buf, int minor)
+char *lvm_hd_name(struct gendisk *hd, int minor, char *buf)
{
int len = 0;
lv_t *lv_ptr;

if (vg[VG_BLK(minor)] == NULL ||
(lv_ptr = vg[VG_BLK(minor)]->lv[LV_BLK(minor)]) == NULL)
- return;
+ return buf;
len = strlen(lv_ptr->lv_name) - 5;
memcpy(buf, &lv_ptr->lv_name[5], len);
buf[len] = 0;
- return;
+ return buf;
}
-#endif


/*
@@ -2515,6 +2498,8 @@

blksize_size[MAJOR_NR] = lvm_blocksizes;
blk_size[MAJOR_NR] = lvm_size;
+
+ lvm_gendisk.hd_name=lvm_hd_name;

return;
} /* lvm_gen_init() */
diff -u --recursive linux-2.4.0-test11/drivers/md/md.c linux-2.4.0-test11-ppfix/drivers/md/md.c
--- linux-2.4.0-test11/drivers/md/md.c Mon Nov 20 15:20:30 2000
+++ linux-2.4.0-test11-ppfix/drivers/md/md.c Thu Nov 23 07:56:06 2000
@@ -113,6 +113,8 @@
extern struct block_device_operations md_fops;
static devfs_handle_t devfs_handle;

+static char *md_disk_name(struct gendisk*, int, char *);
+
static struct gendisk md_gendisk=
{
major: MD_MAJOR,
@@ -125,6 +127,7 @@
real_devices: NULL,
next: NULL,
fops: &md_fops,
+ hd_name: md_disk_name,
};

/*
@@ -215,6 +218,14 @@
MOD_INC_USE_COUNT;

return mddev;
+}
+
+static char * md_disk_name(struct gendisk *hd, int minor, char* buf)
+
+{
+ int unit = (minor >> hd->minor_shift);
+ sprintf(buf, "%s%d", hd->major_name, unit);
+ return buf;
}

struct gendisk * find_gendisk (kdev_t dev)
diff -u --recursive linux-2.4.0-test11/drivers/scsi/sd.c linux-2.4.0-test11-ppfix/drivers/scsi/sd.c
--- linux-2.4.0-test11/drivers/scsi/sd.c Mon Nov 20 15:17:18 2000
+++ linux-2.4.0-test11-ppfix/drivers/scsi/sd.c Thu Nov 23 09:53:01 2000
@@ -1013,6 +1013,30 @@
return i;
}

+static char *scsi_disk_name(struct gendisk *hd, int minor, char *buf)
+
+{
+ int unit = (minor >> hd->minor_shift) + 'a';
+ unsigned int part = minor & ((1 << hd->minor_shift) - 1);
+
+ if (hd->major >= SCSI_DISK1_MAJOR && hd->major <= SCSI_DISK7_MAJOR) {
+ unit = unit + (hd->major - SCSI_DISK1_MAJOR + 1) * 16;
+ if (unit > 'z') {
+ unit -= 'z' + 1;
+ sprintf(buf, "sd%c%c", 'a' + unit / 26, 'a' + unit % 26);
+ if (part)
+ sprintf(buf + 4, "%d", part);
+ return buf;
+ }
+ }
+
+ if (part)
+ sprintf(buf, "sd%c%d", unit, part);
+ else
+ sprintf(buf, "sd%c", unit);
+ return buf;
+}
+
/*
* The sd_init() function looks at all SCSI drives present, determines
* their size, and reads partition table entries for them.
@@ -1109,6 +1133,7 @@
sd_gendisks[i].next = sd_gendisks + i + 1;
sd_gendisks[i].real_devices =
(void *) (rscsi_disks + i * SCSI_DISKS_PER_MAJOR);
+ sd_gendisks[i].hd_name=scsi_disk_name;
}

LAST_SD_GENDISK.next = NULL;
diff -u --recursive linux-2.4.0-test11/fs/partitions/check.c linux-2.4.0-test11-ppfix/fs/partitions/check.c
--- linux-2.4.0-test11/fs/partitions/check.c Mon Nov 20 15:17:27 2000
+++ linux-2.4.0-test11-ppfix/fs/partitions/check.c Thu Nov 23 14:30:45 2000
@@ -83,11 +83,10 @@
*/
char *disk_name (struct gendisk *hd, int minor, char *buf)
{
- unsigned int part;
const char *maj = hd->major_name;
int unit = (minor >> hd->minor_shift) + 'a';
+ unsigned int part = minor & ((1 << hd->minor_shift) - 1);

- part = minor & ((1 << hd->minor_shift) - 1);
if (hd->part[minor].de) {
int pos;

@@ -95,77 +94,8 @@
if (pos >= 0)
return buf + pos;
}
- /*
- * IDE devices use multiple major numbers, but the drives
- * are named as: {hda,hdb}, {hdc,hdd}, {hde,hdf}, {hdg,hdh}..
- * This requires special handling here.
- */
- switch (hd->major) {
- case IDE9_MAJOR:
- unit += 2;
- case IDE8_MAJOR:
- unit += 2;
- case IDE7_MAJOR:
- unit += 2;
- case IDE6_MAJOR:
- unit += 2;
- case IDE5_MAJOR:
- unit += 2;
- case IDE4_MAJOR:
- unit += 2;
- case IDE3_MAJOR:
- unit += 2;
- case IDE2_MAJOR:
- unit += 2;
- case IDE1_MAJOR:
- unit += 2;
- case IDE0_MAJOR:
- maj = "hd";
- break;
- case MD_MAJOR:
- unit -= 'a'-'0';
- break;
- }
- if (hd->major >= SCSI_DISK1_MAJOR && hd->major <= SCSI_DISK7_MAJOR) {
- unit = unit + (hd->major - SCSI_DISK1_MAJOR + 1) * 16;
- if (unit > 'z') {
- unit -= 'z' + 1;
- sprintf(buf, "sd%c%c", 'a' + unit / 26, 'a' + unit % 26);
- if (part)
- sprintf(buf + 4, "%d", part);
- return buf;
- }
- }
- if (hd->major >= COMPAQ_SMART2_MAJOR && hd->major <= COMPAQ_SMART2_MAJOR+7) {
- int ctlr = hd->major - COMPAQ_SMART2_MAJOR;
- int disk = minor >> hd->minor_shift;
- int part = minor & (( 1 << hd->minor_shift) - 1);
- if (part == 0)
- sprintf(buf, "%s/c%dd%d", maj, ctlr, disk);
- else
- sprintf(buf, "%s/c%dd%dp%d", maj, ctlr, disk, part);
- return buf;
- }
- if (hd->major >= COMPAQ_CISS_MAJOR && hd->major <= COMPAQ_CISS_MAJOR+7) {
- int ctlr = hd->major - COMPAQ_CISS_MAJOR;
- int disk = minor >> hd->minor_shift;
- int part = minor & (( 1 << hd->minor_shift) - 1);
- if (part == 0)
- sprintf(buf, "%s/c%dd%d", maj, ctlr, disk);
- else
- sprintf(buf, "%s/c%dd%dp%d", maj, ctlr, disk, part);
- return buf;
- }
- if (hd->major >= DAC960_MAJOR && hd->major <= DAC960_MAJOR+7) {
- int ctlr = hd->major - DAC960_MAJOR;
- int disk = minor >> hd->minor_shift;
- int part = minor & (( 1 << hd->minor_shift) - 1);
- if (part == 0)
- sprintf(buf, "%s/c%dd%d", maj, ctlr, disk);
- else
- sprintf(buf, "%s/c%dd%dp%d", maj, ctlr, disk, part);
- return buf;
- }
+ if (hd->hd_name) return hd->hd_name(hd, minor, buf);
+
if (part)
sprintf(buf, "%s%c%d", maj, unit, part);
else
diff -u --recursive linux-2.4.0-test11/include/linux/genhd.h linux-2.4.0-test11-ppfix/include/linux/genhd.h
--- linux-2.4.0-test11/include/linux/genhd.h Mon Nov 20 15:31:01 2000
+++ linux-2.4.0-test11-ppfix/include/linux/genhd.h Thu Nov 23 07:44:35 2000
@@ -72,6 +72,8 @@

devfs_handle_t *de_arr; /* one per physical disc */
char *flags; /* one per physical disc */
+
+ char *(*hd_name) (struct gendisk *, int, char *);
};
#endif /* __KERNEL__ */

2000-12-06 21:27:23

by NeilBrown

[permalink] [raw]
Subject: Re: kernel panic in SoftwareRAID autodetection

On Wednesday December 6, [email protected] wrote:
> Peter Samuelson wrote:
> >
> > [Roberto Ragusa]
> > > BTW, here is a little patch regarding a silly problem I found
> > > about RAID partitions naming (/proc/partitions).
> > > No more "md8" "md9" "md:" "md;" ... but "md8" "md9" "md10" "md11" ...
> > > Well, this patch should work up to "md99".
> >
> > This stuff *really* should be split out into the drivers. Brian Kress
> > had a patch against test11 for this. Brian? You want to fold in this
> > fix?
>
> Sure. I got resounding silence to posting the patch last time,
> so I'm not sure if anyone actually wants this patch, but here it is
> again with this fix (the non ugly version) folded in.
>

Have you ever noticed that bad patches get a lot more response than
good patches? I think people like having something useful to say and
a simple "I like it" often doesn't seem worth it, though in reality is
very valuable.

Mayge the thing to do is include a few obvious but not very
significant errors like:
- initialise a static variable to 0
- use /** to introduce a comment that isn't in the right format for
the auto-documentation stuff
- uses lines wider than 80 characters
- use unnecessary extra parentheses
- use non-standard indenting

that way people will respond because they think they have something to
say, and will hopefully comment further... :-)

But I see that you have done that .... a simple compiler error (I
think).

> diff -u --recursive linux-2.4.0-test11/fs/partitions/check.c
> linux-2.4.0-test11-ppfix/fs/partitions/check.c
> --- linux-2.4.0-test11/fs/partitions/check.c Mon Nov 20 15:17:27 2000
> +++ linux-2.4.0-test11-ppfix/fs/partitions/check.c Thu Nov 23 14:30:45
> 2000
> @@ -83,11 +83,10 @@
> */
> char *disk_name (struct gendisk *hd, int minor, char *buf)
> {
> - unsigned int part;
> const char *maj = hd->major_name;
> int unit = (minor >> hd->minor_shift) + 'a';
> + unsigned int part = minor & ((1 << hd->minor_shift) - 1);
>
> - part = minor & ((1 << hd->minor_shift) - 1);

here we have lost the "part" automatic variable in disk_name but ....

(stuff deleted)
> - else
> - sprintf(buf, "%s/c%dd%dp%d", maj, ctlr, disk, part);
> - return buf;
> - }
> + if (hd->hd_name) return hd->hd_name(hd, minor, buf);
> +
> if (part)
> sprintf(buf, "%s%c%d", maj, unit, part);

here we use it. Does this compile?

But apart from this, I like it, and I think that it would be good if
it went into 2.4.

Maybe do one more iteration (or convince me that the above isn't a
bug), and ask for comment. I promise to test and respond.
Then send it to Linus, and explain what it does, and mention that it
fixes a real issue in lvm (I assume it does as someone said so. I
haven't actually looked into that) and leave it to him.

NeilBrown

2000-12-06 22:04:45

by Brian Kress

[permalink] [raw]
Subject: Re: kernel panic in SoftwareRAID autodetection

Neil Brown wrote:
>
> On Wednesday December 6, [email protected] wrote:
> > Peter Samuelson wrote:
> > >
> > > [Roberto Ragusa]
> > > > BTW, here is a little patch regarding a silly problem I found
> > > > about RAID partitions naming (/proc/partitions).
> > > > No more "md8" "md9" "md:" "md;" ... but "md8" "md9" "md10" "md11" ...
> > > > Well, this patch should work up to "md99".
> > >
> > > This stuff *really* should be split out into the drivers. Brian Kress
> > > had a patch against test11 for this. Brian? You want to fold in this
> > > fix?
> >
> > Sure. I got resounding silence to posting the patch last time,
> > so I'm not sure if anyone actually wants this patch, but here it is
> > again with this fix (the non ugly version) folded in.
> >
>
> Have you ever noticed that bad patches get a lot more response than
> good patches? I think people like having something useful to say and
> a simple "I like it" often doesn't seem worth it, though in reality is
> very valuable.
>
> Mayge the thing to do is include a few obvious but not very
> significant errors like:
> - initialise a static variable to 0
> - use /** to introduce a comment that isn't in the right format for
> the auto-documentation stuff
> - uses lines wider than 80 characters
> - use unnecessary extra parentheses
> - use non-standard indenting
>
> that way people will respond because they think they have something to
> say, and will hopefully comment further... :-)
>

Perhaps this should be in the Linus-HOWTO. :)

>
> But I see that you have done that .... a simple compiler error (I
> think).
>
> > diff -u --recursive linux-2.4.0-test11/fs/partitions/check.c
> > linux-2.4.0-test11-ppfix/fs/partitions/check.c
> > --- linux-2.4.0-test11/fs/partitions/check.c Mon Nov 20 15:17:27 2000
> > +++ linux-2.4.0-test11-ppfix/fs/partitions/check.c Thu Nov 23 14:30:45
> > 2000
> > @@ -83,11 +83,10 @@
> > */
> > char *disk_name (struct gendisk *hd, int minor, char *buf)
> > {
> > - unsigned int part;
> > const char *maj = hd->major_name;
> > int unit = (minor >> hd->minor_shift) + 'a';
> > + unsigned int part = minor & ((1 << hd->minor_shift) - 1);
> >
> > - part = minor & ((1 << hd->minor_shift) - 1);
>
> here we have lost the "part" automatic variable in disk_name but ....
>

I don't think so. Look again. I deleted the declaration and
the assignment and replaced it with a declare and assign all at once.
The other pieces were like that and that one being inconsisent
offended me. :) This does compile, and seems to work.

>
> But apart from this, I like it, and I think that it would be good if
> it went into 2.4.
>
> Maybe do one more iteration (or convince me that the above isn't a
> bug), and ask for comment. I promise to test and respond.
> Then send it to Linus, and explain what it does, and mention that it
> fixes a real issue in lvm (I assume it does as someone said so. I
> haven't actually looked into that) and leave it to him.
>

If you want to test it, get the version I posted later
in this thread. It has some fixes like declaring some functions
static and my mailer not eating it. If you'd like that version
mailed to you, let me know.
Let me know how it works.


Brian Kress
[email protected]

2000-12-07 02:27:42

by NeilBrown

[permalink] [raw]
Subject: Re: kernel panic in SoftwareRAID autodetection

On Wednesday December 6, [email protected] wrote:
> Neil Brown wrote:
> >
> > here we have lost the "part" automatic variable in disk_name but ....
> >
>
> I don't think so. Look again.

Gulp... :-(

Yes, your patch is indeed fine. I heartily recommend it (for whatever
that is worth).

Your mailer on the other hand.... could use some work.
The latest patch that you mailed out still had only spaces, no tabs.
Nevertheless, I managed to apply it and it seems to work fine, as well
as looking all right.

NeilBrown