2005-02-16 16:24:19

by Luben Tuikov

[permalink] [raw]
Subject: [PATCH] fix units/partition count in sd.c (2.4.x)

Hi,

This patch fixes the nr_real count in sd.c, which is also used
in genhd.c to print out the partitions/units. The problem is that
nr_real is decremented on detach, the genhd's nr_sects is
cleared but the entry is still there and is being counted
for when displaying the partitions. Thus when nr_real
is decremented _and_ a 0-ed partition/unit is counted,
we get to not display 1 or more entries of the tail of
the list.

The solution is to not decrement nr_real on detach, effectively
never decrementing it, and so that it doesn't grow without a bound,
to throttle it on attach, incrementing it only if it would be
smaller than nr_dev.

This was observed on a RH kernel and on the current BK kernel.
Tested and fixed on 2.4.30-pre1 (BK). This patch is against 2.4.30-pre1.

To reproduce: assume 4 scsi disks sda, sdb, sdc, sdd.
#echo "scsi remove-single-device <sdb-HCTL>" > /proc/scsi/scsi
#cat /proc/partitions
<<sdb _and_ sdd are not listed>>

Signed-off-by: Luben Tuikov <[email protected]>

===== sd.c 1.31 vs edited =====
--- 1.31/drivers/scsi/sd.c 2003-06-25 19:34:08 -04:00
+++ edited/sd.c 2005-02-14 17:09:43 -05:00
@@ -1332,8 +1332,8 @@

rscsi_disks[i].device = SDp;
rscsi_disks[i].has_part_table = 0;
- sd_template.nr_dev++;
- SD_GENDISK(i).nr_real++;
+ if (sd_template.nr_dev++ >= SD_GENDISK(i).nr_real)
+ SD_GENDISK(i).nr_real++;
devnum = i % SCSI_DISKS_PER_MAJOR;
SD_GENDISK(i).de_arr[devnum] = SDp->de;
if (SDp->removable)
@@ -1424,9 +1424,12 @@

for (dpnt = rscsi_disks, i = 0; i < sd_template.dev_max; i++, dpnt++)
if (dpnt->device == SDp) {
+ char nbuff[6];

/* If we are disconnecting a disk driver, sync and invalidate
* everything */
+ sd_devname(i, nbuff);
+
sdgd = &SD_GENDISK(i);
max_p = sd_gendisk.max_p;
start = i << sd_gendisk.minor_shift;
@@ -1447,7 +1450,12 @@
SDp->attached--;
sd_template.dev_noticed--;
sd_template.nr_dev--;
- SD_GENDISK(i).nr_real--;
+
+ printk("Detached scsi %sdisk %s at scsi%d, "
+ "channel %d, id %d, lun %d\n",
+ SDp->removable ? "removable " : "",
+ nbuff, SDp->host->host_no, SDp->channel,
+ SDp->id, SDp->lun);
return;
}
return;

Luben
P.S. Patch attached as well, for formatting.


Attachments:
sd.patch (1.25 kB)

2005-02-16 22:43:34

by soohoon.lee

[permalink] [raw]
Subject: Re: [PATCH] fix units/partition count in sd.c (2.4.x)


Conincidentally I've found the same problem but fixed it differently.
Because nr_real is not real # of devices but max # of devices of a major #,
it doesn't need to be changed on disk add/remove.

1223c1223
< sd_gendisks[i].nr_real = 0;
---
> sd_gendisks[i].nr_real = SCSI_DISKS_PER_MAJOR;
1336d1335
< SD_GENDISK(i).nr_real++;
1450d1448
< SD_GENDISK(i).nr_real--;


2.6 has little different structure but it does like this

sd.c:sd_probe()
gd->minors = 16;

Soohoon.


Attachments:
diff.patch (930.00 B)

2005-02-26 18:08:24

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: [PATCH] fix units/partition count in sd.c (2.4.x)

On Wed, Feb 16, 2005 at 11:23:53AM -0500, Luben Tuikov wrote:
> Hi,
>
> This patch fixes the nr_real count in sd.c, which is also used
> in genhd.c to print out the partitions/units. The problem is that
> nr_real is decremented on detach, the genhd's nr_sects is
> cleared but the entry is still there and is being counted
> for when displaying the partitions. Thus when nr_real
> is decremented _and_ a 0-ed partition/unit is counted,
> we get to not display 1 or more entries of the tail of
> the list.
>
> The solution is to not decrement nr_real on detach, effectively
> never decrementing it, and so that it doesn't grow without a bound,
> to throttle it on attach, incrementing it only if it would be
> smaller than nr_dev.
>
> This was observed on a RH kernel and on the current BK kernel.
> Tested and fixed on 2.4.30-pre1 (BK). This patch is against 2.4.30-pre1.
>
> To reproduce: assume 4 scsi disks sda, sdb, sdc, sdd.
> #echo "scsi remove-single-device <sdb-HCTL>" > /proc/scsi/scsi
> #cat /proc/partitions
> <<sdb _and_ sdd are not listed>>

Luben,

On James Bottomley advice I have applied Soo Lee's fix, which looks cleaner.

Also as James notice this will increase overhead of /proc/partitions which might be
a problem on higher-end systems with many devices.

Testing of it on such systems is highly appreciated.


# 05/02/26 [email protected] 1.1558
# [PATCH] Fix units/partition count in sd.c
#
# Symptom:
# When a scsi disk is removed other scsi disk with biggest minor #
# disapears in /proc/partition at the same time.
#
# Cause and fix:
# sd.c decreases nr_real on disk removal but because nr_real is not
# real # of devices but max # of devices of a major #,
# it doesn't need to be changed on disk add/remove.
#
# 2.6 has little different structure but it does like this
#
# sd.c:sd_probe()
# gd->minors = 16;
# --------------------------------------------
#
diff -Nru a/drivers/scsi/sd.c b/drivers/scsi/sd.c
--- a/drivers/scsi/sd.c Sat Feb 26 10:46:42 2005
+++ b/drivers/scsi/sd.c Sat Feb 26 10:46:42 2005
@@ -1220,7 +1220,7 @@
goto cleanup_gendisks_part;
memset(sd_gendisks[i].part, 0, (SCSI_DISKS_PER_MAJOR << 4) * sizeof(struct hd_struct));
sd_gendisks[i].sizes = sd_sizes + (i * SCSI_DISKS_PER_MAJOR << 4);
- sd_gendisks[i].nr_real = 0;
+ sd_gendisks[i].nr_real = SCSI_DISKS_PER_MAJOR;
sd_gendisks[i].real_devices =
(void *) (rscsi_disks + i * SCSI_DISKS_PER_MAJOR);
}
@@ -1333,7 +1333,6 @@
rscsi_disks[i].device = SDp;
rscsi_disks[i].has_part_table = 0;
sd_template.nr_dev++;
- SD_GENDISK(i).nr_real++;
devnum = i % SCSI_DISKS_PER_MAJOR;
SD_GENDISK(i).de_arr[devnum] = SDp->de;
if (SDp->removable)
@@ -1447,7 +1446,6 @@
SDp->attached--;
sd_template.dev_noticed--;
sd_template.nr_dev--;
- SD_GENDISK(i).nr_real--;
return;
}
return;



2005-02-26 19:23:18

by Soo Lee

[permalink] [raw]
Subject: Re: [PATCH] fix units/partition count in sd.c (2.4.x)


Thanks for your attention.
I think It only matters when there're many scsi controllers but less disks like one disk per controller.

Even in such case
The main user of nr_real does
genhd.c:part_show()
/* show the full disk and all non-0 size partitions of it */
for (n = 0; n < (gp->nr_real << gp->minor_shift); n++) {
if (gp->part[n].nr_sects) {

It's very tight loop when there's no real device.
If "(gp->nr_real << gp->minor_shift)" part is taken out from the loop
it'll be much more efficient.

And much simpler proof of "It's safe" is that
we are already living with monsters.
./drivers/md/md.c: nr_real: MAX_MD_DEVS, == 256
./drivers/md/lvm.c: .nr_real = MAX_LV, == 256

So using md or lvm simply adds overhead of 16 scsi controllers.
So we are safe or we are not safe already.

Soohoon Lee.

---------- Original Message ----------------------------------
From: Marcelo Tosatti <[email protected]>
Date: Sat, 26 Feb 2005 10:48:52 -0300

>On Wed, Feb 16, 2005 at 11:23:53AM -0500, Luben Tuikov wrote:
>> Hi,
>>
>> This patch fixes the nr_real count in sd.c, which is also used
>> in genhd.c to print out the partitions/units. The problem is that
>> nr_real is decremented on detach, the genhd's nr_sects is
>> cleared but the entry is still there and is being counted
>> for when displaying the partitions. Thus when nr_real
>> is decremented _and_ a 0-ed partition/unit is counted,
>> we get to not display 1 or more entries of the tail of
>> the list.
>>
>> The solution is to not decrement nr_real on detach, effectively
>> never decrementing it, and so that it doesn't grow without a bound,
>> to throttle it on attach, incrementing it only if it would be
>> smaller than nr_dev.
>>
>> This was observed on a RH kernel and on the current BK kernel.
>> Tested and fixed on 2.4.30-pre1 (BK). This patch is against 2.4.30-pre1.
>>
>> To reproduce: assume 4 scsi disks sda, sdb, sdc, sdd.
>> #echo "scsi remove-single-device <sdb-HCTL>" > /proc/scsi/scsi
>> #cat /proc/partitions
>> <<sdb _and_ sdd are not listed>>
>
>Luben,
>
>On James Bottomley advice I have applied Soo Lee's fix, which looks cleaner.
>
>Also as James notice this will increase overhead of /proc/partitions which might be
>a problem on higher-end systems with many devices.
>
>Testing of it on such systems is highly appreciated.
>
>
># 05/02/26 [email protected] 1.1558
># [PATCH] Fix units/partition count in sd.c
>#
># Symptom:
># When a scsi disk is removed other scsi disk with biggest minor #
># disapears in /proc/partition at the same time.
>#
># Cause and fix:
># sd.c decreases nr_real on disk removal but because nr_real is not
># real # of devices but max # of devices of a major #,
># it doesn't need to be changed on disk add/remove.
>#
># 2.6 has little different structure but it does like this
>#
># sd.c:sd_probe()
># gd->minors = 16;
># --------------------------------------------
>#
>diff -Nru a/drivers/scsi/sd.c b/drivers/scsi/sd.c
>--- a/drivers/scsi/sd.c Sat Feb 26 10:46:42 2005
>+++ b/drivers/scsi/sd.c Sat Feb 26 10:46:42 2005
>@@ -1220,7 +1220,7 @@
> goto cleanup_gendisks_part;
> memset(sd_gendisks[i].part, 0, (SCSI_DISKS_PER_MAJOR << 4) * sizeof(struct hd_struct));
> sd_gendisks[i].sizes = sd_sizes + (i * SCSI_DISKS_PER_MAJOR << 4);
>- sd_gendisks[i].nr_real = 0;
>+ sd_gendisks[i].nr_real = SCSI_DISKS_PER_MAJOR;
> sd_gendisks[i].real_devices =
> (void *) (rscsi_disks + i * SCSI_DISKS_PER_MAJOR);
> }
>@@ -1333,7 +1333,6 @@
> rscsi_disks[i].device = SDp;
> rscsi_disks[i].has_part_table = 0;
> sd_template.nr_dev++;
>- SD_GENDISK(i).nr_real++;
> devnum = i % SCSI_DISKS_PER_MAJOR;
> SD_GENDISK(i).de_arr[devnum] = SDp->de;
> if (SDp->removable)
>@@ -1447,7 +1446,6 @@
> SDp->attached--;
> sd_template.dev_noticed--;
> sd_template.nr_dev--;
>- SD_GENDISK(i).nr_real--;
> return;
> }
> return;
>
>
>
>