2004-10-13 21:26:16

by Mike Miller

[permalink] [raw]
Subject: cciss update [2/2] fixes for Steeleye Lifekeeper

This patch addresses a problem with clustering software and also some of our utilies. We had to modify the modify the open specifically for clustering. This is the same as was done for 2.4. We also have to register the reserved volumes with the OS so when the backup server breaks the reservations he can call BLKRRPART to set the volume to the correct size.
We also have to register a controller that may not have any logical volumes configured. This is for the online utils as well as clustering.

Patch applies to 2.6.9-rc4. Please apply in order.
Please consider this for inclusion.

Thanks,
mikem
-------------------------------------------------------------------------------

diff -urNp lx269-rc4-p001/drivers/block/cciss.c lx269-rc4/drivers/block/cciss.c
--- lx269-rc4-p001/drivers/block/cciss.c 2004-10-13 13:53:49.545284000 -0500
+++ lx269-rc4/drivers/block/cciss.c 2004-10-13 15:26:47.312335048 -0500
@@ -438,13 +438,22 @@ static int cciss_open(struct inode *inod

/*
* Root is allowed to open raw volume zero even if it's not configured
- * so array config can still work. I don't think I really like this,
+ * so array config can still work. Root is also allowed to open any
+ * volume that has a LUN ID, so it can issue IOCTL to reread the
+ * disk information. I don't think I really like this
* but I'm already using way to many device nodes to claim another one
* for "raw controller".
*/
if (drv->nr_blocks == 0) {
- if (iminor(inode) != 0)
+ if (iminor(inode) != 0) { /* not node 0? */
+ /* if not node 0 make sure it is a partition = 0 */
+ if (iminor(inode) & 0x0f) {
return -ENXIO;
+ /* if it is, make sure we have a LUN ID */
+ } else if (drv->LunID == 0) {
+ return -ENXIO;
+ }
+ }
if (!capable(CAP_SYS_ADMIN))
return -EPERM;
}
@@ -1095,13 +1104,6 @@ cleanup1:

}

-static int cciss_revalidate(struct gendisk *disk)
-{
- drive_info_struct *drv = disk->private_data;
- set_capacity(disk, drv->nr_blocks);
- return 0;
-}
-
/*
* revalidate_allvol is for online array config utilities. After a
* utility reconfigures the drives in the array, it can use this function
@@ -1153,7 +1155,9 @@ static int revalidate_allvol(ctlr_info_t
for (i = 0; i < NWD; i++) {
struct gendisk *disk = host->gendisk[i];
drive_info_struct *drv = &(host->drv[i]);
- if (!drv->nr_blocks)
+ /* we must register the controller even if no disks exist */
+ /* this is for the online array utilities */
+ if (!drv->heads && i)
continue;
blk_queue_hardsect_size(host->queue, drv->block_size);
set_capacity(disk, drv->nr_blocks);
@@ -1485,13 +1489,7 @@ static void cciss_geometry_inquiry(int c
}
}
} else { /* Get geometry failed */
- printk(KERN_WARNING "cciss: reading geometry failed, "
- "continuing with default geometry\n");
- drv->block_size = block_size;
- drv->nr_blocks = total_size;
- drv->heads = 255;
- drv->sectors = 32; // Sectors per track
- drv->cylinders = total_size / 255 / 32;
+ printk(KERN_WARNING "cciss: reading geometry failed\n");
}
printk(KERN_INFO " heads= %d, sectors= %d, cylinders= %d\n\n",
drv->heads, drv->sectors, drv->cylinders);
@@ -1520,6 +1518,7 @@ cciss_read_capacity(int ctlr, int logvol
*total_size, *block_size);
return;
}
+
static int register_new_disk(ctlr_info_t *h)
{
struct gendisk *disk;
@@ -1663,7 +1662,9 @@ static int register_new_disk(ctlr_info_t
/* setup partitions per disk */
disk = h->gendisk[logvol];
set_capacity(disk, h->drv[logvol].nr_blocks);
- add_disk(disk);
+ /* if it's the controller it's already added */
+ if(logvol)
+ add_disk(disk);
freeret:
kfree(ld_buff);
kfree(size_buff);
@@ -1675,6 +1676,53 @@ free_err:
logvol = -1;
goto freeret;
}
+
+static int cciss_revalidate(struct gendisk *disk)
+{
+ ctlr_info_t *h = get_host(disk);
+ drive_info_struct *drv = get_drv(disk);
+ int logvol;
+ int FOUND=0;
+ unsigned int block_size;
+ unsigned int total_size;
+ ReadCapdata_struct *size_buff = NULL;
+ InquiryData_struct *inq_buff = NULL;
+
+ for(logvol=0; logvol < CISS_MAX_LUN; logvol++)
+ {
+ if(h->drv[logvol].LunID == drv->LunID) {
+ FOUND=1;
+ break;
+ }
+ }
+
+ if (!FOUND) return 1;
+
+ size_buff = kmalloc(sizeof( ReadCapdata_struct), GFP_KERNEL);
+ if (size_buff == NULL)
+ {
+ printk(KERN_WARNING "cciss: out of memory\n");
+ return 1;
+ }
+ inq_buff = kmalloc(sizeof( InquiryData_struct), GFP_KERNEL);
+ if (inq_buff == NULL)
+ {
+ printk(KERN_WARNING "cciss: out of memory\n");
+ kfree(size_buff);
+ return 1;
+ }
+
+ cciss_read_capacity(h->ctlr, logvol, size_buff, 1, &total_size, &block_size);
+ cciss_geometry_inquiry(h->ctlr, logvol, 1, total_size, block_size, inq_buff, drv);
+
+ blk_queue_hardsect_size(h->queue, drv->block_size);
+ set_capacity(disk, drv->nr_blocks);
+
+ kfree(size_buff);
+ kfree(inq_buff);
+ return 0;
+}
+
/*
* Wait polling for a command to complete.
* The memory mapped FIFO is polled for the completion.
@@ -2762,7 +2810,9 @@ static int __devinit cciss_init_one(stru
disk->fops = &cciss_fops;
disk->queue = hba[i]->queue;
disk->private_data = drv;
- if( !(drv->nr_blocks))
+ /* we must register the controller even if no disks exist */
+ /* this is for the online array utilities */
+ if(!drv->heads && j)
continue;
blk_queue_hardsect_size(hba[i]->queue, drv->block_size);
set_capacity(disk, drv->nr_blocks);


2004-10-14 08:39:33

by Christoph Hellwig

[permalink] [raw]
Subject: Re: cciss update [2/2] fixes for Steeleye Lifekeeper

On Wed, Oct 13, 2004 at 04:22:53PM -0500, [email protected] wrote:
> This patch addresses a problem with clustering software and also some of our utilies. We had to modify the modify the open specifically for clustering. This is the same as was done for 2.4. We also have to register the reserved volumes with the OS so when the backup server breaks the reservations he can call BLKRRPART to set the volume to the correct size.
> We also have to register a controller that may not have any logical volumes configured. This is for the online utils as well as clustering.
>
> Patch applies to 2.6.9-rc4. Please apply in order.
> Please consider this for inclusion.

No, this is bogus. Never call add_disk on a volume that hasn't been
configured and full set up. If you need to talk to our driver without
online volumes add a character device.

2004-10-14 13:48:56

by Mike Miller (OS Dev)

[permalink] [raw]
Subject: Re: cciss update [2/2] fixes for Steeleye Lifekeeper

On Thu, Oct 14, 2004 at 09:39:00AM +0100, Christoph Hellwig wrote:
> On Wed, Oct 13, 2004 at 04:22:53PM -0500, [email protected] wrote:
> > This patch addresses a problem with clustering software and also some of our utilies. We had to modify the modify the open specifically for clustering. This is the same as was done for 2.4. We also have to register the reserved volumes with the OS so when the backup server breaks the reservations he can call BLKRRPART to set the volume to the correct size.
> > We also have to register a controller that may not have any logical volumes configured. This is for the online utils as well as clustering.
> >
> > Patch applies to 2.6.9-rc4. Please apply in order.
> > Please consider this for inclusion.
>
> No, this is bogus. Never call add_disk on a volume that hasn't been
> configured and full set up. If you need to talk to our driver without
> online volumes add a character device.
>
This is the way we've done it since the 2.2 kernel. We have legacy applications
in the field that expect to open c?d0 as the controller. Creating a character
device would require all these apps to change.

mikem

2004-10-14 14:46:11

by James Bottomley

[permalink] [raw]
Subject: Re: cciss update [2/2] fixes for Steeleye Lifekeeper

On Thu, 2004-10-14 at 03:39, Christoph Hellwig wrote:
> No, this is bogus. Never call add_disk on a volume that hasn't been
> configured and full set up. If you need to talk to our driver without
> online volumes add a character device.

I don't think so ... it's a volume you know exists but you can't get
access to (in a shared storage configuration). In SCSI we have two
examples of this now:

1. The case where the device responds to inquiry but errors read
capacity (usually because of a reservation conflict). Here we configure
a totally bogus 1GB volume with 512 byte sectors.

2. The case where the device has an illegal sector size. Here we
configure a zero size volume.

In both cases we expose the device in sd but make it effectively
inaccessible because we want to be able to send ioctls to it but not
allow ordinary read/writes. (Although I'd really like not to configure
a bogus size in case 1.). I don't see how the cciss case is any
different from this.

James


2004-10-14 18:55:13

by Christoph Hellwig

[permalink] [raw]
Subject: Re: cciss update [2/2] fixes for Steeleye Lifekeeper

On Thu, Oct 14, 2004 at 09:37:32AM -0500, James Bottomley wrote:
> I don't think so ... it's a volume you know exists but you can't get
> access to (in a shared storage configuration). In SCSI we have two
> examples of this now:

Such a volume has been configured and set up, and although it's still
ugly I'd say it's okay. But the patch also adds one gendisk per controller
even if no volume is set up.

2004-10-15 15:12:54

by James Bottomley

[permalink] [raw]
Subject: Re: cciss update [2/2] fixes for Steeleye Lifekeeper

On Thu, 2004-10-14 at 13:39, Christoph Hellwig wrote:
> Such a volume has been configured and set up, and although it's still
> ugly I'd say it's okay. But the patch also adds one gendisk per controller
> even if no volume is set up.

That's this bit of code:

@@ -2762,7 +2810,9 @@ static int __devinit cciss_init_one(stru
disk->fops = &cciss_fops;
disk->queue = hba[i]->queue;
disk->private_data = drv;
- if( !(drv->nr_blocks))
+ /* we must register the controller even if no disks
exist */
+ /* this is for the online array utilities */
+ if(!drv->heads && j)
continue;
blk_queue_hardsect_size(hba[i]->queue, drv->block_size);
set_capacity(disk, drv->nr_blocks);

Mike, is there a way we can only allocate a gendisk when we know there's
actually a device there (if owned by another controller currently)?

James


2004-10-18 16:38:19

by Mike Miller (OS Dev)

[permalink] [raw]
Subject: Re: cciss update [2/2] fixes for Steeleye Lifekeeper

On Fri, Oct 15, 2004 at 10:05:09AM -0500, James Bottomley wrote:
> On Thu, 2004-10-14 at 13:39, Christoph Hellwig wrote:
> > Such a volume has been configured and set up, and although it's still
> > ugly I'd say it's okay. But the patch also adds one gendisk per controller
> > even if no volume is set up.
>
> That's this bit of code:
>
> @@ -2762,7 +2810,9 @@ static int __devinit cciss_init_one(stru
> disk->fops = &cciss_fops;
> disk->queue = hba[i]->queue;
> disk->private_data = drv;
> - if( !(drv->nr_blocks))
> + /* we must register the controller even if no disks
> exist */
> + /* this is for the online array utilities */
> + if(!drv->heads && j)
> continue;
> blk_queue_hardsect_size(hba[i]->queue, drv->block_size);
> set_capacity(disk, drv->nr_blocks);
>
> Mike, is there a way we can only allocate a gendisk when we know there's
> actually a device there (if owned by another controller currently)?
>
> James

This patch only registers the controller if no logical drives are configured. It will not result in all possible logical drives being added. I added printk's to the driver to show me what I'm registering.
What I see is the controller registers every time, and only drives that are phsically configured are registered. That is true for reserved drives, also.

Thanks,
mikem

2004-10-18 19:51:28

by James Bottomley

[permalink] [raw]
Subject: Re: cciss update [2/2] fixes for Steeleye Lifekeeper

On Mon, 2004-10-18 at 11:35, mikem wrote:
> This patch only registers the controller if no logical drives are configured. It will not result in all possible logical drives being added. I added printk's to the driver to show me what I'm registering.
> What I see is the controller registers every time, and only drives that are phsically configured are registered. That is true for reserved drives, also.

It also looks like this device is always the one used when the array
comes on line, so it's only a shadow for as long as the actual array has
none of it's storage configured. OK.

The code also seems to imply that we use a single block queue for all of
the array devices ... isn't that a bit inefficient?

James




2004-10-20 20:24:54

by Mike Miller (OS Dev)

[permalink] [raw]
Subject: Re: cciss update [2/2] fixes for Steeleye Lifekeeper

On Mon, Oct 18, 2004 at 02:45:24PM -0500, James Bottomley wrote:
> On Mon, 2004-10-18 at 11:35, mikem wrote:
> > This patch only registers the controller if no logical drives are configured. It will not result in all possible logical drives being added. I added printk's to the driver to show me what I'm registering.
> > What I see is the controller registers every time, and only drives that are phsically configured are registered. That is true for reserved drives, also.
>
> It also looks like this device is always the one used when the array
> comes on line, so it's only a shadow for as long as the actual array has
> none of it's storage configured. OK.
>
> The code also seems to imply that we use a single block queue for all of
> the array devices ... isn't that a bit inefficient?
>
> James
Yes, it's not a perfect solution, but I have yet to complete my fairness algorithm for the per drive queues.

mikem
>
>
>
>