2002-10-18 15:47:02

by Cameron, Steve

[permalink] [raw]
Subject: [PATCH] 2.5.43 cciss rescan disk ioctl

Adds CCISS_RESCANDISK ioctl. Applies to 2.5.43 after the
previous 9 patches I sent.
-- steve

This is meant to be used in a configuration like
Steeleye's Lifekeeper. Two hosts connect to the storage, one
reserves disks. The 2nd will not be able to read the partition
information because of the reservations. In the event the 1st
system fails, the 2nd can detect this, (via special hardware +
software typically) and then take over the storage and rescan
the disks via this ioctl.

diff -urN linux-2.5.43-i/drivers/block/cciss.c linux-2.5.43-j/drivers/block/cciss.c
--- linux-2.5.43-i/drivers/block/cciss.c Thu Oct 17 13:20:30 2002
+++ linux-2.5.43-j/drivers/block/cciss.c Thu Oct 17 13:28:23 2002
@@ -46,12 +46,12 @@
#include <linux/completion.h>

#define CCISS_DRIVER_VERSION(maj,min,submin) ((maj<<16)|(min<<8)|(submin))
-#define DRIVER_NAME "Compaq CISS Driver (v 2.5.0)"
-#define DRIVER_VERSION CCISS_DRIVER_VERSION(2,5,0)
+#define DRIVER_NAME "Compaq CISS Driver (v 2.5.1)"
+#define DRIVER_VERSION CCISS_DRIVER_VERSION(2,5,1)

/* Embedded module documentation macros - see modules.h */
MODULE_AUTHOR("Charles M. White III - Compaq Computer Corporation");
-MODULE_DESCRIPTION("Driver for Compaq Smart Array Controller 5xxx v. 2.5.0");
+MODULE_DESCRIPTION("Driver for Compaq Smart Array Controller 5xxx v. 2.5.1");
MODULE_LICENSE("GPL");

#include "cciss_cmd.h"
@@ -108,6 +108,7 @@
static int cciss_revalidate(kdev_t dev);
static int deregister_disk(int ctlr, int logvol);
static int register_new_disk(int cltr);
+static int cciss_rescan_disk(int cltr, int logvol);

static void cciss_getgeometry(int cntl_num);

@@ -351,13 +352,22 @@
return -ENXIO;
/*
* Root is allowed to open raw volume zero even if its not configured
- * so array config can still work. I don't think I really like this,
+ * so array config can still work. Root is also allowed to open any
+ * volume that has a LUN ID, so it can issue IOCTL to reread the
+ * disk information. I don't think I really like this,
* but I'm already using way to many device nodes to claim another one
* for "raw controller".
*/
if (hba[ctlr]->drv[dsk].nr_blocks == 0) {
- if (minor(inode->i_rdev) != 0)
- return -ENXIO;
+ if (minor(inode->i_rdev) != 0) {
+ /* if not node 0 make sure it is a partition = 0 */
+ if (minor(inode->i_rdev) & 0x0f)
+ return -ENXIO;
+ /* if it is, make sure we have a LUN ID */
+ if (hba[ctlr]->drv[minor(inode->i_rdev)
+ >> NWD_SHIFT].LunID == 0)
+ return -ENXIO;
+ }
if (!capable(CAP_SYS_ADMIN))
return -EPERM;
}
@@ -574,6 +584,9 @@
case CCISS_REVALIDVOLS:
return( revalidate_allvol(inode->i_rdev));

+ case CCISS_RESCANDISK:
+ return(cciss_rescan_disk(ctlr, dsk));
+
case CCISS_GETLUNINFO: {
LogvolInfo_struct luninfo;
struct gendisk *disk = hba[ctlr]->gendisk[dsk];
@@ -1253,6 +1266,52 @@
kfree(size_buff);
kfree(inq_buff);
return (logvol);
+}
+static int cciss_rescan_disk(int ctlr, int logvol)
+{
+ /* this is meant to be used in a configuration like
+ Steeleye's Lifekeeper. Two hosts connect to the storage, one
+ reserves disks. The 2nd will not be able to read the partition
+ information because of the reservations. In the event the 1st
+ system fails, the 2nd can detect this, (via special hardware +
+ software typically) and then take over the storage and rescan
+ the disks via this ioctl. */
+
+ struct gendisk *disk = hba[ctlr]->gendisk[logvol];
+ ReadCapdata_struct *size_buff;
+ InquiryData_struct *inq_buff;
+ unsigned int block_size;
+ unsigned int total_size;
+ kdev_t kdev;
+ struct block_device *bdev;
+
+ if (!capable(CAP_SYS_RAWIO))
+ return -EPERM;
+ if (hba[ctlr]->drv[logvol].nr_blocks != 0) {
+ /* disk is possibly on line, return just a warning */
+ return 1;
+ }
+ size_buff = kmalloc(sizeof( ReadCapdata_struct), GFP_KERNEL);
+ if (size_buff == NULL) {
+ printk(KERN_ERR "cciss: out of memory\n");
+ return -1;
+ }
+ inq_buff = kmalloc(sizeof( InquiryData_struct), GFP_KERNEL);
+ if (inq_buff == NULL) {
+ printk(KERN_ERR "cciss: out of memory\n");
+ kfree(size_buff);
+ return -1;
+ }
+ cciss_read_capacity(ctlr, logvol, size_buff, 1, &total_size,
+ &block_size);
+ cciss_geometry_inquiry(ctlr, logvol, 1, total_size, block_size,
+ inq_buff, &hba[ctlr]->drv[logvol]);
+ kdev = mk_kdev(MAJOR_NR + ctlr, disk->first_minor);
+ bdev = bdget(kdev_t_to_nr(kdev));
+ rescan_partitions(disk, bdev);
+ kfree(size_buff);
+ kfree(inq_buff);
+ return(0);
}
/*
* Wait polling for a command to complete.
diff -urN linux-2.5.43-i/include/linux/cciss_ioctl.h linux-2.5.43-j/include/linux/cciss_ioctl.h
--- linux-2.5.43-i/include/linux/cciss_ioctl.h Thu Oct 17 13:21:16 2002
+++ linux-2.5.43-j/include/linux/cciss_ioctl.h Thu Oct 17 13:29:05 2002
@@ -195,6 +195,7 @@
#define CCISS_REGNEWDISK _IOW(CCISS_IOC_MAGIC, 13, int)

#define CCISS_REGNEWD _IO(CCISS_IOC_MAGIC, 14)
+#define CCISS_RESCANDISK _IO(CCISS_IOC_MAGIC, 16)
#define CCISS_GETLUNINFO _IOR(CCISS_IOC_MAGIC, 17, LogvolInfo_struct)

#endif


2002-10-18 15:59:34

by Alexander Viro

[permalink] [raw]
Subject: Re: [PATCH] 2.5.43 cciss rescan disk ioctl



On Fri, 18 Oct 2002, Stephen Cameron wrote:

> Adds CCISS_RESCANDISK ioctl. Applies to 2.5.43 after the
> previous 9 patches I sent.
> -- steve
>
> This is meant to be used in a configuration like
> Steeleye's Lifekeeper. Two hosts connect to the storage, one
> reserves disks. The 2nd will not be able to read the partition
> information because of the reservations. In the event the 1st
> system fails, the 2nd can detect this, (via special hardware +
> software typically) and then take over the storage and rescan
> the disks via this ioctl.


> + if (minor(inode->i_rdev) != 0) {
> + /* if not node 0 make sure it is a partition = 0 */
> + if (minor(inode->i_rdev) & 0x0f)
> + return -ENXIO;

That's bogus. We call ->open() only for entire disk.

> + kdev = mk_kdev(MAJOR_NR + ctlr, disk->first_minor);
> + bdev = bdget(kdev_t_to_nr(kdev));
> + rescan_partitions(disk, bdev);

... and that is (a) obvious leak and (b) broken unless disk is already
open - it will try to do IO on bdev.

More interesting issue, though, is whether we need to bother with
complications coming from that interface anymore. Notice that
comment re "too many device nodes" doesn't hold these days - that
sort of stuff looks like a candidate for "write command into node
on driverfs" - especially when we are talking about configuring
the device, shutting disks down/starting them up, etc. Linus, Pat?
Unless I'm mistaken that's precisely the sort of work that is
supposed to live in driverfs...

2002-10-18 16:37:33

by Cameron, Steve

[permalink] [raw]
Subject: RE: [PATCH] 2.5.43 cciss rescan disk ioctl

Al Viro wrote:

> On Fri, 18 Oct 2002, Stephen Cameron wrote:
>
[...]
>
> > + if (minor(inode->i_rdev) != 0) {
> > + /* if not node 0 make sure it is a
> partition = 0 */
> > + if (minor(inode->i_rdev) & 0x0f)
> > + return -ENXIO;
>
> That's bogus. We call ->open() only for entire disk.

Ok.

>
> > + kdev = mk_kdev(MAJOR_NR + ctlr, disk->first_minor);
> > + bdev = bdget(kdev_t_to_nr(kdev));
> > + rescan_partitions(disk, bdev);
>
> ... and that is (a) obvious leak and

Ok. I thought it might be, but I wasn't sure.
(I was able to imagine a way it might work
and not be a leak, as bdget appears to search through a list and only
alloc if it doesn't find it already there.) But I don't have
anything like a complete understanding of bdget(), to state the
obvious.

< (b) broken unless disk is already open - it will try to do IO on bdev.

Hmm. It worked when I tried it (apart from the leak).
Maybe the disk was already open. The situation was that the
disk is there, (will respond to inquiry) but is reserved by
another host so read capacity will fail. Maybe I was just
(un)lucky. Anyway, I think you're saying there are situations
that I didn't try that don't work. Ok, forget this patch.

> More interesting issue, though, is whether we need to bother with
> complications coming from that interface anymore. Notice that
> comment re "too many device nodes" doesn't hold these days - that
> sort of stuff looks like a candidate for "write command into node
> on driverfs" - especially when we are talking about configuring
> the device, shutting disks down/starting them up, etc. Linus, Pat?
> Unless I'm mistaken that's precisely the sort of work that is
> supposed to live in driverfs...
>

I haven't looked into driverfs too much yet. (too much time
spent keeping up with the latest redhat/suse kernel, etc.)

Is there anything written up about it to help out driver
writers? I'm not much looking forward to telling our ACU
and CIM guys they need to change all their apps to use
driverfs interfaces. Also, I'm hoping driverfs is better
than /proc, in the sense that /proc tends to be read-only,
or write-only, unlike ioctl() where you can put in a complex
input and get back a corresponding output. With the likes of
/proc, you can write something in there, but how do you get
back a corresponding output? Read it? Some other thread
can race you for it? (And if you're talking about shell
scripts using it, then processes could race.)

An ioctl-ish interface is superior for some applications.
(not that the one that this patch
lamely attempts to implement is necessarily such an application,
but the array config utility definitely is.) Hopefully
driverfs provides that kind of interface.

-- steve