2007-02-16 19:37:19

by Alan Stern

[permalink] [raw]
Subject: [PATCH] Block layer: separate out queue-oriented ioctls

From: James Bottomley <[email protected]>

This patch (as854) separates out the two queue-oriented ioctls from
the rest of the block-layer ioctls. The idea is that they should
apply to any driver using a request_queue, even if the driver doesn't
implement a block-device interface. The prototypical example is the
sg driver, to which the patch adds the new interface.

This will make it possible for cdrecord and related programs to
retrieve reliably the max_sectors value, regardless of whether the
user points it to an sr or an sg device. In particular, this will
resolve Bugzilla entry #7026.

Signed-off-by: Alan Stern <[email protected]>

---

Jens:

James said that he feels you should be be the person to accept this
patch, since it affects the block layer. Please merge it and send it
on up the hierarchy.

Alan Stern



diff --git a/block/ioctl.c b/block/ioctl.c
index 58aab63..8444d0c 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -135,6 +135,31 @@ static int put_u64(unsigned long arg, u6
return put_user(val, (u64 __user *)arg);
}

+static int blk_queue_locked_ioctl(struct request_queue *queue,
+ unsigned cmd, unsigned long arg)
+{
+ switch (cmd) {
+ case BLKSSZGET: /* get block device hardware sector size */
+ return put_int(arg, queue_hardsect_size(queue));
+ case BLKSECTGET:
+ return put_ushort(arg, queue->max_sectors);
+ }
+ return -ENOIOCTLCMD;
+}
+
+int blk_queue_ioctl(struct request_queue *queue, unsigned cmd,
+ unsigned long arg)
+{
+ int ret;
+
+ lock_kernel();
+ ret = blk_queue_locked_ioctl(queue, cmd, arg);
+ unlock_kernel();
+
+ return ret;
+}
+EXPORT_SYMBOL(blk_queue_ioctl);
+
static int blkdev_locked_ioctl(struct file *file, struct block_device *bdev,
unsigned cmd, unsigned long arg)
{
@@ -154,10 +179,6 @@ static int blkdev_locked_ioctl(struct fi
return put_int(arg, bdev_read_only(bdev) != 0);
case BLKBSZGET: /* get the logical block size (cf. BLKSSZGET) */
return put_int(arg, block_size(bdev));
- case BLKSSZGET: /* get block device hardware sector size */
- return put_int(arg, bdev_hardsect_size(bdev));
- case BLKSECTGET:
- return put_ushort(arg, bdev_get_queue(bdev)->max_sectors);
case BLKRASET:
case BLKFRASET:
if(!capable(CAP_SYS_ADMIN))
@@ -278,6 +299,8 @@ int blkdev_ioctl(struct inode *inode, st

lock_kernel();
ret = blkdev_locked_ioctl(file, bdev, cmd, arg);
+ if (ret == -ENOIOCTLCMD)
+ ret = blk_queue_locked_ioctl(bdev_get_queue(bdev), cmd, arg);
unlock_kernel();
if (ret != -ENOIOCTLCMD)
return ret;
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 81e3bc7..d97244b 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -786,6 +786,11 @@ sg_ioctl(struct inode *inode, struct fil
sdp->disk->disk_name, (int) cmd_in));
read_only = (O_RDWR != (filp->f_flags & O_ACCMODE));

+ /* block ioctls first */
+ result = blk_queue_ioctl(sdp->device->request_queue, cmd_in, arg);
+ if (result != -ENOIOCTLCMD)
+ return result;
+
switch (cmd_in) {
case SG_IO:
{
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index e1c7286..550b04a 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -754,6 +754,8 @@ extern void blk_queue_prep_rq(request_qu
extern void blk_queue_merge_bvec(request_queue_t *, merge_bvec_fn *);
extern void blk_queue_dma_alignment(request_queue_t *, int);
extern void blk_queue_softirq_done(request_queue_t *, softirq_done_fn *);
+extern int blk_queue_ioctl(struct request_queue *queue, unsigned cmd,
+ unsigned long arg);
extern struct backing_dev_info *blk_get_backing_dev_info(struct block_device *bdev);
extern int blk_queue_ordered(request_queue_t *, unsigned, prepare_flush_fn *);
extern void blk_queue_issue_flush_fn(request_queue_t *, issue_flush_fn *);


2007-02-17 06:28:40

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] Block layer: separate out queue-oriented ioctls

On Fri, Feb 16 2007, Alan Stern wrote:
> From: James Bottomley <[email protected]>
>
> This patch (as854) separates out the two queue-oriented ioctls from
> the rest of the block-layer ioctls. The idea is that they should
> apply to any driver using a request_queue, even if the driver doesn't
> implement a block-device interface. The prototypical example is the
> sg driver, to which the patch adds the new interface.
>
> This will make it possible for cdrecord and related programs to
> retrieve reliably the max_sectors value, regardless of whether the
> user points it to an sr or an sg device. In particular, this will
> resolve Bugzilla entry #7026.

The block bits are fine with me, the sg calling point is a bit of a sore
thumb (a char driver calling into block layer ioctls) though. So the
block layer bits are certainly ok with me, if Doug acks the sg bit I'll
merge everything up.

(patch left below)

> Signed-off-by: Alan Stern <[email protected]>
>
> ---
>
> Jens:
>
> James said that he feels you should be be the person to accept this
> patch, since it affects the block layer. Please merge it and send it
> on up the hierarchy.
>
> Alan Stern
>
>
>
> diff --git a/block/ioctl.c b/block/ioctl.c
> index 58aab63..8444d0c 100644
> --- a/block/ioctl.c
> +++ b/block/ioctl.c
> @@ -135,6 +135,31 @@ static int put_u64(unsigned long arg, u6
> return put_user(val, (u64 __user *)arg);
> }
>
> +static int blk_queue_locked_ioctl(struct request_queue *queue,
> + unsigned cmd, unsigned long arg)
> +{
> + switch (cmd) {
> + case BLKSSZGET: /* get block device hardware sector size */
> + return put_int(arg, queue_hardsect_size(queue));
> + case BLKSECTGET:
> + return put_ushort(arg, queue->max_sectors);
> + }
> + return -ENOIOCTLCMD;
> +}
> +
> +int blk_queue_ioctl(struct request_queue *queue, unsigned cmd,
> + unsigned long arg)
> +{
> + int ret;
> +
> + lock_kernel();
> + ret = blk_queue_locked_ioctl(queue, cmd, arg);
> + unlock_kernel();
> +
> + return ret;
> +}
> +EXPORT_SYMBOL(blk_queue_ioctl);
> +
> static int blkdev_locked_ioctl(struct file *file, struct block_device *bdev,
> unsigned cmd, unsigned long arg)
> {
> @@ -154,10 +179,6 @@ static int blkdev_locked_ioctl(struct fi
> return put_int(arg, bdev_read_only(bdev) != 0);
> case BLKBSZGET: /* get the logical block size (cf. BLKSSZGET) */
> return put_int(arg, block_size(bdev));
> - case BLKSSZGET: /* get block device hardware sector size */
> - return put_int(arg, bdev_hardsect_size(bdev));
> - case BLKSECTGET:
> - return put_ushort(arg, bdev_get_queue(bdev)->max_sectors);
> case BLKRASET:
> case BLKFRASET:
> if(!capable(CAP_SYS_ADMIN))
> @@ -278,6 +299,8 @@ int blkdev_ioctl(struct inode *inode, st
>
> lock_kernel();
> ret = blkdev_locked_ioctl(file, bdev, cmd, arg);
> + if (ret == -ENOIOCTLCMD)
> + ret = blk_queue_locked_ioctl(bdev_get_queue(bdev), cmd, arg);
> unlock_kernel();
> if (ret != -ENOIOCTLCMD)
> return ret;
> diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
> index 81e3bc7..d97244b 100644
> --- a/drivers/scsi/sg.c
> +++ b/drivers/scsi/sg.c
> @@ -786,6 +786,11 @@ sg_ioctl(struct inode *inode, struct fil
> sdp->disk->disk_name, (int) cmd_in));
> read_only = (O_RDWR != (filp->f_flags & O_ACCMODE));
>
> + /* block ioctls first */
> + result = blk_queue_ioctl(sdp->device->request_queue, cmd_in, arg);
> + if (result != -ENOIOCTLCMD)
> + return result;
> +
> switch (cmd_in) {
> case SG_IO:
> {
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index e1c7286..550b04a 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -754,6 +754,8 @@ extern void blk_queue_prep_rq(request_qu
> extern void blk_queue_merge_bvec(request_queue_t *, merge_bvec_fn *);
> extern void blk_queue_dma_alignment(request_queue_t *, int);
> extern void blk_queue_softirq_done(request_queue_t *, softirq_done_fn *);
> +extern int blk_queue_ioctl(struct request_queue *queue, unsigned cmd,
> + unsigned long arg);
> extern struct backing_dev_info *blk_get_backing_dev_info(struct block_device *bdev);
> extern int blk_queue_ordered(request_queue_t *, unsigned, prepare_flush_fn *);
> extern void blk_queue_issue_flush_fn(request_queue_t *, issue_flush_fn *);
>

--
Jens Axboe

2007-02-17 21:35:19

by Joerg.Schilling

[permalink] [raw]
Subject: Re: [PATCH] Block layer: separate out queue-oriented ioctls

Jens Axboe <[email protected]> wrote:

> > This will make it possible for cdrecord and related programs to
> > retrieve reliably the max_sectors value, regardless of whether the
> > user points it to an sr or an sg device. In particular, this will
> > resolve Bugzilla entry #7026.
>
> The block bits are fine with me, the sg calling point is a bit of a sore
> thumb (a char driver calling into block layer ioctls) though. So the
> block layer bits are certainly ok with me, if Doug acks the sg bit I'll
> merge everything up.

If you care about this kind of order, you would first need to eliminate the
ability of doing low level things like SG_IO from block drivers as this is
similar low level stuff that rather belongs to low level drivers.

Any driver that allows to use low level interfaces to send RAW SCSI commands
also needs access to other lowlevel stuff like the ability to know the
max. DMA size.



J?rg

--
EMail:[email protected] (home) J?rg Schilling D-13353 Berlin
[email protected] (uni)
[email protected] (work) Blog: http://schily.blogspot.com/
URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

2007-02-18 03:43:15

by Douglas Gilbert

[permalink] [raw]
Subject: Re: [PATCH] Block layer: separate out queue-oriented ioctls

Jens Axboe wrote:
> On Fri, Feb 16 2007, Alan Stern wrote:
>> From: James Bottomley <[email protected]>
>>
>> This patch (as854) separates out the two queue-oriented ioctls from
>> the rest of the block-layer ioctls. The idea is that they should
>> apply to any driver using a request_queue, even if the driver doesn't
>> implement a block-device interface. The prototypical example is the
>> sg driver, to which the patch adds the new interface.
>>
>> This will make it possible for cdrecord and related programs to
>> retrieve reliably the max_sectors value, regardless of whether the
>> user points it to an sr or an sg device. In particular, this will
>> resolve Bugzilla entry #7026.
>
> The block bits are fine with me, the sg calling point is a bit of a sore
> thumb (a char driver calling into block layer ioctls) though. So the
> block layer bits are certainly ok with me, if Doug acks the sg bit I'll
> merge everything up.
>
> (patch left below)

Does this need to be in the sg driver?

What is the hardware sector size of a SES or OSD device?

As for the max_sector variable, wouldn't it be better
to generate a new ioctl that yielded the limit in bytes?
Making a driver variable that implicitly assumes sectors
are 512 bytes in length more visible to the user space
seems like a step in the wrong direction.

Alternatively the SG_GET_RESERVED_SIZE ioctl could be
modified to yield no more than max_sectors*512 .

Doug Gilbert


P.S. See comment below

>> Signed-off-by: Alan Stern <[email protected]>
>>
>> ---
>>
>> Jens:
>>
>> James said that he feels you should be be the person to accept this
>> patch, since it affects the block layer. Please merge it and send it
>> on up the hierarchy.
>>
>> Alan Stern
>>
>>
>>
>> diff --git a/block/ioctl.c b/block/ioctl.c
>> index 58aab63..8444d0c 100644
>> --- a/block/ioctl.c
>> +++ b/block/ioctl.c
>> @@ -135,6 +135,31 @@ static int put_u64(unsigned long arg, u6
>> return put_user(val, (u64 __user *)arg);
>> }
>>
>> +static int blk_queue_locked_ioctl(struct request_queue *queue,
>> + unsigned cmd, unsigned long arg)
>> +{
>> + switch (cmd) {
>> + case BLKSSZGET: /* get block device hardware sector size */
>> + return put_int(arg, queue_hardsect_size(queue));
>> + case BLKSECTGET:
>> + return put_ushort(arg, queue->max_sectors);
>> + }
>> + return -ENOIOCTLCMD;
>> +}
>> +
>> +int blk_queue_ioctl(struct request_queue *queue, unsigned cmd,
>> + unsigned long arg)
>> +{
>> + int ret;
>> +
>> + lock_kernel();
>> + ret = blk_queue_locked_ioctl(queue, cmd, arg);
>> + unlock_kernel();
>> +
>> + return ret;
>> +}
>> +EXPORT_SYMBOL(blk_queue_ioctl);
>> +
>> static int blkdev_locked_ioctl(struct file *file, struct block_device *bdev,
>> unsigned cmd, unsigned long arg)
>> {
>> @@ -154,10 +179,6 @@ static int blkdev_locked_ioctl(struct fi
>> return put_int(arg, bdev_read_only(bdev) != 0);
>> case BLKBSZGET: /* get the logical block size (cf. BLKSSZGET) */
>> return put_int(arg, block_size(bdev));
>> - case BLKSSZGET: /* get block device hardware sector size */
>> - return put_int(arg, bdev_hardsect_size(bdev));
>> - case BLKSECTGET:
>> - return put_ushort(arg, bdev_get_queue(bdev)->max_sectors);
>> case BLKRASET:
>> case BLKFRASET:
>> if(!capable(CAP_SYS_ADMIN))
>> @@ -278,6 +299,8 @@ int blkdev_ioctl(struct inode *inode, st
>>
>> lock_kernel();
>> ret = blkdev_locked_ioctl(file, bdev, cmd, arg);
>> + if (ret == -ENOIOCTLCMD)
>> + ret = blk_queue_locked_ioctl(bdev_get_queue(bdev), cmd, arg);
>> unlock_kernel();
>> if (ret != -ENOIOCTLCMD)
>> return ret;
>> diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
>> index 81e3bc7..d97244b 100644
>> --- a/drivers/scsi/sg.c
>> +++ b/drivers/scsi/sg.c
>> @@ -786,6 +786,11 @@ sg_ioctl(struct inode *inode, struct fil
>> sdp->disk->disk_name, (int) cmd_in));
>> read_only = (O_RDWR != (filp->f_flags & O_ACCMODE));
>>
>> + /* block ioctls first */

Why first??
That would allow the block layer to overtake any currently
defined and working sg ioctl.
Surely, if it was put in, it should be in the default case.

>> + result = blk_queue_ioctl(sdp->device->request_queue, cmd_in, arg);
>> + if (result != -ENOIOCTLCMD)
>> + return result;
>> +
>> switch (cmd_in) {
>> case SG_IO:
>> {
>> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
>> index e1c7286..550b04a 100644
>> --- a/include/linux/blkdev.h
>> +++ b/include/linux/blkdev.h
>> @@ -754,6 +754,8 @@ extern void blk_queue_prep_rq(request_qu
>> extern void blk_queue_merge_bvec(request_queue_t *, merge_bvec_fn *);
>> extern void blk_queue_dma_alignment(request_queue_t *, int);
>> extern void blk_queue_softirq_done(request_queue_t *, softirq_done_fn *);
>> +extern int blk_queue_ioctl(struct request_queue *queue, unsigned cmd,
>> + unsigned long arg);
>> extern struct backing_dev_info *blk_get_backing_dev_info(struct block_device *bdev);
>> extern int blk_queue_ordered(request_queue_t *, unsigned, prepare_flush_fn *);
>> extern void blk_queue_issue_flush_fn(request_queue_t *, issue_flush_fn *);
>>
>

2007-02-18 12:40:26

by Joerg.Schilling

[permalink] [raw]
Subject: Re: [PATCH] Block layer: separate out queue-oriented ioctls

Douglas Gilbert <[email protected]> wrote:

> Jens Axboe wrote:
> > On Fri, Feb 16 2007, Alan Stern wrote:
> >> From: James Bottomley <[email protected]>
> >>
> >> This patch (as854) separates out the two queue-oriented ioctls from
> >> the rest of the block-layer ioctls. The idea is that they should
> >> apply to any driver using a request_queue, even if the driver doesn't
> >> implement a block-device interface. The prototypical example is the
> >> sg driver, to which the patch adds the new interface.
> >>
> >> This will make it possible for cdrecord and related programs to
> >> retrieve reliably the max_sectors value, regardless of whether the
> >> user points it to an sr or an sg device. In particular, this will
> >> resolve Bugzilla entry #7026.
> >
> > The block bits are fine with me, the sg calling point is a bit of a sore
> > thumb (a char driver calling into block layer ioctls) though. So the
> > block layer bits are certainly ok with me, if Doug acks the sg bit I'll
> > merge everything up.
> >
> > (patch left below)
>
> Does this need to be in the sg driver?
>
> What is the hardware sector size of a SES or OSD device?
>
> As for the max_sector variable, wouldn't it be better
> to generate a new ioctl that yielded the limit in bytes?
> Making a driver variable that implicitly assumes sectors
> are 512 bytes in length more visible to the user space
> seems like a step in the wrong direction.

This is what I did propose. I know of no SCSI device made since 1986 that
has a "hardware sector size". This is really a DMA size limit in bytes
and if you return the number in an unrelated multiple of a fraction, you
will not be able to use the optmium max transfer size.

> Alternatively the SG_GET_RESERVED_SIZE ioctl could be
> modified to yield no more than max_sectors*512 .

This is what I did propose 3 months ago and already 2 years ago.

J?rg

--
EMail:[email protected] (home) J?rg Schilling D-13353 Berlin
[email protected] (uni)
[email protected] (work) Blog: http://schily.blogspot.com/
URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

2007-02-18 16:44:30

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH] Block layer: separate out queue-oriented ioctls

On Sat, 17 Feb 2007, Douglas Gilbert wrote:

> Jens Axboe wrote:
> > On Fri, Feb 16 2007, Alan Stern wrote:
> >> From: James Bottomley <[email protected]>
> >>
> >> This patch (as854) separates out the two queue-oriented ioctls from
> >> the rest of the block-layer ioctls. The idea is that they should
> >> apply to any driver using a request_queue, even if the driver doesn't
> >> implement a block-device interface. The prototypical example is the
> >> sg driver, to which the patch adds the new interface.
> >>
> >> This will make it possible for cdrecord and related programs to
> >> retrieve reliably the max_sectors value, regardless of whether the
> >> user points it to an sr or an sg device. In particular, this will
> >> resolve Bugzilla entry #7026.
> >
> > The block bits are fine with me, the sg calling point is a bit of a sore
> > thumb (a char driver calling into block layer ioctls) though. So the
> > block layer bits are certainly ok with me, if Doug acks the sg bit I'll
> > merge everything up.
> >
> > (patch left below)
>
> Does this need to be in the sg driver?

Something like it does. Otherwise cdrecord (or any other sg user) has no
way to retrieve the max_sectors value for sg devices.

> What is the hardware sector size of a SES or OSD device?

I have no idea, and I don't see why it is relevant. (I don't even know
what "SES" and "OSD" refer to.)

> As for the max_sector variable, wouldn't it be better
> to generate a new ioctl that yielded the limit in bytes?

Maybe. I wouldn't mind doing it that way. But we would have to leave in
the old ioctl, and we probably would still want it to be usable for sg
devices. Not to mention that it would be silly to have two ioctls which
always return exactly the same values except for a factor of 512.

> Making a driver variable that implicitly assumes sectors
> are 512 bytes in length more visible to the user space
> seems like a step in the wrong direction.
>
> Alternatively the SG_GET_RESERVED_SIZE ioctl could be
> modified to yield no more than max_sectors*512 .

There should be one single ioctl which can be applied uniformly to all
CD-type devices (in fact, to all devices using a request_queue) to learn
max_sectors. This rules out using SG_GET_RESERVED_SIZE.

Furthermore, if you changed SG_GET_RESERVED_SIZE in this way you would
only increase the confusion. The reserved size isn't directly related to
the maximum allowed DMA length, and there's no point pretending it is.
What if it turns out that the reserved size is smaller than max_sectors?
Then you'd force user programs to do I/O in chunks that were smaller than
necessary.

> >> + /* block ioctls first */
>
> Why first??

I don't know -- James will have to answer. I don't see that it makes any
real difference; the new code can easily be moved to the end.

> That would allow the block layer to overtake any currently
> defined and working sg ioctl.
> Surely, if it was put in, it should be in the default case.

Would the patch be okay with you if I rework it to put the new ioctls
under the default case in sg?

Alan Stern

2007-02-18 18:29:59

by Joerg.Schilling

[permalink] [raw]
Subject: Re: [PATCH] Block layer: separate out queue-oriented ioctls

Alan Stern <[email protected]> wrote:

> > Alternatively the SG_GET_RESERVED_SIZE ioctl could be
> > modified to yield no more than max_sectors*512 .
>
> There should be one single ioctl which can be applied uniformly to all
> CD-type devices (in fact, to all devices using a request_queue) to learn
> max_sectors. This rules out using SG_GET_RESERVED_SIZE.

This has nothing to do with CD-type devices!
It is related to SCSI tansport.

> Furthermore, if you changed SG_GET_RESERVED_SIZE in this way you would
> only increase the confusion. The reserved size isn't directly related to
> the maximum allowed DMA length, and there's no point pretending it is.
> What if it turns out that the reserved size is smaller than max_sectors?
> Then you'd force user programs to do I/O in chunks that were smaller than
> necessary.

It would not increase confusion but reduce confusion because all
programs would later behave correctly without the need to change them.

J?rg

--
EMail:[email protected] (home) J?rg Schilling D-13353 Berlin
[email protected] (uni)
[email protected] (work) Blog: http://schily.blogspot.com/
URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

2007-02-19 16:06:27

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH] Block layer: separate out queue-oriented ioctls

On Sun, 18 Feb 2007, Joerg Schilling wrote:

> Alan Stern <[email protected]> wrote:
>
> > > Alternatively the SG_GET_RESERVED_SIZE ioctl could be
> > > modified to yield no more than max_sectors*512 .
> >
> > There should be one single ioctl which can be applied uniformly to all
> > CD-type devices (in fact, to all devices using a request_queue) to learn
> > max_sectors. This rules out using SG_GET_RESERVED_SIZE.
>
> This has nothing to do with CD-type devices!
> It is related to SCSI tansport.

Actually, it isn't even related to SCSI transport; it is related to the
request_queue interface. Even for devices which don't use a SCSI
transport, the request_queue code limits transfer lengths to max_sectors.

> > Furthermore, if you changed SG_GET_RESERVED_SIZE in this way you would
> > only increase the confusion. The reserved size isn't directly related to
> > the maximum allowed DMA length, and there's no point pretending it is.
> > What if it turns out that the reserved size is smaller than max_sectors?
> > Then you'd force user programs to do I/O in chunks that were smaller than
> > necessary.

I take back that last sentence. If the reserved size was smaller than
max_sectors then nothing would be changed.

> It would not increase confusion but reduce confusion because all
> programs would later behave correctly without the need to change them.

Well, if Doug wants to reduce the value returned by SG_GET_RESERVED_SIZE,
it's okay with me. An advantage of doing this is that older versions of
cdrecord would then work correctly.

However you don't seem to realize that people can use programs like
cdrecord with devices whose drivers don't support SG_GET_RESERVED_SIZE --
because that ioctl works only with sg. Programs would have to try
SG_GET_RESERVED_SIZE and if it faied, then try BLKSECTGET.

Remember also, the "reserved size" is _not_ the maximum allowed size of a
DMA transfer. Rather, it is the size of an internal buffer maintained by
sg. It's legal to do an I/O transfer larger than the "reserved size", but
it is not legal to do an I/O transfer larger than max_sectors.

Alan Stern

2007-02-19 16:11:07

by Joerg.Schilling

[permalink] [raw]
Subject: Re: [PATCH] Block layer: separate out queue-oriented ioctls

Alan Stern <[email protected]> wrote:

> Well, if Doug wants to reduce the value returned by SG_GET_RESERVED_SIZE,
> it's okay with me. An advantage of doing this is that older versions of
> cdrecord would then work correctly.
>
> However you don't seem to realize that people can use programs like
> cdrecord with devices whose drivers don't support SG_GET_RESERVED_SIZE --
> because that ioctl works only with sg. Programs would have to try
> SG_GET_RESERVED_SIZE and if it faied, then try BLKSECTGET.

Is there any reason not to have one single ioctl for one basic feature?

> Remember also, the "reserved size" is _not_ the maximum allowed size of a
> DMA transfer. Rather, it is the size of an internal buffer maintained by
> sg. It's legal to do an I/O transfer larger than the "reserved size", but
> it is not legal to do an I/O transfer larger than max_sectors.

At the time the call SG_GET_RESERVED_SIZE has been discussed/defined, we did
originally agree that the max value should be limited to what the HW allows
as DMA size. This is why I did originally files a bug against
SG_GET_RESERVED_SIZE.

J?rg

--
EMail:[email protected] (home) J?rg Schilling D-13353 Berlin
[email protected] (uni)
[email protected] (work) Blog: http://schily.blogspot.com/
URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

2007-02-19 17:06:36

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH] Block layer: separate out queue-oriented ioctls

On Mon, 19 Feb 2007, Joerg Schilling wrote:

> Alan Stern <[email protected]> wrote:
>
> > Well, if Doug wants to reduce the value returned by SG_GET_RESERVED_SIZE,
> > it's okay with me. An advantage of doing this is that older versions of
> > cdrecord would then work correctly.
> >
> > However you don't seem to realize that people can use programs like
> > cdrecord with devices whose drivers don't support SG_GET_RESERVED_SIZE --
> > because that ioctl works only with sg. Programs would have to try
> > SG_GET_RESERVED_SIZE and if it faied, then try BLKSECTGET.
>
> Is there any reason not to have one single ioctl for one basic feature?

Indeed there is not. That's what I wrote in an earlier email:

"There should be one single ioctl which can be applied uniformly to all
CD-type devices (in fact, to all devices using a request_queue) to learn
max_sectors. This rules out using SG_GET_RESERVED_SIZE."

> > Remember also, the "reserved size" is _not_ the maximum allowed size of a
> > DMA transfer. Rather, it is the size of an internal buffer maintained by
> > sg. It's legal to do an I/O transfer larger than the "reserved size", but
> > it is not legal to do an I/O transfer larger than max_sectors.
>
> At the time the call SG_GET_RESERVED_SIZE has been discussed/defined, we did
> originally agree that the max value should be limited to what the HW allows
> as DMA size. This is why I did originally files a bug against
> SG_GET_RESERVED_SIZE.

How do you feel about the patch below, either in addition to or instead of
the previous patch?

Alan Stern



Index: usb-2.6/drivers/scsi/sg.c
===================================================================
--- usb-2.6.orig/drivers/scsi/sg.c
+++ usb-2.6/drivers/scsi/sg.c
@@ -917,6 +917,8 @@ sg_ioctl(struct inode *inode, struct fil
return result;
if (val < 0)
return -EINVAL;
+ if (val > sdp->device->request_queue->max_sectors * 512)
+ return -EOVERFLOW;
if (val != sfp->reserve.bufflen) {
if (sg_res_in_use(sfp) || sfp->mmap_called)
return -EBUSY;
@@ -925,7 +927,8 @@ sg_ioctl(struct inode *inode, struct fil
}
return 0;
case SG_GET_RESERVED_SIZE:
- val = (int) sfp->reserve.bufflen;
+ val = min_t(int, sfp->reserve.bufflen,
+ sdp->device->request_queue->max_sectors * 512);
return put_user(val, ip);
case SG_SET_COMMAND_Q:
result = get_user(val, ip);

2007-02-19 22:25:19

by Douglas Gilbert

[permalink] [raw]
Subject: Re: [PATCH] Block layer: separate out queue-oriented ioctls

Alan Stern wrote:
> On Mon, 19 Feb 2007, Joerg Schilling wrote:
>
>> Alan Stern <[email protected]> wrote:
>>
>>> Well, if Doug wants to reduce the value returned by SG_GET_RESERVED_SIZE,
>>> it's okay with me. An advantage of doing this is that older versions of
>>> cdrecord would then work correctly.
>>>
>>> However you don't seem to realize that people can use programs like
>>> cdrecord with devices whose drivers don't support SG_GET_RESERVED_SIZE --
>>> because that ioctl works only with sg. Programs would have to try
>>> SG_GET_RESERVED_SIZE and if it faied, then try BLKSECTGET.
>> Is there any reason not to have one single ioctl for one basic feature?
>
> Indeed there is not. That's what I wrote in an earlier email:
>
> "There should be one single ioctl which can be applied uniformly to all
> CD-type devices (in fact, to all devices using a request_queue) to learn
> max_sectors. This rules out using SG_GET_RESERVED_SIZE."
>
>>> Remember also, the "reserved size" is _not_ the maximum allowed size of a
>>> DMA transfer. Rather, it is the size of an internal buffer maintained by
>>> sg. It's legal to do an I/O transfer larger than the "reserved size", but
>>> it is not legal to do an I/O transfer larger than max_sectors.
>> At the time the call SG_GET_RESERVED_SIZE has been discussed/defined, we did
>> originally agree that the max value should be limited to what the HW allows
>> as DMA size. This is why I did originally files a bug against
>> SG_GET_RESERVED_SIZE.
>
> How do you feel about the patch below, either in addition to or instead of
> the previous patch?

Alan,
The SG_GET_RESERVED_SIZE ioctl is also defined in
the block layer, see block/scsi_ioctl.c .
I suspect it is just a kludge to fool cdrecord that it
is talking to a sg device. [One of many kludges in the
block SG_IO ioctl implementation to that end.]
So perhaps the block layer versions of SG_SET_RESERVED_SIZE
and SG_GET_RESERVED_SIZE need to be similarly capped.
Actually I think that I would default SG_GET_RESERVED_SIZE to
request_queue->max_sectors * 512 in the block layer
implementation (as there is no "reserve buffer" associated
with a block device).


<aside>
The idea of a reserved buffer may live on in bsg as experience
with sg has shown that it is the fastest way to do (mmap-ed) IO.
Having one reserved buffer per file descriptor means not
having to create and tear down a scatter gather list
per IO. [Having a pool of such lists would be even better.]
Until optical storage needs 10 times its current datarates
then cdrecord will not need this mechanism.
</aside>

Doug Gilbert


> Index: usb-2.6/drivers/scsi/sg.c
> ===================================================================
> --- usb-2.6.orig/drivers/scsi/sg.c
> +++ usb-2.6/drivers/scsi/sg.c
> @@ -917,6 +917,8 @@ sg_ioctl(struct inode *inode, struct fil
> return result;
> if (val < 0)
> return -EINVAL;
> + if (val > sdp->device->request_queue->max_sectors * 512)
> + return -EOVERFLOW;
> if (val != sfp->reserve.bufflen) {
> if (sg_res_in_use(sfp) || sfp->mmap_called)
> return -EBUSY;
> @@ -925,7 +927,8 @@ sg_ioctl(struct inode *inode, struct fil
> }
> return 0;
> case SG_GET_RESERVED_SIZE:
> - val = (int) sfp->reserve.bufflen;
> + val = min_t(int, sfp->reserve.bufflen,
> + sdp->device->request_queue->max_sectors * 512);
> return put_user(val, ip);
> case SG_SET_COMMAND_Q:
> result = get_user(val, ip);
>

2007-02-20 03:48:11

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH] Block layer: separate out queue-oriented ioctls

On Mon, 19 Feb 2007, Douglas Gilbert wrote:

> Alan,
> The SG_GET_RESERVED_SIZE ioctl is also defined in
> the block layer, see block/scsi_ioctl.c .

Ah, I didn't know that. (Or more likely, I used to know and have since
forgotten.) Thanks for pointing it out.

> I suspect it is just a kludge to fool cdrecord that it
> is talking to a sg device. [One of many kludges in the
> block SG_IO ioctl implementation to that end.]
> So perhaps the block layer versions of SG_SET_RESERVED_SIZE
> and SG_GET_RESERVED_SIZE need to be similarly capped.

Yes. In fact one of them already is, but the other should be too.

> Actually I think that I would default SG_GET_RESERVED_SIZE to
> request_queue->max_sectors * 512 in the block layer
> implementation (as there is no "reserve buffer" associated
> with a block device).

Okay.

Come to think of it, the reserved_size value used when a new sg device is
created should also be capped at max_sectors * 512. Agreed? I can't see
any reason for ever having a larger buffer -- it would be impossible to
make use of the extra space.

Alan Stern

2007-02-20 04:48:28

by Douglas Gilbert

[permalink] [raw]
Subject: Re: [PATCH] Block layer: separate out queue-oriented ioctls

Alan Stern wrote:
> On Mon, 19 Feb 2007, Douglas Gilbert wrote:
>
>> Alan,
>> The SG_GET_RESERVED_SIZE ioctl is also defined in
>> the block layer, see block/scsi_ioctl.c .
>
> Ah, I didn't know that. (Or more likely, I used to know and have since
> forgotten.) Thanks for pointing it out.
>
>> I suspect it is just a kludge to fool cdrecord that it
>> is talking to a sg device. [One of many kludges in the
>> block SG_IO ioctl implementation to that end.]
>> So perhaps the block layer versions of SG_SET_RESERVED_SIZE
>> and SG_GET_RESERVED_SIZE need to be similarly capped.
>
> Yes. In fact one of them already is, but the other should be too.
>
>> Actually I think that I would default SG_GET_RESERVED_SIZE to
>> request_queue->max_sectors * 512 in the block layer
>> implementation (as there is no "reserve buffer" associated
>> with a block device).
>
> Okay.
>
> Come to think of it, the reserved_size value used when a new sg device is
> created should also be capped at max_sectors * 512. Agreed? I can't see
> any reason for ever having a larger buffer -- it would be impossible to
> make use of the extra space.

Alan,
That depends whether or not max_sectors can be changed
(via sysfs) subsequent to a sg device being created.
And I think it can.

# ls -l /sys/block/sdc/queue/
total 0
drwxr-xr-x 2 root root 0 Feb 19 18:29 iosched
-r--r--r-- 1 root root 4096 Feb 19 23:41 max_hw_sectors_kb
-rw-r--r-- 1 root root 4096 Feb 19 23:41 max_sectors_kb
-rw-r--r-- 1 root root 4096 Feb 19 23:41 nr_requests
-rw-r--r-- 1 root root 4096 Feb 19 23:41 read_ahead_kb
-rw-r--r-- 1 root root 4096 Feb 19 23:41 scheduler


# cat max_hw_sectors_kb > max_sectors_kb

... is the real maximum if the LLD that set max_hw_sectors_kb
is to be believed (actually it is often a finger in
the wind).

Doug Gilbert


2007-02-20 15:55:47

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH] Block layer: separate out queue-oriented ioctls

On Mon, 19 Feb 2007, Douglas Gilbert wrote:

> > Come to think of it, the reserved_size value used when a new sg device is
> > created should also be capped at max_sectors * 512. Agreed? I can't see
> > any reason for ever having a larger buffer -- it would be impossible to
> > make use of the extra space.
>
> Alan,
> That depends whether or not max_sectors can be changed
> (via sysfs) subsequent to a sg device being created.
> And I think it can.
>
> # ls -l /sys/block/sdc/queue/
> total 0
> drwxr-xr-x 2 root root 0 Feb 19 18:29 iosched
> -r--r--r-- 1 root root 4096 Feb 19 23:41 max_hw_sectors_kb
> -rw-r--r-- 1 root root 4096 Feb 19 23:41 max_sectors_kb
> -rw-r--r-- 1 root root 4096 Feb 19 23:41 nr_requests
> -rw-r--r-- 1 root root 4096 Feb 19 23:41 read_ahead_kb
> -rw-r--r-- 1 root root 4096 Feb 19 23:41 scheduler

Yes, it definitely can be changed.

> # cat max_hw_sectors_kb > max_sectors_kb
>
> ... is the real maximum if the LLD that set max_hw_sectors_kb
> is to be believed (actually it is often a finger in
> the wind).

That's why my patch computes the minimum value every time the
GET_RESERVED_SIZE ioctl runs -- in case max_sectors has changed.

If the user decides to increase max_sectors, then the reserved_size can be
increased immediately afterward. This shouldn't cause any problems.

I will submit a revised patch shortly, on the linux-scsi list.

Alan Stern