2006-02-28 18:01:10

by Mark Rustad

[permalink] [raw]
Subject: sg regression in 2.6.16-rc5

We have encountered some kind of sg regression with kernel 2.6.16-rc5
relative to 2.6.15. We have a small program that demonstrates the
failure. On 2.6.15 it produces the output:

Alloced dataptr 0 -> 0xb7d07008
IOS: 0
ios 100

indicating that it did 100 operations successfully. On 2.6.16-rc5, it
produces the output:

Alloced dataptr 0 -> 0xa7d10008
SG_IO ioctl error 12 Cannot allocate memory
ios 0

indicating that it did 0 operations successfully. This program is
attempting to do 1MB reads on a SCSI device. We get the failure both
on an aic79xx parallel SCSI and on aic94xx SAS. With both types of
devices, it works fine on the 2.6.15 kernel. We have also seen this
problem on the 2.6.16-rc4 kernel. In all cases we were running on an
Intel Xeon-based system.

Below is the source for the program that was used to demonstrate this
problem:

#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <sched.h>
#include <error.h>
#include <errno.h>
#include <string.h>
#include <scsi/sg.h>
#include <scsi/scsi.h>
#include <sys/ioctl.h>
#include <sys/epoll.h>
#include <ctype.h>


void dispsns(unsigned char cdb0, unsigned char *sense_buffer, int len)
{
int i;
unsigned int *dwptr;
printf("sense: cdb0 : %02hhX",cdb0);
dwptr = (unsigned int *) &sense_buffer;
for (i = 0; i < len ; i++) {

if (!(i % 16)) {
printf("\n%02hhX", sense_buffer[i]);
continue;
}
if (!(i % 4))
printf(" ");
printf("%02hhX", sense_buffer[i]);
}
printf(" KEY: %02hhX ", sense_buffer[2] & 0x0f);
printf("ASC: %02hhX ", sense_buffer[12]);
printf("ASCQ: %02hhX\n", sense_buffer[13]);
}

/*
sends the given io to the sg layer if there is only 1 SGL
element first IO
will be attmpeted. Otherwise the SGL is sent
*/
int do_scsi_io(int sg_fd, unsigned char *cdb,int cdblen, sg_iovec_t
*iovec,
int dir, int datalen,int sglcount)
{
unsigned char sense_buffer[32];

sg_io_hdr_t io_hdr;
memset(&io_hdr, 0, sizeof(sg_io_hdr_t));

io_hdr.interface_id = 'S';
io_hdr.cmd_len = cdblen;
io_hdr.mx_sb_len = 32;
io_hdr.dxfer_direction = dir;
io_hdr.dxfer_len = datalen;
if (sglcount > 1) {
io_hdr.dxferp = iovec;
io_hdr.iovec_count = sglcount;
} else {
io_hdr.flags |= SG_FLAG_DIRECT_IO;
io_hdr.dxferp = iovec[0].iov_base;
io_hdr.iovec_count = 0;
}
io_hdr.cmdp = cdb;
io_hdr.sbp = sense_buffer;
io_hdr.timeout = 10000; /* 10000 millisecs == 10 seconds */
memset(&sense_buffer, 0, 32);

if (ioctl(sg_fd, SG_IO, &io_hdr) < 0) {
printf("SG_IO ioctl error %d %s\n", errno, strerror(errno));
return -1;
}
if ((io_hdr.info & SG_INFO_OK_MASK) == SG_INFO_OK) {
return datalen - io_hdr.resid;
} else {
dispsns(cdb[0], sense_buffer, io_hdr.sb_len_wr);
return -1;
}
}

unsigned int IOLEN = 0x100000;
unsigned int SGLCOUNT = 1;
unsigned int LOOPCOUNT = 100;
int main(int argc, char *argv[])
{
char devpath[80];
int i;
int handle;
int ret;
unsigned char readCmdBlk[10] = {0x28, 0, 0, 0, 0x00, 0, 0, 0 ,0,
0};
unsigned int blkcount;
sg_iovec_t iovectable[SGLCOUNT];
void *dataptr[SGLCOUNT];
void *dataptr2[SGLCOUNT];

if (argc < 2) {
printf("Error: no input parms\n");
printf(" Usage: iotest /dev/sg<n>\n");
return -1;
}

for (i = 0; i < SGLCOUNT; i++) {
dataptr[i] = malloc((IOLEN/SGLCOUNT) + 0x2000);
dataptr2[i] = dataptr[i];
printf("Alloced dataptr %d -> %p \n", i, dataptr[i]);
}

for (i = 0; i < SGLCOUNT; i++)
if (dataptr[i] == NULL) {
printf("Unable to alloc memory \n");
return -1;
}
for (i = 0; i < SGLCOUNT; i++) {
iovectable[i].iov_base = dataptr2[i];
iovectable[i].iov_len = IOLEN/SGLCOUNT;
}
strcpy(devpath, argv[1]);

handle = open(devpath, O_RDWR);
if (handle == -1) {
printf(" Open of %s failed \n",devpath);
for (i = 0; i < SGLCOUNT; i++)
free(dataptr[i]);
return -1;
}
blkcount = IOLEN /0x200;
readCmdBlk[2] = 0; /*lba*/
readCmdBlk[3] = 0; /*lba*/
readCmdBlk[4] = 0; /*lba*/
readCmdBlk[5] = 0; /*lba*/
readCmdBlk[7] = (blkcount & 0xff00) >> 8; /*len*/
readCmdBlk[8] = blkcount & 0xff; /*len*/

for (i = 0; i < LOOPCOUNT; i++) {
ret = do_scsi_io(handle, readCmdBlk, 10, iovectable,
SG_DXFER_FROM_DEV, IOLEN,SGLCOUNT);
if (ret == -1)
break;
if ((i & 0xFF) == 0)
printf(" IOS: %d \n", i);
}
printf("ios %d \n", i);
for (i = 0; i < SGLCOUNT; i++)
free(dataptr[i]);

return 0;
}

--
Mark Rustad, [email protected]


2006-02-28 19:54:17

by Douglas Gilbert

[permalink] [raw]
Subject: Re: sg regression in 2.6.16-rc5

Mark Rustad wrote:
> We have encountered some kind of sg regression with kernel 2.6.16-rc5
> relative to 2.6.15. We have a small program that demonstrates the
> failure. On 2.6.15 it produces the output:
>
> Alloced dataptr 0 -> 0xb7d07008
> IOS: 0
> ios 100
>
> indicating that it did 100 operations successfully. On 2.6.16-rc5, it
> produces the output:
>
> Alloced dataptr 0 -> 0xa7d10008
> SG_IO ioctl error 12 Cannot allocate memory
> ios 0
>
> indicating that it did 0 operations successfully. This program is
> attempting to do 1MB reads on a SCSI device.

Mark,
You can stop right there with the 1 MB reads. Welcome
to the new, blander sg driver which now shares many
size shortcomings with the block subsystem.

In lk 2.6.15 the sg driver (and the st driver) did its
own scatter gather list allocations. The sg driver
used 32 KB segments (8 times the normal page size)
in each scatter gather element. The maximum number
of scatter gather elements depends on the LLD but
can be no more than 256. That meant the sg driver
allowed a maximum single IO size of 8 MB. There was
also a define in sg.h (SG_SCATTER_SZ and it is still
there) that allowed the 32KB per segment to be increased
allowing larger single command transfers (then 8 MB).

We get the failure both on
> an aic79xx parallel SCSI and on aic94xx SAS. With both types of
> devices, it works fine on the 2.6.15 kernel. We have also seen this
> problem on the 2.6.16-rc4 kernel. In all cases we were running on an
> Intel Xeon-based system.

Well this is broken by design. If you and others
talk to the management it may be reversed or a
better solution may be found.

Here is an example of the sg driver in lk 2.6.15-rc5.
The number of bytes each SCSI READ command is trying
to fetch is 'bs * bpt'. Note that it works for 256 KB
per SCSI READ but fails for anything bigger:

# modprobe scsi_debug
# modprobe sg
# sg_dd if=/dev/sg0 of=. bs=512 bpt=512
16384+0 records in
16384+0 records out
# sg_dd if=/dev/sg0 of=. bs=512 bpt=513
sg_read failed, try reducing bpt, at or after lba=0 [0x0]
Some error occurred, remaining block count=16384
0+0 records in
0+0 records out
# sg_dd if=/dev/sg0 of=. bs=512 bpt=1024
sg_read failed, try reducing bpt, at or after lba=0 [0x0]
Some error occurred, remaining block count=16384
0+0 records in
0+0 records out


Doug Gilbert

2006-02-28 20:35:22

by Kai Mäkisara (Kolumbus)

[permalink] [raw]
Subject: Re: sg regression in 2.6.16-rc5

On Tue, 28 Feb 2006, Douglas Gilbert wrote:

> Mark Rustad wrote:
> > We have encountered some kind of sg regression with kernel 2.6.16-rc5
> > relative to 2.6.15. We have a small program that demonstrates the
> > failure. On 2.6.15 it produces the output:
> >
> > Alloced dataptr 0 -> 0xb7d07008
> > IOS: 0
> > ios 100
> >
> > indicating that it did 100 operations successfully. On 2.6.16-rc5, it
> > produces the output:
> >
> > Alloced dataptr 0 -> 0xa7d10008
> > SG_IO ioctl error 12 Cannot allocate memory
> > ios 0
> >
> > indicating that it did 0 operations successfully. This program is
> > attempting to do 1MB reads on a SCSI device.
>
> Mark,
> You can stop right there with the 1 MB reads. Welcome
> to the new, blander sg driver which now shares many
> size shortcomings with the block subsystem.
>
> In lk 2.6.15 the sg driver (and the st driver) did its
> own scatter gather list allocations. The sg driver
> used 32 KB segments (8 times the normal page size)
> in each scatter gather element. The maximum number
> of scatter gather elements depends on the LLD but
> can be no more than 256. That meant the sg driver
> allowed a maximum single IO size of 8 MB. There was
> also a define in sg.h (SG_SCATTER_SZ and it is still
> there) that allowed the 32KB per segment to be increased
> allowing larger single command transfers (then 8 MB).
>
This is still possible but it needs some changes to most SCSI HBA drivers.
The big requests are split into bios supporting 256 pages. For 4 kB pages,
this limits i/o to 1 MB. The scsi_execute_async() path used by st and sg
can chain bios and this enables large request at the ULD level. At lower
level, the request consists of pages and now we hit the s/g list maximum
length _unless_ the HBA driver enables clustering. In this case the
adjacent pages are coalesced and the large requests fit into the HBA s/g
limits. Well, now we hit another limit: the max_sectors default for SCSI
drivers is 1024 and this limits requests to 512 kB _unless_ the HBA driver
increases max_sectors.

The aic79xx driver enables clustering but does not increase max_sectors.
This makes the maximum request size 512 kB. If it is possible to set

.max_sectors = 0xFFFF,

in linux/drivers/scsi/aic7xxx/aic79xx_osm.c without breaking the driver,
this should enable requests up to 8 MB - 256 B. (I don't have the hardware
to test this.)

Several SCSI HBA drivers currently have similar problems.

> We get the failure both on
> > an aic79xx parallel SCSI and on aic94xx SAS. With both types of
> > devices, it works fine on the 2.6.15 kernel. We have also seen this
> > problem on the 2.6.16-rc4 kernel. In all cases we were running on an
> > Intel Xeon-based system.
>
> Well this is broken by design. If you and others
> talk to the management it may be reversed or a
> better solution may be found.
>
> Here is an example of the sg driver in lk 2.6.15-rc5.
> The number of bytes each SCSI READ command is trying
> to fetch is 'bs * bpt'. Note that it works for 256 KB
> per SCSI READ but fails for anything bigger:
>
> # modprobe scsi_debug
> # modprobe sg
> # sg_dd if=/dev/sg0 of=. bs=512 bpt=512
> 16384+0 records in
> 16384+0 records out
> # sg_dd if=/dev/sg0 of=. bs=512 bpt=513
> sg_read failed, try reducing bpt, at or after lba=0 [0x0]
> Some error occurred, remaining block count=16384
> 0+0 records in
> 0+0 records out
> # sg_dd if=/dev/sg0 of=. bs=512 bpt=1024
> sg_read failed, try reducing bpt, at or after lba=0 [0x0]
> Some error occurred, remaining block count=16384
> 0+0 records in
> 0+0 records out
>
I tested this with my SCSI disk and the sym53c8xx_2 driver with the patch
I sent to linux-scsi recently (enables clustering and sets max_sectors to
0xFFFF). The 1 MB transfers seem to work:

kai:/data # sg_dd if=/dev/sg0 of=pup bs=512 bpt=2048 count=10240
10240+0 records in
10240+0 records out

--
Kai

2006-03-01 02:06:30

by Douglas Gilbert

[permalink] [raw]
Subject: Re: sg regression in 2.6.16-rc5

Kai Makisara wrote:
> On Tue, 28 Feb 2006, Douglas Gilbert wrote:
>
>
>>Mark Rustad wrote:
>>
>>>We have encountered some kind of sg regression with kernel 2.6.16-rc5
>>>relative to 2.6.15. We have a small program that demonstrates the
>>>failure. On 2.6.15 it produces the output:
>>>
>>>Alloced dataptr 0 -> 0xb7d07008
>>>IOS: 0
>>>ios 100
>>>
>>>indicating that it did 100 operations successfully. On 2.6.16-rc5, it
>>>produces the output:
>>>
>>>Alloced dataptr 0 -> 0xa7d10008
>>>SG_IO ioctl error 12 Cannot allocate memory
>>>ios 0
>>>
>>>indicating that it did 0 operations successfully. This program is
>>>attempting to do 1MB reads on a SCSI device.
>>
>>Mark,
>>You can stop right there with the 1 MB reads. Welcome
>>to the new, blander sg driver which now shares many
>>size shortcomings with the block subsystem.
>>
>>In lk 2.6.15 the sg driver (and the st driver) did its
>>own scatter gather list allocations. The sg driver
>>used 32 KB segments (8 times the normal page size)
>>in each scatter gather element. The maximum number
>>of scatter gather elements depends on the LLD but
>>can be no more than 256. That meant the sg driver
>>allowed a maximum single IO size of 8 MB. There was
>>also a define in sg.h (SG_SCATTER_SZ and it is still
>>there) that allowed the 32KB per segment to be increased
>>allowing larger single command transfers (then 8 MB).
>>
>
> This is still possible but it needs some changes to most SCSI HBA drivers.
> The big requests are split into bios supporting 256 pages. For 4 kB pages,
> this limits i/o to 1 MB. The scsi_execute_async() path used by st and sg
> can chain bios and this enables large request at the ULD level. At lower
> level, the request consists of pages and now we hit the s/g list maximum
> length _unless_ the HBA driver enables clustering. In this case the
> adjacent pages are coalesced and the large requests fit into the HBA s/g
> limits. Well, now we hit another limit: the max_sectors default for SCSI
> drivers is 1024 and this limits requests to 512 kB _unless_ the HBA driver
> increases max_sectors.
>
> The aic79xx driver enables clustering but does not increase max_sectors.
> This makes the maximum request size 512 kB. If it is possible to set
>
> .max_sectors = 0xFFFF,
>
> in linux/drivers/scsi/aic7xxx/aic79xx_osm.c without breaking the driver,
> this should enable requests up to 8 MB - 256 B. (I don't have the hardware
> to test this.)
>
> Several SCSI HBA drivers currently have similar problems.

Kai,
I applied the above changes to my scsi_debug (plus
extended .sg_tablesize to SG_ALL (it was 64)) and
my single command READs topped out at 4 MB exactly
(bs=512 bpt=8192). When I tried bpt=8193 the
SG_IO ioctl (via a sg device) yielded ENOMEM which
is much more informative than EIO.

That is an improvement.

Doug Gilbert

2006-03-01 02:11:58

by Mark Rustad

[permalink] [raw]
Subject: Re: sg regression in 2.6.16-rc5

On Feb 28, 2006, at 2:38 PM, Kai Makisara wrote:

> On Tue, 28 Feb 2006, Douglas Gilbert wrote:
>
>> Mark,
>> You can stop right there with the 1 MB reads. Welcome
>> to the new, blander sg driver which now shares many
>> size shortcomings with the block subsystem.
>>
>> In lk 2.6.15 the sg driver (and the st driver) did its
>> own scatter gather list allocations. The sg driver
>> used 32 KB segments (8 times the normal page size)
>> in each scatter gather element. The maximum number
>> of scatter gather elements depends on the LLD but
>> can be no more than 256. That meant the sg driver
>> allowed a maximum single IO size of 8 MB. There was
>> also a define in sg.h (SG_SCATTER_SZ and it is still
>> there) that allowed the 32KB per segment to be increased
>> allowing larger single command transfers (then 8 MB).
>
> This is still possible but it needs some changes to most SCSI HBA
> drivers.
> The big requests are split into bios supporting 256 pages. For 4 kB
> pages,
> this limits i/o to 1 MB. The scsi_execute_async() path used by st
> and sg
> can chain bios and this enables large request at the ULD level. At
> lower
> level, the request consists of pages and now we hit the s/g list
> maximum
> length _unless_ the HBA driver enables clustering. In this case the
> adjacent pages are coalesced and the large requests fit into the
> HBA s/g
> limits. Well, now we hit another limit: the max_sectors default for
> SCSI
> drivers is 1024 and this limits requests to 512 kB _unless_ the HBA
> driver
> increases max_sectors.
>
> The aic79xx driver enables clustering but does not increase
> max_sectors.
> This makes the maximum request size 512 kB. If it is possible to set
>
> .max_sectors = 0xFFFF,
>
> in linux/drivers/scsi/aic7xxx/aic79xx_osm.c without breaking the
> driver,
> this should enable requests up to 8 MB - 256 B. (I don't have the
> hardware
> to test this.)

Indeed, this seems to work fine, at least with the hardware we have.
Gotta love those one-line patches.

> Several SCSI HBA drivers currently have similar problems.

Yes. Now that I know about this, it is no problem. I'm not allergic
to patches.

Thanks to both of you for your responses. I would submit a patch for
this except that I know I don't know that it won't cause problems for
configurations and targets that I can't test.

--
Mark Rustad, [email protected]

2006-03-01 08:38:30

by Matthias Andree

[permalink] [raw]
Subject: Re: sg regression in 2.6.16-rc5

On Tue, 28 Feb 2006, Douglas Gilbert wrote:

> You can stop right there with the 1 MB reads. Welcome
> to the new, blander sg driver which now shares many
> size shortcomings with the block subsystem.

What is the reason to break user-space applications like this?

--
Matthias Andree

2006-03-01 18:28:47

by Linus Torvalds

[permalink] [raw]
Subject: Re: sg regression in 2.6.16-rc5



On Wed, 1 Mar 2006, Matthias Andree wrote:
>
> On Tue, 28 Feb 2006, Douglas Gilbert wrote:
>
> > You can stop right there with the 1 MB reads. Welcome
> > to the new, blander sg driver which now shares many
> > size shortcomings with the block subsystem.
>
> What is the reason to break user-space applications like this?

Did you read the whole thread? It was a low-level SCSI driver issue, where
nothing broke user space, but the command was just fed to the drive
differently, which then hit a limit in the driver.

Linus

2006-03-01 18:31:55

by Mark Lord

[permalink] [raw]
Subject: Re: sg regression in 2.6.16-rc5

Linus Torvalds wrote:
>
> On Wed, 1 Mar 2006, Matthias Andree wrote:
>> On Tue, 28 Feb 2006, Douglas Gilbert wrote:
>>
>>> You can stop right there with the 1 MB reads. Welcome
>>> to the new, blander sg driver which now shares many
>>> size shortcomings with the block subsystem.
>> What is the reason to break user-space applications like this?
>
> Did you read the whole thread? It was a low-level SCSI driver issue, where
> nothing broke user space, but the command was just fed to the drive
> differently, which then hit a limit in the driver.

Will this break major applications like CD/DVD rippers,
DVD players, etc.. which read LARGE blocks at a time?

If not, then good!

2006-03-01 18:42:52

by Linus Torvalds

[permalink] [raw]
Subject: Re: sg regression in 2.6.16-rc5



On Wed, 1 Mar 2006, Mark Lord wrote:

> Linus Torvalds wrote:
> >
> > On Wed, 1 Mar 2006, Matthias Andree wrote:
> > > On Tue, 28 Feb 2006, Douglas Gilbert wrote:
> > >
> > > > You can stop right there with the 1 MB reads. Welcome
> > > > to the new, blander sg driver which now shares many
> > > > size shortcomings with the block subsystem.
> > > What is the reason to break user-space applications like this?
> >
> > Did you read the whole thread? It was a low-level SCSI driver issue, where
> > nothing broke user space, but the command was just fed to the drive
> > differently, which then hit a limit in the driver.
>
> Will this break major applications like CD/DVD rippers,
> DVD players, etc.. which read LARGE blocks at a time?
>
> If not, then good!

I wouldn't expect it to. Most people use ATA for that, and it tends to
have lower limits than most SCSI HBA's (well, at least the old PATA), so
the change - if any - should at most change some of the sg.c limits to be
no less than what SG_IO has had on ATA forever.

Not that I expect people to have a SCSI CD/DVD drive anyway in this day
and age, so the sg.c changes probably won't show up at all.

The problem that was reported was apparently for a rather special use.

Linus

2006-03-01 18:50:54

by Matthew Wilcox

[permalink] [raw]
Subject: Re: sg regression in 2.6.16-rc5

On Wed, Mar 01, 2006 at 10:42:12AM -0800, Linus Torvalds wrote:
> I wouldn't expect it to. Most people use ATA for that, and it tends to
> have lower limits than most SCSI HBA's (well, at least the old PATA), so
> the change - if any - should at most change some of the sg.c limits to be
> no less than what SG_IO has had on ATA forever.
>
> Not that I expect people to have a SCSI CD/DVD drive anyway in this day
> and age, so the sg.c changes probably won't show up at all.

My wife's last two laptops have both had 'SCSI' CD/DVD -- firewire on
the Vaio and SATA on the Lifebook. Neither time have distros been
prepared to deal with such things ;-(

http://www.leog.net/fujp_forum/topic.asp?TOPIC_ID=9038 shows it's not
just my distro of choice that has problems with SATA ATAPI.
Unfortunately, the one-line change to enable that by default was too
late for 2.6.16, according to jgarzik. I just hope all the distros
patch it.

2006-03-01 19:34:28

by Douglas Gilbert

[permalink] [raw]
Subject: Re: sg regression in 2.6.16-rc5

Linus Torvalds wrote:
>
> On Wed, 1 Mar 2006, Matthias Andree wrote:
>
>>On Tue, 28 Feb 2006, Douglas Gilbert wrote:
>>
>>
>>>You can stop right there with the 1 MB reads. Welcome
>>>to the new, blander sg driver which now shares many
>>>size shortcomings with the block subsystem.
>>
>>What is the reason to break user-space applications like this?
>
>
> Did you read the whole thread? It was a low-level SCSI driver issue, where
> nothing broke user space, but the command was just fed to the drive
> differently, which then hit a limit in the driver.

Linus,
That is an optimistic take. The maximum data carrying
capacity of a single SCSI command via the SG_IO ioctl
depends on the maximum data carrying capacity of a
scatter gather list. Assuming all scatter gather list
elements carry the same amount of data then the
maximum capacity is:
'max_bytes_per_element * max_num_elements'

Only the latter figure is a "low-level SCSI driver issue"
whose maximum seems to be SG_ALL (255). It is the former
figure that has changed. The sg driver in lk 2.6.15 used
__get_free_pages() with the order set to get 32 KB where
as the generic routine used now get a single page (usually
4 KB). Kai Makisara proposed changes in the SCSI LLD
template that made things better in my experiments with
scsi_debug.

However today James Bottomley confirmed that relying on
coalescing pages that may be adjacent is not deterministic:
http://marc.theaimsgroup.com/?l=linux-scsi&m=114122991606658&w=2

That leaves a worst case scatter gather list data capacity
of (4 * 255) KB if the SCSI LLD (or SATA) uses SG_ALL. That
is still just under the 1 MB bar that started this thread.

So I guess we might find out how many people do big,
single SCSI command data, transfers when lk 2.6.16 comes
out.

Doug Gilbert

2006-03-01 20:46:29

by Mike Christie

[permalink] [raw]
Subject: Re: sg regression in 2.6.16-rc5

On Wed, 2006-03-01 at 14:33 -0500, Douglas Gilbert wrote:
> Linus Torvalds wrote:
> >
> > On Wed, 1 Mar 2006, Matthias Andree wrote:
> >
> >>On Tue, 28 Feb 2006, Douglas Gilbert wrote:
> >>
> >>
> >>>You can stop right there with the 1 MB reads. Welcome
> >>>to the new, blander sg driver which now shares many
> >>>size shortcomings with the block subsystem.
> >>
> >>What is the reason to break user-space applications like this?
> >
> >
> > Did you read the whole thread? It was a low-level SCSI driver issue, where
> > nothing broke user space, but the command was just fed to the drive
> > differently, which then hit a limit in the driver.
>
> Linus,
> That is an optimistic take. The maximum data carrying
> capacity of a single SCSI command via the SG_IO ioctl
> depends on the maximum data carrying capacity of a
> scatter gather list. Assuming all scatter gather list
> elements carry the same amount of data then the
> maximum capacity is:
> 'max_bytes_per_element * max_num_elements'
>
> Only the latter figure is a "low-level SCSI driver issue"
> whose maximum seems to be SG_ALL (255). It is the former
> figure that has changed. The sg driver in lk 2.6.15 used
> __get_free_pages() with the order set to get 32 KB where
> as the generic routine used now get a single page (usually
> 4 KB).

The current sg driver should use alloc_pages() with an order that should
get 32 KB. If the order being passed to alloc_pages() in sg.c is only
getting one page by default that is bug.

The generic routines now being used can turn that 32KB segment into
multiple 4KB ones if the LLD does not support clustering.


> Kai Makisara proposed changes in the SCSI LLD
> template that made things better in my experiments with
> scsi_debug.
>
> However today James Bottomley confirmed that relying on
> coalescing pages that may be adjacent is not deterministic:
> http://marc.theaimsgroup.com/?l=linux-scsi&m=114122991606658&w=2
>
> That leaves a worst case scatter gather list data capacity
> of (4 * 255) KB if the SCSI LLD (or SATA) uses SG_ALL. That
> is still just under the 1 MB bar that started this thread.

Actually, we will hit the SCSI_MAX_PHYS_SEGMENTS first. It is 128 by
default so (4 * 128) KB. Here is a patch, only compile tested, to
increase SCSI_MAX_PHYS_SEGMENTS to 256.

--- linux-2.6.16-rc4/include/scsi/scsi.h.orig 2006-03-01 14:32:18.000000000 -0600
+++ linux-2.6.16-rc4/include/scsi/scsi.h 2006-03-01 14:33:32.000000000 -0600
@@ -14,7 +14,7 @@
* The maximum sg list length SCSI can cope with
* (currently must be a power of 2 between 32 and 256)
*/
-#define SCSI_MAX_PHYS_SEGMENTS MAX_PHYS_SEGMENTS
+#define SCSI_MAX_PHYS_SEGMENTS 256


/*


2006-03-01 21:03:34

by Kai Mäkisara (Kolumbus)

[permalink] [raw]
Subject: Re: sg regression in 2.6.16-rc5

On Wed, 1 Mar 2006, Douglas Gilbert wrote:

> Linus Torvalds wrote:
> >
> > On Wed, 1 Mar 2006, Matthias Andree wrote:
> >
> >>On Tue, 28 Feb 2006, Douglas Gilbert wrote:
> >>
> >>
> >>>You can stop right there with the 1 MB reads. Welcome
> >>>to the new, blander sg driver which now shares many
> >>>size shortcomings with the block subsystem.
> >>
> >>What is the reason to break user-space applications like this?
> >
> >
> > Did you read the whole thread? It was a low-level SCSI driver issue, where
> > nothing broke user space, but the command was just fed to the drive
> > differently, which then hit a limit in the driver.
>
> Linus,
> That is an optimistic take. The maximum data carrying
> capacity of a single SCSI command via the SG_IO ioctl
> depends on the maximum data carrying capacity of a
> scatter gather list. Assuming all scatter gather list
> elements carry the same amount of data then the
> maximum capacity is:
> 'max_bytes_per_element * max_num_elements'
>
> Only the latter figure is a "low-level SCSI driver issue"
> whose maximum seems to be SG_ALL (255). It is the former
> figure that has changed. The sg driver in lk 2.6.15 used
> __get_free_pages() with the order set to get 32 KB where
> as the generic routine used now get a single page (usually
> 4 KB). Kai Makisara proposed changes in the SCSI LLD
> template that made things better in my experiments with
> scsi_debug.
>
> However today James Bottomley confirmed that relying on
> coalescing pages that may be adjacent is not deterministic:
> http://marc.theaimsgroup.com/?l=linux-scsi&m=114122991606658&w=2
>
If this is James's final opininion, then the changes to st for 2.6.16
to use scsi_execute_async _must be reverted_ before final 2.6.16. They
cause user-visible regression.

I don't think relying on coalescing in this case is non-deterministic: we
know that enough pages are adjacent to coalesce. This is because the pages
have been split into bios from larger kernel space buffers. (Conceptually
I don't like this splitting and re-merging but it works provided the HBA
parameters are good.)

I am a little frustrated with this whole thing. Several people have talked
about switching st to use the block layer. Mike Christie finally did the
work and the details were discussed on linux-scsi. I thought that
everybody agreed on the details and I tested that the code works for st.
Now it seems that there was no agreement!

> That leaves a worst case scatter gather list data capacity
> of (4 * 255) KB if the SCSI LLD (or SATA) uses SG_ALL. That
> is still just under the 1 MB bar that started this thread.
>
This is not acceptable.

> So I guess we might find out how many people do big,
> single SCSI command data, transfers when lk 2.6.16 comes
> out.
>
Learn the hard way ;-( I know that there are applications that read and
write large tape blocks but I think they will hit the problems much later.
These are production systems that are probably not updated frequently.
When they find out this, they probably have to move away from Linux.

I have talked about st but I think the same arguments apply mostly to sg.

--
Kai

2006-03-01 22:31:10

by James Bottomley

[permalink] [raw]
Subject: Re: sg regression in 2.6.16-rc5

On Wed, 2006-03-01 at 14:42 -0600, Mike Christie wrote:
> The current sg driver should use alloc_pages() with an order that should
> get 32 KB. If the order being passed to alloc_pages() in sg.c is only
> getting one page by default that is bug.

> The generic routines now being used can turn that 32KB segment into
> multiple 4KB ones if the LLD does not support clustering.

To be honest, the original behaviour was a bug. A device that doesn't
enable clustering is telling us it can't take anything other than
PAGE_SIZE chunks ... trying to give it more is likely to end in tears.

However ... I'm not sure we actually have any devices that anyone can
identify which truly can't enable clustering (a lot which have it
disabled, I suspect, are that way historically because their writers
didn't trust the clustering algorithm).

So ... I think we can go ahead and cautiously enable clustering (as a
separate patch like Jens suggested).

James


2006-03-01 22:57:19

by Mike Christie

[permalink] [raw]
Subject: Re: sg regression in 2.6.16-rc5

James Bottomley wrote:
> On Wed, 2006-03-01 at 14:42 -0600, Mike Christie wrote:
>
>>The current sg driver should use alloc_pages() with an order that should
>>get 32 KB. If the order being passed to alloc_pages() in sg.c is only
>>getting one page by default that is bug.
>
>
>>The generic routines now being used can turn that 32KB segment into
>>multiple 4KB ones if the LLD does not support clustering.
>
>
> To be honest, the original behaviour was a bug. A device that doesn't
> enable clustering is telling us it can't take anything other than
> PAGE_SIZE chunks ... trying to give it more is likely to end in tears.

Yeah, we hit this with iscsi_tcp. iscsi_tcp does not suport clustering,
not due to a HW limit, but becuase that is just how it was implemented.
When we get clustered segments we end up with data corruption or an oops
depending on the operation. I think the workaround was to set the
default segment for sg and st to a page or just use the block layer sg_io.

>
> However ... I'm not sure we actually have any devices that anyone can
> identify which truly can't enable clustering (a lot which have it
> disabled, I suspect, are that way historically because their writers
> didn't trust the clustering algorithm).
>

ok, I can implement clustering for iscsi_tcp. For now it does not much
matter since we never supported large sg or st commands.

2006-03-02 19:52:17

by Douglas Gilbert

[permalink] [raw]
Subject: Re: sg regression in 2.6.16-rc5

Kai Makisara wrote:
> On Wed, 1 Mar 2006, Douglas Gilbert wrote:
>
>
>>Linus Torvalds wrote:
>>
>>>On Wed, 1 Mar 2006, Matthias Andree wrote:
>>>
>>>
>>>>On Tue, 28 Feb 2006, Douglas Gilbert wrote:
>>>>
>>>>
>>>>
>>>>>You can stop right there with the 1 MB reads. Welcome
>>>>>to the new, blander sg driver which now shares many
>>>>>size shortcomings with the block subsystem.
>>>>
>>>>What is the reason to break user-space applications like this?
>>>
>>>
>>>Did you read the whole thread? It was a low-level SCSI driver issue, where
>>>nothing broke user space, but the command was just fed to the drive
>>>differently, which then hit a limit in the driver.
>>
>>Linus,
>>That is an optimistic take. The maximum data carrying
>>capacity of a single SCSI command via the SG_IO ioctl
>>depends on the maximum data carrying capacity of a
>>scatter gather list. Assuming all scatter gather list
>>elements carry the same amount of data then the
>>maximum capacity is:
>>'max_bytes_per_element * max_num_elements'
>>
>>Only the latter figure is a "low-level SCSI driver issue"
>>whose maximum seems to be SG_ALL (255). It is the former
>>figure that has changed. The sg driver in lk 2.6.15 used
>>__get_free_pages() with the order set to get 32 KB where
>>as the generic routine used now get a single page (usually
>>4 KB). Kai Makisara proposed changes in the SCSI LLD
>>template that made things better in my experiments with
>>scsi_debug.
>>
>>However today James Bottomley confirmed that relying on
>>coalescing pages that may be adjacent is not deterministic:
>>http://marc.theaimsgroup.com/?l=linux-scsi&m=114122991606658&w=2
>>
>
> If this is James's final opininion, then the changes to st for 2.6.16
> to use scsi_execute_async _must be reverted_ before final 2.6.16. They
> cause user-visible regression.
>
> I don't think relying on coalescing in this case is non-deterministic: we
> know that enough pages are adjacent to coalesce. This is because the pages
> have been split into bios from larger kernel space buffers. (Conceptually
> I don't like this splitting and re-merging but it works provided the HBA
> parameters are good.)

As more information has come to light, the worst case
"big transfer" of a single SCSI command through sg (and
st I suspect) is 512 KB **. With full coalescing that figure
goes up to 4 MB **. I am also aware that some users
increase SG_SCATTER_SZ in the sg driver to get larger
"big transfer"s than sg's current limit of (8MB - 32KB) **.
That facility has now gone (i.e. upping SG_SCATTER_SZ will
have no effect) with no replacement mechanism.

So I'll add my vote to "revert this change before lk 2.6.16"
with a view to applying it after some solution to the "big
transfer" problem is found.

In 8 years of maintaining the sg driver I cannot remember
anybody contacting me regarding the way the sg driver ignored
the DISABLE_CLUSTERING flag; that was until Mike Christie raised
it yesterday with regard to iscsi_tcp ***. Gerard's post from
2000 implied clustering was not a problem with U160 (SPI-3)
so perhaps it was for SPI-2 (1998) or SPI (1995) or SCSI-2 (1993).
If so those are pretty old symbios controllers. Why would any
storage manufacturer make a DMA element that was restricted
to Intel's i386 page size per transfer?

I suspect there are a small number of high power users
that need "big transfers" and they will get a surprise
in lk 2.6.16 if things stay as they are.

> I am a little frustrated with this whole thing. Several people have talked
> about switching st to use the block layer. Mike Christie finally did the
> work and the details were discussed on linux-scsi. I thought that
> everybody agreed on the details and I tested that the code works for st.
> Now it seems that there was no agreement!
>
>
>>That leaves a worst case scatter gather list data capacity
>>of (4 * 255) KB if the SCSI LLD (or SATA) uses SG_ALL. That
>>is still just under the 1 MB bar that started this thread.
>>
>
> This is not acceptable.
>
>
>>So I guess we might find out how many people do big,
>>single SCSI command data, transfers when lk 2.6.16 comes
>>out.
>>
>
> Learn the hard way ;-( I know that there are applications that read and
> write large tape blocks but I think they will hit the problems much later.
> These are production systems that are probably not updated frequently.
> When they find out this, they probably have to move away from Linux.
>
> I have talked about st but I think the same arguments apply mostly to sg.

I believe the sg and st drivers share these problems as would
osst and ch if they tried to send a lot of data in one command.
Firmware loading comes to mind.


** these figures assume ".sg_tablesize" in the corresponding low level
driver is at least 128. If not, scale those figures down proportionally.

*** I believe the iscsi_tcp driver is emulating DMA and Mike
Christie said yesterday that code could be added to that driver
to support "clustering" (i.e. scatter gather element sizes larger
than page size).

Doug Gilbert

2006-03-02 21:25:36

by Linus Torvalds

[permalink] [raw]
Subject: Re: sg regression in 2.6.16-rc5



On Thu, 2 Mar 2006, Douglas Gilbert wrote:
>
> As more information has come to light, the worst case
> "big transfer" of a single SCSI command through sg (and
> st I suspect) is 512 KB **. With full coalescing that figure
> goes up to 4 MB **. I am also aware that some users
> increase SG_SCATTER_SZ in the sg driver to get larger
> "big transfer"s than sg's current limit of (8MB - 32KB) **.
> That facility has now gone (i.e. upping SG_SCATTER_SZ will
> have no effect) with no replacement mechanism.
>
> So I'll add my vote to "revert this change before lk 2.6.16"
> with a view to applying it after some solution to the "big
> transfer" problem is found.

Considering that the old code was apparently known-broken due to not
honoring the use_clustering flag, I would say that the more likely thing
is that very few people use sg in the first place, and we should wait and
see what the reaction is to actually fixing a real bug.

Doing more than page-sized transfers can be hard/impossible in virtualized
environments, for example.

In contrast, upping the limits should be fairly easy, I assume. Same goes
for if some driver disables clustering even though it shouldn't. No?

Linus

2006-03-02 23:03:55

by Falkinder, David Malcolm

[permalink] [raw]
Subject: RE: sg regression in 2.6.16-rc5

Linus,

I contacted Doug off-list, and he asked me to express my concerns here.

Whilst a Linux advocate, I work cross platform, and have but a shallow
knowledge of the kernel, so apologies in advance for any technical
inaccuracies, or misunderstandings ...

Essentially what I conveyed to Doug was :

I guess, I'm not fully aware of the implications of what is being
discussed as there appears to essentially be two implementations of the
SG_IO IOCtl - namely the one in the sg driver, and the one in the block
layer.

One of the key drivers for us using Linux is the ability to do a 16Mb
contiguous single transfer.
i.e. WRITE(6) with 0xFF 0xFF 0xFF as the transfer length. Often we use
patterns like (2^n)-1, 2^n, (2^n)+1, to thoroughly test the SCSI bus, so
ALL transfer sizes are needed.

Certainly a 1Mb limit would be useless, as would 4Mb.

To achieve our goal of 16Mb all we've had to do to date is recompile the
kernel having set SG_SCATTER_SZ to (64 * 4096).

Whilst it would be great to just use a vanilla kernel, this is a
relatively trivial patch to meet our needs. I'd hate to think at any
point anything would be done to move away from this. Certainly we'd have
to either find another proprietary solution, or freeze our Linux
implementation indefinitely. Neither a particularly attractive solution.

-------

I (obviously) support your wish to fix broken code. In my technical
naivety in this area, I obviously can't comment on the ramifications of
a fix/non fix situation other than pertaining directly to the large
transfer situation. However it's obvious we ( and I'm sure others ) are
at the moment exploiting this "defect". I guess I feel to be hearing a
lot of discussion regarding the fix, so it's obviously contentious, and
it's agreed it will effectively reduce large transfer functionality of
the kernel; what I am not hearing is a timeline for restoring that
functionality. Personally I'd be happy to "miss out" on a couple of
kernel releases, if I was confident functionality would be restored.
What does worry me is the potential for this fix to be applied, and the
functionality I need not be restored. For example the SG_IO IoCtl in the
block layer was obviously a laudable project, yet to date does not
provide all the features offered by the SG driver [ that I need at least
].

Can I request therefore, that unless the fix can be extended to retain
the large transfer functionality, or a suitable timeline for it's
restoration be resolved; that the patch not be applied.

Many thanks,

Best Wishes,

|\
|/ave




-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Linus Torvalds
Sent: 02 March 2006 21:25
To: Douglas Gilbert
Cc: Kai Makisara; Matthias Andree; Mark Rustad;
[email protected]; Linux Kernel Mailing List
Subject: Re: sg regression in 2.6.16-rc5



On Thu, 2 Mar 2006, Douglas Gilbert wrote:
>
> As more information has come to light, the worst case "big transfer"
> of a single SCSI command through sg (and st I suspect) is 512 KB **.
> With full coalescing that figure goes up to 4 MB **. I am also aware
> that some users increase SG_SCATTER_SZ in the sg driver to get larger
> "big transfer"s than sg's current limit of (8MB - 32KB) **.
> That facility has now gone (i.e. upping SG_SCATTER_SZ will have no
> effect) with no replacement mechanism.
>
> So I'll add my vote to "revert this change before lk 2.6.16"
> with a view to applying it after some solution to the "big transfer"
> problem is found.

Considering that the old code was apparently known-broken due to not
honoring the use_clustering flag, I would say that the more likely thing
is that very few people use sg in the first place, and we should wait
and see what the reaction is to actually fixing a real bug.

Doing more than page-sized transfers can be hard/impossible in
virtualized environments, for example.

In contrast, upping the limits should be fairly easy, I assume. Same
goes for if some driver disables clustering even though it shouldn't.
No?

Linus

2006-03-03 18:28:05

by Steve Byan

[permalink] [raw]
Subject: Re: sg regression in 2.6.16-rc5


On Mar 1, 2006, at 1:42 PM, Linus Torvalds wrote:

>
>
> On Wed, 1 Mar 2006, Mark Lord wrote:
>
>> Linus Torvalds wrote:
>>>
>>> On Wed, 1 Mar 2006, Matthias Andree wrote:
>>>> On Tue, 28 Feb 2006, Douglas Gilbert wrote:
>>>>
>>>>> You can stop right there with the 1 MB reads. Welcome
>>>>> to the new, blander sg driver which now shares many
>>>>> size shortcomings with the block subsystem.
>>>> What is the reason to break user-space applications like this?
>>>
>>> Did you read the whole thread? It was a low-level SCSI driver
>>> issue, where
>>> nothing broke user space, but the command was just fed to the drive
>>> differently, which then hit a limit in the driver.
>>
>> Will this break major applications like CD/DVD rippers,
>> DVD players, etc.. which read LARGE blocks at a time?
>>
>> If not, then good!
>
> I wouldn't expect it to. Most people use ATA for that, and it tends to
> have lower limits than most SCSI HBA's (well, at least the old
> PATA), so
> the change - if any - should at most change some of the sg.c limits
> to be
> no less than what SG_IO has had on ATA forever.
>
> Not that I expect people to have a SCSI CD/DVD drive anyway in this
> day
> and age, so the sg.c changes probably won't show up at all.

CD-ROM support is a frequently-requested feature on the iSCSI
Enterprise Target (iet) email list. It won't be long before iSCSI CD
and DVD devices start showing up, although the underlying hardware
will be ATAPI or else missing entirely (i.e. ISO image file).

Regards,
-Steve
--
Steve Byan <[email protected]>
Software Architect
Egenera, Inc.
165 Forest Street
Marlboro, MA 01752
(508) 858-3125


2006-03-03 18:55:18

by Linus Torvalds

[permalink] [raw]
Subject: Re: sg regression in 2.6.16-rc5



On Fri, 3 Mar 2006, Steve Byan wrote:
>
> On Mar 1, 2006, at 1:42 PM, Linus Torvalds wrote:
> >
> > I wouldn't expect it to. Most people use ATA for that, and it tends to
> > have lower limits than most SCSI HBA's (well, at least the old PATA), so
> > the change - if any - should at most change some of the sg.c limits to be
> > no less than what SG_IO has had on ATA forever.
> >
> > Not that I expect people to have a SCSI CD/DVD drive anyway in this day
> > and age, so the sg.c changes probably won't show up at all.
>
> CD-ROM support is a frequently-requested feature on the iSCSI Enterprise
> Target (iet) email list. It won't be long before iSCSI CD and DVD devices
> start showing up, although the underlying hardware will be ATAPI or else
> missing entirely (i.e. ISO image file).

Yes, but the point that the ATA limits tend to be on the low side still
stands.

For example, I think the IDE driver defaults to a maximum transfer of 256
sectors, and the same number of max scatter-gather entries. Some
controllers will actually lower that, due to silly hw problems.

The point being that it has worked fine for IDE, and if a SCSI controller
has noticeably lower limits than that, there's something really strange
going on, like a real bug.

Linus

2006-03-03 19:14:09

by Steve Byan

[permalink] [raw]
Subject: Re: sg regression in 2.6.16-rc5


On Mar 3, 2006, at 1:55 PM, Linus Torvalds wrote:

>
>
> On Fri, 3 Mar 2006, Steve Byan wrote:
>>
>> On Mar 1, 2006, at 1:42 PM, Linus Torvalds wrote:
>>>
>>> I wouldn't expect it to. Most people use ATA for that, and it
>>> tends to
>>> have lower limits than most SCSI HBA's (well, at least the old
>>> PATA), so
>>> the change - if any - should at most change some of the sg.c
>>> limits to be
>>> no less than what SG_IO has had on ATA forever.
>>>
>>> Not that I expect people to have a SCSI CD/DVD drive anyway in
>>> this day
>>> and age, so the sg.c changes probably won't show up at all.
>>
>> CD-ROM support is a frequently-requested feature on the iSCSI
>> Enterprise
>> Target (iet) email list. It won't be long before iSCSI CD and DVD
>> devices
>> start showing up, although the underlying hardware will be ATAPI
>> or else
>> missing entirely (i.e. ISO image file).
>
> Yes, but the point that the ATA limits tend to be on the low side
> still
> stands.
>
> For example, I think the IDE driver defaults to a maximum transfer
> of 256
> sectors, and the same number of max scatter-gather entries. Some
> controllers will actually lower that, due to silly hw problems.
>
> The point being that it has worked fine for IDE, and if a SCSI
> controller
> has noticeably lower limits than that, there's something really
> strange
> going on, like a real bug.

Yes, you are correct. I wasn't intending to contest your main point.
I only intended to point out that ignoring bugs because no-one uses
SCSI DVDs will soon lead to much grief.

Regards,
-Steve
--
Steve Byan <[email protected]>
Software Architect
Egenera, Inc.
165 Forest Street
Marlboro, MA 01752
(508) 858-3125


2006-03-03 19:42:55

by Jeff Garzik

[permalink] [raw]
Subject: Re: sg regression in 2.6.16-rc5

Linus Torvalds wrote:
> For example, I think the IDE driver defaults to a maximum transfer of 256
> sectors, and the same number of max scatter-gather entries. Some
> controllers will actually lower that, due to silly hw problems.

Yep. Just to be specific:

256 max sectors IDE driver, 200 max sectors libata (due to driver not
hardware).

256 max s/g entries hardware limit, but due to a IOMMU merging worst
case libata (IDE driver too?) winds up with a 128 entry practical limit.

Newer SATA controllers eliminate the s/g entry limit and DMA boundary
limits, but its still 256 max-sectors for ATAPI (64k for LBA48 ATA).

Jeff


2006-03-03 20:10:04

by Linus Torvalds

[permalink] [raw]
Subject: Re: sg regression in 2.6.16-rc5



On Fri, 3 Mar 2006, Jeff Garzik wrote:
>
> 256 max sectors IDE driver, 200 max sectors libata (due to driver not
> hardware).

When I said "lower due to broken hw" I was more thinking about things like
the SiIimage driver, which actually limits the rqsize to 15 sectors due to
some strange hw interactions with seagate SATA devices.

(It will then raise it back up to 128 if it's not a Seagate SATA drive. I
forget what the exact issue was. Some strange corruption in some limited
case, and not allowing big requests worked around it. There's some
strange IDE quirks out there...).

Linus

2006-03-03 20:30:14

by Jeff Garzik

[permalink] [raw]
Subject: Re: sg regression in 2.6.16-rc5

Linus Torvalds wrote:
>
> On Fri, 3 Mar 2006, Jeff Garzik wrote:
>
>>256 max sectors IDE driver, 200 max sectors libata (due to driver not
>>hardware).
>
>
> When I said "lower due to broken hw" I was more thinking about things like
> the SiIimage driver, which actually limits the rqsize to 15 sectors due to
> some strange hw interactions with seagate SATA devices.

Yep. There's trouble if the last FIS (sata packet, max 8K) is exactly 7.5K.


> (It will then raise it back up to 128 if it's not a Seagate SATA drive. I
> forget what the exact issue was. Some strange corruption in some limited
> case, and not allowing big requests worked around it. There's some
> strange IDE quirks out there...).

Technically its

if (sectors % 15 == 1)
explode

:) Yes, IDE is a weird weird world...

Jeff