2001-04-08 23:31:52

by Alex Q Chen

[permalink] [raw]
Subject: Zero Copy IO

I am trying to find a way to pin down user space memory from kernel, so
that these user space buffer can be used for direct IO transfer or
otherwise known as "zero copying IO". Searching through the Internet and
reading comments on various news groups, it would appear that most
developers including Linus himself doesn't believe in the benefit of "zero
copying IO". Most of the discussion however was based on network card
drivers. For certain other drivers such as SCSI Tape driver, which need to
handle great deal of data transfer, it would seemed still be more
advantageous to enable zero copy IO than copy_from_user() and copy_to_user
() all the data. Other OS such as AIX and OS2 have kernel functions that
can be used to accomplish such a task. Has any ground work been done in
Linux 2.4 to enable "zero copying IO"?

Thanks in advance for any suggestions or comments

Sincerely,
Alex Chen

IBM SSD Device Driver Development
Office: 9000 S. Rita Rd 9032/2262
Email: [email protected]
Phone: (external) 520-799-5212 (Tie Line) (321)-5212


2001-04-09 00:13:30

by Andi Kleen

[permalink] [raw]
Subject: Re: Zero Copy IO

On Sun, Apr 08, 2001 at 04:31:27PM -0700, Alex Q Chen wrote:
> I am trying to find a way to pin down user space memory from kernel, so
> that these user space buffer can be used for direct IO transfer or
> otherwise known as "zero copying IO". Searching through the Internet and
> reading comments on various news groups, it would appear that most
> developers including Linus himself doesn't believe in the benefit of "zero
> copying IO". Most of the discussion however was based on network card
> drivers. For certain other drivers such as SCSI Tape driver, which need to
> handle great deal of data transfer, it would seemed still be more
> advantageous to enable zero copy IO than copy_from_user() and copy_to_user
> () all the data. Other OS such as AIX and OS2 have kernel functions that
> can be used to accomplish such a task. Has any ground work been done in
> Linux 2.4 to enable "zero copying IO"?

Yes, e.g. the raw io device does it using kiovecs. See
drivers/char/raw.c,fs/iobuf.c et.al. 2.4+zerocopy networking also has a
different implementation.
The raw.c implementation is not very efficient at the moment though,
mostly because of limitations in the block device layer (but that
should be no problem for a direct tape driver)
This work is also in the 2.2 kernels of most distributions.


-Andi

2001-04-09 00:55:24

by Douglas Gilbert

[permalink] [raw]
Subject: Re: Zero Copy IO

"Alex Q Chen" <[email protected]> wrote:

> I am trying to find a way to pin down user space
> memory from kernel, so that these user space buffer
> can be used for direct IO transfer or otherwise
> known as "zero copying IO". Searching through the
> Internet and reading comments on various news groups,
> it would appear that most developers including Linus
> himself doesn't believe in the benefit of "zero
> copying IO". Most of the discussion however was based
> on network card drivers. For certain other drivers
> such as SCSI Tape driver, which need to handle great
> deal of data transfer, it would seemed still be more
> advantageous to enable zero copy IO than copy_from_user()
> and copy_to_user() all the data. Other OS such as AIX
> and OS2 have kernel functions that can be used to
> accomplish such a task. Has any ground work been done
> in Linux 2.4 to enable "zero copying IO"?

Alex,
The kiobufs mechanism in the 2.4 series is the appropriate
tool for avoiding copy_from_user() and copy_to_user().
The definitive driver is in drivers/char/raw.c which
does synchronous IO to block devices such as disks
(but is probably not appropriate for tapes).

The SCSI generic (sg) driver supports direct IO. The driver
in lk 2.4.3 has the direct IO code commented out while
a version that I'm currently testing (sg 3.1.18 at
http://www.torque.net/sg) has its direct IO code activated. I have
a web page comparing throughput times and CPU utilizations
at http://www.torque.net/sg/rbuf_tbl.html . My testing
indicates that the kiobufs mechanism is now working
quite well. For various reasons I still think that it
is best to default to indirect IO and let speed hungry
users enable dio (which is done in sg via procfs). Even
when the user selects direct IO is should be possible to
fall back to indirect IO. [Sg does this when a SCSI
adapter can't support direct IO (e.g. an ISA adapter).]

Since the SCSI tape (st) driver is structurally similar
to sg, it should be possible to add direct IO support
to st.

One thing to note is that when you let the user provide
the buffer for direct IO (e.g. with malloc) then on
the i386 it won't be contiguous from a bus address POV.
This means large scatter gather lists (typically with
each element 4 KB on i386) which can be time consuming
to load on some SCSI adapters. One way around this would
be for a driver to provide "malloc/free" like ioctls.

Doug Gilbert

2001-04-09 01:53:55

by Ryan Mack

[permalink] [raw]
Subject: [QUESTIONS] Transision from pcmcia-cs to 2.4 built-in PCMCIA

I have a 3c595 CardBus card and a Lucent Orinoco card

under pcmcia-cs I used the following drivers:

pcmcia_core, ds, and (I think) cs for low level stuff,
i82365 for the controller,
cb_enabler and 3c595_cb for the 3c595,
and wvlan_cs for the Orinoco

under the kernel pcmcia support, I use:

pcmcia_core and ds for low level stuff,
yenta_socket for the controller,
3c59x (same driver as PCI) for the 3c595,
and hermes and orinoco_cs for the Orinoco

Now after awhile I figured out what new drivers I should start using, but
I have a few things about the change that still confuse me...

First, why have I stopped needing cs and cb_enabler?

Second, why is yenta_socket only compiled if I enable CardBus support in
the kernel? I'm running an Orinoco card on another machine, and since I
don't think it's CardBus (am I wrong?), I didn't enable CB in the kernel.
The i82365 driver is the only one compiled, but it seems to work fine on
that machine. Should I enable CardBus support and use yenta_socket
instead?

Third, on the first machine with both cards, neither card works if I use
i82365 instead of yenta_socket, why? The Orinoco gets Tx timeouts on
every packet, and inserting the 3c595 causes the controller (socket) to
time out waiting for reset and it doesn't recognize the 3c595.

Despite the confusion of changing systems, I must say that the orinoco
driver works much better than wvlan_cs for me, as the two Orinoco cards in
IBSS Ad-Hoc mode would get intermittent Tx timeouts with the wvlan_cs
driver. It's also nice not to have to rebuild pcmcia-cs when I upgrade my
kernel anymore. Keep up the good work!

Ryan

2001-04-09 09:04:30

by David Woodhouse

[permalink] [raw]
Subject: Re: [QUESTIONS] Transision from pcmcia-cs to 2.4 built-in PCMCIA


[email protected] said:
> First, why have I stopped needing cs and cb_enabler?

cs is built into pcmcia_core.o, cb_enabler should still be there though.
It's feasible that you only need cb_enabler for the old CardBus drivers,
though - I'm not sure.

> Second, why is yenta_socket only compiled if I enable CardBus support
> in the kernel? I'm running an Orinoco card on another machine, and
> since I don't think it's CardBus (am I wrong?), I didn't enable CB in
> the kernel. The i82365 driver is the only one compiled, but it seems
> to work fine on that machine. Should I enable CardBus support and use
> yenta_socket instead?

yenta_socket is the driver for CardBus i82365-compatible sockets.
i82365 no longer drives CardBus sockets, only PCMCIA.

> Third, on the first machine with both cards, neither card works if I
> use i82365 instead of yenta_socket, why? The Orinoco gets Tx timeouts
> on every packet, and inserting the 3c595 causes the controller
> (socket) to time out waiting for reset and it doesn't recognize the
> 3c595.

The PCMCIA card ought to work. It's probably screwed up the IRQ routing -
it no longer knows about some of the differences between CardBus and PCMCIA
bridges. What exactly is the bridge in this machine?

--
dwmw2


2001-04-09 12:32:30

by Jeremy Jackson

[permalink] [raw]
Subject: Re: Zero Copy IO

Douglas Gilbert wrote:

> "Alex Q Chen" <[email protected]> wrote:
>
> > I am trying to find a way to pin down user space
> > memory from kernel, so that these user space buffer
> > can be used for direct IO transfer or otherwise
> > known as "zero copying IO". Searching through the
> > Internet and reading comments on various news groups,
> > it would appear that most developers including Linus
> > himself doesn't believe in the benefit of "zero
> > copying IO". Most of the discussion however was based
> > on network card drivers. For certain other drivers
> > such as SCSI Tape driver, which need to handle great
> > deal of data transfer, it would seemed still be more
> > advantageous to enable zero copy IO than copy_from_user()
> > and copy_to_user() all the data. Other OS such as AIX
> > and OS2 have kernel functions that can be used to
> > accomplish such a task. Has any ground work been done
> > in Linux 2.4 to enable "zero copying IO"?
>
> Alex,
> The kiobufs mechanism in the 2.4 series is the appropriate
> tool for avoiding copy_from_user() and copy_to_user().
> The definitive driver is in drivers/char/raw.c which
> does synchronous IO to block devices such as disks
> (but is probably not appropriate for tapes).
>
> The SCSI generic (sg) driver supports direct IO. The driver
> in lk 2.4.3 has the direct IO code commented out while
> a version that I'm currently testing (sg 3.1.18 at
> http://www.torque.net/sg) has its direct IO code activated. I have
> a web page comparing throughput times and CPU utilizations
> at http://www.torque.net/sg/rbuf_tbl.html . My testing
> indicates that the kiobufs mechanism is now working
> quite well. For various reasons I still think that it
> is best to default to indirect IO and let speed hungry
> users enable dio (which is done in sg via procfs). Even
> when the user selects direct IO is should be possible to
> fall back to indirect IO. [Sg does this when a SCSI
> adapter can't support direct IO (e.g. an ISA adapter).]
>
> Since the SCSI tape (st) driver is structurally similar
> to sg, it should be possible to add direct IO support
> to st.
>
> One thing to note is that when you let the user provide
> the buffer for direct IO (e.g. with malloc) then on
> the i386 it won't be contiguous from a bus address POV.
> This means large scatter gather lists (typically with
> each element 4 KB on i386) which can be time consuming
> to load on some SCSI adapters. One way around this would
> be for a driver to provide "malloc/free" like ioctls.

I'm not very knowledgeable, but doesn't the sound driver
use mmap() to do this? Either way, AGP GART is
basically a paged-MMU allowing non-contiguous phys-mem
to be made to look contiguous from the AGP *and* from PCI
(on most chipsets?). perhaps this would be helpful.

Large contiguous phys-mem seems to be difficult currently,
but would be nice as it would allow use of larger MMU pages
with many cpus. Someone mentioned reverse page table support
would be required first...

>
> Doug Gilbert
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2001-04-09 19:21:59

by Alan

[permalink] [raw]
Subject: Re: Zero Copy IO

> advantageous to enable zero copy IO than copy_from_user() and copy_to_user
> () all the data. Other OS such as AIX and OS2 have kernel functions that
> can be used to accomplish such a task. Has any ground work been done in
> Linux 2.4 to enable "zero copying IO"?

kiovecs support this. Note that the current kiovec has problems when it comes
to certain kinds of latency critical use and the 2.5 kernel meeting hashed out
some big changes here. But for 2.4 the kiovecs are there