Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757470AbYAJVya (ORCPT ); Thu, 10 Jan 2008 16:54:30 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753978AbYAJVyV (ORCPT ); Thu, 10 Jan 2008 16:54:21 -0500 Received: from accolon.hansenpartnership.com ([76.243.235.52]:49618 "EHLO accolon.hansenpartnership.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753574AbYAJVyU (ORCPT ); Thu, 10 Jan 2008 16:54:20 -0500 Subject: Re: [PATCH] bsg : Add support for io vectors in bsg From: James Bottomley To: Pete Wyckoff Cc: FUJITA Tomonori , tomof@acm.org, deepakrc@gmail.com, linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org In-Reply-To: <20080110214605.GD1928@osc.edu> References: <200801050501.m0551LFB030667@mbox.iij4u.or.jp> <20080108220918.GA9484@osc.edu> <20080109091118N.fujita.tomonori@lab.ntt.co.jp> <20080110204325.GC1928@osc.edu> <1199998531.3141.96.camel@localhost.localdomain> <20080110214605.GD1928@osc.edu> Content-Type: text/plain Date: Thu, 10 Jan 2008 15:54:14 -0600 Message-Id: <1200002054.3141.103.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.12.2 (2.12.2-2.fc8) Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3995 Lines: 85 On Thu, 2008-01-10 at 16:46 -0500, Pete Wyckoff wrote: > James.Bottomley@HansenPartnership.com wrote on Thu, 10 Jan 2008 14:55 -0600: > > On Thu, 2008-01-10 at 15:43 -0500, Pete Wyckoff wrote: > > > fujita.tomonori@lab.ntt.co.jp wrote on Wed, 09 Jan 2008 09:11 +0900: > > > > On Tue, 8 Jan 2008 17:09:18 -0500 > > > > Pete Wyckoff wrote: > > > > > I took another look at the compat approach, to see if it is feasible > > > > > to keep the compat handling somewhere else, without the use of #ifdef > > > > > CONFIG_COMPAT and size-comparison code inside bsg.c. I don't see how. > > > > > The use of iovec is within a write operation on a char device. It's > > > > > not amenable to a compat_sys_ or a .compat_ioctl approach. > > > > > > > > > > I'm partial to #1 because the use of architecture-independent fields > > > > > matches the rest of struct sg_io_v4. But if you don't want to have > > > > > another iovec type in the kernel, could we do #2 but just return > > > > > -EINVAL if the need for compat is detected? I.e. change > > > > > dout_iovec_count to dout_iovec_length and do the math? > > > > > > > > If you are ok with removing the write/read interface and just have > > > > ioctl, we could can handle comapt stuff like others do. But I think > > > > that you (OSD people) really want to keep the write/read > > > > interface. Sorry, I think that there is no workaround to support iovec > > > > in bsg. > > > > > > I don't care about read/write in particular. But we do need some > > > way to launch asynchronous SCSI commands, and currently read/write > > > are the only way to do that in bsg. The reason is to keep multiple > > > spindles busy at the same time. > > > > Won't multi-threading the ioctl calls achieve the same effect? Or do > > you trip over BKL there? > > There's no BKL on (new) ioctls anymore, at least. A thread per > device would be feasible perhaps. But if you want any sort of > pipelining out of the device, esp. in the remote iSCSI case, you > need to have a good number of commands outstanding to each device. > So a thread per command per device. Typical iSCSI queue depth of > 128 times 16 devices for a small setup is a lot of threads. I was actually thinking of a thread per outstanding command. > The pthread/pipe latency overhead is not insignificant for fast > storage networks too. > > > > How about these new ioctls instead of read/write: > > > > > > SG_IO_SUBMIT - start a new blk_execute_rq_nowait() > > > SG_IO_TEST - complete and return a previous req > > > SG_IO_WAIT - wait for a req to finish, interruptibly > > > > > > Then old write users will instead do ioctl SUBMIT. Read users will > > > do TEST for non-blocking fd, or WAIT for blocking. And SG_IO could > > > be implemented as SUBMIT + WAIT. > > > > > > Then we can do compat_ioctl and convert up iovecs out-of-line before > > > calling the normal functions. > > > > > > Let me know if you want a patch for this. > > > > Really, the thought of re-inventing yet another async I/O interface > > isn't very appealing. > > I'm fine with read/write, except Tomo is against handling iovecs > because of the compat complexity with struct iovec being different > on 32- vs 64-bit. There is a standard way to do "compat" ioctl that > hides this handling in a different file (not bsg.c), which is the > only reason I'm even considering these ioctls. I don't care about > compat setups per se. > > Is there another async I/O mechanism? Userspace builds the CDBs, > just needs some way to drop them in SCSI ML. BSG is almost perfect > for this, but doesn't do iovec, leading to lots of memcpy. No, it's just that async interfaces in Linux have a long and fairly unhappy history. James -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/