Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp517944imu; Thu, 20 Dec 2018 00:27:04 -0800 (PST) X-Google-Smtp-Source: AFSGD/XovUKXdHTO+K0+IsnbLgJotgLJndXYZZh1Qox0KNvdEXl8uZn0li/geuTTJq4We1XZItjB X-Received: by 2002:a17:902:f082:: with SMTP id go2mr23295683plb.115.1545294424414; Thu, 20 Dec 2018 00:27:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1545294424; cv=none; d=google.com; s=arc-20160816; b=FsfARC98RtsP/n4f5luCNDS9zIDDx1PVjhHaDARAlbsXigD62ZCFFZP4Fp19hs1SUf plxoC+0gDFlRmF3r4qRsveed/yDEyjFNy4mhoBhatB9iVkZITvlWBRMPF6nB5RIS5JpR rDacEoQDbhAaGy2XUGRBc9CG9MHKQ5VN2bboEwf+x69ey9EL36EEUPvxVS2Rd4Lfq+mG C3JFmg+gil54reUB1CEZISF1F2CS+488ANNn/xmt9kWGpUhipS66D8P9w4nruXfHVQYv YeqnNIGCty2+seYLsQi0TeAglP9DqAbklmXpbultzoglgwwk2BvP4bHJbvcBD1dIsWpB 3XiQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=aZ/nFAUsX00tZZLT5t0QtY1dp3pbtF8XhKB7n30RHcE=; b=0Px3HxyqhtOFE/T3E/lHEiob8285UQAKvmzcJMoCBzfckVk0caxv9DeIg/gE64OtZh 189Tfx93zDlQ446iqYiTE3px82Yt/PdtZypa1h/RN5KJo5JX/TSqe3PQvwOvxcrgMRKY AOXkaXJJD53wtiiDDgPSLaxFI2Wc2VZuVTCBjjw9XmJb/izEGKO0eZmhS8tKH/a7sogp vYrIv94ZJpjfLYMiZbeRNumWWq34KZk2zzrY89jzOtmriUujCyg9JIVYsWc0cKBU7d0J 0gBLCt9zfcYPFchlDiGFZ1ACa1KU+FMGd3bHX2rLfa1kYTTsNQTu/cDYy0T2MkYNM4Q1 ugVQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b5si17175434plr.355.2018.12.20.00.26.49; Thu, 20 Dec 2018 00:27:04 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730372AbeLTH06 (ORCPT + 99 others); Thu, 20 Dec 2018 02:26:58 -0500 Received: from verein.lst.de ([213.95.11.211]:51360 "EHLO newverein.lst.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727644AbeLTH06 (ORCPT ); Thu, 20 Dec 2018 02:26:58 -0500 Received: by newverein.lst.de (Postfix, from userid 2407) id B4E5E68AA6; Thu, 20 Dec 2018 08:26:56 +0100 (CET) Date: Thu, 20 Dec 2018 08:26:56 +0100 From: Christoph Hellwig To: Douglas Gilbert Cc: Christoph Hellwig , Boaz Harrosh , axboe@kernel.dk, martin.petersen@oracle.com, Johannes Thumshirn , Benjamin Block , linux-scsi@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: remove exofs, the T10 OSD code and block/scsi bidi support V3 Message-ID: <20181220072656.GA10011@lst.de> References: <20181111133211.13926-1-hch@lst.de> <4f4b6aff-6726-c500-e3e4-f8b73d641851@electrozaur.com> <20181219144347.GB23410@lst.de> <0e8b8d45-cfeb-ba9d-c92f-953cabede1ee@interlog.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0e8b8d45-cfeb-ba9d-c92f-953cabede1ee@interlog.com> User-Agent: Mutt/1.5.17 (2007-11-01) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Dec 19, 2018 at 09:01:53PM -0500, Douglas Gilbert wrote: >> 1) reduce the size of every kernel with block layer support, and >> even more for every kernel with scsi support > > By proposing the removal of bidi support from the block layer, it isn't > just the SCSI subsystem that will be impacted. Those NVMe documents > that you referred me to earlier in the year, in the command tables > in 1.3c and earlier you have noticed the 2 bit direction field and > what 11b means? Even if there aren't any bidi NVMe commands *** yet, > the fact that NVMe's 64 byte command format has provision for 4 > (not 2) independent data transfers (data + meta, for each direction). > Surely NVMe will sooner or later take advantage of those ... a > command like READ GATHERED comes to mind. NVMe on the other hand does have support for separate read and write buffers as in the current SCSI bidi support, as it encodes the data transfers in that SQE. So IFF NVMe does bidi commands it would have to use a single buffer for data in/out, which can be easily done in the block layer without the current bidi support that chains two struct request instances for data in and data out. >> 2) reduce the size of the critical struct request structure by >> 128 bits, thus reducing the memory used by every blk-mq driver >> significantly, never mind the cache effects > > Hmm, one pointer (that is null in the non-bidi case) should be enough, > that's 64 or 32 bits. Due to the way we use request chaining we need two fields at the moment. ->special and ->next_rq. If we'd refactor the whole thing for the basically non-existent user we could indeed probably get it down to a single pointer. > While on the subject of bidi, the order of transfers: is the data-out > (to the target) always before the data-in or is it the target device > that decides (depending on the semantics of the command) who is first? The way I read SAM data needs to be transferred to the device for processing first, then the processing occurs and then it is transferred out, so the order seems fixed. > > Doug Gilbert > > *** there could already be vendor specific bidi NVMe commands out > there (ditto for SCSI) For NVMe they'd need to transfer data in and out in the same buffer to sort work, and even then only if we don't happen to be bounce buffering using swiotlb, or using a network transport. Similarly for SCSI only iSCSI at the moment supports bidi CDBs, so we could have applications using vendor specific bidi commands on iSCSI, which is exactly what we're trying to find out, but it is a bit of a very niche use case.