Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp2151100imm; Mon, 3 Sep 2018 21:13:17 -0700 (PDT) X-Google-Smtp-Source: ANB0VdYzYdebbmMKM517sJI4nSVTqFnXbmtBI8WStRkjZjVWl8KxiH6G6NcXqTZuZfB1v/tBdWNt X-Received: by 2002:a62:71c4:: with SMTP id m187-v6mr4905729pfc.232.1536034397903; Mon, 03 Sep 2018 21:13:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536034397; cv=none; d=google.com; s=arc-20160816; b=p6mbP39DoK2v2OU5R5N/8VS8yis7u/FPXNvIzkfrnkU59IjdugBePQaakavQHSrlEA 3TlgIjB+vpYlDUoms4799ZrmksEhF8rvNx2X7U+n16x4P95QiVuGW5FbRpi3Lq47XWy2 w7kCREYchlykO5FW/LS4mV6EcVWvBnASm9+CEZKyNMLNxFpOL7drPk9SWfQTVjzxOnxy w5KBwW6TI4081Ni6axPS0exgDuXzesZDmC/bQ0bb+0ySEBeVe/JpSYvhz6PF93TfgrfR bfDE+I9dJ1aEO5Bx2v4oWR3HvgdYW7uf82VzayYU4X6NU8ZF0aqwwemHkWlCDP+KwMKa H22w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:reply-to :arc-authentication-results; bh=qCnQ5UKUrbKIj2uWBsJvRhJCq/8aoft01G2wqsCzgQU=; b=qBV4T8YhKps4g1doVY6XhZEaNB7F/M/rbI8EHr0D5jODVCTNTtnLcqwYwQ1abiwMee e+tXi6+AaxZUZhTWtlRXybcHepQ+rz874yDMIatrquOKd7xeAtG/Nuy9x+0RhoZ1A6MC rfwu8gQoa7BSEiyzgoB+q+ujrvbEKW0tV8pR+0P/3i/Lx5wF6s7ahdOOrx9t26oR48k9 J6pwvIVAzKftQSZuGoPPK8FikoSJY0i29W8liDMm8f7hPyHnYWi7mARfqQEgVDC/2iHQ JsUZJHHaRXcdVgpGVTlPTqVOlzVFOA3IsL3UAxkD5/cx5mJ3JiV27D9OOeWx9Iw9AchU KPjw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m21-v6si19879284pgh.664.2018.09.03.21.12.32; Mon, 03 Sep 2018 21:13:17 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726363AbeIDIdv (ORCPT + 99 others); Tue, 4 Sep 2018 04:33:51 -0400 Received: from smtp.infotech.no ([82.134.31.41]:58720 "EHLO smtp.infotech.no" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726090AbeIDIdv (ORCPT ); Tue, 4 Sep 2018 04:33:51 -0400 Received: from localhost (localhost [127.0.0.1]) by smtp.infotech.no (Postfix) with ESMTP id A092F2041E3; Tue, 4 Sep 2018 06:10:37 +0200 (CEST) X-Virus-Scanned: by amavisd-new-2.6.6 (20110518) (Debian) at infotech.no Received: from smtp.infotech.no ([127.0.0.1]) by localhost (smtp.infotech.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id X3LmNCg0pF9B; Tue, 4 Sep 2018 06:10:31 +0200 (CEST) Received: from [192.168.16.56] (unknown [195.69.32.11]) by smtp.infotech.no (Postfix) with ESMTPA id EBB25204179; Tue, 4 Sep 2018 06:10:30 +0200 (CEST) Reply-To: dgilbert@interlog.com Subject: Re: Recent removal of bsg read/write support To: Dror Levin , torvalds@linux-foundation.org Cc: richard.weinberger@gmail.com, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-scsi@vger.kernel.org, hch@infradead.org, axboe@kernel.dk References: From: Douglas Gilbert Message-ID: Date: Tue, 4 Sep 2018 06:10:30 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-CA Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2018-09-03 10:34 AM, Dror Levin wrote: > On Sun, Sep 2, 2018 at 8:55 PM Linus Torvalds > wrote: >> >> On Sun, Sep 2, 2018 at 4:44 AM Richard Weinberger >> wrote: >>> >>> CC'ing relevant people. Otherwise your mail might get lost. >> >> Indeed. > > Sorry for that. > >>> On Sun, Sep 2, 2018 at 1:37 PM Dror Levin wrote: >>>> >>>> We have an internal tool that uses the bsg read/write interface to >>>> issue SCSI commands as part of a test suite for a storage device. >>>> >>>> After recently reading on LWN that this interface is to be removed we >>>> tried porting our code to use sg instead. However, that raises new >>>> issues - mainly getting ENOMEM over iSCSI for unknown reasons. >> >> Is there any chance that you can make more data available? > > Sure, I can try. > > We use writev() to send up to SG_MAX_QUEUE tasks at a time. Occasionally not > all tasks are written at which point we wait for tasks to return before > sending more, but then writev() fails with ENOMEM and we see this in the syslog: > > Sep 1 20:58:14 gdc-qa-io-017 kernel: sd 441:0:0:5: [sg73] > sg_common_write: start_req err=-12 > > Failing tasks are reads of 128KiB. > >> I'd rather fix the sg interface (which while also broken garbage, we >> can't get rid of) than re-surrect the bsg interface. >> >> That said, the removed bsg code looks a hell of a lot prettier than >> the nasty sg interface code does, although it also lacks ansolutely >> _any_ kind of security checking. > > For us the bsg interface also has several advantages over sg: > 1. The device name is its HCTL which is nicer than an arbitrary integer. > 2. write() supports writing more than one sg_io_v4 struct so we don't have > to resort to writev(). > 3. Queue size is the device's queue depth and not SG_MAX_QUEUE which is 16. > >>>> Because of this we would like to continue using the bsg interface, >>>> even if some changes are required to meet security concerns. >> >> I wonder if we could at least try to unify the bsg/sg code - possibly >> by making sg use the prettier bsg code (but definitely have to add all >> the security measures). >> >> And dammit, the SCSI people need to get their heads out of their >> arses. This whole "stream random commands over read/write" needs to go >> the f*ck away. >> >> Could we perhaps extend the SG_IO interace to have an async mode? >> Instead of "read/write", have "SG_IOSUBMIT" and "SG_IORECEIVE" and >> have the SG_IO ioctl just be a shorthand of "both". > > Just my two cents - having an interface other than read/write won't allow > users to treat this fd as a regular file with epoll() and read(). This is > a major bonus for this interface - an sg/bsg device can be used just like > a socket or pipe in any reactor (we use boost asio for example). The advantage of having two ioctls is that they can both pass (meta-)data bidirectionally. That is hard to do with standard read() and write() calls. The command tag is the piece if meta-data that goes against the flow: returned from SG_IOSUBMIT, optionally given to SG_IORECEIVE (which might have a 'cancel command' flag). The sg v1, v2 and v3 interfaces could keep their write()/read() interfaces for backward compatibility (to Linux 1.0.0, March 1994 for sg v1). New, clean submit and receive paths could be added to the sg driver for the v3 and v4 twin ioctl interface. Previously the sg v4 interface was only supported by the bsg driver. One advantage of sg v4 over v3 is support for bidi commands. Not sure if epoll/poll works with an ioctl, if not we could add a "dummy" read() call that notionally returned SCSI status. The SG_IORECEIVE ioctl would still be needed to "clean up" the command, and optionally transfer the data-in buffer. Tony Battersby has also requested twin ioctls saying that it is extremely tedious ploughing through logs full of SG_IO calls and that clearly separating submits from receives would make things somewhat better. Doug Gilbert