Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935009AbcJFKcD (ORCPT ); Thu, 6 Oct 2016 06:32:03 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:45694 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752332AbcJFKcB (ORCPT ); Thu, 6 Oct 2016 06:32:01 -0400 Date: Thu, 6 Oct 2016 03:31:55 -0700 From: Christoph Hellwig To: Wouter Verhelst Cc: Alex Bligh , "nbd-general@lists.sourceforge.net" , Jens Axboe , Josef Bacik , "linux-kernel@vger.kernel.org" , Christoph Hellwig , "linux-block@vger.kernel.org" , Kernel Team Subject: Re: [Nbd] [PATCH][V3] nbd: add multi-connection support Message-ID: <20161006103155.GA20279@infradead.org> References: <2B49072B-6F83-4CD2-863B-5AB21E1F7816@fb.com> <20161003072049.GA16847@infradead.org> <20161003075149.u3ppcnk2j55fci6h@grep.be> <20161003075701.GA29457@infradead.org> <97C12880-A095-4F7B-B828-1837E65F7721@alex.org.uk> <20161003210714.ukgojallutalpjun@grep.be> <2AEFCBE9-E2C9-400E-9FF8-91901D7CE442@alex.org.uk> <20161006090415.xme3mgcjtkdx2j5f@grep.be> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161006090415.xme3mgcjtkdx2j5f@grep.be> User-Agent: Mutt/1.6.1 (2016-04-27) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1911 Lines: 34 On Thu, Oct 06, 2016 at 11:04:15AM +0200, Wouter Verhelst wrote: > In the current situation, a client could opportunistically send a number > of write requests immediately followed by a flush and hope for the best. > However, in that case there is no guarantee that for the write requests > that the client actually cares about to have hit the disk, a reply > arrives on the client side before the flush reply arrives. If that > doesn't happen, that would then mean the client would have to issue > another flush request, probably at a performance hit. There is also no guarantee that the server would receive them in order. Note that people looked into schemes like this multiple times using a SCSI feature called ordered tags which should provide this sort of ordering, but no one managed to make it work reliably. > As I understand Christoph's explanations, currently the Linux kernel > *doesn't* issue flush requests unless and until the necessary writes > have already completed (i.e., the reply has been received and processed > on the client side). Given that, given the issue in the previous > paragraph, and given the uncertainty introduced with multiple > connections, I think it is reasonable to say that a client should just > not assume a flush touches anything except for the writes for which it > has already received a reply by the time the flush request is sent out. Exactly. That's the wording in other protocol specifications, and the semantics Linux (and Windows) rely on. > Christoph: just to double-check: would such semantics be incompatible > with the semantics that the Linux kernel expects of block devices? If > so, we'll have to review. Otherwise, I think we should go with that. No, they match the cache flush semantics in every other storage protocol known to me, and they match the expectations of both the Linux kernel and any other OS or comsumer I know about perfectly.