Received: by 2002:a25:7ec1:0:0:0:0:0 with SMTP id z184csp497412ybc; Fri, 22 Nov 2019 08:26:56 -0800 (PST) X-Google-Smtp-Source: APXvYqzKYpSCwGS6mPlmnTlx63it8zem2nr+VYqLoFaOzim3c14blDR55nJpbazzRvWKrMke2kpt X-Received: by 2002:a17:906:6a43:: with SMTP id n3mr22931173ejs.31.1574440016800; Fri, 22 Nov 2019 08:26:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1574440016; cv=none; d=google.com; s=arc-20160816; b=Z+Bo6oNlVna9vDV6BZS0gCqaNcQCSJ3GTUeK7GhvvkeW3FOf07UTfaKJ/AsDpMUnO4 viH3r+HL2iujjTSg+x415wY/q2upSsEfQKctO8PTNrmoGjKfRsRJYiCxod/mFiVIzcRd yz73Hnlkj9uzJkVxu6Msbp5Glxh4DnE+BxmZCTqfS2UvXj4fofILb0sPh+15qMByFOS5 pxvB0bjg18WpkXiM/QFbK+8i/KretgMAqTyBUieNbFmmfIEFw/6FMUhfIWOc+UC+c7JG dh1I85RBSTFepvdl3KmttJBNBZb8Gm2gD0Au8XokSQtfrzRsFR2yn886SoWHn9S3FbVn VJwA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:message-id:in-reply-to:date:references:subject:cc:to :from:dkim-signature; bh=dHdyHfkNKosHl51NLrMjsjBT0ZayHryhZQGO5Md7wqE=; b=HJnyoiBV/MgBppG2Vm9p0zELfnOdGizD05aqeg2PMvI4eDmh57r/5f1O1w57cgdd0u QMiriaLQ3MPXw9v0SFXa1zyDcLknaLQcM/Y1M9Gy99qqKKq2F6QIz5WDOBAM59O02XTc 7V/CduW3/AMoy0YjrClPpicMq3E+9SSTl6rfzP73AFEPpw/tMN8TgXj6sxN45SFwbURU CtFE3f8TYYO/umtvi4L0hRb4dncPw0UxDkRFKH+nq+m3R0bA0AxNFxtyLfWc8GUSUEG0 GPTEirIZdiCU7G6DhJbZFW0y0g/CcZhtVttTlLXcpXVTcIJd7HkabLS58njJCbJDli+e xIRw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=fkgsDrpJ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r15si2170294edi.437.2019.11.22.08.26.32; Fri, 22 Nov 2019 08:26:56 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=fkgsDrpJ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727139AbfKVQZW (ORCPT + 99 others); Fri, 22 Nov 2019 11:25:22 -0500 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:51212 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726638AbfKVQZV (ORCPT ); Fri, 22 Nov 2019 11:25:21 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1574439921; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dHdyHfkNKosHl51NLrMjsjBT0ZayHryhZQGO5Md7wqE=; b=fkgsDrpJ7qdIE/ZntwpIM9t+ZRHI9I5rZef9UUq/cc683neCdFAK8wWNTagVaOZypEDPJj F+K+n9KUBq5NYuzGaOwTp4CB7XaPiP4Wwhxu7MRK5Lrnr7mbChG+Wwwa5B+JEmHUw5zgXm 0Omj5nQeZygrjnBujNrvFfvvSS1obTQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-165-FxwrXjrrPBO6LPh4PmY5hA-1; Fri, 22 Nov 2019 11:25:17 -0500 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 07B7B184CAA8; Fri, 22 Nov 2019 16:25:15 +0000 (UTC) Received: from segfault.boston.devel.redhat.com (segfault.boston.devel.redhat.com [10.19.60.26]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 659181CB; Fri, 22 Nov 2019 16:25:13 +0000 (UTC) From: Jeff Moyer To: Dan Williams Cc: Pankaj Gupta , linux-nvdimm , Linux Kernel Mailing List , Linux ACPI , Vishal L Verma , Dave Jiang , "Weiny\, Ira" , "Rafael J. Wysocki" , Len Brown , Vivek Goyal , Keith Busch Subject: Re: [PATCH] virtio pmem: fix async flush ordering References: <20191120092831.6198-1-pagupta@redhat.com> X-PGP-KeyID: 1F78E1B4 X-PGP-CertKey: F6FE 280D 8293 F72C 65FD 5A58 1FF8 A7CA 1F78 E1B4 Date: Fri, 22 Nov 2019 11:25:11 -0500 In-Reply-To: (Dan Williams's message of "Fri, 22 Nov 2019 08:13:05 -0800") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-MC-Unique: FxwrXjrrPBO6LPh4PmY5hA-1 X-Mimecast-Spam-Score: 0 Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Dan Williams writes: > On Fri, Nov 22, 2019 at 8:09 AM Jeff Moyer wrote: >> >> Dan Williams writes: >> >> > On Wed, Nov 20, 2019 at 9:26 AM Jeff Moyer wrote: >> >> >> >> Pankaj Gupta writes: >> >> >> >> > Remove logic to create child bio in the async flush function which >> >> > causes child bio to get executed after parent bio 'pmem_make_reque= st' >> >> > completes. This resulted in wrong ordering of REQ_PREFLUSH with th= e >> >> > data write request. >> >> > >> >> > Instead we are performing flush from the parent bio to maintain th= e >> >> > correct order. Also, returning from function 'pmem_make_request' i= f >> >> > REQ_PREFLUSH returns an error. >> >> > >> >> > Reported-by: Jeff Moyer >> >> > Signed-off-by: Pankaj Gupta >> >> >> >> There's a slight change in behavior for the error path in the >> >> virtio_pmem driver. Previously, all errors from virtio_pmem_flush we= re >> >> converted to -EIO. Now, they are reported as-is. I think this is >> >> actually an improvement. >> >> >> >> I'll also note that the current behavior can result in data corruptio= n, >> >> so this should be tagged for stable. >> > >> > I added that and was about to push this out, but what about the fact >> > that now the guest will synchronously wait for flushing to occur. The >> > goal of the child bio was to allow that to be an I/O wait with >> > overlapping I/O, or at least not blocking the submission thread. Does >> > the block layer synchronously wait for PREFLUSH requests? >> >> You *have* to wait for the preflush to complete before issuing the data >> write. See the "Explicit cache flushes" section in >> Documentation/block/writeback_cache_control.rst. > > I'm not debating the ordering, or that the current implementation is > obviously broken. I'm questioning whether the bio tagged with PREFLUSH > is a barrier for future I/Os. My reading is that it is only a gate for > past writes, and it can be queued. I.e. along the lines of > md_flush_request(). Sorry, I misunderstood your question. For a write bio with REQ_PREFLUSH set, the PREFLUSH has to be done before the data attached to the bio is written. That preflush is not an I/O barrier. In other words, for unrelated I/O (any other bio in the system), it does not impart any specific ordering requirements. Upper layers are expected to wait for any related I/O completions before issuing a flush request. So yes, you can queue the bio to a worker thread and return to the caller. In fact, this is what I had originally suggested to Pankaj. Cheers, Jeff