Received: by 10.192.165.148 with SMTP id m20csp2396588imm; Thu, 26 Apr 2018 10:15:18 -0700 (PDT) X-Google-Smtp-Source: AB8JxZqxApjxuVcVqntIlrcJstjNds96W6gRaLxw/cYaT2yUkdsp7Hu/DtwaTpn6glxAwPUHBQ6m X-Received: by 10.101.93.15 with SMTP id e15mr4377795pgr.119.1524762918336; Thu, 26 Apr 2018 10:15:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524762918; cv=none; d=google.com; s=arc-20160816; b=nlq0VT6xPP/pMyNP0Ac5EcP3b3qGTYyhJgHmW5iqxUrjjVSFkZ2yA7hFSmP0oyl5KA lDxaMK1HHEWm1Kn7mUH2NeccyWgYwFVP+QuW16WshUr+jMTvJTMJi5p+ljbuiGJw4Wwa +5YDLQzIQbuiMfcy95BWgxXMIaT3TPPIlD5IThslwXXFalrspjCKtCERGp/e3TFWvjK7 Y059VJidiGDar9RWdRVAFvE3clOEsu1RM9PK0XgUJ4Tf9XACtoIP4gVX8eHDGQr4LbWV y1MLf3PYUGYQKPu8R7YSCV8yd5Dno9Kvn9roiv4I6rTnZ8Uu9GSvEYQdez0Jv9/D5ARn lkYA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:thread-index:thread-topic :content-transfer-encoding:mime-version:subject:references :in-reply-to:message-id:cc:to:from:date:arc-authentication-results; bh=s57cXJRTvAJZ0A+a5MdKpaGRCExBPxv7vEC3AOYK3vk=; b=Pvdzd9cCHy8POzDfPCzbxbL9/38QEh/ITC9olCgGRK+rHS8NGeP/s7GuB1DFkFBaV6 O3E42Vw+VSXln/HHs9v9rxg9LmtFpABvmyRXQFfMDzvTRMoLpjhs3JxwFIqRbZLKefAb w/5UfGdD7VfP+l+5ttOY7gfgXB82Y+lx9KI3O6dMJmiCrJNP81+SnyYL73j99HsBBCoW TWuNt8Xbiir3IXlLBolZ7UStSb2so0uFtzYwYT4RpV5josrPBUHx1GYxPzWykDQk5fQA 8DlfK5V8nqMhTtulaIn0AYUJcZG+/SbxZYC4uEmFKTR3peT25RIJc4WWptaDM2GYByNo MhoA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j21si1303699pgn.334.2018.04.26.10.15.03; Thu, 26 Apr 2018 10:15:18 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756865AbeDZRNr (ORCPT + 99 others); Thu, 26 Apr 2018 13:13:47 -0400 Received: from mx1.redhat.com ([209.132.183.28]:55524 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754544AbeDZRNp (ORCPT ); Thu, 26 Apr 2018 13:13:45 -0400 Received: from smtp.corp.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.26]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 730263133E73; Thu, 26 Apr 2018 17:13:45 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 5366430001E0; Thu, 26 Apr 2018 17:13:45 +0000 (UTC) Received: from zmail21.collab.prod.int.phx2.redhat.com (zmail21.collab.prod.int.phx2.redhat.com [10.5.83.24]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 1CC381808850; Thu, 26 Apr 2018 17:13:45 +0000 (UTC) Date: Thu, 26 Apr 2018 13:13:44 -0400 (EDT) From: Pankaj Gupta To: Dan Williams Cc: Stefan Hajnoczi , Linux Kernel Mailing List , KVM list , Qemu Developers , linux-nvdimm , Linux MM , Jan Kara , Stefan Hajnoczi , Rik van Riel , haozhong zhang , Nitesh Narayan Lal , Kevin Wolf , Paolo Bonzini , ross zwisler , David Hildenbrand , xiaoguangrong eric , Christoph Hellwig , Marcel Apfelbaum , "Michael S. Tsirkin" , niteshnarayanlal@hotmail.com, Igor Mammedov , lcapitulino@redhat.com Message-ID: <1302242642.23016855.1524762824836.JavaMail.zimbra@redhat.com> In-Reply-To: References: <20180425112415.12327-1-pagupta@redhat.com> <20180425112415.12327-3-pagupta@redhat.com> <20180426131517.GB30991@stefanha-x1.localdomain> <58645254.23011245.1524760853269.JavaMail.zimbra@redhat.com> Subject: Re: [RFC v2 2/2] pmem: device flush over VIRTIO MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [10.67.116.119, 10.4.195.13] Thread-Topic: pmem: device flush over VIRTIO Thread-Index: DZPGUYMDKi1XNbxWoqEECLMq+ppbKA== X-Scanned-By: MIMEDefang 2.84 on 10.5.11.26 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.46]); Thu, 26 Apr 2018 17:13:45 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > > > >> > >> On Wed, Apr 25, 2018 at 04:54:14PM +0530, Pankaj Gupta wrote: > >> > This patch adds functionality to perform > >> > flush from guest to hosy over VIRTIO > >> > when 'ND_REGION_VIRTIO'flag is set on > >> > nd_negion. Flag is set by 'virtio-pmem' > >> > driver. > >> > > >> > Signed-off-by: Pankaj Gupta > >> > --- > >> > drivers/nvdimm/region_devs.c | 7 +++++++ > >> > 1 file changed, 7 insertions(+) > >> > > >> > diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c > >> > index a612be6..6c6454e 100644 > >> > --- a/drivers/nvdimm/region_devs.c > >> > +++ b/drivers/nvdimm/region_devs.c > >> > @@ -20,6 +20,7 @@ > >> > #include > >> > #include "nd-core.h" > >> > #include "nd.h" > >> > +#include > >> > > >> > /* > >> > * For readq() and writeq() on 32-bit builds, the hi-lo, lo-hi order is > >> > @@ -1074,6 +1075,12 @@ void nvdimm_flush(struct nd_region *nd_region) > >> > struct nd_region_data *ndrd = dev_get_drvdata(&nd_region->dev); > >> > int i, idx; > >> > > >> > + /* call PV device flush */ > >> > + if (test_bit(ND_REGION_VIRTIO, &nd_region->flags)) { > >> > + virtio_pmem_flush(&nd_region->dev); > >> > + return; > >> > + } > >> > >> How does libnvdimm know when flush has completed? > >> > >> Callers expect the flush to be finished when nvdimm_flush() returns but > >> the virtio driver has only queued the request, it hasn't waited for > >> completion! > > > > I tried to implement what nvdimm does right now. It just writes to > > flush hint address to make sure data persists. > > nvdimm_flush() is currently expected to be synchronous. Currently it > is sfence(); write to special address; sfence(). By the time the > second sfence returns the data is flushed. So you would need to make > this virtio flush interface synchronous as well, but that appears > problematic to stop the guest for unbounded amounts of time. Instead, > you need to rework nvdimm_flush() and the pmem driver to make these > flush requests asynchronous and add the plumbing for completion > callbacks via bio_endio(). o.k. > > > I just did not want to block guest write requests till host side > > fsync completes. > > You must complete the flush before bio_endio(), otherwise you're > violating the expectations of the guest filesystem/block-layer. sure! > > > > > be worse for operations on different guest files because all these > > operations would happen > > ultimately on same file at host. > > > > I think with current way, we can achieve an asynchronous queuing mechanism > > on cost of not > > 100% sure when fsync would complete but it is assured it will happen. Also, > > its entire block > > flush. > > No, again, that's broken. We need to add the plumbing for > communicating the fsync() completion relative the WRITE_{FLUSH,FUA} > bio in the guest. Sure. Thanks Dan & Stefan for the explanation and review. Best regards, Pankaj