Received: by 10.213.65.68 with SMTP id h4csp1871000imn; Thu, 5 Apr 2018 05:21:27 -0700 (PDT) X-Google-Smtp-Source: AIpwx48gnM1KcGFZuy4cldpe4zVAUK2rqm2mdjFWOIWAmheuPKY6r9j8fXrHS4tsh9dezPcS3vJH X-Received: by 10.101.99.129 with SMTP id h1mr14365749pgv.27.1522930887935; Thu, 05 Apr 2018 05:21:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522930887; cv=none; d=google.com; s=arc-20160816; b=GYtJaq2YBjYFwJNzxrrehqoBo+oWsABWCKy362kbvB63EwD61B2ACXxTUFZrkD3d+P cFrGlzq4HizeMFPRxKuw+RiJAda+/8WkPSp0D4zCoAMt8gX+Y3xpaCGjf9aNqlq+xBTh GHG7Vwe8TBErAH1UrBDVKBiy51AeI0N9MsmYEiOBC+ikXDNCMeY2OLjA7rZskQBDPX2Y Ro1rA9+NQYBnKFvimqTK6m8ViSF8ZLvZUKS3oMneek4bkiqCdBsoK1j4EaRi5kJB/Sk3 4LDGLfXBqtvCBZWAqsx8CM8pQKMXhevEFkKj2PZTy22KhhBvJz+GbRKhZdg+6gC57UGl kQtA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:organization:from:references:cc:to:subject :arc-authentication-results; bh=ZKuHCSIX3FQ/nUU0LzXtumQ9da+ZV7jgJfCagr6Qroc=; b=F/nJ1Rn9OSPzwWmGTveHQwihiFFmuJPL76vjR1K+coRzmGy2cWfLT4cKYACNildmrm BJvac7IqSEyLiFT/cnooWP85WhzF0MEes/pzX6xxIEhPPjWsAWdN5arpLwlttqUFeaZc XFyc3vWVtP8gZEhBY+PkNAIF59nrcgLt0ztzEHz0ISjG74xwB0Wp6pfpnGZe3MaSUDia DQMRieC7Bp/WoyGgwpgJJs0CzZ1RNj/ghhapOwC8ZExymNDHtpk57xqcnXUsc82ESYBy j6LYwMREfbc1Mo9Sni/LpNR3phcrCu7dtDzDPNFs6+xOP4Qd9yQzIry9UvqyvvEdRAXD 7VKg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w125si5384031pgb.190.2018.04.05.05.21.13; Thu, 05 Apr 2018 05:21:27 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751404AbeDEMTf (ORCPT + 99 others); Thu, 5 Apr 2018 08:19:35 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:55884 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751038AbeDEMTd (ORCPT ); Thu, 5 Apr 2018 08:19:33 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1BBA5406F890; Thu, 5 Apr 2018 12:19:33 +0000 (UTC) Received: from [10.36.116.205] (ovpn-116-205.ams2.redhat.com [10.36.116.205]) by smtp.corp.redhat.com (Postfix) with ESMTP id C46752024CA4; Thu, 5 Apr 2018 12:19:25 +0000 (UTC) Subject: Re: [Qemu-devel] [RFC] qemu: Add virtio pmem device To: Pankaj Gupta Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, qemu-devel@nongnu.org, linux-nvdimm@ml01.01.org, kwolf@redhat.com, haozhong zhang , jack@suse.cz, xiaoguangrong eric , riel@surriel.com, niteshnarayanlal@hotmail.com, mst@redhat.com, ross zwisler , hch@infradead.org, stefanha@redhat.com, imammedo@redhat.com, marcel@redhat.com, pbonzini@redhat.com, dan j williams , nilal@redhat.com References: <20180405104834.10457-1-pagupta@redhat.com> <20180405104834.10457-4-pagupta@redhat.com> <416823501.16310251.1522930166070.JavaMail.zimbra@redhat.com> From: David Hildenbrand Organization: Red Hat GmbH Message-ID: <782ac7e2-b8d9-139d-6182-cb4e2d082458@redhat.com> Date: Thu, 5 Apr 2018 14:19:25 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 MIME-Version: 1.0 In-Reply-To: <416823501.16310251.1522930166070.JavaMail.zimbra@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Thu, 05 Apr 2018 12:19:33 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Thu, 05 Apr 2018 12:19:33 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'david@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org >> >> So right now you're just using some memdev for testing. > > yes. > >> >> I assume that the memory region we will provide to the guest will be a >> simple memory mapped raw file. Dirty tracking (using the kvm slot) will >> be used to detect which blocks actually changed and have to be flushed >> to disk. > > Not really, we will perform fsync on raw file. As this file is created > on regular storage and not nvdimm, so host page cache radix tree would have > the dirty pages information which will be used for fsync. Ah right. That makes things a lot easier! > >> >> Will this raw file already have the "disk information header" (no idea >> how that stuff is called) encoded? Are there any plans/possible ways to >> >> a) automatically create the headers? (if that's even possible) > > Its raw. Right now we are just supporting raw format. > > As this is direct mapping of memory into guest address space, I don't > think we can have an abstraction of headers for block specific features. > Or may be we can get opinion of others(Qemu block people) it is at all possible? > >> b) support anything but raw files? >> >> Please note that under x86, a KVM memory slot still has a (in my >> opinion) fairly big overhead depending on the size of the slot (rmap, >> page_track). We might have to optimize that. > > I have not tried/observed this. Right now I just used single memory slot and cold add > few MB's of memory in Qemu. Can you please provide more details on this? > You can have a look at kvm_arch_create_memslot() in arch/x86/kvm/x86.c. "npages" is used to allocate certain arrays (rmap for shadow page tables). Also kvm_page_track_create_memslot() allocates data for page_track. Having a big disk involves a lot of memory overhead due to the big kvm memory slot. This is already the case for NVDIMMs as of now. Other architectures (e.g. s390x) don't have this "problem". They don't allocate any such data depending on the size of a memory slot. This is certainly something to work on in the future. -- Thanks, David / dhildenb