Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp3650245ybi; Mon, 29 Jul 2019 10:07:20 -0700 (PDT) X-Google-Smtp-Source: APXvYqxmKqcfsPM0a/ERaLWRsQCZaEN2/zdegSvD5tDjOwJKgPEYvKdxdokh+gNZvp9lJw9keAkc X-Received: by 2002:a63:4612:: with SMTP id t18mr96489680pga.85.1564420040155; Mon, 29 Jul 2019 10:07:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564420040; cv=none; d=google.com; s=arc-20160816; b=Oow8bKZ+crqIDB7FpAknL8gqZFe6mfix76Yo3aQz68HgGlcXKn6SYyTaN5ZyPUv2lR sgHRBHRICKN9kz9IL0Lf78X4U1iGHm+APIer05oXvcYFy8BscC8MxCuRKcKN/rZz0RMD W5AVRUjtfqLS1JhS9uvX6EOX2jcObc+wN/UUpL9Nhhsoh4HMvxR88pxDNuDMXHAhAO4m lKUjNgxMuJs8HOO3ZodgXB//DV+zNRagDvcKnuGpCAh+qNgs9UnMTDdJtR1pe7SA7D7k URYb3Kz5hw3o9m/RqHgv7jss8IBk3JK+xuRQgsnavWQ7UawPx+Z4ucmcB3I+yEMs8Rm1 2vhg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=DMJX6exOwqjooLEn9qtJXN6JS2y5xnQ/2H7xPTrhJ28=; b=PO4fPGHECw77aSonYM8BQpbe2z+pD3CCT96BQmCpK7qBGLYdSV3e6Abg5QxJCNrCVv tcq4wfaIHK4sMjmbDoSQdtixYrw/rVuc4fcsA153IcQMDvjU9j7xI7QNvQCkRQLdF8OY IzaSJQw1fhvXhL49MBZw3u8CgefXLnpUTr/m0Ar67ly/pypsnLMEsmBW64kOzBW28N53 KUvt+R0hXDirCVk+0wF0rTN4BT5dlrVVVm4wOhPNSPcKkmHTr9t4Uyd2azHxyWhbN4+W Evc0778VENkIZaZLUpFBxjbViY/xKAqgE8Njf199dpMLrmGPGavB4og012jl4kTGKlgy 4m7w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=qmwBJoRI; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y8si29091416pgk.594.2019.07.29.10.07.05; Mon, 29 Jul 2019 10:07:20 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=qmwBJoRI; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387716AbfG2Q6R (ORCPT + 99 others); Mon, 29 Jul 2019 12:58:17 -0400 Received: from mail-io1-f68.google.com ([209.85.166.68]:46446 "EHLO mail-io1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387717AbfG2Q6R (ORCPT ); Mon, 29 Jul 2019 12:58:17 -0400 Received: by mail-io1-f68.google.com with SMTP id i10so8227930iol.13; Mon, 29 Jul 2019 09:58:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=DMJX6exOwqjooLEn9qtJXN6JS2y5xnQ/2H7xPTrhJ28=; b=qmwBJoRInh23civEWJdNICjdMLoR9BUo4NN/Jr86tkcVeVrkU8KIGkSHwi5irmcRPP //4n9GJj5Y4+FnXDmYBTTPKxGApkHqgo65J6IGPXJlcCq78kIwGwAwelULdRAmOMY0wj 2jocJCfs/d9EId682irXl8YPuCk5umHMvx2eMUnRwkfBu/fBd6mQy3KuQt0NxpkLbO9W YOVEhLnvLQfQO0RnpPtADN/mblkeyeUlPNcWyZOlmbzLjSafTngOlIL+SV4V+pIbh3EV 3SwJBZ+dMeUWV+Bvl+whJ+X+UegiEFARHumJQ/VGnj75FFt4AgYcDgpK34RtNtgdtb9r fSWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=DMJX6exOwqjooLEn9qtJXN6JS2y5xnQ/2H7xPTrhJ28=; b=BQkvtzn5MbhmFKj05FUQd/wf1YqO1YSliof2eTwfmQUN+5sxyVsKy81UvIybS4rYKy i5m0R7Wfsjl8BmyhLgabUgduherQRMtRkaENR+RRY/IvBkfdZDPmxOB8xWhcSStFa4p2 wkG7vMEunYhY9C7h7KvrLlz2p0y/hi9ZS85pXa3XlQX66xhSLjnbsiHB78TXw0/B3gnt 74VcK6M90VbHVds/H/dtRB+yJV+QsoOaNNU8CzRCJvopmfhQBhaqJof7j4BGvW8YMkvw mNIyweMAXTtktLMwoWrDksQqGVWz9ntG8OmponcSqdhrlyNib0NO1bSMUeeeQZD46zCq S+SA== X-Gm-Message-State: APjAAAXzwtTnsHRtDg+Rvlffnl9uq7cMZT+9zsk3mp5wnXxWb5U0XJ3o i7Jh501gWSGEEpCe3YGVSsUCF2ba6mdWWA3QIUo= X-Received: by 2002:a5d:9dc7:: with SMTP id 7mr48452689ioo.237.1564419495894; Mon, 29 Jul 2019 09:58:15 -0700 (PDT) MIME-Version: 1.0 References: <20190724165158.6685.87228.stgit@localhost.localdomain> <20190724171050.7888.62199.stgit@localhost.localdomain> <20190724150224-mutt-send-email-mst@kernel.org> <6218af96d7d55935f2cf607d47680edc9b90816e.camel@linux.intel.com> <20190724164023-mutt-send-email-mst@kernel.org> In-Reply-To: <20190724164023-mutt-send-email-mst@kernel.org> From: Alexander Duyck Date: Mon, 29 Jul 2019 09:58:04 -0700 Message-ID: Subject: Re: [PATCH v2 QEMU] virtio-balloon: Provide a interface for "bubble hinting" To: "Michael S. Tsirkin" , wei.w.wang@intel.com Cc: Nitesh Narayan Lal , Alexander Duyck , kvm list , David Hildenbrand , Dave Hansen , LKML , linux-mm , Andrew Morton , Yang Zhang , pagupta@redhat.com, Rik van Riel , Konrad Rzeszutek Wilk , lcapitulino@redhat.com, Andrea Arcangeli , Paolo Bonzini , dan.j.williams@intel.com Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 24, 2019 at 1:42 PM Michael S. Tsirkin wrote: > > On Wed, Jul 24, 2019 at 04:29:27PM -0400, Nitesh Narayan Lal wrote: > > > > On 7/24/19 4:18 PM, Alexander Duyck wrote: > > > On Wed, 2019-07-24 at 15:02 -0400, Michael S. Tsirkin wrote: > > >> On Wed, Jul 24, 2019 at 10:12:10AM -0700, Alexander Duyck wrote: > > >>> From: Alexander Duyck > > >>> > > >>> Add support for what I am referring to as "bubble hinting". Basically the > > >>> idea is to function very similar to how the balloon works in that we > > >>> basically end up madvising the page as not being used. However we don't > > >>> really need to bother with any deflate type logic since the page will be > > >>> faulted back into the guest when it is read or written to. > > >>> > > >>> This is meant to be a simplification of the existing balloon interface > > >>> to use for providing hints to what memory needs to be freed. I am assuming > > >>> this is safe to do as the deflate logic does not actually appear to do very > > >>> much other than tracking what subpages have been released and which ones > > >>> haven't. > > >>> > > >>> Signed-off-by: Alexander Duyck > > >>> --- > > >>> hw/virtio/virtio-balloon.c | 40 +++++++++++++++++++++++ > > >>> include/hw/virtio/virtio-balloon.h | 2 + > > >>> include/standard-headers/linux/virtio_balloon.h | 1 + > > >>> 3 files changed, 42 insertions(+), 1 deletion(-) > > >>> > > >>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c > > >>> index 2112874055fb..70c0004c0f88 100644 > > >>> --- a/hw/virtio/virtio-balloon.c > > >>> +++ b/hw/virtio/virtio-balloon.c > > >>> @@ -328,6 +328,39 @@ static void balloon_stats_set_poll_interval(Object *obj, Visitor *v, > > >>> balloon_stats_change_timer(s, 0); > > >>> } > > >>> > > >>> +static void virtio_bubble_handle_output(VirtIODevice *vdev, VirtQueue *vq) > > >>> +{ > > >>> + VirtQueueElement *elem; > > >>> + > > >>> + while ((elem = virtqueue_pop(vq, sizeof(VirtQueueElement)))) { > > >>> + unsigned int i; > > >>> + > > >>> + for (i = 0; i < elem->in_num; i++) { > > >>> + void *addr = elem->in_sg[i].iov_base; > > >>> + size_t size = elem->in_sg[i].iov_len; > > >>> + ram_addr_t ram_offset; > > >>> + size_t rb_page_size; > > >>> + RAMBlock *rb; > > >>> + > > >>> + if (qemu_balloon_is_inhibited()) > > >>> + continue; > > >>> + > > >>> + rb = qemu_ram_block_from_host(addr, false, &ram_offset); > > >>> + rb_page_size = qemu_ram_pagesize(rb); > > >>> + > > >>> + /* For now we will simply ignore unaligned memory regions */ > > >>> + if ((ram_offset | size) & (rb_page_size - 1)) > > >>> + continue; > > >>> + > > >>> + ram_block_discard_range(rb, ram_offset, size); > > >> I suspect this needs to do like the migration type of > > >> hinting and get disabled if page poisoning is in effect. > > >> Right? > > > Shouldn't something like that end up getting handled via > > > qemu_balloon_is_inhibited, or did I miss something there? I assumed cases > > > like that would end up setting qemu_balloon_is_inhibited to true, if that > > > isn't the case then I could add some additional conditions. I would do it > > > in about the same spot as the qemu_balloon_is_inhibited check. > > I don't think qemu_balloon_is_inhibited() will take care of the page poisoning > > situations. > > If I am not wrong we may have to look to extend VIRTIO_BALLOON_F_PAGE_POISON > > support as per Michael's suggestion. > > > BTW upstream qemu seems to ignore VIRTIO_BALLOON_F_PAGE_POISON ATM. > Which is probably a bug. > Wei, could you take a look pls? So I was looking at sorting out this for the unused page reporting that I am working on and it occurred to me that I don't think we can do the free page hinting if any sort of poison validation is present. The problem is that free page hinting simply stops the page from being migrated. As a result if there was stale data present it will just leave it there instead of zeroing it or writing it to alternating 1s and 0s. Also it looks like the VIRTIO_BALLOON_F_PAGE_POISON feature is assuming that 0 means that page poisoning is disabled, when in reality it might just mean we are using the value zero to poison pages instead of the 0xaa pattern. As such I think there are several cases where we could incorrectly flag the pages with the hint and result in the migrated guest reporting pages that contain non-poison values. The zero assumption works for unused page reporting since we will be zeroing out the page when it is faulted back into the guest, however the same doesn't work for the free page hint since it is simply skipping the migration of the recently dirtied page.