Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp1745162pxu; Fri, 16 Oct 2020 23:03:37 -0700 (PDT) X-Google-Smtp-Source: ABdhPJydYbpkjSrOGl0OeRifvojXtgYyVlLhP3fAE012YDjvjMYfpmpM6k7MTP9IztFQDBiLr/c/ X-Received: by 2002:a17:906:915:: with SMTP id i21mr7062666ejd.113.1602914616853; Fri, 16 Oct 2020 23:03:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1602914616; cv=none; d=google.com; s=arc-20160816; b=dfC7g5qgPVC3XUoWrdORPaxQk6h0qxA6OalJhy7lUk0XBB9JGp5EMElrJOzV3QJ4vC pzqrIhLmcrg7Vvecp3UrevJVeqSBIVc/wvr9t00vcVaIGKrJK8RuMLzEkgs4DLyVNQMs WPJcDSBV2OCl47lLrrazQzvh+E9kr4cGljl7bSPOZTUAIlH9anzYL8DpVn4VCI/wLIgn +mf5tRyiLnTaR/9mhhBRh2bE6GMuNONCMmBGvRWal6gKMPJIVL3990R62eqKv0ZLEXmA fhphAjJo/avguPX2gAsJZBle7j4oRZEiHUU0B7D69hymAmQtdEivHH/WAO+k3QqwTbKI WhHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=uR2yRSoFIyQFa85ZmyTeQWbKesy0kogveF0JOmK3HCU=; b=I1I1zEo2GpXY98cOuscTzAHquadFikBHTGnoXurkoODH5sRQBI6MiUFWbW8XHMSGJZ xOkhVM+g++RJoWkPJIetduvwaYsF3rP8PkN0vnLlpzpUMfwYkzwI0DjufM3Qy3Es8w94 r+XaG/j0EZDrPL5BRpr0e74/7+EC7fk/2NCeuNQlTiR/hnwc1Ndx6VOehfgIymrgdB+t PoXXRyereQdX5Z1hEhaCoi1ReduBGrZk3q+fnOXhAXG2nE9MkIUoi10mE8SFCXaTEPoq Ry9fsUPjTcMczEMhKsVnEAr+oI15eZyaazXBt15RA82ssreAgbIo5g4fn1+niOu814n8 7W+w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=G0aFMwnX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t21si3216012edy.511.2020.10.16.23.03.14; Fri, 16 Oct 2020 23:03:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=G0aFMwnX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2436908AbgJQGB4 (ORCPT + 99 others); Sat, 17 Oct 2020 02:01:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57312 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2436912AbgJQGBl (ORCPT ); Sat, 17 Oct 2020 02:01:41 -0400 Received: from mail-lj1-x242.google.com (mail-lj1-x242.google.com [IPv6:2a00:1450:4864:20::242]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6C868C05BD1D for ; Fri, 16 Oct 2020 21:03:05 -0700 (PDT) Received: by mail-lj1-x242.google.com with SMTP id y16so4940172ljk.1 for ; Fri, 16 Oct 2020 21:03:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=uR2yRSoFIyQFa85ZmyTeQWbKesy0kogveF0JOmK3HCU=; b=G0aFMwnXL3XlqQtltqTo0eoChMRvdBPFzMoicowzO+xAY5WA7wnecLBujcLmZah0k/ e5mpG2PgiCXI5UV6aPxyJHgp+ak3mHqsIwGNyATLAY9IpyytJ/yv/eUNLEx8n5Z8IWxM 5Ev81C4+GL/IiR5iIz/ra80OcP7ctuQ69Pj2L7/QrHYXrCEAAle4I31KIYWKOWdz7wXn M3HFOt505tL/bufF7RKhxyavqsmuktaYt6aLjMXM2bezAY35bwnHVGu6RsGL9aReXefA CUnwlpUOUePqMTfmrge6m16eiA1+eUvlxTaBEHUJphVGNMuBAicVc8kqC269vRlgE0G9 pD+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=uR2yRSoFIyQFa85ZmyTeQWbKesy0kogveF0JOmK3HCU=; b=Udg7Z9qni81hsjSkTDhVQGpCP/9688Al40A9jdQVVWQ04IVETQDEUULGDJnMFguFM7 cIzIu35G98iDwBaA0V/wod7hJM61VyHIyK5TEXN58+siyMbYYHRmaBxiwiNxXVKcUlbX NbjJ3+d3aG0noggz6Pzds3KwpYkFdfpnflPJkeXp4BdBiX8FwbrS860Hqr2Aoq9jwwEP VK7fkHtA5RT64Z/qFDst0bEx52qajVvG0LNk4bf4ua6jPYvZI4Zn4gLR7c1wE13qyQYX aYx6kOqRpASlKXnt9ToxZWZyu9UWahQFpHYnegEMRPKx+Pp+IXEgc7Ho1YM+fIJ2fEbZ mOWA== X-Gm-Message-State: AOAM532zdbS8CHLza5tTRPEBwOFRYhmqCO9MYdTBIKtXKMYeOkCEH9fG t0DC608qIRgfnlpSU7cY0E8Ydzs9TVTy5ZdDXI0hAA== X-Received: by 2002:a2e:8816:: with SMTP id x22mr2543467ljh.377.1602907383409; Fri, 16 Oct 2020 21:03:03 -0700 (PDT) MIME-Version: 1.0 References: <788878CE-2578-4991-A5A6-669DCABAC2F2@amazon.com> <20201017033606.GA14014@1wt.eu> In-Reply-To: <20201017033606.GA14014@1wt.eu> From: Jann Horn Date: Sat, 17 Oct 2020 06:02:36 +0200 Message-ID: Subject: Re: [PATCH] drivers/virt: vmgenid: add vm generation id driver To: Willy Tarreau Cc: "Catangiu, Adrian Costin" , Andy Lutomirski , Jason Donenfeld , "Theodore Y. Ts'o" , Eric Biggers , "linux-doc@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "Graf (AWS), Alexander" , "MacCarthaigh, Colm" , "Woodhouse, David" , "bonzini@gnu.org" , "Singh, Balbir" , "Weiss, Radu" , "oridgar@gmail.com" , "ghammer@redhat.com" , "corbet@lwn.net" , "gregkh@linuxfoundation.org" , "mst@redhat.com" , "qemu-devel@nongnu.org" , KVM list , Michal Hocko , "Rafael J. Wysocki" , Pavel Machek , Linux API Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Oct 17, 2020 at 5:36 AM Willy Tarreau wrote: > On Sat, Oct 17, 2020 at 03:40:08AM +0200, Jann Horn wrote: > > [adding some more people who are interested in RNG stuff: Andy, Jason, > > Theodore, Willy Tarreau, Eric Biggers. also linux-api@, because this > > concerns some pretty fundamental API stuff related to RNG usage] > > > > On Fri, Oct 16, 2020 at 4:33 PM Catangiu, Adrian Costin > > wrote: > > > This patch is a driver which exposes the Virtual Machine Generation ID > > > via a char-dev FS interface that provides ID update sync and async > > > notification, retrieval and confirmation mechanisms: > > > > > > When the device is 'open()'ed a copy of the current vm UUID is > > > associated with the file handle. 'read()' operations block until the > > > associated UUID is no longer up to date - until HW vm gen id changes - > > > at which point the new UUID is provided/returned. Nonblocking 'read()' > > > uses EWOULDBLOCK to signal that there is no _new_ UUID available. > > > > > > 'poll()' is implemented to allow polling for UUID updates. Such > > > updates result in 'EPOLLIN' events. > > > > > > Subsequent read()s following a UUID update no longer block, but return > > > the updated UUID. The application needs to acknowledge the UUID update > > > by confirming it through a 'write()'. > > > Only on writing back to the driver the right/latest UUID, will the > > > driver mark this "watcher" as up to date and remove EPOLLIN status. > > > > > > 'mmap()' support allows mapping a single read-only shared page which > > > will always contain the latest UUID value at offset 0. > > > > It would be nicer if that page just contained an incrementing counter, > > instead of a UUID. It's not like the application cares *what* the UUID > > changed to, just that it *did* change and all RNGs state now needs to > > be reseeded from the kernel, right? And an application can't reliably > > read the entire UUID from the memory mapping anyway, because the VM > > might be forked in the middle. > > > > So I think your kernel driver should detect UUID changes and then turn > > those into a monotonically incrementing counter. (Probably 64 bits > > wide?) (That's probably also a little bit faster than comparing an > > entire UUID.) > > I agree with this. Further, I'm observing there is a very common > confusion between "universally unique" and "random". Randoms are > needed when seeking unpredictability. A random number generator > *must* be able to return the same value multiple times in a row > (though this is rare), otherwise it's not random. [...] > If the UUIDs used there are real UUIDs, it could be as simple as > updating them according to their format, i.e. updating the timestamp, > and if the timestamp is already the same, just increase the seq counter. > Doing this doesn't require entropy, doesn't need to block and doesn't > needlessly leak randoms that sometimes make people feel nervous. Those UUIDs are supplied by existing hypervisor code; in that regard, this is almost like a driver for a hardware device. It is written against a fixed API provided by the underlying machine. Making sure that the sequence of UUIDs, as seen from inside the machine, never changes back to a previous one is the responsibility of the hypervisor and out of scope for this driver. Microsoft's spec document (which is a .docx file for reasons I don't understand) actually promises us that it is a cryptographically random 128-bit integer value, which means that if you fork a VM 2^64 times, the probability that any two of those VMs have the same counter is 2^-64. That should be good enough. But in userspace, we just need a simple counter. There's no need for us to worry about anything else, like timestamps or whatever. If we repeatedly fork a paused VM, the forked VMs will see the same counter value, but that's totally fine, because the only thing that matters to userspace is that the counter changes when the VM is forked. And actually, since the value is a cryptographically random 128-bit value, I think that we should definitely use it to help reseed the kernel's RNG, and keep it secret from userspace. That way, even if the VM image is public, we can ensure that going forward, the kernel RNG will return securely random data.