Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp1739747pxu; Fri, 16 Oct 2020 22:49:07 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxgXuGbuy7gyNaXO9NidLR5mWeT8nTXyXUwshYLCoBnqPm90z+IzV1CRLzgoXobkAq4qb1O X-Received: by 2002:a50:fa42:: with SMTP id c2mr8154829edq.282.1602913747192; Fri, 16 Oct 2020 22:49:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1602913747; cv=none; d=google.com; s=arc-20160816; b=Il6lH4RcDZoC7pxxiBmYcMm5MjtEetGVgwV6R+5qCU9ENnRc8LlZgeaOHUvWj6cowu g/ACbSd34cMKiVdluF2b9D6qIjJQ+0Lylzp1t3l3Z1Tpk/j1lmeyj6bzIl0vx83s8xNG 1EOgKXT4ozY49qUkbR2B+8AImAzN9F4PHwFtiKdB+c/TxqctIXAgExupADH0cVHZTi4u 7BLdXLJO7H2acYhwAfeBjA9BhoZCtUDnKnCWAs1ylWr2laWwUpVZ6oXSDucgPHGkD9Pp 46yslfTW2lOXkryjdoj2hyNemrXNHhlaZOpmF68Ik5Hnw2FrnqkeWRk1gXkYDPUXzkg1 QaTw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=Hx7Q/ww8SNQlC+Vff55Qg1KHQPfa6zByV8YGUpfQHNk=; b=NNIIN6KTJMGPsMZSZgWCyku9560fVMJ7K4xvIZStL9hyv5J5XZr5dHu8uAjbA3cAic g1G6kzfpS2Oq5zgNB1uAokmNcJXn+n65oDh0T5h+B1Aolvp662jckrJOBQdZL1W7CMOf f1KobzFDc76A5kKBUI/M8ThZBcr7tF6SJDA1nvT3Of3/lgpDVrUiUGwjtUwAhQregmMq w9YBb1okwrq25hYb4qzgtuFtPsMspUsq9cQWxWv25VoHuIq74y66cFv6FmTmrcHR8p07 7w7psDdcV2O8GnDaafDG3XIvLYpAcqo41doBKNStCjlrF0sJUT2KeJE/5qHs4jYbSh7z vecQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=Mj4ipAZT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id e12si3208021ejd.673.2020.10.16.22.48.45; Fri, 16 Oct 2020 22:49:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=Mj4ipAZT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2436549AbgJQFjo (ORCPT + 99 others); Sat, 17 Oct 2020 01:39:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53846 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2436533AbgJQFjm (ORCPT ); Sat, 17 Oct 2020 01:39:42 -0400 Received: from mail-lj1-x244.google.com (mail-lj1-x244.google.com [IPv6:2a00:1450:4864:20::244]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9173FC05BD2D for ; Fri, 16 Oct 2020 22:01:59 -0700 (PDT) Received: by mail-lj1-x244.google.com with SMTP id m16so4983656ljo.6 for ; Fri, 16 Oct 2020 22:01:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=Hx7Q/ww8SNQlC+Vff55Qg1KHQPfa6zByV8YGUpfQHNk=; b=Mj4ipAZT8hm2EFBVYWNa/tPYN08h/c6I9aqqDDNKd/U1Gx8lY/fq6SCOLleNehgJrg 16iTyZDjW3bOYh1hhqWtA77oKiTI+TPM4N1Keo/JnJzSrlqFLzW7RPlCSwXodHNBHioe 60sjtbu/3EHa3YEwd0uy5df4RemUuD5LYm5H+8Y1Jz0qZOf6lUVUq/8KwhvUZ8M3r7Gw bWF2ycp8+cBNuu5EidPWzLVcRpMEVOSaAgokNP1gQ/Vcr6lzp6alR3rT1I0+BfHSKna1 MUB9pFRtzc7GBqzTpgUnKrsGKvi+izpGnZj97WNgEpuknnloyOVePKBpgA5B1rIzMjOM A/bQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=Hx7Q/ww8SNQlC+Vff55Qg1KHQPfa6zByV8YGUpfQHNk=; b=Hcu8gAjV7WJnLpRNOxVPCsHaZNO8l4C9622k5ZAyOmRMTOirngUVAzN5LvFZJ1pbq6 pNhdCEDsBM3z2XS05Jec+oW63ndn/9nkJscNXksHTmhR/ba3zOovNBoflI+gfvZVgx8g +ZgdKDQnZswzyOUrlwOLld6ieegqV1XVSbIRNtxlfPCaUcwoc9FLJ6ZT4Ml5y3vH6ShW Ln9Rp29oks8X5IXuTVXPgK5c7LT/9312KmRl6n/FBMcSYOw/Jl4WUYYoBra3LVAVrAg5 RAbvMSm9tpHe4bSS0EUHc3JRcIcLjZeHaqvCJjaLyBj03XXPu8WxUGB0U/pFOPFRblwA KTEw== X-Gm-Message-State: AOAM531L+v5l2AKdJjT1pIbZWQdimRSKvApS3FHIdSHFCftm68gwCZ5z OsJcOQhxQqC+ZEKi+PIF5aG45jQpjiBsOqr0VH6eOg== X-Received: by 2002:a2e:b6cf:: with SMTP id m15mr2566942ljo.74.1602910917566; Fri, 16 Oct 2020 22:01:57 -0700 (PDT) MIME-Version: 1.0 References: <788878CE-2578-4991-A5A6-669DCABAC2F2@amazon.com> <20201017033606.GA14014@1wt.eu> <6CC3DB03-27BA-4F5E-8ADA-BE605D83A85C@amazon.com> In-Reply-To: <6CC3DB03-27BA-4F5E-8ADA-BE605D83A85C@amazon.com> From: Jann Horn Date: Sat, 17 Oct 2020 07:01:31 +0200 Message-ID: Subject: Re: [PATCH] drivers/virt: vmgenid: add vm generation id driver To: Colm MacCarthaigh Cc: Willy Tarreau , "Catangiu, Adrian Costin" , Andy Lutomirski , Jason Donenfeld , "Theodore Y. Ts'o" , Eric Biggers , "open list:DOCUMENTATION" , kernel list , "open list:VIRTIO GPU DRIVER" , "Graf (AWS), Alexander" , "Woodhouse, David" , bonzini@gnu.org, "Singh, Balbir" , "Weiss, Radu" , oridgar@gmail.com, ghammer@redhat.com, Jonathan Corbet , Greg Kroah-Hartman , "Michael S. Tsirkin" , Qemu Developers , KVM list , Michal Hocko , "Rafael J. Wysocki" , Pavel Machek , Linux API Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Oct 17, 2020 at 6:34 AM Colm MacCarthaigh wro= te: > On 16 Oct 2020, at 21:02, Jann Horn wrote: > > On Sat, Oct 17, 2020 at 5:36 AM Willy Tarreau wrote: > > But in userspace, we just need a simple counter. There's no need for > > us to worry about anything else, like timestamps or whatever. If we > > repeatedly fork a paused VM, the forked VMs will see the same counter > > value, but that's totally fine, because the only thing that matters to > > userspace is that the counter changes when the VM is forked. > > For user-space, even a single bit would do. We added MADVISE_WIPEONFORK > so that userspace libraries can detect fork()/clone() robustly, for the > same reasons. It just wipes a page as the indicator, which is > effectively a single-bit signal, and it works well. On the user-space > side of this, I=E2=80=99m keen to find a solution like that that we can u= se > fairly easily inside of portable libraries and applications. The =E2=80= =9Chave > I forked=E2=80=9D checks do end up in hot paths, so it=E2=80=99s nice if = they can be > CPU cache friendly. Comparing a whole 128-bit value wouldn=E2=80=99t be m= y > favorite. I'm pretty sure a single bit is not enough if you want to have a single page, shared across the entire system, that stores the VM forking state; you need a counter for that. > > And actually, since the value is a cryptographically random 128-bit > > value, I think that we should definitely use it to help reseed the > > kernel's RNG, and keep it secret from userspace. That way, even if the > > VM image is public, we can ensure that going forward, the kernel RNG > > will return securely random data. > > If the image is public, you need some extra new raw entropy from > somewhere. The gen-id could be mixed in, that can=E2=80=99t do any harm a= s > long as rigorous cryptographic mixing with the prior state is used, but > if that=E2=80=99s all you do then the final state is still deterministic = and > non-secret. Microsoft's documentation (http://go.microsoft.com/fwlink/?LinkId=3D260709) says that the VM Generation ID that we get after a fork "is a 128-bit, cryptographically random integer value". If multiple people use the same image, it guarantees that each use of the image gets its own, fresh ID: The table in section "How to implement virtual machine generation ID support in a virtualization platform" says that (among other things) "Virtual machine is imported, copied, or cloned" generates a new generation ID. So the RNG state after mixing in the new VM Generation ID would contain 128 bits of secret entropy not known to anyone else, including people with access to the VM image. Now, 128 bits of cryptographically random data aren't _optimal_; I think something on the order of 256 bits would be nicer from a theoretical standpoint. But in practice I think we'll be good with the 128 bits we're getting (since the number of users who fork a VM image is probably not going to be so large that worst-case collision probabilities matter). > The kernel would need to use the change as a trigger to > measure some entropy (e.g. interrupts and RDRAND, or whatever). Our just > define the machine contract as =E2=80=9Cthis has to be unique random data= and > if it=E2=80=99s not unique, or if it=E2=80=99s pubic, you=E2=80=99re toas= t=E2=80=9D. As far as I can tell from Microsoft's spec, that is a guarantee we're already getting.