Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp1969039pxu; Sat, 17 Oct 2020 07:13:53 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx0nqCRgHFQ0KLFu9MYuza4IlozGXVCX26CdeMxCec16cAbAbskLsM86HogG5+5zVhUY4Vt X-Received: by 2002:aa7:ce91:: with SMTP id y17mr9322970edv.329.1602944033713; Sat, 17 Oct 2020 07:13:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1602944033; cv=none; d=google.com; s=arc-20160816; b=r0y/yGOrapuGB/XgT/1+EbLTyNckRZ5Th+rD18LtPeGMGdM6ZG6koiVhmWDb+VGNJ1 uWA3zu3RZWcHNUvvKeo12yqDEKTQ1lx3i8cjOtF+95d52Tq3xN1uFX285tEuz0lK0XA2 7RI8pJxJdDRS7ZZsVmB57uGQXcC4nH6cJ2ddNRdhczgYqz++Xfq4awix+d5/mZfaHTCY e44/fNuS7ojhM8WikU1o5fBjlMBkX/EfUyWwTQp+b39nUyl08NON/19YYctLuLqcU+u3 x/kpvT2CGxjEmyZy64gMxbbuxIXmvnpUVU8HaF4uCHJmDJLVJT+wgcTmqVezT7BjqD2M fz8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=INlp/aBDJXpq08PibXENI6jkwsEcCreGe87xElD7iNc=; b=MdXyjjKtzfElJBOH5VMlGLTXqOsCQI+uFiJrhCqstc+SY9fs9zSkc4aQ8bV+qC53l5 Xm1/UC0GzuA7tnU2BXwIMue7eiX3VivzhLsfxlSOPpX3qMjvqsxwA+x0M7zM4h9/XLRL uTt5D0S8oJtxWV/YkODQ0dyv8ujFol//l6k2yAlau4mcIMIOBX5Q2FyNeJVDwBUk2DBa G200OML4IxDjIpTe24Kmr+0S7wf3TCgqepyMluGelqAgFsTlk1bgTEacC3ZQX11aby8N oaUZviCjuzdaQxk+jrrElNENenFZtEnF4bT8uH3Ah7ZasL1YdeJcMKO6ntIUGAb7EFr8 WR+g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id si11si3928394ejb.693.2020.10.17.07.13.16; Sat, 17 Oct 2020 07:13:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2437191AbgJQGpV (ORCPT + 99 others); Sat, 17 Oct 2020 02:45:21 -0400 Received: from wtarreau.pck.nerim.net ([62.212.114.60]:43853 "EHLO 1wt.eu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2437182AbgJQGpV (ORCPT ); Sat, 17 Oct 2020 02:45:21 -0400 Received: (from willy@localhost) by pcw.home.local (8.15.2/8.15.2/Submit) id 09H6ignu014122; Sat, 17 Oct 2020 08:44:42 +0200 Date: Sat, 17 Oct 2020 08:44:42 +0200 From: Willy Tarreau To: Jann Horn Cc: Colm MacCarthaigh , "Catangiu, Adrian Costin" , Andy Lutomirski , Jason Donenfeld , "Theodore Y. Ts'o" , Eric Biggers , "open list:DOCUMENTATION" , kernel list , "open list:VIRTIO GPU DRIVER" , "Graf (AWS), Alexander" , "Woodhouse, David" , bonzini@gnu.org, "Singh, Balbir" , "Weiss, Radu" , oridgar@gmail.com, ghammer@redhat.com, Jonathan Corbet , Greg Kroah-Hartman , "Michael S. Tsirkin" , Qemu Developers , KVM list , Michal Hocko , "Rafael J. Wysocki" , Pavel Machek , Linux API Subject: Re: [PATCH] drivers/virt: vmgenid: add vm generation id driver Message-ID: <20201017064442.GA14117@1wt.eu> References: <788878CE-2578-4991-A5A6-669DCABAC2F2@amazon.com> <20201017033606.GA14014@1wt.eu> <6CC3DB03-27BA-4F5E-8ADA-BE605D83A85C@amazon.com> <20201017053712.GA14105@1wt.eu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.6.1 (2016-04-27) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Oct 17, 2020 at 07:52:48AM +0200, Jann Horn wrote: > On Sat, Oct 17, 2020 at 7:37 AM Willy Tarreau wrote: > > On Sat, Oct 17, 2020 at 07:01:31AM +0200, Jann Horn wrote: > > > Microsoft's documentation > > > (http://go.microsoft.com/fwlink/?LinkId=260709) says that the VM > > > Generation ID that we get after a fork "is a 128-bit, > > > cryptographically random integer value". If multiple people use the > > > same image, it guarantees that each use of the image gets its own, > > > fresh ID: > > > > No. It cannot be more unique than the source that feeds that cryptographic > > transformation. All it guarantees is that the entropy source is protected > > from being guessed based on the output. Applying cryptography on a simple > > counter provides apparently random numbers that will be unique for a long > > period for the same source, but as soon as you duplicate that code between > > users and they start from the same counter they'll get the same IDs. > > > > This is why I think that using a counter is better if you really need something > > unique. Randoms only reduce predictability which helps avoiding collisions. > > Microsoft's spec tells us that they're giving us cryptographically > random numbers. Where they're getting those from is not our problem. > (And if even the hypervisor is not able to collect enough entropy to > securely generate random numbers, worrying about RNG reseeding in the > guest would be kinda pointless, we'd be fairly screwed anyway.) Sorry if I sound annoying, but it's a matter of terminology and needs. Cryptograhically random means safe for use with cryptography in that it is unguessable enough so that you can use it for encryption keys that nobody will be able to guess. It in no ways guarantees uniqueness, just like you don't really care if the symmetric crypto key of you VPN has already been used once somewhere else as long as there's no way to know. However with the good enough distribution that a CSPRNG provides, collisions within a *same* generator are bound to a very low, predictable rate which is by generally considered as acceptable for all use cases. Something random (cryptographically or not) *cannot* be unique by definition, otherwise it's not random anymore, since each draw has an influence on the remaining list of possible draws, which is contrary to randomness. And conversely something unique cannot be completely random because if you know it's unique, you can already rule out all other known values from the candidates, thus it's more predictable than random. With this in mind, picking randoms from a same RNG is often highly sufficient to consider they're highly likely unique within a long period. But it's not a guarantee. And it's even less one between two RNGs (e.g. if uniqueness is required between multiple hypervisors in case VMs are migrated or centrally managed, which I don't know). If what is sought here is a strong guarantee of uniqueness, using a counter as you first suggested is better. If what is sought is pure randomness (in the sense that it's unpredictable, which I don't think is needed here), then randoms are better. If both are required, just concatenate a counter and a random. And if you need them to be spatially unique, just include a node identifier. Now the initial needs in the forwarded message are not entirely clear to me but I wanted to rule out the apparent mismatch between the expressed needs for uniqueness and the proposed solutions solely based on randomness. Cheers, Willy