Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp2670980pxu; Sun, 18 Oct 2020 10:50:55 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzOmsT7B7ADPVN8EayvL+HqLfvbZLORRoO5/gu9kKacssb050rGJHd3/1R7GqOKBQdt2BNM X-Received: by 2002:aa7:d782:: with SMTP id s2mr14243083edq.111.1603043454888; Sun, 18 Oct 2020 10:50:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1603043454; cv=none; d=google.com; s=arc-20160816; b=gY2TavbNSrbyy17RFsrcqmyZcuXPzv9Nj7yBUpSnFsupNJmtnYenSrH0FZ7vA2Dx5l huYcgHDLV4865U+IXAzn8Ag9n/fdbpK46Cg5JrBPb95WEwb/NNhRf3EZnJLYnFca6mtz v/yWVGNkVzIsGiQAMIPU76Io8AVMbvq+Yo8k55/K7NyQPqJLF5WVAbV6pMX8OTD6MUsQ 0sCxJovAlxdcdmy1jTLuuk/aWkOzEjnklIFziTk7GeJTpktKHTKC8TxJeXIzgh1sqQ5/ jLkZ/SZkRovfijOaRYEbWNUqB2S00wvsxryMwOkfytT3Xz1lj+glGxS+p+6HVCIPAzYt JX1w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=Ij8hCcQO2WHfMKXqZ9b6mcj8QwkRUY/6wK0sNA0dh5k=; b=bpMl/j6rRDZdcg2gxcMFKy9MBPdnwTtATOlwbGkLf80Ei3lKMKGoZSZNYqn90k6FTI BL+6JGAGPOcKo1k4Nwo0jmiyaVzvca5fCyqlbpO1OiSs+nFlNBoaYZlQ2p7ZuiEQKmal 9bOouA8jNHCrLvkDes76f+5G3dOYCYtnP7OuOSZD7IKAK22xaAXBl0Pqn6Pu7qEuyXaV NuoEusrdbYHYWT8oiZOeE/XEugeUrRsNJDHl/ZdRk0eTIEi6yVmfyKM9mh7DzShtg5Yp TlgR+NJT2DdcYirqcCgcNwN6HZTwVGpxdA5QBItJw3CE8gbl0fdeOvRCP+7b2bjG7HIF 2ulA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=lzbXoiBw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id p26si6157231edy.520.2020.10.18.10.50.05; Sun, 18 Oct 2020 10:50:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=lzbXoiBw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727047AbgJRPyv (ORCPT + 99 others); Sun, 18 Oct 2020 11:54:51 -0400 Received: from mail.kernel.org ([198.145.29.99]:60954 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725776AbgJRPyv (ORCPT ); Sun, 18 Oct 2020 11:54:51 -0400 Received: from mail-wm1-f49.google.com (mail-wm1-f49.google.com [209.85.128.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id EE4BE2072D for ; Sun, 18 Oct 2020 15:54:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1603036490; bh=Ij8hCcQO2WHfMKXqZ9b6mcj8QwkRUY/6wK0sNA0dh5k=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=lzbXoiBwUCx8VRbmkG4nkt885exxkcqhAsuLb2fvr3fqkTlejSRJaeNyAYD2epEgu vJrqpsB5FALFVLCoGKlM+wXxG2Q+07RP85AFYpIjgIZz1hwOnr1NJ6Q7FSrx4ODrF0 EFWrMcx3j36Aefjfb3FMwtF5XgySaAaPGPvN45OQ= Received: by mail-wm1-f49.google.com with SMTP id 13so7945571wmf.0 for ; Sun, 18 Oct 2020 08:54:49 -0700 (PDT) X-Gm-Message-State: AOAM531l4cpCrPu1Ajl5bJK58Zw2ph2zPIvj1TBcfiYBWVtQh9wht3wa nDh6BFU31dAfn01mzB9G0j/buSuMJbDg6BRslxJIrw== X-Received: by 2002:a05:600c:2256:: with SMTP id a22mr13689655wmm.138.1603036488608; Sun, 18 Oct 2020 08:54:48 -0700 (PDT) MIME-Version: 1.0 References: <20201017033606.GA14014@1wt.eu> <6CC3DB03-27BA-4F5E-8ADA-BE605D83A85C@amazon.com> <20201017053712.GA14105@1wt.eu> <20201017064442.GA14117@1wt.eu> <20201018114625-mutt-send-email-mst@kernel.org> In-Reply-To: <20201018114625-mutt-send-email-mst@kernel.org> From: Andy Lutomirski Date: Sun, 18 Oct 2020 08:54:36 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] drivers/virt: vmgenid: add vm generation id driver To: "Michael S. Tsirkin" Cc: "Jason A. Donenfeld" , Jann Horn , Willy Tarreau , Colm MacCarthaigh , "Catangiu, Adrian Costin" , Andy Lutomirski , "Theodore Y. Ts'o" , Eric Biggers , "open list:DOCUMENTATION" , kernel list , "open list:VIRTIO GPU DRIVER" , "Graf (AWS), Alexander" , "Woodhouse, David" , bonzini@gnu.org, "Singh, Balbir" , "Weiss, Radu" , oridgar@gmail.com, ghammer@redhat.com, Jonathan Corbet , Greg Kroah-Hartman , Qemu Developers , KVM list , Michal Hocko , "Rafael J. Wysocki" , Pavel Machek , Linux API Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Oct 18, 2020 at 8:52 AM Michael S. Tsirkin wrote: > > On Sat, Oct 17, 2020 at 03:24:08PM +0200, Jason A. Donenfeld wrote: > > 4c. The guest kernel maintains an array of physical addresses that are > > MADV_WIPEONFORK. The hypervisor knows about this array and its > > location through whatever protocol, and before resuming a > > moved/snapshotted/duplicated VM, it takes the responsibility for > > memzeroing this memory. The huge pro here would be that this > > eliminates all races, and reduces complexity quite a bit, because the > > hypervisor can perfectly synchronize its bringup (and SMP bringup) > > with this, and it can even optimize things like on-disk memory > > snapshots to simply not write out those pages to disk. > > > > A 4c-like approach seems like it'd be a lot of bang for the buck -- we > > reuse the existing mechanism (MADV_WIPEONFORK), so there's no new > > userspace API to deal with, and it'd be race free, and eliminate a lot > > of kernel complexity. > > Clearly this has a chance to break applications, right? > If there's an app that uses this as a non-system-calls way > to find out whether there was a fork, it will break > when wipe triggers without a fork ... > For example, imagine: > > MADV_WIPEONFORK > copy secret data to MADV_DONTFORK > fork > > > used to work, with this change it gets 0s instead of the secret data. > > > I am also not sure it's wise to expose each guest process > to the hypervisor like this. E.g. each process needs a > guest physical address of its own then. This is a finite resource. > > > The mmap interface proposed here is somewhat baroque, but it is > certainly simple to implement ... Wipe of fork/vmgenid/whatever could end up being much more problematic than it naively appears -- it could be wiped in the middle of a read. Either the API needs to handle this cleanly, or we need something more aggressive like signal-on-fork. --Andy