Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp1913799ybe; Thu, 12 Sep 2019 01:18:51 -0700 (PDT) X-Google-Smtp-Source: APXvYqwYu3pVGQOtHbRP+/tRMO27gmF42TRW7nuTU9/MVN/htMCy6hXmzKifbpocaUkSnzTPgFbF X-Received: by 2002:a17:906:81ce:: with SMTP id e14mr9068337ejx.175.1568276331824; Thu, 12 Sep 2019 01:18:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1568276331; cv=none; d=google.com; s=arc-20160816; b=GZ/K2kpVCNjycjcCXIDx2S61Oi18xnGupJGHeWyhmPXsUITdSUSDNQ/Dc7QAP0JBzn N/ijflPdWnHgxKAXpG1my8SwP/J+MNl+eZnCx7nuLANj5ZsS+tKZAxM+ClqTNQXg+ecq PpS6X5M2A4yCW5eBPPmEJtx2ai7qs1o/6pS3b6zw2+UFv3opzlVO1Xv8QjU+v1scIcbh 8Gd46rR8nEh9hLYSI/bm53k0ihfLEpOXQwc+v1jOlASFkd+CHMU8/2cZsfamxgI+eh7e ofoJz3ot0fcY/RbZdr6ZC6sbNoqnyiuOoLva2+Bcpa6h1hJn+NOfWisN/qKGX29ukZ/m luVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=nq9/DS8ZAkk6yGZQDvDoemz16SUFh0W3pARCnI2rYv8=; b=h+tq5R2oUspuyzsL0XAFcoJcJAWz8kCtSjdPfX9wtR9vwWf1zqdR8xNnThMQSYdOUn IOs9+JHhiUv5/yFIv9IvJoDN7qaADpJq+Jc/plYGaMMs8chafi9cg9da3JdZ1ut9nmXD DowN33NJptKu/u64TApFh8WD00x1pFdxujk73cDytcd2ZAUVgVctaF6mc7taZRs6rDqI /wUylhRpMvjE3vc2Bd9YKx43pA5jMlDe9guuGUN9ec6oFy4wm8KWK3KNtPefkV3taY6c NeK9a8SiO1xGPzvBvT9qiFP6YJ2ebbSVrqMY/7prZnV6tVUBR4fmzVFVYsKQwnYs4YtD kqDw== ARC-Authentication-Results: i=1; mx.google.com; dkim=temperror (no key for signature) header.i=@szeredi.hu header.s=google header.b=gBPv97Wi; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k9si12819009ejh.263.2019.09.12.01.18.27; Thu, 12 Sep 2019 01:18:51 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=temperror (no key for signature) header.i=@szeredi.hu header.s=google header.b=gBPv97Wi; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729842AbfILIOX (ORCPT + 99 others); Thu, 12 Sep 2019 04:14:23 -0400 Received: from mail-io1-f65.google.com ([209.85.166.65]:45433 "EHLO mail-io1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725940AbfILIOX (ORCPT ); Thu, 12 Sep 2019 04:14:23 -0400 Received: by mail-io1-f65.google.com with SMTP id f12so52386043iog.12 for ; Thu, 12 Sep 2019 01:14:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=szeredi.hu; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=nq9/DS8ZAkk6yGZQDvDoemz16SUFh0W3pARCnI2rYv8=; b=gBPv97WiIsJNEVOZTmAMABWXLytZ3qsF9OH0lw2dMtOUnbzAKiFjSvrOlTQS5TXujp uZFGqRUDOklLF0o6zZGhThUZSYix/Bxtl+uwkOlA/z8ZBk116lonY4Ac0aUui9vi2J8y 7HoPzNmr12yEuR940DWBrdXLk23333EirKTZA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=nq9/DS8ZAkk6yGZQDvDoemz16SUFh0W3pARCnI2rYv8=; b=OXy6UEpaigUXb5gMgIuxmWDXXu9B/fybBsXzGtumq1vGrTWfkLtCHDo9PzWWJ+cdia 4LiHC+e1diHSAyt7bzdML2EfwCEWjR3kW4VmEVAl5hKDSUgUP7RiC4iQmCHjbbwoEsBI ODKm2vaC+VTwaFuH2k2eJ8AzWIG2wPj7WHELTMoknhaO3QkTmyZMJHTK1TN2j6K+eCQO uMMe0aFcsLtRU1RNP1wN5erQd062faTszvvjoUtrHwFH9RWZ0i8yc2uxY9nPV9G+L2H5 Qf9WQer5ozi6oN+KpWU8BgpMYO0r5GvVghAmEwS8BhYsYPezVBgHESuwruqvISgQH7// 5v/Q== X-Gm-Message-State: APjAAAW2YrGCekcT+hMhTVwgZpM/ZChskAvrwdfds4gc0SZYOdZasb/E MtkkSWcTZZvSv13MDzkv9vKl5AAZunayAynMVaikKw== X-Received: by 2002:a05:6602:21cb:: with SMTP id c11mr2851573ioc.25.1568276062494; Thu, 12 Sep 2019 01:14:22 -0700 (PDT) MIME-Version: 1.0 References: <20190910151206.4671-1-mszeredi@redhat.com> <20190911155208.GA20527@stefanha-x1.localdomain> In-Reply-To: <20190911155208.GA20527@stefanha-x1.localdomain> From: Miklos Szeredi Date: Thu, 12 Sep 2019 10:14:11 +0200 Message-ID: Subject: Re: [PATCH v5 0/4] virtio-fs: shared file system for virtual machines To: Stefan Hajnoczi Cc: Miklos Szeredi , virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, "Michael S. Tsirkin" , Vivek Goyal , "Dr. David Alan Gilbert" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 11, 2019 at 5:54 PM Stefan Hajnoczi wrote: > > On Tue, Sep 10, 2019 at 05:12:02PM +0200, Miklos Szeredi wrote: > > I've folded the series from Vivek and fixed a couple of TODO comments > > myself. AFAICS two issues remain that need to be resolved in the short > > term, one way or the other: freeze/restore and full virtqueue. > > I have researched freeze/restore and come to the conclusion that it > needs to be a future feature. It will probably come together with live > migration support for reasons mentioned below. > > Most virtio devices have fairly simply power management freeze/restore > functions that shut down the device and bring it back to the state held > in memory, respectively. virtio-fs, as well as virtio-9p and > virtio-gpu, are different because they contain session state. It is not > easily possible to bring back the state held in memory after the device > has been reset. > > The following areas of the FUSE protocol are stateful and need special > attention: > > * FUSE_INIT - this is pretty easy, we must re-negotiate the same > settings as before. > > * FUSE_LOOKUP -> fuse_inode (inode_map) > > The session contains a set of inode numbers that have been looked up > using FUSE_LOOKUP. They are ephemeral in the current virtiofsd > implementation and vary across device reset. Therefore we are unable > to restore the same inode numbers upon restore. > > The solution is persistent inode numbers in virtiofsd. This is also > needed to make open_by_handle_at(2) work and probably for live > migration. > > * FUSE_OPEN -> fh (fd_map) > > The session contains FUSE file handles for open files. There is > currently no way of re-opening a file so that a specific fh is > returned. A mechanism to do so probably isn't necessary if the > driver can update the fh to the new one produced by the device for > all open files instead. > > * FUSE_OPENDIR -> fh (dirp_map) > > Same story as for FUSE_OPEN but for open directories. > > * FUSE_GETLK/SETLK/SETLKW -> (inode->posix_locks and fcntl(F_OFD_GET/SETLK)) > > The session contains file locks. The driver must reacquire them upon > restore. It's unclear what to do when locking fails. > > Live migration has the same problem since the FUSE session will be moved > to a new virtio-fs device instance. It makes sense to tackle both > features together. This is something that can be implemented in the > next year, but it's not a quick fix. Right. The question for now is: should the freeze silently succeed (as it seems to do now) or should it fail instead? I guess normally freezing should be okay, as long as the virtiofsd remains connected while the system is frozen. I tried to test this with "echo -n mem > /sys/power/state", which indeed resulted in the virtio_fs_freeze() callback being called. However, I couldn't find a way to wake up the system... Thanks, Miklos