Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp2833562rwr; Sat, 6 May 2023 21:52:41 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7+vhU4iThDGTP/rDS7wd6PYG5wo9jgq9pHgKOYfGBJHH8ze4lJqGgKSo1Tofb3aQ2orhrS X-Received: by 2002:a17:902:cec2:b0:1ac:3be4:8a96 with SMTP id d2-20020a170902cec200b001ac3be48a96mr8323478plg.42.1683435161493; Sat, 06 May 2023 21:52:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683435161; cv=none; d=google.com; s=arc-20160816; b=uSJ6A/+/FVTbRUg9JHVc62X4RKmMfbiylZne1l0PAf3NN89ZmeIaVjc0faowAzvFHq OGsqoEoUfPtimr20oV8g2rf3jj8RIQuLAa+gz6DWPdEvbhKPJ554t/jj5x+RoL4FuYUw 34jVI9Pi8BPElQcVYQgULfHPunbMBCCsDr+yRHHTmN9SjQJuHaUTHbJdeGYrzAX074P7 MIbgS40ykbOFrOaWmXOqhzUR6s61MHG9+5ajT3AFxeapaovKgXIQppQpiH4ZRPKK//sG IEEv3lSKFu2aP9+2CYthvutwOb2A4qrXwLsnqCHZSxTxllKaxPVgA/LJ0drgB/bAa0oS 11ag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=uLqW3A8uvrQKdQ+R/YhjIAPZpVxWxhu3RtWv9DAK988=; b=fOuiiovqkZUCuQMiW//gVes+Tg2h51bxvHrqKAgyiN5OmUAhIOF63SUVHWXIpLN5tr 7X6RTHHvTu1i8NKaqq9GK9qh5sQ6mkkZd/FGeuPH6YXOFjUe8x2nwK0GZ1+bFDzpfUJP p2Om2+VjfWzDfVL+NHM1hOqJfq9Eict38bJPvrlIBRJriogMfKjdjU2xiKyc/kSVSHbx jB0uiizmcUeXDCRFChhDvYx3Gvfip/j37xLvNCHFtNfjz5AVmdP4AA5iIUM7RM/z/yEd nyO8IO5oAoxnZOCpp0zZQRD2PxGnKrN87JcMxcviP6zCT5aFmH8RDb5mhCi6cFBSURfY K4Xw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=bombadil.20210309 header.b=cXF19qNr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y8-20020a1709029b8800b001a987c1bf5esi4998263plp.270.2023.05.06.21.52.14; Sat, 06 May 2023 21:52:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=bombadil.20210309 header.b=cXF19qNr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229939AbjEGEHw (ORCPT + 99 others); Sun, 7 May 2023 00:07:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44034 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229462AbjEGEHv (ORCPT ); Sun, 7 May 2023 00:07:51 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3B19F6E93; Sat, 6 May 2023 21:07:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=uLqW3A8uvrQKdQ+R/YhjIAPZpVxWxhu3RtWv9DAK988=; b=cXF19qNrhPAS5lhAEG1Eiyl7s6 T3oTY1ROOpt/4hqPjTYzOMLuEDeYe72pF/F4pB+DDfr4FwDyj1cvSbhrtI+5lUA7OpwxYBL5gJYGu wudqpf5sXuM7Yb8G7XIMsDmfzIFSJlmM3OriHXgYPfzrLSbMew4LucT/X84yQ4DsNnXBy9QmtrjaF 1c2pEH7FO6Jpzy8pKh/lEnlx8OFcn+XUKweDRI8MZdbXssBfszJ+xOzthdNYjoHrSD4Ad7hDO8eOn iY/EEYDD7TxlvMC52IxSBnFsyE+CiqBNtI0mLgBeB0NZtLovvs3A4c1DyuGfSVw3dg7C0DvoZpgMJ 3HHrML2A==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.96 #2 (Red Hat Linux)) id 1pvVgS-00F7Q5-0j; Sun, 07 May 2023 04:07:40 +0000 Date: Sat, 6 May 2023 21:07:40 -0700 From: Luis Chamberlain To: "Darrick J. Wong" Cc: hch@infradead.org, song@kernel.org, rafael@kernel.org, gregkh@linuxfoundation.org, viro@zeniv.linux.org.uk, jack@suse.cz, bvanassche@acm.org, ebiederm@xmission.com, mchehab@kernel.org, keescook@chromium.org, p.raghav@samsung.com, linux-fsdevel@vger.kernel.org, kernel@tuxforce.de, kexec@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [RFC v3 05/24] fs: add automatic kernel fs freeze / thaw and remove kthread freezing Message-ID: References: <20230114003409.1168311-1-mcgrof@kernel.org> <20230114003409.1168311-6-mcgrof@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: Luis Chamberlain X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 23, 2023 at 07:08:37PM -0800, Darrick J. Wong wrote: > On Fri, Jan 13, 2023 at 04:33:50PM -0800, Luis Chamberlain wrote: > > Add support to automatically handle freezing and thawing filesystems > > during the kernel's suspend/resume cycle. > > > > This is needed so that we properly really stop IO in flight without > > races after userspace has been frozen. Without this we rely on > > kthread freezing and its semantics are loose and error prone. > > For instance, even though a kthread may use try_to_freeze() and end > > up being frozen we have no way of being sure that everything that > > has been spawned asynchronously from it (such as timers) have also > > been stopped as well. > > > > A long term advantage of also adding filesystem freeze / thawing > > supporting during suspend / hibernation is that long term we may > > be able to eventually drop the kernel's thread freezing completely > > as it was originally added to stop disk IO in flight as we hibernate > > or suspend. > > Hooray! > > One evil question though -- > > Say you have dm devices A and B. Each has a distinct fs on it. > If you mount A and then B and initiate a suspend, that should result in > first B and then A freezing, right? > > After resuming, you then change A's dm-table definition to point it > at a loop device backed by a file on B. > > What happens now when you initiate a suspend? B freezes, then A tries > to flush data to the loop-mounted file on B, but it's too late for that. > That sounds like a deadlock? > > Though I don't know how much we care about this corner case, As you suggest this is not the only corner case that one could draw upon. There was that evil ioctl added years ago to allow flipping an installed system bootted from a USB or ISO over to the real freshly installed root mount point. To make this bullet-proof we'll need to eventually add a simple graph implementation to keep tags on ordering requirements for the super blocks. I have some C code which tries to implement a graph Linux style but since these are all corner cases at this time, I think it's best we fix first suspend for most and later address a proper graph solution. > Anyway, just wondering if you'd thought about that kind of doomsday > scenario that a nutty sysadmin could set up. > > The only way I can think of to solve that kind of thing would be to hook > filesystems and loop devices into the device model, make fs "device" > suspend actually freeze, hope the suspend code suspends from the leaves > inward, and hope I actually understand how the device model works (I > don't.) There's probably really odd things one can do, and one thing I think we can later do is simply annotate those cases and *not* allow auto-freeze with time for those horrible situations. A real long term solution I think will involve a graph. Luis