Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754793AbcJEUWF (ORCPT ); Wed, 5 Oct 2016 16:22:05 -0400 Received: from ipmail07.adl2.internode.on.net ([150.101.137.131]:54074 "EHLO ipmail07.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754661AbcJEUWD (ORCPT ); Wed, 5 Oct 2016 16:22:03 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2DTJwCxX/VXEJqYLHldHAEBBAEBCgEBgz0BAQEBAR6BBE+GcpwZAQEBAQEBBoEai3qGIIQYhhoEAgKBdE0BAgEBAQEBAgYBAQEBAQEBATdAhGEBAQEDAScTHCMQCAMOCgklDwUlAwcaE4hGB7wlAQEBAQYBAQEBJB6FVIUfhCQPg0OCLwWZeo9sj35IjCuDfoETBQeCaxyBZio0hgiCLgEBAQ Date: Thu, 6 Oct 2016 07:21:58 +1100 From: Dave Chinner To: Jan Kara Cc: Mateusz Guzik , Pierre Morel , viro@zeniv.linux.org.uk, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, farman@linux.vnet.ibm.com, cornelia.huck@de.ibm.com, Jens Axboe , Josef Bacik Subject: Re: [PATCH] fs/block_dev.c: return the right error in thaw_bdev() Message-ID: <20161005202158.GA9806@dastard> References: <1475571220-2522-1-git-send-email-pmorel@linux.vnet.ibm.com> <1475571220-2522-2-git-send-email-pmorel@linux.vnet.ibm.com> <20161004090615.GF17515@quack2.suse.cz> <20161005064740.yekp2jl3nler2won@mguzik> <20161005095903.GC7291@quack2.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161005095903.GC7291@quack2.suse.cz> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2977 Lines: 80 On Wed, Oct 05, 2016 at 11:59:03AM +0200, Jan Kara wrote: > On Wed 05-10-16 08:47:42, Mateusz Guzik wrote: > > On Tue, Oct 04, 2016 at 11:06:15AM +0200, Jan Kara wrote: > > > On Tue 04-10-16 10:53:40, Pierre Morel wrote: > > > > When triggering thaw-filesystems via magic sysrq, the system enters a > > > > loop in do_thaw_one(), as thaw_bdev() still returns success if > > > > bd_fsfreeze_count == 0. To fix this, let thaw_bdev() always return > > > > error (and simplify the code a bit at the same time). > > > > > > > > > > The patch looks good. > > > > > > > Now that I had a closer look, while the patch indeed gets rid of the > > infinite loop, the functionality itself does not work properly. > > > > Note I'm not familiar with this code, so chances are I got details > > wrong. I also don't know the original reasoning. > > > > The current state is that you can freeze by calling either freeze_super > > or freeze_bdev. The latter bumps bd_fsfreeze_count and calls > > freeze_super. freeze_super does NOT modify bd_fsfreeze_count. > > > > freeze_bdev is used by device mapper, xfs and e2fs. > > Where is freeze_bdev() used by e2fs? typo - it's f2fs, and it's usage is copied from XFS. Both of those usages can be completely ignored for th purposes of this argument, because they are done in the context of {X,F2}FS_IOC_FSGOINGDOWN. i.e ioctls to shutdown a filesystem and completely deny access to it. Freezing is just means to an end - it's just used to prevent any more IO from being issued by the block device while we do the filesystem shutdown... IOWs, These are never used in normal usage by users - they are a diagnostic tool and, potentially, a get-out-of-gaol-free card that can be played when something goes wrong with the underlying storage and the filesystem hangs waiting for it. Indeed, the code in both does: sb = freeze_bdev() if (sb && !IS_ERR(sb) { do_shutdown() thaw_bdev(sb) } Which is perfectly sane and will not cause any sort of unbalanced freeze behaviour to occur unless the shutdown oopses, in which case the user has bigger problems than a frozen device... So in production systems, dm_suspend()->lock_fs()->freeze_bdev() is the only path that will ever be exercised. > > Now, looking at *_bdev functions: > > > > > struct super_block *freeze_bdev(struct block_device *bdev) > > > { > > > struct super_block *sb; > > > int error = 0; > > > > > > mutex_lock(&bdev->bd_fsfreeze_mutex); > > > if (++bdev->bd_fsfreeze_count > 1) { > > > > No limit is put in place so in principle this will eventually turn negative. > > Yeah, ok, send a fix... FWIW, that will only overflow if someone tries to freeze their block device more than *2 billion times* without a thaw. If that happens, the user has bigger problems to worry about, like the days/weeks the system couldn't write to the filesystem because it was frozen.... :/ Cheers, Dave. -- Dave Chinner david@fromorbit.com