Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp5780007ioo; Wed, 1 Jun 2022 12:26:01 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxD0A4dTktOn+XM6jSo1r6NgJRbh6bO6O1C+uvhQX0wC5ABYVUQlUWTQ5UocDoIPTsh4Alz X-Received: by 2002:a63:e4a:0:b0:3fc:a85f:8c08 with SMTP id 10-20020a630e4a000000b003fca85f8c08mr793725pgo.450.1654111561077; Wed, 01 Jun 2022 12:26:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654111561; cv=none; d=google.com; s=arc-20160816; b=ZUgLUKGt78UB9ByecrcCdIMEACTaKDHdZdEWVrDZXdUIMLc76XGho5KM+0VCbwcu5D xN+J4p2Cd+7I38TPU7dnT5nw0033GXQA++2ltrx+JWYm1kzwRjObSDBr062z8tU2Ht/n zOGTuQbLfHhJtqQHLjpzvNG0zYYyLLX1QYZanuEap+XZkYDgc6Xcz6L9QGW3/jOTcI9e vXRjkCLZwIh2j5waB2MRsGHddJp/ZLul21/G8JC1W6Pbxi3c+09MDe+k+grm0yv+DkGD f+7ynLVUx1eDtII4aAVZzMxBR2AtVtCQUXbTd3aem5sNN7DKm+OXmip8HHl8ix5Yn1LT M5Nw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature :dkim-signature; bh=2HvcditncinzmTgMYfdt0TIwv9F6NSjZGvjIg4D6xtE=; b=thbS1bzyVb8lgbZs4Mhczbk+hjfBcpuu6nKaV05l3uxn8YAAkJKG1Mi4jRg+vpNi85 q3Sa2AbK66aLYGAXK2Gv1I4TmS6pOmMJ53ejV5O6IlbI/cceGGMNXlcXCOYDI6KKNjQ6 L1vnS+GcwDClg4SRBq2ZZRN5/fO4dLBAD2RVBR7Lil9WtfOxU0+SF9Qn5F9+kPlH9E5K zDD+zWBPrP9rbbgKbgRWvtavng3kGpWHwPbymEYlG7eJO7LZ2lcXhZdZO4aNxWZxwURa 4YDg5LmrAuzwKp6nl2tjmAeVDi5IFmTzwRj+AsMWnjP2OIBVFqkEB115T2uJrgQCk2zK fAmg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=WGEK0khV; dkim=neutral (no key) header.i=@suse.cz header.s=susede2_ed25519; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id na11-20020a17090b4c0b00b001e36158ab15si5078882pjb.163.2022.06.01.12.26.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Jun 2022 12:26:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=WGEK0khV; dkim=neutral (no key) header.i=@suse.cz header.s=susede2_ed25519; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 2928716ABD9; Wed, 1 Jun 2022 11:58:51 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245241AbiEaKih (ORCPT + 99 others); Tue, 31 May 2022 06:38:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56324 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239301AbiEaKih (ORCPT ); Tue, 31 May 2022 06:38:37 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0CA2F996B0; Tue, 31 May 2022 03:38:36 -0700 (PDT) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id BACC11F975; Tue, 31 May 2022 10:38:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1653993514; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=2HvcditncinzmTgMYfdt0TIwv9F6NSjZGvjIg4D6xtE=; b=WGEK0khVXa2/t47RV+bWG0XCY6vS4bXKy8oLscmGAAWF7A4Dnv8/L3gPBf7OUauDL0aIV7 TUTjF+MRLZfBX0QKnUp4AJHeFnXQQ5nyJUmGYe0DGcy+aXav0dw9tCR04zrHF9FlDTrpyE /b7iFcs2iTuHHMdEa2dxjjZPAFF13ko= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1653993514; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=2HvcditncinzmTgMYfdt0TIwv9F6NSjZGvjIg4D6xtE=; b=1gYmoWoUdytQiC6FidCdrB2vVgvAasVLIge8Q7ErhVzfRNWilqNCk0QdL1A7fhbNSkMFny elgNlYXtwAT9EAAw== Received: from quack3.suse.cz (unknown [10.163.28.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id A88F82C141; Tue, 31 May 2022 10:38:34 +0000 (UTC) Received: by quack3.suse.cz (Postfix, from userid 1000) id 3955DA0633; Tue, 31 May 2022 12:38:34 +0200 (CEST) Date: Tue, 31 May 2022 12:38:34 +0200 From: Jan Kara To: Donald Buczek Cc: linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org, dm-devel@redhat.com, it+linux@molgen.mpg.de, Linux Kernel Mailing List Subject: Re: ext4_writepages: jbd2_start: 5120 pages, ino 11; err -5 Message-ID: <20220531103834.vhscyk3yzsocorco@quack3.lan> References: <4e83fb26-4d4a-d482-640c-8104973b7ebf@molgen.mpg.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4e83fb26-4d4a-d482-640c-8104973b7ebf@molgen.mpg.de> X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org Late reply but maybe it is still useful :) On Thu 14-04-22 17:19:49, Donald Buczek wrote: > We have a cluster scheduler which provides each cluster job with a > private scratch filesystem (TMPDIR). These are created when a job starts > and removed when a job completes. The setup works by fallocate, losetup, > mkfs.ext4, mkdir, mount, "losetup -d", rm and the teardown just does a > umount and rmdir. > > This works but there is one nuisance: The systems usually have a lot of > memory and some jobs write a lot of data to their scratch filesystems. So > when a job finishes, there often is a lot to sync by umount which > sometimes takes many minutes and wastes a lot of I/O bandwidth. > Additionally, the reserved space can't be returned and reused until the > umount is finished and the backing file is deleted. > > So I was looking for a way to avoid that but didn't find something > straightforward. The workaround I've found so far is using a dm-device > (linear target) between the filesystem and the loop device and then use > this sequence for teardown: > > - fcntl EXT4_IOC_SHUTDOWN with EXT4_GOING_FLAGS_NOLOGFLUSH > - dmestup reload $dmname --table "0 $sectors zero" > - dmsetup resume $dmname --noflush > - umount $mountpoint > - dmsetup remove --deferred $dmname > - rmdir $mountpoint > > This seems to do what I want. The unnecessary flushing of the temporary data is redirected from the backing file into the zero target and it works really fast. There is one remaining problem though, which might be just a cosmetic one: Although ext4 is shut down to prevent it from writing, I sometimes get the error message from the subject in the logs: > > [2963044.462043] EXT4-fs (dm-1): mounted filesystem without journal. Opts: (null) > [2963044.686994] EXT4-fs (dm-0): mounted filesystem without journal. Opts: (null) > [2963044.728391] EXT4-fs (dm-2): mounted filesystem without journal. Opts: (null) > [2963055.585198] EXT4-fs (dm-2): shut down requested (2) > [2963064.821246] EXT4-fs (dm-2): mounted filesystem without journal. Opts: (null) > [2963074.838259] EXT4-fs (dm-2): shut down requested (2) > [2963095.979089] EXT4-fs (dm-0): shut down requested (2) > [2963096.066376] EXT4-fs (dm-0): ext4_writepages: jbd2_start: 5120 pages, ino 11; err -5 > [2963108.636648] EXT4-fs (dm-0): mounted filesystem without journal. Opts: (null) > [2963125.194740] EXT4-fs (dm-0): shut down requested (2) > [2963166.708088] EXT4-fs (dm-1): shut down requested (2) > [2963169.334437] EXT4-fs (dm-0): mounted filesystem without journal. Opts: (null) > [2963227.515974] EXT4-fs (dm-0): shut down requested (2) > [2966222.515143] EXT4-fs (dm-0): mounted filesystem without journal. Opts: (null) > [2966222.523390] EXT4-fs (dm-1): mounted filesystem without journal. Opts: (null) > [2966222.598071] EXT4-fs (dm-2): mounted filesystem without journal. Opts: (null) > > So I'd like to ask a few questions: > > - Is this error message expected or is it a bug? Well, shutdown is not 100% tuned for clean teardown. It is mostly a testing / debugging aid. > - Can it be ignored or is there a leak or something on that error path. The error recovery path should be cleaning up everything. If not, that would be a bug :) > - Is there a better way to do what I want? Something I've overlooked? Why not just rm -rf $mountpoint/*? That will remove all dirty data from memory without writing it back. It will cost you more in terms of disk IOs than the above dance with shutdown but unless you have many files, it should be fast... And it is much more standard path than shutdown :). > - I consider to create a new dm target or add an option to an existing > one, because I feel that "zero" underneath a filesystem asks for problems > because a filesystem expects to read back the data that it wrote, and the > "error" target would trigger lots of errors during the writeback > attempts. What I really want is a target which silently discard writes > and returns errors on reads. Any opinion about that? > - But to use devicemapper to eat away the I/O is also just a workaround > to the fact that we can't parse some flag to umount to say that we are > okay to lose all data and leave the filesystem in a corrupted state if > this was the last reference to it. Would this be a useful feature? I think something like this might be useful if the "rm -rf" solution is too slow. But it is a bit of a niche usecase ;). Honza -- Jan Kara SUSE Labs, CR