Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp482102rdg; Tue, 10 Oct 2023 16:59:57 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE3slvmkCcUcrU/SY1rLZK+oonnElS/qi6lSla0rkIkXhQJFl3S1a02y+5FKhN1dwxudJj5 X-Received: by 2002:a05:6a00:248a:b0:68e:29a6:e247 with SMTP id c10-20020a056a00248a00b0068e29a6e247mr19444583pfv.10.1696982396987; Tue, 10 Oct 2023 16:59:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696982396; cv=none; d=google.com; s=arc-20160816; b=AEOtmnJEN1rSRJpG0/6/6ZVWyOKvoN3JxvH8qOV9BKTTZL8B41FaWPrXl5v5AJ+uAe M/gm47QrkH2k4sCUuWjHQlIyuG0DrTIuo35uvGYkkdQ2P1jcNlbOAvgyC7lxCPMKFQcE 2I+DRHPAN2g9xDUfkCfritJVpxy5gdxFOFmabynb0E1QB0N2/qgB38o/uJGn8jpdhoMr 2gUszO87xqup6eo/oMAR2STM1CiaRRDLcsuTv3+YqN8VH6oOs6QGqNA/qvREitk4idUP hTqOwdUMYg1YxpSqqGnQtspyla5FDuN7WzUXnvdhA+03diPNpVWG9A18fA38iDoJppIo 0wDg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=vuBhbcRCY14Ehqi0PTSDL9AEIBFyvO42CHNfxwJX88M=; fh=yewGppZRzX9YpyYWG0x58MuDKnCcUPfkXLwJGOKeE90=; b=WfEFTvlAjIY1bHJfZdb1WaTlo2PVU8CCQxNNmkLZTLUYws8oCFDwlMLzx2ECxaUzRw OUgAiRY54Al7ZpT7NUDxF8VWMxcgQox40z6hUunjiylalolRBf3N9mT9Ej/iKnxoYwKY kxfAbXsqBKuWEwFnUuRpzdyNxskLesuAP+k1904KGWjIJ+PChQc7OKWq8AtAgkE9WcGg 1skI2Fm8l7zKfez6IelZT306jJp1bDsw8RX6Q+vzssQHMhOz/6n6rDTaWqJZ/Tsf+cnb 4TAdee4bRoUwsP8XF3W9g3hIzdmRZAawmwynB82LC9bIts8ywteMyAYUvOrhoxKs1UH1 6yBw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=BruM3DZN; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=fromorbit.com Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id cq27-20020a056a00331b00b0069018a768d7si6514350pfb.385.2023.10.10.16.59.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Oct 2023 16:59:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=pass header.i=@fromorbit-com.20230601.gappssmtp.com header.s=20230601 header.b=BruM3DZN; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=fromorbit.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 7F74B81DDD25; Tue, 10 Oct 2023 16:59:52 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229600AbjJJX7v (ORCPT + 99 others); Tue, 10 Oct 2023 19:59:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45376 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229484AbjJJX7u (ORCPT ); Tue, 10 Oct 2023 19:59:50 -0400 Received: from mail-pf1-x431.google.com (mail-pf1-x431.google.com [IPv6:2607:f8b0:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 33E3394 for ; Tue, 10 Oct 2023 16:59:48 -0700 (PDT) Received: by mail-pf1-x431.google.com with SMTP id d2e1a72fcca58-690d8c05784so4867298b3a.2 for ; Tue, 10 Oct 2023 16:59:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1696982387; x=1697587187; darn=vger.kernel.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=vuBhbcRCY14Ehqi0PTSDL9AEIBFyvO42CHNfxwJX88M=; b=BruM3DZNS7ZwvtNBAEFkyeZZhxE3DpEpWC5sgAyG2IPC5wbLxGs2LMI8xfToh0inuj LzKU/5lszTGZoXyMWTDlymXPDdUmQOnXxxGNFbt9xRgbxCqxI7LN3jLENsi5GoV/rsCb knyUT+6SkLaBFAJxAgAxTaBqZSU2DxDhszgWA1nfmR+/clUa+YVxjuxCxd94S1KdzrPq lf7aylN+jV97Wn9H2SDQDV25gsKkSJzVJezZLXBZ51634uKrGy82kv+az37blcFpit9l SL6P0to3t1/QAoiwbEj9QLqa0L6u65QIrHjZp4CsZO5CNjBLZZ6tI79slGLKpX6Nwrmz CDLQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696982387; x=1697587187; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=vuBhbcRCY14Ehqi0PTSDL9AEIBFyvO42CHNfxwJX88M=; b=T0DOyV4zjp0ixUR9oRw7LcLHpJsAvw4m6UDZD2ldpTffB6am+U68IYs9LMGRvBhDHq l3RAInlVKmZdQLuyC8bFeGJV0hwR0CxD1K0SvT8srtBNolKeNNMfyNVcjpkwWVlac/zn Ff3bnUtA6s70Me5qwTomtKTQD4+DoOOkPW+HKW1HnWe8CNgnCccS7SxNShL8lPsb6Wg3 yUhEvwtRal4pq6inrtVyCNz8WiKC4somGvGMBHX8PmESZLaYywjhnMrC90r/BB1uALah vYkdCBkIqFrQ2Jl+EjceiEmYWt8tZdBPp78uVWjlnjETIR7JMMt/qCrRBgMRruaQxUgi V/ag== X-Gm-Message-State: AOJu0YypSvAEYmJ4huj8cYYoJjC0CKDS6xDBhM3hFSdBbi7SuEcdzt6r L7LCo4itIAcymeIOzwsbaxrzkg== X-Received: by 2002:a05:6a21:329c:b0:15c:b7b9:fc21 with SMTP id yt28-20020a056a21329c00b0015cb7b9fc21mr19831702pzb.14.1696982387504; Tue, 10 Oct 2023 16:59:47 -0700 (PDT) Received: from dread.disaster.area (pa49-180-20-59.pa.nsw.optusnet.com.au. [49.180.20.59]) by smtp.gmail.com with ESMTPSA id y17-20020a056a001c9100b0068fcc7f6b00sm2559718pfw.74.2023.10.10.16.59.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Oct 2023 16:59:46 -0700 (PDT) Received: from dave by dread.disaster.area with local (Exim 4.96) (envelope-from ) id 1qqMdc-00CAvp-1A; Wed, 11 Oct 2023 10:59:44 +1100 Date: Wed, 11 Oct 2023 10:59:44 +1100 From: Dave Chinner To: Sarthak Kukreti Cc: dm-devel@redhat.com, linux-block@vger.kernel.org, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Jens Axboe , Alasdair Kergon , Mike Snitzer , Christoph Hellwig , Brian Foster , Theodore Ts'o , Andreas Dilger , Bart Van Assche , "Darrick J. Wong" Subject: Re: [PATCH v8 3/5] loop: Add support for provision requests Message-ID: References: <20231007012817.3052558-1-sarthakkukreti@chromium.org> <20231007012817.3052558-4-sarthakkukreti@chromium.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=2.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_SBL_CSS, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Tue, 10 Oct 2023 16:59:52 -0700 (PDT) X-Spam-Level: ** On Tue, Oct 10, 2023 at 03:43:10PM -0700, Sarthak Kukreti wrote: > On Sun, Oct 8, 2023 at 4:37 PM Dave Chinner wrote: > > > > On Fri, Oct 06, 2023 at 06:28:15PM -0700, Sarthak Kukreti wrote: > > > Add support for provision requests to loopback devices. > > > Loop devices will configure provision support based on > > > whether the underlying block device/file can support > > > the provision request and upon receiving a provision bio, > > > will map it to the backing device/storage. For loop devices > > > over files, a REQ_OP_PROVISION request will translate to > > > an fallocate mode 0 call on the backing file. > > > > > > Signed-off-by: Sarthak Kukreti > > > Signed-off-by: Mike Snitzer > > > > > > Hmmmm. > > > > This doesn't actually implement the required semantics of > > REQ_PROVISION. Yes, it passes the command to the filesystem > > fallocate() implementation, but fallocate() at the filesystem level > > does not have the same semantics as REQ_PROVISION. > > > > i.e. at the filesystem level, fallocate() only guarantees the next > > write to the provisioned range will succeed without ENOSPC, it does > > not guarantee *every* write to the range will succeed without > > ENOSPC. If someone clones the loop file while it is in use (i.e. > > snapshots it via cp --reflink) then all guarantees that the next > > write to a provisioned LBA range will succeed without ENOSPC are > > voided. > > > > So while this will work for basic testing that the filesystem is > > issuing REQ_PROVISION based IO correctly, it can't actually be used > > for hosting production filesystems that need full REQ_PROVISION > > guarantees when the loop device backing file is independently > > shapshotted via FICLONE.... > > > > At minimuim, this set of implementation constraints needs tobe > > documented somewhere... > > > Fair point. I wanted to have a separate fallocate() mode > (FALLOC_FL_PROVISION) in the earlier series of the patchset so that we > can distinguish between a provision request and a regular fallocate() > call; I dropped it from the series after feedback that the default > case should suffice. But this might be one of the cases where we need > an explicit intent that we want to provision space. ISTR that I commented that filesystems like XFS can't implement REQ_PROVISION semantics for extents without on-disk format changes. Hence that needs to happen before we expose a new API to userspace.... > Given a separate FALLOC_FL_PROVISION mode in the scenario you > mentioned, the filesystem could copy previously 'provisioned' blocks > to new blocks (which implicitly provisions them) or reserve blocks for > use (and passing through REQ_OP_PROVISION below). That also means that > the filesystem should track 'provisioned' blocks and take appropriate > actions to ensure the provisioning guarantees. Yes, tracking provisioned ranges persistently and the reservations they require needs on-disk filesytem format changes compared to just preallocating space. None of this functionality currently exists in any filesystem that supports shared extents, and it's a fairly significant chunk of development work to support it. Nobody has planned to do this sort of complex surgery to XFS at this point in time. I doubt that anyone on the btrfs side of things is really even following this discussion because this is largely for block device thinp and snapshot support and btrfs just doesn't care about that. > For filesystems without copy-on-write semantics (eg. ext4), > REQ_OP_PROVISION should still be equivalent to mode == 0. Well, yes. This is the same situation as "for non-sparse block devices, REQ_PROVISION can just be ignored." This is not an interesting use case, nor a use case that the functionality or APIs should be designed around. -Dave. -- Dave Chinner david@fromorbit.com