Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp634159rwb; Fri, 23 Sep 2022 01:54:00 -0700 (PDT) X-Google-Smtp-Source: AMsMyM79tKV2OvgaqRY+9G53KzjBzkp1OrDkrdIWAGxdAv+aunoyzvDOYG5Wa+GgGAjVcNF3bZ5g X-Received: by 2002:a17:903:22c2:b0:178:3c7c:18af with SMTP id y2-20020a17090322c200b001783c7c18afmr7424004plg.134.1663923239853; Fri, 23 Sep 2022 01:53:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663923239; cv=none; d=google.com; s=arc-20160816; b=DLnEiUaA+m8Nm4WupbbzcKWjNoYCtqxd+nN39p/1EA60OgkB7PxGG0y54URmD0lJFD oPlaU9KTT72gQWm7c7SuZ7AtInPlf85sYU34yTAyYcYylf0wh+wluPLt2m3wvN108xnq AOjy631SqpjVVOhsctUl2tiaS3pM49T/dLsVRM3awQIv0k8V/2MsDdJ+emv7SjT6cfsF 3Wj4Fsz4ZGN9JjFq4VxW0a/AGO/PYIDI27FvIw8rf+Qm1dckjWL1fVCjKxK43HftDDWM 92dZxvLzQsluh00VUL3sMhLyuc92CvnJ5Trmnd75P0QWvgiyMKGTmJ6LLYm9i0a3/UUx ECCA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=Mwd96PDyTOevoURdRNm22AhUb71BkD8a02+bd0y+kQ4=; b=jhk41DvAXsl4Sq1B9SEDV6GUvzM81zaZJ91htYd8VMmzWd20KvYkSx648obzhXjnL9 IMi6MeKI0FwNczLwmnAdBS7Dy92clqbHZPAL3ekb0msRmQLZpk18TaBCfrmtz4D6czLp 8Xc3oZhhjmgpyQebPlMgqeovJA1fHnPaG7SNkFDktzEmIbDPI/FZoU2+0K/Apgy/maGc oHE3nhKntchPfH3SnV6YbLC/H/hVo3lDf5JPyv4rbgtFlb+oekhllJnZa4RQzUpCOmPM yzrIMSOzVZhkSCVLZOcIlM9Zjx4HiwBb8RA20ApAtzTFZOHbQwBQWpAk6fHpIgc38JBc jpYQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=bombadil.20210309 header.b=tvJph7Rg; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 77-20020a621650000000b00518b91753b2si7018837pfw.207.2022.09.23.01.53.45; Fri, 23 Sep 2022 01:53:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=bombadil.20210309 header.b=tvJph7Rg; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231235AbiIWIwD (ORCPT + 99 others); Fri, 23 Sep 2022 04:52:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38564 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231344AbiIWIvn (ORCPT ); Fri, 23 Sep 2022 04:51:43 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DF5DA1296B5; Fri, 23 Sep 2022 01:51:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=Mwd96PDyTOevoURdRNm22AhUb71BkD8a02+bd0y+kQ4=; b=tvJph7RgQD7CLZKEaP7wd20+HA yzg1Rpk81YuhnLGg0IxkLW71EYP4wW4RxMvkX1baGgdCA+zsyv0wxDyVeWCGkfoxAJ7O7djHLJL5s 9oyGI4jpbsPbsfAD/1s8vHAosC6rt8/vPRZu2pj25Fmrmqh51GANFZB4FeikHq9qZh6sMuW3RgTI4 exZH9/upSkiDA3aEr3SXAM5NCGcRAgvOsRUHqBv2zAk5GnhfvqK/Wecjncu/UIVciVJhjnOmG9gGy zr79D+RgDhz/xCk1Amx7X6rjhAl+hOr6g3q+F+AJq0AHJkiqr7j9ytA+QPCRAVyI+5CA3uxYY9gIv 71si1rdQ==; Received: from hch by bombadil.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1obePA-0033p5-F8; Fri, 23 Sep 2022 08:51:28 +0000 Date: Fri, 23 Sep 2022 01:51:28 -0700 From: Christoph Hellwig To: Daniil Lunev Cc: Christoph Hellwig , Sarthak Kukreti , Stefan Hajnoczi , dm-devel@redhat.com, linux-block@vger.kernel.org, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, Jens Axboe , "Michael S . Tsirkin" , Jason Wang , Paolo Bonzini , Alasdair Kergon , Mike Snitzer , Theodore Ts'o , Andreas Dilger , Bart Van Assche , Evan Green , Gwendal Grignou Subject: Re: [PATCH RFC 0/8] Introduce provisioning primitives for thinly provisioned storage Message-ID: References: <20220915164826.1396245-1-sarthakkukreti@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org. See http://www.infradead.org/rpr.html X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Wed, Sep 21, 2022 at 07:48:50AM +1000, Daniil Lunev wrote: > > There is no such thing as WRITE UNAVAILABLE in NVMe. > Apologize, that is WRITE UNCORRECTABLE. Chapter 3.2.7 of > NVM Express NVM Command Set Specification 1.0b Write uncorrectable is a very different thing, and the equivalent of the horribly misnamed SCSI WRITE LONG COMMAND. It injects an unrecoverable error, and does not provision anything. > * Each application is potentially allowed to consume the entirety > of the disk space - there is no strict size limit for application > * Applications need to pre-allocate space sometime, for which > they use fallocate. Once the operation succeeded, the application > assumed the space is guaranteed to be there for it. > * Since filesystems on the volumes are independent, filesystem > level enforcement of size constraints is impossible and the only > common level is the thin pool, thus, each fallocate has to find its > representation in thin pool one way or another - otherwise you > may end up in the situation, where FS thinks it has allocated space > but when it tries to actually write it, the thin pool is already > exhausted. > * Hole-Punching fallocate will not reach the thin pool, so the only > solution presently is zero-writing pre-allocate. To me it sounds like you want a non-thin pool in dm-thin and/or guaranted space reservations for it. > * Thus, a provisioning block operation allows an interface specific > operation that guarantees the presence of the block in the > mapped space. LVM Thin-pool itself is the primary target for our > use case but the argument is that this operation maps well to > other interfaces which allow thinly provisioned units. I think where you are trying to go here is badly mistaken. With flash (or hard drive SMR) there is no such thing as provisioning LBAs. Every write is out of place, and a one time space allocation does not help you at all. So fundamentally what you try to here just goes against the actual physics of modern storage media. While there are some layers that keep up a pretence, trying to that an an exposed API level is a really bad idea.