Received: by 2002:a05:6a10:a0d1:0:0:0:0 with SMTP id j17csp367800pxa; Wed, 19 Aug 2020 03:44:45 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxaq8cXzeRL2n/AEBIpzTArbc9tH1k7neVD8DjwZXPChlwh4PKfHjpMbSLFtV6utVu9mTXu X-Received: by 2002:a17:906:9989:: with SMTP id af9mr1128079ejc.385.1597833885591; Wed, 19 Aug 2020 03:44:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1597833885; cv=none; d=google.com; s=arc-20160816; b=XVHCqEI4WjM9VE9kzEpdqBnsG4RY558nIP4rrd8yYGAGPe7Pk7R88Ji/WiYw0zqCCF lS5gx8hgXlvR0KtdHKKVGqkpM5DKQtsQJFDESKsJqSrC0SAFDtLauBXCOjT7pNzPi8jt 3evE0jh0xSjfNK6xP7/ptJWuuCzgXsuOHl5Gc6lXubVF/4K0786F/JyUQ4kJlmVKX3sN CWXnIpm/Trv/jowwnfSMRgWz35wRGKIwNV9DKyckRc39s8N00pQx+jWErVTbIcpaq05b TQvxPHk5jvu5Zd3DywcM5LthSVFamVoHM1CTnmR58PqzQRhfzfMLhnzzpv9dlp7in6AF NIYw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=Bf84tEKcs6aV+KIIHh8pWM7pDj2wET0I06xaooVrpsE=; b=J9Zxx08Wz4Sq/ZOb5Czibrqdas6SGqfg5HweWM6LVF58Zei+iUE73MNabEoWi0y4Ue /Snd6VqMHZJ9bA6MBwLCQlZI0nUOdqPO7uEPSjRZ3chUAkLGr5mMBQZ/ALcpImDMXbbz k9cavorm+B+OGdqoD4NJwoJTTz8mS66RoGRW6BbkOa8Nm8SRJydK2eL31VtCGs1GrWNJ Kaztt0u1titHDyigucDcXluIgiDU2dME0mLCPwyxStFTVofBFmwr2R/JWckqNeV4qrj6 IzQJxAUteMORHCNaBIZ+ySFMoEF/A8JfCpHsW2UU/PgnpdU2MVhEfCwQNeyydtJPZTn9 PUuA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id e24si15283183eja.210.2020.08.19.03.44.21; Wed, 19 Aug 2020 03:44:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727807AbgHSKny (ORCPT + 99 others); Wed, 19 Aug 2020 06:43:54 -0400 Received: from verein.lst.de ([213.95.11.211]:37169 "EHLO verein.lst.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727102AbgHSKnw (ORCPT ); Wed, 19 Aug 2020 06:43:52 -0400 Received: by verein.lst.de (Postfix, from userid 2407) id F273367357; Wed, 19 Aug 2020 12:43:49 +0200 (CEST) Date: Wed, 19 Aug 2020 12:43:49 +0200 From: Christoph Hellwig To: Damien Le Moal Cc: Javier Gonzalez , Christoph Hellwig , Kanchan Joshi , "kbusch@kernel.org" , "axboe@kernel.dk" , "sagi@grimberg.me" , "linux-nvme@lists.infradead.org" , "linux-kernel@vger.kernel.org" , Johannes Thumshirn , Nitesh Shetty , SelvaKumar S Subject: Re: [PATCH 2/2] nvme: add emulation for zone-append Message-ID: <20200819104349.GA2697@lst.de> References: <20200818052936.10995-1-joshi.k@samsung.com> <20200818052936.10995-3-joshi.k@samsung.com> <20200818071249.GB2544@lst.de> <20200818095033.h6ybdwiq3ljagl5a@mpHalley.local> <20200818155004.GA26688@lst.de> <20200818180428.obipue6adpqqpwjj@MacBook-Pro.localdomain> <20200819074035.GA21991@lst.de> <20200819083353.rwblagiesocfcq7i@mpHalley.local> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.17 (2007-11-01) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 19, 2020 at 09:14:13AM +0000, Damien Le Moal wrote: > While defining a zone append command for SCSI/ZBC is possible (using sense data > for returning the written offset), there is no way to define zone append for > SATA/ZAC without entirely breaking the ATA command model. This is why we went > after an emulation implementation instead of trying to standardized native > commands. That implementation does not have any performance impact over regular > writes *and* zone write locking does not in general degrade HDD write > performance (only a few corner cases suffer from it). Comparing things equally, > the same could be said of NVMe drives that do not have zone append native > support: performance will be essentially the same using regular writes and > emulated zone append. But mq-deadline and zone write locking will significantly > lower performance for emulated zone append compared to a native zone append > support by the drive. And to summarize the most important point - Zone Append doesn't exist in ZAC/ABC. For people that spent the last years trying to make zoned storage work, the lack of such a primite has been the major pain point. That's why I came up with the Zone Append design in response to a request for such an operation from another company that is now heavily involved in both Linux development and hosting Linux VMs. For ZAC and ZBC the best we can do is to emulate the approach in the driver, but for NVMe we can do it. ZNS until just before the release had Zone Append mandatory, and it did so for a very good reason. While making it optional allows OEMs to request drives without it, I fundamentally think we should not support that in Linux and request vendors do implement writes to zones the right way. And just as some OEMs can request certain TPs or optional features to be implemented, so can Linux. Just to give an example from the zone world - Linux requires uniform and power of two zone sizes, which in ZAC and ZBC are not required.