Received: by 2002:a05:7412:b10a:b0:f3:1519:9f41 with SMTP id az10csp2454840rdb; Sun, 3 Dec 2023 18:31:05 -0800 (PST) X-Google-Smtp-Source: AGHT+IFml1gXS8aEL4+kajk07jscRqUCRX9MrJxG8ZGROdT0Gwi7fXYm6KcE/mYFeBuwuvy5QKmY X-Received: by 2002:a05:6808:f10:b0:3b8:b063:823c with SMTP id m16-20020a0568080f1000b003b8b063823cmr4243805oiw.62.1701657065205; Sun, 03 Dec 2023 18:31:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701657065; cv=none; d=google.com; s=arc-20160816; b=NGKfLTsZsSq4D1VhTlmQa3VQ9Xfzxv9zphrLd5+w+v3rms1RAyc0EAj1S/RtI3Bp5a npM0sYJqOHidtnpV+bRdSkJQPrsh+Aso7wO1S87UrVBVmMddaGhzAEibMj23ZxWDrUWz qSooOOuM6CB3d9tUva7F2lC7kdzB/k344cahD23MAAPYl5wqQzgen/WQUHq8YkTXaHfG irG0IxwRxJHdfImSt1FjP6uuOa49FINO2sNlCMi7ffR20jowJsKSmX00bPit7xncCNnt 4vUPImzAVFoVL4XYbEgmv9KhSLXps37zoxQUzLupinfnF/qELZL6f3RNgpBtMMd5fb5H SHhg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=tJPC9DmNEur0ykkO9N98kEn1eLUMAn1MYmT0jUvM8c4=; fh=1LDghv6GIt73tyaB8z523EX0SCW2yus2eDffFhJDmuc=; b=Ys99iHKyv8BfcxJvRFio3LgD/TRnBA7s6Ue5RQ3wel/cCcc1MIvgAMC8tSYKfvV1dw WCfe7koJjFYDna7btnBq0ZPQueNqm3/6SQRqtN4FKuTmt6jsIMpRDd0rlq2r0mDW0GTt AluAV0QYMty7uC2f9xmWW5Wz7J2/33WntfgkBciW63eH0nGJjbjTtQpjtwSjQwKDbzWk bmhjWWNtFxCW287i3OmE9jyM60kWN5T8tXU+ZXOLQWS8UOBhLW/9qAIXgAGFLiXIHbkQ h5RtFekNasNvJJ/LKFxeQ3Q45h3fb/2uZmhFrej9JgX5KlmeiGgogEbVq+n0vqYVRpnb OSSg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=G38GN337; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id p18-20020a056a000b5200b006cdfca5f4dbsi5254209pfo.106.2023.12.03.18.31.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 03 Dec 2023 18:31:05 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=G38GN337; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id E547C807C754; Sun, 3 Dec 2023 18:30:48 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234344AbjLDCaa (ORCPT + 99 others); Sun, 3 Dec 2023 21:30:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44020 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230141AbjLDCa3 (ORCPT ); Sun, 3 Dec 2023 21:30:29 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0CABACB for ; Sun, 3 Dec 2023 18:30:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1701657035; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=tJPC9DmNEur0ykkO9N98kEn1eLUMAn1MYmT0jUvM8c4=; b=G38GN337NQt5j2GRJ2axbt910/RCAPRbMaWg4IjPpko8KLKELHX2DegWp690eqk7DV2xmb zJlqVCFKyl6URUn+MFr3OgNJYjhPpW9jCyHlxbERoO/OnEoYPm4BxerRH0zoTagODz09MK q4T+3py7qZWQ1YiLhHxFLBifZ9CSAsk= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-517-kyPI3YZUPXWYIekSdoIZ4Q-1; Sun, 03 Dec 2023 21:30:32 -0500 X-MC-Unique: kyPI3YZUPXWYIekSdoIZ4Q-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 384798007B3; Mon, 4 Dec 2023 02:30:31 +0000 (UTC) Received: from fedora (unknown [10.72.120.8]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 1AF4A492BFE; Mon, 4 Dec 2023 02:30:18 +0000 (UTC) Date: Mon, 4 Dec 2023 10:30:14 +0800 From: Ming Lei To: John Garry Cc: axboe@kernel.dk, kbusch@kernel.org, hch@lst.de, sagi@grimberg.me, jejb@linux.ibm.com, martin.petersen@oracle.com, djwong@kernel.org, viro@zeniv.linux.org.uk, brauner@kernel.org, chandan.babu@oracle.com, dchinner@redhat.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, tytso@mit.edu, jbongio@google.com, linux-api@vger.kernel.org, ming.lei@redhat.com Subject: Re: [PATCH 10/21] block: Add fops atomic write support Message-ID: References: <20230929102726.2985188-1-john.g.garry@oracle.com> <20230929102726.2985188-11-john.g.garry@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230929102726.2985188-11-john.g.garry@oracle.com> X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.10 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Sun, 03 Dec 2023 18:30:49 -0800 (PST) On Fri, Sep 29, 2023 at 10:27:15AM +0000, John Garry wrote: > Add support for atomic writes, as follows: > - Ensure that the IO follows all the atomic writes rules, like must be > naturally aligned > - Set REQ_ATOMIC > > Signed-off-by: John Garry > --- > block/fops.c | 42 +++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 41 insertions(+), 1 deletion(-) > > diff --git a/block/fops.c b/block/fops.c > index acff3d5d22d4..516669ad69e5 100644 > --- a/block/fops.c > +++ b/block/fops.c > @@ -41,6 +41,29 @@ static bool blkdev_dio_unaligned(struct block_device *bdev, loff_t pos, > !bdev_iter_is_aligned(bdev, iter); > } > > +static bool blkdev_atomic_write_valid(struct block_device *bdev, loff_t pos, > + struct iov_iter *iter) > +{ > + unsigned int atomic_write_unit_min_bytes = > + queue_atomic_write_unit_min_bytes(bdev_get_queue(bdev)); > + unsigned int atomic_write_unit_max_bytes = > + queue_atomic_write_unit_max_bytes(bdev_get_queue(bdev)); > + > + if (!atomic_write_unit_min_bytes) > + return false; The above check should have be moved to limit setting code path. > + if (pos % atomic_write_unit_min_bytes) > + return false; > + if (iov_iter_count(iter) % atomic_write_unit_min_bytes) > + return false; > + if (!is_power_of_2(iov_iter_count(iter))) > + return false; > + if (iov_iter_count(iter) > atomic_write_unit_max_bytes) > + return false; > + if (pos % iov_iter_count(iter)) > + return false; I am a bit confused about relation between atomic_write_unit_max_bytes and atomic_write_max_bytes. Here the max IO length is limited to be <= atomic_write_unit_max_bytes, so looks userspace can only submit IO with write-atomic-unit naturally aligned IO(such as, 4k, 8k, 16k, 32k, ...), but these user IOs are allowed to be merged to big one if naturally alignment is respected and the merged IO size is <= atomic_write_max_bytes. Is my understanding right? If yes, I'd suggest to document the point, and the last two checks could be change to: /* naturally aligned */ if (pos % iov_iter_count(iter)) return false; if (iov_iter_count(iter) > atomic_write_max_bytes) return false; Thanks, Ming