Received: by 2002:a05:6a10:9afc:0:0:0:0 with SMTP id t28csp123590pxm; Tue, 22 Feb 2022 18:30:23 -0800 (PST) X-Google-Smtp-Source: ABdhPJyMUA7ZqJi7sXifMFldWzsKUF4g7gjEoePBrtrfqAb4RiOMM3ZexHxXrxHJ8O5lzwlFV4rB X-Received: by 2002:a50:c444:0:b0:413:30d5:c6ad with SMTP id w4-20020a50c444000000b0041330d5c6admr1367789edf.251.1645583422766; Tue, 22 Feb 2022 18:30:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1645583422; cv=none; d=google.com; s=arc-20160816; b=Rxzkvw8oAhNc+eKPGr9CIV3U2YW/B3my65YffHo4NojTa4iufrhkUGB2xkVJpVIMdn CX+T4ne55VIhRm74/nfQ4rZfaZ+E7jtU29WXeCSSPO4I4nBTj7YSl2NOE3gMPHpOOQR1 g0AWdmZgLH51QidVA0OR/BVhMOKKJDob0sASf7kjINCVtUDUzODGJ+UMNtJPAZfa6V4z rhqaGqc6wl4XL/KaC2VD2Sg7Id//W0RxwUMCzaOyDyM6ruFPG6fvr4+cD2utvZRxIGSm 1oaRbRnr35wqzwXqtNqM0V/KaRC0vzkldk8+/yMKwdAmsEtXeE/g8ZD/ArUrwL4uLPgF vLAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=aeErea1KNVsvGG1/DzyBT0YHCag1yVRHroDCfmgmaVc=; b=UarquWP9M1z1ChrSzPjWvwQM7Tv1QMv8UsNo/k7zyZIIasPrFwItnYflM7dSk3K31k xRJftqM6GtuNX2/5nt+kw+bAYq4lQ1oerbFXEesqGozuuD5Mcb3MMKY85ivaLBXcpJnK ijd0XUjbGPW33DEzqZL9SfxivBprqjpSe/qjPhEdnXBTsnSAbNTBdw0RJgQpHuJ71BKx zS3Vtv62KqvN/7LyhmSEXDiRcQgL/QTVtknAN7x8Dho5bc3D8q7oTTl4QfcXvemQ7a04 JOb/K5iReHhESfwn73wmlyZVNoIvwpMXT4mA0+rzjK1t1ZjKrwgToj3NYSvGhCTocCcT SLRA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i8si10987804ejj.319.2022.02.22.18.29.59; Tue, 22 Feb 2022 18:30:22 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234060AbiBWBoz (ORCPT + 99 others); Tue, 22 Feb 2022 20:44:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39600 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236794AbiBWBol (ORCPT ); Tue, 22 Feb 2022 20:44:41 -0500 Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au [211.29.132.246]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 0E71753E29; Tue, 22 Feb 2022 17:43:41 -0800 (PST) Received: from dread.disaster.area (pa49-186-17-0.pa.vic.optusnet.com.au [49.186.17.0]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id C1E5D52F66C; Wed, 23 Feb 2022 12:43:36 +1100 (AEDT) Received: from dave by dread.disaster.area with local (Exim 4.92.3) (envelope-from ) id 1nMggp-00FHrN-4M; Wed, 23 Feb 2022 12:43:35 +1100 Date: Wed, 23 Feb 2022 12:43:35 +1100 From: Dave Chinner To: Nitesh Shetty Cc: javier@javigon.com, chaitanyak@nvidia.com, linux-block@vger.kernel.org, linux-scsi@vger.kernel.org, dm-devel@redhat.com, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, axboe@kernel.dk, msnitzer@redhat.com, bvanassche@acm.org, martin.petersen@oracle.com, hare@suse.de, kbusch@kernel.org, hch@lst.de, Frederick.Knight@netapp.com, osandov@fb.com, lsf-pc@lists.linux-foundation.org, djwong@kernel.org, josef@toxicpanda.com, clm@fb.com, dsterba@suse.com, tytso@mit.edu, jack@suse.com, joshi.k@samsung.com, arnav.dawn@samsung.com, nitheshshetty@gmail.com, Alasdair Kergon , Mike Snitzer , Sagi Grimberg , James Smart , Chaitanya Kulkarni , Alexander Viro , linux-kernel@vger.kernel.org Subject: Re: [PATCH v3 00/10] Add Copy offload support Message-ID: <20220223014335.GH3061737@dread.disaster.area> References: <20220214080002.18381-1-nj.shetty@samsung.com> <20220214220741.GB2872883@dread.disaster.area> <20220217130215.GB3781@test-zns> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220217130215.GB3781@test-zns> X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.4 cv=e9dl9Yl/ c=1 sm=1 tr=0 ts=6215914d a=+dVDrTVfsjPpH/ci3UuFng==:117 a=+dVDrTVfsjPpH/ci3UuFng==:17 a=kj9zAlcOel0A:10 a=oGFeUVbbRNcA:10 a=7-415B0cAAAA:8 a=ejYDZuiVgDdOHNFj5oEA:9 a=CjuIK1q_8ugA:10 a=biEYGPWJfzWAr4FL6Ov7:22 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE, SPF_HELO_PASS,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 17, 2022 at 06:32:15PM +0530, Nitesh Shetty wrote: > Tue, Feb 15, 2022 at 09:08:12AM +1100, Dave Chinner wrote: > > On Mon, Feb 14, 2022 at 01:29:50PM +0530, Nitesh Shetty wrote: > > > [LSF/MM/BFP TOPIC] Storage: Copy Offload[0]. > > The biggest missing piece - and arguably the single most useful > > piece of this functionality for users - is hooking this up to the > > copy_file_range() syscall so that user file copies can be offloaded > > to the hardware efficiently. > > > > This seems like it would relatively easy to do with an fs/iomap iter > > loop that maps src + dst file ranges and issues block copy offload > > commands on the extents. We already do similar "read from source, > > write to destination" operations in iomap, so it's not a huge > > stretch to extent the iomap interfaces to provide an copy offload > > mechanism using this infrastructure. > > > > Also, hooking this up to copy-file-range() will also get you > > immediate data integrity testing right down to the hardware via fsx > > in fstests - it uses copy_file_range() as one of it's operations and > > it will find all the off-by-one failures in both the linux IO stack > > implementation and the hardware itself. > > > > And, in reality, I wouldn't trust a block copy offload mechanism > > until it is integrated with filesystems, the page cache and has > > solid end-to-end data integrity testing available to shake out all > > the bugs that will inevitably exist in this stack.... > > We had planned copy_file_range (CFR) in next phase of copy offload patch series. > Thinking that we will get to CFR when everything else is robust. > But if that is needed to make things robust, will start looking into that. How do you make it robust when there is no locking/serialisation to prevent overlapping concurrent IO while the copy-offload is in progress? Or that you don't have overlapping concurrent copy-offloads running at the same time? You've basically created a block dev ioctl interface that looks impossible to use safely. It doesn't appear to be coherent with the blockdev page cache nor does it appear to have any documented data integrity semantics, either. e.g. how does this interact with the guarantees that fsync_bdev() and/or sync_blockdev() are supposed to provide? IOWs, if you don't have either CFR or some other strictly bound kernel user with well defined access, synchronisation and integrity semantics, how can anyone actually robustly test these ioctls to be working correctly in all situations they might be called? Cheers, Dave. -- Dave Chinner david@fromorbit.com