Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp3199314ybb; Mon, 13 Apr 2020 02:55:20 -0700 (PDT) X-Google-Smtp-Source: APiQypL2vb7YM6GcvcmNVx2kCFIA4Yiizomt4RIT0aPcbnU/ca7MkVgp/2zLlF40Y0w6mZbtJxll X-Received: by 2002:a17:907:11de:: with SMTP id va30mr10269912ejb.121.1586771719965; Mon, 13 Apr 2020 02:55:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1586771719; cv=none; d=google.com; s=arc-20160816; b=lIKFV+nGutVfC9IWS3h6JqJCNtnyYofvULjMVrLb1h1iWydSvjbx7XIYwTT3aoIsq5 8Ec83BUd6NrLKxdsxcJx75l+LsdFNFiblb4Z+X++Dhc6siHi1knH5ih/SMk2cp2BMWQ9 pkjrXLe6mKxx6lD25h0RVN80Rs+57EF7Ymdp1jAyIx/UCzj0Soayn0umY6od23aW6C0M UmzZLjMKVbJu2PXkxWqn4Is1qdPoYhJENRDa8K8KzBXqXHhaS+673n190NGkEVPKtkxo p9GiEjfjvjzHsK5AKN5jhfU/kCJgGNHxiUmaSUGokQL+U03Qf8m0ZSesDAX8Ql+dWWay jWIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=GHA9/acdYNKg6TtXigttoWf+ctJ+xZIYedEqvuMFErI=; b=Rj7jAmIPYWpMjCFX7kEZDCie6st+N5bAdxhwl9iPuI5qhIe/Ry/6cx+vxzSTS4bYcq fdbDFBNjE6deaTeS2hM0HLewhfKjcbrwEusN3nQKr3wWyhnmppncvHct1oT5obOtyvz1 YYUzbR89xeEnL07DHejS8xaONUYf2YvmNnycJCxHJ4XZibImdxcJWiUQhCKf7ry1GLd4 lyCXBB/xuTc6+Dg6C1RKqg8yErZdqt+VlpRWi9VSfnDVdNOMjnvMwy11ac0vfM2NRsb8 OZovyZMjI1y/YMYwko55Fle8gQZqatJqIF77F4wO8tbpKdZEIXqRi+3+trYkKw5mVhSZ Jvrg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=gtIWF+DR; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x22si5589244eju.478.2020.04.13.02.54.43; Mon, 13 Apr 2020 02:55:19 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=gtIWF+DR; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728115AbgDMJEq (ORCPT + 99 others); Mon, 13 Apr 2020 05:04:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55406 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1728091AbgDMJEl (ORCPT ); Mon, 13 Apr 2020 05:04:41 -0400 Received: from mail-il1-x143.google.com (mail-il1-x143.google.com [IPv6:2607:f8b0:4864:20::143]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0008AC008749; Mon, 13 Apr 2020 01:57:51 -0700 (PDT) Received: by mail-il1-x143.google.com with SMTP id f82so4509930ilh.8; Mon, 13 Apr 2020 01:57:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=GHA9/acdYNKg6TtXigttoWf+ctJ+xZIYedEqvuMFErI=; b=gtIWF+DRFw3NFjElbOmdUq+StalADZweXYUfWCkqiK8JLnF7+ewHUQMxY+05DelQSh qXSsO4sJPiZJO5nQCa3P36ztKcww04jTPFhV7HkEKzuN22nplRCgZ1Zf7qRo3fGIKHQ6 7X+qnxY6W3J8t4tN73OF6zZKx1wsnuu+3MN8PLFa77bWZvxOID6Hap2HgyVEMWm3235m wPXQZS9Mq+UFvND3v/rf6kfHfS9Dtlj7GHzLNFHzDfKI7YP88UIGKKQWyrGgKy/TRMLg f8Hw9XVxkuDpz/JGArMAMJnoBgNoxJHtqr1WTtcp0Hph4bDYkRgn0hNTjOVSZevNria0 0x/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=GHA9/acdYNKg6TtXigttoWf+ctJ+xZIYedEqvuMFErI=; b=NwsDSGSaKdFTZkTmiVqAnpW62xZArNmU6d5w+o9F3+bbxfOmvyEXM1NA/Tx9OASAs5 9sbCeOYFDS4bKI7QSV71QBVjO8wgap36rBwLifstQCNg9eEuR69n/+eLuSxaFt1pjfGW ngPQ13g8ZqqbeAFU5lrAwPuA/O6SVjn89RKS3ogq/3kMc4sgXcfaHeKiOAgYZkg3Xkbs wX2Thtd2M9nmPcCjwXVyEjHeEUY0kExO9sU8AlrGC7v7FCu5D3CrD2blMwUNlSW6C8iU P93Ne5NimiENFEA368sKClnHegA3N+jmRyv8LI5wpckkC+fJWJcOGR8KPEf5Wdanzgmq 65ZQ== X-Gm-Message-State: AGi0PuZ9vZcE3FWUmgWohdx6HQ09ge0lVV//bZNOsX3jJqaJwNvlOKXN As5tGuTTwMH4mLlSh8PJ70ThsoCQc5F6VOHZWFMU/qWI X-Received: by 2002:a92:cc02:: with SMTP id s2mr12907286ilp.9.1586768271374; Mon, 13 Apr 2020 01:57:51 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Amir Goldstein Date: Mon, 13 Apr 2020 11:57:40 +0300 Message-ID: Subject: Re: Same mountpoint restriction in FICLONE ioctls To: Keno Fischer Cc: linux-fsdevel , Miklos Szeredi , linux-xfs , CIFS , Linux NFS Mailing List , Olga Kornievskaia , Steve French Content-Type: text/plain; charset="UTF-8" Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On Mon, Apr 13, 2020 at 1:28 AM Keno Fischer wrote: > > > You did not specify your use case. > > My use case is recording (https://rr-project.org/) executions Cool! I should try that ;-) > of containers (which often make heavy use of bind mounts on > the same file system, thus me running into this restriction). > In essence, at relevant read or mmap operations, > rr needs to checkpoint the file that was opened, > in case it later gets deleted or modified. > It always tries to FICLONE the file first, > before deciding heuristically whether to > instead create a copy (if it decides there is a low > likelihood the file will get changed - e.g. because > it's a system file - it may decide to take the chance and > not copy it at the risk of creating a broken recording). > That's often a decent trade-off, but of course it's not > 100% perfect. > > > The question is: do you *really* need cross mount clone? > > Can you use copy_file_range() instead? > > Good question. copy_file_range doesn't quite work > for that initial clone, because we do want it to fail if > cloning doesn't work (so that we can apply the > heuristics). However, you make a good point that > the copy fallback should probably use copy_file_range. > At least that way, if it does decide to copy, the > performance will be better. > > It would still be nice for FICLONE to ease this restriction, > since it reduces the chance of the heuristics getting > it wrong and preventing the copy, even if such > a copy would have been cheap. > You make it sound like the heuristic decision must be made *after* trying to clone, but it can be made before and pass flags to the kernel whether or to fallback to copy. copy_file_range(2) has an unused flags argument. Adding support for flags like: COPY_FILE_RANGE_BY_FS COPY_FILE_RANGE_BY_KERNEL or any other names elected after bike shedding can be used to control whether user intended to use filesystem internal clone/copy methods and/or to fallback to kernel copy. I think this functionality will be useful to many. > > Across which filesystems mounts are you trying to clone? > > This functionality was written with btrfs in mind, so that's > what I was testing with. The mounts themselves are just > different bindmounts into the same filesystem. > I can also suggest a workaround for you. If your only problem is bind mounts and if recorder is a privileged process (CAP_DAC_READ_SEARCH) then you can use a "master" bind mount to perform all clone operations on. Use name_to_handle_at(2) to get sb file handle of source file. Use open_by_handle_at(2) to get an open file descriptor of the source file under the "master" bind mount. Thanks, Amir.