Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp875071rwr; Thu, 20 Apr 2023 07:21:52 -0700 (PDT) X-Google-Smtp-Source: AKy350Y4qfpDx9hLLJUZhP18fC1xMkXc/1klhprtRp2SJERu3K4jO9YbTqJ4MihdZFdSTfFOMDaN X-Received: by 2002:a05:6a00:182a:b0:63d:2648:f93e with SMTP id y42-20020a056a00182a00b0063d2648f93emr1944249pfa.20.1682000512617; Thu, 20 Apr 2023 07:21:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1682000512; cv=none; d=google.com; s=arc-20160816; b=ZvXL4dT7WkovH0yMaOpXYi4t0Dk6sLi498qCoUiG5FPK3ImjEIVEJgcOpg2mS07KE9 anKS8zfaMdfQiGFagV9Z9lTfe3G4rFll5OnyFdik7axZedrCpGmY6nIIvzf3IG3TRjgq h7KNdsU0jvr9m1GbmUDFPL82Nh/d4EcU8CJtI42u9BW/Y3aQ3BtqqN+nqb9/UpaCieXc Qpzfkl0LQKVYhSKGEYhcMRHnlQ8v1uI/8dZySwkRozQzhADA+3pevrhV3+1EUN/T1bGR UNp20cIkwhCu0xfqnGhNEc5W++7pqjm2jJJoEjW0FqgapfPV9pqzM7xVFyXM4SfxlOWK 0w2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=5bvHlG0/JTB7HrnkZNbj8ncjZGNIrjux5UiB5tAycP8=; b=yUaitSmKc1w+nXvxXu0JxLANeho3SYXAiyN/CayzQPtvZyEuAaY/fT2F2IqnUPyJV3 02ZUwWC8zmTM1MvpzG8ntWzi+gX00IyYgEIfwHD+ds+bEK0dA90ZMo7sI4oU16qMP+GD +rRBykflaFhv5w9kQ0PRVMCwRZCaukLlJcZ5fwq3yqj+mFByJcoCBr0klUMulGV5ZB16 IASUkbZVG9FkLxV+IGr6rqcfeezr8jvlLoop7amtpzYnydkjFBR7LejYeoO802ctswSl w85XKOuN9adzNMW1OYLFhcwlI3bW12t0X9O+vRy9IaQ5otqycbqzmEMkOdpzfEq5ayyw CdcA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=Npay1TVh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k196-20020a6284cd000000b0063b886a1786si1812159pfd.22.2023.04.20.07.21.40; Thu, 20 Apr 2023 07:21:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=Npay1TVh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231345AbjDTOTv (ORCPT + 99 others); Thu, 20 Apr 2023 10:19:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40794 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231997AbjDTOTb (ORCPT ); Thu, 20 Apr 2023 10:19:31 -0400 Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [IPv6:2a00:1450:4864:20::334]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7640C61B4; Thu, 20 Apr 2023 07:19:05 -0700 (PDT) Received: by mail-wm1-x334.google.com with SMTP id bi21-20020a05600c3d9500b003f17a8eaedbso3155696wmb.1; Thu, 20 Apr 2023 07:19:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1682000343; x=1684592343; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=5bvHlG0/JTB7HrnkZNbj8ncjZGNIrjux5UiB5tAycP8=; b=Npay1TVhojwYB+yz7Ysjw6StA9B11tmjVHgRd0PstSs12o90dEemEl7STZn4axgCAZ OM0zb6TDhk4+v+TPBRFM/72qZ2pHWuV8TdrVyRVeArba9e2ZakVBE9YbrBLowET/mVkn y9CfDStdJ/jAQ5UoJOgJyVl3DFq67AXzIVS5CfCyC6vA1d6xYGzIsV3OmETy/1MFXxVg 33rX0L2U/nL3XGXk7cTu/AfGaxVfRVwJR5zfHhLZ99DxjiSIX9Y5Lqxw0pyz9LzxX60h jXpOcFDei0fF6puh5ZTK6rvSWrdUJyKyqZT3GNjyXdq2yC6PQ5e0rOr4X94x0HmeunRP H/Mw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682000343; x=1684592343; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=5bvHlG0/JTB7HrnkZNbj8ncjZGNIrjux5UiB5tAycP8=; b=ielwcZWQstvXcxNGnDpPCvGcOrnMe/MeYzLNzvpDOESE3tuVgSV8YoKVBEafklCqux Fhb7c319joV8TU2NOurjSQMQK2n5ZtQzG/SvQ6t8dZfEnkLgVy1kUU+6cG0NsIxYc46o JX87+Tw5aT9a1Hv+N/ldeuc5vWkUAeRIy0W6hDfP4CYN7VlV09HpcRbE/FCt4c+slCGs HUQCSiYX3QbWkbzd7J+aaO6XcDTiY9g/63R87Dwcc9imb9RECoxz9Ou9+L4fE4FTe3SF KkJeb9mBL3pEGuNW2NAaNHlfq/ztcqtA44s636x2ISgEQXuh3WV8TV8sdUiiraU0Xy8n UBvw== X-Gm-Message-State: AAQBX9c9o7vDTpf3jenvW3DACC+RzYSr7N22MCe5dl5lKN8iFVXkzCA1 e6x76VMVMCZJk613gKEzR8w= X-Received: by 2002:a7b:cd10:0:b0:3f0:a785:f0e0 with SMTP id f16-20020a7bcd10000000b003f0a785f0e0mr1430470wmj.40.1682000342479; Thu, 20 Apr 2023 07:19:02 -0700 (PDT) Received: from localhost (host86-156-84-164.range86-156.btcentralplus.com. [86.156.84.164]) by smtp.gmail.com with ESMTPSA id s9-20020a05600c45c900b003f092f0e0a0sm8082757wmo.3.2023.04.20.07.19.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 20 Apr 2023 07:19:01 -0700 (PDT) Date: Thu, 20 Apr 2023 15:19:01 +0100 From: Lorenzo Stoakes To: Pavel Begunkov Cc: Matthew Wilcox , Jason Gunthorpe , Jens Axboe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , David Hildenbrand , io-uring@vger.kernel.org Subject: Re: [PATCH v4 4/6] io_uring: rsrc: avoid use of vmas parameter in pin_user_pages() Message-ID: <5e4a23f1-c99e-45d6-8402-6c2df8fa06e0@lucifer.local> References: <69f48cc6-8fc6-0c49-5a79-6c7d248e4ad5@kernel.dk> <8af483d2-0d3d-5ece-fb1d-a3654411752b@kernel.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 20, 2023 at 02:36:47PM +0100, Pavel Begunkov wrote: > On 4/19/23 19:35, Matthew Wilcox wrote: > > On Wed, Apr 19, 2023 at 03:24:55PM -0300, Jason Gunthorpe wrote: > > > On Wed, Apr 19, 2023 at 07:23:00PM +0100, Matthew Wilcox wrote: > > > > On Wed, Apr 19, 2023 at 07:18:26PM +0100, Lorenzo Stoakes wrote: > > > > > So even if I did the FOLL_ALLOW_BROKEN_FILE_MAPPING patch series first, I > > > > > would still need to come along and delete a bunch of your code > > > > > afterwards. And unfortunately Pavel's recent change which insists on not > > > > > having different vm_file's across VMAs for the buffer would have to be > > > > > reverted so I expect it might not be entirely without discussion. > > > > > > > > I don't even understand why Pavel wanted to make this change. The > > > > commit log really doesn't say. > > > > > > > > commit edd478269640 > > > > Author: Pavel Begunkov > > > > Date: Wed Feb 22 14:36:48 2023 +0000 > > > > > > > > io_uring/rsrc: disallow multi-source reg buffers > > > > > > > > If two or more mappings go back to back to each other they can be passed > > > > into io_uring to be registered as a single registered buffer. That would > > > > even work if mappings came from different sources, e.g. it's possible to > > > > mix in this way anon pages and pages from shmem or hugetlb. That is not > > > > a problem but it'd rather be less prone if we forbid such mixing. > > > > > > > > Cc: > > > > Signed-off-by: Pavel Begunkov > > > > Signed-off-by: Jens Axboe > > > > > > > > It even says "That is not a problem"! So why was this patch merged > > > > if it's not fixing a problem? > > > > > > > > It's now standing in the way of an actual cleanup. So why don't we > > > > revert it? There must be more to it than this ... > > > > > > https://lore.kernel.org/all/61ded378-51a8-1dcb-b631-fda1903248a9@gmail.com/ > > > > So um, it's disallowed because Pavel couldn't understand why it > > should be allowed? This gets less and less convincing. > > Excuse me? I'm really sorry you "couldn't understand" the explanation > as it has probably been too much of a "mental load", but let me try to > elaborate. > > Because it's currently limited what can be registered, it's indeed not > a big deal, but that will most certainly change, and I usually and > apparently nonsensically prefer to tighten things up _before_ it becomes > a problem. And again, taking a random set of buffers created for > different purposes and registering it as a single entity is IMHO not a > sane approach. > > Take p2pdma for instance, if would have been passed without intermixing > there might not have been is_pci_p2pdma_page()/etc. for every single page > in a bvec. That's why in general, it won't change for p2pdma but there > might be other cases in the future. > The proposed changes for GUP are equally intended to tighten things up both in advance of issues (e.g. misuse of vmas parameter), for the purposes of future improvements to GUP (optimising how we perform these operations) and most pertinently here - ensuring broken uses of GUP do not occur. So both are 'cleanups' in some sense :) I think it's important to point out that this change is not simply a desire to improve an interface but has practical implications going forward (which maybe aren't obvious at this point yet). The new approach to this changes goes further in that we essentially perform the existing check here (anon/shmem/hugetlb) but for _all_ FOLL_WRITE operations in GUP to avoid broken writes to file mappings, at which point we can just remove the check here altogether (I will post a series for this soon). I think that you guys should not have to be performing sanity checks here but rather we should handle it in GUP so you don't need to think about it at all. It feels like an mm failing that you have had to do so at all. So I guess the question is, do you feel the requirement for vm_file being the same should apply across _any_ GUP operation over a range or is it io_uring-specific? If the former then we should do it in GUP alongside the other checks (and you can comment accordingly on that patch series when it comes), if the latter then I would restate my opinion that we shouldn't be prevented from making improvements to GUP in for one caller that wants to iterate over these VMAs for custom checks + that should be done separately. > > > FWIW, what I was suggesting was that we should have a FOLL_SINGLE_VMA > > flag, which would use our shiny new VMA lock infrastructure to look > > up and lock _one_ VMA instead of having the caller take the mmap_lock. > > Passing that flag would be a tighter restriction that Pavel implemented, > > but would certainly relieve some of his mental load. > > > > By the way, even if all pages are from the same VMA, they may still be a > > mixture of anon and file pages; think a MAP_PRIVATE of a file when > > only some pages have been written to. Or an anon MAP_SHARED which is > > accessible by a child process. > > -- > Pavel Begunkov