Received: by 2002:a05:7412:37c9:b0:e2:908c:2ebd with SMTP id jz9csp2391893rdb; Thu, 21 Sep 2023 18:21:31 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEA2u4H3yrKQjt0BsaCwrUdaUCFryE9TmBwPjIt4HUDpZs3jVoPR8TUcl5dILwNrcp2VZzE X-Received: by 2002:a05:6870:6487:b0:1c8:bf19:e1e4 with SMTP id cz7-20020a056870648700b001c8bf19e1e4mr7607432oab.37.1695345691463; Thu, 21 Sep 2023 18:21:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695345691; cv=none; d=google.com; s=arc-20160816; b=DRf6jNqD2M7VQZulBkL8KPGqHP7BORjMkY9wJgcZyA8hhuyDgvQrUFsm/KVI1K8dOR RdQNgB+OKt5OnniZvg6X922qtC6hXKMZAY5idnRJuo+MV1Zv2aNmHOQwXJjGFvBbeMyG SYvxb8iUbAFuYEM1mGbD5Dxqtx/Ag3LVHqVFTmogUJ20HuSfNyccOr2qrje+LT3lH7pO KDh7fxuBIYll1k6Qn+Fiq2xmqTB9LQdM2ZoJdg+AHj2JcjLbXq6ZC+R/tkCCZqy3vGl4 0EANtCAO5D6o6HDWsAUcezaRtIto0KyrkeDgAhh74xjDmLLCGq53fx0GpFfLvu5O15hV J6zg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=AdMCh/+yCsduMbxg3dBOP8pPO/0inss321VX5z7Cslw=; fh=VgNU4hvvciN7wCBodKBxYQKTfkyRb8UMoUyYkuL70QU=; b=dwutH9NW4U27Y6rpVXITUSoOUXuXz7PNlpa2Ze7gL1nziJEGHzYgaqVb1oR/ZaGhun YOaZIOUoFJLvpsGqkB0xhoU36yTaELJ7Ux40xVSRdb0afGhfnjnpsfrtxIaY5epukWab HDwCidWFYNPeJa8znOpk/mquA+Dmui+K/GAg3w/VPf00Mc1ZBjWZoqew4VQlmrKrs0aF 4EicrKCgh+PEt2p5iBX96J+veeFqun5Ip35rWaHvCZpWgMYq+ighhwFNHPZNh2zMiZlj 8IpiYhDhF4efOpYRpEGkK6kuB5FtFTQ12JE2Te8fLnK/ckpt0TLXFCBeAOfyxEKxg3cA bARw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=4cF7X2KP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id a35-20020a631a63000000b00578ea9a0b93si2580406pgm.890.2023.09.21.18.21.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Sep 2023 18:21:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=4cF7X2KP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id C537D829FDD1; Thu, 21 Sep 2023 12:30:50 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230381AbjIUTai (ORCPT + 99 others); Thu, 21 Sep 2023 15:30:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43800 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230394AbjIUTaF (ORCPT ); Thu, 21 Sep 2023 15:30:05 -0400 Received: from mail-yw1-x1133.google.com (mail-yw1-x1133.google.com [IPv6:2607:f8b0:4864:20::1133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9E5F5ADF21 for ; Thu, 21 Sep 2023 11:04:44 -0700 (PDT) Received: by mail-yw1-x1133.google.com with SMTP id 00721157ae682-579de633419so15256597b3.3 for ; Thu, 21 Sep 2023 11:04:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1695319484; x=1695924284; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=AdMCh/+yCsduMbxg3dBOP8pPO/0inss321VX5z7Cslw=; b=4cF7X2KPVdvj/7607ekIQaEkhcmXERPq1+VVvEJRyxyWaezd5rbC9thCnzqB94ztKQ +TfEJjXpFjfWkyJ7KrTjFUq/3rbF6WwXPHNTkgibkgKgzTNtmgz6JLTqCy7o63154Lw3 JkijIXtRuB5qgVnzVQDXUkKyg8OcoA9NjUM5tN1WkyYYZ/MTOyy3tiwOJUzqcpq7ffVi mppntYxKiaZkni8p7pmOPe1W7cLQWJ6GIpyVeefZ94o1e7MaT8OitlV73/fewMZGB8gU ec6vw1Nrqc9ThSRrDbZO2GoHt7qiX1QK5AIYPua8zhISucW8WmL7bZS3PgjfWV60lQU2 Ip4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695319484; x=1695924284; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=AdMCh/+yCsduMbxg3dBOP8pPO/0inss321VX5z7Cslw=; b=AfkYkawZYHLQKRZYP5lna4PA+b5hNIoSAb8OpNaRf2YeaNsX4xzkacXC8WupTQ9Fep 7saWhITTgmR2YDpb0q9etKWNFPmFqfsIVJW1t/ARqWSU4/EI3Yvgdj/14iduWIFI6Myi hGjvNwAqj+PVYeqlrqi7AHOcsxqnP1uzr37QsopJgi04+r0yiVL2xWJRkSFLhHhLJbRJ 0q7wrcIs/4oEFp2aX6ZpQfd22VrrFQHMOPSlIBQbHT5a3hzg5VCNsB+txo2VBTLGY4hI rcz2WCe/xhwZcFKgJLj8pGYIgO3kpOyoSvgjMbLEOKAgk9tLJdzv8Lf1d8Tgl23nEXjs DBWg== X-Gm-Message-State: AOJu0YybcvlcCNQH5FYMWU4+gwFQOD8CBej2lrcB6wqvGyv4JjTIdg7Q 5rLqJ3z4eEVqtPa7eG7dzO4cUbenX9+6OFHXZv+qvQ== X-Received: by 2002:a25:dbca:0:b0:d5c:ce73:6528 with SMTP id g193-20020a25dbca000000b00d5cce736528mr5584392ybf.35.1695319483599; Thu, 21 Sep 2023 11:04:43 -0700 (PDT) MIME-Version: 1.0 References: <20230914152620.2743033-1-surenb@google.com> <20230914152620.2743033-3-surenb@google.com> In-Reply-To: From: Suren Baghdasaryan Date: Thu, 21 Sep 2023 18:04:30 +0000 Message-ID: Subject: Re: [PATCH 2/3] userfaultfd: UFFDIO_REMAP uABI To: David Hildenbrand Cc: Matthew Wilcox , akpm@linux-foundation.org, viro@zeniv.linux.org.uk, brauner@kernel.org, shuah@kernel.org, aarcange@redhat.com, lokeshgidra@google.com, peterx@redhat.com, hughd@google.com, mhocko@suse.com, axelrasmussen@google.com, rppt@kernel.org, Liam.Howlett@oracle.com, jannh@google.com, zhangpeng362@huawei.com, bgeffon@google.com, kaleshsingh@google.com, ngeoffray@google.com, jdduke@google.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Thu, 21 Sep 2023 12:30:51 -0700 (PDT) On Thu, Sep 14, 2023 at 6:45=E2=80=AFPM David Hildenbrand wrote: > > On 14.09.23 20:43, David Hildenbrand wrote: > > On 14.09.23 20:11, Matthew Wilcox wrote: > >> On Thu, Sep 14, 2023 at 08:26:12AM -0700, Suren Baghdasaryan wrote: > >>> +++ b/include/linux/userfaultfd_k.h > >>> @@ -93,6 +93,23 @@ extern int mwriteprotect_range(struct mm_struct *d= st_mm, > >>> extern long uffd_wp_range(struct vm_area_struct *vma, > >>> unsigned long start, unsigned long len, bool en= able_wp); > >>> > >>> +/* remap_pages */ > >>> +extern void double_pt_lock(spinlock_t *ptl1, spinlock_t *ptl2); > >>> +extern void double_pt_unlock(spinlock_t *ptl1, spinlock_t *ptl2); > >>> +extern ssize_t remap_pages(struct mm_struct *dst_mm, > >>> + struct mm_struct *src_mm, > >>> + unsigned long dst_start, > >>> + unsigned long src_start, > >>> + unsigned long len, __u64 flags); > >>> +extern int remap_pages_huge_pmd(struct mm_struct *dst_mm, > >>> + struct mm_struct *src_mm, > >>> + pmd_t *dst_pmd, pmd_t *src_pmd, > >>> + pmd_t dst_pmdval, > >>> + struct vm_area_struct *dst_vma, > >>> + struct vm_area_struct *src_vma, > >>> + unsigned long dst_addr, > >>> + unsigned long src_addr); > >> > >> Drop the 'extern' markers from function declarations. > >> > >>> +int remap_pages_huge_pmd(struct mm_struct *dst_mm, > >>> + struct mm_struct *src_mm, > >>> + pmd_t *dst_pmd, pmd_t *src_pmd, > >>> + pmd_t dst_pmdval, > >>> + struct vm_area_struct *dst_vma, > >>> + struct vm_area_struct *src_vma, > >>> + unsigned long dst_addr, > >>> + unsigned long src_addr) > >>> +{ > >>> + pmd_t _dst_pmd, src_pmdval; > >>> + struct page *src_page; > >>> + struct anon_vma *src_anon_vma, *dst_anon_vma; > >>> + spinlock_t *src_ptl, *dst_ptl; > >>> + pgtable_t pgtable; > >>> + struct mmu_notifier_range range; > >>> + > >>> + src_pmdval =3D *src_pmd; > >>> + src_ptl =3D pmd_lockptr(src_mm, src_pmd); > >>> + > >>> + BUG_ON(!pmd_trans_huge(src_pmdval)); > >>> + BUG_ON(!pmd_none(dst_pmdval)); > >>> + BUG_ON(!spin_is_locked(src_ptl)); > >>> + mmap_assert_locked(src_mm); > >>> + mmap_assert_locked(dst_mm); > >>> + BUG_ON(src_addr & ~HPAGE_PMD_MASK); > >>> + BUG_ON(dst_addr & ~HPAGE_PMD_MASK); > >>> + > >>> + src_page =3D pmd_page(src_pmdval); > >>> + BUG_ON(!PageHead(src_page)); > >>> + BUG_ON(!PageAnon(src_page)); > >> > >> Better to add a src_folio =3D page_folio(src_page); > >> and then folio_test_anon() here. > >> > >>> + if (unlikely(page_mapcount(src_page) !=3D 1)) { > >> > >> Brr, this is going to miss PTE mappings of this folio. I think you > >> actually want folio_mapcount() instead, although it'd be more efficien= t > >> to look at folio->_entire_mapcount =3D=3D 1 and _nr_pages_mapped =3D= =3D 0. > >> Not wure what a good name for that predicate would be. > > > > We have > > > > * It only works on non shared anonymous pages because those can > > * be relocated without generating non linear anon_vmas in the rmap > > * code. > > * > > * It provides a zero copy mechanism to handle userspace page faults. > > * The source vma pages should have mapcount =3D=3D 1, which can be > > * enforced by using madvise(MADV_DONTFORK) on src vma. > > > > Use PageAnonExclusive(). As long as KSM is not involved and you don't > > use fork(), that flag should be good enough for that use case here. > > > ... and similarly don't do any of that swapcount stuff and only check if > the swap pte is anon exclusive. I'm preparing v2 and this is the only part left for me to address but I'm not clear how. David, could you please clarify how I should be checking swap pte to be exclusive without swapcount? > > -- > Cheers, > > David / dhildenb >