Received: by 2002:a05:6a10:17d3:0:0:0:0 with SMTP id hz19csp2557619pxb; Tue, 13 Apr 2021 05:03:37 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxROHaq5JRBdf+uQD3a+bmOrJO3EmCfVvG/BcZzlvFfYGeWqhud6p/dWEtlvYSpFy5EhegK X-Received: by 2002:a5d:62cd:: with SMTP id o13mr7375933wrv.77.1618315417037; Tue, 13 Apr 2021 05:03:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618315417; cv=none; d=google.com; s=arc-20160816; b=c2C8XbG3xpakl6h5V2PoH9mK6jXchqJj3HEBVZRCte9YwsIE/WquL2myIxxh7AN2KG vjs0DXXiUZFz48uEKtOetA8xPI6UQV/1OowMRiayqwer7mQaGorSWp29HWU9a430lcLF bxtcmthDtUDtNUimTlzo1lslDk8u3FUDVDOAzDGKurFtEDQ0kTo2mhpHJQA7YgGLonVN Lb3Ql7TYwPue6ovKjRKhQNtLjEzKEQvxt4s3CBBdEeistjwFzTPClSMEE+4ZKPWeq84D vJGIQ1wGfg1EObNBrLhGpFINjlQ4gV8BzjmrXb79HFnhGdO419PAIb3djjAlj8nhpKJR 8FDg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=RjgkMTxmWerUbM+C2EtS/XPjyHCbWkmm0xddYj/OAJc=; b=A9Enb1EB1+3ISrEufGMkwb6k/TjfQZEDmbN2GXu1H6OiNAxbCBhvHXg+HpsZT24fzL +WIs9uC8SVlJsce3V1JqU33+Cpl/XPV3V56utkkRSwnqeIr1qIUuKP+XhKwgsqoUVGfC qbil5kKpBdDWpZYYR8lTm4j+25lg9hBYFXVg7AQdvU7puhQCP8W1sjxYC32fYI8bFU5l J+9UiVqNqiRQOmSEmVLInPUreCM2J+1zkfoj1UfGo5DJDTyW4VdrQmkm7nz0W1l73YZh PpPFYpp3PswY0lnqOzjNz6Hep7G73gc4oMvJ5uUGLA3odRdtdM8hzQI+PsFWp+OaLlAM 7Efg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=W33JPKLa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b20si9023498eja.632.2021.04.13.05.03.05; Tue, 13 Apr 2021 05:03:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=W33JPKLa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245102AbhDLXSD (ORCPT + 99 others); Mon, 12 Apr 2021 19:18:03 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:20828 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343646AbhDLXSC (ORCPT ); Mon, 12 Apr 2021 19:18:02 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1618269463; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=RjgkMTxmWerUbM+C2EtS/XPjyHCbWkmm0xddYj/OAJc=; b=W33JPKLabqarbsxy6hHL6DpNl6VzDuUqsbA0BRlN2KsEAofO78zJPW2TV23oPnZKKmXudo Am+iqgSejT3kFupOH4LO+sOVvyu2QrTQMQavMO4hvWPVPW6xFgDtqyRXiCJTzUpjoBcPMG TVVxMkvyIlVDhq07uqusMYlszCFrMC8= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-59--3dGWYF8MpOIT_yPPDh5hQ-1; Mon, 12 Apr 2021 19:17:41 -0400 X-MC-Unique: -3dGWYF8MpOIT_yPPDh5hQ-1 Received: by mail-qv1-f70.google.com with SMTP id ek16so9000964qvb.3 for ; Mon, 12 Apr 2021 16:17:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=RjgkMTxmWerUbM+C2EtS/XPjyHCbWkmm0xddYj/OAJc=; b=TK8L65Yxwg8uZbSPxBk71LOj/kNjLhC95NZIuUfqT0igoM8fRpvq/8vcmNtNHBN3/T PoGLl4IsLIlJ2ZTfvKhgOAL/VEaddcmc1zYIl95U5bt3qV+y3Re5qlLtZLrr3pCIkio/ DtUWi0RLq5CFQ//Boceoo7bR+9ikZSBrkopKtsatJQh10BEey1nJokKJA35nuyl/lRO8 QwrUK+ayYr0b9jIQEvg7C4KO34uKXPfbwt6/KufdSRm0i17WAtvaEscE8xed+YFsV9yM PbnxUKSrbi+IynKwqC2cuxmYYAvU3U8vUHzw+HTKyJjMek6id/xGodWDiIYWK23+dwvG wUUA== X-Gm-Message-State: AOAM532iwNEeO2Bd/FH5dIETSIV2vaGyCBGZz2qWBuit4VM0nIhSkLST ypt2s6NuWwZ6g1RBexjN5lUOFmw/eeXjphYwVtCeD0b7TS4I14k1D5JPx4ZCxs8zDfCyFZYlOb5 f/1j2r8+V2CSVneu3EgFsSHLz X-Received: by 2002:ac8:5f10:: with SMTP id x16mr28535538qta.6.1618269461303; Mon, 12 Apr 2021 16:17:41 -0700 (PDT) X-Received: by 2002:ac8:5f10:: with SMTP id x16mr28535500qta.6.1618269460962; Mon, 12 Apr 2021 16:17:40 -0700 (PDT) Received: from xz-x1 (bras-base-toroon474qw-grc-88-174-93-75-154.dsl.bell.ca. [174.93.75.154]) by smtp.gmail.com with ESMTPSA id j129sm9011374qkf.110.2021.04.12.16.17.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Apr 2021 16:17:40 -0700 (PDT) Date: Mon, 12 Apr 2021 19:17:36 -0400 From: Peter Xu To: Axel Rasmussen Cc: Alexander Viro , Andrea Arcangeli , Andrew Morton , Daniel Colascione , Hugh Dickins , Jerome Glisse , Joe Perches , Lokesh Gidra , Mike Kravetz , Mike Rapoport , Shaohua Li , Shuah Khan , Stephen Rothwell , Wang Qing , linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, Brian Geffon , "Dr . David Alan Gilbert" , Mina Almasry , Oliver Upton Subject: Re: [PATCH 4/9] userfaultfd/shmem: support UFFDIO_CONTINUE for shmem Message-ID: <20210412231736.GA1002612@xz-x1> References: <20210408234327.624367-1-axelrasmussen@google.com> <20210408234327.624367-5-axelrasmussen@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20210408234327.624367-5-axelrasmussen@google.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 08, 2021 at 04:43:22PM -0700, Axel Rasmussen wrote: > +/* > + * Install PTEs, to map dst_addr (within dst_vma) to page. > + * > + * This function handles MCOPY_ATOMIC_CONTINUE (which is always file-backed), > + * whether or not dst_vma is VM_SHARED. It also handles the more general > + * MCOPY_ATOMIC_NORMAL case, when dst_vma is *not* VM_SHARED (it may be file > + * backed, or not). > + * > + * Note that MCOPY_ATOMIC_NORMAL for a VM_SHARED dst_vma is handled by > + * shmem_mcopy_atomic_pte instead. > + */ > +static int mcopy_atomic_install_ptes(struct mm_struct *dst_mm, pmd_t *dst_pmd, > + struct vm_area_struct *dst_vma, > + unsigned long dst_addr, struct page *page, > + bool newly_allocated, bool wp_copy) > +{ > + int ret; > + pte_t _dst_pte, *dst_pte; > + int writable; > + bool vm_shared = dst_vma->vm_flags & VM_SHARED; > + spinlock_t *ptl; > + struct inode *inode; > + pgoff_t offset, max_off; > + > + _dst_pte = mk_pte(page, dst_vma->vm_page_prot); > + writable = dst_vma->vm_flags & VM_WRITE; > + /* For private, non-anon we need CoW (don't write to page cache!) */ > + if (!vma_is_anonymous(dst_vma) && !vm_shared) > + writable = 0; > + > + if (writable || vma_is_anonymous(dst_vma)) > + _dst_pte = pte_mkdirty(_dst_pte); > + if (writable) { > + if (wp_copy) > + _dst_pte = pte_mkuffd_wp(_dst_pte); > + else > + _dst_pte = pte_mkwrite(_dst_pte); > + } else if (vm_shared) { > + /* > + * Since we didn't pte_mkdirty(), mark the page dirty or it > + * could be freed from under us. We could do this > + * unconditionally, but doing it only if !writable is faster. > + */ > + set_page_dirty(page); > + } > + > + dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl); > + > + if (vma_is_shmem(dst_vma)) { > + /* The shmem MAP_PRIVATE case requires checking the i_size */ When you start to use this function in the last patch it'll be needed too even if MAP_SHARED? How about directly state the reason of doing this ("serialize against truncate with the PT lock") instead of commenting about "who will need it"? > + inode = dst_vma->vm_file->f_inode; > + offset = linear_page_index(dst_vma, dst_addr); > + max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); > + ret = -EFAULT; > + if (unlikely(offset >= max_off)) > + goto out_unlock; > + } [...] > +/* Handles UFFDIO_CONTINUE for all shmem VMAs (shared or private). */ > +static int mcontinue_atomic_pte(struct mm_struct *dst_mm, > + pmd_t *dst_pmd, > + struct vm_area_struct *dst_vma, > + unsigned long dst_addr, > + bool wp_copy) > +{ > + struct inode *inode = file_inode(dst_vma->vm_file); > + pgoff_t pgoff = linear_page_index(dst_vma, dst_addr); > + struct page *page; > + int ret; > + > + ret = shmem_getpage(inode, pgoff, &page, SGP_READ); SGP_READ looks right, as we don't want page allocation. However I noticed there's very slight difference when the page was just fallocated: /* fallocated page? */ if (page && !PageUptodate(page)) { if (sgp != SGP_READ) goto clear; unlock_page(page); put_page(page); page = NULL; hindex = index; } I think it won't happen for your case since the page should be uptodate already (the other thread should check and modify the page before CONTINUE), but still raise this up, since if the page was allocated it smells better to still install the fallocated page (do we need to clear the page and SetUptodate)? -- Peter Xu