Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp5345671ybi; Wed, 12 Jun 2019 00:38:30 -0700 (PDT) X-Google-Smtp-Source: APXvYqx0Uu2EKV1J6p7ELfCmX4O4KaRPLeOsV5DhceiIwYXa+83nu84Vi38zuHwHCs2gbY6vrI+C X-Received: by 2002:a63:4d05:: with SMTP id a5mr22451617pgb.19.1560325110165; Wed, 12 Jun 2019 00:38:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560325110; cv=none; d=google.com; s=arc-20160816; b=oBlYW8Y3gTy6TvCANPewfqVEdsyp5GYH58PKRN3QcKpWoUOzCsz6gH2kG86PzK22lh HDPMm4gSgTU68JriRPa2xpbhrznMZPC+p7CnaTNZZ4I7h2Hn2Ym7HlHUO+TTtZ/g3C4C u4F0D0rkKxRvrHN29g109UIa+Cj6llkIjLfk7rDLsb/9+UMde04u0kWiEfUh6HReDRmJ uGEoSc76qXTghHSU9h+vpvyU278MVgOslXqbzCxxk97/kc6wKkYsJkyZSkAVN+27HyWS TQqlOMq6ooKOPatngM+a9kdjcQcsxMqmtpzKRXRdxsk+GSwKVV1myIksDW0LKcLQZwtU Vafw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=MAd9VAiJz95Llz8BbTxV4sHt55dwKHdH+hoX3bTDkOM=; b=UVzGXnL7lkCWCW8HksbiaWW+Mvsi3xevbvfWpJrsgpok9XMV5kFKuwnTya1MlJNJ0y Hk4Mg1EuxTBIkkFRIeMYKrZ0VxOblbMQPUhk4VIAj2TcpojmKeUz+ojwJD0t+ngb7V6n 02XwlsiCCaOh7/3lnga0Xwgmr/ZfRPlwOyQTdbibqDlqgKVroEAoOtxIoRsO8GiQjuND iR/5c19uXiMDzAprMPVR1e+qkPBu8WTwkr6GOzrYMTcS7FXyYzbsZ0A+ekUAJXqvXH+H 0QMr2HC1qekr9RLPgKvbG74rorUc1dRWRVnrVvLqz95Avv1QT+Yey1sATsvwFSYBPKz9 4OdQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@shutemov-name.20150623.gappssmtp.com header.s=20150623 header.b=aGF0m5qD; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 14si16018437pfw.143.2019.06.12.00.38.11; Wed, 12 Jun 2019 00:38:30 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@shutemov-name.20150623.gappssmtp.com header.s=20150623 header.b=aGF0m5qD; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2408460AbfFLCdj (ORCPT + 99 others); Tue, 11 Jun 2019 22:33:39 -0400 Received: from mail-ed1-f67.google.com ([209.85.208.67]:38489 "EHLO mail-ed1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2406202AbfFLCdj (ORCPT ); Tue, 11 Jun 2019 22:33:39 -0400 Received: by mail-ed1-f67.google.com with SMTP id g13so23150629edu.5 for ; Tue, 11 Jun 2019 19:33:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov-name.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=MAd9VAiJz95Llz8BbTxV4sHt55dwKHdH+hoX3bTDkOM=; b=aGF0m5qDaDj87qvNo5iPDYjLHNHkvWmDaEpCNpwbFSa4HzQDh4bwYk/ghq2akBh7gf mvYvGFQKDZIuN9zsuuccUcDr6CwuBRbat9mNrwjJNqAHXk5LoA66Npn1J32iT2rcDfpw f+i5zMYkljrX+ld9mYuVrJTtg30/15VL5s9Sh6uBssTfZ57JtMnhkfDzQKGLfLD7rQjV yZWlPSz0HiFXV9SmWRGoEhN14rWIRN5p0NtTtgElcRvgA+7FrrckEiVoMyoE3ze2sQbQ dYnPsIewsFSO+WvWMcODJaKcHE2ujN58RlX4DWlbu9H2vKMtJdLSA9GH2kSSCsbTERm9 IihA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=MAd9VAiJz95Llz8BbTxV4sHt55dwKHdH+hoX3bTDkOM=; b=WM+aUL+goIGGpUPIvFHbyWY7KEv9ertfg/gpblz84mX6ZnBcRWs+ULa8Qhy6ntcu5z u/lIEfrQrzcBwgWpkMbLLGz5TM2LJyTL9Y36tpN3Xk7VCFxUNRcBM9ZPwutdma2ouwfC LeBL98I09q5hAzNJZ7Nc06SS++tXvFUc6UJY11/1jgLNbOP2s46U2jru+66pfmrxfd4r BzeRHEiYKNwvsjqzRL8ptqhUXQeHsRHXF8fdFpF3iSaCVusztPnVHxQ/9k2Ug+0BIqEL DINkBAF4jmSrszyr+barkZPqCFDGTjKUZ5LX3Q7x7TEUmUyDPgkku4j5ZS5uiCZ/0Upe WAAw== X-Gm-Message-State: APjAAAWhP4brdKav6z3m2YhGDHf6+FnXrakaDr9S7t7yszLN3XUU6lFb mCjKt2iPrS/u/aqUlQ1odHAfkw== X-Received: by 2002:a17:906:53c1:: with SMTP id p1mr5138534ejo.241.1560306817323; Tue, 11 Jun 2019 19:33:37 -0700 (PDT) Received: from box.localdomain ([86.57.175.117]) by smtp.gmail.com with ESMTPSA id i31sm4161836edd.90.2019.06.11.19.33.36 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 11 Jun 2019 19:33:36 -0700 (PDT) Received: by box.localdomain (Postfix, from userid 1000) id C01BA10081B; Wed, 12 Jun 2019 05:33:36 +0300 (+03) Date: Wed, 12 Jun 2019 05:33:36 +0300 From: "Kirill A. Shutemov" To: Larry Bassel Cc: mike.kravetz@oracle.com, willy@infradead.org, dan.j.williams@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org Subject: Re: [RFC PATCH v2 2/2] Implement sharing/unsharing of PMDs for FS/DAX Message-ID: <20190612023336.hbqs2ag4bv2qv2eh@box> References: <1559937063-8323-1-git-send-email-larry.bassel@oracle.com> <1559937063-8323-3-git-send-email-larry.bassel@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1559937063-8323-3-git-send-email-larry.bassel@oracle.com> User-Agent: NeoMutt/20180716 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 07, 2019 at 12:51:03PM -0700, Larry Bassel wrote: > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 3a54c9d..1c1ed4e 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -4653,9 +4653,9 @@ long hugetlb_unreserve_pages(struct inode *inode, long start, long end, > } > > #ifdef CONFIG_ARCH_HAS_HUGE_PMD_SHARE > -static unsigned long page_table_shareable(struct vm_area_struct *svma, > - struct vm_area_struct *vma, > - unsigned long addr, pgoff_t idx) > +unsigned long page_table_shareable(struct vm_area_struct *svma, > + struct vm_area_struct *vma, > + unsigned long addr, pgoff_t idx) > { > unsigned long saddr = ((idx - svma->vm_pgoff) << PAGE_SHIFT) + > svma->vm_start; > @@ -4678,7 +4678,7 @@ static unsigned long page_table_shareable(struct vm_area_struct *svma, > return saddr; > } > > -static bool vma_shareable(struct vm_area_struct *vma, unsigned long addr) > +bool vma_shareable(struct vm_area_struct *vma, unsigned long addr) > { > unsigned long base = addr & PUD_MASK; > unsigned long end = base + PUD_SIZE; This is going to be build error. mm/hugetlb.o doesn't build unlessp CONFIG_HUGETLBFS=y. And I think both functions doesn't cover all DAX cases: VMA can be not aligned (due to vm_start and/or vm_pgoff) to 2M even if the file has 2M ranges allocated. See transhuge_vma_suitable(). And as I said before, nothing guarantees contiguous 2M ranges on backing storage. And in general I found piggybacking on hugetlb hacky. The solution has to stand on its own with own justification. Saying it worked for hugetlb and it has to work here would not fly. hugetlb is much more restrictive on use cases. THP has more corner cases. > diff --git a/mm/memory.c b/mm/memory.c > index ddf20bd..1ca8f75 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3932,6 +3932,109 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) > return 0; > } > > +#ifdef CONFIG_ARCH_HAS_HUGE_PMD_SHARE > +static pmd_t *huge_pmd_offset(struct mm_struct *mm, > + unsigned long addr, unsigned long sz) > +{ > + pgd_t *pgd; > + p4d_t *p4d; > + pud_t *pud; > + pmd_t *pmd; > + > + pgd = pgd_offset(mm, addr); > + if (!pgd_present(*pgd)) > + return NULL; > + p4d = p4d_offset(pgd, addr); > + if (!p4d_present(*p4d)) > + return NULL; > + > + pud = pud_offset(p4d, addr); > + if (sz != PUD_SIZE && pud_none(*pud)) > + return NULL; > + /* hugepage or swap? */ > + if (pud_huge(*pud) || !pud_present(*pud)) > + return (pmd_t *)pud; So do we or do we not support PUD pages? This is just broken. > + > + pmd = pmd_offset(pud, addr); > + if (sz != PMD_SIZE && pmd_none(*pmd)) > + return NULL; > + /* hugepage or swap? */ > + if (pmd_huge(*pmd) || !pmd_present(*pmd)) > + return pmd; > + > + return NULL; > +} > + -- Kirill A. Shutemov