Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp4028365imm; Mon, 14 May 2018 01:15:57 -0700 (PDT) X-Google-Smtp-Source: AB8JxZoUikyAZFZ+T3RgIIq/c2PT6kGOyVXdwBCa/HFagUbcQKVzs1AwahFlcP9g9+/ZVW2/HWaH X-Received: by 2002:a17:902:7283:: with SMTP id d3-v6mr9009162pll.192.1526285757040; Mon, 14 May 2018 01:15:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526285756; cv=none; d=google.com; s=arc-20160816; b=t3if5PO+bedhL2VKtpRQckRCSgDTA74maqXZtgozkZ3U83eypF2FE3ie84ME7+L333 wWdxopb5NKYstsERrFRFedjxXjL/jlmUhIJAVw1UeeUKK4C79EYEzY5rBPVlb4io/L3W cuGube3bDkLUBmmWS6D7zt6qbwltqCd6xD1G2m+okF6z0gUUIhnKyIsqvS9WHmPjWtk8 RQVW5BRI/YCP9WR7pDEijZtAGvi/Fi0yW5nbRMbSzgQeoiE7kQDk+qgKjBX5oPAH6LQo TnF3pXXUZsR3/zGUV9Kj7GK1OYrjqr4b68VGDoY0eNHWNxgXaBdnvn5SDmsmhKqEYyj2 8EGA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:references:in-reply-to:date :subject:cc:to:from:arc-authentication-results; bh=TQC1Nt4pZCTvNBf4y0ra/nNW4V2ocbOzvDiGRic3B8Q=; b=UUd6Gj3e+pYKWDmKvncDRemX1wuHkRexdNBzCzX8OIIEPQy9aJ8k7+a2H1wpRbbtKR G3hn/9BSWfoKdiT0PGfiall4Go8en55HyPDQ1wE17hjyV/+/T5y0J6K7O/dUzlx8fEO6 gm3U1eheX9sPu9Sk5hWXSZS6yYYwNAExTctrlxoDe2O/n5MKFNIR1H1jqu1VhdVewcSo cWGqbsuUVS6a4xdMstVaTDEo4YRod9cr7lxVuFWuYk6CnntNu9wJ7TTnpkKfM7YGt3dF xmPSmyRmVe21rATgcRA+7Birsde6yWNYWf58DESk+w3a+u9sFXWPTIIWEF7AY6tYizXJ f0+g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z30-v6si8828344pfg.266.2018.05.14.01.15.42; Mon, 14 May 2018 01:15:56 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752706AbeENIOG (ORCPT + 99 others); Mon, 14 May 2018 04:14:06 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:53758 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752327AbeENIN7 (ORCPT ); Mon, 14 May 2018 04:13:59 -0400 Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w4E84Io8043335 for ; Mon, 14 May 2018 04:13:59 -0400 Received: from e06smtp12.uk.ibm.com (e06smtp12.uk.ibm.com [195.75.94.108]) by mx0b-001b2d01.pphosted.com with ESMTP id 2hy4sc4yt9-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 14 May 2018 04:13:58 -0400 Received: from localhost by e06smtp12.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 14 May 2018 09:13:52 +0100 Received: from b06cxnps3074.portsmouth.uk.ibm.com (9.149.109.194) by e06smtp12.uk.ibm.com (192.168.101.142) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 14 May 2018 09:13:48 +0100 Received: from d06av22.portsmouth.uk.ibm.com (d06av22.portsmouth.uk.ibm.com [9.149.105.58]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w4E8Dmur8520122; Mon, 14 May 2018 08:13:48 GMT Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9A7D84C04A; Mon, 14 May 2018 09:05:41 +0100 (BST) Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2EBA34C040; Mon, 14 May 2018 09:05:40 +0100 (BST) Received: from rapoport-lnx (unknown [9.148.8.81]) by d06av22.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Mon, 14 May 2018 09:05:40 +0100 (BST) Received: by rapoport-lnx (sSMTP sendmail emulation); Mon, 14 May 2018 11:13:46 +0300 From: Mike Rapoport To: Jonathan Corbet Cc: linux-doc , linux-mm , lkml , Mike Rapoport Subject: [PATCH 1/3] docs/vm: transhuge: change sections order Date: Mon, 14 May 2018 11:13:38 +0300 X-Mailer: git-send-email 2.7.4 In-Reply-To: <1526285620-453-1-git-send-email-rppt@linux.vnet.ibm.com> References: <1526285620-453-1-git-send-email-rppt@linux.vnet.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18051408-0008-0000-0000-000004F639B8 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18051408-0009-0000-0000-00001E8A96F7 Message-Id: <1526285620-453-2-git-send-email-rppt@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-05-14_02:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1805140085 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org so that userspace interface and implementation description will be grouped together Signed-off-by: Mike Rapoport --- Documentation/vm/transhuge.rst | 82 +++++++++++++++++++++--------------------- 1 file changed, 41 insertions(+), 41 deletions(-) diff --git a/Documentation/vm/transhuge.rst b/Documentation/vm/transhuge.rst index 2c6867f..56d04cbb 100644 --- a/Documentation/vm/transhuge.rst +++ b/Documentation/vm/transhuge.rst @@ -38,31 +38,6 @@ are using hugepages but a significant speedup already happens if only one of the two is using hugepages just because of the fact the TLB miss is going to run faster. -Design -====== - -- "graceful fallback": mm components which don't have transparent hugepage - knowledge fall back to breaking huge pmd mapping into table of ptes and, - if necessary, split a transparent hugepage. Therefore these components - can continue working on the regular pages or regular pte mappings. - -- if a hugepage allocation fails because of memory fragmentation, - regular pages should be gracefully allocated instead and mixed in - the same vma without any failure or significant delay and without - userland noticing - -- if some task quits and more hugepages become available (either - immediately in the buddy or through the VM), guest physical memory - backed by regular pages should be relocated on hugepages - automatically (with khugepaged) - -- it doesn't require memory reservation and in turn it uses hugepages - whenever possible (the only possible reservation here is kernelcore= - to avoid unmovable pages to fragment all the memory but such a tweak - is not specific to transparent hugepage support and it's a generic - feature that applies to all dynamic high order allocations in the - kernel) - Transparent Hugepage Support maximizes the usefulness of free memory if compared to the reservation approach of hugetlbfs by allowing all unused memory to be used as cache or other movable (or even unmovable @@ -401,6 +376,47 @@ tracer to record how long was spent in __alloc_pages_nodemask and using the mm_page_alloc tracepoint to identify which allocations were for huge pages. +Optimizing the applications +=========================== + +To be guaranteed that the kernel will map a 2M page immediately in any +memory region, the mmap region has to be hugepage naturally +aligned. posix_memalign() can provide that guarantee. + +Hugetlbfs +========= + +You can use hugetlbfs on a kernel that has transparent hugepage +support enabled just fine as always. No difference can be noted in +hugetlbfs other than there will be less overall fragmentation. All +usual features belonging to hugetlbfs are preserved and +unaffected. libhugetlbfs will also work fine as usual. + +Design principles +================= + +- "graceful fallback": mm components which don't have transparent hugepage + knowledge fall back to breaking huge pmd mapping into table of ptes and, + if necessary, split a transparent hugepage. Therefore these components + can continue working on the regular pages or regular pte mappings. + +- if a hugepage allocation fails because of memory fragmentation, + regular pages should be gracefully allocated instead and mixed in + the same vma without any failure or significant delay and without + userland noticing + +- if some task quits and more hugepages become available (either + immediately in the buddy or through the VM), guest physical memory + backed by regular pages should be relocated on hugepages + automatically (with khugepaged) + +- it doesn't require memory reservation and in turn it uses hugepages + whenever possible (the only possible reservation here is kernelcore= + to avoid unmovable pages to fragment all the memory but such a tweak + is not specific to transparent hugepage support and it's a generic + feature that applies to all dynamic high order allocations in the + kernel) + get_user_pages and follow_page ============================== @@ -432,22 +448,6 @@ hugepages being returned (as it's not only checking the pfn of the page and pinning it during the copy but it pretends to migrate the memory in regular page sizes and with regular pte/pmd mappings). -Optimizing the applications -=========================== - -To be guaranteed that the kernel will map a 2M page immediately in any -memory region, the mmap region has to be hugepage naturally -aligned. posix_memalign() can provide that guarantee. - -Hugetlbfs -========= - -You can use hugetlbfs on a kernel that has transparent hugepage -support enabled just fine as always. No difference can be noted in -hugetlbfs other than there will be less overall fragmentation. All -usual features belonging to hugetlbfs are preserved and -unaffected. libhugetlbfs will also work fine as usual. - Graceful fallback ================= -- 2.7.4