Received: by 2002:a5b:505:0:0:0:0:0 with SMTP id o5csp5911937ybp; Tue, 8 Oct 2019 10:03:51 -0700 (PDT) X-Google-Smtp-Source: APXvYqy/XbPy6twf/G/esmfSb7vz4ISy3DHUNIecmBF9GuMQpDETAoMuFeU7706a4UWmdZlN6/SO X-Received: by 2002:a05:6402:1a45:: with SMTP id bf5mr34132585edb.275.1570554231134; Tue, 08 Oct 2019 10:03:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570554231; cv=none; d=google.com; s=arc-20160816; b=w14oTB6m/jqwvRgu7r5qd7umGvMoASvxIK42DZwbrdT6qfDuSOHCS7qtdGr+OfqFuW aYCJ8GlmHEfSqqu14ujRLJLIRcGUg/ebTiIy5ZaaG5JFmAk2l5RRid+lRS8T1soRPboO y2rgijJ1YJmuBrKtJ0TYXjKq+ajDpnCoem7lNCZ+BMotmpM7jVCN9kRyzvB8tCVymtuO rlY3TILzpvh80wz7P7j+oZ0cY0ofUakqlM/kEQYNNaimd8Q8eGA2xxVHjVCsZDsGG0I3 X1kgTSgGMnKB1LyeFEQnWfRowLySY7YBTQwtnt5JlMGB2jRgiHt6eBhVKSS6WYH/vdE6 Aqow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=+kn4f5Staj2G0S97feAMGR+y9EFzfDrMQ1EEnWhSIXw=; b=vfk8L6eBJC9k6HYX5u500e4wDXBVSvDKm3mCCljptGGJpbj7KHj56pgdLl3Q5IjwLF kkspJ7aA+0ryLBOUpYyoIXi3ftA7rhy7vm2vqc/xGHVRzFXmIu6P6XqYChOdWXHgYYTq vY9iy8al1eTUFs9Z2Luwl8tC7P4oVjhwMbLdVxxruuY89PXxffbVtB1ZiC1Htd+XuTZV 4AM9yEISSBWME5Otp+QR9fMKYtEehF9RFFASlP4ETEsfnvfCvee8XEcHSGV09QdHCpQW 2KvmYniwS8d+sIB/qkiKgAgRc0a+g/moSmvjMpjCUpk7XsEFA1RvOgNJYZEN20SyX8QS eCaA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=NONE dis=NONE) header.from=vmware.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b12si11337312edk.16.2019.10.08.10.03.26; Tue, 08 Oct 2019 10:03:51 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=NONE dis=NONE) header.from=vmware.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729531AbfJHQ7m (ORCPT + 99 others); Tue, 8 Oct 2019 12:59:42 -0400 Received: from ex13-edg-ou-001.vmware.com ([208.91.0.189]:12243 "EHLO EX13-EDG-OU-001.vmware.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726822AbfJHQ7l (ORCPT ); Tue, 8 Oct 2019 12:59:41 -0400 Received: from sc9-mailhost3.vmware.com (10.113.161.73) by EX13-EDG-OU-001.vmware.com (10.113.208.155) with Microsoft SMTP Server id 15.0.1156.6; Tue, 8 Oct 2019 09:44:33 -0700 Received: from akaher-lnx-dev.eng.vmware.com (unknown [10.110.19.203]) by sc9-mailhost3.vmware.com (Postfix) with ESMTP id 7FBA140AF8; Tue, 8 Oct 2019 09:44:30 -0700 (PDT) From: Ajay Kaher To: CC: , , , , , , , , , , , , , , , , , , , Subject: [PATCH v2 6/8] mm: prevent get_user_pages() from overflowing page refcount Date: Wed, 9 Oct 2019 06:14:21 +0530 Message-ID: <1570581863-12090-7-git-send-email-akaher@vmware.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1570581863-12090-1-git-send-email-akaher@vmware.com> References: <1570581863-12090-1-git-send-email-akaher@vmware.com> MIME-Version: 1.0 Content-Type: text/plain Received-SPF: None (EX13-EDG-OU-001.vmware.com: akaher@vmware.com does not designate permitted sender hosts) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Linus Torvalds commit 8fde12ca79aff9b5ba951fce1a2641901b8d8e64 upstream. If the page refcount wraps around past zero, it will be freed while there are still four billion references to it. One of the possible avenues for an attacker to try to make this happen is by doing direct IO on a page multiple times. This patch makes get_user_pages() refuse to take a new page reference if there are already more than two billion references to the page. Reported-by: Jann Horn Acked-by: Matthew Wilcox Cc: stable@kernel.org Signed-off-by: Linus Torvalds [ 4.4.y backport notes: Ajay: Added local variable 'err' with-in follow_hugetlb_page() from 2be7cfed995e, to resolve compilation error Srivatsa: Replaced call to get_page_foll() with try_get_page_foll() ] Signed-off-by: Srivatsa S. Bhat (VMware) Signed-off-by: Ajay Kaher --- mm/gup.c | 43 ++++++++++++++++++++++++++++++++----------- mm/hugetlb.c | 16 +++++++++++++++- 2 files changed, 47 insertions(+), 12 deletions(-) diff --git a/mm/gup.c b/mm/gup.c index fae4d1e..171b460 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -126,8 +126,12 @@ retry: } } - if (flags & FOLL_GET) - get_page_foll(page); + if (flags & FOLL_GET) { + if (unlikely(!try_get_page_foll(page))) { + page = ERR_PTR(-ENOMEM); + goto out; + } + } if (flags & FOLL_TOUCH) { if ((flags & FOLL_WRITE) && !pte_dirty(pte) && !PageDirty(page)) @@ -289,7 +293,10 @@ static int get_gate_page(struct mm_struct *mm, unsigned long address, goto unmap; *page = pte_page(*pte); } - get_page(*page); + if (unlikely(!try_get_page(*page))) { + ret = -ENOMEM; + goto unmap; + } out: ret = 0; unmap: @@ -1053,6 +1060,20 @@ struct page *get_dump_page(unsigned long addr) */ #ifdef CONFIG_HAVE_GENERIC_RCU_GUP +/* + * Return the compund head page with ref appropriately incremented, + * or NULL if that failed. + */ +static inline struct page *try_get_compound_head(struct page *page, int refs) +{ + struct page *head = compound_head(page); + if (WARN_ON_ONCE(atomic_read(&head->_count) < 0)) + return NULL; + if (unlikely(!page_cache_add_speculative(head, refs))) + return NULL; + return head; +} + #ifdef __HAVE_ARCH_PTE_SPECIAL static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, int write, struct page **pages, int *nr) @@ -1082,9 +1103,9 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, VM_BUG_ON(!pfn_valid(pte_pfn(pte))); page = pte_page(pte); - head = compound_head(page); - if (!page_cache_get_speculative(head)) + head = try_get_compound_head(page, 1); + if (!head) goto pte_unmap; if (unlikely(pte_val(pte) != pte_val(*ptep))) { @@ -1141,8 +1162,8 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, refs++; } while (addr += PAGE_SIZE, addr != end); - head = compound_head(pmd_page(orig)); - if (!page_cache_add_speculative(head, refs)) { + head = try_get_compound_head(pmd_page(orig), refs); + if (!head) { *nr -= refs; return 0; } @@ -1187,8 +1208,8 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr, refs++; } while (addr += PAGE_SIZE, addr != end); - head = compound_head(pud_page(orig)); - if (!page_cache_add_speculative(head, refs)) { + head = try_get_compound_head(pud_page(orig), refs); + if (!head) { *nr -= refs; return 0; } @@ -1229,8 +1250,8 @@ static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned long addr, refs++; } while (addr += PAGE_SIZE, addr != end); - head = compound_head(pgd_page(orig)); - if (!page_cache_add_speculative(head, refs)) { + head = try_get_compound_head(pgd_page(orig), refs); + if (!head) { *nr -= refs; return 0; } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index fd932e7..3a1501e 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3886,6 +3886,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long vaddr = *position; unsigned long remainder = *nr_pages; struct hstate *h = hstate_vma(vma); + int err = -EFAULT; while (vaddr < vma->vm_end && remainder) { pte_t *pte; @@ -3957,6 +3958,19 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, pfn_offset = (vaddr & ~huge_page_mask(h)) >> PAGE_SHIFT; page = pte_page(huge_ptep_get(pte)); + + /* + * Instead of doing 'try_get_page_foll()' below in the same_page + * loop, just check the count once here. + */ + if (unlikely(page_count(page) <= 0)) { + if (pages) { + spin_unlock(ptl); + remainder = 0; + err = -ENOMEM; + break; + } + } same_page: if (pages) { pages[i] = mem_map_offset(page, pfn_offset); @@ -3983,7 +3997,7 @@ same_page: *nr_pages = remainder; *position = vaddr; - return i ? i : -EFAULT; + return i ? i : err; } unsigned long hugetlb_change_protection(struct vm_area_struct *vma, -- 2.7.4