Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp5132666imm; Tue, 18 Sep 2018 05:00:44 -0700 (PDT) X-Google-Smtp-Source: ANB0VdbpGZO/0FsN7vhVFk4frb0EGFHPiJaAYgMbpM2wBcRb4cLOavVCtzbJEgEZkk5uEzwQ3qRm X-Received: by 2002:a17:902:209:: with SMTP id 9-v6mr29008313plc.270.1537272044848; Tue, 18 Sep 2018 05:00:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537272044; cv=none; d=google.com; s=arc-20160816; b=lIi5OsP2US2eADAOomaPTa5fYTUADkhszZlXGrJIVUi+yPx0od2aXfht9sYgejWBAA Jwlazvy8uI650hRv8Lq7noT9DLSS0CY+QjAs9uPxyM3y7W4CrotBjKyDKPucPCJhzOfG LSBUgepEyQ6Jltz7i+OAn0lQjUsqtnrh4ScdoNwYJu/XO/e304UkfglMoDmcLzzkoiY6 hZi/hzZrRwkReZqWsU/LnMOvBCg7fGWiO29lvr+JZAoAJOz6kMdJlLxdPHCephXI7XkK XPjeeojPLeAhqpcZCWi6IMWANqvqlerXB/64hHYW0FHsYyDEAY8UdbW40iadOBrXHBug j7CA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:references:in-reply-to:date :subject:cc:to:from; bh=MH1xubOAD/a0Ch5UTzxxgumIftrVXnQELtP4CffADK0=; b=Y9QDNMFXrfkE42H5W4Oamrba3oxkpMxJItZlKcbb/+J+fSE3hZSbrZL7inNo8jRp5V kzQyvMgyWTWAhQ2KB32bQ6kWqT5HMJd499HLjPejBU/F1upRmhQ5eAkRKXGfeI7XjJyJ vi0aUzeaHBO0DYrNUTvLl96ymYbfMhRf9nPRjX/zRcuaI2swGZp1F2CET9kYfdcD+XNE rHDNzztb9RFogPwuCASwhffkca7KfR+rByzYf9S+46MSc/yJV0lMyitg9RlaUUKRA7sa RM/SLhDq3sK7Mog1Ycz1knNXkAlOQe5w2bEHCc+RTN/R95HadEKqbaqYcrMIpDIqBETJ J72w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g27-v6si17941673pgm.208.2018.09.18.05.00.23; Tue, 18 Sep 2018 05:00:44 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729626AbeIRRbZ (ORCPT + 99 others); Tue, 18 Sep 2018 13:31:25 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:51160 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726812AbeIRRbZ (ORCPT ); Tue, 18 Sep 2018 13:31:25 -0400 Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w8IBrqtm126036 for ; Tue, 18 Sep 2018 07:59:08 -0400 Received: from e17.ny.us.ibm.com (e17.ny.us.ibm.com [129.33.205.207]) by mx0a-001b2d01.pphosted.com with ESMTP id 2mk0t08e7d-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 18 Sep 2018 07:59:08 -0400 Received: from localhost by e17.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 18 Sep 2018 07:59:07 -0400 Received: from b01cxnp22034.gho.pok.ibm.com (9.57.198.24) by e17.ny.us.ibm.com (146.89.104.204) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 18 Sep 2018 07:59:04 -0400 Received: from b01ledav005.gho.pok.ibm.com (b01ledav005.gho.pok.ibm.com [9.57.199.110]) by b01cxnp22034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w8IBx3xM5177752 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 18 Sep 2018 11:59:03 GMT Received: from b01ledav005.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 901E8AE063; Tue, 18 Sep 2018 07:57:47 -0400 (EDT) Received: from b01ledav005.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C96E0AE062; Tue, 18 Sep 2018 07:57:44 -0400 (EDT) Received: from skywalker.ibmuc.com (unknown [9.85.75.167]) by b01ledav005.gho.pok.ibm.com (Postfix) with ESMTP; Tue, 18 Sep 2018 07:57:44 -0400 (EDT) From: "Aneesh Kumar K.V" To: akpm@linux-foundation.org, Michal Hocko , Alexey Kardashevskiy , mpe@ellerman.id.au Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, "Aneesh Kumar K.V" Subject: [PATCH V3 2/2] powerpc/mm/iommu: Allow migration of cma allocated pages during mm_iommu_get Date: Tue, 18 Sep 2018 17:28:39 +0530 X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180918115839.22154-1-aneesh.kumar@linux.ibm.com> References: <20180918115839.22154-1-aneesh.kumar@linux.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18091811-0040-0000-0000-000004719C57 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00009728; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000266; SDB=6.01090083; UDB=6.00563121; IPR=6.00870103; MB=3.00023366; MTD=3.00000008; XFM=3.00000015; UTC=2018-09-18 11:59:06 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18091811-0041-0000-0000-00000878D61A Message-Id: <20180918115839.22154-3-aneesh.kumar@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-09-18_04:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1809180122 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Current code doesn't do page migration if the page allocated is a compound page. With HugeTLB migration support, we can end up allocating hugetlb pages from CMA region. Also THP pages can be allocated from CMA region. This patch updates the code to handle compound pages correctly. This use the new helper get_user_pages_cma_migrate. It does one get_user_pages with right count, instead of doing one get_user_pages per page. That avoids reading page table multiple times. The patch also convert the hpas member of mm_iommu_table_group_mem_t to a union. We use the same storage location to store pointers to struct page. We cannot update alll the code path use struct page *, because we access hpas in real mode and we can't do that struct page * to pfn conversion in real mode. Signed-off-by: Aneesh Kumar K.V --- arch/powerpc/mm/mmu_context_iommu.c | 120 ++++++++-------------------- 1 file changed, 35 insertions(+), 85 deletions(-) diff --git a/arch/powerpc/mm/mmu_context_iommu.c b/arch/powerpc/mm/mmu_context_iommu.c index c9ee9e23845f..f0d8645872cb 100644 --- a/arch/powerpc/mm/mmu_context_iommu.c +++ b/arch/powerpc/mm/mmu_context_iommu.c @@ -20,6 +20,7 @@ #include #include #include +#include static DEFINE_MUTEX(mem_list_mutex); @@ -30,8 +31,18 @@ struct mm_iommu_table_group_mem_t { atomic64_t mapped; unsigned int pageshift; u64 ua; /* userspace address */ - u64 entries; /* number of entries in hpas[] */ - u64 *hpas; /* vmalloc'ed */ + u64 entries; /* number of entries in hpages[] */ + /* + * in mm_iommu_get we temporarily use this to store + * struct page address. + * + * We need to convert ua to hpa in real mode. Make it + * simpler by storing physicall address. + */ + union { + struct page **hpages; /* vmalloc'ed */ + phys_addr_t *hpas; + }; }; static long mm_iommu_adjust_locked_vm(struct mm_struct *mm, @@ -74,63 +85,14 @@ bool mm_iommu_preregistered(struct mm_struct *mm) } EXPORT_SYMBOL_GPL(mm_iommu_preregistered); -/* - * Taken from alloc_migrate_target with changes to remove CMA allocations - */ -struct page *new_iommu_non_cma_page(struct page *page, unsigned long private) -{ - gfp_t gfp_mask = GFP_USER; - struct page *new_page; - - if (PageCompound(page)) - return NULL; - - if (PageHighMem(page)) - gfp_mask |= __GFP_HIGHMEM; - - /* - * We don't want the allocation to force an OOM if possibe - */ - new_page = alloc_page(gfp_mask | __GFP_NORETRY | __GFP_NOWARN); - return new_page; -} - -static int mm_iommu_move_page_from_cma(struct page *page) -{ - int ret = 0; - LIST_HEAD(cma_migrate_pages); - - /* Ignore huge pages for now */ - if (PageCompound(page)) - return -EBUSY; - - lru_add_drain(); - ret = isolate_lru_page(page); - if (ret) - return ret; - - list_add(&page->lru, &cma_migrate_pages); - put_page(page); /* Drop the gup reference */ - - ret = migrate_pages(&cma_migrate_pages, new_iommu_non_cma_page, - NULL, 0, MIGRATE_SYNC, MR_CONTIG_RANGE); - if (ret) { - if (!list_empty(&cma_migrate_pages)) - putback_movable_pages(&cma_migrate_pages); - } - - return 0; -} - long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries, struct mm_iommu_table_group_mem_t **pmem) { struct mm_iommu_table_group_mem_t *mem; - long i, j, ret = 0, locked_entries = 0; + long i, ret = 0, locked_entries = 0; unsigned int pageshift; unsigned long flags; unsigned long cur_ua; - struct page *page = NULL; mutex_lock(&mem_list_mutex); @@ -177,41 +139,24 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries, goto unlock_exit; } + ret = get_user_pages_cma_migrate(ua, entries, 1, mem->hpages); + if (ret != entries) { + /* free the reference taken */ + for (i = 0; i < ret; i++) + put_page(mem->hpages[i]); + + vfree(mem->hpas); + kfree(mem); + ret = -EFAULT; + goto unlock_exit; + } else + ret = 0; + + pageshift = PAGE_SHIFT; for (i = 0; i < entries; ++i) { + struct page *page = mem->hpages[i]; cur_ua = ua + (i << PAGE_SHIFT); - if (1 != get_user_pages_fast(cur_ua, - 1/* pages */, 1/* iswrite */, &page)) { - ret = -EFAULT; - for (j = 0; j < i; ++j) - put_page(pfn_to_page(mem->hpas[j] >> - PAGE_SHIFT)); - vfree(mem->hpas); - kfree(mem); - goto unlock_exit; - } - /* - * If we get a page from the CMA zone, since we are going to - * be pinning these entries, we might as well move them out - * of the CMA zone if possible. NOTE: faulting in + migration - * can be expensive. Batching can be considered later - */ - if (is_migrate_cma_page(page)) { - if (mm_iommu_move_page_from_cma(page)) - goto populate; - if (1 != get_user_pages_fast(cur_ua, - 1/* pages */, 1/* iswrite */, - &page)) { - ret = -EFAULT; - for (j = 0; j < i; ++j) - put_page(pfn_to_page(mem->hpas[j] >> - PAGE_SHIFT)); - vfree(mem->hpas); - kfree(mem); - goto unlock_exit; - } - } -populate: - pageshift = PAGE_SHIFT; + if (mem->pageshift > PAGE_SHIFT && PageCompound(page)) { pte_t *pte; struct page *head = compound_head(page); @@ -229,7 +174,12 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries, local_irq_restore(flags); } mem->pageshift = min(mem->pageshift, pageshift); + /* + * We don't need struct page reference any more, switch + * physicall address. + */ mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT; + } atomic64_set(&mem->mapped, 1); -- 2.17.1