Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp4978394imm; Tue, 19 Jun 2018 03:04:10 -0700 (PDT) X-Google-Smtp-Source: ADUXVKL7dpY0me1SNhVu6i7wXEl/PRsc60TevPwe1La3XSOXvPlr8HaTiVeqI3teSDXcjxCfIwgg X-Received: by 2002:a17:902:6105:: with SMTP id t5-v6mr18278747plj.138.1529402650158; Tue, 19 Jun 2018 03:04:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529402650; cv=none; d=google.com; s=arc-20160816; b=JV493CuRskwUaDYq4lbCBlJTA35BGRwOnefPJGMQN4yuBRlFk55aUZT/qOhn2S/6uZ p6qwbXGafxkWIbYuPh/4QLncEwUfflntBPLJAsb25wH44/gyZYVFp4sbuLUoEEaFAytW DkpL7vv3gv+M1e2B46EituCV5x+Yy3o1RKhWouaMRsglfLLDlnlgagV44SIjJtL8ywWe WifKy9PciE/0ZNIcnzrLGgesFMs7xy9fL/ImCfv8QBeoOM0/kAiQRYn30W3F6QSl7stm lLGAAbGPCRBC9cJ4HpRex62cUSrw2U7OBi0zOBVtKHuflacnP8dXKJMKSV6Tv7e6ybh0 n6Bg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=GnRUZU+YNPV9hRDIH4MQR776E6J6fLwT2o+7JVZWQ8M=; b=vbBAy7pp4VP5jdXmkAbjOPvypt1S08IzSVxiDa7nF790N829PJ6IoAfdA26jiZPTJ+ FPUb+EfqY9xjLyO272iRSoxnKBP0lbrUucWa4hUCoa+mWINH98E8Vh2wz97NE0fZVChG sJbN+c7Z23M+Avf16mr990JWiN10EQ4kOChZG4LbzwPqT1jAr8COrC2CGG/hvD7QdM2X khafM3UIoWxc4MQ/7LXVnBNWNKvMAlgNdj1p+jat73gKpSCHYO4wkSo508hGmTT+pvQR TntcbOpLxgiF7woYhsB6dQHECLXrXxNDkqRKx09/6zbMuFaBk0ADInbgzacswrxqsMWF OAxw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=TxnyVi+S; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f35-v6si16475445plh.193.2018.06.19.03.03.33; Tue, 19 Jun 2018 03:04:10 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=TxnyVi+S; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757169AbeFSKC1 (ORCPT + 99 others); Tue, 19 Jun 2018 06:02:27 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:33722 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756442AbeFSKCZ (ORCPT ); Tue, 19 Jun 2018 06:02:25 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=GnRUZU+YNPV9hRDIH4MQR776E6J6fLwT2o+7JVZWQ8M=; b=TxnyVi+Sc21Dqpeskv/7jcMGm Z5p/4LqOm8Adf1IMzegJ1iPp9LBYoDL5Vocv0spnmzx6LR/rfLk/1MeZ23BXVRs5/A1IlKYOzkpmX f/ead+WZhcu+MGIbJpqRCmJqikFU2OHvYQjVm8WJMUOzej+02+ukRcv7sY4TWKYryr+WAJq3kMkVH 1MdaqaR73e44pNppfp8JRmaL70umB2P8GkUrzeoXtAz0lfg2jc9r2HnxxOOzoyok9C57/CONz+Wdk GQFeyycj82CysqrehiFnk0Sh5FJ5W4ZPkBwdc2Uu8OtE35rzfkpXOvgwTOAWksVGriTfUEQZ934EK GdZEmiSig==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=hirez.programming.kicks-ass.net) by bombadil.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1fVDSu-0001SY-Et; Tue, 19 Jun 2018 10:02:20 +0000 Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id B534620268507; Tue, 19 Jun 2018 12:02:18 +0200 (CEST) Date: Tue, 19 Jun 2018 12:02:18 +0200 From: Peter Zijlstra To: Yang Shi Cc: mhocko@kernel.org, willy@infradead.org, ldufour@linux.vnet.ibm.com, akpm@linux-foundation.org, mingo@redhat.com, acme@kernel.org, alexander.shishkin@linux.intel.com, jolsa@redhat.com, namhyung@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [RFC v2 PATCH 2/2] mm: mmap: zap pages with read mmap_sem for large mapping Message-ID: <20180619100218.GN2458@hirez.programming.kicks-ass.net> References: <1529364856-49589-1-git-send-email-yang.shi@linux.alibaba.com> <1529364856-49589-3-git-send-email-yang.shi@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1529364856-49589-3-git-send-email-yang.shi@linux.alibaba.com> User-Agent: Mutt/1.10.0 (2018-05-17) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jun 19, 2018 at 07:34:16AM +0800, Yang Shi wrote: > diff --git a/mm/mmap.c b/mm/mmap.c > index fc41c05..e84f80c 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -2686,6 +2686,141 @@ int split_vma(struct mm_struct *mm, struct vm_area_struct *vma, > return __split_vma(mm, vma, addr, new_below); > } > > +/* Consider PUD size or 1GB mapping as large mapping */ > +#ifdef HPAGE_PUD_SIZE > +#define LARGE_MAP_THRESH HPAGE_PUD_SIZE > +#else > +#define LARGE_MAP_THRESH (1 * 1024 * 1024 * 1024) > +#endif > + > +/* Unmap large mapping early with acquiring read mmap_sem */ > +static int do_munmap_zap_early(struct mm_struct *mm, unsigned long start, > + size_t len, struct list_head *uf) > +{ > + unsigned long end = 0; > + struct vm_area_struct *vma = NULL, *prev, *last, *tmp; > + bool success = false; > + int ret = 0; > + > + if ((offset_in_page(start)) || start > TASK_SIZE || len > TASK_SIZE - start) > + return -EINVAL; > + > + len = (PAGE_ALIGN(len)); > + if (len == 0) > + return -EINVAL; > + > + /* Just deal with uf in regular path */ > + if (unlikely(uf)) > + goto regular_path; > + > + if (len >= LARGE_MAP_THRESH) { > + down_read(&mm->mmap_sem); > + vma = find_vma(mm, start); > + if (!vma) { > + up_read(&mm->mmap_sem); > + return 0; > + } > + > + prev = vma->vm_prev; > + > + end = start + len; > + if (vma->vm_start > end) { > + up_read(&mm->mmap_sem); > + return 0; > + } > + > + if (start > vma->vm_start) { > + int error; > + > + if (end < vma->vm_end && > + mm->map_count > sysctl_max_map_count) { > + up_read(&mm->mmap_sem); > + return -ENOMEM; > + } > + > + error = __split_vma(mm, vma, start, 0); > + if (error) { > + up_read(&mm->mmap_sem); > + return error; > + } > + prev = vma; > + } > + > + last = find_vma(mm, end); > + if (last && end > last->vm_start) { > + int error = __split_vma(mm, last, end, 1); > + > + if (error) { > + up_read(&mm->mmap_sem); > + return error; > + } > + } > + vma = prev ? prev->vm_next : mm->mmap; Hold up, two things: you having to copy most of do_munmap() didn't seem to suggest a helper function? And second, since when are we allowed to split VMAs under a read lock?