Received: by 10.213.65.68 with SMTP id h4csp1434036imn; Wed, 21 Mar 2018 10:31:12 -0700 (PDT) X-Google-Smtp-Source: AG47ELsn7noBx3eoXToryena9kmAR4fWyWPyRxjS0zTH+XRsRs9aZ7UUtekLD03j1jhcYKYeA6vi X-Received: by 10.99.103.69 with SMTP id b66mr15512082pgc.233.1521653471934; Wed, 21 Mar 2018 10:31:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521653471; cv=none; d=google.com; s=arc-20160816; b=RZF5XywEDZg1CPf56JHhCesgMC5m/Qqf3i9C/95+buNBE0/WuATp/wIrjxBOhkQhuo wqKio4pW5BIlVPQqksFZ4aXeaEDGhg8XQhGaCIPk8Q7aYKh4CEkJo8Am+89PTmZjG0d/ pldZ2cQ5RUmoy7VIDYh2/Oc35/bVh4OVOqRZEvrUbVcs3tK9yFhWgpRjWZSfgUCAszVO NWrmrECAci1t+uwrgv6UkmrBCPUlgxjmIkCK8/nMyoZPkt91IRnjS2TgpGhaYEaIGUin NGbScKy90qO2tPSuLo381L3ykmpk8d9k7OIJVifkJ7TgeCTQO+Ob8wlH7iXnHeX5T+6a yhMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=lJSXgfnocCfYfCnxOJYuMdSk9phYhzJoOMCufd0s2p8=; b=hY8gYAxEaRKxa2w5bSOCK/8k/8XVvy0a6NDoVpno6djA6R1LqXPZ5oI9QuzCP6UYg8 lcskCGaANLF3qXDjevIBcaXbgTm3YvnXkzdKJto39ECGDW6VNAEo3sfw51EiFRTLy59e VcDo9j3glCzQO36dCF8XOWUCy6f7MekhbJegQOFKto5D9EhWWzx4PU7FT8mpbsB8eVjP +aB4p+ow+JDPQ02DS+bczJc8yv4OZ5hwFWIZ7TmhWMAtl3ZlBlHTjjO62GRvDkIMXqlC iPtnbaNFj/OpQd78Z5wqqHucbG18yg+zs5grhKTVFy3QOiUWYqKJ2iqKsHjs7dVEuIpj H4Mw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=G2Kf7zGr; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j2-v6si3436946pli.501.2018.03.21.10.30.57; Wed, 21 Mar 2018 10:31:11 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=G2Kf7zGr; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752138AbeCUR3i (ORCPT + 99 others); Wed, 21 Mar 2018 13:29:38 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:45800 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751641AbeCUR3e (ORCPT ); Wed, 21 Mar 2018 13:29:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=lJSXgfnocCfYfCnxOJYuMdSk9phYhzJoOMCufd0s2p8=; b=G2Kf7zGrLpJh7S6Xt9uYyKSF9 793zTj/ZDli3FIRHCwVnw9CNPLvWOvGK1T0nc2WUmNfWV0oacyV6QIfiwT41i8wJomtIIM1Tfynhk oGfBsJEG1EObD+xBs+7FiHJV9N2Z7LJxw9yOAx5vVRg+Mppwt9hiUBEhkbZsCxCBIiH1L2GwMaWMS FmWFEPoSc5duPdncmkTqU4BJIOMbeu1IEbtjxI75PcZo4uX/m8mY5OM5G8jOQ5NqVl/P5XEwQ/QW/ /Bx9aIzQ28oXr3LEB0AVErnnSOE0b6r824WpJL/3cY20FjogLID/7b0+h0OaYAF8j4PxAbAm+zQ8K Y59OtRQKw==; Received: from willy by bombadil.infradead.org with local (Exim 4.90_1 #2 (Red Hat Linux)) id 1eyhYK-0006CZ-P0; Wed, 21 Mar 2018 17:29:32 +0000 Date: Wed, 21 Mar 2018 10:29:32 -0700 From: Matthew Wilcox To: Yang Shi Cc: Michal Hocko , akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH 1/8] mm: mmap: unmap large mapping by section Message-ID: <20180321172932.GE4780@bombadil.infradead.org> References: <1521581486-99134-1-git-send-email-yang.shi@linux.alibaba.com> <1521581486-99134-2-git-send-email-yang.shi@linux.alibaba.com> <20180321130833.GM23100@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 21, 2018 at 09:31:22AM -0700, Yang Shi wrote: > On 3/21/18 6:08 AM, Michal Hocko wrote: > > Yes, this definitely sucks. One way to work that around is to split the > > unmap to two phases. One to drop all the pages. That would only need > > mmap_sem for read and then tear down the mapping with the mmap_sem for > > write. This wouldn't help for parallel mmap_sem writers but those really > > need a different approach (e.g. the range locking). > > page fault might sneak in to map a page which has been unmapped before? > > range locking should help a lot on manipulating small sections of a large > mapping in parallel or multiple small mappings. It may not achieve too much > for single large mapping. I don't think we need range locking. What if we do munmap this way: Take the mmap_sem for write Find the VMA If the VMA is large(*) Mark the VMA as deleted Drop the mmap_sem zap all of the entries Take the mmap_sem Else zap all of the entries Continue finding VMAs Drop the mmap_sem Now we need to change everywhere which looks up a VMA to see if it needs to care the the VMA is deleted (page faults, eg will need to SIGBUS; mmap does not care; munmap will need to wait for the existing munmap operation to complete), but it gives us the atomicity, at least on a per-VMA basis. We could also do: Take the mmap_sem for write Mark all VMAs in the range as deleted & modify any partial VMAs Drop mmap_sem zap pages from deleted VMAs That would give us the same atomicity that we have today. Deleted VMAs would need a pointer to a completion, so operations that need to wait can queue themselves up. I'd recommend we use the low bit of vm_file and treat it as a pointer to a struct completion if set.