Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp269734imm; Thu, 11 Oct 2018 20:22:46 -0700 (PDT) X-Google-Smtp-Source: ACcGV63pYN1AhrBKrprMbn7t56MBakMevgOd6hug5wl3ACg1UIgBJgia3CFGv1SO1EKTlgq2cwM0 X-Received: by 2002:a17:902:4222:: with SMTP id g31-v6mr4190861pld.281.1539314566681; Thu, 11 Oct 2018 20:22:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539314566; cv=none; d=google.com; s=arc-20160816; b=iZk37XiqeSZHAUEH34WI1BjTMLJCgZgEPe9I985dMwEnCTl6oRRvXNmCtHPdS4GGt/ ukFo/b+pkJZafA5TnZ98sjXKyPQoWaILnqEUUru54wlhtMwz2o3bIKn8OqP1g+ejggG/ 6vx/Y/NQ24xiWXZ2pHaepgesWMZO42fh7wLd1JndXi7YHkjeXl13lKUWZw53Z439uF0g 5QfIh5en4P7cb7YGyg25FZWXiOBKZ/jBHliR4zgHRwP8S2p2O+1LTmbdVO0OArGx57Xy 1mHJHH6oUtohaWlXbbef6CobUE6S4mXy2CKy+JElfU4JD1P4l9iftCeguSHc2OpMhpXl Sm/g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=LWmcAcc5aI6rD0TxwHD+d05i18SLBiVhQijk1UyCPzQ=; b=0M++OHg8M6K4GrkoOn74GLD6edBKRUTIf22CDK5Kg7ML+ydl2CDBwkpQUOygejhKgt +ZX2uPy25CKYp26eJQ5cbrES4Bob3wkMrWnRJ4pGhnLqKeqZABdrW1EdUxcyMuPJtGVl GBxKWqW5tVwjic0IypY1tTZQWylqSt0eQA1mw6PP2dwFv8/ccCkgz7MCnnAuTd5/9IqE xWzQ2cCugvWx6l8VryxFxmyNH7DQxniFcbg46i5yzSeReK7JY+anQx229DapI2HT7HUQ B2eTXOsc12eQsH/IfAIe9P2cxlzzGaaADBgviJ2iL57m5uT/Qg7a2uzB9ax3/p06CpTi IlGw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=Qslww6c5; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 3-v6si29031799plp.173.2018.10.11.20.22.31; Thu, 11 Oct 2018 20:22:46 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=Qslww6c5; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727000AbeJLKwR (ORCPT + 99 others); Fri, 12 Oct 2018 06:52:17 -0400 Received: from mail-oi1-f194.google.com ([209.85.167.194]:44794 "EHLO mail-oi1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726196AbeJLKwQ (ORCPT ); Fri, 12 Oct 2018 06:52:16 -0400 Received: by mail-oi1-f194.google.com with SMTP id u74-v6so8793238oia.11 for ; Thu, 11 Oct 2018 20:21:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=LWmcAcc5aI6rD0TxwHD+d05i18SLBiVhQijk1UyCPzQ=; b=Qslww6c5/x9a7evzAMIt+iD3NE9iEJBM+Xd7OHzJGvGR+cxNsSSLXGX6Vx//8MPnDA wzvB2wuseUFUzs96kEKrcxuJ/CERNzdsbJQI0I8fGZtHw3HDs0+UBpSO9H4c/tDNvJtP w5sGbbagF2MrL3jWWry2ecewAxnDIMqrg2tnpM5HUiZdVDdP14/DJfBPKc4PT4HYvzD2 ZDaISZ323KwLo5BwCIwBlQIgyJLtnXij6vHtfmTpIDYsixiOP+PtaZK/UTyQRRgeNRLy CZDsq1d0fNt8V9kD83qDdt0X8XxRi8dpDV9JuukDs1iHvvllqmCi+Bn5EsuKyowqmOL2 Gl3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=LWmcAcc5aI6rD0TxwHD+d05i18SLBiVhQijk1UyCPzQ=; b=jQKicK1Ntzw0n8LQ5RN5LKONdI8CGSQG1Iidp4cfsD52eC9RrapxYaPu8hR9qk/AGn 0Lx0DS6l1aVgGbee+X2QmhbtXjAvhjbuT/GdTraReJFGAhzk6H1BAIMsBzbF+APB02q+ +dl5xwaIdfFvIfTuYe8hjek/yUj2si1ZxJ6c7cm+oOHq3rX8L7V9J6PUKOzbFMJQDxOx ZaGKiYLq/mCr0sHO1h0+JVXOwpWxJolZiupU5Vhp8GAz+lzzEFOLH3KGrm48rLGPP3wL 2qmwIyWoU3gIs1nL5S5EdxWvovSCOWDcl8E0DdnkusrYwz98eSEnyegP/bhFBe9XBh5C j8Fw== X-Gm-Message-State: ABuFfojRPfhsQp+c4qwqEhR2d4RNzB7XKd3B/VyT4HhBrRjXKUq3Bkdf tud5KeW6RlQuabvQJv9/t8R/i+Yoxhl1a8aBnTMVzA== X-Received: by 2002:aca:efd6:: with SMTP id n205-v6mr2247908oih.3.1539314518950; Thu, 11 Oct 2018 20:21:58 -0700 (PDT) MIME-Version: 1.0 References: <20181009201400.168705-1-joel@joelfernandes.org> In-Reply-To: <20181009201400.168705-1-joel@joelfernandes.org> From: Jann Horn Date: Fri, 12 Oct 2018 05:21:32 +0200 Message-ID: Subject: Re: [PATCH] mm: Speed up mremap on large regions To: joel@joelfernandes.org Cc: kernel list , Linux-MM , kernel-team@android.com, Minchan Kim , Hugh Dickins , lokeshgidra@google.com, Andrew Morton , Greg Kroah-Hartman , Kate Stewart , pombredanne@nexb.com, Thomas Gleixner , Boris Ostrovsky , Juergen Gross , Paolo Bonzini , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , kvm@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org +cc xen maintainers and kvm folks On Fri, Oct 12, 2018 at 4:40 AM Joel Fernandes (Google) wrote: > Android needs to mremap large regions of memory during memory management > related operations. The mremap system call can be really slow if THP is > not enabled. The bottleneck is move_page_tables, which is copying each > pte at a time, and can be really slow across a large map. Turning on THP > may not be a viable option, and is not for us. This patch speeds up the > performance for non-THP system by copying at the PMD level when possible. [...] > +bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr, > + unsigned long new_addr, unsigned long old_end, > + pmd_t *old_pmd, pmd_t *new_pmd, bool *need_flush) > +{ [...] > + /* > + * We don't have to worry about the ordering of src and dst > + * ptlocks because exclusive mmap_sem prevents deadlock. > + */ > + old_ptl = pmd_lock(vma->vm_mm, old_pmd); > + if (old_ptl) { > + pmd_t pmd; > + > + new_ptl = pmd_lockptr(mm, new_pmd); > + if (new_ptl != old_ptl) > + spin_lock_nested(new_ptl, SINGLE_DEPTH_NESTING); > + > + /* Clear the pmd */ > + pmd = *old_pmd; > + pmd_clear(old_pmd); > + > + VM_BUG_ON(!pmd_none(*new_pmd)); > + > + /* Set the new pmd */ > + set_pmd_at(mm, new_addr, new_pmd, pmd); > + if (new_ptl != old_ptl) > + spin_unlock(new_ptl); > + spin_unlock(old_ptl); How does this interact with Xen PV? From a quick look at the Xen PV integration code in xen_alloc_ptpage(), it looks to me as if, in a config that doesn't use split ptlocks, this is going to temporarily drop Xen's type count for the page to zero, causing Xen to de-validate and then re-validate the L1 pagetable; if you first set the new pmd before clearing the old one, that wouldn't happen. I don't know how this interacts with shadow paging implementations. > + *need_flush = true; > + return true; > + } > + return false; > +}