Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp1044532pxk; Thu, 1 Oct 2020 22:40:20 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxV4KLrcEWJAEOkW1pSlndWJtKKrtHBOS1+uGi84GUTDFhK8dzAmHA5ywWYATWNumJsUz0D X-Received: by 2002:a17:906:b353:: with SMTP id cd19mr499387ejb.140.1601617220325; Thu, 01 Oct 2020 22:40:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1601617220; cv=none; d=google.com; s=arc-20160816; b=P4HGZ24yFHqS8e0cBd5HtEH3iM4TKNUEPNQgJ2salcl+wKujDVi9jcs6IH00z8yyRo QzoLxp+XWWrBlq7Z6x5CnvWGQWWlEtMYthIud88c/SPh+8iEpA7rtoFxlLLwDUxTgkTh ViYN3aITOY3N3B6RJQoq53ViLqgqgTo54Psi0gRoHoTWvnHlamiQj7bBLpYcF8v6Jimk icJqm1nXeP2ow7z7EwrKxt8LMBD334bhHlqjC1cAXcyQ7ArC/xiQDc2CFIeAGTUvkkuQ Gkc7ALPsmem0Jzsvj3JC+epUNgEg1eqt9IlE6GlWH/56aA7XteFmC5mBE6jdLHp3rUdU +BKw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:ironport-sdr :ironport-sdr; bh=3cL4rqGuwF3bgF88tb82PsHrK2yd7BttY7IIlWxW7uw=; b=h1+PPcHZ/AN9bqdQkUzJhMbOsfG6Wm+OBnnxY8ibsQHSfVHDwDBt+u/3nf5+n0BZLv f9vfzvf7ZjUwZpsglR67OG+8nK9g9kc3L9yGytIHwEDBOEU8vwvcFoBnBzv+V9N5HcYL XH+6d36DM5qchD/v+l/lLKFh7wT0eVCTBe5m+ml30bFo3KJVzTTRI28yfHOLBtdY6skY qdego1CujoSYLeneCgeuiE+X7ZA9VKrhS4toMr0q1BRPZTNECuhsEP5kMTPo+Z/EahHt gXKcGse4/B7LUXICdNHi/msIxVjhEo69CJ9VmxWcf8jA0xtGpi8Vyud74pVBUGbMzHxW g9aA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v20si293338ejg.640.2020.10.01.22.39.58; Thu, 01 Oct 2020 22:40:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726010AbgJBFgB (ORCPT + 99 others); Fri, 2 Oct 2020 01:36:01 -0400 Received: from mga03.intel.com ([134.134.136.65]:56826 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725926AbgJBFgB (ORCPT ); Fri, 2 Oct 2020 01:36:01 -0400 IronPort-SDR: D2Rz482NaQhH5ZRjNh1xWcX+A/mfikJ3BnVNt9Y+T5RM+Ht6Ned8HksMLwpn9yZAi2WZvY+AN5 BfLuFAMbK/ag== X-IronPort-AV: E=McAfee;i="6000,8403,9761"; a="162999080" X-IronPort-AV: E=Sophos;i="5.77,326,1596524400"; d="scan'208";a="162999080" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Oct 2020 22:36:00 -0700 IronPort-SDR: 6jbjLfWQqsr8aAjCtPX3bOqRuP0w7ep6Pn0NOrfCvkw3J64sV+Kxqz7ve4t9XBDwiBnZPS2cyw 1gHeBPuottrg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,326,1596524400"; d="scan'208";a="515784545" Received: from black.fi.intel.com ([10.237.72.28]) by fmsmga005.fm.intel.com with ESMTP; 01 Oct 2020 22:35:49 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 3AFB4CB; Fri, 2 Oct 2020 08:35:47 +0300 (EEST) Date: Fri, 2 Oct 2020 08:35:47 +0300 From: "Kirill A. Shutemov" To: Lokesh Gidra Cc: Kalesh Singh , Suren Baghdasaryan , Minchan Kim , Joel Fernandes , "Cc: Android Kernel" , Catalin Marinas , Will Deacon , Thomas Gleixner , Ingo Molnar , Borislav Petkov , the arch/x86 maintainers , "H. Peter Anvin" , Andrew Morton , Shuah Khan , "Aneesh Kumar K.V" , Kees Cook , Peter Zijlstra , Sami Tolvanen , Masahiro Yamada , Arnd Bergmann , Frederic Weisbecker , Krzysztof Kozlowski , Hassan Naveed , Christian Brauner , Mark Rutland , Mike Rapoport , Gavin Shan , Zhenyu Ye , Jia He , John Hubbard , William Kucharski , Sandipan Das , Ralph Campbell , Mina Almasry , Ram Pai , Dave Hansen , Kamalesh Babulal , Masami Hiramatsu , Brian Geffon , SeongJae Park , linux-kernel , "moderated list:ARM64 PORT (AARCH64 ARCHITECTURE)" , "open list:MEMORY MANAGEMENT" , "open list:KERNEL SELFTEST FRAMEWORK" Subject: Re: [PATCH 0/5] Speed up mremap on large regions Message-ID: <20201002053547.7roe7b4mpamw4uk2@black.fi.intel.com> References: <20200930222130.4175584-1-kaleshsingh@google.com> <20200930223207.5xepuvu6wr6xw5bb@black.fi.intel.com> <20201001122706.jp2zr23a43hfomyg@black.fi.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 01, 2020 at 05:09:02PM -0700, Lokesh Gidra wrote: > On Thu, Oct 1, 2020 at 9:00 AM Kalesh Singh wrote: > > > > On Thu, Oct 1, 2020 at 8:27 AM Kirill A. Shutemov > > wrote: > > > > > > On Wed, Sep 30, 2020 at 03:42:17PM -0700, Lokesh Gidra wrote: > > > > On Wed, Sep 30, 2020 at 3:32 PM Kirill A. Shutemov > > > > wrote: > > > > > > > > > > On Wed, Sep 30, 2020 at 10:21:17PM +0000, Kalesh Singh wrote: > > > > > > mremap time can be optimized by moving entries at the PMD/PUD level if > > > > > > the source and destination addresses are PMD/PUD-aligned and > > > > > > PMD/PUD-sized. Enable moving at the PMD and PUD levels on arm64 and > > > > > > x86. Other architectures where this type of move is supported and known to > > > > > > be safe can also opt-in to these optimizations by enabling HAVE_MOVE_PMD > > > > > > and HAVE_MOVE_PUD. > > > > > > > > > > > > Observed Performance Improvements for remapping a PUD-aligned 1GB-sized > > > > > > region on x86 and arm64: > > > > > > > > > > > > - HAVE_MOVE_PMD is already enabled on x86 : N/A > > > > > > - Enabling HAVE_MOVE_PUD on x86 : ~13x speed up > > > > > > > > > > > > - Enabling HAVE_MOVE_PMD on arm64 : ~ 8x speed up > > > > > > - Enabling HAVE_MOVE_PUD on arm64 : ~19x speed up > > > > > > > > > > > > Altogether, HAVE_MOVE_PMD and HAVE_MOVE_PUD > > > > > > give a total of ~150x speed up on arm64. > > > > > > > > > > Is there a *real* workload that benefit from HAVE_MOVE_PUD? > > > > > > > > > We have a Java garbage collector under development which requires > > > > moving physical pages of multi-gigabyte heap using mremap. During this > > > > move, the application threads have to be paused for correctness. It is > > > > critical to keep this pause as short as possible to avoid jitters > > > > during user interaction. This is where HAVE_MOVE_PUD will greatly > > > > help. > > > > > > Any chance to quantify the effect of mremap() with and without > > > HAVE_MOVE_PUD? > > > > > > I doubt it's a major contributor to the GC pause. I expect you need to > > > move tens of gigs to get sizable effect. And if your GC routinely moves > > > tens of gigs, maybe problem somewhere else? > > > > > > I'm asking for numbers, because increase in complexity comes with cost. > > > If it doesn't provide an substantial benefit to a real workload > > > maintaining the code forever doesn't make sense. > > > mremap is indeed the biggest contributor to the GC pause. It has to > take place in what is typically known as a 'stop-the-world' pause, > wherein all application threads are paused. During this pause the GC > thread flips the GC roots (threads' stacks, globals etc.), and then > resumes threads along with concurrent compaction of the heap.This > GC-root flip differs depending on which compaction algorithm is being > used. > > In our case it involves updating object references in threads' stacks > and remapping java heap to a different location. The threads' stacks > can be handled in parallel with the mremap. Therefore, the dominant > factor is indeed the cost of mremap. From patches 2 and 4, it is clear > that remapping 1GB without this optimization will take ~9ms on arm64. > > Although this mremap has to happen only once every GC cycle, and the > typical size is also not going to be more than a GB or 2, pausing > application threads for ~9ms is guaranteed to cause jitters. OTOH, > with this optimization, mremap is reduced to ~60us, which is a totally > acceptable pause time. > > Unfortunately, implementation of the new GC algorithm hasn't yet > reached the point where I can quantify the effect of this > optimization. But I can confirm that without this optimization the new > GC will not be approved. IIUC, the 9ms -> 90us improvement attributed to combination HAVE_MOVE_PMD and HAVE_MOVE_PUD, right? I expect HAVE_MOVE_PMD to be reasonable for some workloads, but marginal benefit of HAVE_MOVE_PUD is in doubt. Do you see it's useful for your workload? -- Kirill A. Shutemov