Received: by 2002:a05:6358:16cc:b0:ea:6187:17c9 with SMTP id r12csp2502475rwl; Mon, 26 Dec 2022 16:31:10 -0800 (PST) X-Google-Smtp-Source: AMrXdXs4iVDTMMG3SVbyYPhrMYmi53yrsn4Yd+21joddEnrTjBz7yYASWUqhAZX6T/pm5IMx+MQX X-Received: by 2002:a17:90a:a28:b0:223:f78c:15d with SMTP id o37-20020a17090a0a2800b00223f78c015dmr22793648pjo.41.1672101069797; Mon, 26 Dec 2022 16:31:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1672101069; cv=none; d=google.com; s=arc-20160816; b=b2tMurVYGeZc0pVSYQwgB5EToG7FSd/pFOuu5R1vlJlwhIQB8xIH5LM7PczcFQM1bB SrPhH1vqfw0OlWLb6wEr2dLRiRgoiBlaX+ID4cKU9BLL2NH55GIbKQ7Z8vU28pwOpfjl JYFf1ffk7RHIZ6NB4HDLFgPk7/I461khefbDuu59GQsdnBNJp87J3WV85E0jeGSSx1cM GcV0dhULmu6EcOYRvmOALTe4mQjSDHULxqlSFJn8d0tMZ0ARS2G51ignBLBcW/aivzeg +AWJw8cCS/+TRspg5giCmx8dML5A36GtDQiEQf9xf8YVJc5tZ1B6ZyW1uocAsZUcZjPj EqEw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=a2xsuyBzGhFwK6demWSKeLgqkbiNTVuKySpakNGC5ag=; b=KaggiVR4jAxHxOsHiBbj2k2BONRzxmvSa1WUk57BVXmtEsc2riICm71G+2Dp0nVbqn OmdzPI6Ppb/2cUDRgr4LNnlHOX1rZ7r4xVolc4OXzPRFhchd1O+DTfVwLZgl0yIXKXP2 GCLq4pzopsUz/Y8K1DbRHBAnynbBH+sbcD48QGEsAK/0b/bXycYdMHRPNmnssvzC+cy7 +vLpKi5ghDYjFWwnJfgcN3i1V8MmH5V52M61fyrEciBFYEL6AzaX7gxUx0/dz5bqjgkC Ww0jvi5lHnwigLFwA2LtmTu0mf3qrnvdvFnGYEgMJ32TZGzjoytgF1ol6T+N30m4ykfz cKIg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ivXu9Ymq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id pg10-20020a17090b1e0a00b0022335a67eb9si13683013pjb.175.2022.12.26.16.30.57; Mon, 26 Dec 2022 16:31:09 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ivXu9Ymq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232409AbiL0A3Y (ORCPT + 66 others); Mon, 26 Dec 2022 19:29:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36808 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229791AbiL0A3W (ORCPT ); Mon, 26 Dec 2022 19:29:22 -0500 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 14E3CEB1 for ; Mon, 26 Dec 2022 16:29:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1672100962; x=1703636962; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=Of+NKY4iw8WWEB0zFLvTUTaUQgg1khMoL1jqU3Qdd5I=; b=ivXu9YmqZDYVFv3a1Ujv+oDMCxj/p2cY3x3KAzI+YrYqMjhHSvl+9hk1 XxMdofdgS4J6+4edgDol0DjaxZdwc7481OBoS2jQhsTT8rhMPrtObpwZo lmw+QpzmKOVXuKXrtxqiaZHtktKoZ+gebIrzLoNtE94odcVVvJsmAcbSh beogyzpEDtshmi1aDbGfpmlDnf3sdi8Fw8ftdAr4W1JzS1Sq06bwpCpPi gmjhnoeEJfSjM+Sh0i5/sbLnL90y+J3VNIAoopmmHJX6TcKiRvkFDNlqr gQox0eXP7YVCLGIb9br6JsofsQ38WLkz0V1tQ59G+kKGzY+ul0V1h/mhp g==; X-IronPort-AV: E=McAfee;i="6500,9779,10572"; a="322597195" X-IronPort-AV: E=Sophos;i="5.96,277,1665471600"; d="scan'208";a="322597195" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Dec 2022 16:29:21 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10572"; a="760172178" X-IronPort-AV: E=Sophos;i="5.96,277,1665471600"; d="scan'208";a="760172178" Received: from yyang3-mobl1.ccr.corp.intel.com (HELO yhuang6-mobl2.ccr.corp.intel.com) ([10.254.212.104]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Dec 2022 16:29:16 -0800 From: Huang Ying To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Huang, Ying" , Zi Yan , Yang Shi , Baolin Wang , Oscar Salvador , Matthew Wilcox , Bharata B Rao , Alistair Popple , haoxin Subject: [PATCH 0/8] migrate_pages(): batch TLB flushing Date: Tue, 27 Dec 2022 08:28:51 +0800 Message-Id: <20221227002859.27740-1-ying.huang@intel.com> X-Mailer: git-send-email 2.35.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: "Huang, Ying" Now, migrate_pages() migrate folios one by one, like the fake code as follows, for each folio unmap flush TLB copy restore map If multiple folios are passed to migrate_pages(), there are opportunities to batch the TLB flushing and copying. That is, we can change the code to something as follows, for each folio unmap for each folio flush TLB for each folio copy for each folio restore map The total number of TLB flushing IPI can be reduced considerably. And we may use some hardware accelerator such as DSA to accelerate the folio copying. So in this patch, we refactor the migrate_pages() implementation and implement the TLB flushing batching. Base on this, hardware accelerated folio copying can be implemented. If too many folios are passed to migrate_pages(), in the naive batched implementation, we may unmap too many folios at the same time. The possibility for a task to wait for the migrated folios to be mapped again increases. So the latency may be hurt. To deal with this issue, the max number of folios be unmapped in batch is restricted to no more than HPAGE_PMD_NR in the unit of page. That is, the influence is at the same level of THP migration. We use the following test to measure the performance impact of the patchset, On a 2-socket Intel server, - Run pmbench memory accessing benchmark - Run `migratepages` to migrate pages of pmbench between node 0 and node 1 back and forth. With the patch, the TLB flushing IPI reduces 99.1% during the test and the number of pages migrated successfully per second increases 291.7%. This patchset is based on mm-unstable. Changes: from rfc to v1: - Rebased on v6.2-rc1 - Fix the deadlock issue caused by locking multiple pages synchronously per Alistair's comments. Thanks! - Fix the autonumabench panic per Rao's comments and fix. Thanks! - Other minor fixes per comments. Thanks! Best Regards, Huang, Ying