Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp365819pxb; Thu, 21 Jan 2021 08:54:57 -0800 (PST) X-Google-Smtp-Source: ABdhPJzpJfviu7XJ7Y6NtY9x2EzUIdGuBngTk3LGzTxcrSTKHxxlCLT38+X+TRIcJeF/76tJT1NN X-Received: by 2002:a17:906:9499:: with SMTP id t25mr230577ejx.339.1611248097029; Thu, 21 Jan 2021 08:54:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1611248097; cv=none; d=google.com; s=arc-20160816; b=SFz3/SgGn/SitkPVF7IgJE4TCmVIzTzjMPVWuERtNqp6ncDPBuXH1JeI6JXuSKycY2 gOfQlPKFNfraEy5gZCfD/4/wy00JF3ncrXiMVQiscl0OUvn+djG9YTumpKbRCBHa8AB/ gSHrHn5/PVOJkTXjmsm33Fu44Miwg+g+CywV8CCs6Rf84gcoCt8Q1Vfxy0y/3ts3VXtG pna+qCPPkKgA7GSx2BFf/5xmPkIyxZsgR3/lxwSh4T0JKJmXw8ezNqeUUWG6TryXKs/a FVWgFeQfCppHNeRAPNEGn6OLlaEK8Cr8OZ7QB9HfOHNAZmo78mYOSoboz8bruRjUOMon zoOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from:dkim-signature; bh=9W5ZfzceZEprkNwKnWxd/HA2+wZNESBmIZ6OZLN9IFs=; b=Q8x9UJMq2XOukz0sdo3CjsCDmJcpK6rbcH+oqm8CKtm2RFRYb+sdH7aEedN3DB9je7 zeE2peLjXrUC7a2iI6wN5ySAD4GdnVZsxPurISb94UTUvI1ye9fQquLhs8BQiB8G7kp6 KXqDuRra0e39+uio9n8BwRTXFOVhPhi4KA5iZTOaMNvUxyv0EXfJYSzpwS06xKehiYp4 tZJmmJ4bSCINNboT648RX+R0zT7foXnDwfobOqaXDPZ2c9eOrHpRsrIRXqnNXMHkhv+t EOPbrfLMgNQsHp2+m4jjD1DKm7tGSO/k0lH/lGpzGUvheu61y/oZWXltZG6NZgjewlaQ CH6A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=ckzOtzlw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c9si2510730edr.596.2021.01.21.08.54.32; Thu, 21 Jan 2021 08:54:57 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=ckzOtzlw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388177AbhAUQxn (ORCPT + 99 others); Thu, 21 Jan 2021 11:53:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55488 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388190AbhAUQxC (ORCPT ); Thu, 21 Jan 2021 11:53:02 -0500 Received: from mail-pf1-x434.google.com (mail-pf1-x434.google.com [IPv6:2607:f8b0:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8825BC06174A for ; Thu, 21 Jan 2021 08:52:22 -0800 (PST) Received: by mail-pf1-x434.google.com with SMTP id m6so1841115pfm.6 for ; Thu, 21 Jan 2021 08:52:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=9W5ZfzceZEprkNwKnWxd/HA2+wZNESBmIZ6OZLN9IFs=; b=ckzOtzlwn3A88sAj8WM5TYRP6OOOxFgTO7qXVqyF5No7YgF71gnW4IRy+/9lLfhNYg 9K7SXhQluGFJMxjs8t9MkpR6njztLa5rVZC+52x+tNjtbJ6/q0X857JvUXaspEjVCANq yyidiLqA4khKahGyELHlKazQGtd1FFFaoetdhUj25fEZip1ISo5yCAK27rT07nYouTuV uhTa04KC2wmEZbfYXpdkplIUovLimLQ95u0Tedre+h9YOyYsDTd7iS7A8oA65+YCd3lY aEjiPMRaThFSpCtzm/gMe0amvZlQN11HDg2eE1hmVz1XW5qMuqPbsES6T6/hoAExC6F4 PYfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=9W5ZfzceZEprkNwKnWxd/HA2+wZNESBmIZ6OZLN9IFs=; b=Gsfk2yWIKDxcvW/NLFgdiwACLRVO9+wsMdJbf4KN8qYcsbJ2wdaUcWfwAn8t03KNlh TpgmzcqWkLG+L5j9vmH71KVV0w+bYQ5s5+ClVZ7vpk3aR20WAAfy525wTCb7qBIkym74 pRtscvCJ6tVjQQ0WKaa2B/AR1widsxNu/JtoByytjk2IwBzKKZNsENooBXDCYIjN34Wf Cop6Yua4FAcQPsP2DJ9UrrqLOWpVNzhNMbk1YVyBA75r46QjKyJkqyM8zzKBZ3xF21F0 5UQMystCs04I3YQedrXOrZxUsDY5zHwWz2xoLdkApD0v9UsIZdvKzmvoX7wa3aHJow9u r3bA== X-Gm-Message-State: AOAM532m69IbvPII75DOr8oTHARWraQ9FRVi62FSiERkWj/nPx98gTFR +1imAqhDctiMq9u7EK4hHcm+h5RyKHOUAZxI X-Received: by 2002:a62:32c5:0:b029:1b6:7586:f718 with SMTP id y188-20020a6232c50000b02901b67586f718mr266295pfy.74.1611247941684; Thu, 21 Jan 2021 08:52:21 -0800 (PST) Received: from localhost.localdomain ([2405:201:5c0b:3035:cd47:c5b3:4276:dc05]) by smtp.gmail.com with ESMTPSA id m27sm5924291pgn.62.2021.01.21.08.52.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Jan 2021 08:52:21 -0800 (PST) From: Prathu Baronia X-Google-Original-From: Prathu Baronia To: linux-kernel@vger.kernel.org Cc: chintan.pandya@oneplus.com, Prathu Baronia , Catalin Marinas , Will Deacon , Vincenzo Frascino , "glider@google.com" , Anshuman Khandual , Andrew Morton , Andrey Konovalov , linux-arm-kernel@lists.infradead.org Subject: [PATCH 1/1] mm: Optimizing hugepage zeroing in arm64 Date: Thu, 21 Jan 2021 22:21:51 +0530 Message-Id: <20210121165153.17828-2-prathu.baronia@oneplus.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210121165153.17828-1-prathu.baronia@oneplus.com> References: <20210121165153.17828-1-prathu.baronia@oneplus.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org In !HIGHMEM cases, specially in 64-bit architectures, we don't need temp mapping of pages. Hence, k(map|unmap)_atomic() acts as nothing more than multiple barrier() calls, for example for a 2MB hugepage in clear_huge_page() these are called 512 times i.e. to map and unmap each subpage that means in total 2048 barrier calls. This called for optimization. Simply getting VADDR from page does the job for us. We profiled clear_huge_page() using ftrace and observed an improvement of 62%. Setup:- Below data has been collected on Qualcomm's SM7250 SoC THP enabled (kernel v4.19.113) with only CPU-0(Cortex-A55) and CPU-7(Cortex-A76) switched on and set to max frequency, also DDR set to perf governor. FTRACE Data:- Base data:- Number of iterations: 48 Mean of allocation time: 349.5 us std deviation: 74.5 us v1 data:- Number of iterations: 48 Mean of allocation time: 131 us std deviation: 32.7 us The following simple userspace experiment to allocate 100MB(BUF_SZ) of pages and writing to it gave us a good insight, we observed an improvement of 42% in allocation and writing timings. ------------------------------------------------------------- Test code snippet ------------------------------------------------------------- clock_start(); buf = malloc(BUF_SZ); /* Allocate 100 MB of memory */ for(i=0; i < BUF_SZ_PAGES; i++) { *((int *)(buf + (i*PAGE_SIZE))) = 1; } clock_end(); ------------------------------------------------------------- Malloc test timings for 100MB anon allocation:- Base data:- Number of iterations: 100 Mean of allocation time: 31831 us std deviation: 4286 us v1 data:- Number of iterations: 100 Mean of allocation time: 18193 us std deviation: 4915 us Reported-by: Chintan Pandya Signed-off-by: Prathu Baronia --- arch/arm64/include/asm/page.h | 3 +++ arch/arm64/mm/copypage.c | 8 ++++++++ 2 files changed, 11 insertions(+) diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h index 012cffc574e8..8f9d005a11bb 100644 --- a/arch/arm64/include/asm/page.h +++ b/arch/arm64/include/asm/page.h @@ -35,6 +35,9 @@ void copy_highpage(struct page *to, struct page *from); #define clear_user_page(page, vaddr, pg) clear_page(page) #define copy_user_page(to, from, vaddr, pg) copy_page(to, from) +#define clear_user_highpage clear_user_highpage +void clear_user_highpage(struct page *page, unsigned long vaddr); + typedef struct page *pgtable_t; extern int pfn_valid(unsigned long); diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c index b5447e53cd73..7f5943c6fc12 100644 --- a/arch/arm64/mm/copypage.c +++ b/arch/arm64/mm/copypage.c @@ -44,3 +44,11 @@ void copy_user_highpage(struct page *to, struct page *from, flush_dcache_page(to); } EXPORT_SYMBOL_GPL(copy_user_highpage); + +inline void clear_user_highpage(struct page *page, unsigned long vaddr) +{ + void *addr = page_address(page); + + clear_user_page(addr, vaddr, page); +} +EXPORT_SYMBOL_GPL(clear_user_highpage); -- 2.17.1