Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp7491514imm; Thu, 28 Jun 2018 04:50:09 -0700 (PDT) X-Google-Smtp-Source: AAOMgpcqtraSZ9klWdj+8odZXOuUdihVnnpK+RnM+Pcfq1HguYsGnPmiQM2ZNpU10a3cjWlNDPVi X-Received: by 2002:a62:8d5:: with SMTP id 82-v6mr9941296pfi.154.1530186609690; Thu, 28 Jun 2018 04:50:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530186609; cv=none; d=google.com; s=arc-20160816; b=OssNUEnH4vUBh04z6EWsz+joy+biag2iO1+m6/hISNRSJCrn3llGvdYfnLVfdvLq9y yD1DFKmsOimGhxdTN/jf/df6uPPFn5FOjuPvr8U+p4NqANvE2C9q0U9mPajxaognhAFC lU/qU4qDXPxFgPb7HxbvK2TiE60Ia3VECr7E4YH9Bs3CnBLUklDomx2ARRm+75Ed3IDX CFALvu66c2y2nGOeWfBFtpXIpJTbTXPHbjIQ4moy3PSvou5061z5GzA1CunO6+abU8PE pRS9mt49/z1S5xbnKMQgOb03LID/MqsykYxedFcfkeWENtLd0i6EOUxHhkO238hGX754 GEUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:to:from :arc-authentication-results; bh=0O53OJtbeetbVxLNqnhl//nmnpl8xM6xMEVTVB37W08=; b=I22zz+obAeJgJgqUGPiIos51GiP1M7tvLmR5HJLX4BW/ew62rtiTpfhl0TTKyqo0iJ IqgOVs+k0k7CRbDp1PqiB2yyfwykSskpoeEjEPsFAQr9iX0GFk0QpELbafJxAQJrGbBe QQ0ILkGTU392hnMmwpWH50lm8IeRlv/eaSD26BqoDLQBSWIO8/FJZTFyRvUU0WGKZ8bS BlfX5CsLjTKpFFboSGPqgXG/9jMKzUqRjWbBr3Hj5cEudIMGzpTzZGRP420PT1abOAWv DIKJlcoTqkNIbCV2+Z/ulnKNSpSqsy6X63/H7kQLJ1xw9eOQ/C0BKhjkSkGBFrYOz2i6 aoDQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i3-v6si6549417pld.189.2018.06.28.04.49.51; Thu, 28 Jun 2018 04:50:09 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965182AbeF1KFn (ORCPT + 99 others); Thu, 28 Jun 2018 06:05:43 -0400 Received: from mga17.intel.com ([192.55.52.151]:15220 "EHLO mga17.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932831AbeF1KFm (ORCPT ); Thu, 28 Jun 2018 06:05:42 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 28 Jun 2018 03:05:41 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,282,1526367600"; d="scan'208";a="68007505" Received: from shbuild000.sh.intel.com (HELO byang_ol.sh.intel.com) ([10.239.144.215]) by fmsmga001.fm.intel.com with ESMTP; 28 Jun 2018 03:05:40 -0700 From: Bin Yang To: tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com, x86@kernel.org, bin.yang@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH] x86/mm: fix cpu stuck issue in __change_page_attr_set_clr Date: Thu, 28 Jun 2018 10:05:40 +0000 Message-Id: <1530180340-18593-1-git-send-email-bin.yang@intel.com> X-Mailer: git-send-email 2.7.4 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This issue can be easily triggered by free_initmem() functuion on x86_64 cpu. When changing page attr, __change_page_attr_set_clr will call __change_page_attr for every 4K page. And try_preserve_large_page will be called to check whether it needs to split the large page. If cpu supports "pdpe1gb", kernel will try to use 1G large page. In worst case, it needs to check every 4K page in 1G range in try_preserve_large_page function. If __change_page_attr_set_clr needs to change lots of 4K pages, cpu will be stuck for long time. This patch try to cache the last address which had been checked just now. If the next address is in same big page, the cache will be used without full range check. Signed-off-by: Bin Yang --- arch/x86/mm/pageattr.c | 26 ++++++++++++++++++++++---- 1 file changed, 22 insertions(+), 4 deletions(-) diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c index 3bded76e..b9241ac 100644 --- a/arch/x86/mm/pageattr.c +++ b/arch/x86/mm/pageattr.c @@ -552,16 +552,20 @@ static int try_preserve_large_page(pte_t *kpte, unsigned long address, struct cpa_data *cpa) { + static unsigned long address_cache; + static unsigned long do_split_cache = 1; unsigned long nextpage_addr, numpages, pmask, psize, addr, pfn, old_pfn; pte_t new_pte, old_pte, *tmp; pgprot_t old_prot, new_prot, req_prot; int i, do_split = 1; enum pg_level level; - if (cpa->force_split) + spin_lock(&pgd_lock); + if (cpa->force_split) { + do_split_cache = 1; return 1; + } - spin_lock(&pgd_lock); /* * Check for races, another CPU might have split this page * up already: @@ -627,13 +631,25 @@ try_preserve_large_page(pte_t *kpte, unsigned long address, new_prot = static_protections(req_prot, address, pfn); + addr = address & pmask; + pfn = old_pfn; + /* + * If an address in same range had been checked just now, re-use the + * cache value without full range check. In the worst case, it needs to + * check every 4K page in 1G range, which causes cpu stuck for long + * time. + */ + if (!do_split_cache && + address_cache >= addr && address_cache < nextpage_addr && + pgprot_val(new_prot) == pgprot_val(old_prot)) { + do_split = do_split_cache; + goto out_unlock; + } /* * We need to check the full range, whether * static_protection() requires a different pgprot for one of * the pages in the range we try to preserve: */ - addr = address & pmask; - pfn = old_pfn; for (i = 0; i < (psize >> PAGE_SHIFT); i++, addr += PAGE_SIZE, pfn++) { pgprot_t chk_prot = static_protections(req_prot, addr, pfn); @@ -670,6 +686,8 @@ try_preserve_large_page(pte_t *kpte, unsigned long address, } out_unlock: + address_cache = address; + do_split_cache = do_split; spin_unlock(&pgd_lock); return do_split; -- 2.7.4