Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp907628imm; Tue, 15 May 2018 10:53:27 -0700 (PDT) X-Google-Smtp-Source: AB8JxZofSIqdvEzMCpMuR1ue22ianzBcqi7YDQSHWqqvZcFJcsFwVf9ZCMvE1dKU6VdW5n1zsgjN X-Received: by 2002:a65:4301:: with SMTP id j1-v6mr12954035pgq.356.1526406807730; Tue, 15 May 2018 10:53:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526406807; cv=none; d=google.com; s=arc-20160816; b=cPFtzyvKvZ4v4vIR7H3qSlgF0vDdikXu52WcnDa93SgXbZJRjNtsm3hAJAX3Qb9hGN PKFfloA3fhjZGTt+4g85RcKLX+vDI0njGaXgPmKOg8HeHeNLHknSxnGQsLRUznHxPF0l W3gp+9Ox+S72rpHGgZcvTnFQSQdt0t2UPXDKz3sgCyUbct7bBW/Pwag/GvnNYY7bNfOT t8ah3ra1+u9Sh8Re3INiAaplfF2OinSoEgpSYLUwN4vgE+cVxa6i7u1e5x+9jl7fqig5 Kf5UeZyskk78aaT6+t9ZLaiustyBLz5IC7ckqlwctQGghaftsSlI01mU0oJi14HkFMDR Ly/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:to:from :dkim-signature:arc-authentication-results; bh=JkjGHDMWte2fBZSN+qivDeQKhF6WQN5II5T22y2ywAg=; b=QDNalDeDaT6gwcp33HIIljWEKEgQrwr0dBFk3m3NnkQL+N7A8yMtIIbiQXYy25WLb3 Wgww2woPDf5IRLgv8o7cB7dwpLWEJw/8Hhj53lauShy8CMdkIeL1wU7GiZ2Gp5k4ZxG8 LTq2wDg35WuCUsKii7C1sPehyEdcxnc5DngHcn3nmOWrIKgy6yZ3QtbQ6fzJN7I67zFu /jBBWxSrMeTCo0clHsPFgssS32ID+mvUP7mgjz7EchpeCqH2bVJlBZmCEVndqaojqosm ZxA3hpYqN0CB+cVOCf4uYt5sRwtRnD94h48zsS48Sa5uv9c2XFPAH6y+qc9F5W4bNHpV llmw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=V2jPTt11; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k33-v6si534295pld.100.2018.05.15.10.53.06; Tue, 15 May 2018 10:53:27 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=V2jPTt11; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753576AbeEORwN (ORCPT + 99 others); Tue, 15 May 2018 13:52:13 -0400 Received: from aserp2130.oracle.com ([141.146.126.79]:45678 "EHLO aserp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753524AbeEORwK (ORCPT ); Tue, 15 May 2018 13:52:10 -0400 Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w4FHpGOC128263; Tue, 15 May 2018 17:51:34 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : subject : date : message-id; s=corp-2017-10-26; bh=JkjGHDMWte2fBZSN+qivDeQKhF6WQN5II5T22y2ywAg=; b=V2jPTt117v83yZXzacCcnmF+vooBILeKci4Vzs4lrljtTJTOfrjH7pWpqBH5/cJ05FdL 2twIgGQFjkp1ydXidcMV12VSWUCSnDgHOdGd2fKgSdTMEZ0cKMzmngQkmVSsBebNO6m0 2p7wN3gIQ42WtrzDd+XOuMXdqrW7q97PovBBbD8KCJuBOwOQgjdjtvDR+41jCXIrdAcz 3vSkiVXOnfBc5WhGTChn/Pl5waipwkxPk6dZ1YI0CcCUfzqwF0xib6qXJfyi6VNwIEP3 vMOOTtVpz2M7ZgvX8aGCp8gmlf3PkY2qWH7o3AEOcin1PrBISX0mDO3sIvgEe/2jSw9c mw== Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by aserp2130.oracle.com with ESMTP id 2hxpvcr73n-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 15 May 2018 17:51:33 +0000 Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w4FHpVWA010101 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 15 May 2018 17:51:32 GMT Received: from abhmp0018.oracle.com (abhmp0018.oracle.com [141.146.116.24]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id w4FHpTYp011720; Tue, 15 May 2018 17:51:29 GMT Received: from xakep.us.oracle.com (/10.39.249.38) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 15 May 2018 10:51:28 -0700 From: Pavel Tatashin To: steven.sistare@oracle.com, daniel.m.jordan@oracle.com, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, tglx@linutronix.de, mhocko@suse.com, linux-mm@kvack.org, mgorman@techsingularity.net, mingo@kernel.org, peterz@infradead.org, rostedt@goodmis.org, fengguang.wu@intel.com, dennisszhou@gmail.com Subject: [PATCH v5] mm: don't allow deferred pages with NEED_PER_CPU_KM Date: Tue, 15 May 2018 13:51:24 -0400 Message-Id: <20180515175124.1770-1-pasha.tatashin@oracle.com> X-Mailer: git-send-email 2.17.0 X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8894 signatures=668698 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1805150178 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org It is unsafe to do virtual to physical translations before mm_init() is called if struct page is needed in order to determine the memory section number (see SECTION_IN_PAGE_FLAGS). This is because only in mm_init() we initialize struct pages for all the allocated memory when deferred struct pages are used. My recent fix exposed this problem, because it greatly reduced number of pages that are initialized before mm_init(), but the problem existed even before my fix, as Fengguang Wu found. Below is a more detailed explanation of the problem. We initialize struct pages in four places: 1. Early in boot a small set of struct pages is initialized to fill the first section, and lower zones. 2. During mm_init() we initialize "struct pages" for all the memory that is allocated, i.e reserved in memblock. 3. Using on-demand logic when pages are allocated after mm_init call (when memblock is finished) 4. After smp_init() when the rest free deferred pages are initialized. The problem occurs if we try to do va to phys translation of a memory between steps 1 and 2. Because we have not yet initialized struct pages for all the reserved pages, it is inherently unsafe to do va to phys if the translation itself requires access of "struct page" as in case of this combination: CONFIG_SPARSE && !CONFIG_SPARSE_VMEMMAP The following path exposes the problem: start_kernel() trap_init() setup_cpu_entry_areas() setup_cpu_entry_area(cpu) get_cpu_gdt_paddr(cpu) per_cpu_ptr_to_phys(addr) pcpu_addr_to_page(addr) virt_to_page(addr) pfn_to_page(__pa(addr) >> PAGE_SHIFT) We disable this path by not allowing NEED_PER_CPU_KM with deferred struct pages feature. The problems are discussed in these threads: http://lkml.kernel.org/r/20180418135300.inazvpxjxowogyge@wfg-t540p.sh.intel.com http://lkml.kernel.org/r/20180419013128.iurzouiqxvcnpbvz@wfg-t540p.sh.intel.com http://lkml.kernel.org/r/20180426202619.2768-1-pasha.tatashin@oracle.com Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set") Signed-off-by: Pavel Tatashin --- mm/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/Kconfig b/mm/Kconfig index d5004d82a1d6..e14c01513bfd 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -636,6 +636,7 @@ config DEFERRED_STRUCT_PAGE_INIT default n depends on NO_BOOTMEM depends on !FLATMEM + depends on !NEED_PER_CPU_KM help Ordinarily all struct pages are initialised during early boot in a single thread. On very large machines this can take a considerable -- 2.17.0