Received: by 10.213.65.68 with SMTP id h4csp674056imn; Tue, 13 Mar 2018 17:41:03 -0700 (PDT) X-Google-Smtp-Source: AG47ELtD/PLQUsdsLv4GZppHNjd7Tj/6AZDSDrGRffRZvFxib0yGOU7yzp6dcPscn+gG3sXpaJrw X-Received: by 10.101.85.143 with SMTP id j15mr1969562pgs.387.1520988063849; Tue, 13 Mar 2018 17:41:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1520988063; cv=none; d=google.com; s=arc-20160816; b=ib8ZlH7f9AcJPXq+bII6M6BS5KTzINReYbzpI28AuW4qeTknkcqqlxnhqfRM2ZKhqt /XOsP+Sc5hI8rlJcX1AtYCzxpoCcRKJm8EnCMN8KdRkTLyI94prmOlS6UL0lISNTpqEX KLYArqNG98MVC3Nd+AeUR4f7ydyp7FJwXwYUaa3poS3BVyDkrjhNJ0PbTNxR43uCTnsy 1xg0SmceyqOWSHleL64ObntqVfT28gS5VpB8Boz6+O7K0HilywlxDBaBIht8cC0KTX7z prymUcyoAJALdAyImzAh9EW3HEho7ahgJYpwvEHAOVCNCcSDz+NgL6zyN8FLPCfC1S8u 2jAQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=pkgSpmqVaIVgvO++fYheloibAk7YUZgy0apPn+Vh2Zw=; b=R0Y/khy/aXfQTImRTUy+x2U1AbNs542Nx9opJ7GBG6/e/8trsXFypEIiq32544PJJS 1dBbaem/RO/oJ2LRV95TjUHgWZkUM2+3OE8C243H7nyXyiAR90tfXv09WJFrKOZ/kXaA DcEQ0SojfsIFUti3dgYtQSDwLWgJKicsvAs+Hpaz6zsxUIhMEsHkuXQ1JDuaFJR6EnhZ vzH73NTIorLp4nS9DSEJp66PggJFfsrSMsRkYefQ1qtFZIpa4bI/Trp+9nwlETph7bI/ rE5eW6jFAd5mB32SQR7bw5NXUnYAUgxyzxpS7L9JDbOBt16lsCIcAxuGPm4sT60Avhuf o+EQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=FpyzkYDK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b12-v6si991409plk.189.2018.03.13.17.40.48; Tue, 13 Mar 2018 17:41:03 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=FpyzkYDK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932852AbeCNAjl (ORCPT + 99 others); Tue, 13 Mar 2018 20:39:41 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:36114 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932580AbeCNAjk (ORCPT ); Tue, 13 Mar 2018 20:39:40 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w2E0bOOv030596 for ; Wed, 14 Mar 2018 00:39:40 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=mime-version : in-reply-to : references : from : date : message-id : subject : to : cc : content-type; s=corp-2017-10-26; bh=pkgSpmqVaIVgvO++fYheloibAk7YUZgy0apPn+Vh2Zw=; b=FpyzkYDK7fnKkQ8Aq+ncAxnmbu0IZUhy1gZIegJsB5qg5m5Qd7OoKL8jUKBTB+gdE2g5 OzB9hlIKFksVuHqs8sNs1soHWZu34044DMPar/ttoPasd09iFTTHfOW4S7cSWeaYvrFG 1YG8CIuvEcWbWUBCM9WGwuJ8qZ8AjHmHyE/M6VDlhrA5fkrKw4KMBspik2ubGcTmdpEQ txVtqMf9g3O4ruLi0LWW2rlWO9kB4ThOjqJHo9nE7+CRBaAOxmPcQ/rXt7YJLZUZ2bIh Hs/IA6aA2JCacULOE27vSbkfkUYAHzOPHOUvg/CkMKba+0/qsaqrKhTfE3vbTrvlevVT GA== Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by userp2130.oracle.com with ESMTP id 2gps1x81br-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 14 Mar 2018 00:39:40 +0000 Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w2E0dckK012655 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 14 Mar 2018 00:39:38 GMT Received: from abhmp0019.oracle.com (abhmp0019.oracle.com [141.146.116.25]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w2E0dcqU026188 for ; Wed, 14 Mar 2018 00:39:38 GMT Received: from mail-ot0-f182.google.com (/74.125.82.182) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 13 Mar 2018 17:39:38 -0700 Received: by mail-ot0-f182.google.com with SMTP id m22-v6so1558189otf.10 for ; Tue, 13 Mar 2018 17:39:38 -0700 (PDT) X-Gm-Message-State: AElRT7F2b92RqJUa5WWuVHNah2qwFocCaD6VOfcBw7kgj+cu9oe0sqY1 2c66372vA9LS5lsPcYNDXEsYMN23LohM0WQrg0g= X-Received: by 10.157.15.5 with SMTP id 5mr1685837ott.323.1520987977963; Tue, 13 Mar 2018 17:39:37 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a9d:7258:0:0:0:0:0 with HTTP; Tue, 13 Mar 2018 17:38:57 -0700 (PDT) In-Reply-To: <20180313234333.j3i43yxeawx5d67x@sasha-lappy> References: <20180131210300.22963-1-pasha.tatashin@oracle.com> <20180131210300.22963-2-pasha.tatashin@oracle.com> <20180313234333.j3i43yxeawx5d67x@sasha-lappy> From: Pavel Tatashin Date: Tue, 13 Mar 2018 20:38:57 -0400 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v2 1/2] mm: uninitialized struct page poisoning sanity checking To: Sasha Levin Cc: "steven.sistare@oracle.com" , "daniel.m.jordan@oracle.com" , "akpm@linux-foundation.org" , "mgorman@techsingularity.net" , "mhocko@suse.com" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , "gregkh@linuxfoundation.org" , "vbabka@suse.cz" , "bharata@linux.vnet.ibm.com" Content-Type: text/plain; charset="UTF-8" X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8831 signatures=668690 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=1 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1803140003 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Sasha, It seems the patch is doing the right thing, and it catches bugs. Here we access uninitialized struct page. The question is why this happens? register_mem_sect_under_node(struct memory_block *mem_blk, int nid) page_nid = get_nid_for_pfn(pfn); node id is stored in page flags, and since struct page is poisoned, and the pattern is recognized, the panic is triggered. Do you have config file? Also, instructions how to reproduce it? Thank you, Pasha On Tue, Mar 13, 2018 at 7:43 PM, Sasha Levin wrote: > On Wed, Jan 31, 2018 at 04:02:59PM -0500, Pavel Tatashin wrote: >>During boot we poison struct page memory in order to ensure that no one is >>accessing this memory until the struct pages are initialized in >>__init_single_page(). >> >>This patch adds more scrutiny to this checking, by making sure that flags >>do not equal to poison pattern when the are accessed. The pattern is all >>ones. >> >>Since, node id is also stored in struct page, and may be accessed quiet >>early we add the enforcement into page_to_nid() function as well. >> >>Signed-off-by: Pavel Tatashin >>--- > > Hey Pasha, > > This patch is causing the following on boot: > > [ 1.253732] BUG: unable to handle kernel paging request at fffffffffffffffe > [ 1.254000] PGD 2284e19067 P4D 2284e19067 PUD 2284e1b067 PMD 0 > [ 1.254000] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI > [ 1.254000] Modules linked in: > [ 1.254000] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.16.0-rc5-next-20180313 #10 > [ 1.254000] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090007 06/02/2017 > [ 1.254000] RIP: 0010:__dump_page (??:?) > [ 1.254000] RSP: 0000:ffff881c63c17810 EFLAGS: 00010246 > [ 1.254000] RAX: dffffc0000000000 RBX: ffffea0084000000 RCX: 1ffff1038c782f2b > [ 1.254000] RDX: 1fffffffffffffff RSI: ffffffff9e160640 RDI: ffffea0084000000 > [ 1.254000] RBP: ffff881c63c17c00 R08: ffff8840107e8880 R09: ffffed0802167a4d > [ 1.254000] R10: 0000000000000001 R11: ffffed0802167a4c R12: 1ffff1038c782f07 > [ 1.254000] R13: ffffea0084000020 R14: fffffffffffffffe R15: ffff881c63c17bd8 > [ 1.254000] FS: 0000000000000000(0000) GS:ffff881c6ac00000(0000) knlGS:0000000000000000 > [ 1.254000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 1.254000] CR2: fffffffffffffffe CR3: 0000002284e16000 CR4: 00000000003406e0 > [ 1.254000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 1.254000] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 1.254000] Call Trace: > [ 1.254000] dump_page (/mm/debug.c:80) > [ 1.254000] get_nid_for_pfn (/./include/linux/mm.h:900 /drivers/base/node.c:396) > [ 1.254000] register_mem_sect_under_node (/drivers/base/node.c:438) > [ 1.254000] link_mem_sections (/drivers/base/node.c:517) > [ 1.254000] topology_init (/./include/linux/nodemask.h:271 /arch/x86/kernel/topology.c:164) > [ 1.254000] do_one_initcall (/init/main.c:835) > [ 1.254000] kernel_init_freeable (/init/main.c:901 /init/main.c:909 /init/main.c:927 /init/main.c:1076) > [ 1.254000] kernel_init (/init/main.c:1004) > [ 1.254000] ret_from_fork (/arch/x86/entry/entry_64.S:417) > [ 1.254000] Code: ff a8 01 4c 0f 44 f3 4d 85 f6 0f 84 31 0e 00 00 4c 89 f2 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 80 3c 02 00 0f 85 2d 11 00 00 <49> 83 3e ff 0f 84 a9 06 00 00 4d 8d b7 c0 fd ff ff 48 b8 00 00 > All code > ======== > 0: ff a8 01 4c 0f 44 ljmp *0x440f4c01(%rax) > 6: f3 4d 85 f6 repz test %r14,%r14 > a: 0f 84 31 0e 00 00 je 0xe41 > 10: 4c 89 f2 mov %r14,%rdx > 13: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax > 1a: fc ff df > 1d: 48 c1 ea 03 shr $0x3,%rdx > 21: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) > 25: 0f 85 2d 11 00 00 jne 0x1158 > 2b:* 49 83 3e ff cmpq $0xffffffffffffffff,(%r14) <-- trapping instruction > 2f: 0f 84 a9 06 00 00 je 0x6de > 35: 4d 8d b7 c0 fd ff ff lea -0x240(%r15),%r14 > 3c: 48 rex.W > 3d: b8 .byte 0xb8 > ... > > Code starting with the faulting instruction > =========================================== > 0: 49 83 3e ff cmpq $0xffffffffffffffff,(%r14) > 4: 0f 84 a9 06 00 00 je 0x6b3 > a: 4d 8d b7 c0 fd ff ff lea -0x240(%r15),%r14 > 11: 48 rex.W > 12: b8 .byte 0xb8 > ... > [ 1.254000] RIP: __dump_page+0x1c8/0x13c0 RSP: ffff881c63c17810 (/./include/asm-generic/sections.h:42) > [ 1.254000] CR2: fffffffffffffffe > [ 1.254000] ---[ end trace e643dfbc44b562ca ]--- > > -- > > Thanks, > Sasha