Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp4645341imu; Tue, 29 Jan 2019 05:15:31 -0800 (PST) X-Google-Smtp-Source: ALg8bN5gFltiKNwnOTFdOYxDM0pKhcCaeEVr9rpVg2Xpf+RdUEivWON+Zs+IsszUAJLls09KYCij X-Received: by 2002:a17:902:8a8a:: with SMTP id p10mr26331359plo.50.1548767731510; Tue, 29 Jan 2019 05:15:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548767731; cv=none; d=google.com; s=arc-20160816; b=QAGY4ngaKYfT/Gs/aNatOFumaetWuL0xBYpKXHzt8qEdmxuuDLHjPiOUq6NckBE4AF NrBDKNrdevMwAaMsR2FySKQ6ruEiOExxuKrLR4zbMNLj/HoHb7SLDJiGOEB6Y4lAJvhl B7r6YgE+oZg8uWr2yjNJWOtbsYK+7mte86Ps4+3NlEeROsJ1jn3KmF65VeN36TPeRF7I sfXnay0VLRxsnB+bk7VWl++6FsD8Us7DphNN1uvQdpigK+t9iEZtKwOzFnTxXsLmcBL5 qdc/wUjJ8MkBrDUMdN+vh0pooehOUJSimSF7LdzGHDYbuf6Yz4vUQtVcmUNXAZDUJqT4 417Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:message-id :mime-version:references:in-reply-to:subject:cc:to:from:date; bh=hAF8WrPFKj5s+aR6qFYjS7obZpGr42T38qNl+vDkoGI=; b=NRnYrSyGjG3WgIQyWkiQ5D5Oxfoizsoq0PtBbNY1/QQqnBt8/FPd+N56NTXvgRzPZI 8y5xeOZ0Kj2xd4dhR5q3CEjaBk3zElgTU05HjU0kd3F6BceBNycpd0WWga+IEZnr7xYo 5mfmMUDNzCHqascjH4APPDh+ZbnkMNJSSDOKtra9hhU92hG+6BH7L/EMwR94Uk4E+55A 7rPCwo4ca0z/DbpLvdDVgkf2P8K7/KpZlw4o+TKWFhdPqwU42Mojs224kjjX+XcB7EcY WSc/OlJDoCpnDyzDZ90YPv4QBhADb7vd6DvS1efrgjV5hNefV8reCrIIFRmCKuHDHEg+ JUqA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k69si32954302pga.176.2019.01.29.05.15.14; Tue, 29 Jan 2019 05:15:31 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726383AbfA2NO6 (ORCPT + 99 others); Tue, 29 Jan 2019 08:14:58 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:43856 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725601AbfA2NO6 (ORCPT ); Tue, 29 Jan 2019 08:14:58 -0500 Received: from pps.filterd (m0098394.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x0TDDAIS134701 for ; Tue, 29 Jan 2019 08:14:57 -0500 Received: from e06smtp02.uk.ibm.com (e06smtp02.uk.ibm.com [195.75.94.98]) by mx0a-001b2d01.pphosted.com with ESMTP id 2qap89mqca-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 29 Jan 2019 08:14:56 -0500 Received: from localhost by e06smtp02.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 29 Jan 2019 13:14:54 -0000 Received: from b06cxnps3074.portsmouth.uk.ibm.com (9.149.109.194) by e06smtp02.uk.ibm.com (192.168.101.132) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 29 Jan 2019 13:14:51 -0000 Received: from d06av24.portsmouth.uk.ibm.com (mk.ibm.com [9.149.105.60]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x0TDEoFl61735050 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 29 Jan 2019 13:14:50 GMT Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 58B3B42047; Tue, 29 Jan 2019 13:14:50 +0000 (GMT) Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id F28C04204D; Tue, 29 Jan 2019 13:14:49 +0000 (GMT) Received: from thinkpad (unknown [9.152.99.81]) by d06av24.portsmouth.uk.ibm.com (Postfix) with ESMTP; Tue, 29 Jan 2019 13:14:49 +0000 (GMT) Date: Tue, 29 Jan 2019 14:14:47 +0100 From: Gerald Schaefer To: Michal Hocko Cc: Mikhail Zaslonko , Mikhail Gavrilov , Andrew Morton , Pavel Tatashin , schwidefsky@de.ibm.com, heiko.carstens@de.ibm.com, , LKML Subject: Re: [PATCH 0/2] mm, memory_hotplug: fix uninitialized pages fallouts. In-Reply-To: <20190128144506.15603-1-mhocko@kernel.org> References: <20190128144506.15603-1-mhocko@kernel.org> X-Mailer: Claws Mail 3.17.3 (GTK+ 2.24.32; x86_64-redhat-linux-gnu) MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 19012913-0008-0000-0000-000002B775A7 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19012913-0009-0000-0000-00002223B856 Message-Id: <20190129141447.34aa9d0c@thinkpad> Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 8bit X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-01-29_10:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901290099 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 28 Jan 2019 15:45:04 +0100 Michal Hocko wrote: > Hi, > Mikhail has posted fixes for the two bugs quite some time ago [1]. I > have pushed back on those fixes because I believed that it is much > better to plug the problem at the initialization time rather than play > whack-a-mole all over the hotplug code and find all the places which > expect the full memory section to be initialized. We have ended up with > 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full > memory section") merged and cause a regression [2][3]. The reason is > that there might be memory layouts when two NUMA nodes share the same > memory section so the merged fix is simply incorrect. > > In order to plug this hole we really have to be zone range aware in > those handlers. I have split up the original patch into two. One is > unchanged (patch 2) and I took a different approach for `removable' > crash. It would be great if Mikhail could test it still works for his > memory layout. > > [1] http://lkml.kernel.org/r/20181105150401.97287-2-zaslonko@linux.ibm.com > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948 > [3] http://lkml.kernel.org/r/20190125163938.GA20411@dhcp22.suse.cz I verified that both patches fix the issues we had with valid_zones (with mem=2050M) and removable (with mem=3075M). However, the call trace in the description of your patch 1 is wrong. You basically have the same call trace for test_pages_in_a_zone in both patches. The "removable" patch should have the call trace for is_mem_section_removable from Mikhails original patches: CONFIG_DEBUG_VM_PGFLAGS=y kernel parameter mem=3075M -------------------------- page:000003d08300c000 is uninitialized and poisoned page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) Call Trace: ([<000000000038596c>] is_mem_section_removable+0xb4/0x190) [<00000000008f12fa>] show_mem_removable+0x9a/0xd8 [<00000000008cf9c4>] dev_attr_show+0x34/0x70 [<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148 [<00000000003e4194>] seq_read+0x204/0x480 [<00000000003b53ea>] __vfs_read+0x32/0x178 [<00000000003b55b2>] vfs_read+0x82/0x138 [<00000000003b5be2>] ksys_read+0x5a/0xb0 [<0000000000b86ba0>] system_call+0xdc/0x2d8 Last Breaking-Event-Address: [<000000000038596c>] is_mem_section_removable+0xb4/0x190 Kernel panic - not syncing: Fatal exception: panic_on_oops