Received: by 10.223.176.5 with SMTP id f5csp3200836wra; Mon, 29 Jan 2018 10:09:43 -0800 (PST) X-Google-Smtp-Source: AH8x227qOdGPi+rCSnEU3NxHvCVPqwsHMH5AommyUXa9InvWg4EL8so2s+miu+YIZabIsWt3aaHq X-Received: by 10.107.148.82 with SMTP id w79mr12997658iod.207.1517249383302; Mon, 29 Jan 2018 10:09:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517249383; cv=none; d=google.com; s=arc-20160816; b=GO4rVzFnNgOT63UTsWviZoU/B4V6/PmUut3dCYTGLJeneX3DwzmU+xhnG4cs5iY2W7 Ry++fIiI0apYQIFY7N6MX/IJwE+IyemXwD8ohRdPLKJdFrMnLzAeuhK5vuAEtxExn8WM cRVv5IeGx3Xuxi4uQ1rmId3oPox6ZmsYi/5I1tRCRWoW87xg3AgFEDVtEZXtFTYlRCAd MbqUSObWE6AdEsUFxVTxlx65mNbMtBY5SS710gYd7MAxVu8mKsQlzPX14xXLUx7cPIjH 0DpzQ4aLOa1SPJfj60/bnRkAkQ4+FOztyosS/q05nYU1SuCEI/Hkf9PFq+XzaCI/b+HM ugDg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=VAwrZTfyZPvLlYS6M929cL/JaXiAaLwi8yzTwOwqXLc=; b=SKsun/bb7egWL3u2q4qDHExZxwQ/qREKfyPzRLjaOyXxNuJVpsgOufyYhVj2+PtcpX 9xROqczfZdnQieTQcUSQEE/ByyIMjYuSi0IRbCYTDMPQ6gM1YpxlKtImujFEegyHOrSB 6blGKIvKG/BaxY3ICBTPX1EpYP1pmPECgC82KYoqBcIq7O2soF5ZLN77P94Zmf4T0x9+ H4I5FLuxRuL7OMbhCBbK6lBWmm1/T5yUqE/VE5Ctx04sV97jRTEMO7Lh6eP+4hDnBPs5 ZO1gerzQurJfkbCbDW79eMT/ImVbaBm2zyvp9ACScxFvQUtO220VPVyj559jVdDHK/cV 8piQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=UBHejZ35; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r142si4592802itr.18.2018.01.29.10.09.29; Mon, 29 Jan 2018 10:09:43 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=UBHejZ35; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751411AbeA2SJG (ORCPT + 99 others); Mon, 29 Jan 2018 13:09:06 -0500 Received: from aserp2120.oracle.com ([141.146.126.78]:55584 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751072AbeA2SJF (ORCPT ); Mon, 29 Jan 2018 13:09:05 -0500 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w0TI6q4Z196355; Mon, 29 Jan 2018 18:08:57 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=corp-2017-10-26; bh=VAwrZTfyZPvLlYS6M929cL/JaXiAaLwi8yzTwOwqXLc=; b=UBHejZ358BzNvScOgYFLTwegsStPZp/sUyM6AlUJH+OCug7qOMO7aepojISV+6TFJFw3 6UgQJzeiX9wg+j5zatIfyMwGRzQyMiPeLNCwfs5Xbd57Jpt//1R2zl6zvLiGoLE674sf QhOR0hBY64BYpIh4w1jsXMCl4nSBa0ypdsWZ2LQKb0p8oP70h1srUb4EBOuOnRLynwbK hcvTaE4MULgNcc0HLgEIWEx1ohJ04hSi6Z8Hqqx8CHri8/grm58NJwWDKEw6N5kFzZqd NYBXMEQTqq0mzCVfmViHayNHKiNk2MJyviXYI9fx6cceABb8FDksR8J5OGZsgObEsar1 fw== Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by aserp2120.oracle.com with ESMTP id 2ft7ft0jay-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 29 Jan 2018 18:08:57 +0000 Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w0TI8ucW000310 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 29 Jan 2018 18:08:56 GMT Received: from abhmp0011.oracle.com (abhmp0011.oracle.com [141.146.116.17]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id w0TI8suc031038; Mon, 29 Jan 2018 18:08:55 GMT Received: from [192.168.1.164] (/98.246.252.205) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 29 Jan 2018 10:08:54 -0800 Subject: Re: [PATCH v1] mm: hwpoison: disable memory error handling on 1GB hugepage To: Michal Hocko , Naoya Horiguchi Cc: Andrew Morton , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , Anshuman Khandual , "Aneesh Kumar K.V" References: <1517207283-15769-1-git-send-email-n-horiguchi@ah.jp.nec.com> <20180129063054.GA5205@hori1.linux.bs1.fc.nec.co.jp> <20180129095425.GA21609@dhcp22.suse.cz> From: Mike Kravetz Message-ID: Date: Mon, 29 Jan 2018 10:08:53 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 MIME-Version: 1.0 In-Reply-To: <20180129095425.GA21609@dhcp22.suse.cz> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8789 signatures=668655 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=9 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1801290235 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/29/2018 01:54 AM, Michal Hocko wrote: > On Mon 29-01-18 06:30:55, Naoya Horiguchi wrote: >> My apology, I forgot to CC to the mailing lists. >> >> On Mon, Jan 29, 2018 at 03:28:03PM +0900, Naoya Horiguchi wrote: >>> Recently the following BUG was reported: >>> >>> Injecting memory failure for pfn 0x3c0000 at process virtual address 0x7fe300000000 >>> Memory failure: 0x3c0000: recovery action for huge page: Recovered >>> BUG: unable to handle kernel paging request at ffff8dfcc0003000 >>> IP: gup_pgd_range+0x1f0/0xc20 >>> PGD 17ae72067 P4D 17ae72067 PUD 0 >>> Oops: 0000 [#1] SMP PTI >>> ... >>> CPU: 3 PID: 5467 Comm: hugetlb_1gb Not tainted 4.15.0-rc8-mm1-abc+ #3 >>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014 >>> >>> You can easily reproduce this by calling madvise(MADV_HWPOISON) twice on >>> a 1GB hugepage. This happens because get_user_pages_fast() is not aware >>> of a migration entry on pud that was created in the 1st madvise() event. > > Do pgd size pages work properly? Adding Anshuman and Aneesh as they added pgd support for power. And, this patch will disable that as well IIUC. This patch makes sense for x86. My only concern/question is for other archs which may have huge page sizes defined which are > MAX_ORDER and < PUD_SIZE. These would also be classified as gigantic and impacted by this patch. Do these also have the same issue? -- Mike Kravetz >>> I think that conversion to pud-aligned migration entry is working, >>> but other MM code walking over page table isn't prepared for it. >>> We need some time and effort to make all this work properly, so >>> this patch avoids the reported bug by just disabling error handling >>> for 1GB hugepage. > > Can we also get some documentation which would describe all requirements > for HWPoison pages to work properly please? > >>> Signed-off-by: Naoya Horiguchi > > Acked-by: Michal Hocko > > We probably want a backport to stable as well. Although regular process > cannot get giga pages easily without admin help it is still not nice to > oops like this. > >>> --- >>> include/linux/mm.h | 1 + >>> mm/memory-failure.c | 7 +++++++ >>> 2 files changed, 8 insertions(+) >>> >>> diff --git v4.15-rc8-mmotm-2018-01-18-16-31/include/linux/mm.h v4.15-rc8-mmotm-2018-01-18-16-31_patched/include/linux/mm.h >>> index 63f7ba1..166864e 100644 >>> --- v4.15-rc8-mmotm-2018-01-18-16-31/include/linux/mm.h >>> +++ v4.15-rc8-mmotm-2018-01-18-16-31_patched/include/linux/mm.h >>> @@ -2607,6 +2607,7 @@ enum mf_action_page_type { >>> MF_MSG_POISONED_HUGE, >>> MF_MSG_HUGE, >>> MF_MSG_FREE_HUGE, >>> + MF_MSG_GIGANTIC, >>> MF_MSG_UNMAP_FAILED, >>> MF_MSG_DIRTY_SWAPCACHE, >>> MF_MSG_CLEAN_SWAPCACHE, >>> diff --git v4.15-rc8-mmotm-2018-01-18-16-31/mm/memory-failure.c v4.15-rc8-mmotm-2018-01-18-16-31_patched/mm/memory-failure.c >>> index d530ac1..c497588 100644 >>> --- v4.15-rc8-mmotm-2018-01-18-16-31/mm/memory-failure.c >>> +++ v4.15-rc8-mmotm-2018-01-18-16-31_patched/mm/memory-failure.c >>> @@ -508,6 +508,7 @@ static const char * const action_page_types[] = { >>> [MF_MSG_POISONED_HUGE] = "huge page already hardware poisoned", >>> [MF_MSG_HUGE] = "huge page", >>> [MF_MSG_FREE_HUGE] = "free huge page", >>> + [MF_MSG_GIGANTIC] = "gigantic page", >>> [MF_MSG_UNMAP_FAILED] = "unmapping failed page", >>> [MF_MSG_DIRTY_SWAPCACHE] = "dirty swapcache page", >>> [MF_MSG_CLEAN_SWAPCACHE] = "clean swapcache page", >>> @@ -1090,6 +1091,12 @@ static int memory_failure_hugetlb(unsigned long pfn, int trapno, int flags) >>> return 0; >>> } >>> >>> + if (hstate_is_gigantic(page_hstate(head))) { >>> + action_result(pfn, MF_MSG_GIGANTIC, MF_IGNORED); >>> + res = -EBUSY; >>> + goto out; >>> + } >>> + >>> if (!hwpoison_user_mappings(p, pfn, trapno, flags, &head)) { >>> action_result(pfn, MF_MSG_UNMAP_FAILED, MF_IGNORED); >>> res = -EBUSY; >>> -- >>> 2.7.0 >>> >>> >> -- >> To unsubscribe, send a message with 'unsubscribe linux-mm' in >> the body to majordomo@kvack.org. For more info on Linux MM, >> see: http://www.linux-mm.org/ . >> Don't email: email@kvack.org >