Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp158565pxk; Wed, 16 Sep 2020 00:30:39 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzqHYpBbwlaMg2eSwvmNHPqKr/JDkoPTAvB8CjxACBAwUOfDQzwXt5zrohOQM2Vl1B/MX/k X-Received: by 2002:a17:906:4754:: with SMTP id j20mr24803631ejs.293.1600241439072; Wed, 16 Sep 2020 00:30:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1600241439; cv=none; d=google.com; s=arc-20160816; b=udvPZo4Q42ZR67ftDJNVN7EDUPmcL8MxMtLuh6rCO+QpKHiahKp43l7Satn7J7AGLR GDXKVm5hi+j4v0vsrCSljNE/xI+NXtQjq68UHV/OuNS8ZdJ92dgjvo2bTxOIJ3RmEhNV 4eNyJ9MX0zqzz4vo6bwMqcRQ1YMuugbIp0OVzDVWeoevBw9PCIGlgH+G3rFfy7EEFNGh rLV+kJT56OVLH2IX9uuNjz06dFoCjETZnTYpjGzqBONOtFHnpv7JWtPmjZQej2QVhId6 mbfiMI82P++9VFCfS2/C5BQUPXd9UidpO7QpkW/SRUZbHcgWdePoAnLFOpwENEEqW4eF IkMw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=AoB+kpWnX2lWdzFLTF84IN8I7pVLd3qmCTTB/7Mxi2E=; b=b5DAuH67HL4nkbaU6V4cxGpPPStweGKfAyTO2twU+uGZnnBGONyNtVS800Rmkt5RJn DJA9RbVds09zrhPqaL8qKCSOd8ozdeA5fBiv1in4Cuyzglo4bPJKefgoPyMi0dbpZTA8 icJqbRguKsFiPMoPoxYK60Un5J/smsbkHqpxj1ZefvYodzwB+Ts23yLmRTL+6D0QDIN3 vfpTpqOvkh938rPIPwRrAwqvjbYiWBuABBcSS1rnzt3Sa7GVdYiBklKM7QmO+hVqEG7B NVeCKMaLcYL1Ni4KxU4BWb4p/XBvePJTvzGaJrpTu5JNVpR/l1pbY/xaXy5XAUm/2nxB FlrQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id dr1si6966708ejc.187.2020.09.16.00.30.16; Wed, 16 Sep 2020 00:30:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726402AbgIPH1L (ORCPT + 99 others); Wed, 16 Sep 2020 03:27:11 -0400 Received: from mx2.suse.de ([195.135.220.15]:34868 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726262AbgIPH1K (ORCPT ); Wed, 16 Sep 2020 03:27:10 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id F149CAC4D; Wed, 16 Sep 2020 07:27:20 +0000 (UTC) Date: Wed, 16 Sep 2020 09:27:02 +0200 From: Oscar Salvador To: Aristeu Rozanski Cc: naoya.horiguchi@nec.com, akpm@linux-foundation.org, mhocko@kernel.org, tony.luck@intel.com, cai@lca.pw, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v3 0/5] HWpoison: further fixes and cleanups Message-ID: <20200916072658.GA10692@linux> References: <20200914101559.17103-1-osalvador@suse.de> <20200915212222.GA18315@cathedrallabs.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20200915212222.GA18315@cathedrallabs.org> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Sep 15, 2020 at 05:22:22PM -0400, Aristeu Rozanski wrote: > Hi Oscar, Naoya, Hi Aristeu, thanks for reporting this. > I've run these tests using mmotm and mmotm with this patchset on top. Could you please re-run the tests with the below patch applied, and attached then the logs here? diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 84a7f228af36..d7b6e7724e47 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -67,6 +67,7 @@ atomic_long_t num_poisoned_pages __read_mostly = ATOMIC_LONG_INIT(0); static bool page_handle_poison(struct page *page, bool hugepage_or_freepage, bool release) { + dump_page(page, "page_handle_poison"); if (release) { put_page(page); drain_all_pages(page_zone(page)); @@ -77,7 +78,7 @@ static bool page_handle_poison(struct page *page, bool hugepage_or_freepage, boo * Doing this check for free pages is also fine since dissolve_free_huge_page * returns 0 for non-hugetlb pages as well. */ - if (dissolve_free_huge_page(page) || !take_page_off_buddy(page)) + if (dissolve_free_huge_page(page) || !take_page_off_buddy(page)) { /* * We could fail to take off the target page from buddy * for example due to racy page allocaiton, but that's @@ -85,7 +86,9 @@ static bool page_handle_poison(struct page *page, bool hugepage_or_freepage, boo * and if someone really want to use it, they should * take it. */ + pr_info("%s: hugepage_or_freepage failed?n", __func__); return false; + } } SetPageHWPoison(page); @@ -1858,8 +1861,11 @@ static int __soft_offline_page(struct page *page) if (!ret) { bool release = !huge; - if (!page_handle_poison(page, true, release)) + if (!page_handle_poison(page, true, release)) { + pr_info("%s: page_handle_poison -EBUSY\n", __func__); + dump_page(page, "__soft_offline_page after migrate"); ret = -EBUSY; + } } else { if (!list_empty(&pagelist)) putback_movable_pages(&pagelist); @@ -1872,6 +1878,7 @@ static int __soft_offline_page(struct page *page) } else { pr_info("soft offline: %#lx: %s isolation failed: %d, page count %d, type %lx (%pGp)\n", pfn, msg_page[huge], ret, page_count(page), page->flags, &page->flags); + dump_page(page, "__soft_offline_page isolation failed"); ret = -EBUSY; } return ret; @@ -1882,8 +1889,11 @@ static int soft_offline_in_use_page(struct page *page) struct page *hpage = compound_head(page); if (!PageHuge(page) && PageTransHuge(hpage)) - if (try_to_split_thp_page(page, "soft offline") < 0) + if (try_to_split_thp_page(page, "soft offline") < 0) { + pr_info("%s: try_to_split_thp_page -EBUSY\n", __func__); + dump_page(page, "try_to_split_thp_page"); return -EBUSY; + } return __soft_offline_page(page); } @@ -1891,8 +1901,11 @@ static int soft_offline_free_page(struct page *page) { int rc = 0; - if (!page_handle_poison(page, true, false)) + if (!page_handle_poison(page, true, false)) { + pr_info("%s: page_handle_poison -EBUSY\n", __func__); + dump_page(page, "soft_offline_free_page"); rc = -EBUSY; + } return rc; } Thanks -- Oscar Salvador SUSE L3