Received: by 2002:ac0:98c7:0:0:0:0:0 with SMTP id g7-v6csp4415544imd; Tue, 30 Oct 2018 01:18:43 -0700 (PDT) X-Google-Smtp-Source: AJdET5fF5kAjUT+B3/TEgCZzK3zQtQdcdqz7qYvTRth4Yl+lpioCF+t2i8ndBC0a9SinOv/qXw7P X-Received: by 2002:a17:902:bd01:: with SMTP id p1-v6mr17487513pls.63.1540887523341; Tue, 30 Oct 2018 01:18:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1540887523; cv=none; d=google.com; s=arc-20160816; b=WjPT+zUUC7xQ4l+9HLH7IuV/QpDMEm1JY+zAsXImn4dwtfDv4Onf26aK8rYPyfI7fM kKDtKEGBWn0JKTMMB09k7U2mw27Pr1AbXS7dMgc9sj+SR10xBFUVK9XjrZsKe+mUnyco tdw+MFoBF87yj3ivjXDlItAJ5g6A26AcU/9SujJaoNLpeTwiIe2nSq5chPxGS1+5xvW+ uXfMFclZtS0L5Zr6GgxJaVXLa4bH60LjOMDsLUb2UQbubOdpQZnM4HLB9KgX+NEB5Bjs Tb5j83KXZX/UAfqm4XKfwHwsBunlFgDkaH+2Eyn3IXmxftmtXK5TpHXnX3UrE7e0YYp4 bmsg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=snoLMqLoGZaEvvzgsSWseqfPK5S4YIuIwUmQUqZM4r8=; b=hT2EZ5LPDK0ThtgYn75XMvqB7L66fPwajD7QWNnQvp5R3CKUbf0Ewa1H7gn8fOEVwL dWxyRgz7TJp9d/Gvm4+f9rqQqBE4n7CXt/FzM0FN4sk1zvEQF6vI0I4TJnbDWlc2shCy 0vCnrxt1709hIJNzUhgmqAzjcZsLQ8gBg4//xTZ4jLrXk1pReyymd7Oo7Eul/ZLXl1jR WetylK8RD4htdPUSqsd94Qc3Bj/kVilnlV5rTS1yU/CBi4JXTSN3wfn+idnGORxa7BOF /bdB+xMH1n+n9gquX9Yiqu63hfIEya3IBuYN2j8jYVsSTpip+xbR1aGrTkBqGaMb1zVm zb4A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j2-v6si9923083pfg.10.2018.10.30.01.18.28; Tue, 30 Oct 2018 01:18:43 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726792AbeJ3RJH (ORCPT + 99 others); Tue, 30 Oct 2018 13:09:07 -0400 Received: from mx2.suse.de ([195.135.220.15]:35706 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726189AbeJ3RJH (ORCPT ); Tue, 30 Oct 2018 13:09:07 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 31314AF3F; Tue, 30 Oct 2018 08:16:40 +0000 (UTC) Date: Tue, 30 Oct 2018 09:16:39 +0100 From: Michal Hocko To: Naoya Horiguchi Cc: Andrew Morton , "linux-mm@kvack.org" , "xishi.qiuxishi@alibaba-inc.com" , "zy.zhengyi@alibaba-inc.com" , "linux-kernel@vger.kernel.org" , Mike Kravetz Subject: Re: [PATCH v2 0/2] mm: soft-offline: fix race against page allocation Message-ID: <20181030081639.GW32673@dhcp22.suse.cz> References: <1531805552-19547-1-git-send-email-n-horiguchi@ah.jp.nec.com> <20180815154334.f3eecd1029a153421631413a@linux-foundation.org> <20180822013748.GA10343@hori1.linux.bs1.fc.nec.co.jp> <20180822080025.GD29735@dhcp22.suse.cz> <20181026084636.GY18839@dhcp22.suse.cz> <20181030065433.GA1119@hori1.linux.bs1.fc.nec.co.jp> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181030065433.GA1119@hori1.linux.bs1.fc.nec.co.jp> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 30-10-18 06:54:33, Naoya Horiguchi wrote: > On Fri, Oct 26, 2018 at 10:46:36AM +0200, Michal Hocko wrote: > > On Wed 22-08-18 10:00:25, Michal Hocko wrote: > > > On Wed 22-08-18 01:37:48, Naoya Horiguchi wrote: > > > > On Wed, Aug 15, 2018 at 03:43:34PM -0700, Andrew Morton wrote: > > > > > On Tue, 17 Jul 2018 14:32:30 +0900 Naoya Horiguchi wrote: > > > > > > > > > > > I've updated the patchset based on feedbacks: > > > > > > > > > > > > - updated comments (from Andrew), > > > > > > - moved calling set_hwpoison_free_buddy_page() from mm/migrate.c to mm/memory-failure.c, > > > > > > which is necessary to check the return code of set_hwpoison_free_buddy_page(), > > > > > > - lkp bot reported a build error when only 1/2 is applied. > > > > > > > > > > > > > mm/memory-failure.c: In function 'soft_offline_huge_page': > > > > > > > >> mm/memory-failure.c:1610:8: error: implicit declaration of function > > > > > > > 'set_hwpoison_free_buddy_page'; did you mean 'is_free_buddy_page'? > > > > > > > [-Werror=implicit-function-declaration] > > > > > > > if (set_hwpoison_free_buddy_page(page)) > > > > > > > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > > > > is_free_buddy_page > > > > > > > cc1: some warnings being treated as errors > > > > > > > > > > > > set_hwpoison_free_buddy_page() is defined in 2/2, so we can't use it > > > > > > in 1/2. Simply doing s/set_hwpoison_free_buddy_page/!TestSetPageHWPoison/ > > > > > > will fix this. > > > > > > > > > > > > v1: https://lkml.org/lkml/2018/7/12/968 > > > > > > > > > > > > > > > > Quite a bit of discussion on these two, but no actual acks or > > > > > review-by's? > > > > > > > > Really sorry for late response. > > > > Xishi provided feedback on previous version, but no final ack/reviewed-by. > > > > This fix should work on the reported issue, but rewriting soft-offlining > > > > without PageHWPoison flag would be the better fix (no actual patch yet.) > > > > > > If we can go with the later the I would obviously prefer that. I cannot > > > promise to work on the patch though. I can help with reviewing of > > > course. > > > > > > If this is important enough that people are hitting the issue in normal > > > workloads then sure, let's go with the simple fix and continue on top of > > > that. > > > > Naoya, did you have any chance to look at this or have any plans to look? > > I am willing to review and help with the overal design but I cannot > > really promise to work on the code. > > I have a draft version of a patch to isolate a page in buddy-friendly manner > without PageHWPoison flag (that was written weeks ago, but I couldn't finish > because my other project interrupted me ...). > I'll post it after testing, especially confirming that hotplug code properly > reset the isolated page. Thanks a lot Naoya. It is highly appreciated! -- Michal Hocko SUSE Labs