Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp1184412imm; Wed, 25 Jul 2018 13:05:28 -0700 (PDT) X-Google-Smtp-Source: AAOMgpd1aYstNJVKwsC5+t+KQ8s/0gOjOdQm8LpcHgV7/lmojlJTQbUGsBOxBhqGVNZAjcbe0vot X-Received: by 2002:a17:902:8481:: with SMTP id c1-v6mr18124plo.177.1532549128155; Wed, 25 Jul 2018 13:05:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1532549128; cv=none; d=google.com; s=arc-20160816; b=mducho+o4dmIB6AJocIv8Lj7NdWfGDhaWLSGcrh7gSA8VOmclelN2WtPnLnfPkzMSJ Y6MnZHlN6k3vi6WwKLLvdLSd+Y4UR9AHXRvAGTw+gKCtrvV5+Op+f47EO3MuuYtPmZWV sgOLDs1l+cSPlPJqO4UXlrU+CFDyjcqdamZ1oJrYPSP0eLM7sV3G2gxNpDbPCH6Ih/Nt 9v5GQ5CbzsdCLLw3JVxlduE60ekop2P4pZ+QHHUAN6U5RSzv+ENqUA7HY0ulufTtQM1C JaqXVH2pgSsju/2asJfeRr50+6CWCk4ByVSgmQfdw/vI/vU+int3woBHkWuzVawuFv55 KeBA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=AO4YnDjTXPD73+DcYVgddROQR3QxQK8rTatMAI1ePhA=; b=CgD65tCqOW22nFwiDYvx+UpeBAt7ar+d7qzV0vs4e2W0zeBTT2XckVvqH152WOVxHR tCFvN/aIcLQk/Mpcszr3eAbI6HRd0ga7NwT2eVWdFtyzU17oJNHPaLE7rP4rxV2kPNHL KS2m/C891nsU08SbU6VrPBhpKZBtg3lGEznzwdCu5IRnuGKhtZWgOyqjHYf/2I/ZoFrQ TOdv+B+ON4f1HahkTbrBVzw3V4Ey4SyooJQArnoqiOq6GOySWBLm2rO3tZnCWOEXTFc6 93kT4EmAqOK8ndgRh/wI+Wul1Fpehdjq4l+Y+akLAkt3oBlhuLpBNjj7NSMILUby/t0V /0+w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e12-v6si14135052pfn.322.2018.07.25.13.05.13; Wed, 25 Jul 2018 13:05:28 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731185AbeGYVQv (ORCPT + 99 others); Wed, 25 Jul 2018 17:16:51 -0400 Received: from mx2.suse.de ([195.135.220.15]:53402 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729529AbeGYVQv (ORCPT ); Wed, 25 Jul 2018 17:16:51 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 4EA0EAF55; Wed, 25 Jul 2018 20:03:37 +0000 (UTC) Date: Wed, 25 Jul 2018 22:03:36 +0200 From: Michal Hocko To: John Allen Cc: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kamezawa.hiroyu@jp.fujitsu.com, n-horiguchi@ah.jp.nec.com, mgorman@suse.de Subject: Re: Infinite looping observed in __offline_pages Message-ID: <20180725200336.GP28386@dhcp22.suse.cz> References: <20180725181115.hmlyd3tmnu3mn3sf@p50.austin.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180725181115.hmlyd3tmnu3mn3sf@p50.austin.ibm.com> User-Agent: Mutt/1.10.0 (2018-05-17) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 25-07-18 13:11:15, John Allen wrote: [...] > Does a failure in do_migrate_range indicate that the range is unmigratable > and the loop in __offline_pages should terminate and goto failed_removal? Or > should we allow a certain number of retrys before we > give up on migrating the range? Unfortunatelly not. Migration code doesn't tell a difference between ephemeral and permanent failures. We are relying on start_isolate_page_range to tell us this. So the question is, what kind of page is not migratable and for what reason. Are you able to add some debugging to give us more information. The current debugging code in the hotplug/migration sucks... -- Michal Hocko SUSE Labs