Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp523753pxk; Wed, 2 Sep 2020 07:58:01 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxPh2DL6Om+/xj7nj14w+rHlhoBiLmHLMALpkb+JD4xZyQnWYIpqgeLgBnTqHmStIRU8KhA X-Received: by 2002:a17:906:a4b:: with SMTP id x11mr408568ejf.83.1599058681773; Wed, 02 Sep 2020 07:58:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1599058681; cv=none; d=google.com; s=arc-20160816; b=Z644QbiioFNqLacg1tR4IQ3dPe9n6OoJIKz2VBkkfqF0LS1iRTCCQy6wZnyfx3EPoS 1WmFTNzLqlGGcbl9EdFw2d9DAq7K7Q9jgw7N2K7nEW8mzQjiz695HnonGJGnj/296orI dn/ES0alz/wxhBlyrSA/3NOwAyGs1QIa7XgY0WaIBCp94WgPJYiazzKC/3s4y6Ksp83i sXCQx20ZtwP+kqHLFtiRK+8YmivfceHO/XGTcwkJE9UmPdJA0ABQash8quxk+o0QXPOc UcP6f0W4KmbIqDOklDBae3SYpP/kKtb/9O5s/2rUYXXcONpnoj3wVWnHO83kI01Uadmp SrYA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=0pDNW74nNgHUeUhU+d2iV8Q+gLqzz+cMl2hCkl7eneU=; b=VQhXBXt6KTMH/a/uIaYxa6s/Xebf5H12YTpAvSs858TU6lvIJu3qbCg+8GhaOJtL2X IOVZUQvw+joiqDXsKwCbOADkDH+PFlvUV5Q0sVhbCmTEfSDA4MxiZZdAIho2ZwzLfamI 5v6h48vOZnvzL/8LhPV8La8ewtZ8vu0r3TXioyKg3x4x7ZNTsWH7BrsuBFsKO32iZDzB aYlAqh9T2EaoYrUApnMpMqcpQONEXvOLsAJgzok/rRYueFC6W5Eh0/W4avWX0lyRkZhs z4F3D+83JKeqU11EAUP8cK4Ix1dwWz58zFmTQajuloUxkgJB3lFHetGwzhYFJO+bCVEa nLqQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a10si2821527ejf.666.2020.09.02.07.57.38; Wed, 02 Sep 2020 07:58:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726380AbgIBOzg (ORCPT + 99 others); Wed, 2 Sep 2020 10:55:36 -0400 Received: from mx2.suse.de ([195.135.220.15]:44142 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726247AbgIBOzG (ORCPT ); Wed, 2 Sep 2020 10:55:06 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 992D6AF3B; Wed, 2 Sep 2020 14:55:06 +0000 (UTC) Subject: Re: [PATCH] mm/memory_hotplug: drain per-cpu pages again during memory offline To: Pavel Tatashin , Michal Hocko Cc: LKML , Andrew Morton , linux-mm References: <20200901124615.137200-1-pasha.tatashin@soleen.com> <20200902140851.GJ4617@dhcp22.suse.cz> From: Vlastimil Babka Message-ID: <74f2341a-7834-3e37-0346-7fbc48d74df3@suse.cz> Date: Wed, 2 Sep 2020 16:55:05 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 9/2/20 4:26 PM, Pavel Tatashin wrote: > On Wed, Sep 2, 2020 at 10:08 AM Michal Hocko wrote: >> >> > >> > Thread#1 - continue >> > free_unref_page_commit >> > migratetype = get_pcppage_migratetype(page); >> > // get old migration type >> > list_add(&page->lru, &pcp->lists[migratetype]); >> > // add new page to already drained pcp list >> > >> > Thread#2 >> > Never drains pcp again, and therefore gets stuck in the loop. >> > >> > The fix is to try to drain per-cpu lists again after >> > check_pages_isolated_cb() fails. >> >> But this means that the page is not isolated and so it could be reused >> for something else. No? > > The page is in a movable zone, has zero references, and the section is > isolated (i.e. set_pageblock_migratetype(page, MIGRATE_ISOLATE);) is > set. The page should be offlinable, but it is lost in a pcp list as > that list is never drained again after the first failure to migrate > all pages in the range. Yeah. To answer Michal's "it could be reused for something else" - yes, somebody could allocate it from the pcplist before we do the extra drain. But then it becomes "visible again" and the loop in __offline_pages() should catch it by scan_movable_pages() - do_migrate_range(). And this time the pageblock is already marked as isolated, so the page (freed by migration) won't end up on the pcplist again.