Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp1951919pxk; Tue, 1 Sep 2020 11:40:24 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx2P7gwS3Bw+8q2k5zrPcRXagUMz0a41bne5xdNesyyN5WJj/vmQ+/E9KGgXt7fv0hWqMn+ X-Received: by 2002:aa7:cd52:: with SMTP id v18mr2955170edw.165.1598985624129; Tue, 01 Sep 2020 11:40:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1598985624; cv=none; d=google.com; s=arc-20160816; b=qEgFJ1UsX0B8gr8PggZJ1y4ZsA85OrjCwBp2bqeDlzjASpVCt3gt83+PmH/HkVVZlH JUy9G/RwzyVZ4jk68ylkX46o9poFeLkf9XbfqfUZaVuBPlXXurl14RB13LANf06/j9m2 dbE89YUi3NUsnZ85t0LrLOxx8p02hG17in4Uri37s+FNRwFVmldQXIQB+Fwxbmt162jG 3Vbb6thgyQHXyZltjE7T40B7EaZEv6JwMNJtUhduPLZi7vrGvVOQ1N387NAQCVMBt4wJ nz+Q3gixoQ2CYZBfHbdfMccu4xBBVkxv+tE1ga8ixTM7Hcwa5ABQA8UUmOJjYnbqkx90 j8vw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date:dkim-signature; bh=I8N41/CU/UJKwzDdO6eiLqWZrVtaxJQ+jfTpQTJbaiI=; b=OtL3E7AJlvn+tl25ODe8iCPSPUEZcGbqE5eFzDrlS2LO8DcS4GITrIPpmcXXcH8qKh j/n0fjSfOFqKjuN9wvc7C4zelXoXu5kgqUif17p5TLr2GSJ1MMH9TU9nzMQgceIrhGsh dAj4VHpyL6okg+xd4+hoH4+49eXa0yrMs/Th+a0JLxcb+Ga105/z0INxeaat61nolRF+ ENjEITh9oDJyg7D3LxdEyl5XYcOPl0DVpFR0HFg+rfFSWDlWlqckuVrOZFSRuJnMDIV1 gAVikOKWi4AGfnlH8SBXkxgLq4RRwES4URurDp6v/5FgCyrs1xcib4Qf3WghqGPgeo3q NeaA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=tp5xOwZu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id cd4si1162881ejb.107.2020.09.01.11.40.00; Tue, 01 Sep 2020 11:40:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=tp5xOwZu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729621AbgIASho (ORCPT + 99 others); Tue, 1 Sep 2020 14:37:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55026 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726489AbgIAShn (ORCPT ); Tue, 1 Sep 2020 14:37:43 -0400 Received: from mail-pl1-x644.google.com (mail-pl1-x644.google.com [IPv6:2607:f8b0:4864:20::644]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0D9C0C061244 for ; Tue, 1 Sep 2020 11:37:43 -0700 (PDT) Received: by mail-pl1-x644.google.com with SMTP id y6so970711plk.10 for ; Tue, 01 Sep 2020 11:37:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=I8N41/CU/UJKwzDdO6eiLqWZrVtaxJQ+jfTpQTJbaiI=; b=tp5xOwZuUelM6L6Qza2mwA17OKXylorWkBsO7id6/0u45E6x8vQ7xxsq5adkvPtIGo fpwamKcKa9lrh2LQVkRbmKW52mS6sHDF22Tr/johvUXWDni2WajSRhd/IEK/qSlcAJca hgOYyEOsZy2HBPXg2Um5xvLaJRMqm5smtlneVU0deXjsCzUwtcf1wJNrY+9uAMRYgR/Q dRqQOsyudNmwIZK/iQ3lxix1J2flS4SJ4Yjc1eZZjP1jcHjR/jgTOR4kdMGD2oHjM+j1 YJaC0JgZGgtvZFCG591hvSCq/ILOspDNNSj92aheBugH7Lwb3Wj2qn13bo7DiIfzf1j1 NGLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=I8N41/CU/UJKwzDdO6eiLqWZrVtaxJQ+jfTpQTJbaiI=; b=O50L5vQHRkr/FFwpUHSCvVQwLFIICf5wAQvVjum7hqWvxuu37rUZe9aHt4IH47VJkQ Bjctc5kd1WJ/NyLtSNXkLn1toP94KDhUrRQi94KdqXpTNdBSW6Ov89tEP+p6M96E74wK /k4yHC7giuy3J2VRZsGUaQO/LB6gtzaq4TgwzkLY41CMBFULmqCBuLSftuDWyrd37FK0 tG3iqhk8YWChDPoEdZTBJ/R81xHK3avngB88wwB7L9rjBi31Us0fJjG0/iPViHVD5f5H ViZXP61tv8+yWPehlD0OgA6q7TJo8EZVoXaWfB/VIkeEsQwdvblungSdNSAFseQvyt2d 5RhA== X-Gm-Message-State: AOAM532S7bFQllZ3yCfhgs95K/B9212O1FoYXLQNX5erwlcNhCNW2fUC khEBiLhwy9RmFMWsm3QLTgEo6HLPiiRpjg== X-Received: by 2002:a17:902:9a95:: with SMTP id w21mr150213plp.296.1598985462348; Tue, 01 Sep 2020 11:37:42 -0700 (PDT) Received: from [2620:15c:17:3:4a0f:cfff:fe51:6667] ([2620:15c:17:3:4a0f:cfff:fe51:6667]) by smtp.gmail.com with ESMTPSA id lb1sm2017325pjb.26.2020.09.01.11.37.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Sep 2020 11:37:41 -0700 (PDT) Date: Tue, 1 Sep 2020 11:37:41 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Pavel Tatashin cc: linux-kernel@vger.kernel.org, akpm@linux-foundation.org, mhocko@suse.com, linux-mm@kvack.org Subject: Re: [PATCH] mm/memory_hotplug: drain per-cpu pages again during memory offline In-Reply-To: <20200901124615.137200-1-pasha.tatashin@soleen.com> Message-ID: References: <20200901124615.137200-1-pasha.tatashin@soleen.com> User-Agent: Alpine 2.23 (DEB 453 2020-06-18) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 1 Sep 2020, Pavel Tatashin wrote: > There is a race during page offline that can lead to infinite loop: > a page never ends up on a buddy list and __offline_pages() keeps > retrying infinitely or until a termination signal is received. > > Thread#1 - a new process: > > load_elf_binary > begin_new_exec > exec_mmap > mmput > exit_mmap > tlb_finish_mmu > tlb_flush_mmu > release_pages > free_unref_page_list > free_unref_page_prepare > set_pcppage_migratetype(page, migratetype); > // Set page->index migration type below MIGRATE_PCPTYPES > > Thread#2 - hot-removes memory > __offline_pages > start_isolate_page_range > set_migratetype_isolate > set_pageblock_migratetype(page, MIGRATE_ISOLATE); > Set migration type to MIGRATE_ISOLATE-> set > drain_all_pages(zone); > // drain per-cpu page lists to buddy allocator. > > Thread#1 - continue > free_unref_page_commit > migratetype = get_pcppage_migratetype(page); > // get old migration type > list_add(&page->lru, &pcp->lists[migratetype]); > // add new page to already drained pcp list > > Thread#2 > Never drains pcp again, and therefore gets stuck in the loop. > > The fix is to try to drain per-cpu lists again after > check_pages_isolated_cb() fails. > > Signed-off-by: Pavel Tatashin > Cc: stable@vger.kernel.org Acked-by: David Rientjes