Received: by 2002:a05:6a10:2785:0:0:0:0 with SMTP id ia5csp14921pxb; Wed, 13 Jan 2021 21:42:58 -0800 (PST) X-Google-Smtp-Source: ABdhPJyQi/XtqbC8gbGTMAWXoNJGamc2PSfczivVJ1BAaTAJIf1rLtOJ5ExQ5rruylJBIJ2myz/S X-Received: by 2002:a05:6402:513:: with SMTP id m19mr4519237edv.244.1610602978619; Wed, 13 Jan 2021 21:42:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1610602978; cv=none; d=google.com; s=arc-20160816; b=lwCFBY87ezKbVf/MVDscOcsX8JCcK7rx6Omwb58URbYQSt4FHOKWb8bFjUooV3ds/a i5N+Qd/xICZqHI3c+dUsOmkV6CQ8C4awDK48IrWTwTOSq3wFLi5ZtYraywWi8XJaR//I oa9v6fZY2hXtfNC6+YK639b4vMbpBchQsNzEH6rNBCxdGmYtoQT9BnXO0hZCOUAprbEi 73sRC8ownbaU6H4X3HytRIYFakfImJxyxoi097l2PJQ10hyof3AAftDDTJiipc00q4Eb rf0e/XFOPzumaUb80ZsLV8MXqwuTT/RdLe/motAj2f/oBmA6wjsQr0Awj5bNfpAOQuCZ meRA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=fqBUe+vvN+P9JTqSIfpD9Vhk8w4ny3ImosrA1d+/dmM=; b=UfgNvUGZXRX+wWmxllztV2HmB08qN/30NK6FsgPL7855ETgP0bBluI5EkfXK6J3TCD dFu0cmapYVIjG+D1XsI3S/gMVbyO52mOsxqsNNqCa9uqNghio6Vjv5BafQTbwt34vGr3 Ct6A291nxNkenGYf+JpCuG0cugrWsjH4ELB7T5elTUD4sGSkq1yBT8iymcmEP8LhPryO iwdhZcrTm4VSa8ZG7c2WBgweX3JyFZTCRhb05WU+oCgBeFVTQRpVFBvttOkhn43inzeZ bL9rP/jP2+4iCqWjhKafZnhuggOJnJrgbUVztQgN2BYs2/bnlMwie7Od9//k8Llo/2m1 uByQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=LMxBi8DP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id d2si2103762edd.145.2021.01.13.21.42.34; Wed, 13 Jan 2021 21:42:58 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=LMxBi8DP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726095AbhANFjo (ORCPT + 99 others); Thu, 14 Jan 2021 00:39:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35734 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726046AbhANFjn (ORCPT ); Thu, 14 Jan 2021 00:39:43 -0500 Received: from mail-pf1-x430.google.com (mail-pf1-x430.google.com [IPv6:2607:f8b0:4864:20::430]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 42BFDC061575 for ; Wed, 13 Jan 2021 21:39:03 -0800 (PST) Received: by mail-pf1-x430.google.com with SMTP id x126so2695351pfc.7 for ; Wed, 13 Jan 2021 21:39:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=fqBUe+vvN+P9JTqSIfpD9Vhk8w4ny3ImosrA1d+/dmM=; b=LMxBi8DP2LCrba0oIh4Uy4Ho5KlSkW5FOHY68DjM/ViMTFkqoAOFNXWggiedoRs399 nF67pkHEojOqdiG3l3KbBgYp1zP0fLb3hAsrZgIy1MAhLalB2fogYrqpcm+UKMUDQPBT SJPnD57avS6VvyLOZrCEChKav/qgIGpp4nqhBhKmvPAl79dRov7nJuSuaqRGubah4r7U RJ631A/cQVb3LMkTgryha11FKBUG7UC27NlH6qtEQyc33QjjfL7YxgPhc/RTkyyqZ6Y0 g2ZKCR9HirMwsBRHhYiMGY8xnVf3X7ueJSpeODBeMrOOHGrNEuBq9b/Zd7bkrD2DUBj/ wifw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=fqBUe+vvN+P9JTqSIfpD9Vhk8w4ny3ImosrA1d+/dmM=; b=WrBBhpmYm3IWBkjeJeu+YD7kAQvVIv3YiAhTQjytb4lnepHUN/s8xs3H4rU+Pppvyj WWGGm2UuRAhlOpQgrQP+oZTLAoSthDuMCcSIgvp6pH5eQtG4u8Enm53f8S6efoBuqNLg DM0AuMfbSM+g2pRhCV7rPTTQxCGGBLY88bFxPCOKKFwPyU4NHHbTF4+jetBw4qFzGATT YfeQPkG+usJfXk885uyrI4DumJ0qEqsYSNZKU011ZtkWbFY++2EVUjOrdB2rXaEN1mY2 7cdVSS1Djpq057SmbyYSdUej7+Dghmu0k3ugVZUCuRCTPvWSJ2Eu1AQeEFby1q0KHIoV fNTQ== X-Gm-Message-State: AOAM531BT4oA0eIqhv9/d3O36qeroofT7ZpD5TkcTTDbSXDU6TPGqmcd IgBma6lpBgVEowjmQr1/BlVnaBXoxSU= X-Received: by 2002:a63:d917:: with SMTP id r23mr5870061pgg.126.1610602742772; Wed, 13 Jan 2021 21:39:02 -0800 (PST) Received: from localhost ([2409:10:2e40:5100:6e29:95ff:fe2d:8f34]) by smtp.gmail.com with ESMTPSA id c5sm4258372pgt.73.2021.01.13.21.39.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Jan 2021 21:39:01 -0800 (PST) Date: Thu, 14 Jan 2021 14:38:58 +0900 From: Sergey Senozhatsky To: Hugh Dickins Cc: Sergey Senozhatsky , Andrew Morton , "Kirill A. Shutemov" , Suleiman Souhlal , Matthew Wilcox , Andrea Arcangeli , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: madvise(MADV_REMOVE) deadlocks on shmem THP Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On (21/01/13 20:31), Hugh Dickins wrote: > > We are running into lockups during the memory pressure tests on our > > boards, which essentially NMI panic them. In short the test case is > > > > - THP shmem > > echo advise > /sys/kernel/mm/transparent_hugepage/shmem_enabled > > > > - And a user-space process doing madvise(MADV_HUGEPAGE) on new mappings, > > and madvise(MADV_REMOVE) when it wants to remove the page range > > > > The problem boils down to the reverse locking chain: > > kswapd does > > > > lock_page(page) -> down_read(page->mapping->i_mmap_rwsem) > > > > madvise() process does > > > > down_write(page->mapping->i_mmap_rwsem) -> lock_page(page) > > > > > > > > CPU0 CPU1 > > > > kswapd vfs_fallocate() > > shrink_node() shmem_fallocate() > > shrink_active_list() unmap_mapping_range() > > page_referenced() << lock page:PG_locked >> unmap_mapping_pages() << down_write(mapping->i_mmap_rwsem) >> > > rmap_walk_file() zap_page_range_single() > > down_read(mapping->i_mmap_rwsem) << W-locked on CPU1>> unmap_page_range() > > rwsem_down_read_failed() __split_huge_pmd() > > __rwsem_down_read_failed_common() __lock_page() << PG_locked on CPU0 >> > > schedule() wait_on_page_bit_common() > > io_schedule() > > Very interesting, Sergey: many thanks for this report. Thanks for the quick feedback. > There is no doubt that kswapd is right in its lock ordering: > __split_huge_pmd() is in the wrong to be attempting lock_page(). > > Which used not to be done, but was added in 5.8's c444eb564fb1 ("mm: > thp: make the THP mapcount atomic against __split_huge_pmd_locked()"). Hugh, I forgot to mention, we are facing these issues on 4.19. Let me check if (maybe) we have cherry picked c444eb564fb1. -ss