Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp4212277imm; Wed, 30 May 2018 01:01:08 -0700 (PDT) X-Google-Smtp-Source: ADUXVKI4QTboAJtVvnhHeXX+Y5VEvqzruP0niwUOGNnnU1GmqFs/BHnQYLJvNUbkJ9NSKtnG6nm0 X-Received: by 2002:a62:3fdd:: with SMTP id z90-v6mr1780908pfj.216.1527667268084; Wed, 30 May 2018 01:01:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527667268; cv=none; d=google.com; s=arc-20160816; b=WkBc41RAW/5/O4gDXfMrBdHL098EsDGL2n1//4jGCzSyVoBa4Gngdnw5b0CCZtxREU Mb8Qa7++0vBNI+Y8xXNQkLyl5WfmCzudQM59rUM1f8CP2tQcIURBpxDcVPwu8/CybXKj C9/2CfkehqsQLs+o+btatZN8CiiQsKJOIGq6r3bqtSIpYVvACLnKa5JxNMgaPBN+TYVe hNoQ4DAWv1vMVf9Udlrt2qh5WcJCASWrTLU9xFzvHbmcoMI72hJ8q8eiPdI5OoIus5Q0 Iipdlz3Sfub2JVpeufC9VaxMOmxCx6knCWJOsH32u2ktSfzwBFHVGWW/9FMDWFmbQ3Io OCVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=bQ3plwacpV3xxGSr8o9xaRwjjADwA3F7vQvlKVXIrUE=; b=bzqOFLldfbLpWdu/WQ9LtXukemSm/jkUgqrIXvnw2zkSP2W1m5rxn14Hn7Q4+b/LlQ NxPBQrc7w13v4adYloge6ybERIh8JmSgRcRxfFmlbdGA1duAockvJoT+CGOaFnWXWdRM n805Bp3E3Wu/Otqbf305W/UImPjtMOXpbZvkJHHr3JKuKhaYgxuBD3FJeQAS7wLzCrsF 9fIsr/1ViR15wx05HRMG5a/tkU04evtzIrmB249cio95q0tBSqHzhjxlgGwpgHsz4QDW 8NlYyGt//U9d8I4CyUtcNdDfc/7GqFl4dXEnhhvyM9rK/Xwk4I4qsqnweh/8VqWoT8na 2gEQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@yandex-team.ru header.s=default header.b=Kz61awSg; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=yandex-team.ru Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u8-v6si33741179plh.22.2018.05.30.01.00.54; Wed, 30 May 2018 01:01:08 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@yandex-team.ru header.s=default header.b=Kz61awSg; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=yandex-team.ru Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935881AbeE3H7F (ORCPT + 99 others); Wed, 30 May 2018 03:59:05 -0400 Received: from forwardcorp1j.cmail.yandex.net ([5.255.227.105]:54544 "EHLO forwardcorp1j.cmail.yandex.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935164AbeE3H7E (ORCPT ); Wed, 30 May 2018 03:59:04 -0400 Received: from smtpcorp1p.mail.yandex.net (smtpcorp1p.mail.yandex.net [IPv6:2a02:6b8:0:1472:2741:0:8b6:10]) by forwardcorp1j.cmail.yandex.net (Yandex) with ESMTP id 5549C20263; Wed, 30 May 2018 10:59:02 +0300 (MSK) Received: from smtpcorp1p.mail.yandex.net (localhost.localdomain [127.0.0.1]) by smtpcorp1p.mail.yandex.net (Yandex) with ESMTP id 51F246E40ABD; Wed, 30 May 2018 10:59:02 +0300 (MSK) Received: from unknown (unknown [2a02:6b8:0:3711::1:15]) by smtpcorp1p.mail.yandex.net (nwsmtp/Yandex) with ESMTPSA id RwNBrBUext-x1q0pJxX; Wed, 30 May 2018 10:59:02 +0300 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client certificate not present) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1527667142; bh=bQ3plwacpV3xxGSr8o9xaRwjjADwA3F7vQvlKVXIrUE=; h=Subject:To:Cc:References:From:Message-ID:Date:In-Reply-To; b=Kz61awSgSdU0/6Ynrg/398BYVcr2GxW3shSZrQLdQ5ijXnf+Zbqe0P1SAsOLTneoO o0o3c/iDecnlM7DmnpwrYmMJ7m9N6V95EV1cQYzvtCjVEWogHSdQ1z5GPgbHqex+lA CJkRBX0sCRs0x9eOafuce3zE2GwfzU/rfevuJkkE= Authentication-Results: smtpcorp1p.mail.yandex.net; dkim=pass header.i=@yandex-team.ru Subject: Re: [PATCH] mm/huge_memory.c: __split_huge_page() use atomic ClearPageDirty() To: Hugh Dickins , Andrew Morton Cc: "Kirill A. Shutemov" , Nicholas Piggin , linux-kernel@vger.kernel.org, linux-mm@kvack.org References: From: Konstantin Khlebnikov Message-ID: <6c069202-c963-07a4-fc35-630acb223041@yandex-team.ru> Date: Wed, 30 May 2018 10:59:00 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-CA Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 30.05.2018 04:50, Hugh Dickins wrote: > Swapping load on huge=always tmpfs (with khugepaged tuned up to be very > eager, but I'm not sure that is relevant) soon hung uninterruptibly, > waiting for page lock in shmem_getpage_gfp()'s find_lock_entry(), most > often when "cp -a" was trying to write to a smallish file. Debug showed > that the page in question was not locked, and page->mapping NULL by now, > but page->index consistent with having been in a huge page before. > > Reproduced in minutes on a 4.15 kernel, even with 4.17's 605ca5ede764 > ("mm/huge_memory.c: reorder operations in __split_huge_page_tail()") > added in; but took hours to reproduce on a 4.17 kernel (no idea why). > > The culprit proved to be the __ClearPageDirty() on tails beyond i_size > in __split_huge_page(): the non-atomic __bitoperation may have been safe > when 4.8's baa355fd3314 ("thp: file pages support for split_huge_page()") > introduced it, but liable to erase PageWaiters after 4.10's 62906027091f > ("mm: add PageWaiters indicating tasks are waiting for a page bit"). > > Fixes: 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit") > Signed-off-by: Hugh Dickins > --- > > It's not a 4.17-rc regression that this fixes, so no great need to slip > this into 4.17 at the last moment - though it makes a good companion to > Konstantin's 605ca5ede764. I think they both should go to stable, but > since Konstantin's already went into rc1 without that tag, we shall > have to recommend Konstantin's to GregKH out-of-band. Good catch. This is the same issue, so all 4.10+ needs them both. Preserving known regressions in core pieces like lock_page() is a bad idea. > > mm/huge_memory.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > --- 4.17-rc7/mm/huge_memory.c 2018-04-26 10:48:36.019288258 -0700 > +++ linux/mm/huge_memory.c 2018-05-29 18:14:52.095512715 -0700 > @@ -2431,7 +2431,7 @@ static void __split_huge_page(struct pag > __split_huge_page_tail(head, i, lruvec, list); > /* Some pages can be beyond i_size: drop them from page cache */ > if (head[i].index >= end) { > - __ClearPageDirty(head + i); > + ClearPageDirty(head + i); > __delete_from_page_cache(head + i, NULL); > if (IS_ENABLED(CONFIG_SHMEM) && PageSwapBacked(head)) > shmem_uncharge(head->mapping->host, 1); >