Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp5335pxp; Tue, 15 Mar 2022 21:53:27 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzOrWyrrCfrw/gLwm0T74pcvGk95ybThht2bTthmIvwchPoEqipXmNzmtmjnHE9Oiz9JuRy X-Received: by 2002:a62:3896:0:b0:4f7:87dc:de5b with SMTP id f144-20020a623896000000b004f787dcde5bmr26392730pfa.49.1647406407390; Tue, 15 Mar 2022 21:53:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1647406407; cv=none; d=google.com; s=arc-20160816; b=ZJ8yLl1ui1gvDyuWiY0IrzfxctO4RZ/uXwP28ERYmawDrSAQACF+spa2Tk3kUCJWiK sbh7IpN34OtY6VD+ifEfLFA9KSNMhA1eFVXK2BiynRnV6anEfVpx1qta78pJLa3kJkXX Z3MsWZYxw4MGseGh+fVxyxVbZiYLyQPYAUy+wnvjJ6wJym83mwM0PL1/j+K3QY7bhv+W RWe5XT/BjTwD7EBsF16PhtCu+ctrPhO+zwW2PJCvGaC6d/EfbYnqEPVyVemtRMNsUjFP Q7mL6ZOElYOZOuzRa6niPqVN3z5SHENMAQqq2CGpq2+nidnDBXyazFMkEzJpeYV8yTEb AvtA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:message-id:in-reply-to :subject:cc:to:from:date:dkim-signature; bh=HuxzPBFcNGizUpUFeFsNjMUynwK/YImVehMj2Mb+h5A=; b=cc05tJ8kkQPq1fQkuRuUyT2ozjNRuiEgzsleHAfxuxKzdYkd3EHZ32Cw52CFlFwK34 VKGpyad3sBAHpR1WwTt5rE0D2kr/XUQDWp/9P0ZQiDl6FyxDSV3qDNJR3MLxNiKT7P/D 87Vx4QMOt6ZZW1AC1XxxV+8N4SW0xxgQfIISQE77COt6iWmIvDRDbYCvPDq9SgyVGbn1 GBj2oOJ5ZnNnpVlRO3jxCtPe6sboPq6QUQdREQDlpSyoVyG6uBhnE62PWnGdw+PAsqR5 Kb+QN5PY7w1uIlG6PmNh5CV28IQcfHtEutCF7ly6nkxoBRHJu/638RuXCYoL5I5p2y/B SysQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=K3HGrbkU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a191-20020a6390c8000000b0038034735ef3si1036196pge.273.2022.03.15.21.52.40; Tue, 15 Mar 2022 21:53:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=K3HGrbkU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1351735AbiCOUfp (ORCPT + 99 others); Tue, 15 Mar 2022 16:35:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40148 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241887AbiCOUfo (ORCPT ); Tue, 15 Mar 2022 16:35:44 -0400 Received: from mail-oi1-x22b.google.com (mail-oi1-x22b.google.com [IPv6:2607:f8b0:4864:20::22b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7224313D55 for ; Tue, 15 Mar 2022 13:34:31 -0700 (PDT) Received: by mail-oi1-x22b.google.com with SMTP id h10so551086oia.4 for ; Tue, 15 Mar 2022 13:34:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:in-reply-to:message-id:references :mime-version; bh=HuxzPBFcNGizUpUFeFsNjMUynwK/YImVehMj2Mb+h5A=; b=K3HGrbkU3n9yvQYCY/kzElGn6znziTraIc5bj7MoyQgemj6Jy89aSAsaWRBNeSQirf 5XAGCDiWwqOiXzbReSY3yKhwLgKrG7zIljAb0MxtNcmhyYqdc+BbqjG11NdnvjIGO7pF WhYkc0Pj2Lpv6CKFjhncgf5mGTCONSuF6qVBOAl3I/yiUt3sNVL9My/LEgJzuZ9PNz1+ KNp706Dq/JKj2fKDY5vdz4tqkqUxLy0tS2/hFySkXnSULhgAZIqeLKYu1+UvCMNJsCH5 IhISLb9TUOzNwHkwJbSji4zTaCSKtDvaR3Ge1ojWe2ATreVqSqdLJBV+Jk1aDaeLLEjs owUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:mime-version; bh=HuxzPBFcNGizUpUFeFsNjMUynwK/YImVehMj2Mb+h5A=; b=7P4AhAdsVDdula2Z0XnFwyl/00YhdsK12RdFm7pa15kDhquVfed4Oex6IK3cdOYCsc vsWzNLcTyoR4EppminxPpnugwzBMn7zrYzkC+b1wtBzJ+DrnR07sUAFmOZWYWDMYu17X 63SCIE+rtmcMJ97RKwZVrtl9okLYAG+Yn92PjMIw6fySmirFLhxrkrS3Cnvt7Unir/oG wpC2bNpItc31fdyhW5v2DnZmBUVCMK8SsLsJy0SJfLP+yWj1ch05DZyQmcJde8RfRn0v /7Uja10/yXQRC5wk5pFhYpeiqw+vtd7fQ1hmtKKP+sTrDg0QQQTQqBC5X3zZYHTH9ddy iTWQ== X-Gm-Message-State: AOAM532NfpaFkZ1JCMddVqyWUpTisyoqWfmgomwW5d/UMalWWpfqXB+1 pXcEEpKfqt659lRazNzHAQDJ1MXAVmXA8A== X-Received: by 2002:a54:4e11:0:b0:2ec:e0ee:ac29 with SMTP id a17-20020a544e11000000b002ece0eeac29mr2429740oiy.257.1647376470526; Tue, 15 Mar 2022 13:34:30 -0700 (PDT) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id a10-20020a056808120a00b002d404a71444sm95511oil.35.2022.03.15.13.34.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Mar 2022 13:34:29 -0700 (PDT) Date: Tue, 15 Mar 2022 13:34:02 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@ripple.anvils To: David Hildenbrand cc: Andrew Morton , Andrew Yang , Matthias Brugger , Matthew Wilcox , Vlastimil Babka , David Howells , William Kucharski , Yang Shi , Marc Zyngier , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linux-mediatek@lists.infradead.org, wsd_upstream@mediatek.com, Nicholas Tang , Kuan-Ying Lee Subject: Re: [PATCH] mm/migrate: fix race between lock page and clear PG_Isolated In-Reply-To: <4cb789a5-c49c-f095-1f7e-67be65ba508a@redhat.com> Message-ID: <883877a-30b0-96e0-48a6-7cfc3c59de93@google.com> References: <20220315030515.20263-1-andrew.yang@mediatek.com> <20220314212127.a2797926ee0ef8a7ad05dcaa@linux-foundation.org> <4cb789a5-c49c-f095-1f7e-67be65ba508a@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 15 Mar 2022, David Hildenbrand wrote: > On 15.03.22 05:21, Andrew Morton wrote: > > On Tue, 15 Mar 2022 11:05:15 +0800 Andrew Yang wrote: > > > >> When memory is tight, system may start to compact memory for large > >> continuous memory demands. If one process tries to lock a memory page > >> that is being locked and isolated for compaction, it may wait a long time > >> or even forever. This is because compaction will perform non-atomic > >> PG_Isolated clear while holding page lock, this may overwrite PG_waiters > >> set by the process that can't obtain the page lock and add itself to the > >> waiting queue to wait for the lock to be unlocked. > >> > >> CPU1 CPU2 > >> lock_page(page); (successful) > >> lock_page(); (failed) > >> __ClearPageIsolated(page); SetPageWaiters(page) (may be overwritten) > >> unlock_page(page); > >> > >> The solution is to not perform non-atomic operation on page flags while > >> holding page lock. > > > > Sure, the non-atomic bitop optimization is really risky and I suspect > > we reach for it too often. Or at least without really clearly > > demonstrating that it is safe, and documenting our assumptions. > > I agree. IIRC, non-atomic variants are mostly only safe while the > refcount is 0. Everything else is just absolutely fragile. It is normal and correct to use __SetPageFlag(page) on a page just allocated from the buddy, and not yet logically visible to others: that has refcount 1. Of course, it might have refcount 2 or more, through being speculatively visible to get_page_unless_zero() users: perhaps through earlier usage of the same struct page, or by physical scan of memmap. Those few such others - compaction's isolate_migratepages_block() is the one I know best - must be very careful in their sequence of operations. Preliminary read-only checks are usually okay (but some VM_BUG_ON_PGFLAGS are increasingly problematic: I've had to turn off that CONFIG), then get_page_unless_zero(), then read-only check that the page is of the manageable kind (PageLRU in my world), and only then can it be safe to lock the page - which of course touches page flags, and so would be problematic for a racing user's __SetPageFlag(page). But PageMovable and PageIsolated are beyond my ken: I can't say there. Hugh