Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp37197813rwd; Tue, 11 Jul 2023 10:42:07 -0700 (PDT) X-Google-Smtp-Source: APBJJlG+38d/yqx5vR2zBz7mmQLL5Uit6hPmt6pdjzKJAPTo9gB5nHjZTWo6xNo39i5/fsrQdj9G X-Received: by 2002:a5d:42ca:0:b0:314:3dd7:bb9e with SMTP id t10-20020a5d42ca000000b003143dd7bb9emr14612767wrr.7.1689097326891; Tue, 11 Jul 2023 10:42:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689097326; cv=none; d=google.com; s=arc-20160816; b=GsTY0pha6bd9Uen1PwqnZHbupNBD4uhwWy2vQQgWSQvR9ckIffIFEV8Fr4v2vuHq1H 0UuJoDq7TJp+6c321Z+e3IAlh7F7RvmyPGWNPwgfpaTw9E/Oa6LKKHmhG2PukL4Bd7U0 DWDrHtF7T8eMpV6s1OK7rZxydr/xabIG7fHXVrSxIJCkcoeX3YdN1muycHcpYiyhv7WI AffWhr7MQaIgF6GdWEEU5z48tSGaFibJqY4CkuHcxD24tGfMNX6B+3I4qhn6hILL1r1D 6cUL0es77VPlFSDF7LjUY6+U4MDcMGPwEQWzPb6a6PfiTitafd7k8n/N+mKzIU75IMVD ZLVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=egt2ow1fWfYS8Gw95Yl0UW2LZk+PQZpFZ+iHkx63nEU=; fh=a7w88TzjQk9HRf2p++v9Wfs7UJlaYajZL7dMgPk2Uwk=; b=et6Pys3aKTfHs21co+f/aKh7amJA7yCPczt3OYM/5WYsPcKYuhDgivAVk5IUCJtx1v IROr0tPZEHJlwRxjU5hzbnxfIObrtWNe37/81j0V8Z4JXGI5YlS6EvKIUO9ZvwLrTqHg PZZcb84WQJz+7ZWFJdBdMNo5JpJnFB8fRA5lJw4bqScurxoTLx67yuJjM21aeadTxUY1 vdWE5sKL3EwiPbXBd29b9vwcBtt1mEyFbZzsR7M/Qh+sC7vIIe9yiPkiyW+4ma0Dds4Q S03dAYkbuQozZR0eRCl4xkq4C0sNRSph4pzewY2OdDnHqwMUjgEYea4lkJIBQriPUg8T 4D7A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=ecuxrJuc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t25-20020a170906065900b00993860a6d3bsi2237878ejb.518.2023.07.11.10.41.42; Tue, 11 Jul 2023 10:42:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=ecuxrJuc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230329AbjGKRFj (ORCPT + 99 others); Tue, 11 Jul 2023 13:05:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55330 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230204AbjGKRFh (ORCPT ); Tue, 11 Jul 2023 13:05:37 -0400 Received: from mail-yw1-x112d.google.com (mail-yw1-x112d.google.com [IPv6:2607:f8b0:4864:20::112d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 41460B7 for ; Tue, 11 Jul 2023 10:05:36 -0700 (PDT) Received: by mail-yw1-x112d.google.com with SMTP id 00721157ae682-57d24970042so6317877b3.2 for ; Tue, 11 Jul 2023 10:05:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1689095135; x=1691687135; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=egt2ow1fWfYS8Gw95Yl0UW2LZk+PQZpFZ+iHkx63nEU=; b=ecuxrJucUhDGppYS/qb0NC3IHv/38+Hvtpuhc9jQb3nZB3pq0gP7fi6JvuTEalDYTt rSIk5nkzXuyFzlPXFlekWycYD0Z+BzSgopxWgQo9E/6lG7VkxAIz6xUr8Kk2zc7wIkrp N1zxq9PI6ERSfBZFkSFm70vq50/bDRsuV6rmTHg0Nqdm4divdscl+t+FACP5CuAk/DOq irtWCuMXtFjCUX+TLeOVjiNawUzNd1RF41x0S7QtJmM+bUagol2k7o5bUrJDohv2INmp 5rHbDonugyo6XIyYRNeZx86NAJ7YurTV9jgFQ/YmGJB7RgPFimMNfSQFJ6ODczlwBiTJ xPjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689095135; x=1691687135; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=egt2ow1fWfYS8Gw95Yl0UW2LZk+PQZpFZ+iHkx63nEU=; b=OBJnCnYWcPHU449mVhO4V42HiOZCODugthkUlB8XhHPNNgznohWdXeEDK6Sm60XZ5n LDduETjcKt3KfoB3yywfF4VFpGbh9YcSS1IfIxNRJvSD7N8d+WVtAXx+kGPphwZuEMAg lwJZYLLbHy66WkdqINQ2oGE9h90QR1q8ekNU/hKInMQoeWy6RBZ+gdVZIAn0nX8AwiEg YsK3MQcdq9stgNfBNces7DYMzYXaxsfLBxSt5bOeteePaUvRWPvdyLxn84Jt/6gzJOO/ N9H13DnSX8vGTNp64+Rlnym5cKeo2I9fdnp+p1oKDbMtCJKVDz6E0azsXfbLQJwaLQzK yKaA== X-Gm-Message-State: ABy/qLYip1o+Q4ncGnNyeFYFcI/XCzv0c8C/Givz+WNeUecH4pV+J2XT AxKfNe+Psh7gwJ9YVIZex9gQlYpYGTe6fVmnz+uv8w== X-Received: by 2002:a81:730b:0:b0:56c:fd16:330f with SMTP id o11-20020a81730b000000b0056cfd16330fmr14511958ywc.12.1689095135312; Tue, 11 Jul 2023 10:05:35 -0700 (PDT) MIME-Version: 1.0 References: <20230707201904.953262-1-jiaqiyan@google.com> <20230707201904.953262-3-jiaqiyan@google.com> <6682284d-7ad3-9b59-687d-899f4d08d911@huawei.com> In-Reply-To: From: Jiaqi Yan Date: Tue, 11 Jul 2023 10:05:21 -0700 Message-ID: Subject: Re: [PATCH v3 2/4] mm/hwpoison: check if a subpage of a hugetlb folio is raw HWPOISON To: Miaohe Lin Cc: akpm@linux-foundation.org, mike.kravetz@oracle.com, naoya.horiguchi@nec.com, songmuchun@bytedance.com, shy828301@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, duenwen@google.com, axelrasmussen@google.com, jthoughton@google.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 10, 2023 at 8:16=E2=80=AFAM Jiaqi Yan wro= te: > > On Fri, Jul 7, 2023 at 7:57=E2=80=AFPM Miaohe Lin = wrote: > > > > On 2023/7/8 4:19, Jiaqi Yan wrote: > > > Add the functionality, is_raw_hwp_subpage, to tell if a subpage of a > > > hugetlb folio is a raw HWPOISON page. This functionality relies on > > > RawHwpUnreliable to be not set; otherwise hugepage's raw HWPOISON lis= t > > > becomes meaningless. > > > > > > is_raw_hwp_subpage needs to hold hugetlb_lock in order to synchronize > > > with __get_huge_page_for_hwpoison, who iterates and inserts an entry = to > > > raw_hwp_list. llist itself doesn't ensure insertion is synchornized w= ith > > > the iterating used by __is_raw_hwp_list. Caller can minimize the > > > overhead of lock cycles by first checking if folio / head page's > > > HWPOISON flag is set. > > > > > > Exports this functionality to be immediately used in the read operati= on > > > for hugetlbfs. > > > > > > Reviewed-by: Mike Kravetz > > > Reviewed-by: Naoya Horiguchi > > > Signed-off-by: Jiaqi Yan > > > --- > > > include/linux/hugetlb.h | 19 +++++++++++++++++++ > > > include/linux/mm.h | 7 +++++++ > > > mm/hugetlb.c | 10 ++++++++++ > > > mm/memory-failure.c | 34 ++++++++++++++++++++++++---------- > > > 4 files changed, 60 insertions(+), 10 deletions(-) > > > ... > > > -static inline struct llist_head *raw_hwp_list_head(struct folio *fol= io) > > > +bool __is_raw_hwp_subpage(struct folio *folio, struct page *subpage) > > > { > > > - return (struct llist_head *)&folio->_hugetlb_hwpoison; > > > + struct llist_head *raw_hwp_head; > > > + struct raw_hwp_page *p, *tmp; > > > + bool ret =3D false; > > > + > > > + if (!folio_test_hwpoison(folio)) > > > + return false; > > > + > > > + /* > > > + * When RawHwpUnreliable is set, kernel lost track of which sub= pages > > > + * are HWPOISON. So return as if ALL subpages are HWPOISONed. > > > + */ > > > + if (folio_test_hugetlb_raw_hwp_unreliable(folio)) > > > + return true; > > > + > > > + raw_hwp_head =3D raw_hwp_list_head(folio); > > > + llist_for_each_entry_safe(p, tmp, raw_hwp_head->first, node) { > > > > Since we don't free the raw_hwp_list, does llist_for_each_entry works s= ame as llist_for_each_entry_safe? Sorry I missed this comment. Yes they are the same but llist_for_each_entry doesn't need "tmp". I will switch to llist_for_each_entry in v4. > > > > > > + if (subpage =3D=3D p->page) { > > > + ret =3D true; > > > + break; > > > + } > > > + } > > > + > > > + return ret; > > > } > > > > It seems there's a race between __is_raw_hwp_subpage and unpoison_memor= y: > > unpoison_memory __is_raw_hwp_subpage > > if (!folio_test_hwpoison(folio)) -- h= wpoison is set > > folio_free_raw_hwp llist_for_each_entry_safe raw_hwp_lis= t > > llist_del_all .. > > folio_test_clear_hwpoison > > > > Thanks Miaohe for raising this concern. > > > But __is_raw_hwp_subpage is used in hugetlbfs, unpoison_memory couldn't= reach here because there's a > > folio_mapping =3D=3D NULL check before folio_free_raw_hwp. > > I agree. But in near future I do want to make __is_raw_hwp_subpage > work for shared-mapping hugetlb, so it would be nice to work with > unpoison_memory. It doesn't seem to me that holding mf_mutex in > __is_raw_hwp_subpage is nice or even absolutely correct. Let me think > if I can come up with something in v4. At my 2nd thought, if __is_raw_hwp_subpage simply takes mf_mutex before llist_for_each_entry, it will introduce a deadlock: unpoison_memory __is_raw_hwp_subpage held mf_mutex held hugetlb_lock get_hwpoison_hugetlb_folio attempts mf_mutex attempts hugetlb lock Not for this patch series, but for future, is it a good idea to make mf_mutex available to hugetlb code? Then enforce the order of locking to be mf_mutex first, hugetlb_lock second? I believe this is the current locking pattern / order for try_memory_failure_hugetlb. > > > > > Anyway, this patch looks good to me. > > > > Reviewed-by: Miaohe Lin > > Thanks. > >