Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp12726250rwd; Fri, 23 Jun 2023 09:48:00 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5CBejuBq5BbqgS83+zC58B5XU67G1cXFXSTHL6oLvW6p0OUsj3VgsMTiOvweTFfuM3iEG5 X-Received: by 2002:a54:4612:0:b0:3a0:41d4:b144 with SMTP id p18-20020a544612000000b003a041d4b144mr9855743oip.1.1687538880542; Fri, 23 Jun 2023 09:48:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687538880; cv=none; d=google.com; s=arc-20160816; b=T4cDtp3flYCIG27MibLXOSJ2YQzTv0n8yMiPoLXy/iWB81TjIcvyRqsfNdpQxVcWQk aCYCEDxIivJYgbCNL7SvOeGmtxhp6qcB1sSk3EEa8/g7478NYdgrHe4YpTG5m2hhMsPM vdpZlHJqZ1hycwj6pg1U+7gITCD6Mt9q/IflmIzulPL4grhiVF/iJemDGyql188r32yc fLFEfivMqqD2gplIxauDGWYg1i9NxK4KzDIHhkgmSc8I1Hjdw77Wk7MzohxktOQmvVRE 74nyOjCVhiLSmuF6ZsxfDE4eCWV7y8MoNuaOPtd/vdb6XMetGlt3aOflEcSj29rzu4EL OcQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:mime-version:date :dkim-signature; bh=804BUznaRJxahSpZ+yggx/c4ITZ/f8vCCZVijpr48F0=; b=H6ypNdbcgH5n2PRDj1Glnq82zNmq7tsH1YmkNRRj/cSzKpm/viA+TuN6t3t4Oi8VG+ VBRCkbJu7E+AobvITwivs4cPu0oQ43zXf5jykbIsfmFEIvvmuIWszVaMzbSwsTqOlSo3 id7nMosgyMC0DfKQVVJ0C2y58yawOYWC0OJDpFmx1UbOEbpMAR7QvaAuF73uSqpzSL3j 60OF/D4zFqdGDpatq5XRYYaYvLiOd4WbA15+Ij9x5od2ze0gQfXrJBPpIHQJg/jVX/VG 1qHayw5DypKnCiOVOHEW8PYxfNiWbDtbH0NWr6hS/F79I1FdwjviJAyW+1ucvr2kp9eY MpVg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=jZtiAGYT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bk13-20020a056a02028d00b00543a6ce8c34si9638909pgb.463.2023.06.23.09.47.47; Fri, 23 Jun 2023 09:48:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=jZtiAGYT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231966AbjFWQk2 (ORCPT + 99 others); Fri, 23 Jun 2023 12:40:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49812 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232389AbjFWQkX (ORCPT ); Fri, 23 Jun 2023 12:40:23 -0400 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EECC2273A for ; Fri, 23 Jun 2023 09:40:19 -0700 (PDT) Received: by mail-pg1-x549.google.com with SMTP id 41be03b00d2f7-553d27fe4baso690001a12.2 for ; Fri, 23 Jun 2023 09:40:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687538419; x=1690130419; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=804BUznaRJxahSpZ+yggx/c4ITZ/f8vCCZVijpr48F0=; b=jZtiAGYTk81uuMQHYSplK6RArvXfVokurmnT+Vp8LofB/qX9ED8oW1HAtA5pXXksBb aIw+W3qBavUHtjsRfeRuaEtO1FRKQ78PAfLrP9rYwa4Op1nP7cp6WAn9O2HiKuFpgCky 1o+u+u51Ew8k+opL+LDIxrXpaAKFUnJI2VL7KDJdHB0LTq5kg8q/OGU/WIIEFRyx2DN2 UTCsrX1Bm7CGE0QSzXT6fyer9Jb+qNrn4lI17/UO9JQV/LQKUxWF4d64VlCQFjpRUlxJ 7eVuBzL20yR2RMclW5Tll2LhJ6WjDVhlCmzC8e4XXRRxBnSoLvZmWLvxaGOs/jNqeRwU hD7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687538419; x=1690130419; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=804BUznaRJxahSpZ+yggx/c4ITZ/f8vCCZVijpr48F0=; b=eM7NTJjK5nbiKlThIvjST2GbeXfxzIkcSSb3ealdjDl1w9FKCblw5KmM+deKHa0b0h nv4sFSqgkUFqPHpO8iFiUDetUNTVdq4nFTIZrQnmYEewiED0hXzw6wwDUyT9qvi4vt7t 6JN1dGD2FhEWklvq+ExZwYiAExZMuHB8MVhCmonkFG6Kpv15TnLZPVISKih1kh1zefm+ aLToQqACYw/QRFyCLP7RfNb5qZpDWkZEPyRi5dRPszbraUXqcNIHU8NpWlO5k6NkGkgJ x6hWjf+HAL5o6ga/cDdwXj6btXdTHT1Lt3fzgoo29Y1fvFBKbTpIQlCPJzAPjxfwV38m tyjA== X-Gm-Message-State: AC+VfDyX4vWopz1TjpZI/X4ZwLE/pRYb6hg5Bd0bKFG16RcGQOKuJm8u ZcKbLLqVMquwvwc+Ud22u1fjqP7qqxzTXA== X-Received: from yjq3.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:272f]) (user=jiaqiyan job=sendgmr) by 2002:a63:555d:0:b0:543:9759:1952 with SMTP id f29-20020a63555d000000b0054397591952mr2457037pgm.11.1687538419140; Fri, 23 Jun 2023 09:40:19 -0700 (PDT) Date: Fri, 23 Jun 2023 16:40:11 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.41.0.162.gfafddb0af9-goog Message-ID: <20230623164015.3431990-1-jiaqiyan@google.com> Subject: [PATCH v2 0/4] Improve hugetlbfs read on HWPOISON hugepages From: Jiaqi Yan To: mike.kravetz@oracle.com, naoya.horiguchi@nec.com Cc: songmuchun@bytedance.com, shy828301@gmail.com, linmiaohe@huawei.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, duenwen@google.com, axelrasmussen@google.com, jthoughton@google.com, Jiaqi Yan Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Today when hardware memory is corrupted in a hugetlb hugepage, kernel leaves the hugepage in pagecache [1]; otherwise future mmap or read will suject to silent data corruption. This is implemented by returning -EIO from hugetlb_read_iter immediately if the hugepage has HWPOISON flag set. Since memory_failure already tracks the raw HWPOISON subpages in a hugepage, a natural improvement is possible: if userspace only asks for healthy subpages in the pagecache, kernel can return these data. This patchset implements this improvement. The 1st commit fixes an issue in __folio_free_raw_hwp. The 2nd commit exports the functionality to tell if a subpage inside a hugetlb hugepage is a raw HWPOISON page. The 3rd commit teaches hugetlbfs_read_iter to return as many healthy bytes as possible. The last commit properly tests this new feature. [1] commit 8625147cafaa ("hugetlbfs: don't delete error page from pagecache") Changelog v1 => v2 * __folio_free_raw_hwp deletes all entries in raw_hwp_list before it traverses and frees raw_hwp_page. * find_raw_hwp_page => __is_raw_hwp_subpage and __is_raw_hwp_subpage only returns bool instead of a raw_hwp_page entry. * is_raw_hwp_subpage holds hugetlb_lock while checking __is_raw_hwp_subpage. * No need to do folio_lock in adjust_range_hwpoison. * v2 is based on commit a6e79df92e4a ("mm/gup: disallow FOLL_LONGTERM GUP-fast writing to file-backed mappings") Jiaqi Yan (4): mm/hwpoison: delete all entries before traversal in __folio_free_raw_hwp mm/hwpoison: check if a subpage of a hugetlb folio is raw HWPOISON hugetlbfs: improve read HWPOISON hugepage selftests/mm: add tests for HWPOISON hugetlbfs read fs/hugetlbfs/inode.c | 58 +++- include/linux/hugetlb.h | 19 ++ include/linux/mm.h | 7 + mm/hugetlb.c | 10 + mm/memory-failure.c | 42 ++- tools/testing/selftests/mm/.gitignore | 1 + tools/testing/selftests/mm/Makefile | 1 + .../selftests/mm/hugetlb-read-hwpoison.c | 322 ++++++++++++++++++ 8 files changed, 439 insertions(+), 21 deletions(-) create mode 100644 tools/testing/selftests/mm/hugetlb-read-hwpoison.c -- 2.41.0.162.gfafddb0af9-goog