Received: by 2002:a89:d88:0:b0:1fa:5c73:8e2d with SMTP id eb8csp728527lqb; Fri, 24 May 2024 11:14:03 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUgvV1K0vCBSAXWkpzUY6oHveaaQu/sIRE1JPVZJKzgxbE/YYvUSsbeZN12tFn7COl561Bxd2OqWzfYK84/dVORs7lKR3kTv/hEE89Lxg== X-Google-Smtp-Source: AGHT+IFe6WZ3yoXnAs+0OwsqCxzJve8qgaRhQf1l0KNV7GY/vlD68LZ2sxshSecCWZ1YMrd0jbGI X-Received: by 2002:a17:907:7844:b0:a5e:612e:fd58 with SMTP id a640c23a62f3a-a6264f126demr207298966b.51.1716574443372; Fri, 24 May 2024 11:14:03 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1716574443; cv=pass; d=google.com; s=arc-20160816; b=IGDfxUff6hLowIHZCiNiQC2DIFXx/MQWAjwfqtLU6eseFLEoPaNp5STiT9SmYvIPVD Ttz6qU3KnGmtIzxggE4TCLIohW5sn7CN4sdm3UqKcV95jEeJmb70tDE4Chb3isX5ziUo EJ46Kg81mdItXLOZ/2PqXAQqosnr1V5JB1sgFKSQeOZZvTnR4Mvp064lPyaQJlRhNS8H 95aFb7875qwb+KGLEKucD/nNn/J0KC3B6ivQGk5nboE+LNxqNW6F2RhULqcLwFMZh3uU 7yff8E9OSiXSFYYE4XCfL2mdCBr/wB4Sf81P4DlhrQdbmqrMlidfha8Z3xkjck56HSdS UY0A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:cc:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id:dkim-signature; bh=H0ztAQLL6blkv/kKT7uBkWiKGVpA/9kpNbmfi5rdSTU=; fh=BdJycf2WhCrnSqdUJ+3x+fSE7ULeX9quv21WRoaaSVU=; b=MnhmN48Qyvj57YCzeOeaclEGGNJNwJOr3jDfgsYWfCwUuZS0omAT/0z3wKgsFaRRH7 ASQxSk80x4ApVVZONvoebNzWw3I3I7I/+aYdQEjFmR1f0rVPV/6zaxkXOOamhgrpBXLE pH7+hWOpufW9egLiXIFH67AdILIVv6CLmU3ewGBMIR0elCi1j+tR6f0n7q78zi7eE6Q5 Z+O8cLNvAvgesgOTXC3mhTGeoJTZezRFy4Xk/3Jwgh4NaJ6HbB4ZMgq4Mlcqh82v1GO2 dDxfncvaLB9Z64f4BSNBWjlw4XNDHc65Ks1a9Ei1SmWmIyCNUQKtrzC1L6twJObKTODV CCeg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b=FHPymeon; arc=pass (i=1 spf=pass spfdomain=collabora.com dkim=pass dkdomain=collabora.com dmarc=pass fromdomain=collabora.com); spf=pass (google.com: domain of linux-kernel+bounces-188990-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-188990-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=collabora.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id a640c23a62f3a-a626cd92d46si108667666b.823.2024.05.24.11.14.03 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 11:14:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-188990-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b=FHPymeon; arc=pass (i=1 spf=pass spfdomain=collabora.com dkim=pass dkdomain=collabora.com dmarc=pass fromdomain=collabora.com); spf=pass (google.com: domain of linux-kernel+bounces-188990-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-188990-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=collabora.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id EA08A1F22897 for ; Fri, 24 May 2024 18:14:02 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id E047A3BBEB; Fri, 24 May 2024 18:13:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b="FHPymeon" Received: from madrid.collaboradmins.com (madrid.collaboradmins.com [46.235.227.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 18CF63B293; Fri, 24 May 2024 18:13:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=46.235.227.194 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716574433; cv=none; b=dv+ADoX4A/3SNGBvk2Il1B0o/Odt7+LQ9ZQq7y/Gv9FC0vjk0Av5VKu0/19IBZ2jQApgZNLw0XL3MPkqdsdD8TfLB1oDMxk4p3lg1yxTP2AtCnioZqXA3Va7SiIA/tr5zwfa7BcX4kaR96CcHkD1PrsQ9UYA9q1uyc2GLL1j6Zw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716574433; c=relaxed/simple; bh=nl8N0WvwS+k4LhPqsELmqFo3kal4ucHu/JEWK5avZy4=; h=Message-ID:Date:MIME-Version:Cc:Subject:To:References:From: In-Reply-To:Content-Type; b=rec0P0ObSMmxvJ6VS9D9D91IFs+MoFOZrkaK0z/XGNPXt3LKFbJgXcSpaB7vBRSicVjziOrf3bdpnJ4atY2Q7DmSuP5NMhhAnsc3OTZ6gK1lMLNs0EgcQ8YU9gidcFjKYo+ZwNmKNs85Up/1x84rqOrzSCrrr7/P36xVYcx6BBY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=collabora.com; spf=pass smtp.mailfrom=collabora.com; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b=FHPymeon; arc=none smtp.client-ip=46.235.227.194 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=collabora.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=collabora.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1716574424; bh=nl8N0WvwS+k4LhPqsELmqFo3kal4ucHu/JEWK5avZy4=; h=Date:Cc:Subject:To:References:From:In-Reply-To:From; b=FHPymeon4A7eenU/nov8a+EGF3ECOQpJ4OlQiH235blmA+2LRz5kCyyfh+1g8LX4B LH3/JcCirtVOk7zOFgMTsNzm+nilIOCCZ4FWMU/pbUqC8f+vaH3vG/tLsi6Oe3HzPb Mb2K0wOUDmoW4IVz/mMvg7x1q7zlRtyYXJyF5VBYhpcRW/v09auEpGiLdu0q4vdtTX 8QOF9vih4nrNp62I5ym4oBaBoOVlxFexQ9EUs563D+HnPdBbHqkZN1P6Wac+iwECP1 dEgrLSHBbnkA0ESqZZjdypc5h00/w9UyWF3VpDsiIl2C7D5da0FOwjDPtv/VASIqXO 4Y8jzZHM9WjPQ== Received: from [100.113.15.66] (ec2-34-240-57-77.eu-west-1.compute.amazonaws.com [34.240.57.77]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: usama.anjum) by madrid.collaboradmins.com (Postfix) with ESMTPSA id BC077378045F; Fri, 24 May 2024 18:13:40 +0000 (UTC) Message-ID: Date: Fri, 24 May 2024 11:13:24 -0700 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Cc: Muhammad Usama Anjum , linux-mm@kvack.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, Ritesh Harjani , Mike Rapoport , Muchun Song , David Hildenbrand , stable@vger.kernel.org Subject: Re: [PATCH] selftest: mm: Test if hugepage does not get leaked during __bio_release_pages() To: Donet Tom , Andrew Morton , Shuah Khan , Matthew Wilcox , Tony Battersby References: <20240523063905.3173-1-donettom@linux.ibm.com> Content-Language: en-US From: Muhammad Usama Anjum In-Reply-To: <20240523063905.3173-1-donettom@linux.ibm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Thank you for submitting a patch. On 5/22/24 11:39 PM, Donet Tom wrote: > Commit 1b151e2435fc ("block: Remove special-casing of compound > pages") caused a change in behaviour when releasing the pages > if the buffer does not start at the beginning of the page. This > was because the calculation of the number of pages to release > was incorrect. > This was fixed by commit 38b43539d64b ("block: Fix page refcounts > for unaligned buffers in __bio_release_pages()"). > > We pin the user buffer during direct I/O writes. If this buffer is a > hugepage, bio_release_page() will unpin it and decrement all references > and pin counts at ->bi_end_io. However, if any references to the hugepage > remain post-I/O, the hugepage will not be freed upon unmap, leading > to a memory leak. > > This patch verifies that a hugepage, used as a user buffer for DIO > operations, is correctly freed upon unmapping, regardless of whether > the offsets are aligned or unaligned w.r.t page boundary. > > Test Result Fail Scenario (Without the fix) > -------------------------------------------------------- > []# ./hugetlb_dio > TAP version 13 > 1..4 > No. Free pages before allocation : 7 > No. Free pages after munmap : 7 > ok 1 : Huge pages freed successfully ! > No. Free pages before allocation : 7 > No. Free pages after munmap : 7 > ok 2 : Huge pages freed successfully ! > No. Free pages before allocation : 7 > No. Free pages after munmap : 7 > ok 3 : Huge pages freed successfully ! > No. Free pages before allocation : 7 > No. Free pages after munmap : 6 > not ok 4 : Huge pages not freed! > Totals: pass:3 fail:1 xfail:0 xpass:0 skip:0 error:0 > > Test Result PASS Scenario (With the fix) > --------------------------------------------------------- > []#./hugetlb_dio > TAP version 13 > 1..4 > No. Free pages before allocation : 7 > No. Free pages after munmap : 7 > ok 1 : Huge pages freed successfully ! > No. Free pages before allocation : 7 > No. Free pages after munmap : 7 > ok 2 : Huge pages freed successfully ! > No. Free pages before allocation : 7 > No. Free pages after munmap : 7 > ok 3 : Huge pages freed successfully ! > No. Free pages before allocation : 7 > No. Free pages after munmap : 7 > ok 4 : Huge pages freed successfully ! > Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0 > > Signed-off-by: Donet Tom > Signed-off-by: Ritesh Harjani (IBM) > --- > tools/testing/selftests/mm/Makefile | 1 + > tools/testing/selftests/mm/hugetlb_dio.c | 118 +++++++++++++++++++++++ Add this test to vm_runtest.sh as all the tests are run from this script in CIs. > 2 files changed, 119 insertions(+) > create mode 100644 tools/testing/selftests/mm/hugetlb_dio.c > > diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile > index eb5f39a2668b..87d8130b3376 100644 > --- a/tools/testing/selftests/mm/Makefile > +++ b/tools/testing/selftests/mm/Makefile > @@ -71,6 +71,7 @@ TEST_GEN_FILES += ksm_functional_tests > TEST_GEN_FILES += mdwe_test > TEST_GEN_FILES += hugetlb_fault_after_madv > TEST_GEN_FILES += hugetlb_madv_vs_map > +TEST_GEN_FILES += hugetlb_dio > > ifneq ($(ARCH),arm64) > TEST_GEN_FILES += soft-dirty > diff --git a/tools/testing/selftests/mm/hugetlb_dio.c b/tools/testing/selftests/mm/hugetlb_dio.c > new file mode 100644 > index 000000000000..6f6587c7913c > --- /dev/null > +++ b/tools/testing/selftests/mm/hugetlb_dio.c > @@ -0,0 +1,118 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * This program tests for hugepage leaks after DIO writes to a file using a > + * hugepage as the user buffer. During DIO, the user buffer is pinned and > + * should be properly unpinned upon completion. This patch verifies that the > + * kernel correctly unpins the buffer at DIO completion for both aligned and > + * unaligned user buffer offsets (w.r.t page boundary), ensuring the hugepage > + * is freed upon unmapping. > + */ > + > +#define _GNU_SOURCE > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include "vm_util.h" > +#include "../kselftest.h" > + > +void run_dio_using_hugetlb(unsigned int start_off, unsigned int end_off) > +{ > + int fd; > + char *buffer = NULL; > + char *orig_buffer = NULL; > + size_t h_pagesize = 0; > + size_t writesize; > + int free_hpage_b = 0; > + int free_hpage_a = 0; > + > + writesize = end_off - start_off; > + > + /* Get the default huge page size */ > + h_pagesize = default_huge_page_size(); > + if (!h_pagesize) > + ksft_exit_fail_msg("Unable to determine huge page size\n"); > + > + /* Open the file to DIO */ > + fd = open("/tmp", O_TMPFILE | O_RDWR | O_DIRECT); > + if (fd < 0) > + ksft_exit_fail_msg("Error opening file"); Use ksft_exit_fail_perror to print the info about the error > + > + /* Get the free huge pages before allocation */ > + free_hpage_b = get_free_hugepages(); > + if (free_hpage_b == 0) { > + close(fd); > + ksft_exit_skip("No free hugepage, exiting!\n"); > + } > + > + /* Allocate a hugetlb page */ > + orig_buffer = mmap(NULL, h_pagesize, PROT_READ | PROT_WRITE, MAP_PRIVATE > + | MAP_ANONYMOUS | MAP_HUGETLB, -1, 0); Better align the arguments. Put all flags in one line instead of slitting like this > + if (orig_buffer == MAP_FAILED) { > + close(fd); > + ksft_exit_fail_msg("Error mapping memory"); nit: "\n" is missing from here. > + } > + buffer = orig_buffer; > + buffer += start_off; > + > + memset(buffer, 'A', writesize); > + > + /* Write the buffer to the file */ > + if (write(fd, buffer, writesize) != (writesize)) { > + munmap(orig_buffer, h_pagesize); > + close(fd); > + ksft_exit_fail_msg("Error writing to file"); > + } > + > + /* unmap the huge page */ > + munmap(orig_buffer, h_pagesize); > + close(fd); > + > + /* Get the free huge pages after unmap*/ > + free_hpage_a = get_free_hugepages(); > + > + /* > + * If the no. of free hugepages before allocation and after unmap does > + * not match - that means there could still be a page which is pinned. > + */ > + if (free_hpage_a != free_hpage_b) { > + printf("No. Free pages before allocation : %d\n", free_hpage_b); Use ksft_print_msg instead > + printf("No. Free pages after munmap : %d\n", free_hpage_a); > + ksft_test_result_fail(": Huge pages not freed!\n"); > + } else { > + printf("No. Free pages before allocation : %d\n", free_hpage_b); > + printf("No. Free pages after munmap : %d\n", free_hpage_a); > + ksft_test_result_pass(": Huge pages freed successfully !\n"); > + } > +} > + > +int main(void) > +{ > + size_t pagesize = 0; > + > + ksft_print_header(); > + ksft_set_plan(4); > + > + /* Get base page size */ > + pagesize = psize(); > + > + /* start and end is aligned to pagesize */ > + run_dio_using_hugetlb(0, (pagesize * 3)); > + > + /* start is aligned but end is not aligned */ > + run_dio_using_hugetlb(0, (pagesize * 3) - (pagesize / 2)); > + > + /* start is unaligned and end is aligned */ > + run_dio_using_hugetlb(pagesize / 2, (pagesize * 3)); > + > + /* both start and end are unaligned */ > + run_dio_using_hugetlb(pagesize / 2, (pagesize * 3) + (pagesize / 2)); > + > + ksft_finished(); ksft_finished() never returns. Remove the following line. > + return 0; > +} > + -- BR, Muhammad Usama Anjum