Received: by 2002:ab2:2994:0:b0:1ef:ca3e:3cd5 with SMTP id n20csp507927lqb; Thu, 14 Mar 2024 19:25:51 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXKhCZqxhXXx7WwLOWYivt8jwmskHjhO1EUiqA8kOHAeB4ovgBN8Y89BjX9LrcS9ndrEwbVz8c1LYdrjtWcpgM/XR7m2cc0uvYSL4GLOw== X-Google-Smtp-Source: AGHT+IEIXiome09MELOdY43Q4p77jWYCJ5lyk6spa5YKjhpz0XFwf5hauxzCMMqOz4pDw11SRutw X-Received: by 2002:a05:6a00:181f:b0:6e6:96cb:3ab8 with SMTP id y31-20020a056a00181f00b006e696cb3ab8mr5313012pfa.10.1710469550137; Thu, 14 Mar 2024 19:25:50 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1710469550; cv=pass; d=google.com; s=arc-20160816; b=XL6YdAs7ag4440032DSS74Dg+ULTl1yScRxb+Z0S/Ji2cn2tdS8mEz62EvBYP3nPxb 9VR0eabClOlP9FHdQHSg2JCp7KANKZqrjChOB/8BCBsr+pHtlQ0wJYBX1m600ksRVkzm 7kP4P1G2Ac7DYPOE93gxutQ/Z1ikvoHBvWLDG/0SnBVdb70sdg5t/DlX4d/uu1VCXVOZ /jIlXEZzfpCOnnz/cezox7CZf5M/a9PcGaPnm0QAAjtqndQlR52+C0SI0SodfwgcEIDN n4zWzzwzlhUgFKLVfYUosgm5sZo94ev4LkX1fen1Y1fEKM8aO6DUgov8ktpZWZlFeq91 7Nug== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=UYWo+uASOuXDh53bEqM9LRR7deedZF14peInEPZmXY0=; fh=OqhTz9Wpm7MXavVcBnZZCPI7JYXmthdD+pHVKoXKNk8=; b=CYvIrsgohuUHdj0i0c2Wl6CdnL7Mf/eKNDbmtvZT09bCpfYaUVBQBKqnFYwU5rPnkL dKG/oJRsESFphHPnvfGSI5UCrH2vLW88vSt1rIwzfry9F5BVM2Se+/+gkD5x/uOl3Ocv tqgsqL3E4HS66pGlGUHe5oysBGLu79htavOzHSgf7dkNC9S7YeUKdjIr9wY93OLTOr9G WYNxSEdoia1nlOKMNdy0fNmkD5BadXPW49R3gIqVNgxdJZ7FJ5+m+60NM+hBkTBDy1Ec DxJU/fCOxYTa1nhuUPpSvtf6UBnEw125tA0Y+yBTyzwX7RNEkszaFNjQAAck4jFL1fFo cDIA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=fHfJmHPB; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-103986-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-103986-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id a13-20020a634d0d000000b005dc48945ce7si1778868pgb.802.2024.03.14.19.25.49 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Mar 2024 19:25:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-103986-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=fHfJmHPB; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-103986-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-103986-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 03268281CCC for ; Fri, 15 Mar 2024 02:25:49 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 6EE707499; Fri, 15 Mar 2024 02:25:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="fHfJmHPB" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 771A16FBD for ; Fri, 15 Mar 2024 02:25:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710469541; cv=none; b=Rz/eHF34tc2oRNGKF0Y3+bk8OmKYpik7b5VXT/7gPi6esSZ/CU38oi+Qi1NCykM36CR3zwBBRQUf9ZFb5DqWnOjuiSXHMINtLKy0ogdSHqSpXMSBxKaQxvfq9Jq0kSPeYqCY3zzhQSxhHtWjjBP/ZyTjHhKu3Y9pXssV1GaoO0U= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710469541; c=relaxed/simple; bh=g/yhIwX3XvlhVWvh9kFVm4R5Le/gPv0TUWKca6Jwjg4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=hxpK9uvJCeOK8T3vRuVdwybSBPHjhz/c04krvO7MrAOh/5hSB4sx1hH3GMDnAzqJOrvjiIfc/koo4V3Zo/3LSPGEP1iRB00V6HnZB07hNUp9spYZNXt3tSfifymNa+qBVckL516uQIgGnZaQKv7Vgi7pBymw0zlIH/UmIOELJh0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=fHfJmHPB; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 07C15C433F1; Fri, 15 Mar 2024 02:25:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1710469541; bh=g/yhIwX3XvlhVWvh9kFVm4R5Le/gPv0TUWKca6Jwjg4=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=fHfJmHPB3o7/qILSKnQc+WnzSMZV7uO0+2RHkjCoTkkOPCf7lJ6Xs8+s9mQqFQOXq Q9bsU3hHzrcgg8Pz0LC5eq7jHm45z4awbJ8Rft9uSYiydDzVshBpHFGBThJhdyCrU4 cmVXJ7lfClQBP7R6B0ii2DH7uGqhRLTlI62ROyic7LE/Cr9BEOUfVNxJjlzEca2Cn5 LlzOD9mlM5hIyDjKLwxv/5/Q5O4jb3Rd5RUzv46uaL0vYFB8AW1cyH9c5z+SJYh/X3 8Je0Bg93ytVSqel5I/KPQmBXOaRwenUBklQBdd2y7p7qSJmxmOtsmsethDdeEfuPXo peoQ2a3m0XQ7A== Date: Thu, 14 Mar 2024 19:25:40 -0700 From: "Darrick J. Wong" To: David Hildenbrand Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , John Hubbard , Jason Gunthorpe , Hugh Dickins Subject: Re: [PATCH v1 0/2] mm/madvise: make MADV_POPULATE_(READ|WRITE) handle VM_FAULT_RETRY properly Message-ID: <20240315022540.GD6226@frogsfrogsfrogs> References: <20240314161300.382526-1-david@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240314161300.382526-1-david@redhat.com> On Thu, Mar 14, 2024 at 05:12:58PM +0100, David Hildenbrand wrote: > Derrick reports that in some cases where pread() would fail with -EIO and > mmap()+access would generate a SIGBUS signal, MADV_POPULATE_READ / > MADV_POPULATE_WRITE will keep retrying forever and not fail with -EFAULT. > > It all boils down to missing VM_FAULT_RETRY handling. Let's try to handle > that in a better way, similar to how ordinary GUP handles it. > > Details in patch #1. In short, move special MADV_POPULATE_(READ|WRITE) > VMA handling into __get_user_pages(), and make faultin_page_range() > call __get_user_pages_locked(), which handles VM_FAULT_RETRY. Further, > avoid the now-useless madvise VMA walk, because __get_user_pages() will > perform the VMA lookup either way. > > I briefly played with handling the FOLL_MADV_POPULATE checks in > __get_user_pages() a bit differently, integrating them with existing > handling, but it ended up looking worse. So I decided to keep it simple. > > Likely, we need better selftests, but the reproducer from Darrick might > be a bit hard to convert into a simple selftest. No worries, I can convert my reproducer into an fstest. I actually had no idea that there were so many madvise flags, it's tempting to wire up fsx and fsstress so that the long soak group tests will exercise them. > Note that using mlock() in Darricks reproducer results in a similar > endless retry. Likely, that is not what we want, and we should handle > VM_FAULT_RETRY in populate_vma_page_range() / __mm_populate() as well. > However, similarly using __get_user_pages_locked() might be more > complicated, because of the advanced VMA handling in > populate_vma_page_range(). > > Further, most populate_vma_page_range() callers simply ignore the return > values, so it's unclear in which cases we expect to just silently fail, or > where we'd want to retry+fail or endlessly retry instead. With this patchset applied, my reproducer no longer gets stuck in an infinite loop. I'll throw this at fstests overnight and see if anything else falls out. Thank you! --D > Cc: Andrew Morton > Cc: Darrick J. Wong > Cc: John Hubbard > Cc: Jason Gunthorpe > Cc: Hugh Dickins > > David Hildenbrand (2): > mm/madvise: make MADV_POPULATE_(READ|WRITE) handle VM_FAULT_RETRY > properly > mm/madvise: don't perform madvise VMA walk for > MADV_POPULATE_(READ|WRITE) > > mm/gup.c | 54 ++++++++++++++++++++++++++++++--------------------- > mm/internal.h | 10 ++++++---- > mm/madvise.c | 43 +++++++++++++--------------------------- > 3 files changed, 52 insertions(+), 55 deletions(-) > > > base-commit: f48159f866f422371bb1aad10eb4d05b29ca4d8c > -- > 2.43.2 >