Received: by 2002:a05:6358:16cc:b0:ea:6187:17c9 with SMTP id r12csp12353262rwl; Tue, 3 Jan 2023 13:04:07 -0800 (PST) X-Google-Smtp-Source: AMrXdXuiTrRj6EOGdYseyjT3He4MLX4MP1/FYEvmSWTd/fMUs5lIgPMlEWU55zwjh8XFwfjqZJrX X-Received: by 2002:a17:90a:dc15:b0:221:6db6:9660 with SMTP id i21-20020a17090adc1500b002216db69660mr60471156pjv.11.1672779846904; Tue, 03 Jan 2023 13:04:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1672779846; cv=none; d=google.com; s=arc-20160816; b=dIMjRVDuqm2rsbFarcr/J7xvrEgRmy4UZhPsMluZzi7jNFwKW1mvbJfWCpB4aD3fw8 jW/DAERZFiYuQZunRMv7JcUzQsNtgKG9diXN+gD0yqoohUNf/wNlKCPhFZtTY22Vlxax 8jAGEjciYPzhE7l+1LUAodub0nrNKVcuSA4WzplRyLJVKO7wqOzCZmAtzMpoMWk218VD QBDKMvdpLDlIenO84SCMrZoQemCWRJ/XFyIdvLGDS5UF3mPBobs5Wgbu4oQAN1NR7P53 qeNbJei6CsbUufKZP3ex8fh+PuN0a1a/TojVWGN7GvIHUpIOe24yzzJbzQ0mjhxQiyt/ PCIA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=xJd3NQqdohX/EjLJYnI3IKupf1n8/BBvgnGo9sOOFuU=; b=NrknY4Wi46x/bTBqNI1OppuqEtpfBSmEQ3qQiZFp6FI2sXP+DUZW8rQ6YKEPZhQLXs Hpmkxzw8goZAG98JYoHfBc2Ng9JxtzabfE0JUA5VD1st3Trmwxtp6rDln87uodAq99ef 8iu3G50Y67YvsDOZZJvs0kJygs2ILrnfKxnCNJm45Tkh7MTyGv29Jlu3vJbVd+3n5ZiS 5YOUp3QH2hbE3DAXEl9tgcyFsYXo/TKMmIbtLj0xev8swZ747XlJL8KoJqxvoN4sFCRQ fgyWKOro6A+OOYwMFDpg7aK3dDezoTfWfghhVa/sgtTUr7jOxlF5aJDhZRPPUz0gJfyO xomw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=QKadA8SE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i21-20020a17090adc1500b00223b1092804si35901123pjv.163.2023.01.03.13.03.59; Tue, 03 Jan 2023 13:04:06 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=QKadA8SE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238359AbjACU1E (ORCPT + 59 others); Tue, 3 Jan 2023 15:27:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53456 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238632AbjACU0k (ORCPT ); Tue, 3 Jan 2023 15:26:40 -0500 Received: from mail-wr1-x432.google.com (mail-wr1-x432.google.com [IPv6:2a00:1450:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A856B14D2B for ; Tue, 3 Jan 2023 12:26:14 -0800 (PST) Received: by mail-wr1-x432.google.com with SMTP id y8so30842186wrl.13 for ; Tue, 03 Jan 2023 12:26:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=xJd3NQqdohX/EjLJYnI3IKupf1n8/BBvgnGo9sOOFuU=; b=QKadA8SEB2rBW3p0Mjuz98wZqz1uT1OnIlZs/1KnyFTSSM0QXaldHo1cIxM0MQ0shL KIZZDQXrkzrF3EKxOjUeBTWOW0ZUuhmrttIh2zjx2Y1gMmGqlZIhrTkgLhP+uijD7HGf 0gr/J3l0D/Gppwqi8GW6669GsS7TKz0CdKLif/v2c9++Fj7tLCWsJ2QNtmRNXrcZEdJ6 SrOGdbH6zocjmWePebaODxB1FvZp0nKIbC9gGHK/tJbM77mu1M/Buw16mY6UMPte60+i IYTtO8waZd2zPN6bAqwK3El3qf2K2We2l1hhdwU9E9l0PIFBva6Mmj2fpHCFEbCiz8eq rGfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=xJd3NQqdohX/EjLJYnI3IKupf1n8/BBvgnGo9sOOFuU=; b=S/7dTntEIOmiaKvUL4uFfFkIqgUVVHqEPHU/jI5dRP8ZAwtVruhsLGNroU16gE2H5T cK8mKkdNAVTJhhfHKxAHboMSd7NPs1GA7ogv8C6xcVM2Sdl5dhwD8u9Ytx040OSiV89q 2qWUzFm7W5MyMqNrMN0ABgwwTAXQWugF9v6BKylygc2uGivY5rqQJ6vLxJKVQ9vg5MxH wAGIGpooCWhd5+Rk+4XlLfi7lnurSmFXLzVQEf6v99vDYlyBXFXestjaaWgACTxNYqgK /8XZc5VCHigPvLeqdMefOrq9RVEmV6ej5X1ZxOupCCOl8IblagmN9vlorezVx/25KMJf mYUw== X-Gm-Message-State: AFqh2krtUCWv5cf4PrCOTMYaUGN9oToVoaSicHlYaYVAe8B0xQnx6FuK NMFj1QzAeGMglLd/NQTQMuGzoRm43VBGta166mpzfQ== X-Received: by 2002:a5d:6148:0:b0:280:91ea:29b7 with SMTP id y8-20020a5d6148000000b0028091ea29b7mr886765wrt.98.1672777573057; Tue, 03 Jan 2023 12:26:13 -0800 (PST) MIME-Version: 1.0 References: <20230101230042.244286-1-jthoughton@google.com> In-Reply-To: From: James Houghton Date: Tue, 3 Jan 2023 20:26:01 +0000 Message-ID: Subject: Re: [PATCH] hugetlb: unshare some PMDs when splitting VMAs To: Mike Kravetz Cc: Muchun Song , Peter Xu , Axel Rasmussen , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > Thanks James. I am just trying to determine if we may have any issues/bugs/ > undesired behavior based on this today. Consider the cases mentioned above: > mbind - I do not think this would cause any user visible issues. mbind is > only dealing with newly allocated pages. We do not unshare as the > result of a mbind call today. > madvise(MADV_DONTDUMP) - It looks like this results in a flag (VM_DONTDUMP) > being set on the vma. So, I do not believe sharing page tables > would cause any user visible issue. > > One somewhat strange things about two vmas after split sharing a PMD is > that operations on one VMA can impact the other. For example, suppose > A VMA split via mbind happens. Then later, mprotect is done on one of > the VMAs in the range that is shared. That would result in the area being > unshared in both VMAs. So, the 'other' vma could see minor faults after > the mprotect. > > Just curious if you (or anyone) knows of a user visible issue caused by this > today. Trying to determine if we need a Fixes: tag. I think I've come up with one... :) It only took many many hours of staring at code to come up with: 1. Fault in PUD_SIZE-aligned hugetlb mapping 2. fork() (to actually share the PMDs) 3. Split VMA with MADV_DONTDUMP 4. Register the lower piece of the newly split VMA with UFFDIO_REGISTER_MODE_WRITEPROTECT (this will call hugetlb_unshare_all_pmds, but it will not attempt to unshare in the unaligned bits now) 5. Now calling UFFDIO_WRITEPROTECT will drop into hugetlb_change_protection and succeed in unsharing. That will hit the WARN_ON_ONCE and *not write-protect anything*. I'll see if I can confirm that this is indeed possible and send a repro if it is. 60dfaad65a ("mm/hugetlb: allow uffd wr-protect none ptes") is the commit that introduced the WARN_ON_ONCE; perhaps it's a good choice for a Fixes: tag (if above is indeed true). > > Code changes look fine to me. Thanks Mike! - James