Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp8871554rwd; Tue, 20 Jun 2023 23:17:29 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6Daxgk9fwQh1+6I0WpSu0OvHHGbu2Nz5KUC06p8EszQuz0Nom9FcuK8GC0ntd1azH/ukJK X-Received: by 2002:a05:6a20:3cac:b0:122:550f:aad4 with SMTP id b44-20020a056a203cac00b00122550faad4mr5664455pzj.9.1687328248969; Tue, 20 Jun 2023 23:17:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687328248; cv=none; d=google.com; s=arc-20160816; b=OUFo62FYBOhSMpqDjXnrkbISqUZexhApbR7nXd1lXFl5O08EiiIqBLYfPH1f7kVSKS Y3mL2riFVibMyMftQ9cFYcMYsnQEGUU95f1w9PVFNnSm04r5FsYX3fGZZ7H6k2ML052v EBEaRsZeqObPYiUo7I4kGWymM815xwuDhf4ySyNzP+j+ghhJ6xN5xUSOe0Z8+wEbbNxF wQqLIxGV7oSKKlm1ReLRxrU6ZqRJlM42FEx58l3vdLj3vUxHGfeTe4MIEUwYkWtqG6zq Sppnnk3GIXtB1uPpqEniWD6Z8Ho2jMXbCVy57+uGPSIZxxdne6/EcRFxe8G2qgaZmD+9 dpFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=dNod9SmAh5bXGJqsnotODRNDpF91TOvUX2oEAIJT7fs=; b=mXIqGYAg4Un+1BI6UYSviEWcId5BezWhJ7Q7+rKiQPtOvxkWDNe5WfStmZcBr91CZ8 mi2Bj3ZJPXZDzUru2ODq8a4M0shyeGYB8RZGBtdjLItj7i4+Bga+8NJb8S/SBUTH53bL UzPOLMYf5LFq7INqQ5MdnMA6En2hL6D/rRv5D6IB914NBRij11EHIV8muGqyWxqqN4um 09ceq1nb7UQ+0TuLwylJDxiiObSOTZTCcE9At7o7HV61mNcMrfn4KHg0P7cuXywbIANb DyXY568AnXUYWPoEl7+Sdbs/VxMtg+mnmTedqFbEmW0KHOqcW5VoHrHnEcbHIYn/FEXo VSag== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ApWBaV3T; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id kk3-20020a170903070300b001ab2a9fcd3esi3367433plb.378.2023.06.20.23.17.16; Tue, 20 Jun 2023 23:17:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ApWBaV3T; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229803AbjFUF4k (ORCPT + 99 others); Wed, 21 Jun 2023 01:56:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47656 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229470AbjFUF4h (ORCPT ); Wed, 21 Jun 2023 01:56:37 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3AB51CA; Tue, 20 Jun 2023 22:56:36 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id CB13661472; Wed, 21 Jun 2023 05:56:35 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 24983C433C8; Wed, 21 Jun 2023 05:56:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1687326995; bh=US3Xe7yz1+AVCinByPs1eOKpdoQy+FtobpB/5DLulWo=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=ApWBaV3TgZW9WTYCyLlgF0j+nqO+nqmitwslpm8g44yprfBubzuc/R2N7ybTRvLh/ Ns2uFpGWP95Ae6rWEKexoj9fyhiRxZA/wLLqVZFEM73cbQIxwPlddy3soGTxo8yvU+ KmhjFIivoFQ3qB5ah+k5e8Gio3hzQKnNmch9z4ONG0+pLoezU4BG4f+2eIC028XG9j utIhP2VICtz5KV8tRvjIeDWimNFy4DzoSprB2oAuTb+YWOSZbTF0QLt2bsSZrnA02M 1XRU1qkVxPK3DQN1U2NTHyiuI13Tc/yQ9JPpS4V1ZAVmfzUQehSI+M00JtumYmXzHB sFdwPG5FVkIVw== Date: Wed, 21 Jun 2023 08:55:51 +0300 From: Mike Rapoport To: Jeff Xu Cc: "Liam R. Howlett" , Peter Xu , linux-mm@kvack.org, linux-hardening@vger.kernel.org, zhangpeng.00@bytedance.com, akpm@linux-foundation.org, koct9i@gmail.com, david@redhat.com, ak@linux.intel.com, hughd@google.com, emunson@akamai.com, rppt@linux.ibm.com, aarcange@redhat.com, linux-kernel@vger.kernel.org, Lorenzo Stoakes Subject: Re: inconsistence in mprotect_fixup mlock_fixup madvise_update_vma Message-ID: <20230621055551.GE52412@kernel.org> References: <20230614011814.sz2l6z6wbaubabk2@revolver> <20230614125731.GY52412@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jun 20, 2023 at 03:29:34PM -0700, Jeff Xu wrote: > On Wed, Jun 14, 2023 at 5:58 AM Mike Rapoport wrote: > > > > On Tue, Jun 13, 2023 at 09:18:14PM -0400, Liam R. Howlett wrote: > > > * Jeff Xu [230613 17:29]: > > > > Hello Peter, > > > > > > > > Thanks for responding. > > > > > > > > On Tue, Jun 13, 2023 at 1:16 PM Peter Xu wrote: > > > > > > > > > > Hi, Jeff, > > > > > > > > > > On Tue, Jun 13, 2023 at 08:26:26AM -0700, Jeff Xu wrote: > > > > > > + more ppl to the list. > > > > > > > > > > > > On Mon, Jun 12, 2023 at 6:04 PM Jeff Xu wrote: > > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > There seems to be inconsistency in different VMA fixup > > > > > > > implementations, for example: > > > > > > > mlock_fixup will skip VMA that is hugettlb, etc, but those checks do > > > > > > > not exist in mprotect_fixup and madvise_update_vma. Wouldn't this be a > > > > > > > problem? the merge/split skipped by mlock_fixup, might get acted on in > > > > > > > the madvice/mprotect case. > > > > > > > > > > > > > > mlock_fixup currently check for > > > > > > > if (newflags == oldflags || > > > > > > newflags == oldflags, then we don't need to do anything here, it's > > > already at the desired mlock. mprotect does this, madvise does this.. > > > probably.. it's ugly. > > > > > > > > > > (oldflags & VM_SPECIAL) || > > > > > > It's special, merging will fail always. I don't know about splitting, > > > but I guess we don't want to alter the mlock state on special mappings. > > > > > > > > > > is_vm_hugetlb_page(vma) || vma == get_gate_vma(current->mm) || > > > > > > > vma_is_dax(vma) || vma_is_secretmem(vma)) > > > > > > > > > > The special handling you mentioned in mlock_fixup mostly makes sense to me. > > > > > > > > > > E.g., I think we can just ignore mlock a hugetlb page if it won't be > > > > > swapped anyway. > > > > > > > > > > Do you encounter any issue with above? > > > > > > > > > > > > Should there be a common function to handle VMA merge/split ? > > > > > > > > > > IMHO vma_merge() and split_vma() are the "common functions". Copy Lorenzo > > > > > as I think he has plan to look into the interface to make it even easier to > > > > > use. > > > > > > > > > The mprotect_fixup doesn't have the same check as mlock_fixup. When > > > > userspace calls mlock(), two VMAs might not merge or split because of > > > > vma_is_secretmem check, However, when user space calls mprotect() with > > > > the same address range, it will merge/split. If mlock() is doing the > > > > right thing to merge/split the VMAs, then mprotect() is not ? > > > > > > It looks like secretmem is mlock'ed to begin with so they don't want it > > > to be touched. So, I think they will be treated differently and I think > > > it is correct. > > > > Right, they don't :) > > > > secretmem VMAs are always mlocked, they cannot be munlocked and there is no > > point trying to mlock them again. > > > > The mprotect for secretmem is Ok though, so e.g. if we (unlikely) have two > > adjacent secretmem VMAs in a range passed to mprotect, it's fine to merge > > them. > > > > I m thinking/brainstorming below, assuming: > Address range 1: 0x5000 to 0x6000 (regular mmap) > Address range 2: 0x6000 to 0x7000 (allocated to secretmem) > Address range 3: 0x7000 to 0x8000 (regular mmap) > > User space call: mlock(0x5000,0x3000) > range 1 and 2 won't merge. > range 2 and 3 could merge, when mlock_fixup checks current vma > (range 3), it is not secretmem, so it will merge with prev vma. But 2 and 3 have different vm_file, they won't merge. > user space call: mprotect(0x5000,0x3000) > range 1 2 3 could merge, all three can have the same flags. > Note: vma_is_secretmem() isn't checked in mprotect_fixup, same for > vma_is_dax and get_gate_vma, those doesn't have included in > vma->vm_flags > > Once 1 and 2 are merged, maybe user space is able to use > munlock(0x5000,0x3000) > to unlock range 1 to 3, this will include 2, right ? (haven't used the > code to prove it) But 1 and 2 won't merge because their vm_file's are different. > I'm using secretmem as an example here, having 3 different _fixup > implementations seems to be error prone to me. The actual decision whether to merge VMAs is taken in vma_merge rather than by the _fixup functions. So while the checks around vma_merge might be different in these functions, it does not mean it's possible to wrongly merge VMA unless there is a bug in vma_merge. So in the end it boils down to a single core implementation, don't you agree? > Thanks > -Jeff -- Sincerely yours, Mike.