Received: by 2002:a05:7412:2a91:b0:fc:a2b0:25d7 with SMTP id u17csp30772rdh; Tue, 13 Feb 2024 08:29:58 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCXqI4xRvJfnOWySdnegEXUolYjau90nqJZhf2gAfPcKbgagtKmLDuwuFNt6KqtfYK9WVBKPv/2uF4XGoQ0gSvip7/JNloH5qKP3f0jfdw== X-Google-Smtp-Source: AGHT+IEwDdCI/YoGnDO3axFl9nfIE9rEDu9a7mGKjJZ+Icp1vG/m5LYXXEnVoib/GA/ZccOSqt9i X-Received: by 2002:a81:a14e:0:b0:607:7f93:45b9 with SMTP id y75-20020a81a14e000000b006077f9345b9mr2867882ywg.11.1707841798457; Tue, 13 Feb 2024 08:29:58 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1707841798; cv=pass; d=google.com; s=arc-20160816; b=QFzVkXNH4ER806ym5PYUc8OIxVF7mR5l1LYTMXDpJFYYSeuUmwINvzekVRjXcQ4Tj/ a4re0VDtDh6lmiAySeqpb45bKH+ZriNu3segBmldPX7boIvquR3DxnAEaixr0BNFyftG nYTv7SjvLDZ4+vX7ZBPMdcBFNb00dBViE6Id0qToaM2JK6q2rrRRKsz12hyJxPfYQd7h eM3D7l0H4PnxzokoDgiEDIuiAkzcwPBzr40bl+d8suu7aXRJ2WHu+1j11PcajwA+31HF UEofJ7tAE+1qhCdItqGdLpSCu35oIyD7dvTnl1lZwSLEHPZzxRYuFh25RTeKbWruu6ma H62g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:dkim-signature:date; bh=mF7J0g2Nxf+VBOHYwoPE53ZcMPEVUheZGZS5pd1pQoU=; fh=UYyLWSnTrroa8sMQtREwOh+LZ2dzpr75NNt6y1jKmAc=; b=eH91Ib4daNBkdZUcvxH7+T+AJTKe+wcB0K5j1CfFHL2gnkXl/4ZH9SQhsEetGycrtl fCaHE0Q34UbWDmKRRqLNjaXRSRLslYYSRknD25C3aE2LK1+lbZHhsvRptyjJ+yeY1pYS AT/xfNEjDvNO+LXQHVEyymzOU4WXjfPtYhQmKTVMlbl19xAqb3xM2F2BAUEWxZjemUhf 47GtP3XW/tVBEIjh/KmQrZO9vHVSwPg1AJGys3erKtiFB2ZnSHulaHnw/bMG3aLrp5on PxATvGAq0JmC5C5C/TnG4dfbiHH6HBRVPUhGcTDHcioHLJip6/cOMCizhOlhqEC69t9j 1SEw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=GZTwqKcD; arc=pass (i=1 spf=pass spfdomain=linux.dev dkim=pass dkdomain=linux.dev dmarc=pass fromdomain=linux.dev); spf=pass (google.com: domain of linux-kernel+bounces-63888-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-63888-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev X-Forwarded-Encrypted: i=2; AJvYcCXNJBG4qO3cFQbrYAReaMy8DU3pW6EXJUuT+RDOmz77NO1Vnpac85b8WW/Yk6M5qZruKvQhwgqlrhfvjBmLrno43sUvUITCge93/+69YQ== Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id o6-20020a05620a228600b00786117e1cffsi3358370qkh.7.2024.02.13.08.29.58 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Feb 2024 08:29:58 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-63888-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=GZTwqKcD; arc=pass (i=1 spf=pass spfdomain=linux.dev dkim=pass dkdomain=linux.dev dmarc=pass fromdomain=linux.dev); spf=pass (google.com: domain of linux-kernel+bounces-63888-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-63888-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 359121C25688 for ; Tue, 13 Feb 2024 16:29:58 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 269A25F87B; Tue, 13 Feb 2024 16:29:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="GZTwqKcD" Received: from out-180.mta1.migadu.com (out-180.mta1.migadu.com [95.215.58.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7A6245D91C for ; Tue, 13 Feb 2024 16:29:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.180 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707841791; cv=none; b=YZT7JKnNesOlAgbgk0O4G5iJCnaao2KNb1V70Ye8poCxHxyk/4qWFXh4eYwbOOysT5Ea8LNpKoXH7seTIduSflXuiHAxXzajjrbWYyQZYNoJaduN6YN62sujeKk+zUrSrerVzP7nJuViRYTiJHvkakYsHzR8SjsoZ9r6Xc/RnAU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707841791; c=relaxed/simple; bh=pwq/fcA6g9rlxCQRDTT5fhtljDd9PZwU2WOBG2ktlBs=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=pHUj/JOiAIp5gK9QFEjv+Bq2DgOyv2Smrn/+ZooTGgfHPB4Oroq+nW0dNWAZzUgnmyPRSBgfhc3mJoZO6DyDRxJ0WH7PP5Q0P5tYdV0q8YqQ9jKezpoBuzHd786H0D1R8RNFC1XXDemxxLuexAC+SoC+Wo8uZBlx/8IINMke/TI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=GZTwqKcD; arc=none smtp.client-ip=95.215.58.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Date: Tue, 13 Feb 2024 16:29:42 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1707841787; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=mF7J0g2Nxf+VBOHYwoPE53ZcMPEVUheZGZS5pd1pQoU=; b=GZTwqKcD2J+jk67hFJUI3xSvi3EfxyJevFW4Mm24S4k7iXdUVJDaHRgP+ME3UCJML/Srlw mz9++W2jHHizB72MeD3m/8UTo9wxGeZFTjbl4O28g8kSfkXpLD0yjGCESVeW9t/In3J4jm 1Yds7/qCgTwuF5j0aeLIT+dHrfTQyaA= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Oliver Upton To: Marc Zyngier Cc: Will Deacon , kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Ricardo Koller Subject: Re: [RFC PATCH] KVM: arm64: Fix double-free following kvm_pgtable_stage2_free_unlinked() Message-ID: References: <20240212193052.27765-1-will@kernel.org> <86cyt062jh.wl-maz@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <86cyt062jh.wl-maz@kernel.org> X-Migadu-Flow: FLOW_OUT On Tue, Feb 13, 2024 at 11:12:34AM +0000, Marc Zyngier wrote: > On Mon, 12 Feb 2024 20:14:37 +0000, > Oliver Upton wrote: > > > > On Mon, Feb 12, 2024 at 07:30:52PM +0000, Will Deacon wrote: > > > kvm_pgtable_stage2_free_unlinked() does the final put_page() on the > > > root page of the sub-tree before returning, so remove the additional > > > put_page() invocations in the callers. > > > > > > Cc: Marc Zyngier > > > Cc: Oliver Upton > > > Cc: Ricardo Koller > > > Signed-off-by: Will Deacon > > > --- > > > > > > Hi folks, > > > > > > Sending this as an RFC as I only spotted it from code inspection and I'm > > > surprised others aren't seeing fireworks if it's a genuine bug. I also > > > couldn't come up with a sensible Fixes tag, as all of: > > > > > > e7c05540c694b ("KVM: arm64: Add helper for creating unlinked stage2 subtrees") > > > 8f5a3eb7513fc ("KVM: arm64: Add kvm_pgtable_stage2_split()") > > > f6a27d6dc51b2 ("KVM: arm64: Drop last page ref in kvm_pgtable_stage2_free_removed()") > > I'd blame it on the last commit, as we really ought to have it if we > have the others. > > > > > > > are actually ok in isolation. Hrm. Please tell me I'm wrong? > > > > > > arch/arm64/kvm/hyp/pgtable.c | 2 -- > > > 1 file changed, 2 deletions(-) > > > > > > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c > > > index c651df904fe3..ab9d05fcf98b 100644 > > > --- a/arch/arm64/kvm/hyp/pgtable.c > > > +++ b/arch/arm64/kvm/hyp/pgtable.c > > > @@ -1419,7 +1419,6 @@ kvm_pte_t *kvm_pgtable_stage2_create_unlinked(struct kvm_pgtable *pgt, > > > level + 1); > > > if (ret) { > > > kvm_pgtable_stage2_free_unlinked(mm_ops, pgtable, level); > > > - mm_ops->put_page(pgtable); > > > return ERR_PTR(ret); > > > } > > > > AFAICT, this entire branch is effectively dead code, unless there's a > > KVM bug lurking behind the page table walk. The sub-tree isn't visible > > to other software or hardware walkers yet, so none of the PTE races > > could cause this to pop. > > > > So while this is very obviously a bug, it might be pure luck that folks > > haven't seen smoke here. Perhaps while fixing the bug we should take the > > opportunity to promote the condition to WARN_ON_ONCE(). > > Can't you construct a case where an allocation fails during the walk > (memcache empty), and we end up on this exact path? Possibly, but AFAICT that can only happen if there was a bug in KVM. We don't start the walk at all if userspace set the split chunk size to 0, and otherwise we expect it to be at least PMD_SIZE, which will top up the cache to 1 every pass. stage2_split_walker() will 'do the right thing' if there aren't enough preallocated pages to get down to level 3. It really doesn't matter either way, I'm just trying to convince myself of the reasons why we haven't seen this explode yet :) > > > > > @@ -1502,7 +1501,6 @@ static int stage2_split_walker(const struct kvm_pgtable_visit_ctx *ctx, > > > > > > if (!stage2_try_break_pte(ctx, mmu)) { > > > kvm_pgtable_stage2_free_unlinked(mm_ops, childp, level); > > > - mm_ops->put_page(childp); > > > return -EAGAIN; > > > } > > > > This, on the other hand, seems possible. There exists a race where an > > old block PTE could have the AF set on it and the underlying cmpxchg() > > could fail. There shouldn't be a race with any software walkers, as we > > hold the MMU lock for write here. > > AF update is indeed a likely candidate. > > In any case, this patch looks good to me as it is, and we can always > have a separate tweak to adjust the severity of the first case as > required. Unless anyone objects, I'd like to queue it shortly. Agreed, happy with the way this looks and should've added: Reviewed-by: Oliver Upton the first time around. -- Thanks, Oliver