Received: by 2002:ab2:6991:0:b0:1f2:fff1:ace7 with SMTP id v17csp201778lqo; Wed, 27 Mar 2024 10:32:16 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXNjJgWTu9puSp+Ug4vlk/J8G/gRKs1uFYmKWTLTmYip8WHR5yHgAlhdpbWHyT9lMi93bNsfMQEqQixqG6KdPEXiQinN1eazPofRYFUKA== X-Google-Smtp-Source: AGHT+IGdrscqDa9r4zA2xeg0cZvECLF+3Xi3CYhdk+HN9VI8GqQgpWh44prQQlmYmi1BvZKlxjiW X-Received: by 2002:a17:906:6dd2:b0:a46:a85d:de81 with SMTP id j18-20020a1709066dd200b00a46a85dde81mr190950ejt.12.1711560736145; Wed, 27 Mar 2024 10:32:16 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1711560736; cv=pass; d=google.com; s=arc-20160816; b=V1QoCic8h9JdqYlwFLCp57NqwpJc73rAY5btP11OfScXuuaRByG9Yu+bqjgKvYdWgy KNF2yME1Fn7dRXZmKj1z+dnYMNQxa5HTGdRuRYMPMJClTMo+Bw6Ipjne+Sd0WXMIqUhy 1pUa6CwV9GobBCLp+HWRWmOfxRQii6tcd98L/J43nS6GiXXGS58hMxIyj7UiknmZhqEg 8eDHc+3itoCabK3ydUAkOSv/shfpDOy23UEzRkK0WD1wyutuGQV9ySmK1iuYD3eCNc5w +fEGdaAOkbUT0anK68aZm7fPMhs5YL/7qxaAd3X2q67M2IQMBXQ80pfTzCQu5n+XTiQc vV5A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-disposition:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:message-id:subject:cc:to:from:date :dkim-signature; bh=/EOUB4d8qVh5umKC4okNLh+AeZuN4LrBRasgfUriSZs=; fh=4MN1khz2zOAOZ7dMnZMfVYz7cbJ5QQzI1dcpmgkMj3g=; b=TnmQbs4VlxmnbnucAMpcNmQa7tdryzLvQdCCisX9honlOBnbi/7dYOf/Sy8raWIhPS NJLgqdH04LODTzmrf6nZSkAJ0XVKySlBHrQQb/gyVUrLa/0NhRk3JgUXtnTJm5YRgkqD 2iYbUSN53P1vBeGWOh/5jh2pTB3fGWSlLH5aEFKfoLkTPscOU/nbqD2jiOHIGltN3rWP QywE1J/+viXRwETZs7rm8ZDzkPxuraHSeuwj6m/ECfbhQKHciDWPHkLAMjxF4CQ5wJyj 8YJg+oZm1py7ojiQNGRueQOj8pAXEzkxcWNouH7DgX4/AMWx8aGOhtqZWp4PQQ7ePVj5 ktEA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=JKQcH3Li; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-121696-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-121696-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id b20-20020a170906151400b00a46bdc6278dsi4651410ejd.963.2024.03.27.10.32.16 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Mar 2024 10:32:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-121696-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=JKQcH3Li; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-121696-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-121696-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 2C4B11F32EBD for ; Wed, 27 Mar 2024 17:26:51 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 10D6E14E2D0; Wed, 27 Mar 2024 17:26:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="JKQcH3Li" Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5D0FF137764 for ; Wed, 27 Mar 2024 17:26:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711560397; cv=none; b=CfTCFFnkwoZiYd7kHrFYQHO8NPMcmu2TBmNnWka7Hdixulryo/LeUMIWsKQL4MiEdll+D2Cs/qyWtbM3yHRiWDiWLZiiERfht6wi/syGMo7OfOsLAxRJXH6cHYZVTMrnWD3fdPrZxr28bIn6cXkV0Kn8ILP6L68Pb6Vq5AEm2eQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711560397; c=relaxed/simple; bh=KO68VCsBjC7YXYAlyG5RlT0YZwWtlqL82EO6khY7yOQ=; h=Date:From:To:Cc:Subject:Message-ID:MIME-Version:Content-Type: Content-Disposition; b=nVSy6CX/0E7o7f1VxmqqCpk4bK02w9I0sSEOAwIIw8OJbHjobPraIxk2dpxDgThQYGDJVECznzTReXF4Z7tmdliuldARAiogJmBgG4aAJkPk2htZg/FMMPIxw0D0V0If5XjnBHIvEpG0ePvqTeDHlt1Ye648jiI1GHvNdgL2JI0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=JKQcH3Li; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1711560394; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=/EOUB4d8qVh5umKC4okNLh+AeZuN4LrBRasgfUriSZs=; b=JKQcH3Li4Usdw0S0YcsOMz4HNs0HDMgb3b1nSA7080PdF6qkGJ7h4bFbsLE2rRVGzhx7qj XrNhyus8sFRkK8kJ244R10UVx+hBzUhPHKpLGrI2xC+dXWDmld+zDNgp0/DBPCrlBIWSVC 3wWNxasLTy3dmYTOVucrluq71fP64Ic= Received: from mail-lj1-f198.google.com (mail-lj1-f198.google.com [209.85.208.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-499-_LHhGj7NNhGuhFQVxTb_wA-1; Wed, 27 Mar 2024 13:26:32 -0400 X-MC-Unique: _LHhGj7NNhGuhFQVxTb_wA-1 Received: by mail-lj1-f198.google.com with SMTP id 38308e7fff4ca-2d491456df2so5634161fa.1 for ; Wed, 27 Mar 2024 10:26:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711560391; x=1712165191; h=content-disposition:mime-version:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=/EOUB4d8qVh5umKC4okNLh+AeZuN4LrBRasgfUriSZs=; b=C8wGC6VO3r8mrCZH6H4+/SD6C8FpO0OtRirEy7KK4hf2wzKW8jjoHPem65JMiVbyUx mFVeGtZ18ib/U867a6TVn+O6WDldkQrv2XE/dHWndKqZkM/H+8Fi6aV0UW2/f2TeF2rt HF4kM9O1+hi0++FVt1BfH0hnxxWTfpaDxS3A0DbKvchrDze7ca0TrWNynZQ0sAQLFp5q bCLcTs4WhhI06Wlm8oiufs+aXx2DYx8HJBcJ7f2fcSK0ClLai5+mQWtGvReQ1Tr4/ahc Rw2/0lwLu3+IE7wThvsh3PeyvtewpI7LOM1drKozUZZ3GjqkVmrgx+1vIEapUnFH0IFb FSCA== X-Gm-Message-State: AOJu0YyIVT1RSqy3+EqFnfWkwo0jydx6H27jHrpd0vgBV2T2Vb4XaMPu xDwb2Y0A6kMprdbeHbuwXquN1XnyJYlds3TLqYbTcMNeZ3GoXtsxtNYRTvxHi613dZdDzflXFT6 XDIMjSnnttHUYPZDiqbA6hzCaLQ9nKTr8EMvqEqrgrpkQQ3mhPlkONaUPKAlxyF6OfxFX0sjZXm XKB2X42Nq4upAjTRbf2HtNUgN60pseKAt2c8nYOns= X-Received: by 2002:a19:ca4b:0:b0:513:e69a:a1ff with SMTP id h11-20020a19ca4b000000b00513e69aa1ffmr72104lfj.33.1711560390621; Wed, 27 Mar 2024 10:26:30 -0700 (PDT) X-Received: by 2002:a19:ca4b:0:b0:513:e69a:a1ff with SMTP id h11-20020a19ca4b000000b00513e69aa1ffmr72081lfj.33.1711560389862; Wed, 27 Mar 2024 10:26:29 -0700 (PDT) Received: from redhat.com ([2.52.20.36]) by smtp.gmail.com with ESMTPSA id e5-20020a196905000000b00515904f7ba3sm1947672lfc.198.2024.03.27.10.26.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Mar 2024 10:26:29 -0700 (PDT) Date: Wed, 27 Mar 2024 13:26:23 -0400 From: "Michael S. Tsirkin" To: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org, Gavin Shan , Will Deacon , Jason Wang , Stefano Garzarella , Eugenio =?utf-8?B?UMOpcmV6?= , Stefan Hajnoczi , "David S. Miller" , kvm@vger.kernel.org, virtualization@lists.linux.dev, netdev@vger.kernel.org Subject: [PATCH untested] vhost: order avail ring reads after index updates Message-ID: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Mailer: git-send-email 2.27.0.106.g8ac3dc51b1 X-Mutt-Fcc: =sent vhost_get_vq_desc (correctly) uses smp_rmb to order avail ring reads after index reads. However, over time we added two more places that read the index and do not bother with barriers. Since vhost_get_vq_desc when it was written assumed it is the only reader when it sees a new index value is cached it does not bother with a barrier either, as a result, on the nvidia-gracehopper platform (arm64) available ring entry reads have been observed bypassing ring reads, causing a ring corruption. To fix, factor out the correct index access code from vhost_get_vq_desc. As a side benefit, we also validate the index on all paths now, which will hopefully help catch future errors earlier. Note: current code is inconsistent in how it handles errors: some places treat it as an empty ring, others - non empty. This patch does not attempt to change the existing behaviour. Cc: stable@vger.kernel.org Reported-by: Gavin Shan Reported-by: Will Deacon Suggested-by: Will Deacon Fixes: 275bf960ac69 ("vhost: better detection of available buffers") Cc: "Jason Wang" Fixes: d3bb267bbdcb ("vhost: cache avail index in vhost_enable_notify()") Cc: "Stefano Garzarella" Signed-off-by: Michael S. Tsirkin --- I think it's better to bite the bullet and clean up the code. Note: this is still only built, not tested. Gavin could you help test please? Especially on the arm platform you have? Will thanks so much for finding this race! drivers/vhost/vhost.c | 80 +++++++++++++++++++++++-------------------- 1 file changed, 42 insertions(+), 38 deletions(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 045f666b4f12..26b70b1fd9ff 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -1290,10 +1290,38 @@ static void vhost_dev_unlock_vqs(struct vhost_dev *d) mutex_unlock(&d->vqs[i]->mutex); } -static inline int vhost_get_avail_idx(struct vhost_virtqueue *vq, - __virtio16 *idx) +static inline int vhost_get_avail_idx(struct vhost_virtqueue *vq) { - return vhost_get_avail(vq, *idx, &vq->avail->idx); + __virtio16 idx; + u16 avail_idx; + int r = vhost_get_avail(vq, idx, &vq->avail->idx); + + if (unlikely(r < 0)) { + vq_err(vq, "Failed to access avail idx at %p: %d\n", + &vq->avail->idx, r); + return -EFAULT; + } + + avail_idx = vhost16_to_cpu(vq, idx); + + /* Check it isn't doing very strange things with descriptor numbers. */ + if (unlikely((u16)(avail_idx - vq->last_avail_idx) > vq->num)) { + vq_err(vq, "Guest moved used index from %u to %u", + vq->last_avail_idx, vq->avail_idx); + return -EFAULT; + } + + /* Nothing new? We are done. */ + if (avail_idx == vq->avail_idx) + return 0; + + vq->avail_idx = avail_idx; + + /* We updated vq->avail_idx so we need a memory barrier between + * the index read above and the caller reading avail ring entries. + */ + smp_rmb(); + return 1; } static inline int vhost_get_avail_head(struct vhost_virtqueue *vq, @@ -2498,38 +2526,21 @@ int vhost_get_vq_desc(struct vhost_virtqueue *vq, { struct vring_desc desc; unsigned int i, head, found = 0; - u16 last_avail_idx; - __virtio16 avail_idx; + u16 last_avail_idx = vq->last_avail_idx; __virtio16 ring_head; int ret, access; - /* Check it isn't doing very strange things with descriptor numbers. */ - last_avail_idx = vq->last_avail_idx; if (vq->avail_idx == vq->last_avail_idx) { - if (unlikely(vhost_get_avail_idx(vq, &avail_idx))) { - vq_err(vq, "Failed to access avail idx at %p\n", - &vq->avail->idx); - return -EFAULT; - } - vq->avail_idx = vhost16_to_cpu(vq, avail_idx); - - if (unlikely((u16)(vq->avail_idx - last_avail_idx) > vq->num)) { - vq_err(vq, "Guest moved used index from %u to %u", - last_avail_idx, vq->avail_idx); - return -EFAULT; - } + ret = vhost_get_avail_idx(vq); + if (unlikely(ret < 0)) + return ret; /* If there's nothing new since last we looked, return * invalid. */ - if (vq->avail_idx == last_avail_idx) + if (!ret) return vq->num; - - /* Only get avail ring entries after they have been - * exposed by guest. - */ - smp_rmb(); } /* Grab the next descriptor number they're advertising, and increment @@ -2790,25 +2801,21 @@ EXPORT_SYMBOL_GPL(vhost_add_used_and_signal_n); /* return true if we're sure that avaiable ring is empty */ bool vhost_vq_avail_empty(struct vhost_dev *dev, struct vhost_virtqueue *vq) { - __virtio16 avail_idx; int r; if (vq->avail_idx != vq->last_avail_idx) return false; - r = vhost_get_avail_idx(vq, &avail_idx); - if (unlikely(r)) - return false; - vq->avail_idx = vhost16_to_cpu(vq, avail_idx); + r = vhost_get_avail_idx(vq); - return vq->avail_idx == vq->last_avail_idx; + /* Note: we treat error as non-empty here */ + return r == 0; } EXPORT_SYMBOL_GPL(vhost_vq_avail_empty); /* OK, now we need to know about added descriptors. */ bool vhost_enable_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq) { - __virtio16 avail_idx; int r; if (!(vq->used_flags & VRING_USED_F_NO_NOTIFY)) @@ -2832,13 +2839,10 @@ bool vhost_enable_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq) /* They could have slipped one in as we were doing that: make * sure it's written, then check again. */ smp_mb(); - r = vhost_get_avail_idx(vq, &avail_idx); - if (r) { - vq_err(vq, "Failed to check avail idx at %p: %d\n", - &vq->avail->idx, r); + r = vhost_get_avail_idx(vq); + /* Note: we treat error as empty here */ + if (r < 0) return false; - } - vq->avail_idx = vhost16_to_cpu(vq, avail_idx); return vq->avail_idx != vq->last_avail_idx; } -- MST