Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp1986848iog; Thu, 16 Jun 2022 19:22:22 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vAiw5NRUX8QS3vE1eW/B/BdQQcdgYrXwrR7fH7tRfxP3idxcJ9X3lkdDclelV/1Z+ekwGv X-Received: by 2002:a17:902:f609:b0:168:dcbe:7c4d with SMTP id n9-20020a170902f60900b00168dcbe7c4dmr7359383plg.169.1655432542092; Thu, 16 Jun 2022 19:22:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1655432542; cv=none; d=google.com; s=arc-20160816; b=m3MveI53x8Cfa8jG/kANXICBIV5AjXCDIo34QtGXQn6nJlEafPDXop7Xt5viISF3nr x+6xIaBRtDTfJzajT9kDuhqsn7LE2dwbUm1OIsg14d9DzC3suBvJmRDERWipF1D6HXk9 jMmJJv1AHVlRsrqI8U3lXghG10mMjFT6XtDv6IJn9+k3CwL5oKaq5n47gjntQv3Un0TM rY9GVnw0wwuI2+otRDiHKnBQKyrzHDjYIpDvJIXpx4uNJCNK1F//q/uoPhqYraScL7qZ USaE3vx8cRKbixL9vvA/6sUDoCzi+ne7NIcmhF/Wqd/JjnO9y+i2Qc3oRO0U3LR2lxMv zCog== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=cSMpGids9UDkbuNot6QlyA8Acj1imseCpqUM2yQS8Hw=; b=Am6m3Ht0j9kPtqV08Zj/NyqB37Yvw+TLKYfYy9f3ap5mRl9P7v5Q6Acct/aET425+Y 9apuyXGXt15rP6d0EtRdYhcoSiE25eN8qqkdq631Lh+bYjbG3bYfdacjV8FTfGVsmYNo gx7WXfJF0HtkiqVtQ3uRGES3fcFbf6UHls/+sGJyIcy4G8PHZ42a7fvIj1OU3d6nDN8I SmVTEseCmuX6zC3T/32AhP2DpWaXMaUQIf6r0H8zb61SEnC1EQYrVDt+c5Y9MM69KkNF G80uW6iIDLoT5s0LDTQlBj3so61F1Bn/Wd5X/74YE1MkYQKKIorouOvGQ9lKyioESJ1s yKbw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b="Trj/2Ke4"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l16-20020a17090a599000b001e2c1e7fe25si7307358pji.113.2022.06.16.19.22.00; Thu, 16 Jun 2022 19:22:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b="Trj/2Ke4"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1378379AbiFQCE0 (ORCPT + 99 others); Thu, 16 Jun 2022 22:04:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53156 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232401AbiFQCEX (ORCPT ); Thu, 16 Jun 2022 22:04:23 -0400 Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 29DCA61602 for ; Thu, 16 Jun 2022 19:04:21 -0700 (PDT) Received: by mail-pl1-x631.google.com with SMTP id d13so2681612plh.13 for ; Thu, 16 Jun 2022 19:04:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=cSMpGids9UDkbuNot6QlyA8Acj1imseCpqUM2yQS8Hw=; b=Trj/2Ke47Mh4YYG3nOrj/qAt3qMeVPnXr4Fj5l+jj+XxVab0xSvsudBvvXzkvQt6/z vOYNcpl2emKUNjH2UASVNWWFnXxkVp2DXknC0yZNdSeo3qdQnVLaUzpppwmsI3z2oL3k 8iwY5JhB7MEafnWjDyOqXHz2H3/ralcRtj0GqkaCi6UNe6FopPTrvt/fALeN6K2nlDw4 juwr3EIaWQC3/VQm/xF9nhdEaMydjaCF5xRTaqIPHTDEqNrpNSL4NhzoL0uvmii2WML+ WLNW24x66OGaiHRi7KJSzNgnAusVQVwHzECeLek4O4fSuiKRfQjfP57rdMKhOiGRIrv7 WeQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cSMpGids9UDkbuNot6QlyA8Acj1imseCpqUM2yQS8Hw=; b=eJN82wuV/U7L+f4gsDoRmCovvfNPmsqW9kTd5DIGEKfolqHO7wWmP+XnVGdkZ5OX9z 2j90ZxWHsb2J0fE1uTWD70kV8S3A2zD7rgn+Wkp+/PBUrj52lh73soWL6BG7IK5IPFnJ 0kwe/5OdgvPTlU1Xj5qbgvrcIBhnyzgfwRMIgdxhdcfP4qTVRmtkE3pZSJbWwj0sRuPC UyDv83pC8AbeQMq4oLVhyYRrWDwx7Zlq8iJ1O+OMqMpZsLZkg8PgdNPOFCB6WKyTChGW AHZISn1CHgAfL1eBcw2FmtS5eExm0lsPWi/bDKtvBAptuNvo+EshfCrwTtbiVJUDcjHO G2TQ== X-Gm-Message-State: AJIora8iY9xGPtZzyzIXO0Zsqj7lN3Da+IpzFR5dEViGRyij/cwcl303 IzvsiE+Cdwnb/N1p9YzSnuT4KJUUR0IdGlE/leg= X-Received: by 2002:a17:90b:3142:b0:1e3:1253:be27 with SMTP id ip2-20020a17090b314200b001e31253be27mr8147267pjb.44.1655431460609; Thu, 16 Jun 2022 19:04:20 -0700 (PDT) Received: from C02FG34NMD6R.bytedance.net ([139.177.225.231]) by smtp.gmail.com with ESMTPSA id p14-20020a170902e74e00b00163bfaf0b17sm2280007plf.233.2022.06.16.19.04.17 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 16 Jun 2022 19:04:19 -0700 (PDT) From: Albert Huang To: mst@redhat.com Cc: yuanzhu@bytedance.com, "huangjie.albert" , Jason Wang , virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH v4] virtio_ring : keep used_wrap_counter in vq->last_used_idx Date: Fri, 17 Jun 2022 10:04:11 +0800 Message-Id: <20220617020411.80367-1-huangjie.albert@bytedance.com> X-Mailer: git-send-email 2.30.1 (Apple Git-130) In-Reply-To: <20220616101823-mutt-send-email-mst@kernel.org> References: <20220616101823-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: "huangjie.albert" the used_wrap_counter and the vq->last_used_idx may get out of sync if they are separate assignment,and interrupt might use an incorrect value to check for the used index. for example:OOB access ksoftirqd may consume the packet and it will call: virtnet_poll -->virtnet_receive -->virtqueue_get_buf_ctx -->virtqueue_get_buf_ctx_packed and in virtqueue_get_buf_ctx_packed: vq->last_used_idx += vq->packed.desc_state[id].num; if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) { vq->last_used_idx -= vq->packed.vring.num; vq->packed.used_wrap_counter ^= 1; } if at the same time, there comes a vring interrupt,in vring_interrupt: we will call: vring_interrupt -->more_used -->more_used_packed -->is_used_desc_packed in is_used_desc_packed, the last_used_idx maybe >= vq->packed.vring.num. so this could case a memory out of bounds bug. this patch is to keep the used_wrap_counter in vq->last_used_idx so we can get the correct value to check for used index in interrupt. v3->v4: - use READ_ONCE/WRITE_ONCE to get/set vq->last_used_idx v2->v3: - add inline function to get used_wrap_counter and last_used - when use vq->last_used_idx, only read once if vq->last_used_idx is read twice, the values can be inconsistent. - use last_used_idx & ~(-(1 << VRING_PACKED_EVENT_F_WRAP_CTR)) to get the all bits below VRING_PACKED_EVENT_F_WRAP_CTR v1->v2: - reuse the VRING_PACKED_EVENT_F_WRAP_CTR - Remove parameter judgment in is_used_desc_packed, because it can't be illegal Signed-off-by: huangjie.albert --- drivers/virtio/virtio_ring.c | 75 ++++++++++++++++++++++-------------- 1 file changed, 47 insertions(+), 28 deletions(-) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index 13a7348cedff..719fbbe716d6 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -111,7 +111,12 @@ struct vring_virtqueue { /* Number we've added since last sync. */ unsigned int num_added; - /* Last used index we've seen. */ + /* Last used index we've seen. + * for split ring, it just contains last used index + * for packed ring: + * bits up to VRING_PACKED_EVENT_F_WRAP_CTR include the last used index. + * bits from VRING_PACKED_EVENT_F_WRAP_CTR include the used wrap counter. + */ u16 last_used_idx; /* Hint for event idx: already triggered no need to disable. */ @@ -154,9 +159,6 @@ struct vring_virtqueue { /* Driver ring wrap counter. */ bool avail_wrap_counter; - /* Device ring wrap counter. */ - bool used_wrap_counter; - /* Avail used flags. */ u16 avail_used_flags; @@ -973,6 +975,15 @@ static struct virtqueue *vring_create_virtqueue_split( /* * Packed ring specific functions - *_packed(). */ +static inline bool packed_used_wrap_counter(u16 last_used_idx) +{ + return !!(last_used_idx & (1 << VRING_PACKED_EVENT_F_WRAP_CTR)); +} + +static inline u16 packed_last_used(u16 last_used_idx) +{ + return last_used_idx & ~(-(1 << VRING_PACKED_EVENT_F_WRAP_CTR)); +} static void vring_unmap_extra_packed(const struct vring_virtqueue *vq, struct vring_desc_extra *extra) @@ -1406,8 +1417,14 @@ static inline bool is_used_desc_packed(const struct vring_virtqueue *vq, static inline bool more_used_packed(const struct vring_virtqueue *vq) { - return is_used_desc_packed(vq, vq->last_used_idx, - vq->packed.used_wrap_counter); + u16 last_used; + u16 last_used_idx; + bool used_wrap_counter; + + last_used_idx = READ_ONCE(vq->last_used_idx); + last_used = packed_last_used(last_used_idx); + used_wrap_counter = packed_used_wrap_counter(last_used_idx); + return is_used_desc_packed(vq, last_used, used_wrap_counter); } static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq, @@ -1415,7 +1432,8 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq, void **ctx) { struct vring_virtqueue *vq = to_vvq(_vq); - u16 last_used, id; + u16 last_used, id, last_used_idx; + bool used_wrap_counter; void *ret; START_USE(vq); @@ -1434,7 +1452,9 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq, /* Only get used elements after they have been exposed by host. */ virtio_rmb(vq->weak_barriers); - last_used = vq->last_used_idx; + last_used_idx = READ_ONCE(vq->last_used_idx); + used_wrap_counter = packed_used_wrap_counter(last_used_idx); + last_used = packed_last_used(last_used_idx); id = le16_to_cpu(vq->packed.vring.desc[last_used].id); *len = le32_to_cpu(vq->packed.vring.desc[last_used].len); @@ -1451,12 +1471,15 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq, ret = vq->packed.desc_state[id].data; detach_buf_packed(vq, id, ctx); - vq->last_used_idx += vq->packed.desc_state[id].num; - if (unlikely(vq->last_used_idx >= vq->packed.vring.num)) { - vq->last_used_idx -= vq->packed.vring.num; - vq->packed.used_wrap_counter ^= 1; + last_used += vq->packed.desc_state[id].num; + if (unlikely(last_used >= vq->packed.vring.num)) { + last_used -= vq->packed.vring.num; + used_wrap_counter ^= 1; } + last_used = (last_used | (used_wrap_counter << VRING_PACKED_EVENT_F_WRAP_CTR)); + WRITE_ONCE(vq->last_used_idx, last_used); + /* * If we expect an interrupt for the next entry, tell host * by writing event index and flush out the write before @@ -1465,9 +1488,7 @@ static void *virtqueue_get_buf_ctx_packed(struct virtqueue *_vq, if (vq->packed.event_flags_shadow == VRING_PACKED_EVENT_FLAG_DESC) virtio_store_mb(vq->weak_barriers, &vq->packed.vring.driver->off_wrap, - cpu_to_le16(vq->last_used_idx | - (vq->packed.used_wrap_counter << - VRING_PACKED_EVENT_F_WRAP_CTR))); + cpu_to_le16(vq->last_used_idx)); LAST_ADD_TIME_INVALID(vq); @@ -1499,9 +1520,7 @@ static unsigned int virtqueue_enable_cb_prepare_packed(struct virtqueue *_vq) if (vq->event) { vq->packed.vring.driver->off_wrap = - cpu_to_le16(vq->last_used_idx | - (vq->packed.used_wrap_counter << - VRING_PACKED_EVENT_F_WRAP_CTR)); + cpu_to_le16(vq->last_used_idx); /* * We need to update event offset and event wrap * counter first before updating event flags. @@ -1518,8 +1537,7 @@ static unsigned int virtqueue_enable_cb_prepare_packed(struct virtqueue *_vq) } END_USE(vq); - return vq->last_used_idx | ((u16)vq->packed.used_wrap_counter << - VRING_PACKED_EVENT_F_WRAP_CTR); + return vq->last_used_idx; } static bool virtqueue_poll_packed(struct virtqueue *_vq, u16 off_wrap) @@ -1537,7 +1555,7 @@ static bool virtqueue_poll_packed(struct virtqueue *_vq, u16 off_wrap) static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq) { struct vring_virtqueue *vq = to_vvq(_vq); - u16 used_idx, wrap_counter; + u16 used_idx, wrap_counter, last_used_idx; u16 bufs; START_USE(vq); @@ -1550,9 +1568,10 @@ static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq) if (vq->event) { /* TODO: tune this threshold */ bufs = (vq->packed.vring.num - vq->vq.num_free) * 3 / 4; - wrap_counter = vq->packed.used_wrap_counter; + last_used_idx = READ_ONCE(vq->last_used_idx); + wrap_counter = packed_used_wrap_counter(last_used_idx); - used_idx = vq->last_used_idx + bufs; + used_idx = packed_last_used(last_used_idx) + bufs; if (used_idx >= vq->packed.vring.num) { used_idx -= vq->packed.vring.num; wrap_counter ^= 1; @@ -1582,9 +1601,10 @@ static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq) */ virtio_mb(vq->weak_barriers); - if (is_used_desc_packed(vq, - vq->last_used_idx, - vq->packed.used_wrap_counter)) { + last_used_idx = READ_ONCE(vq->last_used_idx); + wrap_counter = packed_used_wrap_counter(last_used_idx); + used_idx = packed_last_used(last_used_idx); + if (is_used_desc_packed(vq, used_idx, wrap_counter)) { END_USE(vq); return false; } @@ -1689,7 +1709,7 @@ static struct virtqueue *vring_create_virtqueue_packed( vq->notify = notify; vq->weak_barriers = weak_barriers; vq->broken = true; - vq->last_used_idx = 0; + vq->last_used_idx = 0 | (1 << VRING_PACKED_EVENT_F_WRAP_CTR); vq->event_triggered = false; vq->num_added = 0; vq->packed_ring = true; @@ -1720,7 +1740,6 @@ static struct virtqueue *vring_create_virtqueue_packed( vq->packed.next_avail_idx = 0; vq->packed.avail_wrap_counter = 1; - vq->packed.used_wrap_counter = 1; vq->packed.event_flags_shadow = 0; vq->packed.avail_used_flags = 1 << VRING_PACKED_DESC_F_AVAIL; -- 2.31.1