Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp763451ybi; Wed, 17 Jul 2019 04:32:39 -0700 (PDT) X-Google-Smtp-Source: APXvYqxU0PqtEmf84/eQmaqYDYjMOa3oUNReXRl+JjVQiNz1VfsDA1KlAfHe0rSSSl14vv0w3Xhb X-Received: by 2002:a17:902:b206:: with SMTP id t6mr43143489plr.195.1563363159157; Wed, 17 Jul 2019 04:32:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1563363159; cv=none; d=google.com; s=arc-20160816; b=Vccqbsdd/WEC/KycFBYBC0+4K+IV1Ad39nIgDPEPVF22fPj015joyWsl06auY7+dwh UEz8dXPROMw5bAUwtN6tD3R9pZ4Uk8fL+7Mi62gRMHvT+Xkje82BdRjrxy03Rshbbx91 8Th+9B9OKEMMD5luS2lYG1oAczarLCENHshVtSRBMRISJRZFUq5nEtiln+5Mbhyqi6p2 oGpYYCBGZYWjTcR2ViP9zVcG4eFR/JsXpaZyALKPgyL+i4+worAbW9aKXv44pGGMsh+L /qSpEyg07MbrsCP9hbuLabbwOPbf+VMn4cjhknYMN93OI/xAX4Vog7NXEUZJ8t5pYfk6 iqRw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=u9Y2xlzbpyC3yc5DJGj0o8s6j3V7HY8hj7hOBqyHyBI=; b=byzD7EDwlxJz/JTO3OOA9J5rqBGs22/BGSnBrTjtVEReoPoWNyhpSRyXgEIp3krGjj e+78sG4+9AKJ8dpn534/D1lRWVc/yIQBwCm9st2NMEBf8D0uhCpnjFEvFGFozznaULke Ds956FRpYHjzU/DX5cLPgoTaWV2XSRQDmcr22PoW/xBHXPPa7bCyMTwXdHVp+cpyJtbk QqcyCNjv5cFBJzX2X5KV/TV0G6Qj/cIHnVGqQVAFQHMun2fozMbVRFsDdXEBRCgmHWIE 9LPBi9rlZQNsIz/L1+nyOcM4z5EtOcNtAKx+Rt/5xDWx6ifgHfG44CqkK/G6kgy05xWT YISA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b14si17785079pgi.434.2019.07.17.04.32.22; Wed, 17 Jul 2019 04:32:39 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731044AbfGQLar (ORCPT + 99 others); Wed, 17 Jul 2019 07:30:47 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43306 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725873AbfGQLap (ORCPT ); Wed, 17 Jul 2019 07:30:45 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A2DB8307D860; Wed, 17 Jul 2019 11:30:44 +0000 (UTC) Received: from steredhat.redhat.com (ovpn-116-100.ams2.redhat.com [10.36.116.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8F3495E1A8; Wed, 17 Jul 2019 11:30:42 +0000 (UTC) From: Stefano Garzarella To: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Stefan Hajnoczi , "David S. Miller" , virtualization@lists.linux-foundation.org, Jason Wang , kvm@vger.kernel.org, "Michael S. Tsirkin" Subject: [PATCH v4 1/5] vsock/virtio: limit the memory used per-socket Date: Wed, 17 Jul 2019 13:30:26 +0200 Message-Id: <20190717113030.163499-2-sgarzare@redhat.com> In-Reply-To: <20190717113030.163499-1-sgarzare@redhat.com> References: <20190717113030.163499-1-sgarzare@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.48]); Wed, 17 Jul 2019 11:30:44 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Since virtio-vsock was introduced, the buffers filled by the host and pushed to the guest using the vring, are directly queued in a per-socket list. These buffers are preallocated by the guest with a fixed size (4 KB). The maximum amount of memory used by each socket should be controlled by the credit mechanism. The default credit available per-socket is 256 KB, but if we use only 1 byte per packet, the guest can queue up to 262144 of 4 KB buffers, using up to 1 GB of memory per-socket. In addition, the guest will continue to fill the vring with new 4 KB free buffers to avoid starvation of other sockets. This patch mitigates this issue copying the payload of small packets (< 128 bytes) into the buffer of last packet queued, in order to avoid wasting memory. Reviewed-by: Stefan Hajnoczi Signed-off-by: Stefano Garzarella --- drivers/vhost/vsock.c | 2 + include/linux/virtio_vsock.h | 1 + net/vmw_vsock/virtio_transport.c | 1 + net/vmw_vsock/virtio_transport_common.c | 60 +++++++++++++++++++++---- 4 files changed, 55 insertions(+), 9 deletions(-) diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index 6a50e1d0529c..6c8390a2af52 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -329,6 +329,8 @@ vhost_vsock_alloc_pkt(struct vhost_virtqueue *vq, return NULL; } + pkt->buf_len = pkt->len; + nbytes = copy_from_iter(pkt->buf, pkt->len, &iov_iter); if (nbytes != pkt->len) { vq_err(vq, "Expected %u byte payload, got %zu bytes\n", diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h index e223e2632edd..7d973903f52e 100644 --- a/include/linux/virtio_vsock.h +++ b/include/linux/virtio_vsock.h @@ -52,6 +52,7 @@ struct virtio_vsock_pkt { /* socket refcnt not held, only use for cancellation */ struct vsock_sock *vsk; void *buf; + u32 buf_len; u32 len; u32 off; bool reply; diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index 0815d1357861..082a30936690 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -307,6 +307,7 @@ static void virtio_vsock_rx_fill(struct virtio_vsock *vsock) break; } + pkt->buf_len = buf_len; pkt->len = buf_len; sg_init_one(&hdr, &pkt->hdr, sizeof(pkt->hdr)); diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index 6f1a8aff65c5..095221f94786 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -26,6 +26,9 @@ /* How long to wait for graceful shutdown of a connection */ #define VSOCK_CLOSE_TIMEOUT (8 * HZ) +/* Threshold for detecting small packets to copy */ +#define GOOD_COPY_LEN 128 + static const struct virtio_transport *virtio_transport_get_ops(void) { const struct vsock_transport *t = vsock_core_get_transport(); @@ -64,6 +67,9 @@ virtio_transport_alloc_pkt(struct virtio_vsock_pkt_info *info, pkt->buf = kmalloc(len, GFP_KERNEL); if (!pkt->buf) goto out_pkt; + + pkt->buf_len = len; + err = memcpy_from_msg(pkt->buf, info->msg, len); if (err) goto out; @@ -841,24 +847,60 @@ virtio_transport_recv_connecting(struct sock *sk, return err; } +static void +virtio_transport_recv_enqueue(struct vsock_sock *vsk, + struct virtio_vsock_pkt *pkt) +{ + struct virtio_vsock_sock *vvs = vsk->trans; + bool free_pkt = false; + + pkt->len = le32_to_cpu(pkt->hdr.len); + pkt->off = 0; + + spin_lock_bh(&vvs->rx_lock); + + virtio_transport_inc_rx_pkt(vvs, pkt); + + /* Try to copy small packets into the buffer of last packet queued, + * to avoid wasting memory queueing the entire buffer with a small + * payload. + */ + if (pkt->len <= GOOD_COPY_LEN && !list_empty(&vvs->rx_queue)) { + struct virtio_vsock_pkt *last_pkt; + + last_pkt = list_last_entry(&vvs->rx_queue, + struct virtio_vsock_pkt, list); + + /* If there is space in the last packet queued, we copy the + * new packet in its buffer. + */ + if (pkt->len <= last_pkt->buf_len - last_pkt->len) { + memcpy(last_pkt->buf + last_pkt->len, pkt->buf, + pkt->len); + last_pkt->len += pkt->len; + free_pkt = true; + goto out; + } + } + + list_add_tail(&pkt->list, &vvs->rx_queue); + +out: + spin_unlock_bh(&vvs->rx_lock); + if (free_pkt) + virtio_transport_free_pkt(pkt); +} + static int virtio_transport_recv_connected(struct sock *sk, struct virtio_vsock_pkt *pkt) { struct vsock_sock *vsk = vsock_sk(sk); - struct virtio_vsock_sock *vvs = vsk->trans; int err = 0; switch (le16_to_cpu(pkt->hdr.op)) { case VIRTIO_VSOCK_OP_RW: - pkt->len = le32_to_cpu(pkt->hdr.len); - pkt->off = 0; - - spin_lock_bh(&vvs->rx_lock); - virtio_transport_inc_rx_pkt(vvs, pkt); - list_add_tail(&pkt->list, &vvs->rx_queue); - spin_unlock_bh(&vvs->rx_lock); - + virtio_transport_recv_enqueue(vsk, pkt); sk->sk_data_ready(sk); return err; case VIRTIO_VSOCK_OP_CREDIT_UPDATE: -- 2.20.1