Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp3817367ybi; Mon, 29 Jul 2019 13:12:24 -0700 (PDT) X-Google-Smtp-Source: APXvYqxZWwzBZThhuIq/5zyv3kOxAsUxfgdGp0ztpHw2aSmNNTmzmZTb5kBTP9J/QZ1qrebx6+Bh X-Received: by 2002:aa7:858b:: with SMTP id w11mr36717034pfn.68.1564431144432; Mon, 29 Jul 2019 13:12:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564431144; cv=none; d=google.com; s=arc-20160816; b=b+xzx7nF+2ot+qSuHX2A4bZIzYccDYJLzMIy7yi6agI2rB9mBIjReS/APCJRgZgAyr d7e1YzwDEUa7WRMFF4kO03hE1cxcubIiyr1vncJYzMOeJzB5IR44gFjor/I8fd4iSuni TdTHZ5Yw+uBCnBLd9lUj8vOERhbFe78HqbKB7LMjmHfQzwojAxPsU8ugxZVbNHGL9sZF 0Wx5aj72hkFrcEgUJKtKpQwahV0h/wTtUrm7n6N66DyfZ29+vsPUeBtLXSKxxJCilOif nFda28HCziUHI8mKJwnBlh4CZJLFrnOwxhhACZwRDBoukABxByw4hrGEs3PzhIVFOdW+ ZCsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=APfgCVofNAxTgIpWJJctDNySOFsZWXGHU5vcxTekT1w=; b=WRvY+JPUTlngb+OygdcSSzyA+kfoP7DARDDHm0q9DAGnuirdjk3zfJM3nSNX+zuKN4 /1XqpwsCChd3jfVUNwVJVTEtNOMG04+4cqzt+jWqFmCWZh2TnWf5IKJ/OJLQJswyn155 /qhSLBrHxmps6I8tRh1oyOh8tJG8i7pignvHFxpBWoejuf32kH6jjsIf+asgP2OBh7l3 RdDyBeJ8xwI1vetqQ7uqfNuAA7336QfYnn+d7mFcgvmofpTwJEdKkI5D6UVOHXql2ixJ ki52/B9raIx8DOkRFNuROErgDeXJl4vKl56Kql5yuM4ThHMdgsZPutLd1KoZ7vtQNQVl NDKQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q13si26201809pjb.13.2019.07.29.13.12.08; Mon, 29 Jul 2019 13:12:24 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730458AbfG2UKw (ORCPT + 99 others); Mon, 29 Jul 2019 16:10:52 -0400 Received: from mail-vs1-f66.google.com ([209.85.217.66]:37282 "EHLO mail-vs1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388031AbfG2Td6 (ORCPT ); Mon, 29 Jul 2019 15:33:58 -0400 Received: by mail-vs1-f66.google.com with SMTP id v6so41790140vsq.4 for ; Mon, 29 Jul 2019 12:33:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=APfgCVofNAxTgIpWJJctDNySOFsZWXGHU5vcxTekT1w=; b=gXZ8VPP6FXm7sS95jC+VMgxRj4g6veNIAhPOopOamIoLOFcC1tLtydzHK8bFg1d93R WxjXgnEzIhgfEzAJ5xyImd0UqgVnF2Q9JclMHKXWP1PFqwYHOHOdMlbZqPadnNOKWSLF t8w2EcqsEytyFJdmxU8PrSNsALnfYTDfEF4N2SfzuGVDCJJWlS78Jezp9Yu8t5eDOYdL Q6xnkPEI1vLF3UybHVpfET9xwBqm7IMX/+4rDzabri8LU0gcmRwnCzIzTWcWmYiLXxIo JE6tyZfjfxLHPX0hKMA5WOdksrqrcFg4JlWvrHq0AbUEMlpVH8b15kRlU9wZ59LmuPJS c4Pw== X-Gm-Message-State: APjAAAVw1miIXH5v/rPwiTtBlN2hCUrAdU0FcGl8Lxt3yyIhfLuXqCR3 foTRR3HiTC/b2HuurgO5RQNNLA== X-Received: by 2002:a67:f518:: with SMTP id u24mr26227759vsn.87.1564428837005; Mon, 29 Jul 2019 12:33:57 -0700 (PDT) Received: from redhat.com (bzq-79-181-91-42.red.bezeqint.net. [79.181.91.42]) by smtp.gmail.com with ESMTPSA id u27sm12353175vkk.53.2019.07.29.12.33.53 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Mon, 29 Jul 2019 12:33:56 -0700 (PDT) Date: Mon, 29 Jul 2019 15:33:49 -0400 From: "Michael S. Tsirkin" To: Stefano Garzarella Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Stefan Hajnoczi , "David S. Miller" , virtualization@lists.linux-foundation.org, Jason Wang , kvm@vger.kernel.org Subject: Re: [PATCH v4 1/5] vsock/virtio: limit the memory used per-socket Message-ID: <20190729152634-mutt-send-email-mst@kernel.org> References: <20190717113030.163499-1-sgarzare@redhat.com> <20190717113030.163499-2-sgarzare@redhat.com> <20190729095956-mutt-send-email-mst@kernel.org> <20190729153656.zk4q4rob5oi6iq7l@steredhat> <20190729115904-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 29, 2019 at 06:41:27PM +0200, Stefano Garzarella wrote: > On Mon, Jul 29, 2019 at 12:01:37PM -0400, Michael S. Tsirkin wrote: > > On Mon, Jul 29, 2019 at 05:36:56PM +0200, Stefano Garzarella wrote: > > > On Mon, Jul 29, 2019 at 10:04:29AM -0400, Michael S. Tsirkin wrote: > > > > On Wed, Jul 17, 2019 at 01:30:26PM +0200, Stefano Garzarella wrote: > > > > > Since virtio-vsock was introduced, the buffers filled by the host > > > > > and pushed to the guest using the vring, are directly queued in > > > > > a per-socket list. These buffers are preallocated by the guest > > > > > with a fixed size (4 KB). > > > > > > > > > > The maximum amount of memory used by each socket should be > > > > > controlled by the credit mechanism. > > > > > The default credit available per-socket is 256 KB, but if we use > > > > > only 1 byte per packet, the guest can queue up to 262144 of 4 KB > > > > > buffers, using up to 1 GB of memory per-socket. In addition, the > > > > > guest will continue to fill the vring with new 4 KB free buffers > > > > > to avoid starvation of other sockets. > > > > > > > > > > This patch mitigates this issue copying the payload of small > > > > > packets (< 128 bytes) into the buffer of last packet queued, in > > > > > order to avoid wasting memory. > > > > > > > > > > Reviewed-by: Stefan Hajnoczi > > > > > Signed-off-by: Stefano Garzarella > > > > > > > > This is good enough for net-next, but for net I think we > > > > should figure out how to address the issue completely. > > > > Can we make the accounting precise? What happens to > > > > performance if we do? > > > > > > > > > > In order to do more precise accounting maybe we can use the buffer size, > > > instead of payload size when we update the credit available. > > > In this way, the credit available for each socket will reflect the memory > > > actually used. > > > > > > I should check better, because I'm not sure what happen if the peer sees > > > 1KB of space available, then it sends 1KB of payload (using a 4KB > > > buffer). > > > The other option is to copy each packet in a new buffer like I did in > > > the v2 [2], but this forces us to make a copy for each packet that does > > > not fill the entire buffer, perhaps too expensive. > > > > > > [2] https://patchwork.kernel.org/patch/10938741/ > > > > > > > So one thing we can easily do is to under-report the > > available credit. E.g. if we copy up to 256bytes, > > then report just 256bytes for every buffer in the queue. > > > > Ehm sorry, I got lost :( > Can you explain better? > > > Thanks, > Stefano I think I suggested a better idea more recently. But to clarify this option: we are adding a 4K buffer. Let's say we know we will always copy 128 bytes. So we just tell remote we have 128. If we add another 4K buffer we add another 128 credits. So we are charging local socket 16x more (4k for a 128 byte packet) but we are paying remote 16x less (128 credits for 4k byte buffer). It evens out. Way less credits to go around so I'm not sure it's a good idea, at least as the only solution. Can be combined with other optimizations and probably in a less drastic fashion (e.g. 2x rather than 16x). -- MST