Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp3566568pxu; Tue, 8 Dec 2020 15:53:24 -0800 (PST) X-Google-Smtp-Source: ABdhPJxSQD5yGGrziQAh2zF75V+c9ZBecHq+gvfxYkLjp6pPDy+hoLhURAFwjEgM/ISILmQUBqcT X-Received: by 2002:a17:906:f1c8:: with SMTP id gx8mr125368ejb.524.1607471604066; Tue, 08 Dec 2020 15:53:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1607471604; cv=none; d=google.com; s=arc-20160816; b=kMC2EW/o1b05vNF76Omsj0eTeZR7zqqN5lDMu8edM9/Zr/wTwHrTBQuqZsUlcho2tE NFUZNTanIrk8DqwJglaP+zvltnm/V3gYWNTQ/lwgudSQkUpXDyvBwY02F/aoCi4aIAw+ 7yU7VihYS0G6pUmz5J8d2NXX0cwYgdP9jSRtbiE60IJdkqSLISID3R2c8Os0VdkxLDrO oWsz9HFKTO6HEJUX8XpXv6hml7cCExETyppRkn7q3Ln7IkFOWejVWXl1Mh4oyqkVZlPf uF8MFfUVNLxvxwjF4aSj2QY/MOJL2cTccDQXhDO++cQrLUSLbH79ZX4H7eXh4kgY7oQW 22AQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=hdL0m5JgFJe1EpPesWikdrSf9jXdSBdwbL4TxHVvvOk=; b=yLb1bzJOU1txwC1hFiz+V9x5mW7/uPTuoIfGWDJ44QRgzOi4Ykda3htmilZjSMKhtD aSxIQ9hJpn535iNZkmHdmSkLTeWlrHY8hUisNUHnaHvpU0XzaP73z8IoZO9gfsmlL1er R8DU6iBj2V2MUuJ7Y90zuPjP82tFsDhzlyQq+OVYa9rkMy3VejNj+no3x1rRhzTnhwkj bU761aah+BIVlEiBjG53c1rRD/GyTl9h/QdqbvjD3D8Y/rJkJ/chitrevSwXVEF8e62t zw5BjKfijFfsDjgXBhoYu+/fBXyHlrvT3JFMEBxuvaKhpMYnVS5l2DjT7U4NSkk9TMcc DqMg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=NzY69DPY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v18si31853edd.267.2020.12.08.15.52.59; Tue, 08 Dec 2020 15:53:24 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=NzY69DPY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730553AbgLHXvD (ORCPT + 99 others); Tue, 8 Dec 2020 18:51:03 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42496 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725906AbgLHXvD (ORCPT ); Tue, 8 Dec 2020 18:51:03 -0500 Received: from mail-io1-xd42.google.com (mail-io1-xd42.google.com [IPv6:2607:f8b0:4864:20::d42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F20E1C061794 for ; Tue, 8 Dec 2020 15:50:22 -0800 (PST) Received: by mail-io1-xd42.google.com with SMTP id p187so330476iod.4 for ; Tue, 08 Dec 2020 15:50:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=hdL0m5JgFJe1EpPesWikdrSf9jXdSBdwbL4TxHVvvOk=; b=NzY69DPYhNIzwXHVZei4rEkY+5RtZDtn0aCrOkx/uPRICVZIHrkfQbrlaRl+WQdGWY T3u2EBHoGUDpiUaRXsVSR6EcxAG/qffWWgalhmZ067BzOJYuh1MkEe4gr8uvvZ21NVM1 Pgp/sezauXFiu6bpz+LiR/EJ8zmuwZLU9hfKF2t5TVCnztZsYu9+jdmuhe0KJYNHzqQa i7E8SgSWb7JiVQhROomVLR8FKZfVb0hEFC9zflx0fgYMDLGWoGwFiT794TGqcbBC5Vs+ q5Bp/+B51GnRNr9aQFsyDbcz0fzywlcqVdfOGV+pTLeywo8FhhokE+ZlTPEs3JGl1F+G /Ftw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=hdL0m5JgFJe1EpPesWikdrSf9jXdSBdwbL4TxHVvvOk=; b=AyHbrp6Ym6/s3qzSr7f/+gA5TIpO0hpMNW1XJ5qL0UDFt5XsItIbtaGb1X0bM5WOPs MvZse/9mqn2qTA98W5ScKawBVyncPCiK58IfSJo5jUaMs2ok6A4zXmDIGyh/0qb3L9tS LHTFblwPoTRk6byo2T/NYMbmKdWmR8Nh64Cd7rA31tN696nlOh2qytMmVyGt8Oi+Hozc ukgvxW+snHpaML2nJXWUJN0emq3BlErzDIN1T7uQ2sgWCAcWQTSIhtUQMhUS1V4jDYaW zcpMYm3KGb//Ko9kr0b6Wm55lGx4lGzscHJ4DbAloN0cKUmN+V7+FcQ0e5yOceIqbfdT B3MA== X-Gm-Message-State: AOAM5323GHTY240ZoOMK6OVIYSfjnAwtFkzlEz+aJS6PN717OFYE0Myt mWnd/XOlo1MmT2Rhr9Mb96uLLcjyf+gMicc6y0z4rQ== X-Received: by 2002:a6b:c8c1:: with SMTP id y184mr264930iof.99.1607471421992; Tue, 08 Dec 2020 15:50:21 -0800 (PST) MIME-Version: 1.0 References: <20201206034408.31492-1-TheSven73@gmail.com> <20201208115035.74221c31@kicinski-fedora-pc1c0hjn.DHCP.thefacebook.com> <20201208152948.006606b3@kicinski-fedora-pc1c0hjn.DHCP.thefacebook.com> In-Reply-To: <20201208152948.006606b3@kicinski-fedora-pc1c0hjn.DHCP.thefacebook.com> From: Eric Dumazet Date: Wed, 9 Dec 2020 00:50:10 +0100 Message-ID: Subject: Re: [PATCH net v1 1/2] lan743x: improve performance: fix rx_napi_poll/interrupt ping-pong To: Jakub Kicinski Cc: Sven Van Asbroeck , Bryan Whitehead , Microchip Linux Driver Support , David S Miller , Andrew Lunn , netdev , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Dec 9, 2020 at 12:29 AM Jakub Kicinski wrote: > > On Tue, 8 Dec 2020 17:23:08 -0500 Sven Van Asbroeck wrote: > > On Tue, Dec 8, 2020 at 2:50 PM Jakub Kicinski wrote: > > > > > > > > > > > +done: > > > > /* update RX_TAIL */ > > > > lan743x_csr_write(adapter, RX_TAIL(rx->channel_number), > > > > rx_tail_flags | rx->last_tail); > > > > -done: > > > > + > > > > > > I assume this rings the doorbell to let the device know that more > > > buffers are available? If so it's a little unusual to do this at the > > > end of NAPI poll. The more usual place would be to do this every n > > > times a new buffer is allocated (in lan743x_rx_init_ring_element()?) > > > That's to say for example ring the doorbell every time a buffer is put > > > at an index divisible by 16. > > > > Yes, I believe it tells the device that new buffers have become available. > > > > I wonder why it's so unusual to do this at the end of a napi poll? Omitting > > this could result in sub-optimal use of buffers, right? > > > > Example: > > - tail is at position 0 > > - core calls napi_poll(weight=64) > > - napi poll consumes 15 buffers and pushes 15 skbs, then ring empty > > - index not divisible by 16, so tail is not updated > > - weight not reached, so napi poll re-enables interrupts and bails out > > > > Result: now there are 15 buffers which the device could potentially use, but > > because the tail wasn't updated, it doesn't know about them. > > Perhaps 16 for a device with 64 descriptors is rather high indeed. > Let's say 8. If the device misses 8 packet buffers on the ring, > that should be negligible. > mlx4 uses 8 as the threshold ( mlx4_en_refill_rx_buffers()) > Depends on the cost of the CSR write, usually packet processing is > putting a lot of pressure on the memory subsystem of the CPU, hence > amortizing the write over multiple descriptors helps. The other thing > is that you could delay the descriptor writes to write full cache lines, > but I don't think that will help on IMX6. > > > It does make sense to update the tail more frequently than only at the end > > of the napi poll, though? > > > > I'm new to napi polling, so I'm quite interested to learn about this. > > There is a tracepoint which records how many packets NAPI has polled: > napi:napi_poll, you can see easily what your system is doing. > > What you want to avoid is the situation where the device already used > up all the descriptors by the time driver finishes the Rx processing. > That'd result in drops. So the driver should push the buffers back to > the device reasonably early. > > With a ring of 64 descriptors and NAPI budget of 64 it's not unlikely > that the ring will be completely used when processing runs. > > > > > + /* up to half of elements in a full rx ring are > > > > + * extension frames. these do not generate skbs. > > > > + * to prevent napi/interrupt ping-pong, limit default > > > > + * weight to the smallest no. of skbs that can be > > > > + * generated by a full rx ring. > > > > + */ > > > > netif_napi_add(adapter->netdev, > > > > &rx->napi, lan743x_rx_napi_poll, > > > > - rx->ring_size - 1); > > > > + (rx->ring_size - 1) / 2); > > > > > > This is rather unusual, drivers should generally pass NAPI_POLL_WEIGHT > > > here. > > > > I agree. The problem is that a full ring buffer of 64 buffers will only > > contain 32 buffers with network data - the others are timestamps. > > > > So napi_poll(weight=64) can never reach its full weight. Even with a full > > buffer, it always assumes that it has to stop polling, and re-enable > > interrupts, which results in a ping-pong. > > Interesting I don't think we ever had this problem before. Let me CC > Eric to see if he has any thoughts on the case. AFAIU you should think > of the weight as way of arbitrating between devices (if there is more > than one). Driver could be called with an arbitrary budget (of 64), and if its ring buffer has been depleted, return @budget instead of skb counts, and not ream the interrupt if (count < budget && !rx_ring_fully_processed) { if (napi_complete_done(napi, count)) ream_irqs(); return count; } return budget; > > NAPI does not do any deferral (in wall clock time terms) of processing, > so the only difference you may get for lower weight is that softirq > kthread will get a chance to kick in earlier. > > > Would it be better to fix the weight counting? Increase the count > > for every buffer consumed, instead of for every skb pushed? >