Received: by 10.192.165.148 with SMTP id m20csp93794imm; Thu, 26 Apr 2018 16:45:31 -0700 (PDT) X-Google-Smtp-Source: AB8JxZoCW9BG3aUdHItEn8Df3RzNOni5fcW2DTYwjcII6TzhUh3IiNSosSyUWpDytOltbSY7A9EM X-Received: by 10.98.242.74 with SMTP id y10mr67932pfl.75.1524786331803; Thu, 26 Apr 2018 16:45:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524786331; cv=none; d=google.com; s=arc-20160816; b=kppjytKNcZjDtcF0NZRRxLlTULFIkM1tDRJ26AYm9RcCg9KuF0UNSESjHe8iB54GvO aDS/x7UQmeWxQ8h6bCCZsxKI4/aIf7AzKD5afgQV0fQ7IDjfK2+LbswqqsxcVR7fHzir 59dYZruIOCbzpB2Wdy2tg/V52rDB4V/93hFPH06oXYrMACT2w2wNYMvr7vNRyPSmljcG iURfR39OE9t9fBf6p3pbZTLBuW0rxvkJXSzxEfx7rIjNCXFlP0Q5LTMuj1EjRw+FIipp D4exg2GtjbGv1HoG88k+cqKMCuRpJvVoTVxTrX+qwGpf+hJaaFT4oh2LDH7yTI1gjuFf y32w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=RG3vvdU32+nyaEg0+2iIuCthEfrd9RdFGXjrzMwdgWE=; b=Xg2UyntXde0Li4ok0iJF5GYFfD7ApOErvTM+qb5mcav7cpZ6rNSpndjtaVO9QRnWC2 YYTp6DQnryxRkxWE/AYsqTVuGXcN9rTWiT9KdzyaImeMg/4xZ+pBzQelIVdINTxNq+ns jaDjHD8E5aCHKfvIa+9/hwpireDdqhlBBkseITw/UjDIONhiUcJLRC8RuYiOvxBD3iZS 2ycx1G+m+vspyiYazZ7hQ2qFdunuq0gh1IuOt7VIjcjOUr64O9xmYWlSJTPPA5HaQ/zL /VVQdldfMg64z+fLSH94ixQrfLzjtprUqALqPwhWV0qoM5PorRoHdv31W3JpJ6lq0G8j dDlA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@dev-mellanox-co-il.20150623.gappssmtp.com header.s=20150623 header.b=CkbfjyOz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=mellanox.co.il Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b7si4360pfl.223.2018.04.26.16.45.17; Thu, 26 Apr 2018 16:45:31 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@dev-mellanox-co-il.20150623.gappssmtp.com header.s=20150623 header.b=CkbfjyOz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=mellanox.co.il Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757187AbeDZXn6 (ORCPT + 99 others); Thu, 26 Apr 2018 19:43:58 -0400 Received: from mail-lf0-f65.google.com ([209.85.215.65]:35388 "EHLO mail-lf0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757106AbeDZXny (ORCPT ); Thu, 26 Apr 2018 19:43:54 -0400 Received: by mail-lf0-f65.google.com with SMTP id r125-v6so119030lfe.2 for ; Thu, 26 Apr 2018 16:43:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dev-mellanox-co-il.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=RG3vvdU32+nyaEg0+2iIuCthEfrd9RdFGXjrzMwdgWE=; b=CkbfjyOzG/+LWfXauHaAJWNHmSxbsAGuaVZbjSUD/Pq1NIy6/fgifR08SdBEIEYJIv tHGMmgMHLRXQdn9fmz9bRUvhtYzZfI4x1YP201aAS/kgHq+5zvxcAZhCmLk6bfAYFKUZ S54rJ0/Y3mYI3m2BgNKFtmeeoJG+e2Il4VkqqaY3pPeGajsc3v/23grApZjx2RqxhU5b tM45aVFIXBNRw6KEFDkpp/p9tYkV5h8LLUHncP1eFWuiwgScSMK0iU8l49G8uHX6zthB N1UTwZwZqd8hvYptVoCbX+YLlo6lBEWucffeb6Ytg7b+AtQNfCKEGrqmdnIXJP7SRfo5 /44g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=RG3vvdU32+nyaEg0+2iIuCthEfrd9RdFGXjrzMwdgWE=; b=BYpswYd9r64OJvlRPEKJrk3hiUOaHcp2mVpS4x1qhHfMQnmhGRu2gYLbcu/3tLsMbx aCyBuwABvce77exWuHUfTC594LMNXFI/+8w20WUsHrhwp5uYl1NGqMGYlq85LXN0Wfu1 MwjdfbI1ATKaAO7KXTU6XmRUtiftDwDJ4Kh4ZsdeMMbnWeU8CNB7aOBRrQg774u7pD3a 8x4i9/2DPt05PzLiydBP65pwk/m+S6MhCb8R8y2+Y9YKGLiagPoOz+5SdD7QDHQoKedq 1TDqSmX5PRuJvh5BEzog/m/6APE6Ib8RU7EBeGMrSyrwfbAkxM2wJDOLd7vkCWmfLIyt 8tTw== X-Gm-Message-State: ALQs6tC+JyU2q9JQ3uuJ9HkMoaqX95oU9pO9qbtrIpxGa4trTNJjHvAh jLJaoKHB7cIu7ypQx58tah6vddro+qXGKA6ESc/2hbnH X-Received: by 2002:a2e:5c8:: with SMTP id 191-v6mr44930ljf.136.1524786232937; Thu, 26 Apr 2018 16:43:52 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a19:2947:0:0:0:0:0 with HTTP; Thu, 26 Apr 2018 16:43:32 -0700 (PDT) In-Reply-To: References: <1524775057-8012-1-git-send-email-qing.huang@oracle.com> <5cd29779-fc70-930a-7060-ee3332c15558@oracle.com> From: Saeed Mahameed Date: Thu, 26 Apr 2018 16:43:32 -0700 Message-ID: Subject: Re: [PATCH] net/mlx5: report persistent netdev stats across ifdown/ifup commands To: Qing Huang Cc: Eran Ben Elisha , linux-kernel , RDMA mailing list , Linux Netdev List , Leon Romanovsky , Matan Barak , Saeed Mahameed Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 26, 2018 at 4:30 PM, Saeed Mahameed wrote: > On Thu, Apr 26, 2018 at 3:37 PM, Qing Huang wrote: >> >> >> On 04/26/2018 02:50 PM, Saeed Mahameed wrote: >>> >>> On Thu, Apr 26, 2018 at 1:37 PM, Qing Huang wrote: >>>> >>>> Current stats collecting scheme in mlx5 driver is to periodically fetch >>>> aggregated stats from all the active mlx5 software channels associated >>>> with the device. However when a mlx5 interface is brought down(ifdown), >>>> all the channels will be deactivated and closed. A new set of channels >>>> will be created when next ifup command or a similar command is called. >>>> Unfortunately the new channels will have all stats reset to 0. So you >>>> lose the accumulated stats information. This behavior is different from >>>> other netdev drivers including the mlx4 driver. In order to fix it, we >>>> now save prior mlx5 software stats into netdev stats fields, so all the >>>> accumulated stats will survive multiple runs of ifdown/ifup commands and >>>> be shown correctly. >>>> >>>> Orabug: 27548610 > > I don't think we need this internal bug reference, let's remove it. > >>>> >>>> Signed-off-by: Qing Huang >>>> --- >>> >>> Hi Qing, >>> >>> I am adding Eran since he is currently working on a similar patch, >>> He is also taking care of all cores/rings stats to make them >>> persistent, so you won't have discrepancy between >>> ethtool and ifconfig stats. >>> >>> I am ok with this patch, but this means Eran has to work his way around >>> it. >>> >>> so we have two options: >>> >>> 1. Temporary accept this patch, and change it later with Eran's work. >>> 2. Wait for Eran's work. >>> >>> I am ok with either one of them, please let me know. >>> >>> Thanks ! >> >> Hi Saeed, >> >> Any idea on rough ETA of Eran's stats work to be in upstream? If it will be > > All i know that it is planned for v4.18. > >> available soon, I think >> we can wait a bit. If it will take a while to redesign the whole stats >> scheme (for both ethtool and ifconfig), >> maybe we can go with this incremental fix first? >> Qing, Before you address my comments, it looks like Eran's work is converging and we will finalize the internal review next week, in which case submission will take 2-3 weeks from today. So i suggest to wait. please checkout: https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/commit/?h=testing/mlx5e-stats&id=15ffb7f87432d073e8ac0e6b895188d40fdda4d4 > > Yes we can go with incremental fix, > but let's fix the commit message as requested above and make the patch > title prefix "net/mlx5e:" > since this is a mlx5 netdev related change, not mlx5/core. > also please see a comments below. > > Thanks, > Saeed. > >> Thanks! >> >> >>> >>>> drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 30 >>>> +++++++++++++++++++---- >>>> 1 file changed, 25 insertions(+), 5 deletions(-) >>>> >>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>>> b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>>> index f1fe490..5d50e69 100644 >>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c >>>> @@ -2621,6 +2621,23 @@ static void mlx5e_netdev_set_tcs(struct net_device >>>> *netdev) >>>> netdev_set_tc_queue(netdev, tc, nch, 0); >>>> } >>>> >>>> +static void mlx5e_netdev_save_stats(struct mlx5e_priv *priv) >>>> +{ >>>> + struct net_device *netdev = priv->netdev; >>>> + >>>> + netdev->stats.rx_packets += priv->stats.sw.rx_packets; >>>> + netdev->stats.rx_bytes += priv->stats.sw.rx_bytes; >>>> + netdev->stats.tx_packets += priv->stats.sw.tx_packets; >>>> + netdev->stats.tx_bytes += priv->stats.sw.tx_bytes; >>>> + netdev->stats.tx_dropped += priv->stats.sw.tx_queue_dropped; >>>> + >>>> + priv->stats.sw.rx_packets = 0; >>>> + priv->stats.sw.rx_bytes = 0; >>>> + priv->stats.sw.tx_packets = 0; >>>> + priv->stats.sw.tx_bytes = 0; >>>> + priv->stats.sw.tx_queue_dropped = 0; > > Note: it is safe to do this here since writing to priv->stats.sw is > currently done under priv->state_lock, > Please add a comment to this function that it must be called under > priv->state_lock > >>>> +} >>>> + >>> >>> This means that we are now explicitly clearing channels stats on >>> ifconfig down or switch_channels. >>> and now after ifconfing down, ethtool will always show 0, before this >>> patch it didn't. >>> Anyway update sw stats function will always override them with the new >>> channels stats next time we load new channels. >>> so it is not that big of a deal. >>> >>> >>>> static void mlx5e_build_channels_tx_maps(struct mlx5e_priv *priv) >>>> { >>>> struct mlx5e_channel *c; >>>> @@ -2691,6 +2708,7 @@ void mlx5e_switch_priv_channels(struct mlx5e_priv >>>> *priv, >>>> netif_set_real_num_tx_queues(netdev, new_num_txqs); >>>> >>>> mlx5e_deactivate_priv_channels(priv); >>>> + mlx5e_netdev_save_stats(priv); > > This function call doesn't have to be between deactivate and close, > lets move it to after > mlx5e_close_channels here and below to avoid confusion. > > also you need to accumulate what is left in the rings/channels stats, > before calling > mlx5e_close_channels, see mlx5e_grp_sw_update_stats. > >>>> mlx5e_close_channels(&priv->channels); >>>> >>>> priv->channels = *new_chs; >>>> @@ -2770,6 +2788,7 @@ int mlx5e_close_locked(struct net_device *netdev) >>>> >>>> netif_carrier_off(priv->netdev); >>>> mlx5e_deactivate_priv_channels(priv); >>>> + mlx5e_netdev_save_stats(priv); >>>> mlx5e_close_channels(&priv->channels); >>>> >>>> return 0; >>>> @@ -3215,11 +3234,12 @@ static int mlx5e_setup_tc(struct net_device *dev, >>>> enum tc_setup_type type, >>>> stats->tx_packets = PPORT_802_3_GET(pstats, >>>> a_frames_transmitted_ok); >>>> stats->tx_bytes = PPORT_802_3_GET(pstats, >>>> a_octets_transmitted_ok); >>>> } else { >>>> - stats->rx_packets = sstats->rx_packets; >>>> - stats->rx_bytes = sstats->rx_bytes; >>>> - stats->tx_packets = sstats->tx_packets; >>>> - stats->tx_bytes = sstats->tx_bytes; >>>> - stats->tx_dropped = sstats->tx_queue_dropped; >>>> + stats->rx_packets = sstats->rx_packets + >>>> dev->stats.rx_packets; >>>> + stats->rx_bytes = sstats->rx_bytes + >>>> dev->stats.rx_bytes; >>>> + stats->tx_packets = sstats->tx_packets + >>>> dev->stats.tx_packets; >>>> + stats->tx_bytes = sstats->tx_bytes + >>>> dev->stats.tx_bytes; >>>> + stats->tx_dropped = sstats->tx_queue_dropped + >>>> + dev->stats.tx_dropped; >>>> } >>>> >>>> stats->rx_dropped = priv->stats.qcnt.rx_out_of_buffer; >>>> -- >>>> 1.8.3.1 >>>> >>