Some fixes to normalize how rx_dropped is calculated. This is the
product of a discussion on netdev on or about 18 August 2005 w/
the subject '[RFC] stats: how to count "good" packets dropped by
hardware?'
Patches for 3c59x, e1000, e100, ixgb, and tg3 to follow.
Do not count frames dropped by the hardware as part of rx_dropped.
Signed-off-by: John W. Linville <[email protected]>
---
drivers/net/e1000/e1000_main.c | 1 -
1 files changed, 1 deletion(-)
diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -2544,7 +2544,6 @@ e1000_update_stats(struct e1000_adapter
adapter->stats.crcerrs + adapter->stats.algnerrc +
adapter->stats.rlec + adapter->stats.mpc +
adapter->stats.cexterr;
- adapter->net_stats.rx_dropped = adapter->stats.mpc;
adapter->net_stats.rx_length_errors = adapter->stats.rlec;
adapter->net_stats.rx_crc_errors = adapter->stats.crcerrs;
adapter->net_stats.rx_frame_errors = adapter->stats.algnerrc;
Do not count non-error frames dropped by the hardware as
rx_errors. Instead, count them as part of rx_missed_errors.
Signed-off-by: John W. Linville <[email protected]>
---
drivers/net/tg3.c | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -6893,8 +6893,7 @@ static struct net_device_stats *tg3_get_
get_stat64(&hw_stats->tx_octets);
stats->rx_errors = old_stats->rx_errors +
- get_stat64(&hw_stats->rx_errors) +
- get_stat64(&hw_stats->rx_discards);
+ get_stat64(&hw_stats->rx_errors);
stats->tx_errors = old_stats->tx_errors +
get_stat64(&hw_stats->tx_errors) +
get_stat64(&hw_stats->tx_mac_errors) +
@@ -6922,6 +6921,9 @@ static struct net_device_stats *tg3_get_
stats->rx_crc_errors = old_stats->rx_crc_errors +
calc_crc_errors(tp);
+ stats->rx_missed_errors = old_stats->rx_missed_errors +
+ get_stat64(&hw_stats->rx_discards);
+
return stats;
}
Only increment rx_dropped in case of lack of resources (i.e. not for
frames with errors).
Signed-off-by: John W. Linville <[email protected]>
---
drivers/net/3c59x.c | 2 +-
1 files changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/3c59x.c b/drivers/net/3c59x.c
--- a/drivers/net/3c59x.c
+++ b/drivers/net/3c59x.c
@@ -2598,8 +2598,8 @@ static int vortex_rx(struct net_device *
} else if (vortex_debug > 0)
printk(KERN_NOTICE "%s: No memory to allocate a sk_buff of "
"size %d.\n", dev->name, pkt_len);
+ vp->stats.rx_dropped++;
}
- vp->stats.rx_dropped++;
issue_and_wait(dev, RxDiscard);
}
Do not count frames dropped by the hardware as part of rx_dropped.
Signed-off-by: John W. Linville <[email protected]>
---
drivers/net/ixgb/ixgb_main.c | 2 --
1 files changed, 2 deletions(-)
diff --git a/drivers/net/ixgb/ixgb_main.c b/drivers/net/ixgb/ixgb_main.c
--- a/drivers/net/ixgb/ixgb_main.c
+++ b/drivers/net/ixgb/ixgb_main.c
@@ -1616,8 +1616,6 @@ ixgb_update_stats(struct ixgb_adapter *a
adapter->stats.icbc +
adapter->stats.ecbc + adapter->stats.mpc;
- adapter->net_stats.rx_dropped = adapter->stats.mpc;
-
/* see above
* adapter->net_stats.rx_length_errors = adapter->stats.rlec;
*/
Do not count non-error frames dropped by the hardware as
part of rx_dropped. Instead, count those frames dropped as
rx_missed_errors. Also, do not count other error frames as part of
rx_dropped. Finally, do not count oversized frames in rx_dropped
(since they are counted as part of rx_length_errors).
Signed-off-by: John W. Linville <[email protected]>
---
drivers/net/e100.c | 4 +---
1 files changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/net/e100.c b/drivers/net/e100.c
--- a/drivers/net/e100.c
+++ b/drivers/net/e100.c
@@ -1387,13 +1387,13 @@ static void e100_update_stats(struct nic
ns->collisions += nic->tx_collisions;
ns->tx_errors += le32_to_cpu(s->tx_max_collisions) +
le32_to_cpu(s->tx_lost_crs);
- ns->rx_dropped += le32_to_cpu(s->rx_resource_errors);
ns->rx_length_errors += le32_to_cpu(s->rx_short_frame_errors) +
nic->rx_over_length_errors;
ns->rx_crc_errors += le32_to_cpu(s->rx_crc_errors);
ns->rx_frame_errors += le32_to_cpu(s->rx_alignment_errors);
ns->rx_over_errors += le32_to_cpu(s->rx_overrun_errors);
ns->rx_fifo_errors += le32_to_cpu(s->rx_overrun_errors);
+ ns->rx_missed_errors += le32_to_cpu(s->rx_resource_errors);
ns->rx_errors += le32_to_cpu(s->rx_crc_errors) +
le32_to_cpu(s->rx_alignment_errors) +
le32_to_cpu(s->rx_short_frame_errors) +
@@ -1727,12 +1727,10 @@ static inline int e100_rx_indicate(struc
if(unlikely(!(rfd_status & cb_ok))) {
/* Don't indicate if hardware indicates errors */
- nic->net_stats.rx_dropped++;
dev_kfree_skb_any(skb);
} else if(actual_size > ETH_DATA_LEN + VLAN_ETH_HLEN) {
/* Don't indicate oversized frames */
nic->rx_over_length_errors++;
- nic->net_stats.rx_dropped++;
dev_kfree_skb_any(skb);
} else {
nic->net_stats.rx_packets++;
John W. Linville wrote:
> Some fixes to normalize how rx_dropped is calculated. This is the
> product of a discussion on netdev on or about 18 August 2005 w/
> the subject '[RFC] stats: how to count "good" packets dropped by
> hardware?'
>
> Patches for 3c59x, e1000, e100, ixgb, and tg3 to follow.
For e.g. e1000, are we sure that packets dropped by hardware are
accounted elsewhere?
Jeff
On Mon, Sep 12, 2005 at 02:53:31PM -0400, Jeff Garzik wrote:
> For e.g. e1000, are we sure that packets dropped by hardware are
> accounted elsewhere?
The e100 and tg3 patches move the count of those frames to
rx_missed_errors. e1000 and ixgb were already counting them there in
addition to rx_discards, so they were simply removed from rx_discards.
3c59x was counting other errors in rx_discards, so they were removed
from that count.
John
--
John W. Linville
[email protected]
John W. Linville wrote:
> On Mon, Sep 12, 2005 at 02:53:31PM -0400, Jeff Garzik wrote:
>
>
>>For e.g. e1000, are we sure that packets dropped by hardware are
>>accounted elsewhere?
>
>
> The e100 and tg3 patches move the count of those frames to
> rx_missed_errors. e1000 and ixgb were already counting them there in
> addition to rx_discards, so they were simply removed from rx_discards.
> 3c59x was counting other errors in rx_discards, so they were removed
> from that count.
Whatever became of this discussion? It seems that the e1000 driver
in 2.6.13.2 uses rx_errors as the total of all receive errors, while
the patch John was proposing broke them out into separate counters.
It doesn't matter too much to me either way, but I'd like for there to
be a precisely documented definition for the various net-stats so that
I can correctly show the values to user-space (I can certainly add rx_discards
to rx_errors for a 'total rx errors' value, but I need to know whether
rx_discards is already in rx_errors to keep from counting things twice.)
Jeff: Could you lay down the law somewhere in the Documentation/
directory and then let us start fixing any driver that does it differently?
Thanks,
Ben
--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com
On Mon, Oct 24, 2005 at 02:35:42PM -0700, Ben Greear wrote:
> It doesn't matter too much to me either way, but I'd like for there to
> be a precisely documented definition for the various net-stats so that
> I can correctly show the values to user-space (I can certainly add
> rx_discards
> to rx_errors for a 'total rx errors' value, but I need to know whether
> rx_discards is already in rx_errors to keep from counting things twice.)
My opinion is that:
-- rx_errors should count all "on the wire" hardware errors;
-- rx_missed_errors should count frames w/ no "on the wire"
errors that cannot be received by the hardware (generally
due to lack of DMA bufers); and,
-- rx_discards should count frames dropped by the kernel
after successful reception by the hardware.
I do _not_ think rx_missed_errors should be counted as part of
rx_errors, but I could be persuaded otherwise.
> Jeff: Could you lay down the law somewhere in the Documentation/
> directory and then let us start fixing any driver that does it differently?
It does seem like a netdev stats clarification doc would be
appropriate. Does anyone have the beginnings of this?
John
--
John W. Linville
[email protected]
John W. Linville wrote:
> On Mon, Oct 24, 2005 at 02:35:42PM -0700, Ben Greear wrote:
>
>
>>It doesn't matter too much to me either way, but I'd like for there to
>>be a precisely documented definition for the various net-stats so that
>>I can correctly show the values to user-space (I can certainly add
>>rx_discards
>>to rx_errors for a 'total rx errors' value, but I need to know whether
>>rx_discards is already in rx_errors to keep from counting things twice.)
>
>
> My opinion is that:
>
> -- rx_errors should count all "on the wire" hardware errors;
>
> -- rx_missed_errors should count frames w/ no "on the wire"
> errors that cannot be received by the hardware (generally
> due to lack of DMA bufers); and,
>
> -- rx_discards should count frames dropped by the kernel
> after successful reception by the hardware.
>
> I do _not_ think rx_missed_errors should be counted as part of
> rx_errors, but I could be persuaded otherwise.
Well, if we have rx_errors containing any of the other more specific
error counts (reported in the net-stats struct), I don't see a reason
not to include all of them in the counter. I think my preference would
be to have rx_errors be every conceivable frame that we know was sent to
us but which did not get properly delivered to the software stack.
Each error would also fall into it's more specific counters.
That way, the rx-errors counter can be used for folks who just care
that the packet was not correctly received, and those that care about
the details can look at the individual errors (and sum them up in various
configurations due to personal taste, etc.)
That said, rx-errors would then be duplicate info because we could arrive
at it's value by just adding up all the other error counters....
> It does seem like a netdev stats clarification doc would be
> appropriate. Does anyone have the beginnings of this?
It's not authoritative, but I scrounged this info from various people,
including Mr Becker some years ago. I undoubtedly made some of this up
myself, and there could be errors of course:
rx_errors: Total of all rx errors
rx_dropped: Dropped on receive, usually due to kernel being over-worked.
rx_length: Dropped because pkt-length was invalid.
rx_over: Dropped because we over-ran the NIC's rx buffers.
rx_crc: Packets received with bad CRC errors.
rx_frame: Framing errors (errors at the physical layer), usually cable or hardware error.
rx_fifo: Dropped due to Kernel buffers being full (I guess rx-over could be NIC only, rx-fifo be kernel/driver only.)
rx_missed: Dropped due to not handling IRQ in time.
tx_abort: Failed to TX due to driver abort.
tx_carrier: Failed to TX due to lack of carrier signal.
tx_fifo: Over-ran the driver/kernel buffer(s).
tx_heartbeat: Failed to TX due to transceiver heartbeat errors.
tx_window: Failed to TX due to out-of-window error.
Thanks,
Ben
>
> John
--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com
On Mon, 2005-24-10 at 18:15 -0700, Ben Greear wrote:
[..]
> That way, the rx-errors counter can be used for folks who just care
> that the packet was not correctly received, and those that care about
> the details can look at the individual errors (and sum them up in various
> configurations due to personal taste, etc.)
>
Look at http://www.faqs.org/rfcs/rfc2665.html
for a more finer breakdown.
rx/tx_errors may be further broken down.
cheers,
jamal
On Tuesday 25 October 2005 03:15, Ben Greear wrote:
> rx_errors: Total of all rx errors
If it is a total, then we don't need to store it, since we can calculate
that explicity on request, no?
> rx_dropped: Dropped on receive, usually due to kernel being over-worked.
> rx_length: Dropped because pkt-length was invalid.
> rx_over: Dropped because we over-ran the NIC's rx buffers.
> rx_crc: Packets received with bad CRC errors.
> rx_frame: Framing errors (errors at the physical layer), usually cable or hardware error.
> rx_fifo: Dropped due to Kernel buffers being full (I guess rx-over could be NIC only, rx-fifo be kernel/driver only.)
> rx_missed: Dropped due to not handling IRQ in time.
Regards
Ingo Oeser