commit 5337407c changed NETDEV WATCHDOG messages into a message
that will only print once per driver load. This removed a significant amount
of information from an admin who might be missing that his system was having
NETDEV WATCHDOGs, esp since there is no other global counter available to
count these events.
simply check the __warned flag and print a simple version of the message
without the full stack dump if the (kerneloops related) WARN_ON_ONCE has
already logged the hardware type and one hang.
Signed-off-by: Jesse Brandeburg <[email protected]>
CC: Arjan <[email protected]>
---
include/asm-generic/bug.h | 5 +++++
net/sched/sch_generic.c | 9 +++++++--
2 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index 18c435d..ad810a0 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -132,6 +132,11 @@ extern void warn_slowpath_null(const char *file, const int line);
unlikely(__ret_warn_once); \
})
+#define WARNED_ALREADY() ({ \
+ static bool __warned; \
+ unlikely(__warned); \
+})
+
#define WARN_ON_RATELIMIT(condition, state) \
WARN_ON((condition) && __ratelimit(state))
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 5173c1e..28fb14f 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -251,8 +251,13 @@ static void dev_watchdog(unsigned long arg)
if (some_queue_timedout) {
char drivername[64];
- WARN_ONCE(1, KERN_INFO "NETDEV WATCHDOG: %s (%s): transmit queue %u timed out\n",
- dev->name, netdev_drivername(dev, drivername, 64), i);
+ /* FIXME: is there a way to const char string[] = "NETDEV WATCHDOG..." */
+ if (!WARNED_ALREADY())
+ WARN_ONCE(1, KERN_INFO "NETDEV WATCHDOG: %s (%s): transmit queue %u timed out\n",
+ dev->name, netdev_drivername(dev, drivername, 64), i);
+ else
+ printk(KERN_INFO "NETDEV WATCHDOG: %s (%s): transmit queue %u timed out\n",
+ dev->name, netdev_drivername(dev, drivername, 64), i);
dev->netdev_ops->ndo_tx_timeout(dev);
}
if (!mod_timer(&dev->watchdog_timer,
On Fri, 2010-01-22 at 13:43 -0800, Jesse Brandeburg wrote:
> commit 5337407c changed NETDEV WATCHDOG messages into a message
> that will only print once per driver load. This removed a significant amount
> of information from an admin who might be missing that his system was having
> NETDEV WATCHDOGs, esp since there is no other global counter available to
> count these events.
>
> simply check the __warned flag and print a simple version of the message
> without the full stack dump if the (kerneloops related) WARN_ON_ONCE has
> already logged the hardware type and one hang.
[...]
> diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
> index 18c435d..ad810a0 100644
> --- a/include/asm-generic/bug.h
> +++ b/include/asm-generic/bug.h
> @@ -132,6 +132,11 @@ extern void warn_slowpath_null(const char *file, const int line);
> unlikely(__ret_warn_once); \
> })
>
> +#define WARNED_ALREADY() ({ \
> + static bool __warned; \
> + unlikely(__warned); \
> +})
It is indeed unlikely that __warned will be true, given there is no
statement to set it...
I think this could be a generic macro:
#define first_time() ({ \
static bool __been_here; \
__been_here++; \
})
> #define WARN_ON_RATELIMIT(condition, state) \
> WARN_ON((condition) && __ratelimit(state))
>
> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> index 5173c1e..28fb14f 100644
> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -251,8 +251,13 @@ static void dev_watchdog(unsigned long arg)
>
> if (some_queue_timedout) {
> char drivername[64];
> - WARN_ONCE(1, KERN_INFO "NETDEV WATCHDOG: %s (%s): transmit queue %u timed out\n",
> - dev->name, netdev_drivername(dev, drivername, 64), i);
> + /* FIXME: is there a way to const char string[] = "NETDEV WATCHDOG..." */
[...]
Maybe you could, you know, just write that declaration... though a
'static' in front wouldn't hurt.
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
From: Jesse Brandeburg <[email protected]>
Date: Fri, 22 Jan 2010 13:43:33 -0800
> commit 5337407c changed NETDEV WATCHDOG messages into a message
> that will only print once per driver load. This removed a significant amount
> of information from an admin who might be missing that his system was having
> NETDEV WATCHDOGs, esp since there is no other global counter available to
> count these events.
It's not once per driver load, it's once globally.
Once per driver load would be in fact what I would actually
consider more reasonable, so put the boolean state into
struct netdev, and test it to decide whether to do the
WARN_ON() print.
Doing a message every time is way overboard and is going to
spam some people's systems to the point where they can't
even diagnose the problem, so I'm not accepting a patch
which does that.