2008-02-25 20:37:13

by Steven Hawkes

[permalink] [raw]
Subject: printk_ratelimit and net_ratelimit conflict and tunable behavior

From: Steve Hawkes <[email protected]>

The printk_ratelimit() and net_ratelimit() functions each have their own
tunable parameters to control their respective rate limiting feature, but
they share common state variables, preventing independent tuning of the
parameters from working correctly. Also, changes to rate limiting tunable
parameters do not always take effect properly since state is not recomputed
when changes occur. For example, if ratelimit_burst is increased while rate
limiting is occurring, the change won't take full effect until at least
enough time between messages occurs so that the toks value reaches
ratelimit_burst * ratelimit_jiffies. This can result in messages being
suppressed when they should be allowed.

Implement independent state for printk_ratelimit() and net_ratelimit(), and
update state when tunables are changed.

Signed-off-by: Steve Hawkes <[email protected]>
---
diff -uprN linux-2.6.24/include/linux/kernel.h linux-2.6.24-printk_ratelimit/include/linux/kernel.h
--- linux-2.6.24/include/linux/kernel.h 2008-01-24 16:58:37.000000000 -0600
+++ linux-2.6.24-printk_ratelimit/include/linux/kernel.h 2008-02-21 11:20:41.751197312 -0600
@@ -196,8 +196,19 @@ static inline int log_buf_copy(char *des

unsigned long int_sqrt(unsigned long);

+struct printk_ratelimit_state
+{
+ unsigned long toks;
+ unsigned long last_jiffies;
+ int missed;
+ int limit_jiffies;
+ int limit_burst;
+ char const *facility;
+};
+
extern int printk_ratelimit(void);
-extern int __printk_ratelimit(int ratelimit_jiffies, int ratelimit_burst);
+extern int __printk_ratelimit(int ratelimit_jiffies, int ratelimit_burst,
+ struct printk_ratelimit_state *state);
extern bool printk_timed_ratelimit(unsigned long *caller_jiffies,
unsigned int interval_msec);

diff -uprN linux-2.6.24/kernel/printk.c linux-2.6.24-printk_ratelimit/kernel/printk.c
--- linux-2.6.24/kernel/printk.c 2008-01-24 16:58:37.000000000 -0600
+++ linux-2.6.24-printk_ratelimit/kernel/printk.c 2008-02-21 11:22:27.442319625 -0600
@@ -1238,35 +1238,41 @@ void tty_write_message(struct tty_struct
/*
* printk rate limiting, lifted from the networking subsystem.
*
- * This enforces a rate limit: not more than one kernel message
- * every printk_ratelimit_jiffies to make a denial-of-service
- * attack impossible.
+ * This enforces a rate limit to mitigate denial-of-service attacks:
+ * not more than ratelimit_burst messages every ratelimit_jiffies.
*/
-int __printk_ratelimit(int ratelimit_jiffies, int ratelimit_burst)
+int __printk_ratelimit(int ratelimit_jiffies,
+ int ratelimit_burst,
+ struct printk_ratelimit_state *state)
{
static DEFINE_SPINLOCK(ratelimit_lock);
- static unsigned long toks = 10 * 5 * HZ;
- static unsigned long last_msg;
- static int missed;
unsigned long flags;
unsigned long now = jiffies;

spin_lock_irqsave(&ratelimit_lock, flags);
- toks += now - last_msg;
- last_msg = now;
- if (toks > (ratelimit_burst * ratelimit_jiffies))
- toks = ratelimit_burst * ratelimit_jiffies;
- if (toks >= ratelimit_jiffies) {
- int lost = missed;
-
- missed = 0;
- toks -= ratelimit_jiffies;
+ state->toks += now - state->last_jiffies;
+ /* Reset limiting if tunables changed */
+ if ((state->limit_jiffies != ratelimit_jiffies) ||
+ (state->limit_burst != ratelimit_burst)) {
+ state->toks = ratelimit_burst * ratelimit_jiffies;
+ state->limit_jiffies = ratelimit_jiffies;
+ state->limit_burst = ratelimit_burst;
+ }
+ state->last_jiffies = now;
+ if (state->toks > (ratelimit_burst * ratelimit_jiffies))
+ state->toks = ratelimit_burst * ratelimit_jiffies;
+ if (state->toks >= ratelimit_jiffies) {
+ int lost = state->missed;
+ state->missed = 0;
+ state->toks -= ratelimit_jiffies;
spin_unlock_irqrestore(&ratelimit_lock, flags);
- if (lost)
- printk(KERN_WARNING "printk: %d messages suppressed.\n", lost);
+ if (lost) {
+ pr_warning("%s ratelimit suppressed message count: %d\n",
+ state->facility, lost);
+ }
return 1;
}
- missed++;
+ state->missed++;
spin_unlock_irqrestore(&ratelimit_lock, flags);
return 0;
}
@@ -1280,8 +1286,17 @@ int printk_ratelimit_burst = 10;

int printk_ratelimit(void)
{
+ static struct printk_ratelimit_state limit_state = {
+ .toks = 10 * 5 * HZ,
+ .last_jiffies = 0,
+ .missed = 0,
+ .limit_jiffies = 5 * HZ,
+ .limit_burst = 10,
+ .facility = "printk"
+ };
+
return __printk_ratelimit(printk_ratelimit_jiffies,
- printk_ratelimit_burst);
+ printk_ratelimit_burst, &limit_state);
}
EXPORT_SYMBOL(printk_ratelimit);

diff -uprN linux-2.6.24/net/core/utils.c linux-2.6.24-printk_ratelimit/net/core/utils.c
--- linux-2.6.24/net/core/utils.c 2008-01-24 16:58:37.000000000 -0600
+++ linux-2.6.24-printk_ratelimit/net/core/utils.c 2008-02-21 11:03:44.644337698 -0600
@@ -41,7 +41,16 @@ EXPORT_SYMBOL(net_msg_warn);
*/
int net_ratelimit(void)
{
- return __printk_ratelimit(net_msg_cost, net_msg_burst);
+ static struct printk_ratelimit_state limit_state = {
+ .toks = 10 * 5 * HZ,
+ .last_jiffies = 0,
+ .missed = 0,
+ .limit_jiffies = 5 * HZ,
+ .limit_burst = 10,
+ .facility = "net"
+ };
+
+ return __printk_ratelimit(net_msg_cost, net_msg_burst, &limit_state);
}
EXPORT_SYMBOL(net_ratelimit);


2008-02-25 23:47:34

by Andrew Morton

[permalink] [raw]
Subject: Re: printk_ratelimit and net_ratelimit conflict and tunable behavior

On Mon, 25 Feb 2008 14:36:40 -0600 Steven Hawkes <[email protected]> wrote:

> From: Steve Hawkes <[email protected]>
>
> The printk_ratelimit() and net_ratelimit() functions each have their own
> tunable parameters to control their respective rate limiting feature, but
> they share common state variables, preventing independent tuning of the
> parameters from working correctly. Also, changes to rate limiting tunable
> parameters do not always take effect properly since state is not recomputed
> when changes occur. For example, if ratelimit_burst is increased while rate
> limiting is occurring, the change won't take full effect until at least
> enough time between messages occurs so that the toks value reaches
> ratelimit_burst * ratelimit_jiffies. This can result in messages being
> suppressed when they should be allowed.
>
> Implement independent state for printk_ratelimit() and net_ratelimit(), and
> update state when tunables are changed.
>

This patch causes a large and nasty reject.

> ---
> --- linux-2.6.24/include/linux/kernel.h 2008-01-24 16:58:37.000000000 -0600
> +++ linux-2.6.24-printk_ratelimit/include/linux/kernel.h 2008-02-21 11:20:41.751197312 -0600

Probably because you patched 2.6.24. We're developing 2.6.25 now, and the
difference between the two is very large inded. Please raise patches
against Linus's latest tree?

There are other patches pending against printk.c (in -mm and in git-sched)
but afacit they won't collide.

> @@ -196,8 +196,19 @@ static inline int log_buf_copy(char *des
>
> unsigned long int_sqrt(unsigned long);
>
> +struct printk_ratelimit_state
> +{

Please do

struct printk_ratelimit_state {

> + unsigned long toks;
> + unsigned long last_jiffies;
> + int missed;
> + int limit_jiffies;
> + int limit_burst;
> + char const *facility;
> +};

I find that the best-value comments one can add to kernel code are to the
members of structures. If the reader understands what all the fields do, the
code becomes simple to follow.

> --- linux-2.6.24/net/core/utils.c 2008-01-24 16:58:37.000000000 -0600
> +++ linux-2.6.24-printk_ratelimit/net/core/utils.c 2008-02-21 11:03:44.644337698 -0600
> @@ -41,7 +41,16 @@ EXPORT_SYMBOL(net_msg_warn);
> */
> int net_ratelimit(void)
> {
> - return __printk_ratelimit(net_msg_cost, net_msg_burst);
> + static struct printk_ratelimit_state limit_state = {
> + .toks = 10 * 5 * HZ,
> + .last_jiffies = 0,
> + .missed = 0,
> + .limit_jiffies = 5 * HZ,
> + .limit_burst = 10,
> + .facility = "net"
> + };
> +
> + return __printk_ratelimit(net_msg_cost, net_msg_burst, &limit_state);

I don't get it. There's one instance of limit_state, kernel-wide, and
__printk_ratelimit() modifies it. What prevents one CPU's activities from
interfering with a second CPU's activities?

2008-02-26 00:05:18

by Joe Perches

[permalink] [raw]
Subject: RE: printk_ratelimit and net_ratelimit conflict and tunable behavior

On Mon, 2008-02-25 at 17:49 -0600, Hawkes Steve-FSH016 wrote:
> Are you saying the few lines of code to handle changes to the tunables
> aren't worth keeping?

Yes.

I think the tunables, if needed at all, should be set by modifying
the struct and the call might as well be:

bool __printk_ratelimit(struct printk_ratelimit_state *state)

Another quibble is not directed to your change because it's
preexisting but "tok" isn't a good name and may not even need
to be in the structure. It does save a multiply though.

I think that anything that attempts a printk is slow path
so it doesn't matter much though.

2008-02-28 16:10:57

by Steven Hawkes

[permalink] [raw]
Subject: Re: printk_ratelimit and net_ratelimit conflict and tunable behavior

Joe Perches wrote:

> On Mon, 2008-02-25 at 17:49 -0600, Hawkes Steve-FSH016 wrote:
> > Are you saying the few lines of code to handle changes to the tunables
> > aren't worth keeping?
>
> Yes.
>
> I think the tunables, if needed at all, should be set by modifying
> the struct and the call might as well be:
>
> bool __printk_ratelimit(struct printk_ratelimit_state *state)

The tunables are used in the current rate-limiting algorithm. Wouldn't
incorporating them into the structure require protecting modification of the
tunables by the same spinlock used in the rate limiting? That could be done
by pulling the spinlock variable out into printk_ratelimit() and
net_ratelimit() and into the struct (the spinlock is needed internal to
__printk_ratelimit to allow the spin_unlock() done right before actually
printing the message). That seems a bit more complex.

Or are you suggesting copying the tunables into the struct each time
__printk_ratelimit() is called? I was looking at them as not part of the
state of rate limiting, but rather external attributes controlling rate
limiting.

Joe Perches wrote:

> Another quibble is not directed to your change because it's
> preexisting but "tok" isn't a good name and may not even need
> to be in the structure. It does save a multiply though.

I agree the original name can be improved upon.

The toks state variable is needed because it actually maintains the current
rate-limiting water level, so to speak. The "bucket" is initially filled to
its capacity, ratelimit_jiffies * ratelimit_burst. Each time
__printk_ratelimit is called, water gets added to the bucket in proportion
to the time since the last call (capped by the capacity of the
bucket). Prints are allowed as long as the bucket has at least
ratelimit_jiffies of water left. Each allowed print sucks a
ratelimit_jiffies amount out of the bucket. (At least I think that's the way
the current kernel works; it wasn't immediately obvious to me.)

2008-02-28 16:19:20

by Hawkes Steve-FSH016

[permalink] [raw]
Subject: RE: printk_ratelimit and net_ratelimit conflict and tunable behavior

Andrew Morton wrote:

> This patch causes a large and nasty reject.
> Probably because you patched 2.6.24. We're developing 2.6.25 now, and
> the difference between the two is very large inded. Please raise
patches
> against Linus's latest tree?

Will do. I'm learning the process. I assume Linus's latest tree is the
one
listed as the latest prepatch for the stable Linux kernel tree.

Andrew Morton wrote:

> > struct printk_ratelimit_state {
> > + unsigned long toks;
> > + unsigned long last_jiffies;
> > + int missed;
> > + int limit_jiffies;
> > + int limit_burst;
> > + char const *facility;
> > +};
>
> I find that the best-value comments one can add to kernel code are to
the
> members of structures. If the reader understands what all the fields
do, the
> code becomes simple to follow.

Agreed. Although the current kernel source doesn't document these
attributes, there's no reason I couldn't add documentation for them.

Andrew Morton wrote:

> > int net_ratelimit(void)
> > {
> > - return __printk_ratelimit(net_msg_cost, net_msg_burst);
> > + static struct printk_ratelimit_state limit_state = {
> > + .toks = 10 * 5 * HZ,
> > + .last_jiffies = 0,
> > + .missed = 0,
> > + .limit_jiffies = 5 * HZ,
> > + .limit_burst = 10,
> > + .facility = "net"
> > + };
> > +
> > + return __printk_ratelimit(net_msg_cost, net_msg_burst,
&limit_state);
>
> I don't get it. There's one instance of limit_state, kernel-wide, and
> __printk_ratelimit() modifies it. What prevents one CPU's activities
from
> interfering with a second CPU's activities?

The state is protected by the spinlock in __printk_ratelimit, like it is
in
the current kernel. Am I missing something?

2008-02-28 18:40:47

by Andrew Morton

[permalink] [raw]
Subject: Re: printk_ratelimit and net_ratelimit conflict and tunable behavior

On Thu, 28 Feb 2008 10:19:02 -0600 "Hawkes Steve-FSH016" <[email protected]> wrote:

> Andrew Morton wrote:
>
> > This patch causes a large and nasty reject.
> > Probably because you patched 2.6.24. We're developing 2.6.25 now, and
> > the difference between the two is very large inded. Please raise
> patches
> > against Linus's latest tree?
>
> Will do. I'm learning the process. I assume Linus's latest tree is the
> one
> listed as the latest prepatch for the stable Linux kernel tree.

No, the stable tree is 2.6.24. You'll want
ftp://ftp.kernel.org/pub/linux/kernel/v2.6/snapshots/

> > > int net_ratelimit(void)
> > > {
> > > - return __printk_ratelimit(net_msg_cost, net_msg_burst);
> > > + static struct printk_ratelimit_state limit_state = {
> > > + .toks = 10 * 5 * HZ,
> > > + .last_jiffies = 0,
> > > + .missed = 0,
> > > + .limit_jiffies = 5 * HZ,
> > > + .limit_burst = 10,
> > > + .facility = "net"
> > > + };
> > > +
> > > + return __printk_ratelimit(net_msg_cost, net_msg_burst,
> &limit_state);
> >
> > I don't get it. There's one instance of limit_state, kernel-wide, and
> > __printk_ratelimit() modifies it. What prevents one CPU's activities
> from
> > interfering with a second CPU's activities?
>
> The state is protected by the spinlock in __printk_ratelimit, like it is
> in
> the current kernel. Am I missing something?

ah, OK.

I've occasionally wondered if ratelimiting should be per-callsite rather
than kernel-wide, but I'm not aware of the present setup causing anyone any
problems.