2008-12-06 06:59:39

by Nick Andrew

[permalink] [raw]
Subject: [RFC] Recursive printk

REQUEST FOR COMMENTS ...

Here is a set of two patches + a demo patch to enable recursive processing
of a printf format string in vsnprintf(), which is used by printk().

Specifically, when called like this:

printk(KERN_INFO "He said: %v, and I said: %s\n", fmt, args, "no");

the format string 'fmt' will be processed using the supplied args ('args'
is a va_list) and the resulting string will be 1 kernel message.

This is useful in functions which sit between kernel code and printk and
prepend or append to a kernel message. For example, from sound/core/misc.c

void snd_verbose_printk(const char *file, int line, const char *format, ...)
{
va_list args;

if (format[0] == '<' && format[1] >= '0' && format[1] <= '7' && format[2] == '>') {
char tmp[] = "<0>";
tmp[1] = format[1];
printk("%sALSA %s:%d: ", tmp, file, line);
format += 3;
} else {
printk("ALSA %s:%d: ", file, line);
}
va_start(args, format);
vprintk(format, args);
va_end(args);
}

Making 2 calls to printk for a single output line is undesirable
because the output is only guaranteed to be contiguous within a
single call to printk. Other code works around this by pre-processing
the format string into its own buffer, like drivers/cpufreq/cpufreq.c

void cpufreq_debug_printk(unsigned int type, const char *prefix,
const char *fmt, ...)
{
char s[256];
va_list args;
unsigned int len;
unsigned long flags;

WARN_ON(!prefix);
if (type & debug) {
spin_lock_irqsave(&disable_ratelimit_lock, flags);
if (!disable_ratelimit && debug_ratelimit
&& !printk_ratelimit()) {
spin_unlock_irqrestore(&disable_ratelimit_lock, flags);
return;
}
spin_unlock_irqrestore(&disable_ratelimit_lock, flags);

len = snprintf(s, 256, KERN_DEBUG "%s: ", prefix);

va_start(args, fmt);
len += vsnprintf(&s[len], (256 - len), fmt, args);
va_end(args);

printk(s);

WARN_ON(len < 5);
}
}

(using 256 bytes of stack) and drivers/scsi/arm/fas216.c

fas216_do_log(FAS216_Info *info, char target, char *fmt, va_list ap)
{
static char buf[1024];

vsnprintf(buf, sizeof(buf), fmt, ap);
printk("scsi%d.%c: %s", info->host->host_no, target, buf);
}

(using 1024 bytes of stack) and then issuing a single call to printk at
the end. On 4th December Matt Mackall posted a patch to the list which
essentially changed functions like the latter two into a type which
calls printk (or vprintk) 2 or 3 times.

I propose doing away with 2-and-3-printk code by enabling printk's
formatting function (vsnprintf) to process format strings with a '%'
directive (%v at the moment). When processing a format string, if %v
is seen then two arguments will be retrieved: the sub-format and
va_list args. These are used to recurse back into the processing
function, process that format and arguments, and return to the
original format string and continue to process it and its arguments.
With recursion all the formatting is done into the same buffer
(i.e. 1024 byte printk_buf in kernel/printk.c) saving stack space
and eliminating double-handling of the formatted strings.

I'm looking for feedback on this method.

The first patch prepares vsnprintf() by splitting it into two functions;
the inner function is suitable for recursion. The inner function needs
to know buf (start of buffer), str (current position within buffer)
and end (first character beyond end of buffer) at all times. 'buf'
is there because it has to be able to process a %n directive, which
requires calculation of the total number of characters (potentially)
written. The function also needs to be able to recurse into formats
which can't be expanded because they are beyond the end of the buffer,
because it needs to return the total number of characters which 'would
be' written, even if it can't write them.

The second patch adds the recursion on '%v', it's very simple.

The third patch is a sample net_printk() function which shows the
recursion working on a handful of messages from net/socket.c ...

void net_printk(char *level, char *fmt, ...) {
va_list ap;
va_start(ap, fmt);

printk("%sNET: %v\n", level, fmt, ap);
va_end(ap);
}

The only problem I've come up against so far is that actually using
'%v' causes gcc to warn thusly:

net/socket.c: In function ‘net_printk’:
net/socket.c:183: warning: unknown conversion type character ‘v’ in format
net/socket.c:183: warning: too many arguments for format

I haven't been able to find any way to tell gcc that '%v' is actually
alright and the alternative is to turn off printf format/argument
checking which is also not desirable.

To keep gcc quiet, the problem can be hacked around with another extension
to '%p', however 2 arguments must be taken so the specifier would look
like one of these:

%pV%s ... printk("%s: %pV%s\n", "NET", args, fmt);

%s%pV ... printk("%s: %s%pV\n", "NET", fmt, args);

The former one is easier to parse because it won't upset %s parsing
but the arguments look bad. I prefer to see fmt first.

Comments please,
Nick.

lib/vsprintf.c | 114 +++++++++++++++++++++++++++++++++++++++-----------------
net/socket.c | 27 ++++++++++---
2 files changed, 99 insertions(+), 42 deletions(-)

Nick Andrew (3):
Sample refactor of socket.c to use recursive printk
Add %v support to vsnprintf()
Split the vsnprintf function into two parts


2008-12-06 06:59:56

by Nick Andrew

[permalink] [raw]
Subject: [RFC PATCH v1 1/3] Split the vsnprintf function into two parts

Split the vsnprintf function into two parts

Two functions are required for recursion because we want the
recursive part to be able to run (and calcalate the output
size) even beyond its buffer size.

Signed-off-by: Nick Andrew <[email protected]>
---

lib/vsprintf.c | 100 ++++++++++++++++++++++++++++++++++++--------------------
1 files changed, 65 insertions(+), 35 deletions(-)


diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index a013bbc..1e2dcaf 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -616,10 +616,15 @@ static char *pointer(const char *fmt, char *buf, char *end, void *ptr, int field
return number(buf, end, (unsigned long) ptr, 16, field_width, precision, flags);
}

-/**
- * vsnprintf - Format a string and place it in a buffer
- * @buf: The buffer to place the result into
- * @size: The size of the buffer, including the trailing null space
+/*
+ * __vsnprintfr: Recursive implementation of vsnprintf.
+ *
+ * See the description for vsnprintf, which calls __vsnprintfr.
+ * __vsnprintfr can call itself recursively.
+ *
+ * @buf: The start address of the output buffer
+ * @str: The current write position (may extend beyond the buffer)
+ * @end: One address past the end of the buffer
* @fmt: The format string to use
* @args: Arguments for the format string
*
@@ -628,22 +633,14 @@ static char *pointer(const char *fmt, char *buf, char *end, void *ptr, int field
* %pF output the name of a function pointer
* %pR output the address range in a struct resource
*
- * The return value is the number of characters which would
- * be generated for the given input, excluding the trailing
- * '\0', as per ISO C99. If you want to have the exact
- * number of characters written into @buf as return value
- * (not including the trailing '\0'), use vscnprintf(). If the
- * return is greater than or equal to @size, the resulting
- * string is truncated.
- *
- * Call this function if you are already dealing with a va_list.
- * You probably want snprintf() instead.
*/
-int vsnprintf(char *buf, size_t size, const char *fmt, va_list args)
+
+static char *__vsnprintfr(char *buf, char *str, char *end,
+ const char *fmt, va_list args)
{
unsigned long long num;
int base;
- char *str, *end, c;
+ char c;

int flags; /* flags to number() */

@@ -655,25 +652,6 @@ int vsnprintf(char *buf, size_t size, const char *fmt, va_list args)
/* 'z' changed to 'Z' --davidm 1/25/99 */
/* 't' added for ptrdiff_t */

- /* Reject out-of-range values early. Large positive sizes are
- used for unknown buffer sizes. */
- if (unlikely((int) size < 0)) {
- /* There can be only one.. */
- static char warn = 1;
- WARN_ON(warn);
- warn = 0;
- return 0;
- }
-
- str = buf;
- end = buf + size;
-
- /* Make sure end is always >= buf */
- if (end < buf) {
- end = ((void *)-1);
- size = end - buf;
- }
-
for (; *fmt ; ++fmt) {
if (*fmt != '%') {
if (str < end)
@@ -844,12 +822,64 @@ int vsnprintf(char *buf, size_t size, const char *fmt, va_list args)
str = number(str, end, num, base,
field_width, precision, flags);
}
+
+ return str;
+}
+
+/**
+ * vsnprintf - Format a string and place it in a buffer
+ * @buf: The buffer to place the result into
+ * @size: The size of the buffer, including the trailing null space
+ * @fmt: The format string to use
+ * @args: Arguments for the format string
+ *
+ * Also see __vsnprintfr above.
+ *
+ * The return value is the number of characters which would
+ * be generated for the given input, excluding the trailing
+ * '\0', as per ISO C99. If you want to have the exact
+ * number of characters written into @buf as return value
+ * (not including the trailing '\0'), use vscnprintf(). If the
+ * return is greater than or equal to @size, the resulting
+ * string is truncated.
+ *
+ * Call this function if you are already dealing with a va_list.
+ * You probably want snprintf() instead.
+ */
+
+int vsnprintf(char *buf, size_t size, const char *fmt, va_list args)
+{
+ char *str, *end;
+
+ /* Reject out-of-range values early. Large positive sizes are
+ * used for unknown buffer sizes.
+ */
+ if (unlikely((int) size < 0)) {
+ /* There can be only one.. */
+ static char warn = 1;
+ WARN_ON(warn);
+ warn = 0;
+ return 0;
+ }
+
+ end = buf + size;
+
+ /* Make sure end is always >= buf */
+ if (end < buf) {
+ end = ((void *)-1);
+ size = end - buf;
+ }
+
+ /* format the args, writing them between 'buf' and 'end' */
+ str = __vsnprintfr(buf, buf, end, fmt, args);
+
if (size > 0) {
if (str < end)
*str = '\0';
else
end[-1] = '\0';
}
+
/* the trailing null byte doesn't count towards the total */
return str-buf;
}

2008-12-06 07:00:34

by Nick Andrew

[permalink] [raw]
Subject: [RFC PATCH v1 2/3] Add %v support to vsnprintf()

Add %v support to vsnprintf()

vsnprintf() is extended to support a '%v' in the format string. '%v'
takes two arguments, the first is a format string and the second is a
va_list. vsnprintf() is recursively called to process the specified
format string and arguments, before returning to the original format
string.

This allows functions like the following, which prepend a
subsystem-specific string to an error message:

int subsystem_error(char *fmt, ...) {
va_list ap;
int rc;

va_start(ap, fmt);
rc = printk("%s: %v\n", "SUBSYSTEM", fmt, ap);
va_end(ap);
return rc;
}

Functions printing errors inside the subsystem would do like:

subsystem_error("Failed to get widget: %d", 3);

The resulting kernel message would be:

SUBSYSTEM: Failed to get widget: 3\n

The advantage over using say 2 printk() calls is that message output
will be contiguous, whereas separate printk() calls might be broken
by an intervening message from another kernel thread. Also a single
message buffer is used, which avoids double-handling and excessive
memory use.

Signed-off-by: Nick Andrew <[email protected]>
---

lib/vsprintf.c | 14 ++++++++++++++
1 files changed, 14 insertions(+), 0 deletions(-)


diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 1e2dcaf..e779f79 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -632,6 +632,7 @@ static char *pointer(const char *fmt, char *buf, char *end, void *ptr, int field
* %pS output the name of a text symbol
* %pF output the name of a function pointer
* %pR output the address range in a struct resource
+ * %v processes a supplied format string and args, recursively
*
*/

@@ -787,6 +788,19 @@ static char *__vsnprintfr(char *buf, char *str, char *end,
case 'u':
break;

+ case 'v': {
+ char *fmt2;
+ va_list args2;
+
+ fmt2 = va_arg(args, char *);
+ args2 = va_arg(args, va_list);
+ /* recurse to format the (fmt,args) pair */
+ str = __vsnprintfr(buf, str, end, fmt2, args2);
+ va_end(args2);
+
+ continue;
+ }
+
default:
if (str < end)
*str = '%';

2008-12-06 07:00:51

by Nick Andrew

[permalink] [raw]
Subject: [RFC PATCH v1 3/3] Sample refactor of socket.c to use recursive printk

Sample refactor of socket.c to use recursive printk

Add a new function net_printk which takes a level argument
(kernel message level), format string and varargs and
calls printk with this format string and arguments:

printk("%sNET: %v\n", level, fmt, args);

This is a proof of concept to show that recursive printk
works and is useful.

Signed-off-by: Nick Andrew <[email protected]>
---

net/socket.c | 27 ++++++++++++++++++++-------
1 files changed, 20 insertions(+), 7 deletions(-)


diff --git a/net/socket.c b/net/socket.c
index 92764d8..d2e0d67 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -171,6 +171,19 @@ static DEFINE_PER_CPU(int, sockets_in_use) = 0;
:unix_mkname()).
*/

+/* printk for net subsystem.
+ * Prefixes every message with NET:
+ *
+ */
+
+void net_printk(char *level, char *fmt, ...) {
+ va_list ap;
+ va_start(ap, fmt);
+
+ printk("%sNET: %v\n", level, fmt, ap);
+ va_end(ap);
+}
+
/**
* move_addr_to_kernel - copy a socket address into kernel space
* @uaddr: Address in user space
@@ -535,7 +548,7 @@ void sock_release(struct socket *sock)
}

if (sock->fasync_list)
- printk(KERN_ERR "sock_release: fasync list not empty!\n");
+ net_printk(KERN_ERR, "sock_release: fasync list not empty!");

get_cpu_var(sockets_in_use)--;
put_cpu_var(sockets_in_use);
@@ -987,7 +1000,7 @@ static int sock_close(struct inode *inode, struct file *filp)
*/

if (!inode) {
- printk(KERN_DEBUG "sock_close: NULL inode\n");
+ net_printk(KERN_DEBUG, "sock_close: NULL inode");
return 0;
}
sock_release(SOCKET_I(inode));
@@ -1116,7 +1129,7 @@ static int __sock_create(struct net *net, int family, int type, int protocol,
static int warned;
if (!warned) {
warned = 1;
- printk(KERN_INFO "%s uses obsolete (PF_INET,SOCK_PACKET)\n",
+ net_printk(KERN_INFO, "%s uses obsolete (PF_INET,SOCK_PACKET)",
current->comm);
}
family = PF_PACKET;
@@ -1134,7 +1147,7 @@ static int __sock_create(struct net *net, int family, int type, int protocol,
sock = sock_alloc();
if (!sock) {
if (net_ratelimit())
- printk(KERN_WARNING "socket: no more sockets\n");
+ net_printk(KERN_WARNING, "socket: no more sockets");
return -ENFILE; /* Not exactly a match, but its the
closest posix thing */
}
@@ -2162,7 +2175,7 @@ int sock_register(const struct net_proto_family *ops)
int err;

if (ops->family >= NPROTO) {
- printk(KERN_CRIT "protocol %d >= NPROTO(%d)\n", ops->family,
+ net_printk(KERN_CRIT, "protocol %d >= NPROTO(%d)", ops->family,
NPROTO);
return -ENOBUFS;
}
@@ -2176,7 +2189,7 @@ int sock_register(const struct net_proto_family *ops)
}
spin_unlock(&net_family_lock);

- printk(KERN_INFO "NET: Registered protocol family %d\n", ops->family);
+ net_printk(KERN_INFO, "Registered protocol family %d", ops->family);
return err;
}

@@ -2203,7 +2216,7 @@ void sock_unregister(int family)

synchronize_rcu();

- printk(KERN_INFO "NET: Unregistered protocol family %d\n", family);
+ net_printk(KERN_INFO, "Unregistered protocol family %d", family);
}

static int __init sock_init(void)

2008-12-06 07:03:19

by David Miller

[permalink] [raw]
Subject: Re: [RFC PATCH v1 3/3] Sample refactor of socket.c to use recursive printk

From: Nick Andrew <[email protected]>
Date: Sat, 06 Dec 2008 18:00:23 +1100

> Sample refactor of socket.c to use recursive printk
>
> Add a new function net_printk which takes a level argument
> (kernel message level), format string and varargs and
> calls printk with this format string and arguments:
>
> printk("%sNET: %v\n", level, fmt, args);
>
> This is a proof of concept to show that recursive printk
> works and is useful.
>
> Signed-off-by: Nick Andrew <[email protected]>

Looks not that useful to me, to be honest.

2008-12-06 07:18:52

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: [RFC PATCH v1 3/3] Sample refactor of socket.c to use recursive printk

On Sat, 06 Dec 2008 18:00:23 +1100, Nick Andrew said:
> Sample refactor of socket.c to use recursive printk

I think we'd all appreciate it a lot if, instead of pointless code churn
in socket.c, you used it to fix an *actual* problem - like any of the
many places where we have issues with interleaved output in dmesg because
a function had to use 2 or 3 printk calls to build up a line.

In your 0/3 you point at sound/core/misc.c and drivers/cpufreq/cpufreq.c,
both of those would be better than socket.c


Attachments:
(No filename) (226.00 B)

2008-12-06 07:20:41

by Andrew Morton

[permalink] [raw]
Subject: Re: [RFC] Recursive printk

On Sat, 06 Dec 2008 17:59:22 +1100 Nick Andrew <[email protected]> wrote:

> REQUEST FOR COMMENTS ...
>
> Here is a set of two patches + a demo patch to enable recursive processing
> of a printf format string in vsnprintf(), which is used by printk().
>
> Specifically, when called like this:
>
> printk(KERN_INFO "He said: %v, and I said: %s\n", fmt, args, "no");

Yeah, there's no way to shut gcc up. You'll have to use %p.

> the format string 'fmt' will be processed using the supplied args ('args'
> is a va_list) and the resulting string will be 1 kernel message.
>
> This is useful in functions which sit between kernel code and printk and
> prepend or append to a kernel message. For example, from sound/core/misc.c
>
> void snd_verbose_printk(const char *file, int line, const char *format, ...)
> {
> va_list args;
>
> if (format[0] == '<' && format[1] >= '0' && format[1] <= '7' && format[2] == '>') {
> char tmp[] = "<0>";
> tmp[1] = format[1];
> printk("%sALSA %s:%d: ", tmp, file, line);
> format += 3;

That's racy. Two threads can fight over tmp[1]. It should do:

printk("<%c>ALSA %s:%d: ", format[1], tmp, file, line);

(I didn't know that you can even modify literal strings - shouldn't
they be in read-only storage?)

> } else {
> printk("ALSA %s:%d: ", file, line);
> }
> va_start(args, format);
> vprintk(format, args);
> va_end(args);
> }
>
> Making 2 calls to printk for a single output line is undesirable
> because the output is only guaranteed to be contiguous within a
> single call to printk. Other code works around this by pre-processing
> the format string into its own buffer, like drivers/cpufreq/cpufreq.c
>
> void cpufreq_debug_printk(unsigned int type, const char *prefix,
> const char *fmt, ...)
> {
> char s[256];
> va_list args;
> unsigned int len;
> unsigned long flags;
>
> WARN_ON(!prefix);
> if (type & debug) {
> spin_lock_irqsave(&disable_ratelimit_lock, flags);
> if (!disable_ratelimit && debug_ratelimit
> && !printk_ratelimit()) {
> spin_unlock_irqrestore(&disable_ratelimit_lock, flags);
> return;
> }
> spin_unlock_irqrestore(&disable_ratelimit_lock, flags);
>
> len = snprintf(s, 256, KERN_DEBUG "%s: ", prefix);
>
> va_start(args, fmt);
> len += vsnprintf(&s[len], (256 - len), fmt, args);
> va_end(args);
>
> printk(s);
>
> WARN_ON(len < 5);
> }
> }
>
> (using 256 bytes of stack) and drivers/scsi/arm/fas216.c
>
> fas216_do_log(FAS216_Info *info, char target, char *fmt, va_list ap)
> {
> static char buf[1024];
>
> vsnprintf(buf, sizeof(buf), fmt, ap);
> printk("scsi%d.%c: %s", info->host->host_no, target, buf);
> }
>
> (using 1024 bytes of stack)

Look closer. It uses no stack, but is racy.

> and then issuing a single call to printk at
> the end. On 4th December Matt Mackall posted a patch to the list which
> essentially changed functions like the latter two into a type which
> calls printk (or vprintk) 2 or 3 times.
>
> I propose doing away with 2-and-3-printk code by enabling printk's
> formatting function (vsnprintf) to process format strings with a '%'
> directive (%v at the moment). When processing a format string, if %v
> is seen then two arguments will be retrieved: the sub-format and
> va_list args. These are used to recurse back into the processing
> function, process that format and arguments, and return to the
> original format string and continue to process it and its arguments.
> With recursion all the formatting is done into the same buffer
> (i.e. 1024 byte printk_buf in kernel/printk.c) saving stack space
> and eliminating double-handling of the formatted strings.
>
> I'm looking for feedback on this method.
>
> The first patch prepares vsnprintf() by splitting it into two functions;
> the inner function is suitable for recursion. The inner function needs
> to know buf (start of buffer), str (current position within buffer)
> and end (first character beyond end of buffer) at all times. 'buf'
> is there because it has to be able to process a %n directive, which
> requires calculation of the total number of characters (potentially)
> written. The function also needs to be able to recurse into formats
> which can't be expanded because they are beyond the end of the buffer,
> because it needs to return the total number of characters which 'would
> be' written, even if it can't write them.
>
> The second patch adds the recursion on '%v', it's very simple.
>
> The third patch is a sample net_printk() function which shows the
> recursion working on a handful of messages from net/socket.c ...
>
> void net_printk(char *level, char *fmt, ...) {
> va_list ap;
> va_start(ap, fmt);
>
> printk("%sNET: %v\n", level, fmt, ap);
> va_end(ap);
> }
>
> The only problem I've come up against so far is that actually using
> '%v' causes gcc to warn thusly:
>
> net/socket.c: In function ___net_printk___:
> net/socket.c:183: warning: unknown conversion type character ___v___ in format
> net/socket.c:183: warning: too many arguments for format
>
> I haven't been able to find any way to tell gcc that '%v' is actually
> alright and the alternative is to turn off printf format/argument
> checking which is also not desirable.
>
> To keep gcc quiet, the problem can be hacked around with another extension
> to '%p', however 2 arguments must be taken so the specifier would look
> like one of these:
>
> %pV%s ... printk("%s: %pV%s\n", "NET", args, fmt);
>
> %s%pV ... printk("%s: %s%pV\n", "NET", fmt, args);
>
> The former one is easier to parse because it won't upset %s parsing
> but the arguments look bad. I prefer to see fmt first.
>
> Comments please,
> Nick.
>
> lib/vsprintf.c | 114 +++++++++++++++++++++++++++++++++++++++-----------------
> net/socket.c | 27 ++++++++++---

mutter.

Tejun had a thing a while ago which was kinda intended to solve the
same problem. iirc his approach added a lot more code (bad), but
didn't go and add strange new semantics to printk.

Is this really a big enough problem to justify the effort?

We have a lot of trouble educating developers about core kernel
infrastructure, particularly new stuff. Many, perhaps most developers
will still go off and cheerily do local hacks. These go into the tree,
because the people who merge that code _do_ know about the
infrastructure, but they just merge stuff without looking at it.

IOW, for this to be halfway as useful as you expect, we need a
look-out-for-local-printk-hacks maintainer.

2008-12-06 07:33:35

by Willy Tarreau

[permalink] [raw]
Subject: Re: [RFC] Recursive printk

On Fri, Dec 05, 2008 at 11:20:16PM -0800, Andrew Morton wrote:
> > void snd_verbose_printk(const char *file, int line, const char *format, ...)
> > {
> > va_list args;
> >
> > if (format[0] == '<' && format[1] >= '0' && format[1] <= '7' && format[2] == '>') {
> > char tmp[] = "<0>";
> > tmp[1] = format[1];
> > printk("%sALSA %s:%d: ", tmp, file, line);
> > format += 3;
>
> That's racy. Two threads can fight over tmp[1]. It should do:
>
> printk("<%c>ALSA %s:%d: ", format[1], tmp, file, line);
>
> (I didn't know that you can even modify literal strings - shouldn't
> they be in read-only storage?)

no Andrew, this tmp[] is declared on the stack, and gcc emits code to
copy the constant "<0>" onto the stack every time this code is called
(which is basically just a 32-bit integer copy). So there's no race
either.

Regards,
Willy

2008-12-06 07:41:17

by Nick Andrew

[permalink] [raw]
Subject: Re: [RFC PATCH v1 3/3] Sample refactor of socket.c to use recursive printk

On Sat, Dec 06, 2008 at 02:18:34AM -0500, [email protected] wrote:
> On Sat, 06 Dec 2008 18:00:23 +1100, Nick Andrew said:
> > Sample refactor of socket.c to use recursive printk
>
> I think we'd all appreciate it a lot if, instead of pointless code churn
> in socket.c, you used it to fix an *actual* problem - like any of the
> many places where we have issues with interleaved output in dmesg because
> a function had to use 2 or 3 printk calls to build up a line.

Fair enough, socket.c was only how I tested if it works, I never intended
it to be committed anywhere.

> In your 0/3 you point at sound/core/misc.c and drivers/cpufreq/cpufreq.c,
> both of those would be better than socket.c

Ok. This is how those two files would change, after Matt Mackall's patch
is applied. In each case we end up with a single printk, no interleaved
output, the code's smaller, still readable.

Nick.
---

drivers/cpufreq/cpufreq.c | 3 +--
sound/core/misc.c | 14 +++++---------
2 files changed, 6 insertions(+), 11 deletions(-)


diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index bc8bebc..27b323f 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -234,8 +234,7 @@ void cpufreq_debug_printk(unsigned int type, const char *prefix,
spin_unlock_irqrestore(&disable_ratelimit_lock, flags);

va_start(args, fmt);
- printk(KERN_DEBUG "%s: ", prefix);
- vprintk(fmt, args);
+ printk(KERN_DEBUG "%s: %v", prefix, fmt, args);
va_end(args);
}
}
diff --git a/sound/core/misc.c b/sound/core/misc.c
index 38524f6..ba9655f 100644
--- a/sound/core/misc.c
+++ b/sound/core/misc.c
@@ -39,16 +39,14 @@ void snd_verbose_printk(const char *file, int line, const char *format, ...)
{
va_list args;

+ va_start(args, format);
if (format[0] == '<' && format[1] >= '0' && format[1] <= '7' && format[2] == '>') {
char tmp[] = "<0>";
tmp[1] = format[1];
- printk("%sALSA %s:%d: ", tmp, file, line);
- format += 3;
+ printk("%sALSA %s:%d: %v", tmp, file, line, format + 3, args);
} else {
- printk("ALSA %s:%d: ", file, line);
+ printk("ALSA %s:%d: %v", file, line, format, args);
}
- va_start(args, format);
- vprintk(format, args);
va_end(args);
}

@@ -60,16 +58,14 @@ void snd_verbose_printd(const char *file, int line, const char *format, ...)
{
va_list args;

+ va_start(args, format);
if (format[0] == '<' && format[1] >= '0' && format[1] <= '7' && format[2] == '>') {
char tmp[] = "<0>";
tmp[1] = format[1];
- printk("%sALSA %s:%d: ", tmp, file, line);
- format += 3;
+ printk("%sALSA %s:%d: %v", tmp, file, line, format + 3, args);
} else {
printk(KERN_DEBUG "ALSA %s:%d: ", file, line);
}
- va_start(args, format);
- vprintk(format, args);
va_end(args);

}

2008-12-06 07:42:31

by Andrew Morton

[permalink] [raw]
Subject: Re: [RFC] Recursive printk

On Sat, 6 Dec 2008 08:33:05 +0100 Willy Tarreau <[email protected]> wrote:

> On Fri, Dec 05, 2008 at 11:20:16PM -0800, Andrew Morton wrote:
> > > void snd_verbose_printk(const char *file, int line, const char *format, ...)
> > > {
> > > va_list args;
> > >
> > > if (format[0] == '<' && format[1] >= '0' && format[1] <= '7' && format[2] == '>') {
> > > char tmp[] = "<0>";
> > > tmp[1] = format[1];
> > > printk("%sALSA %s:%d: ", tmp, file, line);
> > > format += 3;
> >
> > That's racy. Two threads can fight over tmp[1]. It should do:
> >
> > printk("<%c>ALSA %s:%d: ", format[1], tmp, file, line);
> >
> > (I didn't know that you can even modify literal strings - shouldn't
> > they be in read-only storage?)
>
> no Andrew, this tmp[] is declared on the stack, and gcc emits code to
> copy the constant "<0>" onto the stack every time this code is called

Good heavens, so it does. It is unusual for C to cater to such stupid
code by generating such inefficient code. A bit of a trap, really.

2008-12-06 07:42:54

by Joe Perches

[permalink] [raw]
Subject: Re: [RFC] Recursive printk

On Fri, 2008-12-05 at 23:20 -0800, Andrew Morton wrote:
> IOW, for this to be halfway as useful as you expect, we need a
> look-out-for-local-printk-hacks maintainer.

Which is really just a reviewer unless
the nominal maintainer is overridden.

I think recursive printk is poor style and
should be avoided.

2008-12-06 08:16:42

by Willy Tarreau

[permalink] [raw]
Subject: Re: [RFC] Recursive printk

On Fri, Dec 05, 2008 at 11:41:05PM -0800, Andrew Morton wrote:
> On Sat, 6 Dec 2008 08:33:05 +0100 Willy Tarreau <[email protected]> wrote:
>
> > On Fri, Dec 05, 2008 at 11:20:16PM -0800, Andrew Morton wrote:
> > > > void snd_verbose_printk(const char *file, int line, const char *format, ...)
> > > > {
> > > > va_list args;
> > > >
> > > > if (format[0] == '<' && format[1] >= '0' && format[1] <= '7' && format[2] == '>') {
> > > > char tmp[] = "<0>";
> > > > tmp[1] = format[1];
> > > > printk("%sALSA %s:%d: ", tmp, file, line);
> > > > format += 3;
> > >
> > > That's racy. Two threads can fight over tmp[1]. It should do:
> > >
> > > printk("<%c>ALSA %s:%d: ", format[1], tmp, file, line);
> > >
> > > (I didn't know that you can even modify literal strings - shouldn't
> > > they be in read-only storage?)
> >
> > no Andrew, this tmp[] is declared on the stack, and gcc emits code to
> > copy the constant "<0>" onto the stack every time this code is called
>
> Good heavens, so it does. It is unusual for C to cater to such stupid
> code by generating such inefficient code. A bit of a trap, really.

especially since teachers tend to explain students that char foo[] is the
same thing as char *foo, which obviously it is not.

Willy

2008-12-06 08:30:47

by Nick Andrew

[permalink] [raw]
Subject: Re: [RFC] Recursive printk

On Fri, Dec 05, 2008 at 11:20:16PM -0800, Andrew Morton wrote:
> On Sat, 06 Dec 2008 17:59:22 +1100 Nick Andrew <[email protected]> wrote:
>
> > REQUEST FOR COMMENTS ...
> >
> > Here is a set of two patches + a demo patch to enable recursive processing
> > of a printf format string in vsnprintf(), which is used by printk().

> > (using 256 bytes of stack) and drivers/scsi/arm/fas216.c
> >
> > fas216_do_log(FAS216_Info *info, char target, char *fmt, va_list ap)
> > {
> > static char buf[1024];
> >
> > vsnprintf(buf, sizeof(buf), fmt, ap);
> > printk("scsi%d.%c: %s", info->host->host_no, target, buf);
> > }
> >
> > (using 1024 bytes of stack)
>
> Look closer. It uses no stack, but is racy.

Ok, static. And that's 1024 bytes you won't get back, unless you
apply Matt Mackall's patch which changes the function to this:

static void
fas216_do_log(FAS216_Info *info, char target, char *fmt, va_list ap)
{
printk("scsi%d.%c: ", info->host->host_no, target, buf);
vprintk(fmt, ap);
}

(looks like he made a mistake in that one)

which allows a race condition on the kernel log, and then you apply
my patch, to get this:


static void
fas216_do_log(FAS216_Info *info, char target, char *fmt, va_list ap)
{
printk("scsi%d.%c: %v", info->host->host_no, target, fmt, ap);
}


> Tejun had a thing a while ago which was kinda intended to solve the
> same problem. iirc his approach added a lot more code (bad), but
> didn't go and add strange new semantics to printk.

I'm not trying to solve a huge problem (e.g. as Jason Baron's dynamic
debug does), just a small problem. Why do tasks have to choose
between double-buffering their messages and making multiple calls
to printk?

I was really pleased to discover that the processing of multiple
format strings could be performed in the _same_ buffer, so
the cost for doing so reduces to a couple of stack vars.

Subsystems seem to like putting functions in front of printk()
to prepend or append output. Matt Mackall found 27 functions
which look to be doing just that. My patch makes those functions
cleaner, and it might be usable to give more subsystems a
message prefix defined in one place, rather than repeated in
each call to printk().

> Is this really a big enough problem to justify the effort?

Seeing kernel messages corrupted like this really annoys me:

checking if image is initramfs...<6>Switched to high resolution mode on CPU 0
it is

I don't know if logcheck is figuring out that that first line is
actually two independent kernel messages.

> We have a lot of trouble educating developers about core kernel
> infrastructure, particularly new stuff.

There's Documentation/printk-formats.txt at least. If any change
gets committed, I expect that file to be updated accordingly.

> Many, perhaps most developers
> will still go off and cheerily do local hacks. These go into the tree,
> because the people who merge that code _do_ know about the
> infrastructure, but they just merge stuff without looking at it.
>
> IOW, for this to be halfway as useful as you expect, we need a
> look-out-for-local-printk-hacks maintainer.

If my patch makes it in, it will be followed by fixes to all
those local hacks. We might discover some common element so they
can be made non-local. Once a better way to interface to printk
becomes prevalent, it will probably be reused. And if it's not,
then it affects only the one subsystem.

Nick.

2008-12-06 08:40:54

by Nick Andrew

[permalink] [raw]
Subject: Re: [RFC] Recursive printk

On Fri, Dec 05, 2008 at 11:42:37PM -0800, Joe Perches wrote:
> I think recursive printk is poor style and
> should be avoided.

I think this is one of those cases where it's a lot harder (or uglier)
to solve a problem non-recursively than it is to solve recursively.

The basic problem is that subsystems want to prepend or append their
own data to kernel messages generated by subsystem code, and they
both want to use printf-type strings. How to combine them into a
single kernel log message without using extra memory, or double
handling of strings?

Nick.

2008-12-06 09:11:27

by Joe Perches

[permalink] [raw]
Subject: Re: [RFC] Recursive printk

On Sat, 2008-12-06 at 19:40 +1100, Nick Andrew wrote:
> The basic problem is that subsystems want to prepend or append their
> own data to kernel messages generated by subsystem code, and they
> both want to use printf-type strings. How to combine them into a
> single kernel log message without using extra memory, or double
> handling of strings?

pr_fmt seems to work for that.

I think the question is how to output messages that use
multiple calls to printk so they are not interleaved.

Giving up printf argument verification seems too high a
price to pay for the benefit of recursion.

2008-12-06 09:44:09

by Takashi Iwai

[permalink] [raw]
Subject: Re: [RFC] Recursive printk

At Fri, 5 Dec 2008 23:41:05 -0800,
Andrew Morton wrote:
>
> On Sat, 6 Dec 2008 08:33:05 +0100 Willy Tarreau <[email protected]> wrote:
>
> > On Fri, Dec 05, 2008 at 11:20:16PM -0800, Andrew Morton wrote:
> > > > void snd_verbose_printk(const char *file, int line, const char *format, ...)
> > > > {
> > > > va_list args;
> > > >
> > > > if (format[0] == '<' && format[1] >= '0' && format[1] <= '7' && format[2] == '>') {
> > > > char tmp[] = "<0>";
> > > > tmp[1] = format[1];
> > > > printk("%sALSA %s:%d: ", tmp, file, line);
> > > > format += 3;
> > >
> > > That's racy. Two threads can fight over tmp[1]. It should do:
> > >
> > > printk("<%c>ALSA %s:%d: ", format[1], tmp, file, line);
> > >
> > > (I didn't know that you can even modify literal strings - shouldn't
> > > they be in read-only storage?)
> >
> > no Andrew, this tmp[] is declared on the stack, and gcc emits code to
> > copy the constant "<0>" onto the stack every time this code is called
>
> Good heavens, so it does. It is unusual for C to cater to such stupid
> code by generating such inefficient code. A bit of a trap, really.

Indeed the tmp string isn't needed to be there.
Simply
printk("<%c>ALSA %s:%d: ", format[1], file, line);
should suffice. Though, the efficiency doesn't matter so much in this
case...


Takashi

2008-12-06 17:27:42

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC PATCH v1 2/3] Add %v support to vsnprintf()



On Sat, 6 Dec 2008, Nick Andrew wrote:
>
> Add %v support to vsnprintf()

No can do. gcc will always warn, and I don't want to see an extension that
will generate unfixable warnings.

If you want to do this, you'd need to do it the same way we did some othe
printf extensions, ie by not passing the valist entry itself, but
passing the _pointer_ to it, and using "%pV" as the format modifier.

(Quite frankly, "va_list" can be rather large, so passing a pointer to it
is probably a better idea anyway).

Of course, gcc may implement "va_list" as an array (or just a pointer)
internally, and then it will be passed as a pointer anyway (and you'd not
need the address-of), but that's certainly not in any way guaranteed, and
is very possibly architecture-dependent.

Linus

2008-12-06 19:35:56

by Al Viro

[permalink] [raw]
Subject: Re: [RFC] Recursive printk

On Sat, Dec 06, 2008 at 07:40:38PM +1100, Nick Andrew wrote:

> The basic problem is that subsystems want to prepend or append their
> own data to kernel messages generated by subsystem code, and they
> both want to use printf-type strings. How to combine them into a
> single kernel log message without using extra memory, or double
> handling of strings?

The basic problem with your approach is that printk interpretation of
arguments becomes dependent on more than its first (and hopefully
an explicit constant) argument.

2008-12-06 23:16:40

by Nick Andrew

[permalink] [raw]
Subject: Re: [RFC] Recursive printk

On Sat, Dec 06, 2008 at 01:11:33AM -0800, Joe Perches wrote:
> I think the question is how to output messages that use
> multiple calls to printk so they are not interleaved.

This is certainly needed e.g. for code which outputs line fragments
and expects full lines in the log, such as hex dumps. But its arguable
that that code should use its own buffer to format lines, then send
only complete lines to printk.

> Giving up printf argument verification seems too high a
> price to pay for the benefit of recursion.

I agree completely. I was hoping somebody could tell me how to include
'%v' in gcc's acceptable list, but it seems impossible. The alternative
then is to use '%pV%s' which gcc _should_ like (I haven't tested it,
but it seems reasonable) and reverse the order of the fmt and args
in the function call arguments.

Nick.

2008-12-06 23:25:39

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC] Recursive printk



On Sun, 7 Dec 2008, Nick Andrew wrote:
>
> I agree completely. I was hoping somebody could tell me how to include
> '%v' in gcc's acceptable list, but it seems impossible. The alternative
> then is to use '%pV%s' which gcc _should_ like (I haven't tested it,
> but it seems reasonable) and reverse the order of the fmt and args
> in the function call arguments.

Don't do %pV%s, do just %pV, and pass it a pointer to a struct, something
like

struct va_format {
const char *fmt;
va_list va;
};

and make people who use this just always pass things around like that.

(Maybe use "va_list *va" to add another level of indirection if necessary)

Linus

2008-12-08 01:43:01

by Tejun Heo

[permalink] [raw]
Subject: Re: [RFC] Recursive printk

Hello,

Nick Andrew wrote:
>> Tejun had a thing a while ago which was kinda intended to solve the
>> same problem. iirc his approach added a lot more code (bad), but
>> didn't go and add strange new semantics to printk.
>
> I'm not trying to solve a huge problem (e.g. as Jason Baron's dynamic
> debug does), just a small problem. Why do tasks have to choose
> between double-buffering their messages and making multiple calls
> to printk?

Yeah, my patch happened to solve about the same problem (and mprintk
proper was ~370 lines, so it wasn't too bad). Mine went the way of
assembling messages piece-by-piece and most of the complexities came
from buffer management and fallback (e.g. even if kmalloc fails to
allocate messages, messages still get printed out, albeit a bit
uglier).

>> IOW, for this to be halfway as useful as you expect, we need a
>> look-out-for-local-printk-hacks maintainer.
>
> If my patch makes it in, it will be followed by fixes to all
> those local hacks. We might discover some common element so they
> can be made non-local. Once a better way to interface to printk
> becomes prevalent, it will probably be reused. And if it's not,
> then it affects only the one subsystem.

No matter which mechanism, I'd love to have something to solve this in
kernel. I face this problem quite often and each and every time I
have to come up with some custom hack to work around it.

FWIW, this was the last take of the mprintk patchset.

http://thread.gmane.org/gmane.linux.ide/28415

Thanks.

--
tejun