2019-05-09 12:20:59

by Petr Mladek

[permalink] [raw]
Subject: [PATCH] vsprintf: Do not break early boot with probing addresses

The commit 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing
invalid pointers") broke boot on several architectures. The common
pattern is that probe_kernel_read() is not working during early
boot because userspace access framework is not ready.

The check is only the best effort. Let's not rush with it during
the early boot.

Details:

1. Report on Power:

Kernel crashes very early during boot with with CONFIG_PPC_KUAP and
CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG

The problem is the combination of some new code called via printk(),
check_pointer() which calls probe_kernel_read(). That then calls
allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
(before we've patched features). With the JUMP_LABEL debug enabled that
causes us to call printk() & dump_stack() and we end up recursing and
overflowing the stack.

Because it happens so early you don't get any output, just an apparently
dead system.

The stack trace (which you don't see) is something like:

...
dump_stack+0xdc
probe_kernel_read+0x1a4
check_pointer+0x58
string+0x3c
vsnprintf+0x1bc
vscnprintf+0x20
printk_safe_log_store+0x7c
printk+0x40
dump_stack_print_info+0xbc
dump_stack+0x8
probe_kernel_read+0x1a4
probe_kernel_read+0x19c
check_pointer+0x58
string+0x3c
vsnprintf+0x1bc
vscnprintf+0x20
vprintk_store+0x6c
vprintk_emit+0xec
vprintk_func+0xd4
printk+0x40
cpufeatures_process_feature+0xc8
scan_cpufeatures_subnodes+0x380
of_scan_flat_dt_subnodes+0xb4
dt_cpu_ftrs_scan_callback+0x158
of_scan_flat_dt+0xf0
dt_cpu_ftrs_scan+0x3c
early_init_devtree+0x360
early_setup+0x9c

2. Report on s390:

vsnprintf invocations, are broken on s390. For example, the early boot
output now looks like this where the first (efault) should be
the linux_banner:

[ 0.099985] (efault)
[ 0.099985] setup: Linux is running as a z/VM guest operating system in 64-bit mode
[ 0.100066] setup: The maximum memory size is 8192MB
[ 0.100070] cma: Reserved 4 MiB at (efault)
[ 0.100100] numa: NUMA mode: (efault)

The reason for this, is that the code assumes that
probe_kernel_address() works very early. This however is not true on
at least s390. Uaccess on KERNEL_DS works only after page tables have
been setup on s390, which happens with setup_arch()->paging_init().

Any probe_kernel_address() invocation before that will return -EFAULT.

Fixes: 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing invalid pointers")
Signed-off-by: Petr Mladek <[email protected]>
---
lib/vsprintf.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 7b0a6140bfad..8b43a883be6b 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -640,8 +640,13 @@ static const char *check_pointer_msg(const void *ptr)
if (!ptr)
return "(null)";

- if (probe_kernel_address(ptr, byte))
- return "(efault)";
+ /* User space address handling is not ready during early boot. */
+ if (system_state <= SYSTEM_BOOTING) {
+ if ((unsigned long)ptr < PAGE_SIZE)
+ return "(efault)";
+ } else {
+ if (probe_kernel_address(ptr, byte))
+ return "(efault)";

return NULL;
}
--
2.16.4


2019-05-09 13:11:59

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On Thu, May 09, 2019 at 02:19:23PM +0200, Petr Mladek wrote:
> The commit 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing
> invalid pointers") broke boot on several architectures. The common
> pattern is that probe_kernel_read() is not working during early
> boot because userspace access framework is not ready.
>
> The check is only the best effort. Let's not rush with it during
> the early boot.
>
> Details:
>
> 1. Report on Power:
>
> Kernel crashes very early during boot with with CONFIG_PPC_KUAP and
> CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG
>
> The problem is the combination of some new code called via printk(),
> check_pointer() which calls probe_kernel_read(). That then calls
> allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
> (before we've patched features). With the JUMP_LABEL debug enabled that
> causes us to call printk() & dump_stack() and we end up recursing and
> overflowing the stack.
>
> Because it happens so early you don't get any output, just an apparently
> dead system.
>
> The stack trace (which you don't see) is something like:
>
> ...
> dump_stack+0xdc
> probe_kernel_read+0x1a4
> check_pointer+0x58
> string+0x3c
> vsnprintf+0x1bc
> vscnprintf+0x20
> printk_safe_log_store+0x7c
> printk+0x40
> dump_stack_print_info+0xbc
> dump_stack+0x8
> probe_kernel_read+0x1a4
> probe_kernel_read+0x19c
> check_pointer+0x58
> string+0x3c
> vsnprintf+0x1bc
> vscnprintf+0x20
> vprintk_store+0x6c
> vprintk_emit+0xec
> vprintk_func+0xd4
> printk+0x40
> cpufeatures_process_feature+0xc8
> scan_cpufeatures_subnodes+0x380
> of_scan_flat_dt_subnodes+0xb4
> dt_cpu_ftrs_scan_callback+0x158
> of_scan_flat_dt+0xf0
> dt_cpu_ftrs_scan+0x3c
> early_init_devtree+0x360
> early_setup+0x9c
>
> 2. Report on s390:
>
> vsnprintf invocations, are broken on s390. For example, the early boot
> output now looks like this where the first (efault) should be
> the linux_banner:
>
> [ 0.099985] (efault)
> [ 0.099985] setup: Linux is running as a z/VM guest operating system in 64-bit mode
> [ 0.100066] setup: The maximum memory size is 8192MB
> [ 0.100070] cma: Reserved 4 MiB at (efault)
> [ 0.100100] numa: NUMA mode: (efault)
>
> The reason for this, is that the code assumes that
> probe_kernel_address() works very early. This however is not true on
> at least s390. Uaccess on KERNEL_DS works only after page tables have
> been setup on s390, which happens with setup_arch()->paging_init().
>
> Any probe_kernel_address() invocation before that will return -EFAULT.
>

It's seems as a good enough fix.
Reviewed-by: Andy Shevchenko <[email protected]>

Though in all cases would be nice to distinguish error pointers as well.
Something like

if (IS_ERR(ptr))
return err_pointer_str(ptr);

in check_pointer_msg().

> Fixes: 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing invalid pointers")
> Signed-off-by: Petr Mladek <[email protected]>
> ---
> lib/vsprintf.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/lib/vsprintf.c b/lib/vsprintf.c
> index 7b0a6140bfad..8b43a883be6b 100644
> --- a/lib/vsprintf.c
> +++ b/lib/vsprintf.c
> @@ -640,8 +640,13 @@ static const char *check_pointer_msg(const void *ptr)
> if (!ptr)
> return "(null)";
>
> - if (probe_kernel_address(ptr, byte))
> - return "(efault)";
> + /* User space address handling is not ready during early boot. */
> + if (system_state <= SYSTEM_BOOTING) {
> + if ((unsigned long)ptr < PAGE_SIZE)
> + return "(efault)";
> + } else {
> + if (probe_kernel_address(ptr, byte))
> + return "(efault)";
>
> return NULL;
> }
> --
> 2.16.4
>

--
With Best Regards,
Andy Shevchenko


2019-05-09 13:15:23

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On Thu, 9 May 2019 14:19:23 +0200
Petr Mladek <[email protected]> wrote:

> The commit 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing
> invalid pointers") broke boot on several architectures. The common
> pattern is that probe_kernel_read() is not working during early
> boot because userspace access framework is not ready.
>
> The check is only the best effort. Let's not rush with it during
> the early boot.
>
> Details:
>
> 1. Report on Power:
>
> Kernel crashes very early during boot with with CONFIG_PPC_KUAP and
> CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG
>
> The problem is the combination of some new code called via printk(),
> check_pointer() which calls probe_kernel_read(). That then calls
> allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
> (before we've patched features). With the JUMP_LABEL debug enabled that
> causes us to call printk() & dump_stack() and we end up recursing and
> overflowing the stack.
>
> Because it happens so early you don't get any output, just an apparently
> dead system.
>
> The stack trace (which you don't see) is something like:
>
> ...
> dump_stack+0xdc
> probe_kernel_read+0x1a4
> check_pointer+0x58
> string+0x3c
> vsnprintf+0x1bc
> vscnprintf+0x20
> printk_safe_log_store+0x7c
> printk+0x40
> dump_stack_print_info+0xbc
> dump_stack+0x8
> probe_kernel_read+0x1a4
> probe_kernel_read+0x19c
> check_pointer+0x58
> string+0x3c
> vsnprintf+0x1bc
> vscnprintf+0x20
> vprintk_store+0x6c
> vprintk_emit+0xec
> vprintk_func+0xd4
> printk+0x40
> cpufeatures_process_feature+0xc8
> scan_cpufeatures_subnodes+0x380
> of_scan_flat_dt_subnodes+0xb4
> dt_cpu_ftrs_scan_callback+0x158
> of_scan_flat_dt+0xf0
> dt_cpu_ftrs_scan+0x3c
> early_init_devtree+0x360
> early_setup+0x9c
>
> 2. Report on s390:
>
> vsnprintf invocations, are broken on s390. For example, the early boot
> output now looks like this where the first (efault) should be
> the linux_banner:
>
> [ 0.099985] (efault)
> [ 0.099985] setup: Linux is running as a z/VM guest operating system in 64-bit mode
> [ 0.100066] setup: The maximum memory size is 8192MB
> [ 0.100070] cma: Reserved 4 MiB at (efault)
> [ 0.100100] numa: NUMA mode: (efault)
>
> The reason for this, is that the code assumes that
> probe_kernel_address() works very early. This however is not true on
> at least s390. Uaccess on KERNEL_DS works only after page tables have
> been setup on s390, which happens with setup_arch()->paging_init().
>
> Any probe_kernel_address() invocation before that will return -EFAULT.

Hmm, this sounds to me that probe_kernel_address() is broken for these
architectures. Perhaps the system_state check should be in
probe_kernel_address() for those architectures?


>
> Fixes: 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing invalid pointers")
> Signed-off-by: Petr Mladek <[email protected]>
> ---
> lib/vsprintf.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/lib/vsprintf.c b/lib/vsprintf.c
> index 7b0a6140bfad..8b43a883be6b 100644
> --- a/lib/vsprintf.c
> +++ b/lib/vsprintf.c
> @@ -640,8 +640,13 @@ static const char *check_pointer_msg(const void *ptr)
> if (!ptr)
> return "(null)";
>
> - if (probe_kernel_address(ptr, byte))
> - return "(efault)";

Either that, or we add a macro to those architectures and add:

#ifdef ARCH_NO_EARLY_PROBE_KERNEL_ADDRESS

> + /* User space address handling is not ready during early boot. */
> + if (system_state <= SYSTEM_BOOTING) {
> + if ((unsigned long)ptr < PAGE_SIZE)
> + return "(efault)";
> + } else {

#endif

Why add an unnecessary branch for archs this is not a problem for?

-- Steve

> + if (probe_kernel_address(ptr, byte))
> + return "(efault)";
>
> return NULL;
> }

2019-05-09 13:39:46

by Michal Suchánek

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On Thu, 9 May 2019 14:19:23 +0200
Petr Mladek <[email protected]> wrote:

> The commit 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing
> invalid pointers") broke boot on several architectures. The common
> pattern is that probe_kernel_read() is not working during early
> boot because userspace access framework is not ready.
>
> The check is only the best effort. Let's not rush with it during
> the early boot.
>
> Details:
>
> 1. Report on Power:
>
> Kernel crashes very early during boot with with CONFIG_PPC_KUAP and
> CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG
>
> The problem is the combination of some new code called via printk(),
> check_pointer() which calls probe_kernel_read(). That then calls
> allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
> (before we've patched features).

There is early_mmu_has_feature for this case. mmu_has_feature does not
work before patching so parts of kernel that can run before patching
must use the early_ variant which actually runs code reading the
feature bitmap to determine the answer.

Thanks

Michal

2019-05-09 13:48:47

by David Laight

[permalink] [raw]
Subject: RE: [PATCH] vsprintf: Do not break early boot with probing addresses

From: Michal Suchánek
> Sent: 09 May 2019 14:38
...
> > The problem is the combination of some new code called via printk(),
> > check_pointer() which calls probe_kernel_read(). That then calls
> > allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
> > (before we've patched features).
>
> There is early_mmu_has_feature for this case. mmu_has_feature does not
> work before patching so parts of kernel that can run before patching
> must use the early_ variant which actually runs code reading the
> feature bitmap to determine the answer.

Does the early_ variant get patched so the it is reasonably
efficient after the 'patching' is done?
Or should there be a third version which gets patched across?

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2019-05-09 14:47:57

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On Thu 2019-05-09 09:13:57, Steven Rostedt wrote:
> On Thu, 9 May 2019 14:19:23 +0200
> Petr Mladek <[email protected]> wrote:
>
> > The commit 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing
> > invalid pointers") broke boot on several architectures. The common
> > pattern is that probe_kernel_read() is not working during early
> > boot because userspace access framework is not ready.
> >
> > The check is only the best effort. Let's not rush with it during
> > the early boot.
> >
> > Details:
> >
> > 1. Report on Power:
> >
> > Kernel crashes very early during boot with with CONFIG_PPC_KUAP and
> > CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG
> >
> > The problem is the combination of some new code called via printk(),
> > check_pointer() which calls probe_kernel_read(). That then calls
> > allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
> > (before we've patched features). With the JUMP_LABEL debug enabled that
> > causes us to call printk() & dump_stack() and we end up recursing and
> > overflowing the stack.
> >
> > Because it happens so early you don't get any output, just an apparently
> > dead system.
> >
> > The stack trace (which you don't see) is something like:
> >
> > ...
> > dump_stack+0xdc
> > probe_kernel_read+0x1a4
> > check_pointer+0x58
> > string+0x3c
> > vsnprintf+0x1bc
> > vscnprintf+0x20
> > printk_safe_log_store+0x7c
> > printk+0x40
> > dump_stack_print_info+0xbc
> > dump_stack+0x8
> > probe_kernel_read+0x1a4
> > probe_kernel_read+0x19c
> > check_pointer+0x58
> > string+0x3c
> > vsnprintf+0x1bc
> > vscnprintf+0x20
> > vprintk_store+0x6c
> > vprintk_emit+0xec
> > vprintk_func+0xd4
> > printk+0x40
> > cpufeatures_process_feature+0xc8
> > scan_cpufeatures_subnodes+0x380
> > of_scan_flat_dt_subnodes+0xb4
> > dt_cpu_ftrs_scan_callback+0x158
> > of_scan_flat_dt+0xf0
> > dt_cpu_ftrs_scan+0x3c
> > early_init_devtree+0x360
> > early_setup+0x9c
> >
> > 2. Report on s390:
> >
> > vsnprintf invocations, are broken on s390. For example, the early boot
> > output now looks like this where the first (efault) should be
> > the linux_banner:
> >
> > [ 0.099985] (efault)
> > [ 0.099985] setup: Linux is running as a z/VM guest operating system in 64-bit mode
> > [ 0.100066] setup: The maximum memory size is 8192MB
> > [ 0.100070] cma: Reserved 4 MiB at (efault)
> > [ 0.100100] numa: NUMA mode: (efault)
> >
> > The reason for this, is that the code assumes that
> > probe_kernel_address() works very early. This however is not true on
> > at least s390. Uaccess on KERNEL_DS works only after page tables have
> > been setup on s390, which happens with setup_arch()->paging_init().
> >
> > Any probe_kernel_address() invocation before that will return -EFAULT.
>
> Hmm, this sounds to me that probe_kernel_address() is broken for these
> architectures. Perhaps the system_state check should be in
> probe_kernel_address() for those architectures?

Yeah. Well, these problems are hard to debug. It left a dead power
system with a blank screen. I am not sure if the added check is
worth the pain.

I hope that the check would help to debug problems. But it is yet
another complexity in printk() path. I think that it is fine
to keep it enabled only on the booted system for a while
and get some more feedback.

Best Regards,
Petr

2019-05-10 04:49:17

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On (05/09/19 14:19), Petr Mladek wrote:
> 1. Report on Power:
>
> Kernel crashes very early during boot with with CONFIG_PPC_KUAP and
> CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG
>
> The problem is the combination of some new code called via printk(),
> check_pointer() which calls probe_kernel_read(). That then calls
> allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
> (before we've patched features). With the JUMP_LABEL debug enabled that
> causes us to call printk() & dump_stack() and we end up recursing and
> overflowing the stack.

Hmm... hmm... PPC does an .opd-based symbol dereference, which
eventually probe_kernel_read()-s. So early printk(%pS) will do

printk(%pS)
dereference_function_descriptor()
probe_kernel_address()
dump_stack()
printk(%pS)
dereference_function_descriptor()
probe_kernel_address()
dump_stack()
printk(%pS)
...

I'd say... that it's not vsprintf that we want to fix, it's
the idea that probe_kernel_address() can dump_stack() on any
platform. On some archs probe_kernel_address()->dump_stack()
is going nowhere:
dump_stack() does probe_kernel_address(), which calls dump_stack(),
which calls printk(%pS)->probe_kernel_address() again and again,
and again.

-ss

2019-05-10 05:08:17

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On (05/09/19 21:47), Linus Torvalds wrote:
> [ Sorry about html and mobile crud, I'm not at the computer right now ]
> How about we just undo the whole misguided probe_kernel_address() thing?

But the problem will remain - %pS/%pF on PPC (and some other arch-s)
do dereference_function_descriptor(), which calls probe_kernel_address().
So if probe_kernel_address() starts to dump_stack(), then we are heading
towards stack overflow. Unless I'm totally missing something.

-ss

2019-05-10 06:43:48

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

Sergey Senozhatsky <[email protected]> writes:
> On (05/09/19 21:47), Linus Torvalds wrote:
>> [ Sorry about html and mobile crud, I'm not at the computer right now ]
>> How about we just undo the whole misguided probe_kernel_address() thing?
>
> But the problem will remain - %pS/%pF on PPC (and some other arch-s)
> do dereference_function_descriptor(), which calls probe_kernel_address().

(Only on 64-bit big endian, and we may even change that one day)

> So if probe_kernel_address() starts to dump_stack(), then we are heading
> towards stack overflow. Unless I'm totally missing something.

We only ended up calling dump_stack() from probe_kernel_address() due to
a combination of things:
1. probe_kernel_address() actually uses __copy_from_user_inatomic()
which is silly because it's not doing a user access.
2. our user access code uses mmu_has_feature() which uses jump labels,
and so isn't safe to call until we've initialised those jump labels.
This is unnecessarily fragile, we can easily make the user access
code safe to call before the jump labels are initialised.
3. we had extra debug code enabled in mmu_has_feature() which calls
dump_stack().

I've fixed 2, and plan to fix 1 as well at some point. And 3 is behind a
CONFIG option that no one except me is going to have enabled in
practice.

So in future we shouldn't be calling dump_stack() in that path.

cheers

2019-05-10 08:07:19

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On Fri 2019-05-10 14:07:09, Sergey Senozhatsky wrote:
> On (05/09/19 21:47), Linus Torvalds wrote:
> > [ Sorry about html and mobile crud, I'm not at the computer right now ]
> > How about we just undo the whole misguided probe_kernel_address() thing?
>
> But the problem will remain - %pS/%pF on PPC (and some other arch-s)
> do dereference_function_descriptor(), which calls probe_kernel_address().
> So if probe_kernel_address() starts to dump_stack(), then we are heading
> towards stack overflow. Unless I'm totally missing something.

That is true. On the other hand, %pS/%pF uses
dereference_function_descriptor() only on three architectures.
And these modifiers are used only rarely (ok, in dump_stack()
but still).

On the other hand, any infinite loop in vsprintf() via
probe_kernel_address() would break any printk(). And would be
hard to debug.

I tend to agree with Linus. probe_kernel_address() is too complicated.
It is prone to these infinite loops and should not be used in
the default printk() path.

It would be nice to have a lightweight and safe alternative. But
I can't find any. And I think that it is not worth any huge
complexity. It was just a nice to have idea...


I am going to send a patch replacing probe_kernel_address() with
a simple check:

if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
return "(efault)";

The original patch still makes sense because it adds the check
into more locations and replaces some custom variants.

Best Regards,
Petr

2019-05-10 08:17:53

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On (05/10/19 10:06), Petr Mladek wrote:
[..]
> I am going to send a patch replacing probe_kernel_address() with
> a simple check:
>
> if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
> return "(efault)";

I'm OK with this.
Probing ptrs was a good idea, it just didn't work out.

-ss

2019-05-10 08:44:13

by Petr Mladek

[permalink] [raw]
Subject: [PATCH] vsprintf: Do not break early boot with probing addresses

The commit 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing
invalid pointers") broke boot on several architectures. The common
pattern is that probe_kernel_read() is not working during early
boot because userspace access framework is not ready.

It is a generic problem. We have to avoid any complex external
functions in vsprintf() code, especially in the common path.
They might break printk() easily and are hard to debug.

Replace probe_kernel_read() with some simple checks for obvious
problems.

Details:

1. Report on Power:

Kernel crashes very early during boot with with CONFIG_PPC_KUAP and
CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG

The problem is the combination of some new code called via printk(),
check_pointer() which calls probe_kernel_read(). That then calls
allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
(before we've patched features). With the JUMP_LABEL debug enabled that
causes us to call printk() & dump_stack() and we end up recursing and
overflowing the stack.

Because it happens so early you don't get any output, just an apparently
dead system.

The stack trace (which you don't see) is something like:

...
dump_stack+0xdc
probe_kernel_read+0x1a4
check_pointer+0x58
string+0x3c
vsnprintf+0x1bc
vscnprintf+0x20
printk_safe_log_store+0x7c
printk+0x40
dump_stack_print_info+0xbc
dump_stack+0x8
probe_kernel_read+0x1a4
probe_kernel_read+0x19c
check_pointer+0x58
string+0x3c
vsnprintf+0x1bc
vscnprintf+0x20
vprintk_store+0x6c
vprintk_emit+0xec
vprintk_func+0xd4
printk+0x40
cpufeatures_process_feature+0xc8
scan_cpufeatures_subnodes+0x380
of_scan_flat_dt_subnodes+0xb4
dt_cpu_ftrs_scan_callback+0x158
of_scan_flat_dt+0xf0
dt_cpu_ftrs_scan+0x3c
early_init_devtree+0x360
early_setup+0x9c

2. Report on s390:

vsnprintf invocations, are broken on s390. For example, the early boot
output now looks like this where the first (efault) should be
the linux_banner:

[ 0.099985] (efault)
[ 0.099985] setup: Linux is running as a z/VM guest operating system in 64-bit mode
[ 0.100066] setup: The maximum memory size is 8192MB
[ 0.100070] cma: Reserved 4 MiB at (efault)
[ 0.100100] numa: NUMA mode: (efault)

The reason for this, is that the code assumes that
probe_kernel_address() works very early. This however is not true on
at least s390. Uaccess on KERNEL_DS works only after page tables have
been setup on s390, which happens with setup_arch()->paging_init().

Any probe_kernel_address() invocation before that will return -EFAULT.

Fixes: 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing invalid pointers")
Signed-off-by: Petr Mladek <[email protected]>
---
lib/vsprintf.c | 11 ++++-------
1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 7b0a6140bfad..2f003cfe340e 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -628,19 +628,16 @@ static char *error_string(char *buf, char *end, const char *s,
}

/*
- * This is not a fool-proof test. 99% of the time that this will fault is
- * due to a bad pointer, not one that crosses into bad memory. Just test
- * the address to make sure it doesn't fault due to a poorly added printk
- * during debugging.
+ * Do not call any complex external code here. Nested printk()/vsprintf()
+ * might cause infinite loops. Failures might break printk() and would
+ * be hard to debug.
*/
static const char *check_pointer_msg(const void *ptr)
{
- char byte;
-
if (!ptr)
return "(null)";

- if (probe_kernel_address(ptr, byte))
+ if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
return "(efault)";

return NULL;
--
2.16.4

2019-05-10 08:53:41

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On (05/10/19 10:42), Petr Mladek wrote:
[..]
> Fixes: 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing invalid pointers")
> Signed-off-by: Petr Mladek <[email protected]>

FWIW
Reviewed-by: Sergey Senozhatsky <[email protected]>

-ss

2019-05-10 10:22:12

by Michael Ellerman

[permalink] [raw]
Subject: RE: [PATCH] vsprintf: Do not break early boot with probing addresses

David Laight <[email protected]> writes:
> From: Michal Suchánek
>> Sent: 09 May 2019 14:38
> ...
>> > The problem is the combination of some new code called via printk(),
>> > check_pointer() which calls probe_kernel_read(). That then calls
>> > allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
>> > (before we've patched features).
>>
>> There is early_mmu_has_feature for this case. mmu_has_feature does not
>> work before patching so parts of kernel that can run before patching
>> must use the early_ variant which actually runs code reading the
>> feature bitmap to determine the answer.
>
> Does the early_ variant get patched so the it is reasonably
> efficient after the 'patching' is done?

No they don't get patched ever. The name is a bit misleading I guess.

> Or should there be a third version which gets patched across?

For a case like this it's entirely safe to just skip the code early in
boot, so if it was a static_key_false everything would just work.

Unfortunately the way the code is currently written we would have to
change all MMU features to static_key_false and that risks breaking
something else.

We have a long standing TODO to rework all our feature logic and unify
CPU/MMU/firmware/etc. features. Possibly as part of that we can come up
with a scheme where the default value is per-feature bit.

Having said all that, in this case the overhead of the test and branch
is small compared to the cost of writing to the SPR which controls user
access and then doing an isync, so it's all somewhat premature
optimisation.

cheers

2019-05-10 14:53:03

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On Fri 2019-05-10 10:42:13, Petr Mladek wrote:
> The commit 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing
> invalid pointers") broke boot on several architectures. The common
> pattern is that probe_kernel_read() is not working during early
> boot because userspace access framework is not ready.
>
> It is a generic problem. We have to avoid any complex external
> functions in vsprintf() code, especially in the common path.
> They might break printk() easily and are hard to debug.
>
> Replace probe_kernel_read() with some simple checks for obvious
> problems.

JFYI, I have sent a pull request with this patch, see
https://lkml.kernel.org/r/[email protected]

Best Regards,
Petr

2019-05-10 16:26:55

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On Fri, 10 May 2019 10:42:13 +0200
Petr Mladek <[email protected]> wrote:

> static const char *check_pointer_msg(const void *ptr)
> {
> - char byte;
> -
> if (!ptr)
> return "(null)";
>
> - if (probe_kernel_address(ptr, byte))
> + if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
> return "(efault)";
>


< PAGE_SIZE ?

do you mean: < TASK_SIZE ?

-- Steve

2019-05-10 16:35:04

by Martin Schwidefsky

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On Fri, 10 May 2019 12:24:01 -0400
Steven Rostedt <[email protected]> wrote:

> On Fri, 10 May 2019 10:42:13 +0200
> Petr Mladek <[email protected]> wrote:
>
> > static const char *check_pointer_msg(const void *ptr)
> > {
> > - char byte;
> > -
> > if (!ptr)
> > return "(null)";
> >
> > - if (probe_kernel_address(ptr, byte))
> > + if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
> > return "(efault)";
> >
>
>
> < PAGE_SIZE ?
>
> do you mean: < TASK_SIZE ?

The check with < TASK_SIZE would break on s390. The 'ptr' is
in the kernel address space, *not* in the user address space.
Remember s390 has two separate address spaces for kernel/user
the check < TASK_SIZE only makes sense with a __user pointer.

--
blue skies,
Martin.

"Reality continues to ruin my life." - Calvin.

2019-05-10 16:43:48

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On Fri, 10 May 2019 18:32:58 +0200
Martin Schwidefsky <[email protected]> wrote:

> On Fri, 10 May 2019 12:24:01 -0400
> Steven Rostedt <[email protected]> wrote:
>
> > On Fri, 10 May 2019 10:42:13 +0200
> > Petr Mladek <[email protected]> wrote:
> >
> > > static const char *check_pointer_msg(const void *ptr)
> > > {
> > > - char byte;
> > > -
> > > if (!ptr)
> > > return "(null)";
> > >
> > > - if (probe_kernel_address(ptr, byte))
> > > + if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
> > > return "(efault)";
> > >
> >
> >
> > < PAGE_SIZE ?
> >
> > do you mean: < TASK_SIZE ?
>
> The check with < TASK_SIZE would break on s390. The 'ptr' is
> in the kernel address space, *not* in the user address space.
> Remember s390 has two separate address spaces for kernel/user
> the check < TASK_SIZE only makes sense with a __user pointer.
>

So we allow this to read user addresses? Can't that cause a fault?

If the condition is true, we return "(efault)".

-- Steve

2019-05-10 16:44:41

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On Fri, May 10, 2019 at 12:24:01PM -0400, Steven Rostedt wrote:
> On Fri, 10 May 2019 10:42:13 +0200
> Petr Mladek <[email protected]> wrote:
>
> > static const char *check_pointer_msg(const void *ptr)
> > {
> > - char byte;
> > -
> > if (!ptr)
> > return "(null)";
> >
> > - if (probe_kernel_address(ptr, byte))
> > + if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
> > return "(efault)";
> >
>
>
> < PAGE_SIZE ?
>
> do you mean: < TASK_SIZE ?

Original code used PAGE_SIZE. If it needs to be changed, that it might be a
separate explanation / patch.

--
With Best Regards,
Andy Shevchenko


2019-05-10 16:48:18

by Martin Schwidefsky

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On Fri, 10 May 2019 12:40:58 -0400
Steven Rostedt <[email protected]> wrote:

> On Fri, 10 May 2019 18:32:58 +0200
> Martin Schwidefsky <[email protected]> wrote:
>
> > On Fri, 10 May 2019 12:24:01 -0400
> > Steven Rostedt <[email protected]> wrote:
> >
> > > On Fri, 10 May 2019 10:42:13 +0200
> > > Petr Mladek <[email protected]> wrote:
> > >
> > > > static const char *check_pointer_msg(const void *ptr)
> > > > {
> > > > - char byte;
> > > > -
> > > > if (!ptr)
> > > > return "(null)";
> > > >
> > > > - if (probe_kernel_address(ptr, byte))
> > > > + if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
> > > > return "(efault)";
> > > >
> > >
> > >
> > > < PAGE_SIZE ?
> > >
> > > do you mean: < TASK_SIZE ?
> >
> > The check with < TASK_SIZE would break on s390. The 'ptr' is
> > in the kernel address space, *not* in the user address space.
> > Remember s390 has two separate address spaces for kernel/user
> > the check < TASK_SIZE only makes sense with a __user pointer.
> >
>
> So we allow this to read user addresses? Can't that cause a fault?
>
> If the condition is true, we return "(efault)".

On x86 this would allow a user space access as kernel and user live
in the same address space, on s390 it would not.
h
--
blue skies,
Martin.

"Reality continues to ruin my life." - Calvin.

2019-05-10 17:37:46

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses



Le 10/05/2019 à 18:24, Steven Rostedt a écrit :
> On Fri, 10 May 2019 10:42:13 +0200
> Petr Mladek <[email protected]> wrote:
>
>> static const char *check_pointer_msg(const void *ptr)
>> {
>> - char byte;
>> -
>> if (!ptr)
>> return "(null)";
>>
>> - if (probe_kernel_address(ptr, byte))
>> + if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
>> return "(efault)";
>>
>
>
> < PAGE_SIZE ?
>
> do you mean: < TASK_SIZE ?

I guess not.

Usually, < PAGE_SIZE means NULL pointer dereference (via the member of a
struct)

Christophe

---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast.
https://www.avast.com/antivirus

2019-05-13 09:15:19

by David Laight

[permalink] [raw]
Subject: RE: [PATCH] vsprintf: Do not break early boot with probing addresses

From: christophe leroy
> Sent: 10 May 2019 18:35
> Le 10/05/2019 à 18:24, Steven Rostedt a écrit :
> > On Fri, 10 May 2019 10:42:13 +0200
> > Petr Mladek <[email protected]> wrote:
> >
> >> static const char *check_pointer_msg(const void *ptr)
> >> {
> >> - char byte;
> >> -
> >> if (!ptr)
> >> return "(null)";
> >>
> >> - if (probe_kernel_address(ptr, byte))
> >> + if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
> >> return "(efault)";

"efault" looks a bit like a spellling mistake for "default".

> > < PAGE_SIZE ?
> >
> > do you mean: < TASK_SIZE ?
>
> I guess not.
>
> Usually, < PAGE_SIZE means NULL pointer dereference (via the member of a
> struct)

Maybe the caller should pass in a short buffer so that you can return
"(err-%d)" or "(null+%#x)" ?

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2019-05-13 09:19:19

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On Mon, May 13, 2019 at 08:52:41AM +0000, David Laight wrote:
> From: christophe leroy
> > Sent: 10 May 2019 18:35
> > Le 10/05/2019 ? 18:24, Steven Rostedt a ?crit?:
> > > On Fri, 10 May 2019 10:42:13 +0200
> > > Petr Mladek <[email protected]> wrote:

> > >> - if (probe_kernel_address(ptr, byte))
> > >> + if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
> > >> return "(efault)";
>
> "efault" looks a bit like a spellling mistake for "default".

It's a special, thus it's in parenthesis, though somebody can be
misguided.

> > Usually, < PAGE_SIZE means NULL pointer dereference (via the member of a
> > struct)
>
> Maybe the caller should pass in a short buffer so that you can return
> "(err-%d)"
> or "(null+%#x)" ?

In both cases it should be limited to the size of pointer (8 or 16
characters). Something like "(e:%4d)" would work for error codes.

The "(null)" is good enough by itself and already an established
practice..

--
With Best Regards,
Andy Shevchenko


2019-05-13 13:41:23

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On Fri 2019-05-10 12:40:58, Steven Rostedt wrote:
> On Fri, 10 May 2019 18:32:58 +0200
> Martin Schwidefsky <[email protected]> wrote:
>
> > On Fri, 10 May 2019 12:24:01 -0400
> > Steven Rostedt <[email protected]> wrote:
> >
> > > On Fri, 10 May 2019 10:42:13 +0200
> > > Petr Mladek <[email protected]> wrote:
> > >
> > > > static const char *check_pointer_msg(const void *ptr)
> > > > {
> > > > - char byte;
> > > > -
> > > > if (!ptr)
> > > > return "(null)";
> > > >
> > > > - if (probe_kernel_address(ptr, byte))
> > > > + if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
> > > > return "(efault)";
> > > >
> > >
> > >
> > > < PAGE_SIZE ?
> > >
> > > do you mean: < TASK_SIZE ?
> >
> > The check with < TASK_SIZE would break on s390. The 'ptr' is
> > in the kernel address space, *not* in the user address space.
> > Remember s390 has two separate address spaces for kernel/user
> > the check < TASK_SIZE only makes sense with a __user pointer.
> >
>
> So we allow this to read user addresses? Can't that cause a fault?

I did some quick check and did not found anywhere a user pointer
being dereferenced via vsprintf().

In each case, %s did the check (ptr < PAGE_SIZE) even before this
patchset. The other checks are in %p format modifiers that are
used to print various kernel structures.

Finally, it accesses the pointer directly. I am not completely sure
but I think that it would not work that easily with an address
from the user address space.

Best Regards,
Petr

2019-05-13 14:25:04

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On Mon, 13 May 2019 14:42:20 +0200
Petr Mladek <[email protected]> wrote:

> > The "(null)" is good enough by itself and already an established
> > practice..
>
> (efault) made more sense with the probe_kernel_read() that
> checked wide range of addresses. Well, I still think that
> it makes sense to distinguish a pure NULL. And it still
> used also for IS_ERR_VALUE().

Why not just "(fault)"? That is self descriptive enough.

-- Steve

2019-05-13 15:17:08

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On Mon 2019-05-13 12:13:20, Andy Shevchenko wrote:
> On Mon, May 13, 2019 at 08:52:41AM +0000, David Laight wrote:
> > From: christophe leroy
> > > Sent: 10 May 2019 18:35
> > > Le 10/05/2019 ? 18:24, Steven Rostedt a ?crit?:
> > > > On Fri, 10 May 2019 10:42:13 +0200
> > > > Petr Mladek <[email protected]> wrote:
>
> > > >> - if (probe_kernel_address(ptr, byte))
> > > >> + if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
> > > >> return "(efault)";
> >
> > "efault" looks a bit like a spellling mistake for "default".
>
> It's a special, thus it's in parenthesis, though somebody can be
> misguided.
>
> > > Usually, < PAGE_SIZE means NULL pointer dereference (via the member of a
> > > struct)
> >
> > Maybe the caller should pass in a short buffer so that you can return
> > "(err-%d)"
> > or "(null+%#x)" ?

There are many vsprintf() users. I am afraid that nobody would want
to define extra buffers for error messages. It must fit into
the one for the formatted string.


> In both cases it should be limited to the size of pointer (8 or 16
> characters). Something like "(e:%4d)" would work for error codes.

I am afraid that we could get 10 different proposals from a huge
enough group of people. I wanted to start with something simple
(source code). I know that (efault) might be confused with
"default" but it is still just one string to grep.


> The "(null)" is good enough by itself and already an established
> practice..

(efault) made more sense with the probe_kernel_read() that
checked wide range of addresses. Well, I still think that
it makes sense to distinguish a pure NULL. And it still
used also for IS_ERR_VALUE().

Best Regards,
Petr

2019-05-14 02:08:41

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On (05/13/19 14:42), Petr Mladek wrote:
> > The "(null)" is good enough by itself and already an established
> > practice..
>
> (efault) made more sense with the probe_kernel_read() that
> checked wide range of addresses. Well, I still think that
> it makes sense to distinguish a pure NULL. And it still
> used also for IS_ERR_VALUE().

Wouldn't anything within first PAGE_SIZE bytes be reported as
a NULL deref?

char *p = (char *)(PAGE_SIZE - 2);
*p = 'a';

gives

kernel: BUG: kernel NULL pointer dereference, address = 0000000000000ffe
kernel: #PF: supervisor-privileged write access from kernel code
kernel: #PF: error_code(0x0002) - not-present page


And I like Steven's "(fault)" idea.
How about this:

if ptr < PAGE_SIZE -> "(null)"
if IS_ERR_VALUE(ptr) -> "(fault)"

-ss

2019-05-14 02:27:46

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On (05/14/19 11:07), Sergey Senozhatsky wrote:
> How about this:
>
> if ptr < PAGE_SIZE -> "(null)"

No, this is totally stupid. Forget about it. Sorry.


> if IS_ERR_VALUE(ptr) -> "(fault)"

But Steven's "(fault)" is nice.

-ss

2019-05-14 08:30:22

by David Laight

[permalink] [raw]
Subject: RE: [PATCH] vsprintf: Do not break early boot with probing addresses

> And I like Steven's "(fault)" idea.
> How about this:
>
> if ptr < PAGE_SIZE -> "(null)"
> if IS_ERR_VALUE(ptr) -> "(fault)"
>
> -ss

Or:
if (ptr < PAGE_SIZE)
return ptr ? "(null+)" : "(null)";
if IS_ERR_VALUE(ptr)
return "(errno)"

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2019-05-14 09:04:08

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On Tue, May 14, 2019 at 10:29 AM David Laight <[email protected]> wrote:
> > And I like Steven's "(fault)" idea.
> > How about this:
> >
> > if ptr < PAGE_SIZE -> "(null)"
> > if IS_ERR_VALUE(ptr) -> "(fault)"
> >
> > -ss
>
> Or:
> if (ptr < PAGE_SIZE)
> return ptr ? "(null+)" : "(null)";
> if IS_ERR_VALUE(ptr)
> return "(errno)"

Do we care about the value? "(-E%u)"?

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2019-05-14 18:39:13

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses


[ Purple is a nice shade on the bike shed. ;-) ]

On Tue, 14 May 2019 11:02:17 +0200
Geert Uytterhoeven <[email protected]> wrote:

> On Tue, May 14, 2019 at 10:29 AM David Laight <[email protected]> wrote:
> > > And I like Steven's "(fault)" idea.
> > > How about this:
> > >
> > > if ptr < PAGE_SIZE -> "(null)"
> > > if IS_ERR_VALUE(ptr) -> "(fault)"
> > >
> > > -ss
> >
> > Or:
> > if (ptr < PAGE_SIZE)
> > return ptr ? "(null+)" : "(null)";

Hmm, that is useful.

> > if IS_ERR_VALUE(ptr)
> > return "(errno)"

I still prefer "(fault)" as is pretty much all I would expect from a
pointer dereference, even if it is just bad parsing of, say, a parsing
an MAC address. "fault" is generic enough. "errno" will be confusing,
because that's normally a variable not a output.

>
> Do we care about the value? "(-E%u)"?

That too could be confusing. What would (-E22) be considered by a user
doing an sprintf() on some string. I know that would confuse me, or I
would think that it was what the %pX displayed, and wonder why it
displayed it that way. Whereas "(fault)" is quite obvious for any %p
use case.

-- Steve

2019-05-14 19:14:55

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

Hi Steve,

On Tue, May 14, 2019 at 8:37 PM Steven Rostedt <[email protected]> wrote:
> On Tue, 14 May 2019 11:02:17 +0200
> Geert Uytterhoeven <[email protected]> wrote:
> > On Tue, May 14, 2019 at 10:29 AM David Laight <[email protected]> wrote:
> > > > And I like Steven's "(fault)" idea.
> > > > How about this:
> > > >
> > > > if ptr < PAGE_SIZE -> "(null)"
> > > > if IS_ERR_VALUE(ptr) -> "(fault)"
> > > >
> > > > -ss
> > >
> > > Or:
> > > if (ptr < PAGE_SIZE)
> > > return ptr ? "(null+)" : "(null)";
>
> Hmm, that is useful.
>
> > > if IS_ERR_VALUE(ptr)
> > > return "(errno)"
>
> I still prefer "(fault)" as is pretty much all I would expect from a
> pointer dereference, even if it is just bad parsing of, say, a parsing
> an MAC address. "fault" is generic enough. "errno" will be confusing,
> because that's normally a variable not a output.
>
> >
> > Do we care about the value? "(-E%u)"?
>
> That too could be confusing. What would (-E22) be considered by a user
> doing an sprintf() on some string. I know that would confuse me, or I
> would think that it was what the %pX displayed, and wonder why it
> displayed it that way. Whereas "(fault)" is quite obvious for any %p
> use case.

I would immediately understand there's a missing IS_ERR() check in a
function that can return -EINVAL, without having to add a new printk()
to find out what kind of bogus value has been received, and without
having to reboot, and trying to reproduce...

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2019-05-14 19:38:38

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On Tue, 14 May 2019 21:13:06 +0200
Geert Uytterhoeven <[email protected]> wrote:

> > > Do we care about the value? "(-E%u)"?
> >
> > That too could be confusing. What would (-E22) be considered by a user
> > doing an sprintf() on some string. I know that would confuse me, or I
> > would think that it was what the %pX displayed, and wonder why it
> > displayed it that way. Whereas "(fault)" is quite obvious for any %p
> > use case.
>
> I would immediately understand there's a missing IS_ERR() check in a
> function that can return -EINVAL, without having to add a new printk()
> to find out what kind of bogus value has been received, and without
> having to reboot, and trying to reproduce...

Hi Geert,

I have to ask. Has there actually been a case that you used a %pX and
it faulted, and you had to go back to find what the value of the
failure was?

IMO, sprintf() should not be a tool to do this, because then people
will not add their IS_ERR() and just let sprintf() do the job for them.
I don't think that would be wise to allow.

-- Steve

2019-05-15 06:22:53

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On (05/14/19 21:13), Geert Uytterhoeven wrote:
> I would immediately understand there's a missing IS_ERR() check in a
> function that can return -EINVAL, without having to add a new printk()
> to find out what kind of bogus value has been received, and without
> having to reboot, and trying to reproduce...

But chances are that missing IS_ERR() will crash the kernel sooner
or later (in general case), if not in sprintf() then somewhere else.

-ss

2019-05-15 07:25:49

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

Hi Steve,

On Tue, May 14, 2019 at 9:35 PM Steven Rostedt <[email protected]> wrote:
> On Tue, 14 May 2019 21:13:06 +0200
> Geert Uytterhoeven <[email protected]> wrote:
> > > > Do we care about the value? "(-E%u)"?
> > >
> > > That too could be confusing. What would (-E22) be considered by a user
> > > doing an sprintf() on some string. I know that would confuse me, or I
> > > would think that it was what the %pX displayed, and wonder why it
> > > displayed it that way. Whereas "(fault)" is quite obvious for any %p
> > > use case.
> >
> > I would immediately understand there's a missing IS_ERR() check in a
> > function that can return -EINVAL, without having to add a new printk()
> > to find out what kind of bogus value has been received, and without
> > having to reboot, and trying to reproduce...
>
> I have to ask. Has there actually been a case that you used a %pX and
> it faulted, and you had to go back to find what the value of the
> failure was?

If it faulted, the bad pointer value is obvious from the backtrace.
If the code avoids the fault by verifying the pointer and returning
"(efault)" instead, the bad pointer value is lost.

Or am I missing something?

Thanks!

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2019-05-15 07:37:25

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On Tue 2019-05-14 14:37:51, Steven Rostedt wrote:
>
> [ Purple is a nice shade on the bike shed. ;-) ]
>
> On Tue, 14 May 2019 11:02:17 +0200
> Geert Uytterhoeven <[email protected]> wrote:
>
> > On Tue, May 14, 2019 at 10:29 AM David Laight <[email protected]> wrote:
> > > > And I like Steven's "(fault)" idea.
> > > > How about this:
> > > >
> > > > if ptr < PAGE_SIZE -> "(null)"
> > > > if IS_ERR_VALUE(ptr) -> "(fault)"
> > > >
> > > > -ss
> > >
> > > Or:
> > > if (ptr < PAGE_SIZE)
> > > return ptr ? "(null+)" : "(null)";
>
> Hmm, that is useful.
>
> > > if IS_ERR_VALUE(ptr)
> > > return "(errno)"
>
> I still prefer "(fault)" as is pretty much all I would expect from a
> pointer dereference, even if it is just bad parsing of, say, a parsing
> an MAC address. "fault" is generic enough. "errno" will be confusing,
> because that's normally a variable not a output.
>
> >
> > Do we care about the value? "(-E%u)"?
>
> That too could be confusing. What would (-E22) be considered by a user
> doing an sprintf() on some string. I know that would confuse me, or I
> would think that it was what the %pX displayed, and wonder why it
> displayed it that way. Whereas "(fault)" is quite obvious for any %p
> use case.

This discussion clearly shows that it is hard to make anyone happy.

I considered switching to "(fault)" because there seems to be more
people in favor of this.

But there is used also "(einval)" when an unsupported pointer
modifier is passed. The idea is to show error codes that people
are familiar with.

It might have been better to use the uppercase "(EFAULT)" and
"(EINVAL)" to make it more obvious. But I wanted to follow
the existing style with the lowercase "(null)".

As of now, I think that we should keep it as is unless there is
some wider agreement on a change.

Best Regards,
Petr

2019-05-15 07:55:15

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH] vsprintf: Do not break early boot with probing addresses

On Wed 2019-05-15 09:23:05, Geert Uytterhoeven wrote:
> Hi Steve,
>
> On Tue, May 14, 2019 at 9:35 PM Steven Rostedt <[email protected]> wrote:
> > On Tue, 14 May 2019 21:13:06 +0200
> > Geert Uytterhoeven <[email protected]> wrote:
> > > > > Do we care about the value? "(-E%u)"?
> > > >
> > > > That too could be confusing. What would (-E22) be considered by a user
> > > > doing an sprintf() on some string. I know that would confuse me, or I
> > > > would think that it was what the %pX displayed, and wonder why it
> > > > displayed it that way. Whereas "(fault)" is quite obvious for any %p
> > > > use case.
> > >
> > > I would immediately understand there's a missing IS_ERR() check in a
> > > function that can return -EINVAL, without having to add a new printk()
> > > to find out what kind of bogus value has been received, and without
> > > having to reboot, and trying to reproduce...
> >
> > I have to ask. Has there actually been a case that you used a %pX and
> > it faulted, and you had to go back to find what the value of the
> > failure was?
>
> If it faulted, the bad pointer value is obvious from the backtrace.
> If the code avoids the fault by verifying the pointer and returning
> "(efault)" instead, the bad pointer value is lost.
>
> Or am I missing something?

Should buggy printk() crash the system?

Another problem is that vsprintf() is called in printk() under
lockbuf_lock. The messages are stored into printk_safe per CPU
buffers. It allows to see the nested messages. But there is still
a bigger risk of missing them than with a "normal" fault.

Finally, various variants of these checks were already used
in "random" printf formats. The only change is that we are
using them consistently everywhere[*] a pointer is accessed.

[*] Just the top level pointer is checked. Some pointer modifiers
are accessing ptr->ptr->val. The lower level pointers are not
checked to avoid too much complexity.

Best Regards,
Petr

2019-05-15 09:03:06

by David Laight

[permalink] [raw]
Subject: RE: [PATCH] vsprintf: Do not break early boot with probing addresses

From: Petr Mladek
> Sent: 15 May 2019 08:36
> On Tue 2019-05-14 14:37:51, Steven Rostedt wrote:
> >
> > [ Purple is a nice shade on the bike shed. ;-) ]
> >
> > On Tue, 14 May 2019 11:02:17 +0200
> > Geert Uytterhoeven <[email protected]> wrote:
> >
> > > On Tue, May 14, 2019 at 10:29 AM David Laight <[email protected]> wrote:
> > > > > And I like Steven's "(fault)" idea.
> > > > > How about this:
> > > > >
> > > > > if ptr < PAGE_SIZE -> "(null)"
> > > > > if IS_ERR_VALUE(ptr) -> "(fault)"
> > > > >
> > > > > -ss
> > > >
> > > > Or:
> > > > if (ptr < PAGE_SIZE)
> > > > return ptr ? "(null+)" : "(null)";
> >
> > Hmm, that is useful.
> >
> > > > if IS_ERR_VALUE(ptr)
> > > > return "(errno)"
> >
> > I still prefer "(fault)" as is pretty much all I would expect from a
> > pointer dereference, even if it is just bad parsing of, say, a parsing
> > an MAC address. "fault" is generic enough. "errno" will be confusing,
> > because that's normally a variable not a output.
> >
> > >
> > > Do we care about the value? "(-E%u)"?
> >
> > That too could be confusing. What would (-E22) be considered by a user
> > doing an sprintf() on some string. I know that would confuse me, or I
> > would think that it was what the %pX displayed, and wonder why it
> > displayed it that way. Whereas "(fault)" is quite obvious for any %p
> > use case.
>
> This discussion clearly shows that it is hard to make anyone happy.
>
> I considered switching to "(fault)" because there seems to be more
> people in favor of this.
>
> But there is used also "(einval)" when an unsupported pointer
> modifier is passed. The idea is to show error codes that people
> are familiar with.
>
> It might have been better to use the uppercase "(EFAULT)" and
> "(EINVAL)" to make it more obvious. But I wanted to follow
> the existing style with the lowercase "(null)".

Printing 'fault' when the code was (trying to) validate the
address was ok.
When the only check is for an -errno value it seems wrong as
most invalid addresses will actually fault (and panic).

The reason modern printf generate "(null)" is that it is far too
easy for a diagnostic print to fail to test a pointer.
It also makes it easier when 'throwing in' printf while debugging
to add a single trace that will work regardless of whether a
call had succeeded or not.

With the Linux kernel putting errno values into pointers it
seems likely that most invalid pointers in printf will actaully
be error values.
Printing the value will be helpful during debugging - as a
trace can be put after a call and show the parameters and result.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)