2013-10-01 20:35:27

by Helge Deller

[permalink] [raw]
Subject: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use

print_worker_info() includes no validity check on the pwq and wq
pointers before handing them over to the probe_kernel_read() functions.

It seems that most architectures don't care about that, but at least on
the parisc architecture this leads to a kernel crash since accesses to
page zero are protected by the kernel for security reasons.

Fix this problem by verifying the contents of pwq and wq before usage.
Even if probe_kernel_read() usually prevents such crashes by disabling
page faults, clean code should always include such checks.

Without this fix issuing "echo t > /proc/sysrq-trigger" will immediately
crash the Linux kernel on the parisc architecture.

CC: Tejun Heo <[email protected]>
CC: Libin <[email protected]>
CC: [email protected]
CC: [email protected]
Signed-off-by: Helge Deller <[email protected]>

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 987293d..c03b47f 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4512,8 +4512,10 @@ void print_worker_info(const char *log_lvl, struct task_struct *task)
*/
probe_kernel_read(&fn, &worker->current_func, sizeof(fn));
probe_kernel_read(&pwq, &worker->current_pwq, sizeof(pwq));
- probe_kernel_read(&wq, &pwq->wq, sizeof(wq));
- probe_kernel_read(name, wq->name, sizeof(name) - 1);
+ if (pwq)
+ probe_kernel_read(&wq, &pwq->wq, sizeof(wq));
+ if (wq)
+ probe_kernel_read(name, wq->name, sizeof(name) - 1);

/* copy worker description */
probe_kernel_read(&desc_valid, &worker->desc_valid, sizeof(desc_valid));


2013-10-01 20:44:10

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use

Hello,

On Tue, Oct 01, 2013 at 10:35:20PM +0200, Helge Deller wrote:
> print_worker_info() includes no validity check on the pwq and wq
> pointers before handing them over to the probe_kernel_read() functions.
>
> It seems that most architectures don't care about that, but at least on
> the parisc architecture this leads to a kernel crash since accesses to
> page zero are protected by the kernel for security reasons.
>
> Fix this problem by verifying the contents of pwq and wq before usage.
> Even if probe_kernel_read() usually prevents such crashes by disabling
> page faults, clean code should always include such checks.
>
> Without this fix issuing "echo t > /proc/sysrq-trigger" will immediately
> crash the Linux kernel on the parisc architecture.

Hmm... um had similar problem but the root cause here is that the arch
isn't implementing probe_kernel_read() properly. We really have no
idea what the pointer value may be at the dump point and that's why we
use probe_kernel_read(). If something like the above is necessary for
the time being, the correct place would be the arch
probe_kernel_read() implementation. James, would it be difficult
implement proper probe_kernel_read() on parisc?

Thanks.

--
tejun

2013-10-01 20:53:36

by Helge Deller

[permalink] [raw]
Subject: Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use

On 10/01/2013 10:43 PM, Tejun Heo wrote:
> Hello,
>
> On Tue, Oct 01, 2013 at 10:35:20PM +0200, Helge Deller wrote:
>> print_worker_info() includes no validity check on the pwq and wq
>> pointers before handing them over to the probe_kernel_read() functions.
>>
>> It seems that most architectures don't care about that, but at least on
>> the parisc architecture this leads to a kernel crash since accesses to
>> page zero are protected by the kernel for security reasons.
>>
>> Fix this problem by verifying the contents of pwq and wq before usage.
>> Even if probe_kernel_read() usually prevents such crashes by disabling
>> page faults, clean code should always include such checks.
>>
>> Without this fix issuing "echo t > /proc/sysrq-trigger" will immediately
>> crash the Linux kernel on the parisc architecture.
>
> Hmm... um had similar problem but the root cause here is that the arch
> isn't implementing probe_kernel_read() properly. We really have no
> idea what the pointer value may be at the dump point and that's why we
> use probe_kernel_read(). If something like the above is necessary for
> the time being, the correct place would be the arch
> probe_kernel_read() implementation. James, would it be difficult
> implement proper probe_kernel_read() on parisc?

No, it's not really complicated.
That was my initial way to work around that problem.

But is this really necessary? Isn't a pointer which points to mem zero most
likely wrong on any architecture?

In addition I wrote another patch to work around that problem in the parisc
page fault handler (which is needed anyway) too:
https://patchwork.kernel.org/patch/2971701/

So, in summary my patch here is not really necessary, but for the sake of
clean code I think it doesn't hurt either and as such it would be nice if
you could apply it.

Helge

2013-10-01 21:03:55

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use

On Tue, Oct 01, 2013 at 10:53:31PM +0200, Helge Deller wrote:
> So, in summary my patch here is not really necessary, but for the sake of
> clean code I think it doesn't hurt either and as such it would be nice if
> you could apply it.

What? function *must* take any value and try to access it and not
cause failure. That's the *whole* purpose of that interface. How is
having incomplete spurious checks around it "clean code" in any sense
of the word? That doesn't make any sense.

Nacked-by: Tejun Heo <[email protected]>

and *please* don't add any checks like that anywhere else in the
kernel.

Thanks.

--
tejun

2013-10-01 21:07:41

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use

On Tue, Oct 01, 2013 at 05:03:48PM -0400, Tejun Heo wrote:
> On Tue, Oct 01, 2013 at 10:53:31PM +0200, Helge Deller wrote:
> > So, in summary my patch here is not really necessary, but for the sake of
> > clean code I think it doesn't hurt either and as such it would be nice if
> > you could apply it.
>
> What? function *must* take any value and try to access it and not
> cause failure. That's the *whole* purpose of that interface. How is
> having incomplete spurious checks around it "clean code" in any sense
> of the word? That doesn't make any sense.

Just in case you didn't know already. probe_kernel_read()'s role is
to take any ulong value and dereference it if it can. If not, it can
return any value, but it shouldn't crash in any case. If you're just
adding NULL test in probe_kernel_read(), you're just masking a common
failure pattern and the kernel still *will* panic while dumping the
states. If a specific arch doesn't have proper probe_kernel_read()
implementation, adding if (!NULL) test there could be a temporary
workaround, but it should be clearly marked as such.

--
tejun

2013-10-01 21:40:54

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use

On Tue, 2013-10-01 at 16:43 -0400, Tejun Heo wrote:
> Hello,
>
> On Tue, Oct 01, 2013 at 10:35:20PM +0200, Helge Deller wrote:
> > print_worker_info() includes no validity check on the pwq and wq
> > pointers before handing them over to the probe_kernel_read() functions.
> >
> > It seems that most architectures don't care about that, but at least on
> > the parisc architecture this leads to a kernel crash since accesses to
> > page zero are protected by the kernel for security reasons.
> >
> > Fix this problem by verifying the contents of pwq and wq before usage.
> > Even if probe_kernel_read() usually prevents such crashes by disabling
> > page faults, clean code should always include such checks.
> >
> > Without this fix issuing "echo t > /proc/sysrq-trigger" will immediately
> > crash the Linux kernel on the parisc architecture.
>
> Hmm... um had similar problem but the root cause here is that the arch
> isn't implementing probe_kernel_read() properly. We really have no
> idea what the pointer value may be at the dump point and that's why we
> use probe_kernel_read(). If something like the above is necessary for
> the time being, the correct place would be the arch
> probe_kernel_read() implementation. James, would it be difficult
> implement proper probe_kernel_read() on parisc?

The problem seems to be that some traps bypass our exception table
handling. Helge, do you have the actual stack trace for this? That
should show where the exception handling is missing.

Thanks,

James

2013-10-01 22:07:41

by Helge Deller

[permalink] [raw]
Subject: Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use

On 10/01/2013 11:40 PM, James Bottomley wrote:
> On Tue, 2013-10-01 at 16:43 -0400, Tejun Heo wrote:
>> Hello,
>>
>> On Tue, Oct 01, 2013 at 10:35:20PM +0200, Helge Deller wrote:
>>> print_worker_info() includes no validity check on the pwq and wq
>>> pointers before handing them over to the probe_kernel_read() functions.
>>>
>>> It seems that most architectures don't care about that, but at least on
>>> the parisc architecture this leads to a kernel crash since accesses to
>>> page zero are protected by the kernel for security reasons.
>>>
>>> Fix this problem by verifying the contents of pwq and wq before usage.
>>> Even if probe_kernel_read() usually prevents such crashes by disabling
>>> page faults, clean code should always include such checks.
>>>
>>> Without this fix issuing "echo t > /proc/sysrq-trigger" will immediately
>>> crash the Linux kernel on the parisc architecture.
>>
>> Hmm... um had similar problem but the root cause here is that the arch
>> isn't implementing probe_kernel_read() properly. We really have no
>> idea what the pointer value may be at the dump point and that's why we
>> use probe_kernel_read(). If something like the above is necessary for
>> the time being, the correct place would be the arch
>> probe_kernel_read() implementation. James, would it be difficult
>> implement proper probe_kernel_read() on parisc?
>
> The problem seems to be that some traps bypass our exception table
> handling.

Yes, that's correct.
It's trap #26 and we directly call parisc_terminate() for fault_space==0
without checking the exception table.
See my patch I posted a few hours ago which fixes this:
https://patchwork.kernel.org/patch/2971701/

> Helge, do you have the actual stack trace for this? That
> should show where the exception handling is missing.

Here it is:
[47072.976000] ksoftirqd/0 R running task 0 3 2 0x00000000
[47072.976000] Backtrace:
[47072.976000] [<0000000040113a54>] __schedule+0x62c/0x808
[47072.976000]
[47072.976000] kworker/0:0H S 00000000401040c0 0 5 2 0x00000000
[47073.468000] Backtrace:
[47073.468000] [<0000000040464264>] pa_memcpy+0x44/0xb0
[47073.468000] [<00000000404643e0>] __copy_from_user+0x60/0x90
[47073.468000] [<00000000401d99bc>] __probe_kernel_read+0x54/0x90
[47073.468000] [<000000004016cc70>] print_worker_info+0x158/0x2c0
[47073.468000] [<0000000040185a60>] sched_show_task+0x1c8/0x210
[47073.468000] [<0000000040185b64>] show_state_filter+0xbc/0x138
[47073.468000] [<00000000404e85c4>] sysrq_handle_showstate+0x34/0x48
[47073.468000] [<00000000404e9154>] __handle_sysrq+0x174/0x2f0
[47073.468000] [<00000000404e933c>] write_sysrq_trigger+0x6c/0x90
[47073.468000] [<00000000402ca2fc>] proc_reg_write+0xbc/0x130
[47073.468000] [<0000000040236d44>] vfs_write+0x114/0x268
[47073.468000] [<00000000402373a4>] SyS_write+0x94/0xf8
[47073.468000] [<0000000040105fc0>] syscall_exit+0x0/0x14
[47073.468000]
[47073.468000]
[47073.468000] Kernel Fault: Code=26 regs=00000000958a09b0 (Addr=0000000000000008)
[47073.468000] CPU: 0 PID: 30189 Comm: bash Not tainted 3.12.0-rc3-64bit+ #1
[47073.468000] task: 000000007ba64100 ti: 00000000958a0000 task.ti: 00000000958a0000
[47073.468000]
[47073.468000] YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
[47073.468000] PSW: 00001000000001001111111100001110 Not tainted
[47073.468000] r00-03 000000ff0804ff0e 00000000958a08c0 0000000040464264 00000000958a0960
[47073.468000] r04-07 0000000040d73db0 0000000000000008 0000000000000008 00000000958a06f8
[47073.468000] r08-11 00000000958a0600 0000000040c49d18 00000000af535494 00000000958a0370
[47073.468000] r12-15 0000000000000000 0000000000000000 000000000010e7e8 00000000000fde28
[47073.468000] r16-19 0000000000000000 00000000000c7800 0000000000000000 0000000000000000
[47073.468000] r20-23 00000000958a06e0 0000000000000018 0000000000000018 0000000000000003
[47073.468000] r24-27 0000000000000008 0000000000000008 00000000958a06f8 0000000040d73db0
[47073.468000] r28-31 00000000958a06f8 00000000958a0930 00000000958a09b0 0000000000000008
[47073.468000] sr00-03 0000000005dc5000 0000000000000000 0000000000000000 0000000005dc5000
[47073.468000] sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[47073.468000]
[47073.468000] IASQ: 0000000000000000 0000000000000000 IAOQ: 0000000040463fdc 0000000040463fe0
[47073.468000] IIR: 0fe25033 ISR: 0000000000000000 IOR: 0000000000000008
[47073.468000] CPU: 0 CR30: 00000000958a0000 CR31: 0000000011111111
[47073.468000] ORIG_R28: 00000000958a0b40
[47073.468000] IAOQ[0]: pa_memcpy_internal+0xec/0x2b4
[47073.468000] IAOQ[1]: pa_memcpy_internal+0xf0/0x2b4
[47073.468000] RP(r2): pa_memcpy+0x44/0xb0
[47073.468000] Backtrace:
[47073.468000] [<0000000040464264>] pa_memcpy+0x44/0xb0
[47073.468000] [<00000000404643e0>] __copy_from_user+0x60/0x90
[47073.468000] [<00000000401d99bc>] __probe_kernel_read+0x54/0x90
[47073.468000] [<000000004016cc70>] print_worker_info+0x158/0x2c0
[47073.468000] [<0000000040185a60>] sched_show_task+0x1c8/0x210
[47073.468000] [<0000000040185b64>] show_state_filter+0xbc/0x138
[47073.468000] [<00000000404e85c4>] sysrq_handle_showstate+0x34/0x48
[47073.468000] [<00000000404e9154>] __handle_sysrq+0x174/0x2f0
[47073.468000] [<00000000404e933c>] write_sysrq_trigger+0x6c/0x90
[47073.468000] [<00000000402ca2fc>] proc_reg_write+0xbc/0x130
[47073.468000] [<0000000040236d44>] vfs_write+0x114/0x268
[47073.468000] [<00000000402373a4>] SyS_write+0x94/0xf8
[47073.468000] [<0000000040105fc0>] syscall_exit+0x0/0x14
[47073.468000]
[47073.468000] Kernel panic - not syncing: Kernel Fault

2013-10-01 22:34:58

by Helge Deller

[permalink] [raw]
Subject: Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use

On 10/01/2013 11:07 PM, Tejun Heo wrote:
> On Tue, Oct 01, 2013 at 05:03:48PM -0400, Tejun Heo wrote:
>> On Tue, Oct 01, 2013 at 10:53:31PM +0200, Helge Deller wrote:
>>> So, in summary my patch here is not really necessary, but for the sake of
>>> clean code I think it doesn't hurt either and as such it would be nice if
>>> you could apply it.
>>
>> What? function *must* take any value and try to access it and not
>> cause failure. That's the *whole* purpose of that interface. How is
>> having incomplete spurious checks around it "clean code" in any sense
>> of the word? That doesn't make any sense.
>
> Just in case you didn't know already. probe_kernel_read()'s role is
> to take any ulong value and dereference it if it can. If not, it can
> return any value, but it shouldn't crash in any case. If you're just
> adding NULL test in probe_kernel_read(), you're just masking a common
> failure pattern and the kernel still *will* panic while dumping the
> states. If a specific arch doesn't have proper probe_kernel_read()
> implementation, adding if (!NULL) test there could be a temporary
> workaround, but it should be clearly marked as such.

Sure, probe_kernel_read() takes care that no segfaults will happen.
Nevertheless, if we know that "pwq" might become NULL, why access pwq->wq at all?
struct pool_workqueue *pwq = NULL;
probe_kernel_read(&wq, &pwq>wq, sizeof(wq));

If you wouldn't have used probe_kernel_read() you would never code it
like that. That's what I meant when I wrote "clean coding" (aka "similar
to what you would have done without probe_kernel_read()").

Helge

2013-10-01 22:40:30

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use

Hello,

On Wed, Oct 02, 2013 at 12:34:53AM +0200, Helge Deller wrote:
> Sure, probe_kernel_read() takes care that no segfaults will happen.
> Nevertheless, if we know that "pwq" might become NULL, why access pwq->wq at all?
> struct pool_workqueue *pwq = NULL;
> probe_kernel_read(&wq, &pwq>wq, sizeof(wq));
>
> If you wouldn't have used probe_kernel_read() you would never code it
> like that. That's what I meant when I wrote "clean coding" (aka "similar
> to what you would have done without probe_kernel_read()").

Because it is using probe_kernel_read() and such test wouldn't mean
anything? It may be NULL, it may be 1 or full Fs. NULL is just one
of many illegal pointers which may happen. Why add code which doesn't
achieve anything when you're explicitly trying to access pointers
which you know could be invalid? Why is that "clean"? Is "if (p)
kfree(p)" cleaner than "kfree(p)"?

Thanks.

--
tejun

2013-10-01 22:47:10

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use

On Tue, Oct 01, 2013 at 06:40:23PM -0400, Tejun Heo wrote:
> Because it is using probe_kernel_read() and such test wouldn't mean
> anything? It may be NULL, it may be 1 or full Fs. NULL is just one
> of many illegal pointers which may happen. Why add code which doesn't
> achieve anything when you're explicitly trying to access pointers
> which you know could be invalid? Why is that "clean"? Is "if (p)
> kfree(p)" cleaner than "kfree(p)"?

Here's one general rule of thumb for "cleanliness" - try to do the
minimal because that's something many people can agree on. If people
do stuff which aren't necessary, naturally different people would have
different opinions on what's cleaner / better and inevitably end up
with different choices as the choices made are functionally superflous
none would fail and we'll end up with various variants for the same
thing for no good reason, which is messy. Adding if (p) in front of
probe_kernel_read(p) is inherently superflous and you wouldn't have
any way to enforce or even encourage such practice and the end result
would inevitably be if (p) being sprayed randomly, which is the
opposite of cleanliness.

So, no, please don't add random tests which aren't essential. It is
inherently messy thing to do.

Thanks.

--
tejun

2013-10-01 22:51:16

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use

On Wed, 2013-10-02 at 00:07 +0200, Helge Deller wrote:
> On 10/01/2013 11:40 PM, James Bottomley wrote:
> > On Tue, 2013-10-01 at 16:43 -0400, Tejun Heo wrote:
> >> Hello,
> >>
> >> On Tue, Oct 01, 2013 at 10:35:20PM +0200, Helge Deller wrote:
> >>> print_worker_info() includes no validity check on the pwq and wq
> >>> pointers before handing them over to the probe_kernel_read() functions.
> >>>
> >>> It seems that most architectures don't care about that, but at least on
> >>> the parisc architecture this leads to a kernel crash since accesses to
> >>> page zero are protected by the kernel for security reasons.
> >>>
> >>> Fix this problem by verifying the contents of pwq and wq before usage.
> >>> Even if probe_kernel_read() usually prevents such crashes by disabling
> >>> page faults, clean code should always include such checks.
> >>>
> >>> Without this fix issuing "echo t > /proc/sysrq-trigger" will immediately
> >>> crash the Linux kernel on the parisc architecture.
> >>
> >> Hmm... um had similar problem but the root cause here is that the arch
> >> isn't implementing probe_kernel_read() properly. We really have no
> >> idea what the pointer value may be at the dump point and that's why we
> >> use probe_kernel_read(). If something like the above is necessary for
> >> the time being, the correct place would be the arch
> >> probe_kernel_read() implementation. James, would it be difficult
> >> implement proper probe_kernel_read() on parisc?
> >
> > The problem seems to be that some traps bypass our exception table
> > handling.
>
> Yes, that's correct.
> It's trap #26 and we directly call parisc_terminate() for fault_space==0
> without checking the exception table.
> See my patch I posted a few hours ago which fixes this:
> https://patchwork.kernel.org/patch/2971701/

That doesn't quite look right ... I guessed it was probably access
rights, so we should do an exception table fixup, so isn't this the fix?
because we shouldn't call do_page_fault if there's no exception table.

James

---
diff --git a/arch/parisc/kernel/traps.c b/arch/parisc/kernel/traps.c
index 04e47c6..25a088a 100644
--- a/arch/parisc/kernel/traps.c
+++ b/arch/parisc/kernel/traps.c
@@ -684,6 +684,8 @@ void notrace handle_interruption(int code, struct pt_regs *regs)
/* Fall Through */
case 26:
/* PCXL: Data memory access rights trap */
+ if (!user_mode(regs) && fixup_exception(regs))
+ return;
fault_address = regs->ior;
fault_space = regs->isr;
break;

2013-10-02 00:48:18

by John David Anglin

[permalink] [raw]
Subject: Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use

On 1-Oct-13, at 6:50 PM, James Bottomley wrote:

> On Wed, 2013-10-02 at 00:07 +0200, Helge Deller wrote:
>> On 10/01/2013 11:40 PM, James Bottomley wrote:
>>> On Tue, 2013-10-01 at 16:43 -0400, Tejun Heo wrote:
>>>> Hello,
>>>>
>>>> On Tue, Oct 01, 2013 at 10:35:20PM +0200, Helge Deller wrote:
>>>>> print_worker_info() includes no validity check on the pwq and wq
>>>>> pointers before handing them over to the probe_kernel_read()
>>>>> functions.
>>>>>
>>>>> It seems that most architectures don't care about that, but at
>>>>> least on
>>>>> the parisc architecture this leads to a kernel crash since
>>>>> accesses to
>>>>> page zero are protected by the kernel for security reasons.
>>>>>
>>>>> Fix this problem by verifying the contents of pwq and wq before
>>>>> usage.
>>>>> Even if probe_kernel_read() usually prevents such crashes by
>>>>> disabling
>>>>> page faults, clean code should always include such checks.
>>>>>
>>>>> Without this fix issuing "echo t > /proc/sysrq-trigger" will
>>>>> immediately
>>>>> crash the Linux kernel on the parisc architecture.
>>>>
>>>> Hmm... um had similar problem but the root cause here is that the
>>>> arch
>>>> isn't implementing probe_kernel_read() properly. We really have no
>>>> idea what the pointer value may be at the dump point and that's
>>>> why we
>>>> use probe_kernel_read(). If something like the above is
>>>> necessary for
>>>> the time being, the correct place would be the arch
>>>> probe_kernel_read() implementation. James, would it be difficult
>>>> implement proper probe_kernel_read() on parisc?
>>>
>>> The problem seems to be that some traps bypass our exception table
>>> handling.
>>
>> Yes, that's correct.
>> It's trap #26 and we directly call parisc_terminate() for
>> fault_space==0
>> without checking the exception table.
>> See my patch I posted a few hours ago which fixes this:
>> https://patchwork.kernel.org/patch/2971701/
>
> That doesn't quite look right ... I guessed it was probably access
> rights, so we should do an exception table fixup, so isn't this the
> fix?
> because we shouldn't call do_page_fault if there's no exception table.

What about trap #18? It appears the same problem can occur on PCXS.

I have the strong feeling that __copy_from_user still won't be bullet
proof.
See attached fault. As far as I know, we don't have an OS HPMC handler.

>
> James
>
> ---
> diff --git a/arch/parisc/kernel/traps.c b/arch/parisc/kernel/traps.c
> index 04e47c6..25a088a 100644
> --- a/arch/parisc/kernel/traps.c
> +++ b/arch/parisc/kernel/traps.c
> @@ -684,6 +684,8 @@ void notrace handle_interruption(int code,
> struct pt_regs *regs)
> /* Fall Through */
> case 26:
> /* PCXL: Data memory access rights trap */
> + if (!user_mode(regs) && fixup_exception(regs))
> + return;
> fault_address = regs->ior;
> fault_space = regs->isr;
> break;
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-
> parisc" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

--
John David Anglin [email protected]



Attachments:
hpmc-20130929.txt (6.07 kB)

2013-10-02 01:58:44

by John David Anglin

[permalink] [raw]
Subject: Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use

On 1-Oct-13, at 6:50 PM, James Bottomley wrote:

> On Wed, 2013-10-02 at 00:07 +0200, Helge Deller wrote:
>> On 10/01/2013 11:40 PM, James Bottomley wrote:
>>> On Tue, 2013-10-01 at 16:43 -0400, Tejun Heo wrote:
>>>> Hello,
>>>>
>>>> On Tue, Oct 01, 2013 at 10:35:20PM +0200, Helge Deller wrote:
>>>>> print_worker_info() includes no validity check on the pwq and wq
>>>>> pointers before handing them over to the probe_kernel_read()
>>>>> functions.
>>>>>
>>>>> It seems that most architectures don't care about that, but at
>>>>> least on
>>>>> the parisc architecture this leads to a kernel crash since
>>>>> accesses to
>>>>> page zero are protected by the kernel for security reasons.
>>>>>
>>>>> Fix this problem by verifying the contents of pwq and wq before
>>>>> usage.
>>>>> Even if probe_kernel_read() usually prevents such crashes by
>>>>> disabling
>>>>> page faults, clean code should always include such checks.
>>>>>
>>>>> Without this fix issuing "echo t > /proc/sysrq-trigger" will
>>>>> immediately
>>>>> crash the Linux kernel on the parisc architecture.
>>>>
>>>> Hmm... um had similar problem but the root cause here is that the
>>>> arch
>>>> isn't implementing probe_kernel_read() properly. We really have no
>>>> idea what the pointer value may be at the dump point and that's
>>>> why we
>>>> use probe_kernel_read(). If something like the above is
>>>> necessary for
>>>> the time being, the correct place would be the arch
>>>> probe_kernel_read() implementation. James, would it be difficult
>>>> implement proper probe_kernel_read() on parisc?
>>>
>>> The problem seems to be that some traps bypass our exception table
>>> handling.
>>
>> Yes, that's correct.
>> It's trap #26 and we directly call parisc_terminate() for
>> fault_space==0
>> without checking the exception table.
>> See my patch I posted a few hours ago which fixes this:
>> https://patchwork.kernel.org/patch/2971701/
>
> That doesn't quite look right ... I guessed it was probably access
> rights, so we should do an exception table fixup, so isn't this the
> fix?
> because we shouldn't call do_page_fault if there's no exception table.
>
> James
>
> ---
> diff --git a/arch/parisc/kernel/traps.c b/arch/parisc/kernel/traps.c
> index 04e47c6..25a088a 100644
> --- a/arch/parisc/kernel/traps.c
> +++ b/arch/parisc/kernel/traps.c
> @@ -684,6 +684,8 @@ void notrace handle_interruption(int code,
> struct pt_regs *regs)
> /* Fall Through */
> case 26:
> /* PCXL: Data memory access rights trap */
> + if (!user_mode(regs) && fixup_exception(regs))
> + return;
> fault_address = regs->ior;
> fault_space = regs->isr;
> break;


With this change, boot on rp3440 hangs here:

Freeing unused kernel memory: 256K (000000004079c000 - 00000000407dc000)
Loading, please wait...

Dave
--
John David Anglin [email protected]


2013-10-02 08:28:23

by Helge Deller

[permalink] [raw]
Subject: Re: [PATCH] [workqueue] check values of pwq and wq in print_worker_info() before use

On 10/02/2013 12:50 AM, James Bottomley wrote:
> On Wed, 2013-10-02 at 00:07 +0200, Helge Deller wrote:
>> On 10/01/2013 11:40 PM, James Bottomley wrote:
>>> On Tue, 2013-10-01 at 16:43 -0400, Tejun Heo wrote:
>>>> Hello,
>>>>
>>>> On Tue, Oct 01, 2013 at 10:35:20PM +0200, Helge Deller wrote:
>>>>> print_worker_info() includes no validity check on the pwq and wq
>>>>> pointers before handing them over to the probe_kernel_read() functions.
>>>>>
>>>>> It seems that most architectures don't care about that, but at least on
>>>>> the parisc architecture this leads to a kernel crash since accesses to
>>>>> page zero are protected by the kernel for security reasons.
>>>>>
>>>>> Fix this problem by verifying the contents of pwq and wq before usage.
>>>>> Even if probe_kernel_read() usually prevents such crashes by disabling
>>>>> page faults, clean code should always include such checks.
>>>>>
>>>>> Without this fix issuing "echo t > /proc/sysrq-trigger" will immediately
>>>>> crash the Linux kernel on the parisc architecture.
>>>>
>>>> Hmm... um had similar problem but the root cause here is that the arch
>>>> isn't implementing probe_kernel_read() properly. We really have no
>>>> idea what the pointer value may be at the dump point and that's why we
>>>> use probe_kernel_read(). If something like the above is necessary for
>>>> the time being, the correct place would be the arch
>>>> probe_kernel_read() implementation. James, would it be difficult
>>>> implement proper probe_kernel_read() on parisc?
>>>
>>> The problem seems to be that some traps bypass our exception table
>>> handling.
>>
>> Yes, that's correct.
>> It's trap #26 and we directly call parisc_terminate() for fault_space==0
>> without checking the exception table.
>> See my patch I posted a few hours ago which fixes this:
>> https://patchwork.kernel.org/patch/2971701/
>
> That doesn't quite look right ... I guessed it was probably access
> rights, so we should do an exception table fixup, so isn't this the fix?
> because we shouldn't call do_page_fault if there's no exception table.
>
> diff --git a/arch/parisc/kernel/traps.c b/arch/parisc/kernel/traps.c
> index 04e47c6..25a088a 100644
> --- a/arch/parisc/kernel/traps.c
> +++ b/arch/parisc/kernel/traps.c
> @@ -684,6 +684,8 @@ void notrace handle_interruption(int code, struct pt_regs *regs)
> /* Fall Through */
> case 26:
> /* PCXL: Data memory access rights trap */
> + if (!user_mode(regs) && fixup_exception(regs))
> + return;

You need to check for preempt_count()!=0 too, which has been increased by pagefault_disable() inside of probe_kernel_read().
Otherwise every simple memcpy(dest,NULL,count) (*) will sucessfully be handled here and we won't trap
on generic invalid memory accesses inside the kernel.

But basically your patch does exactly the same as mine.

Helge

(*) memcpy() uses internally pa_memcpy() which defines the fixup tables.