In trying to track down a bug (see below), I noticed that
math_state_restore() in arch/x86/kernel/traps.c appears to unconditionally
disable interrupts when called. Is this intended behavior or a bug?
The bug in question is triggered by dumping core on an ecryptfs file
system when aesni-intel is loaded. (See
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1265841 for the
original report.) The symptom is that __find_get_block() gets called with
interrupts disabled, causing a BUG(). I tried to find where interrupts
were getting disabled and wound up in aes_set_key_common() in
arch/x86/crypto/aesni-intel_glue.c. It calls aesni_set_key(), and since
that uses the FPU, it wraps it in kernel_fpu_begin()/kernel_fpu_end().
But kernel_fpu_end() calls math_state_restore() which disables interrupts.
I've verified that interrupts are still enabled just before the call to
kernel_fpu_end().
math_state_restore() does:
local_irq_enable();
init_fpu(tsk);
local_irq_disable();
with the result that interrupts are disabled when it finishes, even if
they were enabled to begin with. That looks strange to me; are we sure it
shouldn't just save and restore the interrupt flag? Or are we not
supposed to call it with interrupts enabled?
Given the intimidating comment preceding math_state_restore() ("Don't
touch unless you *really* know how it works"), it's entirely possible I am
missing something...
Any suggestions appreciated. Thanks!
--
Nate Eldredge
[email protected]
THANK YOU!
I've been having a problem with ext4 metadata checksums, which use SSE
for large blocks, and traced it to kernel_fpu_end() disabling interrupts,
but had paused to debug this (I assumed well-tested) piece of kernel
code before pushing it harder.
(Search October-December LKML archives for "3.11.4: kernel BUG at
fs/buffer.c:1268".")
No, I'm pretty sure it's a real bug. At least, it's biting me on the
ass.
[Here's my original message, since George's reply didn't quote or
reference it: https://lkml.org/lkml/2014/1/18/3. Summary:
math_state_restore() always leaves interrupts disabled, and I think this
is a bug.]
On Sun, 19 Jan 2014, George Spelvin wrote:
> THANK YOU!
>
> I've been having a problem with ext4 metadata checksums, which use SSE
> for large blocks, and traced it to kernel_fpu_end() disabling interrupts,
> but had paused to debug this (I assumed well-tested) piece of kernel
> code before pushing it harder.
Interesting. I guess it's not surprising this has other effects.
Here's the commit that added the code in question (about 6 years ago):
https://github.com/torvalds/linux/commit/aa283f49276e7d840a40fb01eee6de97eaa7e012
It's credited to Suresh Siddha, whom I've cc'ed (along with others who
signed off). Suresh, if you're still around, could you comment on why
math_state_restore always leaves interrupts disabled, regardless of their
state on entry? Is there a deep reason or is it a bug?
Assuming it's a bug, here's the obvious patch:
--- linux-source-3.11.0/arch/x86/kernel/traps.c 2013-09-02 14:46:10.000000000 -0600
+++ linux-source-3.11.0-nate/arch/x86/kernel/traps.c 2014-01-19 11:25:32.977221476 -0700
@@ -624,6 +624,9 @@
struct task_struct *tsk = current;
if (!tsk_used_math(tsk)) {
+ unsigned long flags;
+
+ local_save_flags(flags);
local_irq_enable();
/*
* does a slab alloc which can sleep
@@ -635,7 +638,7 @@
do_group_exit(SIGKILL);
return;
}
- local_irq_disable();
+ local_irq_restore(flags);
}
__thread_fpu_begin(tsk);
I tested it briefly: the kernel still boots fine, and it fixes the problem
I was seeing (BUG() when core dumping on ecryptfs). George, does it help
your problem?
Thanks everyone!
> (Search October-December LKML archives for "3.11.4: kernel BUG at
> fs/buffer.c:1268".")
--
Nate Eldredge
[email protected]
> It's credited to Suresh Siddha, whom I've cc'ed (along with others who
> signed off). Suresh, if you're still around, could you comment on why
> math_state_restore always leaves interrupts disabled, regardless of their
> state on entry? Is there a deep reason or is it a bug?
What the comments seemed to be implying was that it was a bug to enter
this code with interrupts enabled. So the problem may be a little bit
more systemic; expert counsel is required.
> George, does it help your problem?
I'm trying it now. But it takes a while for me to reproduce, and even
longer to be sure the problem has gone away. So anything you hear from
me within a week will be bad news.
On Sun, 19 Jan 2014, George Spelvin wrote:
>> It's credited to Suresh Siddha, whom I've cc'ed (along with others who
>> signed off). Suresh, if you're still around, could you comment on why
>> math_state_restore always leaves interrupts disabled, regardless of their
>> state on entry? Is there a deep reason or is it a bug?
>
> What the comments seemed to be implying was that it was a bug to enter
> this code with interrupts enabled. So the problem may be a little bit
> more systemic; expert counsel is required.
It would be kind of weird for code that requires disabled interrupts on
entry to turn around and enable interrupts itself. I agree that it would
really help for a guru to take a look...
On which note, Suresh's email bounced :-(
--
Nate Eldredge
[email protected]
On Mon 20-01-14 19:22:51, Nate Eldredge wrote:
> On Sun, 19 Jan 2014, George Spelvin wrote:
>
> >>It's credited to Suresh Siddha, whom I've cc'ed (along with others who
> >>signed off). Suresh, if you're still around, could you comment on why
> >>math_state_restore always leaves interrupts disabled, regardless of their
> >>state on entry? Is there a deep reason or is it a bug?
> >
> >What the comments seemed to be implying was that it was a bug to enter
> >this code with interrupts enabled. So the problem may be a little bit
> >more systemic; expert counsel is required.
>
> It would be kind of weird for code that requires disabled interrupts
> on entry to turn around and enable interrupts itself. I agree that
> it would really help for a guru to take a look...
>
> On which note, Suresh's email bounced :-(
Guys, since this is lurking without proper attention for a rather long
time, can you just write a patch which fixes the problem as far as you can
tell, test that the patch indeed fixes your problems and if yes, send it
to Ingo & LKML. If nothing else it should get you more attention.
Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR
Nate Eldredge wrote:
> On Sun, 19 Jan 2014, George Spelvin wrote:
>>> It's credited to Suresh Siddha, whom I've cc'ed (along with others who
>>> signed off). Suresh, if you're still around, could you comment on why
>>> math_state_restore always leaves interrupts disabled, regardless of their
>>> state on entry? Is there a deep reason or is it a bug?
>>
>> What the comments seemed to be implying was that it was a bug to enter
>> this code with interrupts enabled. So the problem may be a little bit
>> more systemic; expert counsel is required.
>
> It would be kind of weird for code that requires disabled interrupts on
> entry to turn around and enable interrupts itself. I agree that it would
> really help for a guru to take a look...
Actually, on second look, it appears the comments are just confused.
Suresh's patch appears to have just failed to consider the possibility
of calling math_state_restore with interrupts enabled, rather than
having made a deliberate choice.
> On which note, Suresh's email bounced :-(
I found a newer address. Let's see if this works. Suresh, here's
Nate's proposed patch:
--- linux-source-3.11.0/arch/x86/kernel/traps.c 2013-09-02 14:46:10.000000000 -0600
+++ linux-source-3.11.0-nate/arch/x86/kernel/traps.c 2014-01-19 11:25:32.977221476 -0700
@@ -624,6 +624,9 @@ asmlinkage void math_state_restore(void)
struct task_struct *tsk = current;
if (!tsk_used_math(tsk)) {
+ unsigned long flags;
+
+ local_save_flags(flags);
local_irq_enable();
/*
* does a slab alloc which can sleep
@@ -635,7 +638,7 @@
do_group_exit(SIGKILL);
return;
}
- local_irq_disable();
+ local_irq_restore(flags);
}
__thread_fpu_begin(tsk);
BTW, my test case to reliably trigger the fault is Firefox crashing.
It usually takes a few days for it to run out of memory. It happened
once, with no problem, but I'd like to get one more before passing
judgement. Still, feel free to consider this:
Tested-by: George Spelvin <[email protected]>
It's the
Cc: <[email protected]> # 2.6.26+
I'm a bit anxious about.
> I'm trying it now. But it takes a while for me to reproduce, and even
> longer to be sure the problem has gone away. So anything you hear from
> me within a week will be bad news.
Well, it's been a week, and: good news!
I'd still wish for some review by someone who really understands this
code; in particular it seems dangerous to just enable interrupts for
a window without re-checking the condition afterward.
What if an interrupt hander wants to use the FPU and triggers the
allocate itself? Shouldn't it be:
* Enable interrupts
* Allocate
* Disable interrupts
* Check that tsk->thread.xstate is still NULL
* (If it has been filled in, free and return.)
* Fill in tsk->thread.xstate
I don't feel I really understand the irq_fpu_usable() logic in
arch/x86/i387.c.
But this patch clearly doesn't make these issues any *worse*, so
these concerns are no reason to block it.
Would you like add an appropriate commit message and send in the patch?
Something like:
Subject: arch/x86/kernel/traps.c: make math_state_restore preserve IRQ status.
Commit aa283f4927 (in 2.6.26!) to add lazy FPU save are allocation did
an local_irq_enable()/local_irq_disable() around the allocate. However,
that assumes that it is only called with interrupts disabled.
math_state_restore() can also be called from kernel_fpu_end() with
interrupts enabled. Very occasionally, this triggers an FPU state
allocation. Disabling interrupts unconditionally is Bad.
Not-yet-Signed-off-by: Nate Eldredge <[email protected]>
Tested-by: George Spelvin <[email protected]>
Cc: <[email protected]>
Fixes: aa283f4927
On Tue, 28 Jan 2014, George Spelvin wrote:
>> I'm trying it now. But it takes a while for me to reproduce, and even
>> longer to be sure the problem has gone away. So anything you hear from
>> me within a week will be bad news.
>
> Well, it's been a week, and: good news!
>
> I'd still wish for some review by someone who really understands this
> code; in particular it seems dangerous to just enable interrupts for
> a window without re-checking the condition afterward.
Yeah, I was thinking the same. If interrupts were disabled on entry it
was probably for a good reason, and it is strange for math_state_restore()
and/or kernel_fpu_end() to unilaterally enable them. Maybe the caller is
supposed to know this and account for it, but I didn't see any
documentation of the intended semantics of kernel_fpu_{begin,end}() with
respect to interrupts. The current behavior doesn't really make sense
either way, but it does make me wonder if something deeper is wrong.
> [...]
>
> But this patch clearly doesn't make these issues any *worse*, so
> these concerns are no reason to block it.
>
>
> Would you like add an appropriate commit message and send in the patch?
Ok, when I have a chance I will write something up and send it in, and
maybe having it as a formal patch submission will get more attention.
Thanks, George!
--
Nate Eldredge
[email protected]