LKDTM test visibility shouldn't change, so remove the ifdefs on
DOUBLE_FAULT and make sure test failure doesn't crash the system.
Link: https://lore.kernel.org/lkml/[email protected]
Fixes: b09511c253e5 ("lkdtm: Add a DOUBLE_FAULT crash type on x86")
Signed-off-by: Kees Cook <[email protected]>
---
applies on top of tip/x86/urgent
---
drivers/misc/lkdtm/bugs.c | 8 +++++---
drivers/misc/lkdtm/core.c | 4 +---
drivers/misc/lkdtm/lkdtm.h | 2 --
3 files changed, 6 insertions(+), 8 deletions(-)
diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c
index a4fdad04809a..22f5293414cc 100644
--- a/drivers/misc/lkdtm/bugs.c
+++ b/drivers/misc/lkdtm/bugs.c
@@ -342,9 +342,9 @@ void lkdtm_UNSET_SMEP(void)
#endif
}
-#ifdef CONFIG_X86_32
void lkdtm_DOUBLE_FAULT(void)
{
+#ifdef CONFIG_X86_32
/*
* Trigger #DF by setting the stack limit to zero. This clobbers
* a GDT TLS slot, which is okay because the current task will die
@@ -373,6 +373,8 @@ void lkdtm_DOUBLE_FAULT(void)
asm volatile ("movw %0, %%ss; addl $0, (%%esp)" ::
"r" ((unsigned short)(GDT_ENTRY_TLS_MIN << 3)));
- panic("tried to double fault but didn't die\n");
-}
+ pr_err("FAIL: tried to double fault but didn't die!\n");
+#else
+ pr_err("FAIL: this test is only available on 32-bit x86.\n");
#endif
+}
diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c
index ee0d6e721441..7082ef8a2b99 100644
--- a/drivers/misc/lkdtm/core.c
+++ b/drivers/misc/lkdtm/core.c
@@ -116,6 +116,7 @@ static const struct crashtype crashtypes[] = {
CRASHTYPE(STACK_GUARD_PAGE_LEADING),
CRASHTYPE(STACK_GUARD_PAGE_TRAILING),
CRASHTYPE(UNSET_SMEP),
+ CRASHTYPE(DOUBLE_FAULT),
CRASHTYPE(UNALIGNED_LOAD_STORE_WRITE),
CRASHTYPE(OVERWRITE_ALLOCATION),
CRASHTYPE(WRITE_AFTER_FREE),
@@ -171,9 +172,6 @@ static const struct crashtype crashtypes[] = {
CRASHTYPE(USERCOPY_KERNEL_DS),
CRASHTYPE(STACKLEAK_ERASING),
CRASHTYPE(CFI_FORWARD_PROTO),
-#ifdef CONFIG_X86_32
- CRASHTYPE(DOUBLE_FAULT),
-#endif
};
diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h
index c56d23e37643..f4952efd6785 100644
--- a/drivers/misc/lkdtm/lkdtm.h
+++ b/drivers/misc/lkdtm/lkdtm.h
@@ -28,9 +28,7 @@ void lkdtm_CORRUPT_USER_DS(void);
void lkdtm_STACK_GUARD_PAGE_LEADING(void);
void lkdtm_STACK_GUARD_PAGE_TRAILING(void);
void lkdtm_UNSET_SMEP(void);
-#ifdef CONFIG_X86_32
void lkdtm_DOUBLE_FAULT(void);
-#endif
/* lkdtm_heap.c */
void __init lkdtm_heap_init(void);
--
2.17.1
--
Kees Cook
> On Nov 27, 2019, at 11:19 AM, Kees Cook <[email protected]> wrote:
>
> LKDTM test visibility shouldn't change, so remove the ifdefs on
> DOUBLE_FAULT and make sure test failure doesn't crash the system.
>
> Link: https://lore.kernel.org/lkml/[email protected]
> Fixes: b09511c253e5 ("lkdtm: Add a DOUBLE_FAULT crash type on x86")
> Signed-off-by: Kees Cook <[email protected]>
> ---
> applies on top of tip/x86/urgent
> ---
> drivers/misc/lkdtm/bugs.c | 8 +++++---
> drivers/misc/lkdtm/core.c | 4 +---
> drivers/misc/lkdtm/lkdtm.h | 2 --
> 3 files changed, 6 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c
> index a4fdad04809a..22f5293414cc 100644
> --- a/drivers/misc/lkdtm/bugs.c
> +++ b/drivers/misc/lkdtm/bugs.c
> @@ -342,9 +342,9 @@ void lkdtm_UNSET_SMEP(void)
> #endif
> }
>
> -#ifdef CONFIG_X86_32
> void lkdtm_DOUBLE_FAULT(void)
> {
> +#ifdef CONFIG_X86_32
> /*
> * Trigger #DF by setting the stack limit to zero. This clobbers
> * a GDT TLS slot, which is okay because the current task will die
> @@ -373,6 +373,8 @@ void lkdtm_DOUBLE_FAULT(void)
> asm volatile ("movw %0, %%ss; addl $0, (%%esp)" ::
> "r" ((unsigned short)(GDT_ENTRY_TLS_MIN << 3)));
>
> - panic("tried to double fault but didn't die\n");
> -}
> + pr_err("FAIL: tried to double fault but didn't die!\n");
> +#else
> + pr_err("FAIL: this test is only available on 32-bit x86.\n");
> #endif
> +}
I’m not familiar with the userspace tooling, but this seems unfortunate. The first FAIL is “the test case screwed up, and it’s a bug.” The second FAIL is “not applicable to this system.”
ISTM simply not exposing the test on systems that don’t support makes sense. Can you clarify?
On Wed, Nov 27, 2019 at 01:01:40PM -0800, Andy Lutomirski wrote:
>
>
> > On Nov 27, 2019, at 11:19 AM, Kees Cook <[email protected]> wrote:
> >
> > LKDTM test visibility shouldn't change, so remove the ifdefs on
> > DOUBLE_FAULT and make sure test failure doesn't crash the system.
> >
> > Link: https://lore.kernel.org/lkml/[email protected]
> > Fixes: b09511c253e5 ("lkdtm: Add a DOUBLE_FAULT crash type on x86")
> > Signed-off-by: Kees Cook <[email protected]>
> > ---
> > applies on top of tip/x86/urgent
> > ---
> > drivers/misc/lkdtm/bugs.c | 8 +++++---
> > drivers/misc/lkdtm/core.c | 4 +---
> > drivers/misc/lkdtm/lkdtm.h | 2 --
> > 3 files changed, 6 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c
> > index a4fdad04809a..22f5293414cc 100644
> > --- a/drivers/misc/lkdtm/bugs.c
> > +++ b/drivers/misc/lkdtm/bugs.c
> > @@ -342,9 +342,9 @@ void lkdtm_UNSET_SMEP(void)
> > #endif
> > }
> >
> > -#ifdef CONFIG_X86_32
> > void lkdtm_DOUBLE_FAULT(void)
> > {
> > +#ifdef CONFIG_X86_32
> > /*
> > * Trigger #DF by setting the stack limit to zero. This clobbers
> > * a GDT TLS slot, which is okay because the current task will die
> > @@ -373,6 +373,8 @@ void lkdtm_DOUBLE_FAULT(void)
> > asm volatile ("movw %0, %%ss; addl $0, (%%esp)" ::
> > "r" ((unsigned short)(GDT_ENTRY_TLS_MIN << 3)));
> >
> > - panic("tried to double fault but didn't die\n");
> > -}
> > + pr_err("FAIL: tried to double fault but didn't die!\n");
> > +#else
> > + pr_err("FAIL: this test is only available on 32-bit x86.\n");
> > #endif
> > +}
>
> I’m not familiar with the userspace tooling, but this seems unfortunate. The first FAIL is “the test case screwed up, and it’s a bug.” The second FAIL is “not applicable to this system.”
>
>
> ISTM simply not exposing the test on systems that don’t support makes sense. Can you clarify?
I don't like the tests liked in the DIRECT file to change from build to
build (it should be stable per kernel version). Userspace needs to know
how to evaluate the results of running each test, so in both cases, I
consider it a failure: double fault didn't work or you tried to test
double fault on an unsupported architecture. (The SMEP test works
similarly.)
--
Kees Cook
> On Nov 27, 2019, at 5:50 PM, Kees Cook <[email protected]> wrote:
>
> On Wed, Nov 27, 2019 at 01:01:40PM -0800, Andy Lutomirski wrote:
>>
>>
>>>> On Nov 27, 2019, at 11:19 AM, Kees Cook <[email protected]> wrote:
>>>
>>> LKDTM test visibility shouldn't change, so remove the ifdefs on
>>> DOUBLE_FAULT and make sure test failure doesn't crash the system.
>>>
>>> Link: https://lore.kernel.org/lkml/[email protected]
>>> Fixes: b09511c253e5 ("lkdtm: Add a DOUBLE_FAULT crash type on x86")
>>> Signed-off-by: Kees Cook <[email protected]>
>>> ---
>>> applies on top of tip/x86/urgent
>>> ---
>>> drivers/misc/lkdtm/bugs.c | 8 +++++---
>>> drivers/misc/lkdtm/core.c | 4 +---
>>> drivers/misc/lkdtm/lkdtm.h | 2 --
>>> 3 files changed, 6 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c
>>> index a4fdad04809a..22f5293414cc 100644
>>> --- a/drivers/misc/lkdtm/bugs.c
>>> +++ b/drivers/misc/lkdtm/bugs.c
>>> @@ -342,9 +342,9 @@ void lkdtm_UNSET_SMEP(void)
>>> #endif
>>> }
>>>
>>> -#ifdef CONFIG_X86_32
>>> void lkdtm_DOUBLE_FAULT(void)
>>> {
>>> +#ifdef CONFIG_X86_32
>>> /*
>>> * Trigger #DF by setting the stack limit to zero. This clobbers
>>> * a GDT TLS slot, which is okay because the current task will die
>>> @@ -373,6 +373,8 @@ void lkdtm_DOUBLE_FAULT(void)
>>> asm volatile ("movw %0, %%ss; addl $0, (%%esp)" ::
>>> "r" ((unsigned short)(GDT_ENTRY_TLS_MIN << 3)));
>>>
>>> - panic("tried to double fault but didn't die\n");
>>> -}
>>> + pr_err("FAIL: tried to double fault but didn't die!\n");
>>> +#else
>>> + pr_err("FAIL: this test is only available on 32-bit x86.\n");
>>> #endif
>>> +}
>>
>> I’m not familiar with the userspace tooling, but this seems unfortunate. The first FAIL is “the test case screwed up, and it’s a bug.” The second FAIL is “not applicable to this system.”
>>
>>
>> ISTM simply not exposing the test on systems that don’t support makes sense. Can you clarify?
>
> I don't like the tests liked in the DIRECT file to change from build to
> build (it should be stable per kernel version). Userspace needs to know
> how to evaluate the results of running each test, so in both cases, I
> consider it a failure: double fault didn't work or you tried to test
> double fault on an unsupported architecture. (The SMEP test works
> similarly.)
>
So how is the test harness supposed
to distinguish success from failure? If it printed UNSUPPORTED instead of FAIL, it would make more sense to me, but I’m not sure why that’s better than just not exposing it at all.
On Wed, Nov 27, 2019 at 06:15:17PM -0800, Andy Lutomirski wrote:
>
> > On Nov 27, 2019, at 5:50 PM, Kees Cook <[email protected]> wrote:
> >
> > On Wed, Nov 27, 2019 at 01:01:40PM -0800, Andy Lutomirski wrote:
> >>
> >>
> >>>> On Nov 27, 2019, at 11:19 AM, Kees Cook <[email protected]> wrote:
> >>>
> >>> LKDTM test visibility shouldn't change, so remove the ifdefs on
> >>> DOUBLE_FAULT and make sure test failure doesn't crash the system.
> >>>
> >>> Link: https://lore.kernel.org/lkml/[email protected]
> >>> Fixes: b09511c253e5 ("lkdtm: Add a DOUBLE_FAULT crash type on x86")
> >>> Signed-off-by: Kees Cook <[email protected]>
> >>> ---
> >>> applies on top of tip/x86/urgent
> >>> ---
> >>> drivers/misc/lkdtm/bugs.c | 8 +++++---
> >>> drivers/misc/lkdtm/core.c | 4 +---
> >>> drivers/misc/lkdtm/lkdtm.h | 2 --
> >>> 3 files changed, 6 insertions(+), 8 deletions(-)
> >>>
> >>> diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c
> >>> index a4fdad04809a..22f5293414cc 100644
> >>> --- a/drivers/misc/lkdtm/bugs.c
> >>> +++ b/drivers/misc/lkdtm/bugs.c
> >>> @@ -342,9 +342,9 @@ void lkdtm_UNSET_SMEP(void)
> >>> #endif
> >>> }
> >>>
> >>> -#ifdef CONFIG_X86_32
> >>> void lkdtm_DOUBLE_FAULT(void)
> >>> {
> >>> +#ifdef CONFIG_X86_32
> >>> /*
> >>> * Trigger #DF by setting the stack limit to zero. This clobbers
> >>> * a GDT TLS slot, which is okay because the current task will die
> >>> @@ -373,6 +373,8 @@ void lkdtm_DOUBLE_FAULT(void)
> >>> asm volatile ("movw %0, %%ss; addl $0, (%%esp)" ::
> >>> "r" ((unsigned short)(GDT_ENTRY_TLS_MIN << 3)));
> >>>
> >>> - panic("tried to double fault but didn't die\n");
> >>> -}
> >>> + pr_err("FAIL: tried to double fault but didn't die!\n");
> >>> +#else
> >>> + pr_err("FAIL: this test is only available on 32-bit x86.\n");
> >>> #endif
> >>> +}
> >>
> >> I’m not familiar with the userspace tooling, but this seems unfortunate. The first FAIL is “the test case screwed up, and it’s a bug.” The second FAIL is “not applicable to this system.”
> >>
> >>
> >> ISTM simply not exposing the test on systems that don’t support makes sense. Can you clarify?
> >
> > I don't like the tests liked in the DIRECT file to change from build to
> > build (it should be stable per kernel version). Userspace needs to know
> > how to evaluate the results of running each test, so in both cases, I
> > consider it a failure: double fault didn't work or you tried to test
> > double fault on an unsupported architecture. (The SMEP test works
> > similarly.)
> >
>
>
> So how is the test harness supposed
> to distinguish success from failure? If it printed UNSUPPORTED instead of FAIL, it would make more sense to me, but I’m not sure why that’s better than just not exposing it at all.
If kernelci or similar ever mentions this as a problem for them, I'm
happy to change it. I think it's an error to request this test in the
wrong environment (because that implies userspace doesn't know how to
evaluate the results). As I like it _available_ because having it
missing makes the code ugly (lots of ifdefs) and provides to signal to
userspace about it (EINVAL on the write to DIRECT) doesn't tell me if I
have the wrong kernel version or the wrong architecture, etc. Since the
tester needs to be parsing dmesg and system state (did it panic, etc), I
much prefer keeping the signals there.
--
Kees Cook
> On Nov 27, 2019, at 6:54 PM, Kees Cook <[email protected]> wrote:
>
> On Wed, Nov 27, 2019 at 06:15:17PM -0800, Andy Lutomirski wrote:
>>
>>>> On Nov 27, 2019, at 5:50 PM, Kees Cook <[email protected]> wrote:
>>>
>>> On Wed, Nov 27, 2019 at 01:01:40PM -0800, Andy Lutomirski wrote:
>>>>
>>>>
>>>>>> On Nov 27, 2019, at 11:19 AM, Kees Cook <[email protected]> wrote:
>>>>>
>>>>> LKDTM test visibility shouldn't change, so remove the ifdefs on
>>>>> DOUBLE_FAULT and make sure test failure doesn't crash the system.
>>>>>
>>>>> Link: https://lore.kernel.org/lkml/[email protected]
>>>>> Fixes: b09511c253e5 ("lkdtm: Add a DOUBLE_FAULT crash type on x86")
>>>>> Signed-off-by: Kees Cook <[email protected]>
>>>>> ---
>>>>> applies on top of tip/x86/urgent
>>>>> ---
>>>>> drivers/misc/lkdtm/bugs.c | 8 +++++---
>>>>> drivers/misc/lkdtm/core.c | 4 +---
>>>>> drivers/misc/lkdtm/lkdtm.h | 2 --
>>>>> 3 files changed, 6 insertions(+), 8 deletions(-)
>>>>>
>>>>> diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c
>>>>> index a4fdad04809a..22f5293414cc 100644
>>>>> --- a/drivers/misc/lkdtm/bugs.c
>>>>> +++ b/drivers/misc/lkdtm/bugs.c
>>>>> @@ -342,9 +342,9 @@ void lkdtm_UNSET_SMEP(void)
>>>>> #endif
>>>>> }
>>>>>
>>>>> -#ifdef CONFIG_X86_32
>>>>> void lkdtm_DOUBLE_FAULT(void)
>>>>> {
>>>>> +#ifdef CONFIG_X86_32
>>>>> /*
>>>>> * Trigger #DF by setting the stack limit to zero. This clobbers
>>>>> * a GDT TLS slot, which is okay because the current task will die
>>>>> @@ -373,6 +373,8 @@ void lkdtm_DOUBLE_FAULT(void)
>>>>> asm volatile ("movw %0, %%ss; addl $0, (%%esp)" ::
>>>>> "r" ((unsigned short)(GDT_ENTRY_TLS_MIN << 3)));
>>>>>
>>>>> - panic("tried to double fault but didn't die\n");
>>>>> -}
>>>>> + pr_err("FAIL: tried to double fault but didn't die!\n");
>>>>> +#else
>>>>> + pr_err("FAIL: this test is only available on 32-bit x86.\n");
>>>>> #endif
>>>>> +}
>>>>
>>>> I’m not familiar with the userspace tooling, but this seems unfortunate. The first FAIL is “the test case screwed up, and it’s a bug.” The second FAIL is “not applicable to this system.”
>>>>
>>>>
>>>> ISTM simply not exposing the test on systems that don’t support makes sense. Can you clarify?
>>>
>>> I don't like the tests liked in the DIRECT file to change from build to
>>> build (it should be stable per kernel version). Userspace needs to know
>>> how to evaluate the results of running each test, so in both cases, I
>>> consider it a failure: double fault didn't work or you tried to test
>>> double fault on an unsupported architecture. (The SMEP test works
>>> similarly.)
>>>
>>
>>
>> So how is the test harness supposed
>> to distinguish success from failure? If it printed UNSUPPORTED instead of FAIL, it would make more sense to me, but I’m not sure why that’s better than just not exposing it at all.
>
> If kernelci or similar ever mentions this as a problem for them, I'm
> happy to change it. I think it's an error to request this test in the
> wrong environment (because that implies userspace doesn't know how to
> evaluate the results). As I like it _available_ because having it
> missing makes the code ugly (lots of ifdefs) and provides to signal to
> userspace about it (EINVAL on the write to DIRECT) doesn't tell me if I
> have the wrong kernel version or the wrong architecture, etc. Since the
> tester needs to be parsing dmesg and system state (did it panic, etc), I
> much prefer keeping the signals there.
>
>
Could we perhaps standardized on some particular error code to say “I know about this test, but it’s not implemented on this kernel?”