Currently, the Linux kernel provides two types of reaction to kernel
warnings:
1. Do nothing (by default),
2. Call panic() if panic_on_warn is set. That's a very strong reaction,
so panic_on_warn is usually disabled on production systems.
From a safety point of view, the Linux kernel misses a middle way of
handling kernel warnings:
- The kernel should stop the activity that provokes a warning,
- But the kernel should avoid complete denial of service.
From a security point of view, kernel warning messages provide a lot of
useful information for attackers. Many GNU/Linux distributions allow
unprivileged users to read the kernel log, so attackers use kernel
warning infoleak in vulnerability exploits. See the examples:
https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
https://a13xp0p0v.github.io/2021/02/09/CVE-2021-26708.html
Let's introduce the pkill_on_warn boot parameter.
If this parameter is set, the kernel kills all threads in a process
that provoked a kernel warning. This behavior is reasonable from a safety
point of view described above. It is also useful for kernel security
hardening because the system kills an exploit process that hits a
kernel warning.
Signed-off-by: Alexander Popov <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 4 ++++
kernel/panic.c | 5 +++++
2 files changed, 9 insertions(+)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 91ba391f9b32..86c748907666 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4112,6 +4112,10 @@
pirq= [SMP,APIC] Manual mp-table setup
See Documentation/x86/i386/IO-APIC.rst.
+ pkill_on_warn= Kill all threads in a process that provoked a
+ kernel warning.
+ Format: { "0" | "1" }
+
plip= [PPT,NET] Parallel port network link
Format: { parport<nr> | timid | 0 }
See also Documentation/admin-guide/parport.rst.
diff --git a/kernel/panic.c b/kernel/panic.c
index cefd7d82366f..47b728bfb1d3 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -53,6 +53,7 @@ static int pause_on_oops_flag;
static DEFINE_SPINLOCK(pause_on_oops_lock);
bool crash_kexec_post_notifiers;
int panic_on_warn __read_mostly;
+int pkill_on_warn __read_mostly;
unsigned long panic_on_taint;
bool panic_on_taint_nousertaint = false;
@@ -610,6 +611,9 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
print_oops_end_marker();
+ if (pkill_on_warn && system_state >= SYSTEM_RUNNING)
+ do_group_exit(SIGKILL);
+
/* Just a warning, don't kill lockdep. */
add_taint(taint, LOCKDEP_STILL_OK);
}
@@ -694,6 +698,7 @@ core_param(panic, panic_timeout, int, 0644);
core_param(panic_print, panic_print, ulong, 0644);
core_param(pause_on_oops, pause_on_oops, int, 0644);
core_param(panic_on_warn, panic_on_warn, int, 0644);
+core_param(pkill_on_warn, pkill_on_warn, int, 0644);
core_param(crash_kexec_post_notifiers, crash_kexec_post_notifiers, bool, 0644);
static int __init oops_setup(char *s)
--
2.31.1
On 29.09.2021 21:58, Alexander Popov wrote:
> Currently, the Linux kernel provides two types of reaction to kernel
> warnings:
> 1. Do nothing (by default),
> 2. Call panic() if panic_on_warn is set. That's a very strong reaction,
> so panic_on_warn is usually disabled on production systems.
>
> From a safety point of view, the Linux kernel misses a middle way of
> handling kernel warnings:
> - The kernel should stop the activity that provokes a warning,
> - But the kernel should avoid complete denial of service.
>
> From a security point of view, kernel warning messages provide a lot of
> useful information for attackers. Many GNU/Linux distributions allow
> unprivileged users to read the kernel log, so attackers use kernel
> warning infoleak in vulnerability exploits. See the examples:
> https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
> https://a13xp0p0v.github.io/2021/02/09/CVE-2021-26708.html
>
> Let's introduce the pkill_on_warn boot parameter.
> If this parameter is set, the kernel kills all threads in a process
> that provoked a kernel warning. This behavior is reasonable from a safety
> point of view described above. It is also useful for kernel security
> hardening because the system kills an exploit process that hits a
> kernel warning.
>
> Signed-off-by: Alexander Popov <[email protected]>
This patch was tested using CONFIG_LKDTM.
The kernel kills a process that performs this:
echo WARNING > /sys/kernel/debug/provoke-crash/DIRECT
If you are fine with this approach, I will prepare a patch adding the
pkill_on_warn sysctl.
Best regards,
Alexander
> ---
> Documentation/admin-guide/kernel-parameters.txt | 4 ++++
> kernel/panic.c | 5 +++++
> 2 files changed, 9 insertions(+)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 91ba391f9b32..86c748907666 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -4112,6 +4112,10 @@
> pirq= [SMP,APIC] Manual mp-table setup
> See Documentation/x86/i386/IO-APIC.rst.
>
> + pkill_on_warn= Kill all threads in a process that provoked a
> + kernel warning.
> + Format: { "0" | "1" }
> +
> plip= [PPT,NET] Parallel port network link
> Format: { parport<nr> | timid | 0 }
> See also Documentation/admin-guide/parport.rst.
> diff --git a/kernel/panic.c b/kernel/panic.c
> index cefd7d82366f..47b728bfb1d3 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -53,6 +53,7 @@ static int pause_on_oops_flag;
> static DEFINE_SPINLOCK(pause_on_oops_lock);
> bool crash_kexec_post_notifiers;
> int panic_on_warn __read_mostly;
> +int pkill_on_warn __read_mostly;
> unsigned long panic_on_taint;
> bool panic_on_taint_nousertaint = false;
>
> @@ -610,6 +611,9 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
>
> print_oops_end_marker();
>
> + if (pkill_on_warn && system_state >= SYSTEM_RUNNING)
> + do_group_exit(SIGKILL);
> +
> /* Just a warning, don't kill lockdep. */
> add_taint(taint, LOCKDEP_STILL_OK);
> }
> @@ -694,6 +698,7 @@ core_param(panic, panic_timeout, int, 0644);
> core_param(panic_print, panic_print, ulong, 0644);
> core_param(pause_on_oops, pause_on_oops, int, 0644);
> core_param(panic_on_warn, panic_on_warn, int, 0644);
> +core_param(pkill_on_warn, pkill_on_warn, int, 0644);
> core_param(crash_kexec_post_notifiers, crash_kexec_post_notifiers, bool, 0644);
>
> static int __init oops_setup(char *s)
>
On 9/29/21 11:58 AM, Alexander Popov wrote:
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -53,6 +53,7 @@ static int pause_on_oops_flag;
> static DEFINE_SPINLOCK(pause_on_oops_lock);
> bool crash_kexec_post_notifiers;
> int panic_on_warn __read_mostly;
> +int pkill_on_warn __read_mostly;
> unsigned long panic_on_taint;
> bool panic_on_taint_nousertaint = false;
>
> @@ -610,6 +611,9 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
>
> print_oops_end_marker();
>
> + if (pkill_on_warn && system_state >= SYSTEM_RUNNING)
> + do_group_exit(SIGKILL);
> +
> /* Just a warning, don't kill lockdep. */
> add_taint(taint, LOCKDEP_STILL_OK);
> }
Doesn't this tie into the warning *printing* code? That's better than
nothing, for sure. But, if we're doing this for hardening, I think we
would want to kill anyone provoking a warning, not just the first one
that triggered *printing* the warning.
On Wed, Sep 29, 2021 at 10:01:33PM +0300, Alexander Popov wrote:
> On 29.09.2021 21:58, Alexander Popov wrote:
> > Currently, the Linux kernel provides two types of reaction to kernel
> > warnings:
> > 1. Do nothing (by default),
> > 2. Call panic() if panic_on_warn is set. That's a very strong reaction,
> > so panic_on_warn is usually disabled on production systems.
> >
> > From a safety point of view, the Linux kernel misses a middle way of
> > handling kernel warnings:
> > - The kernel should stop the activity that provokes a warning,
> > - But the kernel should avoid complete denial of service.
> >
> > From a security point of view, kernel warning messages provide a lot of
> > useful information for attackers. Many GNU/Linux distributions allow
> > unprivileged users to read the kernel log, so attackers use kernel
> > warning infoleak in vulnerability exploits. See the examples:
> > https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
> > https://a13xp0p0v.github.io/2021/02/09/CVE-2021-26708.html
> >
> > Let's introduce the pkill_on_warn boot parameter.
> > If this parameter is set, the kernel kills all threads in a process
> > that provoked a kernel warning. This behavior is reasonable from a safety
> > point of view described above. It is also useful for kernel security
> > hardening because the system kills an exploit process that hits a
> > kernel warning.
> >
> > Signed-off-by: Alexander Popov <[email protected]>
>
> This patch was tested using CONFIG_LKDTM.
> The kernel kills a process that performs this:
> echo WARNING > /sys/kernel/debug/provoke-crash/DIRECT
>
> If you are fine with this approach, I will prepare a patch adding the
> pkill_on_warn sysctl.
I suspect that you need a list of kthreads for which you are better
off just invoking panic(). RCU's various kthreads, for but one set
of examples.
Thanx, Paul
> Best regards,
> Alexander
>
> > ---
> > Documentation/admin-guide/kernel-parameters.txt | 4 ++++
> > kernel/panic.c | 5 +++++
> > 2 files changed, 9 insertions(+)
> >
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > index 91ba391f9b32..86c748907666 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -4112,6 +4112,10 @@
> > pirq= [SMP,APIC] Manual mp-table setup
> > See Documentation/x86/i386/IO-APIC.rst.
> >
> > + pkill_on_warn= Kill all threads in a process that provoked a
> > + kernel warning.
> > + Format: { "0" | "1" }
> > +
> > plip= [PPT,NET] Parallel port network link
> > Format: { parport<nr> | timid | 0 }
> > See also Documentation/admin-guide/parport.rst.
> > diff --git a/kernel/panic.c b/kernel/panic.c
> > index cefd7d82366f..47b728bfb1d3 100644
> > --- a/kernel/panic.c
> > +++ b/kernel/panic.c
> > @@ -53,6 +53,7 @@ static int pause_on_oops_flag;
> > static DEFINE_SPINLOCK(pause_on_oops_lock);
> > bool crash_kexec_post_notifiers;
> > int panic_on_warn __read_mostly;
> > +int pkill_on_warn __read_mostly;
> > unsigned long panic_on_taint;
> > bool panic_on_taint_nousertaint = false;
> >
> > @@ -610,6 +611,9 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
> >
> > print_oops_end_marker();
> >
> > + if (pkill_on_warn && system_state >= SYSTEM_RUNNING)
> > + do_group_exit(SIGKILL);
> > +
> > /* Just a warning, don't kill lockdep. */
> > add_taint(taint, LOCKDEP_STILL_OK);
> > }
> > @@ -694,6 +698,7 @@ core_param(panic, panic_timeout, int, 0644);
> > core_param(panic_print, panic_print, ulong, 0644);
> > core_param(pause_on_oops, pause_on_oops, int, 0644);
> > core_param(panic_on_warn, panic_on_warn, int, 0644);
> > +core_param(pkill_on_warn, pkill_on_warn, int, 0644);
> > core_param(crash_kexec_post_notifiers, crash_kexec_post_notifiers, bool, 0644);
> >
> > static int __init oops_setup(char *s)
> >
>
On Wed, 29 Sep 2021 22:01:33 +0300 Alexander Popov <[email protected]> wrote:
> On 29.09.2021 21:58, Alexander Popov wrote:
> > Currently, the Linux kernel provides two types of reaction to kernel
> > warnings:
> > 1. Do nothing (by default),
> > 2. Call panic() if panic_on_warn is set. That's a very strong reaction,
> > so panic_on_warn is usually disabled on production systems.
> >
> > From a safety point of view, the Linux kernel misses a middle way of
> > handling kernel warnings:
> > - The kernel should stop the activity that provokes a warning,
> > - But the kernel should avoid complete denial of service.
> >
> > From a security point of view, kernel warning messages provide a lot of
> > useful information for attackers. Many GNU/Linux distributions allow
> > unprivileged users to read the kernel log, so attackers use kernel
> > warning infoleak in vulnerability exploits. See the examples:
> > https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
> > https://a13xp0p0v.github.io/2021/02/09/CVE-2021-26708.html
> >
> > Let's introduce the pkill_on_warn boot parameter.
> > If this parameter is set, the kernel kills all threads in a process
> > that provoked a kernel warning. This behavior is reasonable from a safety
> > point of view described above. It is also useful for kernel security
> > hardening because the system kills an exploit process that hits a
> > kernel warning.
> >
> > Signed-off-by: Alexander Popov <[email protected]>
>
> This patch was tested using CONFIG_LKDTM.
> The kernel kills a process that performs this:
> echo WARNING > /sys/kernel/debug/provoke-crash/DIRECT
>
> If you are fine with this approach, I will prepare a patch adding the
> pkill_on_warn sysctl.
Why do we need a boot parameter? Isn't a sysctl all we need for this
feature?
Also,
if (pkill_on_warn && system_state >= SYSTEM_RUNNING)
do_group_exit(SIGKILL);
- why do we care about system_state? An explanatory code comment
seems appropriate.
- do we really want to do this in states > SYSTEM_RUNNING? If so, why?
On Wed 2021-09-29 12:49:24, Paul E. McKenney wrote:
> On Wed, Sep 29, 2021 at 10:01:33PM +0300, Alexander Popov wrote:
> > On 29.09.2021 21:58, Alexander Popov wrote:
> > > Currently, the Linux kernel provides two types of reaction to kernel
> > > warnings:
> > > 1. Do nothing (by default),
> > > 2. Call panic() if panic_on_warn is set. That's a very strong reaction,
> > > so panic_on_warn is usually disabled on production systems.
Honestly, I am not sure if panic_on_warn() or the new pkill_on_warn()
work as expected. I wonder who uses it in practice and what is
the experience.
The problem is that many developers do not know about this behavior.
They use WARN() when they are lazy to write more useful message or when
they want to see all the provided details: task, registry, backtrace.
Also it is inconsistent with pr_warn() behavior. Why a single line
warning would be innocent and full info WARN() cause panic/pkill?
What about pr_err(), pr_crit(), pr_alert(), pr_emerg()? They inform
about even more serious problems. Why a warning should cause panic/pkill
while an alert message is just printed?
It somehow reminds me the saga with %pK. We were not able to teach
developers to use it correctly for years and ended with hashed
pointers.
Well, this might be different. Developers might learn this the hard
way from bug reports. But there will be bug reports only when
anyone really enables this behavior. They will enable it only
when it works the right way most of the time.
> > > From a safety point of view, the Linux kernel misses a middle way of
> > > handling kernel warnings:
> > > - The kernel should stop the activity that provokes a warning,
> > > - But the kernel should avoid complete denial of service.
> > >
> > > From a security point of view, kernel warning messages provide a lot of
> > > useful information for attackers. Many GNU/Linux distributions allow
> > > unprivileged users to read the kernel log, so attackers use kernel
> > > warning infoleak in vulnerability exploits. See the examples:
> > > https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
> > > https://a13xp0p0v.github.io/2021/02/09/CVE-2021-26708.html
> > >
> > > Let's introduce the pkill_on_warn boot parameter.
> > > If this parameter is set, the kernel kills all threads in a process
> > > that provoked a kernel warning. This behavior is reasonable from a safety
> > > point of view described above. It is also useful for kernel security
> > > hardening because the system kills an exploit process that hits a
> > > kernel warning.
> > >
> > > Signed-off-by: Alexander Popov <[email protected]>
> >
> > This patch was tested using CONFIG_LKDTM.
> > The kernel kills a process that performs this:
> > echo WARNING > /sys/kernel/debug/provoke-crash/DIRECT
> >
> > If you are fine with this approach, I will prepare a patch adding the
> > pkill_on_warn sysctl.
>
> I suspect that you need a list of kthreads for which you are better
> off just invoking panic(). RCU's various kthreads, for but one set
> of examples.
I wonder if kernel could survive killing of any kthread. I have never
seen a code that would check whether a kthread was killed and
restart it.
Best Regards,
Petr
On 30.09.2021 12:15, Petr Mladek wrote:
> On Wed 2021-09-29 12:49:24, Paul E. McKenney wrote:
>> On Wed, Sep 29, 2021 at 10:01:33PM +0300, Alexander Popov wrote:
>>> On 29.09.2021 21:58, Alexander Popov wrote:
>>>> Currently, the Linux kernel provides two types of reaction to kernel
>>>> warnings:
>>>> 1. Do nothing (by default),
>>>> 2. Call panic() if panic_on_warn is set. That's a very strong reaction,
>>>> so panic_on_warn is usually disabled on production systems.
>
> Honestly, I am not sure if panic_on_warn() or the new pkill_on_warn()
> work as expected. I wonder who uses it in practice and what is
> the experience.
>
> The problem is that many developers do not know about this behavior.
> They use WARN() when they are lazy to write more useful message or when
> they want to see all the provided details: task, registry, backtrace.
>
> Also it is inconsistent with pr_warn() behavior. Why a single line
> warning would be innocent and full info WARN() cause panic/pkill?
>
> What about pr_err(), pr_crit(), pr_alert(), pr_emerg()? They inform
> about even more serious problems. Why a warning should cause panic/pkill
> while an alert message is just printed?
That's a good question.
I guess various kernel continuous integration (CI) systems have panic_on_warn
enabled.
[Adding Dmitry Vyukov to this discussion]
If we look at the syzbot dashboard [1] with the results of Linux kernel fuzzing,
we see the issues that appear as various kernel crashes and warnings.
We don't see anything from pr_err(), pr_crit(), pr_alert(), pr_emerg(). Maybe
these situations are not considered as kernel bugs that require fixing.
Anyway, from a security point of view, a kernel warning output is interesting
for attackers as an infoleak. The messages printed by pr_err(), pr_crit(),
pr_alert(), pr_emerg() provide less information.
[1]: https://syzkaller.appspot.com/upstream
> It somehow reminds me the saga with %pK. We were not able to teach
> developers to use it correctly for years and ended with hashed
> pointers.
>
> Well, this might be different. Developers might learn this the hard
> way from bug reports. But there will be bug reports only when
> anyone really enables this behavior. They will enable it only
> when it works the right way most of the time.
>
>
>>>> From a safety point of view, the Linux kernel misses a middle way of
>>>> handling kernel warnings:
>>>> - The kernel should stop the activity that provokes a warning,
>>>> - But the kernel should avoid complete denial of service.
>>>>
>>>> From a security point of view, kernel warning messages provide a lot of
>>>> useful information for attackers. Many GNU/Linux distributions allow
>>>> unprivileged users to read the kernel log, so attackers use kernel
>>>> warning infoleak in vulnerability exploits. See the examples:
>>>> https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
>>>> https://a13xp0p0v.github.io/2021/02/09/CVE-2021-26708.html
>>>>
>>>> Let's introduce the pkill_on_warn boot parameter.
>>>> If this parameter is set, the kernel kills all threads in a process
>>>> that provoked a kernel warning. This behavior is reasonable from a safety
>>>> point of view described above. It is also useful for kernel security
>>>> hardening because the system kills an exploit process that hits a
>>>> kernel warning.
>>>>
>>>> Signed-off-by: Alexander Popov <[email protected]>
>>>
>>> This patch was tested using CONFIG_LKDTM.
>>> The kernel kills a process that performs this:
>>> echo WARNING > /sys/kernel/debug/provoke-crash/DIRECT
>>>
>>> If you are fine with this approach, I will prepare a patch adding the
>>> pkill_on_warn sysctl.
>>
>> I suspect that you need a list of kthreads for which you are better
>> off just invoking panic(). RCU's various kthreads, for but one set
>> of examples.
>
> I wonder if kernel could survive killing of any kthread. I have never
> seen a code that would check whether a kthread was killed and
> restart it.
The do_group_exit() function calls do_exit() from kernel/exit.c, which is also
called during a kernel oops. This function cares about a lot of special cases
depending on the current task_struct. Is it fine?
Best regards,
Alexander
On Thu, 30 Sep 2021 11:15:41 +0200
Petr Mladek <[email protected]> wrote:
> On Wed 2021-09-29 12:49:24, Paul E. McKenney wrote:
> > On Wed, Sep 29, 2021 at 10:01:33PM +0300, Alexander Popov wrote:
> > > On 29.09.2021 21:58, Alexander Popov wrote:
> > > > Currently, the Linux kernel provides two types of reaction to kernel
> > > > warnings:
> > > > 1. Do nothing (by default),
> > > > 2. Call panic() if panic_on_warn is set. That's a very strong reaction,
> > > > so panic_on_warn is usually disabled on production systems.
>
> Honestly, I am not sure if panic_on_warn() or the new pkill_on_warn()
> work as expected. I wonder who uses it in practice and what is
> the experience.
Several people use it, as I see reports all the time when someone can
trigger a warn on from user space, and it's listed as a DOS of the
system.
>
> The problem is that many developers do not know about this behavior.
> They use WARN() when they are lazy to write more useful message or when
> they want to see all the provided details: task, registry, backtrace.
WARN() Should never be used just because of laziness. If it is, then
that's a bug. Let's not use that as an excuse to shoot down this
proposal. WARN() should only be used to test assumptions where you do
not believe something can happen. I use it all the time when the logic
prevents some state, and have the WARN() enabled if that state is hit.
Because to me, it shows something that shouldn't happen happened, and I
need to fix the code.
Basically, WARN should be used just like BUG. But Linus hates BUG,
because in most cases, these bad areas shouldn't take down the entire
kernel, but for some people, they WANT it to take down the system.
>
> Also it is inconsistent with pr_warn() behavior. Why a single line
> warning would be innocent and full info WARN() cause panic/pkill?
pr_warn() can be used for things that the user can hit. I'll use
pr_warn, for memory failures, and such. Something that says "we ran out
of resources, this will not work the way you expect". That is perfect
for pr_warn. But not something that requires a stack dump.
>
> What about pr_err(), pr_crit(), pr_alert(), pr_emerg()? They inform
> about even more serious problems. Why a warning should cause panic/pkill
> while an alert message is just printed?
Because really, WARN() == BUG() but like I said, Linus doesn't like
taking down the entire system on these areas.
>
>
> It somehow reminds me the saga with %pK. We were not able to teach
> developers to use it correctly for years and ended with hashed
> pointers.
>
> Well, this might be different. Developers might learn this the hard
> way from bug reports. But there will be bug reports only when
> anyone really enables this behavior. They will enable it only
> when it works the right way most of the time.
The panic_on_warn() has been used for years now. I do not think this is
an issue.
>
> I wonder if kernel could survive killing of any kthread. I have never
> seen a code that would check whether a kthread was killed and
> restart it.
We can easily check if the thread is a kernel thread or a user thread,
and make the decision on that.
-- Steve
On Thu, Sep 30, 2021 at 11:15:41AM +0200, Petr Mladek wrote:
> On Wed 2021-09-29 12:49:24, Paul E. McKenney wrote:
> > On Wed, Sep 29, 2021 at 10:01:33PM +0300, Alexander Popov wrote:
> > > On 29.09.2021 21:58, Alexander Popov wrote:
> > > > Currently, the Linux kernel provides two types of reaction to kernel
> > > > warnings:
> > > > 1. Do nothing (by default),
> > > > 2. Call panic() if panic_on_warn is set. That's a very strong reaction,
> > > > so panic_on_warn is usually disabled on production systems.
>
> Honestly, I am not sure if panic_on_warn() or the new pkill_on_warn()
> work as expected. I wonder who uses it in practice and what is
> the experience.
panic_on_warn() gets used by folks with paranoid security concerns.
> The problem is that many developers do not know about this behavior.
> They use WARN() when they are lazy to write more useful message or when
> they want to see all the provided details: task, registry, backtrace.
The documentation[1] on this hopefully clarifies the situation:
Note that the WARN()-family should only be used for “expected to be
unreachable” situations. If you want to warn about “reachable but
undesirable” situations, please use the pr_warn()-family of functions.
System owners may have set the panic_on_warn sysctl, to make sure their
systems do not continue running in the face of “unreachable” conditions.
[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on
> Also it is inconsistent with pr_warn() behavior. Why a single line
> warning would be innocent and full info WARN() cause panic/pkill?
Because pr_warn() is intended for system admins. WARN() is for
developers and should not be reachable through any known path.
> What about pr_err(), pr_crit(), pr_alert(), pr_emerg()? They inform
> about even more serious problems. Why a warning should cause panic/pkill
> while an alert message is just printed?
Additionally, pr_*() don't include stack traces, etc. WARN() is for
situations that should never happen. pr_warn() is about undesirable but
reachable states.
For example:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d4689846881d160a4d12a514e991a740bcb5d65a
> It somehow reminds me the saga with %pK. We were not able to teach
> developers to use it correctly for years and ended with hashed
> pointers.
And this was pointed out when %pK was introduced, but Linus couldn't be
convinced. He changed his mind, thankfully.
--
Kees Cook
On Thu, Sep 30, 2021 at 09:27:43PM +0300, Alexander Popov wrote:
> On 30.09.2021 02:31, Andrew Morton wrote:
> > On Wed, 29 Sep 2021 22:01:33 +0300 Alexander Popov <[email protected]> wrote:
> >
> >> On 29.09.2021 21:58, Alexander Popov wrote:
> >> [...]
> >> If you are fine with this approach, I will prepare a patch adding the
> >> pkill_on_warn sysctl.
> >
> > Why do we need a boot parameter? Isn't a sysctl all we need for this
> > feature?
>
> I would say we need both sysctl and boot parameter for pkill_on_warn.
> That would be consistent with panic_on_warn, ftrace_dump_on_oops and
> oops/panic_on_oops.
If you want to change it at runtime, just make a sysctl: it will
be available as a bootparam since v5.8. See commit 3db978d480e2
("kernel/sysctl: support setting sysctl parameters from kernel command
line")
--
Kees Cook
On 30.09.2021 02:31, Andrew Morton wrote:
> On Wed, 29 Sep 2021 22:01:33 +0300 Alexander Popov <[email protected]> wrote:
>
>> On 29.09.2021 21:58, Alexander Popov wrote:
>>> Currently, the Linux kernel provides two types of reaction to kernel
>>> warnings:
>>> 1. Do nothing (by default),
>>> 2. Call panic() if panic_on_warn is set. That's a very strong reaction,
>>> so panic_on_warn is usually disabled on production systems.
>>>
>>> From a safety point of view, the Linux kernel misses a middle way of
>>> handling kernel warnings:
>>> - The kernel should stop the activity that provokes a warning,
>>> - But the kernel should avoid complete denial of service.
>>>
>>> From a security point of view, kernel warning messages provide a lot of
>>> useful information for attackers. Many GNU/Linux distributions allow
>>> unprivileged users to read the kernel log, so attackers use kernel
>>> warning infoleak in vulnerability exploits. See the examples:
>>> https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
>>> https://a13xp0p0v.github.io/2021/02/09/CVE-2021-26708.html
>>>
>>> Let's introduce the pkill_on_warn boot parameter.
>>> If this parameter is set, the kernel kills all threads in a process
>>> that provoked a kernel warning. This behavior is reasonable from a safety
>>> point of view described above. It is also useful for kernel security
>>> hardening because the system kills an exploit process that hits a
>>> kernel warning.
>>>
>>> Signed-off-by: Alexander Popov <[email protected]>
>>
>> This patch was tested using CONFIG_LKDTM.
>> The kernel kills a process that performs this:
>> echo WARNING > /sys/kernel/debug/provoke-crash/DIRECT
>>
>> If you are fine with this approach, I will prepare a patch adding the
>> pkill_on_warn sysctl.
>
> Why do we need a boot parameter? Isn't a sysctl all we need for this
> feature?
I would say we need both sysctl and boot parameter for pkill_on_warn.
That would be consistent with panic_on_warn, ftrace_dump_on_oops and
oops/panic_on_oops.
> Also,
>
> if (pkill_on_warn && system_state >= SYSTEM_RUNNING)
> do_group_exit(SIGKILL);
>
> - why do we care about system_state? An explanatory code comment
> seems appropriate.
>
> - do we really want to do this in states > SYSTEM_RUNNING? If so, why?
A kernel warning may occur at any moment.
I don't have a deep understanding of possible side effects on early boot stages.
So I decided that at least it's safer to avoid interfering before SYSTEM_RUNNING.
Best regards,
Alexander
On Thu 2021-09-30 12:59:03, Steven Rostedt wrote:
> On Thu, 30 Sep 2021 11:15:41 +0200
> Petr Mladek <[email protected]> wrote:
>
> > On Wed 2021-09-29 12:49:24, Paul E. McKenney wrote:
> > > On Wed, Sep 29, 2021 at 10:01:33PM +0300, Alexander Popov wrote:
> > > > On 29.09.2021 21:58, Alexander Popov wrote:
> > > > > Currently, the Linux kernel provides two types of reaction to kernel
> > > > > warnings:
> > > > > 1. Do nothing (by default),
> > > > > 2. Call panic() if panic_on_warn is set. That's a very strong reaction,
> > > > > so panic_on_warn is usually disabled on production systems.
> >
> > Honestly, I am not sure if panic_on_warn() or the new pkill_on_warn()
> > work as expected. I wonder who uses it in practice and what is
> > the experience.
>
> Several people use it, as I see reports all the time when someone can
> trigger a warn on from user space, and it's listed as a DOS of the
> system.
Good to know.
> > The problem is that many developers do not know about this behavior.
> > They use WARN() when they are lazy to write more useful message or when
> > they want to see all the provided details: task, registry, backtrace.
>
> WARN() Should never be used just because of laziness. If it is, then
> that's a bug. Let's not use that as an excuse to shoot down this
> proposal. WARN() should only be used to test assumptions where you do
> not believe something can happen. I use it all the time when the logic
> prevents some state, and have the WARN() enabled if that state is hit.
> Because to me, it shows something that shouldn't happen happened, and I
> need to fix the code.
I have just double checked code written or reviewed by me and it
mostly follow the rules. But it is partly just by chance. I did not
have these rather clear rules in my head.
But for example, the following older WARN() in format_decode() in
lib/vsprintf.c is questionable:
WARN_ONCE(1, "Please remove unsupported %%%c in format string\n", *fmt);
I guess that the WARN() was used to easily locate the caller. But it
is not a reason the reboot the system or kill the process, definitely.
Maybe, we could implement an alternative macro for these situations,
e.g. DEBUG() or warn().
> > Well, this might be different. Developers might learn this the hard
> > way from bug reports. But there will be bug reports only when
> > anyone really enables this behavior. They will enable it only
> > when it works the right way most of the time.
>
> The panic_on_warn() has been used for years now. I do not think this is
> an issue.
If panic_on_warn() is widely used then pkill_on_warn() is fine as well.
Best Regards,
Petr
On Thu 2021-09-30 18:05:54, Alexander Popov wrote:
> On 30.09.2021 12:15, Petr Mladek wrote:
> > On Wed 2021-09-29 12:49:24, Paul E. McKenney wrote:
> >> On Wed, Sep 29, 2021 at 10:01:33PM +0300, Alexander Popov wrote:
> >>> This patch was tested using CONFIG_LKDTM.
> >>> The kernel kills a process that performs this:
> >>> echo WARNING > /sys/kernel/debug/provoke-crash/DIRECT
> >>>
> >>> If you are fine with this approach, I will prepare a patch adding the
> >>> pkill_on_warn sysctl.
> >>
> >> I suspect that you need a list of kthreads for which you are better
> >> off just invoking panic(). RCU's various kthreads, for but one set
> >> of examples.
> >
> > I wonder if kernel could survive killing of any kthread. I have never
> > seen a code that would check whether a kthread was killed and
> > restart it.
>
> The do_group_exit() function calls do_exit() from kernel/exit.c, which is also
> called during a kernel oops. This function cares about a lot of special cases
> depending on the current task_struct. Is it fine?
IMHO, the bigger problem is that nobody will start the kthreads again.
As a result, some kernel functionality will not longer work.
User space threads are different. The user/admin typically
have a chance to start them again.
We might get inspiration in OOM killer. It never kills kthreads
and the init process, see oom_unkillable_task().
It would be better to panic() when WARN() is called from a kthread
or the init process.
Best Regards,
Petr
On Thu, Sep 30, 2021 at 2:15 AM Petr Mladek <[email protected]> wrote:
>
> Honestly, I am not sure if panic_on_warn() or the new pkill_on_warn()
> work as expected. I wonder who uses it in practice and what is
> the experience.
Afaik, there are only two valid uses for panic-on-warn:
(a) test boxes (particularly VM's) that are literally running things
like syzbot and want to report any kernel warnings
(b) the "interchangeable production machinery" fail-fast kind of situation
So in that (a) case, it's literally that you consider a warning to be
a failure case, and just want to stop. Very useful as a way to get
notified by syzbot that "oh, that assert can actually trigger".
And the (b) case is more of a "we have 150 million machines, we expect
about a thousand of them to fail for any random reason any day
_anyway_ - perhaps simply due to hardware failure, and we'd rather
take a machine down quickly and then perhaps look at why only much
later when we have some pattern to the failures".
You shouldn't expect panic-on-warn to ever be the case for any actual
production machine that _matters_. If it is, that production
maintainer only has themselves to blame if they set that flag.
But yes, the expectation is that warnings are for "this can't happen,
but if it does, it's not necessarily fatal, I want to know about it so
that I can think about it".
So it might be a case that you don't handle, but that isn't
necessarily _wrong_ to not handle. You are ok returning an error like
-ENOSYS for that case, for example, but at the same time you are "If
somebody uses this, we should perhaps react to it".
In many cases, a "pr_warn()" is much better. But if you are unsure
just _how_ the situation can happen, and want a call trace and
information about what process did it, and it really is a "this
shouldn't ever happen" situation, a WARN_ON() or a WARN_ON_ONCE() is
certainly not wrong.
So think of WARN_ON() as basically an assert, but an assert with the
intention to be able to continue so that the assert can actually be
reported. BUG_ON() and friends easily result in a machine that is
dead. That's unacceptable.
And think of "panic-on-warn" as people who can deal with their own
problems. It's fundamentally not your issue. They took that choice,
it's their problem, and the security arguments are pure BS - because
WARN_ON() just shouldn't be something you can trigger anyway.
Linus
On 01.10.2021 22:59, Linus Torvalds wrote:
> On Thu, Sep 30, 2021 at 2:15 AM Petr Mladek <[email protected]> wrote:
>>
>> Honestly, I am not sure if panic_on_warn() or the new pkill_on_warn()
>> work as expected. I wonder who uses it in practice and what is
>> the experience.
>
> Afaik, there are only two valid uses for panic-on-warn:
>
> (a) test boxes (particularly VM's) that are literally running things
> like syzbot and want to report any kernel warnings
>
> (b) the "interchangeable production machinery" fail-fast kind of situation
>
> So in that (a) case, it's literally that you consider a warning to be
> a failure case, and just want to stop. Very useful as a way to get
> notified by syzbot that "oh, that assert can actually trigger".
>
> And the (b) case is more of a "we have 150 million machines, we expect
> about a thousand of them to fail for any random reason any day
> _anyway_ - perhaps simply due to hardware failure, and we'd rather
> take a machine down quickly and then perhaps look at why only much
> later when we have some pattern to the failures".
>
> You shouldn't expect panic-on-warn to ever be the case for any actual
> production machine that _matters_. If it is, that production
> maintainer only has themselves to blame if they set that flag.
>
> But yes, the expectation is that warnings are for "this can't happen,
> but if it does, it's not necessarily fatal, I want to know about it so
> that I can think about it".
>
> So it might be a case that you don't handle, but that isn't
> necessarily _wrong_ to not handle. You are ok returning an error like
> -ENOSYS for that case, for example, but at the same time you are "If
> somebody uses this, we should perhaps react to it".
>
> In many cases, a "pr_warn()" is much better. But if you are unsure
> just _how_ the situation can happen, and want a call trace and
> information about what process did it, and it really is a "this
> shouldn't ever happen" situation, a WARN_ON() or a WARN_ON_ONCE() is
> certainly not wrong.
>
> So think of WARN_ON() as basically an assert, but an assert with the
> intention to be able to continue so that the assert can actually be
> reported. BUG_ON() and friends easily result in a machine that is
> dead. That's unacceptable.
>
> And think of "panic-on-warn" as people who can deal with their own
> problems. It's fundamentally not your issue. They took that choice,
> it's their problem, and the security arguments are pure BS - because
> WARN_ON() just shouldn't be something you can trigger anyway.
Thanks, Linus.
And what do you think about the proposed pkill_on_warn?
Let me quote the rationale behind it.
Currently, the Linux kernel provides two types of reaction to kernel warnings:
1. Do nothing (by default),
2. Call panic() if panic_on_warn is set. That's a very strong reaction,
so panic_on_warn is usually disabled on production systems.
From a safety point of view, the Linux kernel misses a middle way of handling
kernel warnings:
- The kernel should stop the activity that provokes a warning,
- But the kernel should avoid complete denial of service.
From a security point of view, kernel warning messages provide a lot of useful
information for attackers. Many GNU/Linux distributions allow unprivileged users
to read the kernel log (for various reasons), so attackers use kernel warning
infoleak in vulnerability exploits. See the examples:
https://a13xp0p0v.github.io/2021/02/09/CVE-2021-26708.html
https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
https://googleprojectzero.blogspot.com/2018/09/a-cache-invalidation-bug-in-linux.html
Let's introduce the pkill_on_warn parameter.
If this parameter is set, the kernel kills all threads in a process that
provoked a kernel warning. This behavior is reasonable from a safety point of
view described above. It is also useful for kernel security hardening because
the system kills an exploit process that hits a kernel warning.
Linus, how do you see the proper way of handling WARN_ON() in kthreads if
pkill_on_warn is enabled?
Thanks!
Best regards,
Alexander
On Sat, 2 Oct 2021 14:41:34 +0300
Alexander Popov <[email protected]> wrote:
> Currently, the Linux kernel provides two types of reaction to kernel warnings:
> 1. Do nothing (by default),
> 2. Call panic() if panic_on_warn is set. That's a very strong reaction,
> so panic_on_warn is usually disabled on production systems.
>
> >From a safety point of view, the Linux kernel misses a middle way of handling
> kernel warnings:
> - The kernel should stop the activity that provokes a warning,
> - But the kernel should avoid complete denial of service.
>
> >From a security point of view, kernel warning messages provide a lot of useful
> information for attackers. Many GNU/Linux distributions allow unprivileged users
> to read the kernel log (for various reasons), so attackers use kernel warning
> infoleak in vulnerability exploits. See the examples:
> https://a13xp0p0v.github.io/2021/02/09/CVE-2021-26708.html
> https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
> https://googleprojectzero.blogspot.com/2018/09/a-cache-invalidation-bug-in-linux.html
>
> Let's introduce the pkill_on_warn parameter.
> If this parameter is set, the kernel kills all threads in a process that
> provoked a kernel warning. This behavior is reasonable from a safety point of
> view described above. It is also useful for kernel security hardening because
> the system kills an exploit process that hits a kernel warning.
How does this help? It only kills the process that caused the warning,
it doesn't kill the process that spawned it. This is trivial to get
around. Just fork a process, trigger the warning (it gets killed) and
then read the kernel log.
If this is your rationale, then I'm not convinced this helps at all.
-- Steve
On 02.10.2021 15:13, Steven Rostedt wrote:
> On Sat, 2 Oct 2021 14:41:34 +0300
> Alexander Popov <[email protected]> wrote:
>
>> Currently, the Linux kernel provides two types of reaction to kernel warnings:
>> 1. Do nothing (by default),
>> 2. Call panic() if panic_on_warn is set. That's a very strong reaction,
>> so panic_on_warn is usually disabled on production systems.
>>
>> >From a safety point of view, the Linux kernel misses a middle way of handling
>> kernel warnings:
>> - The kernel should stop the activity that provokes a warning,
>> - But the kernel should avoid complete denial of service.
>>
>> >From a security point of view, kernel warning messages provide a lot of useful
>> information for attackers. Many GNU/Linux distributions allow unprivileged users
>> to read the kernel log (for various reasons), so attackers use kernel warning
>> infoleak in vulnerability exploits. See the examples:
>> https://a13xp0p0v.github.io/2021/02/09/CVE-2021-26708.html
>> https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
>> https://googleprojectzero.blogspot.com/2018/09/a-cache-invalidation-bug-in-linux.html
>>
>> Let's introduce the pkill_on_warn parameter.
>> If this parameter is set, the kernel kills all threads in a process that
>> provoked a kernel warning. This behavior is reasonable from a safety point of
>> view described above. It is also useful for kernel security hardening because
>> the system kills an exploit process that hits a kernel warning.
>
> How does this help? It only kills the process that caused the warning,
> it doesn't kill the process that spawned it. This is trivial to get
> around. Just fork a process, trigger the warning (it gets killed) and
> then read the kernel log.
>
> If this is your rationale, then I'm not convinced this helps at all.
Steven, as I understand, here you ask about the security implications of
pkill_on_warn (not about the safety implications that I mentioned).
Killing the exploit process that hit a warning is MUCH better than ignoring and
proceeding with execution. That may influence the stability of the exploits that
hit WARN_ON() or rely on WARN_ON() infoleak.
Exploit development is the constant struggle for attack stability. Exploiting a
heap memory corruption is especially painful when the kernel works with the
attacked slab caches in parallel with your exploit.
So when the kernel kills the exploit process, some of the WARN_ON() infoleak
data becomes obsolete; the attacker loses the execution in that particular
kernel task on that particular CPU. Moreover, restarting the exploit process
would bring a lot of noise to the system. That may decrease the attack stability
even more.
So killing the exploit process is the best option that we have here to distress
the attacker who uses the WARN_ON() infoleak technique. I.e. that is
probabilistic attack mitigation, which is reasonable for kernel safety as well.
I hope I managed to show this from the attacker's side.
Best regards,
Alexander
On Sat, Oct 2, 2021 at 4:41 AM Alexander Popov <[email protected]> wrote:
>
> And what do you think about the proposed pkill_on_warn?
Honestly, I don't see the point.
If you can reliably trigger the WARN_ON some way, you can probably
cause more problems by fooling some other process to trigger it.
And if it's unintentional, then what does the signal help?
So rather than a "rationale" that makes little sense, I'd like to hear
of an actual _use_ case. That's different. That's somebody actually
_using_ that pkill to good effect for some particular load.
That said, I don't much care in the end. But it sounds like a
pointless option to just introduce yet another behavior to something
that should never happen anyway, and where the actual
honest-to-goodness reason for WARN_ON() existing is already being
fulfilled (ie syzbot has been very effective at flushing things like
that out).
Linus
On Wed, Sep 29, 2021 at 09:58:23PM +0300, Alexander Popov wrote:
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -53,6 +53,7 @@ static int pause_on_oops_flag;
> static DEFINE_SPINLOCK(pause_on_oops_lock);
> bool crash_kexec_post_notifiers;
> int panic_on_warn __read_mostly;
> +int pkill_on_warn __read_mostly;
> unsigned long panic_on_taint;
> bool panic_on_taint_nousertaint = false;
>
> @@ -610,6 +611,9 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
>
> print_oops_end_marker();
>
> + if (pkill_on_warn && system_state >= SYSTEM_RUNNING)
> + do_group_exit(SIGKILL);
> +
Wait a sec... do_group_exit() is very much not locking-neutral.
Aren't you introducing a bunch of potential deadlocks by adding
that?
On Sat, 2 Oct 2021 18:04:10 +0000
Al Viro <[email protected]> wrote:
> > @@ -610,6 +611,9 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
> >
> > print_oops_end_marker();
> >
> > + if (pkill_on_warn && system_state >= SYSTEM_RUNNING)
> > + do_group_exit(SIGKILL);
> > +
>
> Wait a sec... do_group_exit() is very much not locking-neutral.
> Aren't you introducing a bunch of potential deadlocks by adding
> that?
Perhaps add an irq_work() here to trigger the do_group_exit() from a
"safe" interrupt context?
-- Steve
On 02.10.2021 19:52, Linus Torvalds wrote:
> On Sat, Oct 2, 2021 at 4:41 AM Alexander Popov <[email protected]> wrote:
>>
>> And what do you think about the proposed pkill_on_warn?
>
> Honestly, I don't see the point.
>
> If you can reliably trigger the WARN_ON some way, you can probably
> cause more problems by fooling some other process to trigger it.
>
> And if it's unintentional, then what does the signal help?
>
> So rather than a "rationale" that makes little sense, I'd like to hear
> of an actual _use_ case. That's different. That's somebody actually
> _using_ that pkill to good effect for some particular load.
I was thinking about a use case for you and got an insight.
Bugs usually don't come alone. Killing the process that got WARN_ON() prevents
possible bad effects **after** the warning. For example, in my exploit for
CVE-2019-18683, the kernel warning happens **before** the memory corruption
(use-after-free in the V4L2 subsystem).
https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
So pkill_on_warn allows the kernel to stop the process when the first signs of
wrong behavior are detected. In other words, proceeding with the code execution
from the wrong state can bring more disasters later.
> That said, I don't much care in the end. But it sounds like a
> pointless option to just introduce yet another behavior to something
> that should never happen anyway, and where the actual
> honest-to-goodness reason for WARN_ON() existing is already being
> fulfilled (ie syzbot has been very effective at flushing things like
> that out).
Yes, we slowly get rid of kernel warnings.
However, the syzbot dashboard still shows a lot of them.
Even my small syzkaller setup finds plenty of new warnings.
I believe fixing all of them will take some time.
And during that time, pkill_on_warn may be a better reaction to WARN_ON() than
ignoring and proceeding with the execution.
Is that reasonable?
Best regards,
Alexander
Alexander Popov <[email protected]> writes:
> On 02.10.2021 19:52, Linus Torvalds wrote:
>> On Sat, Oct 2, 2021 at 4:41 AM Alexander Popov <[email protected]> wrote:
>>>
>>> And what do you think about the proposed pkill_on_warn?
>>
>> Honestly, I don't see the point.
>>
>> If you can reliably trigger the WARN_ON some way, you can probably
>> cause more problems by fooling some other process to trigger it.
>>
>> And if it's unintentional, then what does the signal help?
>>
>> So rather than a "rationale" that makes little sense, I'd like to hear
>> of an actual _use_ case. That's different. That's somebody actually
>> _using_ that pkill to good effect for some particular load.
>
> I was thinking about a use case for you and got an insight.
>
> Bugs usually don't come alone. Killing the process that got WARN_ON() prevents
> possible bad effects **after** the warning. For example, in my exploit for
> CVE-2019-18683, the kernel warning happens **before** the memory corruption
> (use-after-free in the V4L2 subsystem).
> https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
>
> So pkill_on_warn allows the kernel to stop the process when the first signs of
> wrong behavior are detected. In other words, proceeding with the code execution
> from the wrong state can bring more disasters later.
>
>> That said, I don't much care in the end. But it sounds like a
>> pointless option to just introduce yet another behavior to something
>> that should never happen anyway, and where the actual
>> honest-to-goodness reason for WARN_ON() existing is already being
>> fulfilled (ie syzbot has been very effective at flushing things like
>> that out).
>
> Yes, we slowly get rid of kernel warnings.
> However, the syzbot dashboard still shows a lot of them.
> Even my small syzkaller setup finds plenty of new warnings.
> I believe fixing all of them will take some time.
> And during that time, pkill_on_warn may be a better reaction to WARN_ON() than
> ignoring and proceeding with the execution.
>
> Is that reasonable?
I won't comment on the sanity of the feature but I will say that calling
it oops_on_warn (rather than pkill_on_warn), and using the usual oops
facilities rather than rolling oops by hand sounds like a better
implementation.
Especially as calling do_group_exit(SIGKILL) from a random location is
not a clean way to kill a process. Strictly speaking it is not even
killing the process.
Partly this is just me seeing the introduction of a
do_group_exit(SIGKILL) call and not likely the maintenance that will be
needed. I am still sorting out the problems with other randomly placed
calls to do_group_exit(SIGKILL) and interactions with ptrace and
PTRACE_EVENT_EXIT in particular.
Which is a long winded way of saying if I can predictably trigger a
warning that calls do_group_exit(SIGKILL), on some architectures I can
use ptrace and can convert that warning into a way to manipulate the
kernel stack to have the contents of my choice.
If anyone goes forward with this please use the existing oops
infrastructure so the ptrace interactions and anything else that comes
up only needs to be fixed once.
Eric
On 05.10.2021 22:48, Eric W. Biederman wrote:
> Alexander Popov <[email protected]> writes:
>
>> On 02.10.2021 19:52, Linus Torvalds wrote:
>>> On Sat, Oct 2, 2021 at 4:41 AM Alexander Popov <[email protected]> wrote:
>>>>
>>>> And what do you think about the proposed pkill_on_warn?
>>>
>>> Honestly, I don't see the point.
>>>
>>> If you can reliably trigger the WARN_ON some way, you can probably
>>> cause more problems by fooling some other process to trigger it.
>>>
>>> And if it's unintentional, then what does the signal help?
>>>
>>> So rather than a "rationale" that makes little sense, I'd like to hear
>>> of an actual _use_ case. That's different. That's somebody actually
>>> _using_ that pkill to good effect for some particular load.
>>
>> I was thinking about a use case for you and got an insight.
>>
>> Bugs usually don't come alone. Killing the process that got WARN_ON() prevents
>> possible bad effects **after** the warning. For example, in my exploit for
>> CVE-2019-18683, the kernel warning happens **before** the memory corruption
>> (use-after-free in the V4L2 subsystem).
>> https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
>>
>> So pkill_on_warn allows the kernel to stop the process when the first signs of
>> wrong behavior are detected. In other words, proceeding with the code execution
>> from the wrong state can bring more disasters later.
>>
>>> That said, I don't much care in the end. But it sounds like a
>>> pointless option to just introduce yet another behavior to something
>>> that should never happen anyway, and where the actual
>>> honest-to-goodness reason for WARN_ON() existing is already being
>>> fulfilled (ie syzbot has been very effective at flushing things like
>>> that out).
>>
>> Yes, we slowly get rid of kernel warnings.
>> However, the syzbot dashboard still shows a lot of them.
>> Even my small syzkaller setup finds plenty of new warnings.
>> I believe fixing all of them will take some time.
>> And during that time, pkill_on_warn may be a better reaction to WARN_ON() than
>> ignoring and proceeding with the execution.
>>
>> Is that reasonable?
>
> I won't comment on the sanity of the feature but I will say that calling
> it oops_on_warn (rather than pkill_on_warn), and using the usual oops
> facilities rather than rolling oops by hand sounds like a better
> implementation.
>
> Especially as calling do_group_exit(SIGKILL) from a random location is
> not a clean way to kill a process. Strictly speaking it is not even
> killing the process.
>
> Partly this is just me seeing the introduction of a
> do_group_exit(SIGKILL) call and not likely the maintenance that will be
> needed. I am still sorting out the problems with other randomly placed
> calls to do_group_exit(SIGKILL) and interactions with ptrace and
> PTRACE_EVENT_EXIT in particular.
>
> Which is a long winded way of saying if I can predictably trigger a
> warning that calls do_group_exit(SIGKILL), on some architectures I can
> use ptrace and can convert that warning into a way to manipulate the
> kernel stack to have the contents of my choice.
>
> If anyone goes forward with this please use the existing oops
> infrastructure so the ptrace interactions and anything else that comes
> up only needs to be fixed once.
Eric, thanks a lot.
I will learn the oops infrastructure deeper.
I will do more experiments and come with version 2.
Currently, I think I will save the pkill_on_warn option name because I want to
avoid kernel crashes.
Thanks to everyone who gave feedback on this patch!
Best regards,
Alexander
On 05.10.2021 22:48, Eric W. Biederman wrote:
> Especially as calling do_group_exit(SIGKILL) from a random location is
> not a clean way to kill a process. Strictly speaking it is not even
> killing the process.
>
> Partly this is just me seeing the introduction of a
> do_group_exit(SIGKILL) call and not likely the maintenance that will be
> needed. I am still sorting out the problems with other randomly placed
> calls to do_group_exit(SIGKILL) and interactions with ptrace and
> PTRACE_EVENT_EXIT in particular.
>
> Which is a long winded way of saying if I can predictably trigger a
> warning that calls do_group_exit(SIGKILL), on some architectures I can
> use ptrace and can convert that warning into a way to manipulate the
> kernel stack to have the contents of my choice.
>
> If anyone goes forward with this please use the existing oops
> infrastructure so the ptrace interactions and anything else that comes
> up only needs to be fixed once.
Hello Eric, hello everyone.
I learned the oops infrastructure and see that it's arch-specific.
The architectures have separate implementations of the die() function with
different prototypes. I don't see how to use the oops infrastructure for killing
all threads in a process that hits a kernel warning.
What do you think about doing the same as the oom_killer (and some other
subsystems)? It kills all threads in a process this way:
do_send_sig_info(SIGKILL, SEND_SIG_PRIV, current, PIDTYPE_TGID).
The oom_killer also shows a nice way to avoid killing init and kthreads:
static bool oom_unkillable_task(struct task_struct *p)
{
if (is_global_init(p))
return true;
if (p->flags & PF_KTHREAD)
return true;
return false;
}
I want to do something similar.
I would appreciate your comments.
Best regards,
Alexander
On 01.10.2021 22:59, Linus Torvalds wrote:
> On Thu, Sep 30, 2021 at 2:15 AM Petr Mladek <[email protected]> wrote:
>>
>> Honestly, I am not sure if panic_on_warn() or the new pkill_on_warn()
>> work as expected. I wonder who uses it in practice and what is
>> the experience.
>
> Afaik, there are only two valid uses for panic-on-warn:
>
> (a) test boxes (particularly VM's) that are literally running things
> like syzbot and want to report any kernel warnings
>
> (b) the "interchangeable production machinery" fail-fast kind of situation
>
> So in that (a) case, it's literally that you consider a warning to be
> a failure case, and just want to stop. Very useful as a way to get
> notified by syzbot that "oh, that assert can actually trigger".
>
> And the (b) case is more of a "we have 150 million machines, we expect
> about a thousand of them to fail for any random reason any day
> _anyway_ - perhaps simply due to hardware failure, and we'd rather
> take a machine down quickly and then perhaps look at why only much
> later when we have some pattern to the failures".
>
> You shouldn't expect panic-on-warn to ever be the case for any actual
> production machine that _matters_. If it is, that production
> maintainer only has themselves to blame if they set that flag.
>
> But yes, the expectation is that warnings are for "this can't happen,
> but if it does, it's not necessarily fatal, I want to know about it so
> that I can think about it".
>
> So it might be a case that you don't handle, but that isn't
> necessarily _wrong_ to not handle. You are ok returning an error like
> -ENOSYS for that case, for example, but at the same time you are "If
> somebody uses this, we should perhaps react to it".
>
> In many cases, a "pr_warn()" is much better. But if you are unsure
> just _how_ the situation can happen, and want a call trace and
> information about what process did it, and it really is a "this
> shouldn't ever happen" situation, a WARN_ON() or a WARN_ON_ONCE() is
> certainly not wrong.
>
> So think of WARN_ON() as basically an assert, but an assert with the
> intention to be able to continue so that the assert can actually be
> reported. BUG_ON() and friends easily result in a machine that is
> dead. That's unacceptable.
Hi Linus,
Coming back to the discussion of WARN_ON()/pr_warn("WARNING:") semantics.
We see a number of cases where WARNING is used to inform userspace that
it is doing something wrong, e.g.
https://elixir.bootlin.com/linux/v5.19-rc8/source/net/can/j1939/socket.c#L181
https://elixir.bootlin.com/linux/v5.19-rc8/source/drivers/video/fbdev/core/fbmem.c#L1023
It is definitely useful, but it does not make sense in case of fuzzing
when the userspace should do wrong things and check if kernel behaves
correctly.
As a result we have warnings with two different intentions:
- warn that something wrong happens in kernel, but we are able to continue;
- warn userspace that it is doing something wrong.
During fuzzing we would like to report the former and to ignore the
latter. Are any ideas how these intentions can be recognized automatically?
Best regards,
Alexey
On 27.07.2022 19:30, Jann Horn wrote:
> On Wed, Jul 27, 2022 at 6:17 PM Alexey Khoroshilov
> <[email protected]> wrote:
>> On 01.10.2021 22:59, Linus Torvalds wrote:
>> Coming back to the discussion of WARN_ON()/pr_warn("WARNING:") semantics.
>>
>> We see a number of cases where WARNING is used to inform userspace that
>> it is doing something wrong, e.g.
>> https://elixir.bootlin.com/linux/v5.19-rc8/source/net/can/j1939/socket.c#L181
>> https://elixir.bootlin.com/linux/v5.19-rc8/source/drivers/video/fbdev/core/fbmem.c#L1023
>>
>> It is definitely useful, but it does not make sense in case of fuzzing
>> when the userspace should do wrong things and check if kernel behaves
>> correctly.
>>
>> As a result we have warnings with two different intentions:
>> - warn that something wrong happens in kernel, but we are able to continue;
>> - warn userspace that it is doing something wrong.
>>
>> During fuzzing we would like to report the former and to ignore the
>> latter. Are any ideas how these intentions can be recognized automatically?
>
> https://elixir.bootlin.com/linux/v5.19-rc8/source/include/asm-generic/bug.h#L74
> says:
>
> * WARN(), WARN_ON(), WARN_ON_ONCE, and so on can be used to report
> * significant kernel issues that need prompt attention if they should ever
> * appear at runtime.
> *
> * Do not use these macros when checking for invalid external inputs
> * (e.g. invalid system call arguments, or invalid data coming from
> * network/devices), and on transient conditions like ENOMEM or EAGAIN.
> * These macros should be used for recoverable kernel issues only.
> * For invalid external inputs, transient conditions, etc use
> * pr_err[_once/_ratelimited]() followed by dump_stack(), if necessary.
> * Do not include "BUG"/"WARNING" in format strings manually to make these
> * conditions distinguishable from kernel issues.
>
> So if you see drivers intentionally using WARN() or printing
> "WARNING:" on codepaths that are reachable with bogus inputs from
> userspace, those codepaths should be fixed to log warnings in a
> different format.
Thank you, Jann!
I have missed that.
--
Alexey
On Wed, Jul 27, 2022 at 9:17 AM Alexey Khoroshilov
<[email protected]> wrote:
>
> We see a number of cases where WARNING is used to inform userspace that
> it is doing something wrong, e.g.
> https://elixir.bootlin.com/linux/v5.19-rc8/source/net/can/j1939/socket.c#L181
> https://elixir.bootlin.com/linux/v5.19-rc8/source/drivers/video/fbdev/core/fbmem.c#L1023
That first case is entirely bogus.
WARN_ON() should only be used for "This cannot happen, but if it does,
I want to know how we got here".
But the second case is fine: Using "pr_warn()" is fine. A kernel
warning (without a backtrace) is a normal thing for something that is
deprecated or questionable, and you want to tell the user that "this
app is doing something wrong".
So if that j1939 thing is something that can be triggered by a user,
then the backtrace should be reported to the driver maintainer, and
then either
(a) the WARN_ON_ONCE() should just be removed ("ok, this can happen,
we understand why it can happen, and it's fine")
(b) the problem the WARN_ON_ONCE() reports about should be made
impossible some way
(c) it might be downgraded to a pr_warn() if people really want to
tell user space that "guys, you're doing something wrong" and it's
considered a useful warning.
Honestly, for something like that j1939 can driver, I doubt (c) is
ever an option. The "return -EBUSY" is the only real information that
a user needs.
Linus
On Wed, Jul 27, 2022 at 6:17 PM Alexey Khoroshilov
<[email protected]> wrote:
> On 01.10.2021 22:59, Linus Torvalds wrote:
> Coming back to the discussion of WARN_ON()/pr_warn("WARNING:") semantics.
>
> We see a number of cases where WARNING is used to inform userspace that
> it is doing something wrong, e.g.
> https://elixir.bootlin.com/linux/v5.19-rc8/source/net/can/j1939/socket.c#L181
> https://elixir.bootlin.com/linux/v5.19-rc8/source/drivers/video/fbdev/core/fbmem.c#L1023
>
> It is definitely useful, but it does not make sense in case of fuzzing
> when the userspace should do wrong things and check if kernel behaves
> correctly.
>
> As a result we have warnings with two different intentions:
> - warn that something wrong happens in kernel, but we are able to continue;
> - warn userspace that it is doing something wrong.
>
> During fuzzing we would like to report the former and to ignore the
> latter. Are any ideas how these intentions can be recognized automatically?
https://elixir.bootlin.com/linux/v5.19-rc8/source/include/asm-generic/bug.h#L74
says:
* WARN(), WARN_ON(), WARN_ON_ONCE, and so on can be used to report
* significant kernel issues that need prompt attention if they should ever
* appear at runtime.
*
* Do not use these macros when checking for invalid external inputs
* (e.g. invalid system call arguments, or invalid data coming from
* network/devices), and on transient conditions like ENOMEM or EAGAIN.
* These macros should be used for recoverable kernel issues only.
* For invalid external inputs, transient conditions, etc use
* pr_err[_once/_ratelimited]() followed by dump_stack(), if necessary.
* Do not include "BUG"/"WARNING" in format strings manually to make these
* conditions distinguishable from kernel issues.
So if you see drivers intentionally using WARN() or printing
"WARNING:" on codepaths that are reachable with bogus inputs from
userspace, those codepaths should be fixed to log warnings in a
different format.
On 27.07.2022 19:42, Linus Torvalds wrote:
> On Wed, Jul 27, 2022 at 9:17 AM Alexey Khoroshilov
> <[email protected]> wrote:
>>
>> We see a number of cases where WARNING is used to inform userspace that
>> it is doing something wrong, e.g.
>> https://elixir.bootlin.com/linux/v5.19-rc8/source/net/can/j1939/socket.c#L181
>> https://elixir.bootlin.com/linux/v5.19-rc8/source/drivers/video/fbdev/core/fbmem.c#L1023
>
> That first case is entirely bogus.
>
> WARN_ON() should only be used for "This cannot happen, but if it does,
> I want to know how we got here".
>
> But the second case is fine: Using "pr_warn()" is fine. A kernel
> warning (without a backtrace) is a normal thing for something that is
> deprecated or questionable, and you want to tell the user that "this
> app is doing something wrong".
Agree with the only note that I like the requirement:
* Do not include "BUG"/"WARNING" in format strings manually to make
* these conditions distinguishable from kernel issues.
very much.
Thank you,
Alexey