[kees: re-sending this series on behalf of John Wood <[email protected]>
also visible at https://github.com/johwood/linux fbfam]
From: John Wood <[email protected]>
The goal of this patch serie is to detect and mitigate a fork brute force
attack.
Attacks with the purpose to break ASLR or bypass canaries traditionaly use
some level of brute force with the help of the fork system call. This is
possible since when creating a new process using fork its memory contents
are the same as those of the parent process (the process that called the
fork system call). So, the attacker can test the memory infinite times to
find the correct memory values or the correct memory addresses without
worrying about crashing the application.
Based on the above scenario it would be nice to have this detected and
mitigated, and this is the goal of this implementation.
Other implementations
---------------------
The public version of grsecurity, as a summary, is based on the idea of
delay the fork system call if a child died due to a fatal error. This has
some issues:
1.- Bad practices: Add delays to the kernel is, in general, a bad idea.
2.- Weak points: This protection can be bypassed using two different
methods since it acts only when the fork is called after a child has
crashed.
2.1.- Bypass 1: So, it would still be possible for an attacker to fork
a big amount of children (in the order of thousands), then probe
all of them, and finally wait the protection time before repeat
the steps.
2.2.- Bypass 2: This method is based on the idea that the protection
doesn't act if the parent crashes. So, it would still be possible
for an attacker to fork a process and probe itself. Then, fork
the child process and probe itself again. This way, these steps
can be repeated infinite times without any mitigation.
This implementation
-------------------
The main idea behind this implementation is to improve the existing ones
focusing on the weak points annotated before. So, the solution for the
first bypass method is to detect a fast crash rate instead of only one
simple crash. For the second bypass method the solution is to detect both
the crash of parent and child processes. Moreover, as a mitigation method
it is better to kill all the offending tasks involve in the attack instead
of use delays.
So, the solution to the two bypass methods previously commented is to use
some statistical data shared across all the processes that can have the
same memory contents. Or in other words, a statistical data shared between
all the processes that fork the task 0, and all the processes that fork
after an execve system call.
These statistics hold the timestamp for the first fork (case of a fork of
task zero) or the timestamp for the execve system call (the other case).
Also, hold the number of faults of all the tasks that share the same
statistical data since the commented timestamp.
With this information it is possible to detect a brute force attack when a
task die in a fatal way computing the crashing rate. This rate shows the
milliseconds per fault and when it goes under a certain threshold there is
a clear signal that something malicious is happening.
Once detected, the mitigation only kills the processes that share the same
statistical data and so, all the tasks that can have the same memory
contents. This way, an attack is rejected.
The fbfam feature can be enabled, disabled and tuned as follows:
1.- Per system enabling: This feature can be enabled in build time using
the config application under:
Security options ---> Fork brute force attack mitigation
2.- Per process enabling/disabling: To allow that specific applications can
turn off or turn on the detection and mitigation of a fork brute force
attack when required, there are two new prctls.
prctl(PR_FBFAM_ENABLE, 0, 0, 0, 0) -> To enable the feature
prctl(PR_FBFAM_DISABLE, 0, 0, 0, 0) -> To disable the feature
Both functions return zero on success and -EFAULT if the current task
doesn't have statistical data.
3.- Fine tuning: To customize the detection's sensibility there is a new
sysctl that allows to set the crashing rate threshold. It is accessible
through the file:
/proc/sys/kernel/fbfam/crashing_rate_threshold
The units are in milliseconds per fault and the attack's mitigation is
triggered if the crashing rate of an application goes under this
threshold. So, the higher this value, the faster an attack will be
detected.
So, knowing all this information I will explain now the different patches:
The 1/9 patch adds a new config for the fbfam feature.
The 2/9 and 3/9 patches add and use the api to manage the statistical data
necessary to compute the crashing rate of an application.
The 4/9 patch adds a new sysctl to fine tuning the detection's sensibility.
The 5/9 patch detects a fork brute force attack calculating the crashing
rate.
The 6/9 patch mitigates the attack killing all the offending tasks.
The 7/9 patch adds two new prctls to allow per task enabling/disabling.
The 8/9 patch adds general documentation.
The 9/9 patch adds an entry to the maintainers list.
This patch series is a task of the KSPP [1] and it is worth to mention
that there is a previous attempt without any continuation [2].
[1] https://github.com/KSPP/linux/issues/39
[2] https://lore.kernel.org/linux-fsdevel/[email protected]/
Any constructive comments are welcome.
Note: During the compilation these warnings were shown:
kernel/exit.o: warning: objtool: __x64_sys_exit_group()+0x18: unreachable instruction
arch/x86/kernel/cpu/mce/core.o: warning: objtool: mce_panic()+0x123: unreachable instruction
arch/x86/kernel/smpboot.o: warning: objtool: native_play_dead()+0x122: unreachable instruction
net/core/skbuff.o: warning: objtool: skb_push.cold()+0x14: unreachable instruction
John Wood (6):
security/fbfam: Add a Kconfig to enable the fbfam feature
security/fbfam: Add the api to manage statistics
security/fbfam: Use the api to manage statistics
security/fbfam: Add a new sysctl to control the crashing rate
threshold
security/fbfam: Detect a fork brute force attack
security/fbfam: Mitigate a fork brute force attack
fs/coredump.c | 2 +
fs/exec.c | 2 +
include/fbfam/fbfam.h | 24 ++++
include/linux/sched.h | 4 +
kernel/exit.c | 2 +
kernel/fork.c | 4 +
kernel/sysctl.c | 9 ++
security/Kconfig | 1 +
security/Makefile | 4 +
security/fbfam/Kconfig | 10 ++
security/fbfam/Makefile | 3 +
security/fbfam/fbfam.c | 279 ++++++++++++++++++++++++++++++++++++++++
security/fbfam/sysctl.c | 20 +++
13 files changed, 364 insertions(+)
create mode 100644 include/fbfam/fbfam.h
create mode 100644 security/fbfam/Kconfig
create mode 100644 security/fbfam/Makefile
create mode 100644 security/fbfam/fbfam.c
create mode 100644 security/fbfam/sysctl.c
--
2.25.1
From: John Wood <[email protected]>
To detect a fork brute force attack it is necessary to compute the
crashing rate of the application. This calculation is performed in each
fatal fail of a task, or in other words, when a core dump is triggered.
If this rate shows that the application is crashing quickly, there is a
clear signal that an attack is happening.
Since the crashing rate is computed in milliseconds per fault, if this
rate goes under a certain threshold a warning is triggered.
Signed-off-by: John Wood <[email protected]>
---
fs/coredump.c | 2 ++
include/fbfam/fbfam.h | 2 ++
security/fbfam/fbfam.c | 39 +++++++++++++++++++++++++++++++++++++++
3 files changed, 43 insertions(+)
diff --git a/fs/coredump.c b/fs/coredump.c
index 76e7c10edfc0..d4ba4e1828d5 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -51,6 +51,7 @@
#include "internal.h"
#include <trace/events/sched.h>
+#include <fbfam/fbfam.h>
int core_uses_pid;
unsigned int core_pipe_limit;
@@ -825,6 +826,7 @@ void do_coredump(const kernel_siginfo_t *siginfo)
fail_creds:
put_cred(cred);
fail:
+ fbfam_handle_attack(siginfo->si_signo);
return;
}
diff --git a/include/fbfam/fbfam.h b/include/fbfam/fbfam.h
index 2cfe51d2b0d5..9ac8e33d8291 100644
--- a/include/fbfam/fbfam.h
+++ b/include/fbfam/fbfam.h
@@ -12,10 +12,12 @@ extern struct ctl_table fbfam_sysctls[];
int fbfam_fork(struct task_struct *child);
int fbfam_execve(void);
int fbfam_exit(void);
+int fbfam_handle_attack(int signal);
#else
static inline int fbfam_fork(struct task_struct *child) { return 0; }
static inline int fbfam_execve(void) { return 0; }
static inline int fbfam_exit(void) { return 0; }
+static inline int fbfam_handle_attack(int signal) { return 0; }
#endif
#endif /* _FBFAM_H_ */
diff --git a/security/fbfam/fbfam.c b/security/fbfam/fbfam.c
index 9be4639b72eb..3aa669e4ea51 100644
--- a/security/fbfam/fbfam.c
+++ b/security/fbfam/fbfam.c
@@ -4,7 +4,9 @@
#include <linux/errno.h>
#include <linux/gfp.h>
#include <linux/jiffies.h>
+#include <linux/printk.h>
#include <linux/refcount.h>
+#include <linux/signal.h>
#include <linux/slab.h>
/**
@@ -172,3 +174,40 @@ int fbfam_exit(void)
return 0;
}
+/**
+ * fbfam_handle_attack() - Fork brute force attack detection.
+ * @signal: Signal number that causes the core dump.
+ *
+ * The crashing rate of an application is computed in milliseconds per fault in
+ * each crash. So, if this rate goes under a certain threshold there is a clear
+ * signal that the application is crashing quickly. At this moment, a fork brute
+ * force attack is happening.
+ *
+ * Return: -EFAULT if the current task doesn't have statistical data. Zero
+ * otherwise.
+ */
+int fbfam_handle_attack(int signal)
+{
+ struct fbfam_stats *stats = current->fbfam_stats;
+ u64 delta_jiffies, delta_time;
+ u64 crashing_rate;
+
+ if (!stats)
+ return -EFAULT;
+
+ if (!(signal == SIGILL || signal == SIGBUS || signal == SIGKILL ||
+ signal == SIGSEGV || signal == SIGSYS))
+ return 0;
+
+ stats->faults += 1;
+
+ delta_jiffies = get_jiffies_64() - stats->jiffies;
+ delta_time = jiffies64_to_msecs(delta_jiffies);
+ crashing_rate = delta_time / (u64)stats->faults;
+
+ if (crashing_rate < (u64)sysctl_crashing_rate_threshold)
+ pr_warn("fbfam: Fork brute force attack detected\n");
+
+ return 0;
+}
+
--
2.25.1
From: John Wood <[email protected]>
Create a statistical data structure to hold all the necessary
information involve in a fork brute force attack. This info is a
timestamp for the first fork or execve and the number of crashes since
then. Moreover, due to this statitistical data will be shared between
different tasks, a reference counter it is necessary.
For a correct management of an attack it is also necessary that all the
tasks hold statistical data. The same statistical data needs to be
shared between all the tasks that hold the same memory contents or in
other words, between all the tasks that have been forked without any
execve call.
When a forked task calls the execve system call, the memory contents are
set with new values. So, in this scenario the parent's statistical data
no need to be share. Instead, a new statistical data structure must be
allocated to start a new cycle.
The statistical data that every task holds needs to be clear when a task
exits. Due to this data is shared across multiples tasks, the reference
counter is useful to free the previous allocated data only when there
are not other pointers to the same data. Or in other words, when the
reference counter reaches zero.
So, based in all the previous information add the api to manage all the
commented cases.
Also, add to the struct task_struct a new field to point to the
statitistical data related to an attack. This way, all the tasks will
have access to this information.
Signed-off-by: John Wood <[email protected]>
---
include/fbfam/fbfam.h | 18 +++++
include/linux/sched.h | 4 +
security/Makefile | 4 +
security/fbfam/Makefile | 2 +
security/fbfam/fbfam.c | 163 ++++++++++++++++++++++++++++++++++++++++
5 files changed, 191 insertions(+)
create mode 100644 include/fbfam/fbfam.h
create mode 100644 security/fbfam/Makefile
create mode 100644 security/fbfam/fbfam.c
diff --git a/include/fbfam/fbfam.h b/include/fbfam/fbfam.h
new file mode 100644
index 000000000000..b5b7d1127a52
--- /dev/null
+++ b/include/fbfam/fbfam.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _FBFAM_H_
+#define _FBFAM_H_
+
+#include <linux/sched.h>
+
+#ifdef CONFIG_FBFAM
+int fbfam_fork(struct task_struct *child);
+int fbfam_execve(void);
+int fbfam_exit(void);
+#else
+static inline int fbfam_fork(struct task_struct *child) { return 0; }
+static inline int fbfam_execve(void) { return 0; }
+static inline int fbfam_exit(void) { return 0; }
+#endif
+
+#endif /* _FBFAM_H_ */
+
diff --git a/include/linux/sched.h b/include/linux/sched.h
index afe01e232935..00e1aa5e00cd 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1315,6 +1315,10 @@ struct task_struct {
struct callback_head mce_kill_me;
#endif
+#ifdef CONFIG_FBFAM
+ struct fbfam_stats *fbfam_stats;
+#endif
+
/*
* New fields for task_struct should be added above here, so that
* they are included in the randomized portion of task_struct.
diff --git a/security/Makefile b/security/Makefile
index 3baf435de541..36dc4b536349 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -36,3 +36,7 @@ obj-$(CONFIG_BPF_LSM) += bpf/
# Object integrity file lists
subdir-$(CONFIG_INTEGRITY) += integrity
obj-$(CONFIG_INTEGRITY) += integrity/
+
+# Object fbfam file lists
+subdir-$(CONFIG_FBFAM) += fbfam
+obj-$(CONFIG_FBFAM) += fbfam/
diff --git a/security/fbfam/Makefile b/security/fbfam/Makefile
new file mode 100644
index 000000000000..f4b9f0b19c44
--- /dev/null
+++ b/security/fbfam/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_FBFAM) += fbfam.o
diff --git a/security/fbfam/fbfam.c b/security/fbfam/fbfam.c
new file mode 100644
index 000000000000..0387f95f6408
--- /dev/null
+++ b/security/fbfam/fbfam.c
@@ -0,0 +1,163 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <asm/current.h>
+#include <fbfam/fbfam.h>
+#include <linux/errno.h>
+#include <linux/gfp.h>
+#include <linux/jiffies.h>
+#include <linux/refcount.h>
+#include <linux/slab.h>
+
+/**
+ * struct fbfam_stats - Fork brute force attack mitigation statistics.
+ * @refc: Reference counter.
+ * @faults: Number of crashes since jiffies.
+ * @jiffies: First fork or execve timestamp.
+ *
+ * The purpose of this structure is to manage all the necessary information to
+ * compute the crashing rate of an application. So, it holds a first fork or
+ * execve timestamp and a number of crashes since then. This way the crashing
+ * rate in milliseconds per fault can be compute when necessary with the
+ * following formula:
+ *
+ * u64 delta_jiffies = get_jiffies_64() - fbfam_stats::jiffies;
+ * u64 delta_time = jiffies64_to_msecs(delta_jiffies);
+ * u64 crashing_rate = delta_time / (u64)fbfam_stats::faults;
+ *
+ * If the fbfam_stats::faults is zero, the above formula can't be used. In this
+ * case, the crashing rate is zero.
+ *
+ * Moreover, since the same allocated structure will be used in every fork
+ * since the first one or execve, it's also necessary a reference counter.
+ */
+struct fbfam_stats {
+ refcount_t refc;
+ unsigned int faults;
+ u64 jiffies;
+};
+
+/**
+ * fbfam_new_stats() - Allocation of new statistics structure.
+ *
+ * If the allocation is successful the reference counter is set to one to
+ * indicate that there will be one task that points to this structure. The
+ * faults field is initialize to zero and the timestamp for this moment is set.
+ *
+ * Return: NULL if the allocation fails. A pointer to the new allocated
+ * statistics structure if it success.
+ */
+static struct fbfam_stats *fbfam_new_stats(void)
+{
+ struct fbfam_stats *stats = kmalloc(sizeof(struct fbfam_stats),
+ GFP_KERNEL);
+
+ if (stats) {
+ refcount_set(&stats->refc, 1);
+ stats->faults = 0;
+ stats->jiffies = get_jiffies_64();
+ }
+
+ return stats;
+}
+
+/*
+ * fbfam_fork() - Fork management.
+ * @child: Pointer to the child task that will be created with the fork system
+ * call.
+ *
+ * For a correct management of a fork brute force attack it is necessary that
+ * all the tasks hold statistical data. The same statistical data needs to be
+ * shared between all the tasks that hold the same memory contents or in other
+ * words, between all the tasks that have been forked without any execve call.
+ *
+ * To ensure this, if the current task doesn't have statistical data when forks
+ * (only possible in the first fork of the zero task), it is mandatory to
+ * allocate a new one. This way, the child task always will share the statistics
+ * with its parent.
+ *
+ * Return: -ENOMEN if the allocation of the new statistics structure fails.
+ * Zero otherwise.
+ */
+int fbfam_fork(struct task_struct *child)
+{
+ struct fbfam_stats **stats = ¤t->fbfam_stats;
+
+ if (!*stats) {
+ *stats = fbfam_new_stats();
+ if (!*stats)
+ return -ENOMEM;
+ }
+
+ refcount_inc(&(*stats)->refc);
+ child->fbfam_stats = *stats;
+ return 0;
+}
+
+/**
+ * fbfam_execve() - Execve management.
+ *
+ * When a forked task calls the execve system call, the memory contents are set
+ * with new values. So, in this scenario the parent's statistical data no need
+ * to be share. Instead, a new statistical data structure must be allocated to
+ * start a new cycle. This condition is detected when the statistics reference
+ * counter holds a value greater than or equal to two (a fork always sets the
+ * statistics reference counter to two since the parent and the child task are
+ * sharing the same data).
+ *
+ * However, if the execve function is called immediately after another execve
+ * call, althought the memory contents are reset, there is no need to allocate
+ * a new statistical data structure. This is possible because at this moment
+ * only one task (the task that calls the execve function) points to the data.
+ * In this case, the previous allocation is used and only the faults and time
+ * fields are reset.
+ *
+ * Return: -ENOMEN if the allocation of the new statistics structure fails.
+ * -EFAULT if the current task doesn't have statistical data. Zero
+ * otherwise.
+ */
+int fbfam_execve(void)
+{
+ struct fbfam_stats **stats = ¤t->fbfam_stats;
+
+ if (!*stats)
+ return -EFAULT;
+
+ if (!refcount_dec_not_one(&(*stats)->refc)) {
+ /* execve call after an execve call */
+ (*stats)->faults = 0;
+ (*stats)->jiffies = get_jiffies_64();
+ return 0;
+ }
+
+ /* execve call after a fork call */
+ *stats = fbfam_new_stats();
+ if (!*stats)
+ return -ENOMEM;
+
+ return 0;
+}
+
+/**
+ * fbfam_exit() - Exit management.
+ *
+ * The statistical data that every task holds needs to be clear when a task
+ * exits. Due to this data is shared across multiples tasks, the reference
+ * counter is useful to free the previous allocated data only when there are
+ * not other pointers to the same data. Or in other words, when the reference
+ * counter reaches zero.
+ *
+ * Return: -EFAULT if the current task doesn't have statistical data. Zero
+ * otherwise.
+ */
+int fbfam_exit(void)
+{
+ struct fbfam_stats *stats = current->fbfam_stats;
+
+ if (!stats)
+ return -EFAULT;
+
+ if (refcount_dec_and_test(&stats->refc))
+ kfree(stats);
+
+ return 0;
+}
+
--
2.25.1
From: John Wood <[email protected]>
In order to mitigate a fork brute force attack it is necessary to kill
all the offending tasks. This tasks are all the ones that share the
statistical data with the current task (the task that has crashed).
Since the attack detection is done in the function fbfam_handle_attack()
that is called every time a core dump is triggered, only is needed to
kill the others tasks that share the same statistical data, not the
current one as this is in the path to be killed.
When the SIGKILL signal is sent to the offending tasks from the function
fbfam_kill_tasks(), this one will be called again during the core dump
due to the shared statistical data shows a quickly crashing rate. So, to
avoid kill again the same tasks due to a recursive call of this
function, it is necessary to disable the attack detection.
To disable this attack detection, add a condition in the function
fbfam_handle_attack() to not compute the crashing rate when the jiffies
stored in the statistical data are set to zero.
Signed-off-by: John Wood <[email protected]>
---
security/fbfam/fbfam.c | 76 +++++++++++++++++++++++++++++++++++++++---
1 file changed, 71 insertions(+), 5 deletions(-)
diff --git a/security/fbfam/fbfam.c b/security/fbfam/fbfam.c
index 3aa669e4ea51..173a6122390f 100644
--- a/security/fbfam/fbfam.c
+++ b/security/fbfam/fbfam.c
@@ -4,8 +4,11 @@
#include <linux/errno.h>
#include <linux/gfp.h>
#include <linux/jiffies.h>
+#include <linux/pid.h>
#include <linux/printk.h>
+#include <linux/rcupdate.h>
#include <linux/refcount.h>
+#include <linux/sched/signal.h>
#include <linux/signal.h>
#include <linux/slab.h>
@@ -24,7 +27,8 @@ unsigned long sysctl_crashing_rate_threshold = 30000;
* struct fbfam_stats - Fork brute force attack mitigation statistics.
* @refc: Reference counter.
* @faults: Number of crashes since jiffies.
- * @jiffies: First fork or execve timestamp.
+ * @jiffies: First fork or execve timestamp. If zero, the attack detection is
+ * disabled.
*
* The purpose of this structure is to manage all the necessary information to
* compute the crashing rate of an application. So, it holds a first fork or
@@ -175,13 +179,69 @@ int fbfam_exit(void)
}
/**
- * fbfam_handle_attack() - Fork brute force attack detection.
+ * fbfam_kill_tasks() - Kill the offending tasks
+ *
+ * When a fork brute force attack is detected it is necessary to kill all the
+ * offending tasks. Since this function is called from fbfam_handle_attack(),
+ * and so, every time a core dump is triggered, only is needed to kill the
+ * others tasks that share the same statistical data, not the current one as
+ * this is in the path to be killed.
+ *
+ * When the SIGKILL signal is sent to the offending tasks, this function will be
+ * called again during the core dump due to the shared statistical data shows a
+ * quickly crashing rate. So, to avoid kill again the same tasks due to a
+ * recursive call of this function, it is necessary to disable the attack
+ * detection setting the jiffies to zero.
+ *
+ * To improve the for_each_process loop it is possible to end it when all the
+ * tasks that shared the same statistics are found.
+ *
+ * Return: -EFAULT if the current task doesn't have statistical data. Zero
+ * otherwise.
+ */
+static int fbfam_kill_tasks(void)
+{
+ struct fbfam_stats *stats = current->fbfam_stats;
+ struct task_struct *p;
+ unsigned int to_kill, killed = 0;
+
+ if (!stats)
+ return -EFAULT;
+
+ to_kill = refcount_read(&stats->refc) - 1;
+ if (!to_kill)
+ return 0;
+
+ /* Disable the attack detection */
+ stats->jiffies = 0;
+ rcu_read_lock();
+
+ for_each_process(p) {
+ if (p == current || p->fbfam_stats != stats)
+ continue;
+
+ do_send_sig_info(SIGKILL, SEND_SIG_PRIV, p, PIDTYPE_PID);
+ pr_warn("fbfam: Offending process with PID %d killed\n",
+ p->pid);
+
+ killed += 1;
+ if (killed >= to_kill)
+ break;
+ }
+
+ rcu_read_unlock();
+ return 0;
+}
+
+/**
+ * fbfam_handle_attack() - Fork brute force attack detection and mitigation.
* @signal: Signal number that causes the core dump.
*
* The crashing rate of an application is computed in milliseconds per fault in
* each crash. So, if this rate goes under a certain threshold there is a clear
* signal that the application is crashing quickly. At this moment, a fork brute
- * force attack is happening.
+ * force attack is happening. Under this scenario it is necessary to kill all
+ * the offending tasks in order to mitigate the attack.
*
* Return: -EFAULT if the current task doesn't have statistical data. Zero
* otherwise.
@@ -195,6 +255,10 @@ int fbfam_handle_attack(int signal)
if (!stats)
return -EFAULT;
+ /* The attack detection is disabled */
+ if (!stats->jiffies)
+ return 0;
+
if (!(signal == SIGILL || signal == SIGBUS || signal == SIGKILL ||
signal == SIGSEGV || signal == SIGSYS))
return 0;
@@ -205,9 +269,11 @@ int fbfam_handle_attack(int signal)
delta_time = jiffies64_to_msecs(delta_jiffies);
crashing_rate = delta_time / (u64)stats->faults;
- if (crashing_rate < (u64)sysctl_crashing_rate_threshold)
- pr_warn("fbfam: Fork brute force attack detected\n");
+ if (crashing_rate >= (u64)sysctl_crashing_rate_threshold)
+ return 0;
+ pr_warn("fbfam: Fork brute force attack detected\n");
+ fbfam_kill_tasks();
return 0;
}
--
2.25.1
From: John Wood <[email protected]>
This is a previous step to add the detection feature.
A fork brute force attack will be detected when an application crashes
quickly. Since, a rate can be defined as a time per fault, add a new
sysctl to control the crashing rate threshold.
This way, each system can tune the detection's sensibility adjusting the
milliseconds per fault. So, if the application's crashing rate falls
under this threshold an attack will be detected. So, the higher this
value, the faster an attack will be detected.
Signed-off-by: John Wood <[email protected]>
---
include/fbfam/fbfam.h | 4 ++++
kernel/sysctl.c | 9 +++++++++
security/fbfam/Makefile | 1 +
security/fbfam/fbfam.c | 11 +++++++++++
security/fbfam/sysctl.c | 20 ++++++++++++++++++++
5 files changed, 45 insertions(+)
create mode 100644 security/fbfam/sysctl.c
diff --git a/include/fbfam/fbfam.h b/include/fbfam/fbfam.h
index b5b7d1127a52..2cfe51d2b0d5 100644
--- a/include/fbfam/fbfam.h
+++ b/include/fbfam/fbfam.h
@@ -3,8 +3,12 @@
#define _FBFAM_H_
#include <linux/sched.h>
+#include <linux/sysctl.h>
#ifdef CONFIG_FBFAM
+#ifdef CONFIG_SYSCTL
+extern struct ctl_table fbfam_sysctls[];
+#endif
int fbfam_fork(struct task_struct *child);
int fbfam_execve(void);
int fbfam_exit(void);
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 09e70ee2332e..c3b4d737bef3 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -77,6 +77,8 @@
#include <linux/uaccess.h>
#include <asm/processor.h>
+#include <fbfam/fbfam.h>
+
#ifdef CONFIG_X86
#include <asm/nmi.h>
#include <asm/stacktrace.h>
@@ -2660,6 +2662,13 @@ static struct ctl_table kern_table[] = {
.extra1 = SYSCTL_ZERO,
.extra2 = SYSCTL_ONE,
},
+#endif
+#ifdef CONFIG_FBFAM
+ {
+ .procname = "fbfam",
+ .mode = 0555,
+ .child = fbfam_sysctls,
+ },
#endif
{ }
};
diff --git a/security/fbfam/Makefile b/security/fbfam/Makefile
index f4b9f0b19c44..b8d5751ecea4 100644
--- a/security/fbfam/Makefile
+++ b/security/fbfam/Makefile
@@ -1,2 +1,3 @@
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_FBFAM) += fbfam.o
+obj-$(CONFIG_SYSCTL) += sysctl.o
diff --git a/security/fbfam/fbfam.c b/security/fbfam/fbfam.c
index 0387f95f6408..9be4639b72eb 100644
--- a/security/fbfam/fbfam.c
+++ b/security/fbfam/fbfam.c
@@ -7,6 +7,17 @@
#include <linux/refcount.h>
#include <linux/slab.h>
+/**
+ * sysctl_crashing_rate_threshold - Crashing rate threshold.
+ *
+ * The rate's units are in milliseconds per fault.
+ *
+ * A fork brute force attack will be detected if the application's crashing rate
+ * falls under this threshold. So, the higher this value, the faster an attack
+ * will be detected.
+ */
+unsigned long sysctl_crashing_rate_threshold = 30000;
+
/**
* struct fbfam_stats - Fork brute force attack mitigation statistics.
* @refc: Reference counter.
diff --git a/security/fbfam/sysctl.c b/security/fbfam/sysctl.c
new file mode 100644
index 000000000000..430323ad8e9f
--- /dev/null
+++ b/security/fbfam/sysctl.c
@@ -0,0 +1,20 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/sysctl.h>
+
+extern unsigned long sysctl_crashing_rate_threshold;
+static unsigned long ulong_one = 1;
+static unsigned long ulong_max = ULONG_MAX;
+
+struct ctl_table fbfam_sysctls[] = {
+ {
+ .procname = "crashing_rate_threshold",
+ .data = &sysctl_crashing_rate_threshold,
+ .maxlen = sizeof(sysctl_crashing_rate_threshold),
+ .mode = 0644,
+ .proc_handler = proc_doulongvec_minmax,
+ .extra1 = &ulong_one,
+ .extra2 = &ulong_max,
+ },
+ { }
+};
+
--
2.25.1
On Thu, Sep 10, 2020 at 10:21 PM Kees Cook <[email protected]> wrote:
> [kees: re-sending this series on behalf of John Wood <[email protected]>
> also visible at https://github.com/johwood/linux fbfam]
[...]
> The goal of this patch serie is to detect and mitigate a fork brute force
> attack.
>
> Attacks with the purpose to break ASLR or bypass canaries traditionaly use
> some level of brute force with the help of the fork system call. This is
> possible since when creating a new process using fork its memory contents
> are the same as those of the parent process (the process that called the
> fork system call). So, the attacker can test the memory infinite times to
> find the correct memory values or the correct memory addresses without
> worrying about crashing the application.
For the next version of this patchset, you may want to clarify that
this is intended to stop brute force attacks *against vulnerable
userspace processes* that fork off worker processes. I was halfway
through the patch series before I realized that.
From: John Wood <[email protected]>
Add a menu entry under "Security options" to enable the "Fork brute
force attack mitigation" feature.
Signed-off-by: John Wood <[email protected]>
---
security/Kconfig | 1 +
security/fbfam/Kconfig | 10 ++++++++++
2 files changed, 11 insertions(+)
create mode 100644 security/fbfam/Kconfig
diff --git a/security/Kconfig b/security/Kconfig
index 7561f6f99f1d..00a90e25b8d5 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -290,6 +290,7 @@ config LSM
If unsure, leave this as the default.
source "security/Kconfig.hardening"
+source "security/fbfam/Kconfig"
endmenu
diff --git a/security/fbfam/Kconfig b/security/fbfam/Kconfig
new file mode 100644
index 000000000000..bbe7f6aad369
--- /dev/null
+++ b/security/fbfam/Kconfig
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: GPL-2.0
+config FBFAM
+ bool "Fork brute force attack mitigation"
+ default n
+ help
+ This is a user defense that detects any fork brute force attack
+ based on the application's crashing rate. When this measure is
+ triggered the fork system call is blocked.
+
+ If you are unsure how to answer this question, answer N.
--
2.25.1
On Thu, Sep 10, 2020 at 10:22 PM Kees Cook <[email protected]> wrote:
> In order to mitigate a fork brute force attack it is necessary to kill
> all the offending tasks. This tasks are all the ones that share the
> statistical data with the current task (the task that has crashed).
>
> Since the attack detection is done in the function fbfam_handle_attack()
> that is called every time a core dump is triggered, only is needed to
> kill the others tasks that share the same statistical data, not the
> current one as this is in the path to be killed.
>
> When the SIGKILL signal is sent to the offending tasks from the function
> fbfam_kill_tasks(), this one will be called again during the core dump
> due to the shared statistical data shows a quickly crashing rate. So, to
> avoid kill again the same tasks due to a recursive call of this
> function, it is necessary to disable the attack detection.
>
> To disable this attack detection, add a condition in the function
> fbfam_handle_attack() to not compute the crashing rate when the jiffies
> stored in the statistical data are set to zero.
[...]
> /**
> - * fbfam_handle_attack() - Fork brute force attack detection.
> + * fbfam_kill_tasks() - Kill the offending tasks
> + *
> + * When a fork brute force attack is detected it is necessary to kill all the
> + * offending tasks. Since this function is called from fbfam_handle_attack(),
> + * and so, every time a core dump is triggered, only is needed to kill the
> + * others tasks that share the same statistical data, not the current one as
> + * this is in the path to be killed.
> + *
> + * When the SIGKILL signal is sent to the offending tasks, this function will be
> + * called again during the core dump due to the shared statistical data shows a
> + * quickly crashing rate. So, to avoid kill again the same tasks due to a
> + * recursive call of this function, it is necessary to disable the attack
> + * detection setting the jiffies to zero.
> + *
> + * To improve the for_each_process loop it is possible to end it when all the
> + * tasks that shared the same statistics are found.
This is not a fastpath, there's no need to be clever and optimize
things here, please get rid of that optimization. Especially since
that fastpath looks racy against concurrent execve().
> + * Return: -EFAULT if the current task doesn't have statistical data. Zero
> + * otherwise.
> + */
> +static int fbfam_kill_tasks(void)
> +{
> + struct fbfam_stats *stats = current->fbfam_stats;
> + struct task_struct *p;
> + unsigned int to_kill, killed = 0;
> +
> + if (!stats)
> + return -EFAULT;
> +
> + to_kill = refcount_read(&stats->refc) - 1;
> + if (!to_kill)
> + return 0;
> +
> + /* Disable the attack detection */
> + stats->jiffies = 0;
> + rcu_read_lock();
> +
> + for_each_process(p) {
> + if (p == current || p->fbfam_stats != stats)
p->fbfam_stats could change concurrently, you should at least use
READ_ONCE() here.
Also, if this codepath is hit by a non-leader thread, "p == current"
will always be false, and you'll end up killing the caller, too. You
may want to compare with current->group_leader instead.
> + continue;
> +
> + do_send_sig_info(SIGKILL, SEND_SIG_PRIV, p, PIDTYPE_PID);
> + pr_warn("fbfam: Offending process with PID %d killed\n",
> + p->pid);
Normally pr_*() messages about tasks mention not just the pid, but
also the ->comm name of the task.
> + killed += 1;
> + if (killed >= to_kill)
> + break;
> + }
> +
> + rcu_read_unlock();
> + return 0;
> +}
On Thu, Sep 10, 2020 at 10:22 PM Kees Cook <[email protected]> wrote:
> To detect a fork brute force attack it is necessary to compute the
> crashing rate of the application. This calculation is performed in each
> fatal fail of a task, or in other words, when a core dump is triggered.
> If this rate shows that the application is crashing quickly, there is a
> clear signal that an attack is happening.
>
> Since the crashing rate is computed in milliseconds per fault, if this
> rate goes under a certain threshold a warning is triggered.
[...]
> +/**
> + * fbfam_handle_attack() - Fork brute force attack detection.
> + * @signal: Signal number that causes the core dump.
> + *
> + * The crashing rate of an application is computed in milliseconds per fault in
> + * each crash. So, if this rate goes under a certain threshold there is a clear
> + * signal that the application is crashing quickly. At this moment, a fork brute
> + * force attack is happening.
> + *
> + * Return: -EFAULT if the current task doesn't have statistical data. Zero
> + * otherwise.
> + */
> +int fbfam_handle_attack(int signal)
> +{
> + struct fbfam_stats *stats = current->fbfam_stats;
> + u64 delta_jiffies, delta_time;
> + u64 crashing_rate;
> +
> + if (!stats)
> + return -EFAULT;
> +
> + if (!(signal == SIGILL || signal == SIGBUS || signal == SIGKILL ||
> + signal == SIGSEGV || signal == SIGSYS))
> + return 0;
As far as I can tell, you can never get here with SIGKILL, since
SIGKILL doesn't trigger core dumping and also isn't used by seccomp?
> +
> + stats->faults += 1;
This is a data race. If you want to be able to increment a variable
that may be concurrently incremented by other tasks, use either
locking or the atomic_t helpers.
> + delta_jiffies = get_jiffies_64() - stats->jiffies;
> + delta_time = jiffies64_to_msecs(delta_jiffies);
> + crashing_rate = delta_time / (u64)stats->faults;
Do I see this correctly, is this computing the total runtime of this
process hierarchy divided by the total number of faults seen in this
process hierarchy? If so, you may want to reconsider whether that's
really the behavior you want. For example, if I configure the minimum
period between crashes to be 30s (as is the default in the sysctl
patch), and I try to attack a server that has been running without any
crashes for a month, I'd instantly be able to crash around
30*24*60*60/30 = 86400 times before the detection kicks in. That seems
suboptimal.
(By the way, it kind of annoys me that you call it the "rate" when
it's actually the inverse of the rate. "Period" might be more
appropriate?)
> + if (crashing_rate < (u64)sysctl_crashing_rate_threshold)
> + pr_warn("fbfam: Fork brute force attack detected\n");
> +
> + return 0;
> +}
> +
> --
> 2.25.1
>
On Thu, Sep 10, 2020 at 10:21 PM Kees Cook <[email protected]> wrote:
> From: John Wood <[email protected]>
>
> Add a menu entry under "Security options" to enable the "Fork brute
> force attack mitigation" feature.
[...]
> +config FBFAM
Please give this a more descriptive name than FBFAM. Some name where,
if a random kernel developer sees an "#ifdef" with that name in some
random piece of kernel code, they immediately have a rough idea for
what kind of feature this is.
Perhaps something like THROTTLE_FORK_CRASHES. Or something else that
is equally descriptive.
> + bool "Fork brute force attack mitigation"
> + default n
"default n" is superfluous and should AFAIK be omitted.
> + help
> + This is a user defense that detects any fork brute force attack
> + based on the application's crashing rate. When this measure is
> + triggered the fork system call is blocked.
This help text claims that the mitigation will block fork(), but patch
6/6 actually kills the process hierarchy.
On Thu, Sep 10, 2020 at 01:21:05PM -0700, Kees Cook wrote:
> From: John Wood <[email protected]>
>
> This is a previous step to add the detection feature.
>
> A fork brute force attack will be detected when an application crashes
> quickly. Since, a rate can be defined as a time per fault, add a new
> sysctl to control the crashing rate threshold.
>
> This way, each system can tune the detection's sensibility adjusting the
> milliseconds per fault. So, if the application's crashing rate falls
> under this threshold an attack will be detected. So, the higher this
> value, the faster an attack will be detected.
>
> Signed-off-by: John Wood <[email protected]>
> ---
> include/fbfam/fbfam.h | 4 ++++
> kernel/sysctl.c | 9 +++++++++
> security/fbfam/Makefile | 1 +
> security/fbfam/fbfam.c | 11 +++++++++++
> security/fbfam/sysctl.c | 20 ++++++++++++++++++++
> 5 files changed, 45 insertions(+)
> create mode 100644 security/fbfam/sysctl.c
>
> diff --git a/include/fbfam/fbfam.h b/include/fbfam/fbfam.h
> index b5b7d1127a52..2cfe51d2b0d5 100644
> --- a/include/fbfam/fbfam.h
> +++ b/include/fbfam/fbfam.h
> @@ -3,8 +3,12 @@
> #define _FBFAM_H_
>
> #include <linux/sched.h>
> +#include <linux/sysctl.h>
>
> #ifdef CONFIG_FBFAM
> +#ifdef CONFIG_SYSCTL
> +extern struct ctl_table fbfam_sysctls[];
> +#endif
Instead of doing the extern and adding to sysctl.c, this can all be done
directly (dynamically) from the fbfam.c file instead.
> int fbfam_fork(struct task_struct *child);
> int fbfam_execve(void);
> int fbfam_exit(void);
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 09e70ee2332e..c3b4d737bef3 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -77,6 +77,8 @@
> #include <linux/uaccess.h>
> #include <asm/processor.h>
>
> +#include <fbfam/fbfam.h>
> +
> #ifdef CONFIG_X86
> #include <asm/nmi.h>
> #include <asm/stacktrace.h>
> @@ -2660,6 +2662,13 @@ static struct ctl_table kern_table[] = {
> .extra1 = SYSCTL_ZERO,
> .extra2 = SYSCTL_ONE,
> },
> +#endif
> +#ifdef CONFIG_FBFAM
> + {
> + .procname = "fbfam",
> + .mode = 0555,
> + .child = fbfam_sysctls,
> + },
> #endif
> { }
> };
> diff --git a/security/fbfam/Makefile b/security/fbfam/Makefile
> index f4b9f0b19c44..b8d5751ecea4 100644
> --- a/security/fbfam/Makefile
> +++ b/security/fbfam/Makefile
> @@ -1,2 +1,3 @@
> # SPDX-License-Identifier: GPL-2.0
> obj-$(CONFIG_FBFAM) += fbfam.o
> +obj-$(CONFIG_SYSCTL) += sysctl.o
> diff --git a/security/fbfam/fbfam.c b/security/fbfam/fbfam.c
> index 0387f95f6408..9be4639b72eb 100644
> --- a/security/fbfam/fbfam.c
> +++ b/security/fbfam/fbfam.c
> @@ -7,6 +7,17 @@
> #include <linux/refcount.h>
> #include <linux/slab.h>
>
> +/**
> + * sysctl_crashing_rate_threshold - Crashing rate threshold.
> + *
> + * The rate's units are in milliseconds per fault.
> + *
> + * A fork brute force attack will be detected if the application's crashing rate
> + * falls under this threshold. So, the higher this value, the faster an attack
> + * will be detected.
> + */
> +unsigned long sysctl_crashing_rate_threshold = 30000;
I would move the sysctls here, instead. (Also, the above should be
const.)
> +
> /**
> * struct fbfam_stats - Fork brute force attack mitigation statistics.
> * @refc: Reference counter.
> diff --git a/security/fbfam/sysctl.c b/security/fbfam/sysctl.c
> new file mode 100644
> index 000000000000..430323ad8e9f
> --- /dev/null
> +++ b/security/fbfam/sysctl.c
> @@ -0,0 +1,20 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <linux/sysctl.h>
> +
> +extern unsigned long sysctl_crashing_rate_threshold;
> +static unsigned long ulong_one = 1;
> +static unsigned long ulong_max = ULONG_MAX;
> +
> +struct ctl_table fbfam_sysctls[] = {
> + {
> + .procname = "crashing_rate_threshold",
> + .data = &sysctl_crashing_rate_threshold,
> + .maxlen = sizeof(sysctl_crashing_rate_threshold),
> + .mode = 0644,
> + .proc_handler = proc_doulongvec_minmax,
> + .extra1 = &ulong_one,
> + .extra2 = &ulong_max,
> + },
> + { }
> +};
I wouldn't bother splitting this into a separate file. (Just leave it in
fbfam.c)
--
Kees Cook
On Thu, Sep 10, 2020 at 01:21:02PM -0700, Kees Cook wrote:
> From: John Wood <[email protected]>
>
> Add a menu entry under "Security options" to enable the "Fork brute
> force attack mitigation" feature.
>
> Signed-off-by: John Wood <[email protected]>
> ---
> security/Kconfig | 1 +
> security/fbfam/Kconfig | 10 ++++++++++
> 2 files changed, 11 insertions(+)
> create mode 100644 security/fbfam/Kconfig
>
> diff --git a/security/Kconfig b/security/Kconfig
> index 7561f6f99f1d..00a90e25b8d5 100644
> --- a/security/Kconfig
> +++ b/security/Kconfig
> @@ -290,6 +290,7 @@ config LSM
> If unsure, leave this as the default.
>
> source "security/Kconfig.hardening"
> +source "security/fbfam/Kconfig"
Given the layout you've chosen and the interface you've got, I think
this should just be treated like a regular LSM.
>
> endmenu
>
> diff --git a/security/fbfam/Kconfig b/security/fbfam/Kconfig
> new file mode 100644
> index 000000000000..bbe7f6aad369
> --- /dev/null
> +++ b/security/fbfam/Kconfig
> @@ -0,0 +1,10 @@
> +# SPDX-License-Identifier: GPL-2.0
> +config FBFAM
To jump on the bikeshed: how about just calling this
FORK_BRUTE_FORCE_DETECTION or FORK_BRUTE, and the directory could be
"brute", etc. "fbfam" doesn't tell anyone anything.
--
Kees Cook
On Thu, Sep 10, 2020 at 01:21:03PM -0700, Kees Cook wrote:
> From: John Wood <[email protected]>
>
> Create a statistical data structure to hold all the necessary
> information involve in a fork brute force attack. This info is a
> timestamp for the first fork or execve and the number of crashes since
> then. Moreover, due to this statitistical data will be shared between
> different tasks, a reference counter it is necessary.
>
> For a correct management of an attack it is also necessary that all the
> tasks hold statistical data. The same statistical data needs to be
> shared between all the tasks that hold the same memory contents or in
> other words, between all the tasks that have been forked without any
> execve call.
>
> When a forked task calls the execve system call, the memory contents are
> set with new values. So, in this scenario the parent's statistical data
> no need to be share. Instead, a new statistical data structure must be
> allocated to start a new cycle.
>
> The statistical data that every task holds needs to be clear when a task
> exits. Due to this data is shared across multiples tasks, the reference
> counter is useful to free the previous allocated data only when there
> are not other pointers to the same data. Or in other words, when the
> reference counter reaches zero.
>
> So, based in all the previous information add the api to manage all the
> commented cases.
>
> Also, add to the struct task_struct a new field to point to the
> statitistical data related to an attack. This way, all the tasks will
> have access to this information.
>
> Signed-off-by: John Wood <[email protected]>
I think patch 1 should be merged into this one since the former doesn't
really *do* anything. ;)
> ---
> include/fbfam/fbfam.h | 18 +++++
> include/linux/sched.h | 4 +
> security/Makefile | 4 +
> security/fbfam/Makefile | 2 +
> security/fbfam/fbfam.c | 163 ++++++++++++++++++++++++++++++++++++++++
> 5 files changed, 191 insertions(+)
> create mode 100644 include/fbfam/fbfam.h
> create mode 100644 security/fbfam/Makefile
> create mode 100644 security/fbfam/fbfam.c
>
> diff --git a/include/fbfam/fbfam.h b/include/fbfam/fbfam.h
> new file mode 100644
> index 000000000000..b5b7d1127a52
> --- /dev/null
> +++ b/include/fbfam/fbfam.h
> @@ -0,0 +1,18 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _FBFAM_H_
> +#define _FBFAM_H_
> +
> +#include <linux/sched.h>
> +
> +#ifdef CONFIG_FBFAM
> +int fbfam_fork(struct task_struct *child);
> +int fbfam_execve(void);
> +int fbfam_exit(void);
> +#else
> +static inline int fbfam_fork(struct task_struct *child) { return 0; }
This appears to map well to LSM hook "task_alloc".
> +static inline int fbfam_execve(void) { return 0; }
This appears to map well to LSM hook "bprm_committing_creds".
> +static inline int fbfam_exit(void) { return 0; }
This appears to map well to LSM hook "task_free".
> +#endif
> +
> +#endif /* _FBFAM_H_ */
> +
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index afe01e232935..00e1aa5e00cd 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1315,6 +1315,10 @@ struct task_struct {
> struct callback_head mce_kill_me;
> #endif
>
> +#ifdef CONFIG_FBFAM
> + struct fbfam_stats *fbfam_stats;
> +#endif
This could be part of the task_struct security blob used by LSMs.
> +
> /*
> * New fields for task_struct should be added above here, so that
> * they are included in the randomized portion of task_struct.
> diff --git a/security/Makefile b/security/Makefile
> index 3baf435de541..36dc4b536349 100644
> --- a/security/Makefile
> +++ b/security/Makefile
> @@ -36,3 +36,7 @@ obj-$(CONFIG_BPF_LSM) += bpf/
> # Object integrity file lists
> subdir-$(CONFIG_INTEGRITY) += integrity
> obj-$(CONFIG_INTEGRITY) += integrity/
> +
> +# Object fbfam file lists
> +subdir-$(CONFIG_FBFAM) += fbfam
> +obj-$(CONFIG_FBFAM) += fbfam/
> diff --git a/security/fbfam/Makefile b/security/fbfam/Makefile
> new file mode 100644
> index 000000000000..f4b9f0b19c44
> --- /dev/null
> +++ b/security/fbfam/Makefile
> @@ -0,0 +1,2 @@
> +# SPDX-License-Identifier: GPL-2.0
> +obj-$(CONFIG_FBFAM) += fbfam.o
> diff --git a/security/fbfam/fbfam.c b/security/fbfam/fbfam.c
> new file mode 100644
> index 000000000000..0387f95f6408
> --- /dev/null
> +++ b/security/fbfam/fbfam.c
> @@ -0,0 +1,163 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <asm/current.h>
> +#include <fbfam/fbfam.h>
> +#include <linux/errno.h>
> +#include <linux/gfp.h>
> +#include <linux/jiffies.h>
> +#include <linux/refcount.h>
> +#include <linux/slab.h>
> +
> +/**
> + * struct fbfam_stats - Fork brute force attack mitigation statistics.
> + * @refc: Reference counter.
> + * @faults: Number of crashes since jiffies.
> + * @jiffies: First fork or execve timestamp.
> + *
> + * The purpose of this structure is to manage all the necessary information to
> + * compute the crashing rate of an application. So, it holds a first fork or
> + * execve timestamp and a number of crashes since then. This way the crashing
> + * rate in milliseconds per fault can be compute when necessary with the
> + * following formula:
> + *
> + * u64 delta_jiffies = get_jiffies_64() - fbfam_stats::jiffies;
> + * u64 delta_time = jiffies64_to_msecs(delta_jiffies);
> + * u64 crashing_rate = delta_time / (u64)fbfam_stats::faults;
> + *
> + * If the fbfam_stats::faults is zero, the above formula can't be used. In this
> + * case, the crashing rate is zero.
> + *
> + * Moreover, since the same allocated structure will be used in every fork
> + * since the first one or execve, it's also necessary a reference counter.
> + */
> +struct fbfam_stats {
> + refcount_t refc;
> + unsigned int faults;
> + u64 jiffies;
> +};
> +
> +/**
> + * fbfam_new_stats() - Allocation of new statistics structure.
> + *
> + * If the allocation is successful the reference counter is set to one to
> + * indicate that there will be one task that points to this structure. The
> + * faults field is initialize to zero and the timestamp for this moment is set.
> + *
> + * Return: NULL if the allocation fails. A pointer to the new allocated
> + * statistics structure if it success.
> + */
> +static struct fbfam_stats *fbfam_new_stats(void)
> +{
> + struct fbfam_stats *stats = kmalloc(sizeof(struct fbfam_stats),
> + GFP_KERNEL);
> +
> + if (stats) {
> + refcount_set(&stats->refc, 1);
> + stats->faults = 0;
> + stats->jiffies = get_jiffies_64();
> + }
> +
> + return stats;
> +}
> +
> +/*
> + * fbfam_fork() - Fork management.
> + * @child: Pointer to the child task that will be created with the fork system
> + * call.
> + *
> + * For a correct management of a fork brute force attack it is necessary that
> + * all the tasks hold statistical data. The same statistical data needs to be
> + * shared between all the tasks that hold the same memory contents or in other
> + * words, between all the tasks that have been forked without any execve call.
> + *
> + * To ensure this, if the current task doesn't have statistical data when forks
> + * (only possible in the first fork of the zero task), it is mandatory to
> + * allocate a new one. This way, the child task always will share the statistics
> + * with its parent.
> + *
> + * Return: -ENOMEN if the allocation of the new statistics structure fails.
> + * Zero otherwise.
> + */
> +int fbfam_fork(struct task_struct *child)
> +{
> + struct fbfam_stats **stats = ¤t->fbfam_stats;
> +
> + if (!*stats) {
> + *stats = fbfam_new_stats();
> + if (!*stats)
> + return -ENOMEM;
> + }
> +
> + refcount_inc(&(*stats)->refc);
> + child->fbfam_stats = *stats;
> + return 0;
> +}
> +
> +/**
> + * fbfam_execve() - Execve management.
> + *
> + * When a forked task calls the execve system call, the memory contents are set
> + * with new values. So, in this scenario the parent's statistical data no need
> + * to be share. Instead, a new statistical data structure must be allocated to
> + * start a new cycle. This condition is detected when the statistics reference
> + * counter holds a value greater than or equal to two (a fork always sets the
> + * statistics reference counter to two since the parent and the child task are
> + * sharing the same data).
> + *
> + * However, if the execve function is called immediately after another execve
> + * call, althought the memory contents are reset, there is no need to allocate
> + * a new statistical data structure. This is possible because at this moment
> + * only one task (the task that calls the execve function) points to the data.
> + * In this case, the previous allocation is used and only the faults and time
> + * fields are reset.
> + *
> + * Return: -ENOMEN if the allocation of the new statistics structure fails.
> + * -EFAULT if the current task doesn't have statistical data. Zero
> + * otherwise.
> + */
> +int fbfam_execve(void)
> +{
> + struct fbfam_stats **stats = ¤t->fbfam_stats;
> +
> + if (!*stats)
> + return -EFAULT;
> +
> + if (!refcount_dec_not_one(&(*stats)->refc)) {
> + /* execve call after an execve call */
> + (*stats)->faults = 0;
> + (*stats)->jiffies = get_jiffies_64();
> + return 0;
> + }
> +
> + /* execve call after a fork call */
> + *stats = fbfam_new_stats();
> + if (!*stats)
> + return -ENOMEM;
> +
> + return 0;
> +}
> +
> +/**
> + * fbfam_exit() - Exit management.
> + *
> + * The statistical data that every task holds needs to be clear when a task
> + * exits. Due to this data is shared across multiples tasks, the reference
> + * counter is useful to free the previous allocated data only when there are
> + * not other pointers to the same data. Or in other words, when the reference
> + * counter reaches zero.
> + *
> + * Return: -EFAULT if the current task doesn't have statistical data. Zero
> + * otherwise.
> + */
> +int fbfam_exit(void)
> +{
> + struct fbfam_stats *stats = current->fbfam_stats;
> +
> + if (!stats)
> + return -EFAULT;
> +
> + if (refcount_dec_and_test(&stats->refc))
> + kfree(stats);
> +
> + return 0;
> +}
> +
> --
> 2.25.1
>
Jann mentions some concerns about races, and I'd agree: this doesn't
feel right to me, but I've not had a chance to study it yet. I'm
concerned about the lifetime management of the stats vs the task
hierarchy.
--
Kees Cook
On Thu, Sep 10, 2020 at 01:21:06PM -0700, Kees Cook wrote:
> From: John Wood <[email protected]>
>
> To detect a fork brute force attack it is necessary to compute the
> crashing rate of the application. This calculation is performed in each
> fatal fail of a task, or in other words, when a core dump is triggered.
> If this rate shows that the application is crashing quickly, there is a
> clear signal that an attack is happening.
>
> Since the crashing rate is computed in milliseconds per fault, if this
> rate goes under a certain threshold a warning is triggered.
>
> Signed-off-by: John Wood <[email protected]>
> ---
> fs/coredump.c | 2 ++
> include/fbfam/fbfam.h | 2 ++
> security/fbfam/fbfam.c | 39 +++++++++++++++++++++++++++++++++++++++
> 3 files changed, 43 insertions(+)
>
> diff --git a/fs/coredump.c b/fs/coredump.c
> index 76e7c10edfc0..d4ba4e1828d5 100644
> --- a/fs/coredump.c
> +++ b/fs/coredump.c
> @@ -51,6 +51,7 @@
> #include "internal.h"
>
> #include <trace/events/sched.h>
> +#include <fbfam/fbfam.h>
>
> int core_uses_pid;
> unsigned int core_pipe_limit;
> @@ -825,6 +826,7 @@ void do_coredump(const kernel_siginfo_t *siginfo)
> fail_creds:
> put_cred(cred);
> fail:
> + fbfam_handle_attack(siginfo->si_signo);
I don't think this is the right place for detecting a crash -- isn't
this only for the "dumping core" condition? In other words, don't you
want to do this in get_signal()'s "fatal" block? (i.e. very close to the
do_coredump, but without the "should I dump?" check?)
Hmm, but maybe I'm wrong? It looks like you're looking at noticing the
process taking a signal from SIG_KERNEL_COREDUMP_MASK ?
(Better yet: what are fatal conditions that do NOT match
SIG_KERNEL_COREDUMP_MASK, and should those be covered?)
Regardless, *this* looks like the only place without an LSM hook. And it
doesn't seem unreasonable to add one here. I assume it would probably
just take the siginfo pointer, which is also what you're checking.
e.g. for include/linux/lsm_hook_defs.h:
LSM_HOOK(int, 0, task_coredump, const kernel_siginfo_t *siginfo);
> return;
> }
>
> diff --git a/include/fbfam/fbfam.h b/include/fbfam/fbfam.h
> index 2cfe51d2b0d5..9ac8e33d8291 100644
> --- a/include/fbfam/fbfam.h
> +++ b/include/fbfam/fbfam.h
> @@ -12,10 +12,12 @@ extern struct ctl_table fbfam_sysctls[];
> int fbfam_fork(struct task_struct *child);
> int fbfam_execve(void);
> int fbfam_exit(void);
> +int fbfam_handle_attack(int signal);
> #else
> static inline int fbfam_fork(struct task_struct *child) { return 0; }
> static inline int fbfam_execve(void) { return 0; }
> static inline int fbfam_exit(void) { return 0; }
> +static inline int fbfam_handle_attack(int signal) { return 0; }
> #endif
>
> #endif /* _FBFAM_H_ */
> diff --git a/security/fbfam/fbfam.c b/security/fbfam/fbfam.c
> index 9be4639b72eb..3aa669e4ea51 100644
> --- a/security/fbfam/fbfam.c
> +++ b/security/fbfam/fbfam.c
> @@ -4,7 +4,9 @@
> #include <linux/errno.h>
> #include <linux/gfp.h>
> #include <linux/jiffies.h>
> +#include <linux/printk.h>
> #include <linux/refcount.h>
> +#include <linux/signal.h>
> #include <linux/slab.h>
>
> /**
> @@ -172,3 +174,40 @@ int fbfam_exit(void)
> return 0;
> }
>
> +/**
> + * fbfam_handle_attack() - Fork brute force attack detection.
> + * @signal: Signal number that causes the core dump.
> + *
> + * The crashing rate of an application is computed in milliseconds per fault in
> + * each crash. So, if this rate goes under a certain threshold there is a clear
> + * signal that the application is crashing quickly. At this moment, a fork brute
> + * force attack is happening.
> + *
> + * Return: -EFAULT if the current task doesn't have statistical data. Zero
> + * otherwise.
> + */
> +int fbfam_handle_attack(int signal)
> +{
> + struct fbfam_stats *stats = current->fbfam_stats;
> + u64 delta_jiffies, delta_time;
> + u64 crashing_rate;
> +
> + if (!stats)
> + return -EFAULT;
> +
> + if (!(signal == SIGILL || signal == SIGBUS || signal == SIGKILL ||
> + signal == SIGSEGV || signal == SIGSYS))
> + return 0;
This will only be called for:
#define SIG_KERNEL_COREDUMP_MASK (\
rt_sigmask(SIGQUIT) | rt_sigmask(SIGILL) | \
rt_sigmask(SIGTRAP) | rt_sigmask(SIGABRT) | \
rt_sigmask(SIGFPE) | rt_sigmask(SIGSEGV) | \
rt_sigmask(SIGBUS) | rt_sigmask(SIGSYS) | \
rt_sigmask(SIGXCPU) | rt_sigmask(SIGXFSZ) | \
SIGEMT_MASK )
So you're skipping:
SIGQUIT
SIGTRAP
SIGABRT
SIGFPE
SIGXCPU
SIGXFSZ
SIGEMT_MASK
I would include SIGABRT (e.g. glibc will call abort() for stack
canary, malloc, etc failures, which may indicate a mitigation has
fired).
--
Kees Cook
On Thu, Sep 10, 2020 at 01:21:07PM -0700, Kees Cook wrote:
> From: John Wood <[email protected]>
>
> In order to mitigate a fork brute force attack it is necessary to kill
> all the offending tasks. This tasks are all the ones that share the
> statistical data with the current task (the task that has crashed).
>
> Since the attack detection is done in the function fbfam_handle_attack()
> that is called every time a core dump is triggered, only is needed to
> kill the others tasks that share the same statistical data, not the
> current one as this is in the path to be killed.
>
> When the SIGKILL signal is sent to the offending tasks from the function
> fbfam_kill_tasks(), this one will be called again during the core dump
> due to the shared statistical data shows a quickly crashing rate. So, to
> avoid kill again the same tasks due to a recursive call of this
> function, it is necessary to disable the attack detection.
>
> To disable this attack detection, add a condition in the function
> fbfam_handle_attack() to not compute the crashing rate when the jiffies
> stored in the statistical data are set to zero.
>
> Signed-off-by: John Wood <[email protected]>
> ---
> security/fbfam/fbfam.c | 76 +++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 71 insertions(+), 5 deletions(-)
>
> diff --git a/security/fbfam/fbfam.c b/security/fbfam/fbfam.c
> index 3aa669e4ea51..173a6122390f 100644
> --- a/security/fbfam/fbfam.c
> +++ b/security/fbfam/fbfam.c
> @@ -4,8 +4,11 @@
> #include <linux/errno.h>
> #include <linux/gfp.h>
> #include <linux/jiffies.h>
> +#include <linux/pid.h>
> #include <linux/printk.h>
> +#include <linux/rcupdate.h>
> #include <linux/refcount.h>
> +#include <linux/sched/signal.h>
> #include <linux/signal.h>
> #include <linux/slab.h>
>
> @@ -24,7 +27,8 @@ unsigned long sysctl_crashing_rate_threshold = 30000;
> * struct fbfam_stats - Fork brute force attack mitigation statistics.
> * @refc: Reference counter.
> * @faults: Number of crashes since jiffies.
> - * @jiffies: First fork or execve timestamp.
> + * @jiffies: First fork or execve timestamp. If zero, the attack detection is
> + * disabled.
> *
> * The purpose of this structure is to manage all the necessary information to
> * compute the crashing rate of an application. So, it holds a first fork or
> @@ -175,13 +179,69 @@ int fbfam_exit(void)
> }
>
> /**
> - * fbfam_handle_attack() - Fork brute force attack detection.
> + * fbfam_kill_tasks() - Kill the offending tasks
> + *
> + * When a fork brute force attack is detected it is necessary to kill all the
> + * offending tasks. Since this function is called from fbfam_handle_attack(),
> + * and so, every time a core dump is triggered, only is needed to kill the
> + * others tasks that share the same statistical data, not the current one as
> + * this is in the path to be killed.
> + *
> + * When the SIGKILL signal is sent to the offending tasks, this function will be
> + * called again during the core dump due to the shared statistical data shows a
> + * quickly crashing rate. So, to avoid kill again the same tasks due to a
> + * recursive call of this function, it is necessary to disable the attack
> + * detection setting the jiffies to zero.
> + *
> + * To improve the for_each_process loop it is possible to end it when all the
> + * tasks that shared the same statistics are found.
> + *
> + * Return: -EFAULT if the current task doesn't have statistical data. Zero
> + * otherwise.
> + */
> +static int fbfam_kill_tasks(void)
> +{
> + struct fbfam_stats *stats = current->fbfam_stats;
> + struct task_struct *p;
> + unsigned int to_kill, killed = 0;
> +
> + if (!stats)
> + return -EFAULT;
> +
> + to_kill = refcount_read(&stats->refc) - 1;
> + if (!to_kill)
> + return 0;
> +
> + /* Disable the attack detection */
> + stats->jiffies = 0;
> + rcu_read_lock();
> +
> + for_each_process(p) {
> + if (p == current || p->fbfam_stats != stats)
> + continue;
> +
> + do_send_sig_info(SIGKILL, SEND_SIG_PRIV, p, PIDTYPE_PID);
> + pr_warn("fbfam: Offending process with PID %d killed\n",
> + p->pid);
I'd make this ratelimited (along with Jann's suggestions). Also, instead
of the explicit "fbfam:" prefix, use the regular prefixing method:
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> + killed += 1;
> + if (killed >= to_kill)
> + break;
> + }
> +
> + rcu_read_unlock();
Can't newly created processes escape this RCU read lock? I think this
need alternate locking, or something in the task_alloc hook that will
block any new process from being created within the stats group.
> + return 0;
> +}
> +
> +/**
> + * fbfam_handle_attack() - Fork brute force attack detection and mitigation.
> * @signal: Signal number that causes the core dump.
> *
> * The crashing rate of an application is computed in milliseconds per fault in
> * each crash. So, if this rate goes under a certain threshold there is a clear
> * signal that the application is crashing quickly. At this moment, a fork brute
> - * force attack is happening.
> + * force attack is happening. Under this scenario it is necessary to kill all
> + * the offending tasks in order to mitigate the attack.
> *
> * Return: -EFAULT if the current task doesn't have statistical data. Zero
> * otherwise.
> @@ -195,6 +255,10 @@ int fbfam_handle_attack(int signal)
> if (!stats)
> return -EFAULT;
>
> + /* The attack detection is disabled */
> + if (!stats->jiffies)
> + return 0;
> +
> if (!(signal == SIGILL || signal == SIGBUS || signal == SIGKILL ||
> signal == SIGSEGV || signal == SIGSYS))
> return 0;
> @@ -205,9 +269,11 @@ int fbfam_handle_attack(int signal)
> delta_time = jiffies64_to_msecs(delta_jiffies);
> crashing_rate = delta_time / (u64)stats->faults;
>
> - if (crashing_rate < (u64)sysctl_crashing_rate_threshold)
> - pr_warn("fbfam: Fork brute force attack detected\n");
> + if (crashing_rate >= (u64)sysctl_crashing_rate_threshold)
> + return 0;
>
> + pr_warn("fbfam: Fork brute force attack detected\n");
> + fbfam_kill_tasks();
> return 0;
> }
>
> --
> 2.25.1
>
--
Kees Cook
On Thu, Sep 10, 2020 at 01:21:01PM -0700, Kees Cook wrote:
> From: John Wood <[email protected]>
>
> The goal of this patch serie is to detect and mitigate a fork brute force
> attack.
Thanks for this RFC! I'm excited to get this problem finally handled in
the kernel. Hopefully the feedback is useful. :)
--
Kees Cook
On Fri, Sep 11, 2020 at 1:49 AM Kees Cook <[email protected]> wrote:
> On Thu, Sep 10, 2020 at 01:21:06PM -0700, Kees Cook wrote:
> > From: John Wood <[email protected]>
> >
> > To detect a fork brute force attack it is necessary to compute the
> > crashing rate of the application. This calculation is performed in each
> > fatal fail of a task, or in other words, when a core dump is triggered.
> > If this rate shows that the application is crashing quickly, there is a
> > clear signal that an attack is happening.
> >
> > Since the crashing rate is computed in milliseconds per fault, if this
> > rate goes under a certain threshold a warning is triggered.
> >
> > Signed-off-by: John Wood <[email protected]>
> > ---
> > fs/coredump.c | 2 ++
> > include/fbfam/fbfam.h | 2 ++
> > security/fbfam/fbfam.c | 39 +++++++++++++++++++++++++++++++++++++++
> > 3 files changed, 43 insertions(+)
> >
> > diff --git a/fs/coredump.c b/fs/coredump.c
> > index 76e7c10edfc0..d4ba4e1828d5 100644
> > --- a/fs/coredump.c
> > +++ b/fs/coredump.c
> > @@ -51,6 +51,7 @@
> > #include "internal.h"
> >
> > #include <trace/events/sched.h>
> > +#include <fbfam/fbfam.h>
> >
> > int core_uses_pid;
> > unsigned int core_pipe_limit;
> > @@ -825,6 +826,7 @@ void do_coredump(const kernel_siginfo_t *siginfo)
> > fail_creds:
> > put_cred(cred);
> > fail:
> > + fbfam_handle_attack(siginfo->si_signo);
>
> I don't think this is the right place for detecting a crash -- isn't
> this only for the "dumping core" condition? In other words, don't you
> want to do this in get_signal()'s "fatal" block? (i.e. very close to the
> do_coredump, but without the "should I dump?" check?)
>
> Hmm, but maybe I'm wrong? It looks like you're looking at noticing the
> process taking a signal from SIG_KERNEL_COREDUMP_MASK ?
>
> (Better yet: what are fatal conditions that do NOT match
> SIG_KERNEL_COREDUMP_MASK, and should those be covered?)
>
> Regardless, *this* looks like the only place without an LSM hook. And it
> doesn't seem unreasonable to add one here. I assume it would probably
> just take the siginfo pointer, which is also what you're checking.
Good point, making this an LSM might be a good idea.
> e.g. for include/linux/lsm_hook_defs.h:
>
> LSM_HOOK(int, 0, task_coredump, const kernel_siginfo_t *siginfo);
I guess it should probably be an LSM_RET_VOID hook? And since, as you
said, it's not really semantically about core dumping, maybe it should
be named task_fatal_signal or something like that.
On Fri, Sep 11, 2020 at 1:56 AM Kees Cook <[email protected]> wrote:
> On Thu, Sep 10, 2020 at 01:21:07PM -0700, Kees Cook wrote:
> > From: John Wood <[email protected]>
> >
> > In order to mitigate a fork brute force attack it is necessary to kill
> > all the offending tasks. This tasks are all the ones that share the
> > statistical data with the current task (the task that has crashed).
> >
> > Since the attack detection is done in the function fbfam_handle_attack()
> > that is called every time a core dump is triggered, only is needed to
> > kill the others tasks that share the same statistical data, not the
> > current one as this is in the path to be killed.
[...]
> > + for_each_process(p) {
> > + if (p == current || p->fbfam_stats != stats)
> > + continue;
> > +
> > + do_send_sig_info(SIGKILL, SEND_SIG_PRIV, p, PIDTYPE_PID);
> > + pr_warn("fbfam: Offending process with PID %d killed\n",
> > + p->pid);
[...]
> > +
> > + killed += 1;
> > + if (killed >= to_kill)
> > + break;
> > + }
> > +
> > + rcu_read_unlock();
>
> Can't newly created processes escape this RCU read lock? I think this
> need alternate locking, or something in the task_alloc hook that will
> block any new process from being created within the stats group.
Good point; the proper way to deal with this would probably be to take
the tasklist_lock in read mode around this loop (with
read_lock(&tasklist_lock) / read_unlock(&tasklist_lock)), which pairs
with the write_lock_irq(&tasklist_lock) in copy_process(). Thanks to
the fatal_signal_pending() check while holding the lock in
copy_process(), that would be race-free - any fork() that has not yet
inserted the new task into the global task list would wait for us to
drop the tasklist_lock, then bail out at the fatal_signal_pending()
check.
Hi,
On Thu, Sep 10, 2020 at 04:58:29PM -0700, Kees Cook wrote:
> On Thu, Sep 10, 2020 at 01:21:01PM -0700, Kees Cook wrote:
> > From: John Wood <[email protected]>
> >
> > The goal of this patch serie is to detect and mitigate a fork brute force
> > attack.
>
> Thanks for this RFC! I'm excited to get this problem finally handled in
> the kernel. Hopefully the feedback is useful. :)
Kees and Jann,
Thank you very much for your comments and advices. I'm a newbie in the
linux kernel development and this is my first attempt. So, I would prefer
to study all your comments before to reply since a big amount of terms
you expose are unknown to me.
In other words, a late reply to this serie comments is not a lack of
interest. Moreover, I think it would be better that I try to understand and
to implement your ideas before anything else.
My original patch serie is composed of 9 patches, so the 3 lasts are lost.
Kees: Have you removed them for some reason? Can you send them for review?
security/fbfam: Add two new prctls to enable and disable the fbfam feature
https://github.com/johwood/linux/commit/8a36399847213e7eb7b45b853568a53666bd0083
Documentation/security: Add documentation for the fbfam feature
https://github.com/johwood/linux/commit/fb46804541f5c0915f3f48acefbe6dc998815609
MAINTAINERS: Add a new entry for the fbfam feature
https://github.com/johwood/linux/commit/4303bc8935334136c6ef47b5e50b87cd2c472c1f
Is there a problem if I ask for some guidance (replying to this thread)
during the process to do my second patch series?
My goal is to learn as much as possible doing something useful for the
linux kernel.
Thanks a lot,
John Wood
> --
> Kees Cook
On Thu, 10 Sep 2020, Kees Cook wrote:
> [kees: re-sending this series on behalf of John Wood <[email protected]>
> also visible at https://github.com/johwood/linux fbfam]
>
> From: John Wood <[email protected]>
Why are you resending this? The author of the code needs to be able to
send and receive emails directly as part of development and maintenance.
--
James Morris
<[email protected]>
On Sat, Sep 12, 2020 at 10:03:23AM +1000, James Morris wrote:
> On Thu, 10 Sep 2020, Kees Cook wrote:
>
> > [kees: re-sending this series on behalf of John Wood <[email protected]>
> > also visible at https://github.com/johwood/linux fbfam]
> >
> > From: John Wood <[email protected]>
>
> Why are you resending this? The author of the code needs to be able to
> send and receive emails directly as part of development and maintenance.
I wanted to flush it from my "review" TODO list, mainly.
--
Kees Cook
On Fri, Sep 11, 2020 at 04:48:06PM +0200, John Wood wrote:
> In other words, a late reply to this serie comments is not a lack of
> interest. Moreover, I think it would be better that I try to understand and
> to implement your ideas before anything else.
Understood! :)
> My original patch serie is composed of 9 patches, so the 3 lasts are lost.
> Kees: Have you removed them for some reason? Can you send them for review?
>
> security/fbfam: Add two new prctls to enable and disable the fbfam feature
> https://github.com/johwood/linux/commit/8a36399847213e7eb7b45b853568a53666bd0083
>
> Documentation/security: Add documentation for the fbfam feature
> https://github.com/johwood/linux/commit/fb46804541f5c0915f3f48acefbe6dc998815609
>
> MAINTAINERS: Add a new entry for the fbfam feature
> https://github.com/johwood/linux/commit/4303bc8935334136c6ef47b5e50b87cd2c472c1f
Oh, hm, I'm not sure where they went. I think they were missing from my
inbox when I saved your series from email. An oversight on my part;
apologies!
> Is there a problem if I ask for some guidance (replying to this thread)
> during the process to do my second patch series?
Please feel free! I'm happy to help. :)
> My goal is to learn as much as possible doing something useful for the
> linux kernel.
Sounds good; thanks!
--
Kees Cook
On Sat, Sep 12, 2020 at 12:56:18AM -0700, Kees Cook wrote:
> On Sat, Sep 12, 2020 at 10:03:23AM +1000, James Morris wrote:
> > On Thu, 10 Sep 2020, Kees Cook wrote:
> >
> > > [kees: re-sending this series on behalf of John Wood <[email protected]>
> > > also visible at https://github.com/johwood/linux fbfam]
> > >
> > > From: John Wood <[email protected]>
> >
> > Why are you resending this? The author of the code needs to be able to
> > send and receive emails directly as part of development and maintenance.
I tried to send the full patch serie by myself but my email got blocked. After
get support from my email provider it told to me that my account is young,
and due to its spam policie I am not allow, for now, to send a big amount
of mails in a short period. They also informed me that soon I will be able
to send more mails. The quantity increase with the age of the account.
I hope that for the next version all works as expected.
Apologies.
> I wanted to flush it from my "review" TODO list, mainly.
Thanks Kees for the re-send and review.
> --
> Kees Cook
Regards,
John Wood
On Sat, Sep 12, 2020 at 12:55:03AM -0700, Kees Cook wrote:
> On Fri, Sep 11, 2020 at 04:48:06PM +0200, John Wood wrote:
> > My original patch serie is composed of 9 patches, so the 3 lasts are lost.
> > Kees: Have you removed them for some reason? Can you send them for review?
> >
> > security/fbfam: Add two new prctls to enable and disable the fbfam feature
> > https://github.com/johwood/linux/commit/8a36399847213e7eb7b45b853568a53666bd0083
> >
> > Documentation/security: Add documentation for the fbfam feature
> > https://github.com/johwood/linux/commit/fb46804541f5c0915f3f48acefbe6dc998815609
> >
> > MAINTAINERS: Add a new entry for the fbfam feature
> > https://github.com/johwood/linux/commit/4303bc8935334136c6ef47b5e50b87cd2c472c1f
>
> Oh, hm, I'm not sure where they went. I think they were missing from my
> inbox when I saved your series from email. An oversight on my part;
> apologies!
I sent the full serie to Matthew Wilcox <[email protected]> only, as he
wanted to help re-sending the full serie. Then I saw that only 6 patches
appeared in the linux-doc mailing list.
I can try to send the three pending patches in different stages (for example
one patch every 4 or 5 hours) to avoid blocking my email. I hope. Or I can
send the three pending patches only to the kernel-hardening mailing list
and you re-send to all the people involved. Or any other solution you
propose. It's up to you.
> > Is there a problem if I ask for some guidance (replying to this thread)
> > during the process to do my second patch series?
>
> Please feel free! I'm happy to help. :)
It's a pleasure working with you. Thanks a lot.
> --
> Kees Cook
Regards,
John Wood
On Sat, Sep 12, 2020 at 11:36:52AM +0200, John Wood wrote:
> On Sat, Sep 12, 2020 at 12:56:18AM -0700, Kees Cook wrote:
> > On Sat, Sep 12, 2020 at 10:03:23AM +1000, James Morris wrote:
> > > On Thu, 10 Sep 2020, Kees Cook wrote:
> > >
> > > > [kees: re-sending this series on behalf of John Wood <[email protected]>
> > > > also visible at https://github.com/johwood/linux fbfam]
> > > >
> > > > From: John Wood <[email protected]>
> > >
> > > Why are you resending this? The author of the code needs to be able to
> > > send and receive emails directly as part of development and maintenance.
>
> I tried to send the full patch serie by myself but my email got blocked. After
> get support from my email provider it told to me that my account is young,
> and due to its spam policie I am not allow, for now, to send a big amount
> of mails in a short period. They also informed me that soon I will be able
> to send more mails. The quantity increase with the age of the account.
>
If you're using "git send-email" then specify --confirm=always and
either manually send a mail every few seconds or use an expect script
like
#!/bin/bash
EXPECT_SCRIPT=
function cleanup() {
if [ "$EXPECT_SCRIPT" != "" ]; then
rm $EXPECT_SCRIPT
fi
}
trap cleanup EXIT
EXPECT_SCRIPT=`mktemp`
cat > $EXPECT_SCRIPT <<EOF
spawn sh ./SEND
expect {
"Send this email" { sleep 10; exp_send y\\r; exp_continue }
}
EOF
expect -f $EXPECT_SCRIPT
exit $?
This will work if your provider limits the rate mails are sent rather
than the total amount.
--
Mel Gorman
SUSE Labs
On Sat, Sep 12, 2020 at 4:51 PM Mel Gorman <[email protected]> wrote:
> On Sat, Sep 12, 2020 at 11:36:52AM +0200, John Wood wrote:
> > On Sat, Sep 12, 2020 at 12:56:18AM -0700, Kees Cook wrote:
> > > On Sat, Sep 12, 2020 at 10:03:23AM +1000, James Morris wrote:
> > > > On Thu, 10 Sep 2020, Kees Cook wrote:
> > > >
> > > > > [kees: re-sending this series on behalf of John Wood <[email protected]>
> > > > > also visible at https://github.com/johwood/linux fbfam]
> > > > >
> > > > > From: John Wood <[email protected]>
> > > >
> > > > Why are you resending this? The author of the code needs to be able to
> > > > send and receive emails directly as part of development and maintenance.
> >
> > I tried to send the full patch serie by myself but my email got blocked. After
> > get support from my email provider it told to me that my account is young,
> > and due to its spam policie I am not allow, for now, to send a big amount
> > of mails in a short period. They also informed me that soon I will be able
> > to send more mails. The quantity increase with the age of the account.
> >
>
> If you're using "git send-email" then specify --confirm=always and
> either manually send a mail every few seconds or use an expect script
> like
>
> #!/bin/bash
> EXPECT_SCRIPT=
> function cleanup() {
> if [ "$EXPECT_SCRIPT" != "" ]; then
> rm $EXPECT_SCRIPT
> fi
> }
> trap cleanup EXIT
>
> EXPECT_SCRIPT=`mktemp`
> cat > $EXPECT_SCRIPT <<EOF
> spawn sh ./SEND
> expect {
> "Send this email" { sleep 10; exp_send y\\r; exp_continue }
> }
> EOF
>
> expect -f $EXPECT_SCRIPT
> exit $?
>
> This will work if your provider limits the rate mails are sent rather
> than the total amount.
...or you could keep it simple and just pass "--batch-size 1
--relogin-delay 10" to git send-email ;)
--
Ondrej Mosnacek
Software Engineer, Platform Security - SELinux kernel
Red Hat, Inc.
Hi,
On Sat, Sep 12, 2020 at 10:48:39PM +0200, Ondrej Mosnacek wrote:
> On Sat, Sep 12, 2020 at 4:51 PM Mel Gorman <[email protected]> wrote:
> > On Sat, Sep 12, 2020 at 11:36:52AM +0200, John Wood wrote:
> > > On Sat, Sep 12, 2020 at 12:56:18AM -0700, Kees Cook wrote:
> > > > On Sat, Sep 12, 2020 at 10:03:23AM +1000, James Morris wrote:
> > > > > On Thu, 10 Sep 2020, Kees Cook wrote:
> > > > >
> > > > > > [kees: re-sending this series on behalf of John Wood <[email protected]>
> > > > > > also visible at https://github.com/johwood/linux fbfam]
> > > > > >
> > > > > > From: John Wood <[email protected]>
> > > > >
> > > > > Why are you resending this? The author of the code needs to be able to
> > > > > send and receive emails directly as part of development and maintenance.
> > >
> > > I tried to send the full patch serie by myself but my email got blocked. After
> > > get support from my email provider it told to me that my account is young,
> > > and due to its spam policie I am not allow, for now, to send a big amount
> > > of mails in a short period. They also informed me that soon I will be able
> > > to send more mails. The quantity increase with the age of the account.
> > >
> >
> > If you're using "git send-email" then specify --confirm=always and
> > either manually send a mail every few seconds or use an expect script
> > like
> >
> > #!/bin/bash
> > EXPECT_SCRIPT=
> > function cleanup() {
> > if [ "$EXPECT_SCRIPT" != "" ]; then
> > rm $EXPECT_SCRIPT
> > fi
> > }
> > trap cleanup EXIT
> >
> > EXPECT_SCRIPT=`mktemp`
> > cat > $EXPECT_SCRIPT <<EOF
> > spawn sh ./SEND
> > expect {
> > "Send this email" { sleep 10; exp_send y\\r; exp_continue }
> > }
> > EOF
> >
> > expect -f $EXPECT_SCRIPT
> > exit $?
> >
> > This will work if your provider limits the rate mails are sent rather
> > than the total amount.
Yes, it seems to be what is happening.
> ...or you could keep it simple and just pass "--batch-size 1
> --relogin-delay 10" to git send-email ;)
Mel and Ondrej thanks a lot for the proposed solutions. I'm sure some of
your solutions will be used soon.
> --
> Ondrej Mosnacek
> Software Engineer, Platform Security - SELinux kernel
> Red Hat, Inc.
Regards,
John Wood
Hi, more inline.
On Thu, Sep 10, 2020 at 04:14:38PM -0700, Kees Cook wrote:
> > diff --git a/include/fbfam/fbfam.h b/include/fbfam/fbfam.h
> > index b5b7d1127a52..2cfe51d2b0d5 100644
> > --- a/include/fbfam/fbfam.h
> > +++ b/include/fbfam/fbfam.h
> > @@ -3,8 +3,12 @@
> > #define _FBFAM_H_
> >
> > #include <linux/sched.h>
> > +#include <linux/sysctl.h>
> >
> > #ifdef CONFIG_FBFAM
> > +#ifdef CONFIG_SYSCTL
> > +extern struct ctl_table fbfam_sysctls[];
> > +#endif
>
> Instead of doing the extern and adding to sysctl.c, this can all be done
> directly (dynamically) from the fbfam.c file instead.
Like Yama do in the yama_init_sysctl() function? As a reference code.
> > int fbfam_fork(struct task_struct *child);
> > int fbfam_execve(void);
> > int fbfam_exit(void);
> > diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> > index 09e70ee2332e..c3b4d737bef3 100644
> > --- a/kernel/sysctl.c
> > +++ b/kernel/sysctl.c
> > @@ -77,6 +77,8 @@
> > #include <linux/uaccess.h>
> > #include <asm/processor.h>
> >
> > +#include <fbfam/fbfam.h>
> > +
> > #ifdef CONFIG_X86
> > #include <asm/nmi.h>
> > #include <asm/stacktrace.h>
> > @@ -2660,6 +2662,13 @@ static struct ctl_table kern_table[] = {
> > .extra1 = SYSCTL_ZERO,
> > .extra2 = SYSCTL_ONE,
> > },
> > +#endif
> > +#ifdef CONFIG_FBFAM
> > + {
> > + .procname = "fbfam",
> > + .mode = 0555,
> > + .child = fbfam_sysctls,
> > + },
> > #endif
> > { }
> > };
> > diff --git a/security/fbfam/Makefile b/security/fbfam/Makefile
> > index f4b9f0b19c44..b8d5751ecea4 100644
> > --- a/security/fbfam/Makefile
> > +++ b/security/fbfam/Makefile
> > @@ -1,2 +1,3 @@
> > # SPDX-License-Identifier: GPL-2.0
> > obj-$(CONFIG_FBFAM) += fbfam.o
> > +obj-$(CONFIG_SYSCTL) += sysctl.o
> > diff --git a/security/fbfam/fbfam.c b/security/fbfam/fbfam.c
> > index 0387f95f6408..9be4639b72eb 100644
> > --- a/security/fbfam/fbfam.c
> > +++ b/security/fbfam/fbfam.c
> > @@ -7,6 +7,17 @@
> > #include <linux/refcount.h>
> > #include <linux/slab.h>
> >
> > +/**
> > + * sysctl_crashing_rate_threshold - Crashing rate threshold.
> > + *
> > + * The rate's units are in milliseconds per fault.
> > + *
> > + * A fork brute force attack will be detected if the application's crashing rate
> > + * falls under this threshold. So, the higher this value, the faster an attack
> > + * will be detected.
> > + */
> > +unsigned long sysctl_crashing_rate_threshold = 30000;
>
> I would move the sysctls here, instead. (Also, the above should be
> const.)
If the above variable is const how the sysctl interface can modify it?
I think it would be better to declare it as __read_mostly instead. What
do you think?
unsigned long sysctl_crashing_rate_threshold __read_mostly = 30000;
> > +
> > /**
> > * struct fbfam_stats - Fork brute force attack mitigation statistics.
> > * @refc: Reference counter.
> > diff --git a/security/fbfam/sysctl.c b/security/fbfam/sysctl.c
> > new file mode 100644
> > index 000000000000..430323ad8e9f
> > --- /dev/null
> > +++ b/security/fbfam/sysctl.c
> > @@ -0,0 +1,20 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include <linux/sysctl.h>
> > +
> > +extern unsigned long sysctl_crashing_rate_threshold;
> > +static unsigned long ulong_one = 1;
> > +static unsigned long ulong_max = ULONG_MAX;
> > +
> > +struct ctl_table fbfam_sysctls[] = {
> > + {
> > + .procname = "crashing_rate_threshold",
> > + .data = &sysctl_crashing_rate_threshold,
> > + .maxlen = sizeof(sysctl_crashing_rate_threshold),
> > + .mode = 0644,
> > + .proc_handler = proc_doulongvec_minmax,
> > + .extra1 = &ulong_one,
> > + .extra2 = &ulong_max,
> > + },
> > + { }
> > +};
>
> I wouldn't bother splitting this into a separate file. (Just leave it in
> fbfam.c)
>
> --
> Kees Cook
Thanks,
John Wood
Hi,
On Fri, Sep 11, 2020 at 02:01:56AM +0200, Jann Horn wrote:
> On Fri, Sep 11, 2020 at 1:49 AM Kees Cook <[email protected]> wrote:
> > On Thu, Sep 10, 2020 at 01:21:06PM -0700, Kees Cook wrote:
> > > diff --git a/fs/coredump.c b/fs/coredump.c
> > > index 76e7c10edfc0..d4ba4e1828d5 100644
> > > --- a/fs/coredump.c
> > > +++ b/fs/coredump.c
> > > @@ -51,6 +51,7 @@
> > > #include "internal.h"
> > >
> > > #include <trace/events/sched.h>
> > > +#include <fbfam/fbfam.h>
> > >
> > > int core_uses_pid;
> > > unsigned int core_pipe_limit;
> > > @@ -825,6 +826,7 @@ void do_coredump(const kernel_siginfo_t *siginfo)
> > > fail_creds:
> > > put_cred(cred);
> > > fail:
> > > + fbfam_handle_attack(siginfo->si_signo);
> >
> > I don't think this is the right place for detecting a crash -- isn't
> > this only for the "dumping core" condition? In other words, don't you
> > want to do this in get_signal()'s "fatal" block? (i.e. very close to the
> > do_coredump, but without the "should I dump?" check?)
> >
> > Hmm, but maybe I'm wrong? It looks like you're looking at noticing the
> > process taking a signal from SIG_KERNEL_COREDUMP_MASK ?
> >
> > (Better yet: what are fatal conditions that do NOT match
> > SIG_KERNEL_COREDUMP_MASK, and should those be covered?)
> >
> > Regardless, *this* looks like the only place without an LSM hook. And it
> > doesn't seem unreasonable to add one here. I assume it would probably
> > just take the siginfo pointer, which is also what you're checking.
>
> Good point, making this an LSM might be a good idea.
>
> > e.g. for include/linux/lsm_hook_defs.h:
> >
> > LSM_HOOK(int, 0, task_coredump, const kernel_siginfo_t *siginfo);
>
> I guess it should probably be an LSM_RET_VOID hook? And since, as you
> said, it's not really semantically about core dumping, maybe it should
> be named task_fatal_signal or something like that.
If I understand correctly you propose to add a new LSM hook without return
value and place it here:
diff --git a/kernel/signal.c b/kernel/signal.c
index a38b3edc6851..074492d23e98 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2751,6 +2751,8 @@ bool get_signal(struct ksignal *ksig)
do_coredump(&ksig->info);
}
+ // Add the new LSM hook here
+
/*
* Death signals, no core dump.
*/
Thanks,
John Wood
Hi,
On Thu, Sep 10, 2020 at 11:10:38PM +0200, Jann Horn wrote:
> On Thu, Sep 10, 2020 at 10:22 PM Kees Cook <[email protected]> wrote:
> > To detect a fork brute force attack it is necessary to compute the
> > crashing rate of the application. This calculation is performed in each
> > fatal fail of a task, or in other words, when a core dump is triggered.
> > If this rate shows that the application is crashing quickly, there is a
> > clear signal that an attack is happening.
> >
> > Since the crashing rate is computed in milliseconds per fault, if this
> > rate goes under a certain threshold a warning is triggered.
> [...]
> > +/**
> > + * fbfam_handle_attack() - Fork brute force attack detection.
> > + * @signal: Signal number that causes the core dump.
> > + *
> > + * The crashing rate of an application is computed in milliseconds per fault in
> > + * each crash. So, if this rate goes under a certain threshold there is a clear
> > + * signal that the application is crashing quickly. At this moment, a fork brute
> > + * force attack is happening.
> > + *
> > + * Return: -EFAULT if the current task doesn't have statistical data. Zero
> > + * otherwise.
> > + */
> > +int fbfam_handle_attack(int signal)
> > +{
> > + struct fbfam_stats *stats = current->fbfam_stats;
> > + u64 delta_jiffies, delta_time;
> > + u64 crashing_rate;
> > +
> > + if (!stats)
> > + return -EFAULT;
> > +
> > + if (!(signal == SIGILL || signal == SIGBUS || signal == SIGKILL ||
> > + signal == SIGSEGV || signal == SIGSYS))
> > + return 0;
>
> As far as I can tell, you can never get here with SIGKILL, since
> SIGKILL doesn't trigger core dumping and also isn't used by seccomp?
Understood.
> > +
> > + stats->faults += 1;
>
> This is a data race. If you want to be able to increment a variable
> that may be concurrently incremented by other tasks, use either
> locking or the atomic_t helpers.
Ok, I will correct this for the next version. Thanks.
> > + delta_jiffies = get_jiffies_64() - stats->jiffies;
> > + delta_time = jiffies64_to_msecs(delta_jiffies);
> > + crashing_rate = delta_time / (u64)stats->faults;
>
> Do I see this correctly, is this computing the total runtime of this
> process hierarchy divided by the total number of faults seen in this
> process hierarchy? If so, you may want to reconsider whether that's
> really the behavior you want. For example, if I configure the minimum
> period between crashes to be 30s (as is the default in the sysctl
> patch), and I try to attack a server that has been running without any
> crashes for a month, I'd instantly be able to crash around
> 30*24*60*60/30 = 86400 times before the detection kicks in. That seems
> suboptimal.
You are right. This is not the behaviour we want. So, for the next
version it would be better to compute the crashing period as the time
between two faults, or the time between the execve call and the first
fault (first fault case).
However, I am afraid of a premature detection if a child process fails
twice in a short period.
So, I think it would be a good idea add a new sysctl to setup a
minimum number of faults before the time between faults starts to be
computed. And so, the attack detection only will be triggered if the
application crashes quickly but after a number of crashes.
What do you think?
>
> (By the way, it kind of annoys me that you call it the "rate" when
> it's actually the inverse of the rate. "Period" might be more
> appropriate?)
Yes, "period" it's more appropiate. Thanks for the clarification.
> > + if (crashing_rate < (u64)sysctl_crashing_rate_threshold)
> > + pr_warn("fbfam: Fork brute force attack detected\n");
> > +
> > + return 0;
> > +}
> > +
> > --
> > 2.25.1
> >
Regards,
John Wood
On Sun, Sep 13, 2020 at 6:56 PM John Wood <[email protected]> wrote:
> On Fri, Sep 11, 2020 at 02:01:56AM +0200, Jann Horn wrote:
> > On Fri, Sep 11, 2020 at 1:49 AM Kees Cook <[email protected]> wrote:
> > > On Thu, Sep 10, 2020 at 01:21:06PM -0700, Kees Cook wrote:
> > > > diff --git a/fs/coredump.c b/fs/coredump.c
> > > > index 76e7c10edfc0..d4ba4e1828d5 100644
> > > > --- a/fs/coredump.c
> > > > +++ b/fs/coredump.c
> > > > @@ -51,6 +51,7 @@
> > > > #include "internal.h"
> > > >
> > > > #include <trace/events/sched.h>
> > > > +#include <fbfam/fbfam.h>
> > > >
> > > > int core_uses_pid;
> > > > unsigned int core_pipe_limit;
> > > > @@ -825,6 +826,7 @@ void do_coredump(const kernel_siginfo_t *siginfo)
> > > > fail_creds:
> > > > put_cred(cred);
> > > > fail:
> > > > + fbfam_handle_attack(siginfo->si_signo);
> > >
> > > I don't think this is the right place for detecting a crash -- isn't
> > > this only for the "dumping core" condition? In other words, don't you
> > > want to do this in get_signal()'s "fatal" block? (i.e. very close to the
> > > do_coredump, but without the "should I dump?" check?)
> > >
> > > Hmm, but maybe I'm wrong? It looks like you're looking at noticing the
> > > process taking a signal from SIG_KERNEL_COREDUMP_MASK ?
> > >
> > > (Better yet: what are fatal conditions that do NOT match
> > > SIG_KERNEL_COREDUMP_MASK, and should those be covered?)
> > >
> > > Regardless, *this* looks like the only place without an LSM hook. And it
> > > doesn't seem unreasonable to add one here. I assume it would probably
> > > just take the siginfo pointer, which is also what you're checking.
> >
> > Good point, making this an LSM might be a good idea.
> >
> > > e.g. for include/linux/lsm_hook_defs.h:
> > >
> > > LSM_HOOK(int, 0, task_coredump, const kernel_siginfo_t *siginfo);
> >
> > I guess it should probably be an LSM_RET_VOID hook? And since, as you
> > said, it's not really semantically about core dumping, maybe it should
> > be named task_fatal_signal or something like that.
>
> If I understand correctly you propose to add a new LSM hook without return
> value and place it here:
>
> diff --git a/kernel/signal.c b/kernel/signal.c
> index a38b3edc6851..074492d23e98 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -2751,6 +2751,8 @@ bool get_signal(struct ksignal *ksig)
> do_coredump(&ksig->info);
> }
>
> + // Add the new LSM hook here
> +
> /*
> * Death signals, no core dump.
> */
It should probably be in the "if (sig_kernel_coredump(signr)) {"
branch. And I'm not sure whether it should be before or after
do_coredump() - if you do it after do_coredump(), the hook will have
to wait until the core dump file has been written, which may take a
little bit of time.
On Sun, Sep 13, 2020 at 7:55 PM John Wood <[email protected]> wrote:
> On Thu, Sep 10, 2020 at 11:10:38PM +0200, Jann Horn wrote:
> > On Thu, Sep 10, 2020 at 10:22 PM Kees Cook <[email protected]> wrote:
> > > To detect a fork brute force attack it is necessary to compute the
> > > crashing rate of the application. This calculation is performed in each
> > > fatal fail of a task, or in other words, when a core dump is triggered.
> > > If this rate shows that the application is crashing quickly, there is a
> > > clear signal that an attack is happening.
> > >
> > > Since the crashing rate is computed in milliseconds per fault, if this
> > > rate goes under a certain threshold a warning is triggered.
[...]
> > > + delta_jiffies = get_jiffies_64() - stats->jiffies;
> > > + delta_time = jiffies64_to_msecs(delta_jiffies);
> > > + crashing_rate = delta_time / (u64)stats->faults;
> >
> > Do I see this correctly, is this computing the total runtime of this
> > process hierarchy divided by the total number of faults seen in this
> > process hierarchy? If so, you may want to reconsider whether that's
> > really the behavior you want. For example, if I configure the minimum
> > period between crashes to be 30s (as is the default in the sysctl
> > patch), and I try to attack a server that has been running without any
> > crashes for a month, I'd instantly be able to crash around
> > 30*24*60*60/30 = 86400 times before the detection kicks in. That seems
> > suboptimal.
>
> You are right. This is not the behaviour we want. So, for the next
> version it would be better to compute the crashing period as the time
> between two faults, or the time between the execve call and the first
> fault (first fault case).
>
> However, I am afraid of a premature detection if a child process fails
> twice in a short period.
>
> So, I think it would be a good idea add a new sysctl to setup a
> minimum number of faults before the time between faults starts to be
> computed. And so, the attack detection only will be triggered if the
> application crashes quickly but after a number of crashes.
>
> What do you think?
You could keep a list of the timestamps of the last five crashes or
so, and then take action if the last five crashes happened within
(5-1)*crash_period_limit time.
Hi,
On Mon, Sep 14, 2020 at 09:39:10PM +0200, Jann Horn wrote:
> On Sun, Sep 13, 2020 at 6:56 PM John Wood <[email protected]> wrote:
> > On Fri, Sep 11, 2020 at 02:01:56AM +0200, Jann Horn wrote:
> > > On Fri, Sep 11, 2020 at 1:49 AM Kees Cook <[email protected]> wrote:
> > > > On Thu, Sep 10, 2020 at 01:21:06PM -0700, Kees Cook wrote:
> > > > [...]
> > > > I don't think this is the right place for detecting a crash -- isn't
> > > > this only for the "dumping core" condition? In other words, don't you
> > > > want to do this in get_signal()'s "fatal" block? (i.e. very close to the
> > > > do_coredump, but without the "should I dump?" check?)
> > > >
> > > > Hmm, but maybe I'm wrong? It looks like you're looking at noticing the
> > > > process taking a signal from SIG_KERNEL_COREDUMP_MASK ?
> > > >
> > > > (Better yet: what are fatal conditions that do NOT match
> > > > SIG_KERNEL_COREDUMP_MASK, and should those be covered?)
> > > >
> > > > Regardless, *this* looks like the only place without an LSM hook. And it
> > > > doesn't seem unreasonable to add one here. I assume it would probably
> > > > just take the siginfo pointer, which is also what you're checking.
> > >
> > > Good point, making this an LSM might be a good idea.
> > >
> > > > e.g. for include/linux/lsm_hook_defs.h:
> > > >
> > > > LSM_HOOK(int, 0, task_coredump, const kernel_siginfo_t *siginfo);
> > >
> > > I guess it should probably be an LSM_RET_VOID hook? And since, as you
> > > said, it's not really semantically about core dumping, maybe it should
> > > be named task_fatal_signal or something like that.
> >
> > If I understand correctly you propose to add a new LSM hook without return
> > value and place it here:
> >
> > diff --git a/kernel/signal.c b/kernel/signal.c
> > index a38b3edc6851..074492d23e98 100644
> > --- a/kernel/signal.c
> > +++ b/kernel/signal.c
> > @@ -2751,6 +2751,8 @@ bool get_signal(struct ksignal *ksig)
> > do_coredump(&ksig->info);
> > }
> >
> > + // Add the new LSM hook here
> > +
> > /*
> > * Death signals, no core dump.
> > */
>
> It should probably be in the "if (sig_kernel_coredump(signr)) {"
> branch. And I'm not sure whether it should be before or after
> do_coredump() - if you do it after do_coredump(), the hook will have
> to wait until the core dump file has been written, which may take a
> little bit of time.
But if the LSM hook is placed in the "if (sig_kernel_coredump(signr)) {"
branch, then only the following signals will be passed to it.
SIGQUIT, SIGILL, SIGTRAP, SIGABRT, SIGFPE, SIGSEGV, SIGBUS, SIGSYS,
SIGXCPU, SIGXFSZ, SIGEMT
The above signals are extracted from SIG_KERNEL_COREDUMP_MASK macro, and
are only related to coredump.
So, if we add a new LSM hook (named task_fatal_signal) to detect a fatal
signal it would be better to place it just above the if statement.
diff --git a/kernel/signal.c b/kernel/signal.c
index a38b3edc6851..406af87f2f96 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2736,6 +2736,8 @@ bool get_signal(struct ksignal *ksig)
*/
current->flags |= PF_SIGNALED;
+ // Place the new LSM hook here
+
if (sig_kernel_coredump(signr)) {
if (print_fatal_signals)
print_fatal_signal(ksig->info.si_signo);
This way all the fatal signals are caught and we also avoid the commented
delay if a core dump is necessary.
Thanks,
John Wood
On Mon, Sep 14, 2020 at 09:42:37PM +0200, Jann Horn wrote:
> On Sun, Sep 13, 2020 at 7:55 PM John Wood <[email protected]> wrote:
> > On Thu, Sep 10, 2020 at 11:10:38PM +0200, Jann Horn wrote:
> > > > + delta_jiffies = get_jiffies_64() - stats->jiffies;
> > > > + delta_time = jiffies64_to_msecs(delta_jiffies);
> > > > + crashing_rate = delta_time / (u64)stats->faults;
> > >
> > > Do I see this correctly, is this computing the total runtime of this
> > > process hierarchy divided by the total number of faults seen in this
> > > process hierarchy? If so, you may want to reconsider whether that's
> > > really the behavior you want. For example, if I configure the minimum
> > > period between crashes to be 30s (as is the default in the sysctl
> > > patch), and I try to attack a server that has been running without any
> > > crashes for a month, I'd instantly be able to crash around
> > > 30*24*60*60/30 = 86400 times before the detection kicks in. That seems
> > > suboptimal.
> >
> > You are right. This is not the behaviour we want. So, for the next
> > version it would be better to compute the crashing period as the time
> > between two faults, or the time between the execve call and the first
> > fault (first fault case).
> >
> > However, I am afraid of a premature detection if a child process fails
> > twice in a short period.
> >
> > So, I think it would be a good idea add a new sysctl to setup a
> > minimum number of faults before the time between faults starts to be
> > computed. And so, the attack detection only will be triggered if the
> > application crashes quickly but after a number of crashes.
> >
> > What do you think?
>
> You could keep a list of the timestamps of the last five crashes or
> so, and then take action if the last five crashes happened within
> (5-1)*crash_period_limit time.
Ok, your proposed solution seems a more clever one. Anyway I think that a
new sysctl for fine tuning the number of timestamps would be needed.
Thanks,
John Wood
On Thu, Sep 10, 2020 at 11:21:58PM +0200, Jann Horn wrote:
> On Thu, Sep 10, 2020 at 10:21 PM Kees Cook <[email protected]> wrote:
> > From: John Wood <[email protected]>
> >
> > Add a menu entry under "Security options" to enable the "Fork brute
> > force attack mitigation" feature.
> [...]
> > +config FBFAM
>
> Please give this a more descriptive name than FBFAM. Some name where,
> if a random kernel developer sees an "#ifdef" with that name in some
> random piece of kernel code, they immediately have a rough idea for
> what kind of feature this is.
>
> Perhaps something like THROTTLE_FORK_CRASHES. Or something else that
> is equally descriptive.
Ok, understood. This will be fixed for the next version. Thanks.
> > + bool "Fork brute force attack mitigation"
> > + default n
>
> "default n" is superfluous and should AFAIK be omitted.
Ok. I will remove it. Thanks.
> > + help
> > + This is a user defense that detects any fork brute force attack
> > + based on the application's crashing rate. When this measure is
> > + triggered the fork system call is blocked.
>
> This help text claims that the mitigation will block fork(), but patch
> 6/6 actually kills the process hierarchy.
Sorry, it's a mistake. It was the first idea but finally the implementation
changed and this description not was modified. Apologies. It will be fixed
for the next version.
Thanks,
John Wood
Hi,
On Thu, Sep 10, 2020 at 04:18:08PM -0700, Kees Cook wrote:
> On Thu, Sep 10, 2020 at 01:21:02PM -0700, Kees Cook wrote:
> > From: John Wood <[email protected]>
> >
> > Add a menu entry under "Security options" to enable the "Fork brute
> > force attack mitigation" feature.
> >
> > Signed-off-by: John Wood <[email protected]>
> > ---
> > security/Kconfig | 1 +
> > security/fbfam/Kconfig | 10 ++++++++++
> > 2 files changed, 11 insertions(+)
> > create mode 100644 security/fbfam/Kconfig
> >
> > diff --git a/security/Kconfig b/security/Kconfig
> > index 7561f6f99f1d..00a90e25b8d5 100644
> > --- a/security/Kconfig
> > +++ b/security/Kconfig
> > @@ -290,6 +290,7 @@ config LSM
> > If unsure, leave this as the default.
> >
> > source "security/Kconfig.hardening"
> > +source "security/fbfam/Kconfig"
>
> Given the layout you've chosen and the interface you've got, I think
> this should just be treated like a regular LSM.
Yes, throughout the review it seems the most appropiate is treat
this feature as a regular LSM. Thanks.
> >
> > endmenu
> >
> > diff --git a/security/fbfam/Kconfig b/security/fbfam/Kconfig
> > new file mode 100644
> > index 000000000000..bbe7f6aad369
> > --- /dev/null
> > +++ b/security/fbfam/Kconfig
> > @@ -0,0 +1,10 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +config FBFAM
>
> To jump on the bikeshed: how about just calling this
> FORK_BRUTE_FORCE_DETECTION or FORK_BRUTE, and the directory could be
> "brute", etc. "fbfam" doesn't tell anyone anything.
Understood. But how about use the fbfam abbreviation in the code? Like as
function name prefix, struct name prefix, ... It would be better to use a
more descriptive name in this scenario? It is not clear to me.
> --
> Kees Cook
Thanks,
John Wood
On Thu, Sep 17, 2020 at 08:40:06PM +0200, John Wood wrote:
> Hi,
>
> On Thu, Sep 10, 2020 at 04:18:08PM -0700, Kees Cook wrote:
> > On Thu, Sep 10, 2020 at 01:21:02PM -0700, Kees Cook wrote:
> > > From: John Wood <[email protected]>
> > >
> > > Add a menu entry under "Security options" to enable the "Fork brute
> > > force attack mitigation" feature.
> > >
> > > Signed-off-by: John Wood <[email protected]>
> > > ---
> > > security/Kconfig | 1 +
> > > security/fbfam/Kconfig | 10 ++++++++++
> > > 2 files changed, 11 insertions(+)
> > > create mode 100644 security/fbfam/Kconfig
> > >
> > > diff --git a/security/Kconfig b/security/Kconfig
> > > index 7561f6f99f1d..00a90e25b8d5 100644
> > > --- a/security/Kconfig
> > > +++ b/security/Kconfig
> > > @@ -290,6 +290,7 @@ config LSM
> > > If unsure, leave this as the default.
> > >
> > > source "security/Kconfig.hardening"
> > > +source "security/fbfam/Kconfig"
> >
> > Given the layout you've chosen and the interface you've got, I think
> > this should just be treated like a regular LSM.
>
> Yes, throughout the review it seems the most appropiate is treat
> this feature as a regular LSM. Thanks.
>
> > >
> > > endmenu
> > >
> > > diff --git a/security/fbfam/Kconfig b/security/fbfam/Kconfig
> > > new file mode 100644
> > > index 000000000000..bbe7f6aad369
> > > --- /dev/null
> > > +++ b/security/fbfam/Kconfig
> > > @@ -0,0 +1,10 @@
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +config FBFAM
> >
> > To jump on the bikeshed: how about just calling this
> > FORK_BRUTE_FORCE_DETECTION or FORK_BRUTE, and the directory could be
> > "brute", etc. "fbfam" doesn't tell anyone anything.
>
> Understood. But how about use the fbfam abbreviation in the code? Like as
> function name prefix, struct name prefix, ... It would be better to use a
> more descriptive name in this scenario? It is not clear to me.
I don't feel too strongly, but I think having the CONFIG roughly match
the directory name, roughly match the function prefixes should be best.
Maybe call the directory and function prefix "brute"?
--
Kees Cook
On Thu, Sep 17, 2020 at 03:05:18PM -0700, Kees Cook wrote:
> On Thu, Sep 17, 2020 at 08:40:06PM +0200, John Wood wrote:
> > > To jump on the bikeshed: how about just calling this
> > > FORK_BRUTE_FORCE_DETECTION or FORK_BRUTE, and the directory could be
> > > "brute", etc. "fbfam" doesn't tell anyone anything.
> >
> > Understood. But how about use the fbfam abbreviation in the code? Like as
> > function name prefix, struct name prefix, ... It would be better to use a
> > more descriptive name in this scenario? It is not clear to me.
>
> I don't feel too strongly, but I think having the CONFIG roughly match
> the directory name, roughly match the function prefixes should be best.
> Maybe call the directory and function prefix "brute"?
Thanks for the clarification.
> --
> Kees Cook
Regards,
John Wood
On Thu, Sep 10, 2020 at 04:56:19PM -0700, Kees Cook wrote:
> On Thu, Sep 10, 2020 at 01:21:07PM -0700, Kees Cook wrote:
> > /**
> > + * fbfam_kill_tasks() - Kill the offending tasks
> > + *
> > + * When a fork brute force attack is detected it is necessary to kill all the
> > + * offending tasks. Since this function is called from fbfam_handle_attack(),
> > + * and so, every time a core dump is triggered, only is needed to kill the
> > + * others tasks that share the same statistical data, not the current one as
> > + * this is in the path to be killed.
> > + *
> > + * When the SIGKILL signal is sent to the offending tasks, this function will be
> > + * called again during the core dump due to the shared statistical data shows a
> > + * quickly crashing rate. So, to avoid kill again the same tasks due to a
> > + * recursive call of this function, it is necessary to disable the attack
> > + * detection setting the jiffies to zero.
> > + *
> > + * To improve the for_each_process loop it is possible to end it when all the
> > + * tasks that shared the same statistics are found.
> > + *
> > + * Return: -EFAULT if the current task doesn't have statistical data. Zero
> > + * otherwise.
> > + */
> > +static int fbfam_kill_tasks(void)
> > +{
> > + struct fbfam_stats *stats = current->fbfam_stats;
> > + struct task_struct *p;
> > + unsigned int to_kill, killed = 0;
> > +
> > + if (!stats)
> > + return -EFAULT;
> > +
> > + to_kill = refcount_read(&stats->refc) - 1;
> > + if (!to_kill)
> > + return 0;
> > +
> > + /* Disable the attack detection */
> > + stats->jiffies = 0;
> > + rcu_read_lock();
> > +
> > + for_each_process(p) {
> > + if (p == current || p->fbfam_stats != stats)
> > + continue;
> > +
> > + do_send_sig_info(SIGKILL, SEND_SIG_PRIV, p, PIDTYPE_PID);
> > + pr_warn("fbfam: Offending process with PID %d killed\n",
> > + p->pid);
>
> I'd make this ratelimited (along with Jann's suggestions).
Sorry, but I don't understand what you mean with "make this ratelimited".
A clarification would be greatly appreciated.
> Also, instead of the explicit "fbfam:" prefix, use the regular
> prefixing method:
>
> #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
Understood.
> > +
> > + killed += 1;
> > + if (killed >= to_kill)
> > + break;
> > + }
> > +
> > + rcu_read_unlock();
>
> Can't newly created processes escape this RCU read lock? I think this
> need alternate locking, or something in the task_alloc hook that will
> block any new process from being created within the stats group.
I will work on this for the next version. Thanks.
> > + return 0;
> > +}
>
> --
> Kees Cook
Thanks
John Wood
On Fri, Sep 18, 2020 at 06:02:16PM +0200, John Wood wrote:
> On Thu, Sep 10, 2020 at 04:56:19PM -0700, Kees Cook wrote:
> > On Thu, Sep 10, 2020 at 01:21:07PM -0700, Kees Cook wrote:
> > > + pr_warn("fbfam: Offending process with PID %d killed\n",
> > > + p->pid);
> >
> > I'd make this ratelimited (along with Jann's suggestions).
>
> Sorry, but I don't understand what you mean with "make this ratelimited".
> A clarification would be greatly appreciated.
Ah! Yes, sorry for not being more clear. There are ratelimit helpers for
the pr_*() family of functions, e.g.:
pr_warn_ratelimited("brute: Offending process with PID...
--
Kees Cook
On Fri, Sep 18, 2020 at 02:35:12PM -0700, Kees Cook wrote:
> On Fri, Sep 18, 2020 at 06:02:16PM +0200, John Wood wrote:
> > On Thu, Sep 10, 2020 at 04:56:19PM -0700, Kees Cook wrote:
> > > On Thu, Sep 10, 2020 at 01:21:07PM -0700, Kees Cook wrote:
> > > > + pr_warn("fbfam: Offending process with PID %d killed\n",
> > > > + p->pid);
> > >
> > > I'd make this ratelimited (along with Jann's suggestions).
> >
> > Sorry, but I don't understand what you mean with "make this ratelimited".
> > A clarification would be greatly appreciated.
>
> Ah! Yes, sorry for not being more clear. There are ratelimit helpers for
> the pr_*() family of functions, e.g.:
>
> pr_warn_ratelimited("brute: Offending process with PID...
Thanks for the clarification.
> --
> Kees Cook
Regards,
John Wood