LinuxLists.cc - [PATCH 0/4] forkbomb killer

2011-03-24 09:29:14

Subject: [PATCH 0/4] forkbomb killer

Cleaned up and fixed unclear logics. and removed RFC.
Maybe this version is easy to be read.

When we see forkbomb, it tends can be a fatal one.

When A user makes a forkbomb (and sometimes reaches ulimit....
In this case,
- If the system is not in OOM, the admin may be able to kill all threads by
hand..but forkbomb may be faster than pkill() by admin.
- If the system is in OOM, the admin needs to reboot system.
OOM killer is slow than forkbomb.

So, I think forkbomb killer is appreciated. It's better than reboot.

At implementing forkbomb killer, one of difficult case is like this

# forkbomb(){ forkbomb|forkbomb & } ; forkbomb

With this, parent tasks will exit() before the system goes under OOM.
So, it's difficult to know the whole image of forkbomb.

This patch introduce a subsystem to track mm's history and records it
even after the task exit. (It will be flushed periodically.)

I tested with several forkbomb cases and this patch seems work fine.

Maybe some more 'heuristics' can be added....but I think this simple
one works enough. Any comments are welcome.
Thanks,
-Kame

2011-03-24 09:32:26

by Kamezawa Hiroyuki

[permalink] [raw]

Subject: [PATCH 1/5] forkbomb killer config and documentation

Kconfig and Documentation for forkbomb killer.

Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>
---
Documentation/vm/forkbomb.txt | 62 ++++++++++++++++++++++++++++++++++++++++++
mm/Kconfig | 16 ++++++++++
2 files changed, 78 insertions(+)

Index: mm-work2/Documentation/vm/forkbomb.txt
===================================================================
--- /dev/null
+++ mm-work2/Documentation/vm/forkbomb.txt
@@ -0,0 +1,62 @@
+Forkbomb.txt
+
+1. Intruduction
+ Maybe many programmer have an experience to write a fork-bomb program.
+
+ One example of fork-bomb is a bomb which make system unstable by the
+ memory pressure caused by the number of tasks. This kind of fork-bomb
+ can be limited by ulimit(max user processes). If it happens, the user
+ who has the same owner ID of forkbomb will not be able to do anything
+ but other users(admin) may have a chance to kill them. (Of course,
+ if forkbomb is created by root, we have no chance to recover.)
+
+ Another example of fork-bomb is a bomb which eats much memory. This
+ kind of forkbomb causes huge swapout and make system slow and finally,
+ OOM. In swapless system, the system will see OOM soon. To prevent this
+ type of bomb, memory cgroup or overcommit_memory will be a help. But
+ troubles happen when we don't expected.....
+
+ To recover from fork-bomb, we need to kill all tasks which is in the
+ forkbomb tree, in general. But if the system is in OOM state, killing
+ them all tends to be difficult.
+
+2. Forkbomb Killer.
+ The kernel provides a forkbomb killer. (see mm/Kconfig FORKBOMB_KILLER)
+ If enabled, the forkbomb killer will provides 2 system files.
+
+ /sys/kernel/mm/oom/mm_tracking_enabled
+ /sys/kernel/mm/oom/mm_tracking_reset_interval_msecs
+
+
+ If /sys/kernel/mm/oom/mm_tracking_enabled == enabled, the kernel records
+ all fork/vfork/exec information by an extra structure than usual task
+ management. This information is used for tracking a task tree. Unlike
+ process tree, this doesn't discard parent<->children information even
+ when the parent exits before children and make children as orphan processes.
+ By this, even with following script, task tracking information can be
+ preserved and we have a chance to chase all proceesses in a fork bomb.
+
+ (example) # forkbomb(){ forkbomb|forkbomb & } ; forkbomb
+
+ But this information tracking adds a small overhead at fork/vfork/exec/exit.
+ Default is enabled.
+
+ /sys/kernel/mm/oom/mm_tracking_reset_interval_msecs
+
+ Because we cannot preserve all information since the system boot, we need
+ to forget information. Forkbomb killer checks the system status in each
+ period. What checked now is
+ 1. the number of process.
+ 2. the number of kswapd runs.
+ 3. the number of alloc stalls. (memory reclaim)
+ If all of 1,2,3 aren't increased for mm_tracking_reset_interval_msecs,
+ all tracking information recorded before previous period will be
+ removed.
+ IOW, by making mm_tracking_reset_interval_msecs larger, you can check
+ forkbomb in a long period but will have more overheads. By making it
+ smaller, tracking records are removed earlier and tasks killed by
+ forkbomb killer will decrease (and you can avoid unnecessary kills.)
+ Default is 30secs.
+
+
+
Index: mm-work2/mm/Kconfig
===================================================================
--- mm-work2.orig/mm/Kconfig
+++ mm-work2/mm/Kconfig
@@ -274,6 +274,22 @@ config HWPOISON_INJECT
depends on MEMORY_FAILURE && DEBUG_KERNEL && PROC_FS
select PROC_PAGE_MONITOR

+config FORKBOMB_KILLER
+ bool "Killing a tree of tasks when a forkbomb is found"
+ depends on EXPERIMENTAL
+ default n
+ select MM_OWNER
+ help
+ Provide a fork-bomb-killer, which is triggered at OOM.
+ In usual case, OOM-Killer kills a memory eater processes.
+ But it kills tasks in conservative way and cannot be a help
+ if forkbomb is running. The admin may need to reboot system
+ if the influence of the bomb cannot be limited by rlimits or
+ some security settings. FORKBOMB Killer kills a tree of process
+ which have started recently and eats much memory. Please see,
+ Documentation/vm/forkbomb.txt for details. If unsure, say N.
+
+
config NOMMU_INITIAL_TRIM_EXCESS
int "Turn on mmap() excess space trimming before booting"
depends on !MMU

2011-03-24 09:33:29

by Kamezawa Hiroyuki

[permalink] [raw]

Subject: [PATCH 2/5] forkbomb: mm tracking subsystem

This patch adds a subsystem for recording a history of mm.
This patch records relation ship of each mm_structs and
preserve them in a tree. New record is added at fork()
and exec(). If all children disapperas at exit(), the record
will be removed.

Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>
---
fs/exec.c | 1
include/linux/mm_types.h | 3 +
include/linux/oom.h | 14 ++++++++
kernel/fork.c | 3 +
mm/oom_kill.c | 75 +++++++++++++++++++++++++++++++++++++++++++++++
5 files changed, 96 insertions(+)

Index: mm-work2/include/linux/oom.h
===================================================================
--- mm-work2.orig/include/linux/oom.h
+++ mm-work2/include/linux/oom.h
@@ -72,5 +72,19 @@ extern struct task_struct *find_lock_tas
extern int sysctl_oom_dump_tasks;
extern int sysctl_oom_kill_allocating_task;
extern int sysctl_panic_on_oom;
+
+#ifdef CONFIG_FORKBOMB_KILLER
+extern void track_mm_history(struct mm_struct *new, struct mm_struct *old);
+extern void delete_mm_history(struct mm_struct *mm);
+#else
+static inline void
+track_mm_history(struct mm_struct *new, struct mm_struct *old)
+{
+}
+static inline void delete_mm_history(struct mm_struct *mm)
+{
+}
+#endif
+
#endif /* __KERNEL__*/
#endif /* _INCLUDE_LINUX_OOM_H */
Index: mm-work2/mm/oom_kill.c
===================================================================
--- mm-work2.orig/mm/oom_kill.c
+++ mm-work2/mm/oom_kill.c
@@ -761,3 +761,78 @@ void pagefault_out_of_memory(void)
if (!test_thread_flag(TIF_MEMDIE))
schedule_timeout_uninterruptible(1);
}
+
+#ifdef CONFIG_FORKBOMB_KILLER
+
+struct mm_history {
+ spinlock_t lock;
+ struct mm_struct *mm;
+ struct mm_history *parent;
+ struct list_head siblings;
+ struct list_head children;
+ /* scores */
+ unsigned long start_time;
+ unsigned long score;
+ unsigned int family;
+ int need_to_kill;
+};
+
+struct mm_history init_hist = {
+ .parent = &init_hist,
+ .lock = __SPIN_LOCK_UNLOCKED(init_hist.lock),
+ .siblings = LIST_HEAD_INIT(init_hist.siblings),
+ .children = LIST_HEAD_INIT(init_hist.children),
+};
+
+void track_mm_history(struct mm_struct *new, struct mm_struct *parent)
+{
+ struct mm_history *hist, *phist;
+
+ hist = kmalloc(sizeof(*hist), GFP_KERNEL);
+ if (!hist)
+ return;
+ spin_lock_init(&hist->lock);
+ INIT_LIST_HEAD(&hist->children);
+ hist->mm = new;
+ hist->start_time = jiffies;
+ if (parent)
+ phist = parent->history;
+ else
+ phist = NULL;
+ if (!phist)
+ phist = &init_hist;
+ new->history = hist;
+ hist->parent = phist;
+ spin_lock(&phist->lock);
+ list_add_tail(&hist->siblings, &phist->children);
+ spin_unlock(&phist->lock);
+ return;
+}
+
+void delete_mm_history(struct mm_struct *mm)
+{
+ struct mm_history *hist, *phist;
+ bool nochild;
+
+ if (!mm->history)
+ return;
+ hist = mm->history;
+ spin_lock(&hist->lock);
+ nochild = list_empty(&hist->children);
+ mm->history = NULL;
+ hist->mm = NULL;
+ spin_unlock(&hist->lock);
+ /* delete if we have no child */
+ while (nochild && hist != &init_hist) {
+ phist = hist->parent;
+ spin_lock(&phist->lock);
+ list_del(&hist->siblings);
+ /* delete parent if it's dead & no more child other than me.*/
+ nochild = (phist->mm == NULL && list_empty(&phist->children));
+ spin_unlock(&phist->lock);
+ kfree(hist);
+ hist = phist;
+ }
+}
+
+#endif
Index: mm-work2/fs/exec.c
===================================================================
--- mm-work2.orig/fs/exec.c
+++ mm-work2/fs/exec.c
@@ -802,6 +802,7 @@ static int exec_mmap(struct mm_struct *m
}
task_unlock(tsk);
arch_pick_mmap_layout(mm);
+ track_mm_history(mm, old_mm);
if (old_mm) {
up_read(&old_mm->mmap_sem);
BUG_ON(active_mm != old_mm);
Index: mm-work2/kernel/fork.c
===================================================================
--- mm-work2.orig/kernel/fork.c
+++ mm-work2/kernel/fork.c
@@ -559,6 +559,7 @@ void mmput(struct mm_struct *mm)
ksm_exit(mm);
khugepaged_exit(mm); /* must run before exit_mmap */
exit_mmap(mm);
+ delete_mm_history(mm);
set_mm_exe_file(mm, NULL);
if (!list_empty(&mm->mmlist)) {
spin_lock(&mmlist_lock);
@@ -706,6 +707,8 @@ struct mm_struct *dup_mm(struct task_str
if (mm->binfmt && !try_module_get(mm->binfmt->module))
goto free_pt;

+ track_mm_history(mm, oldmm);
+
return mm;

free_pt:
Index: mm-work2/include/linux/mm_types.h
===================================================================
--- mm-work2.orig/include/linux/mm_types.h
+++ mm-work2/include/linux/mm_types.h
@@ -317,6 +317,9 @@ struct mm_struct {
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
pgtable_t pmd_huge_pte; /* protected by page_table_lock */
#endif
+#ifdef CONFIG_FORKBOMB_KILLER
+ struct mm_history *history;
+#endif
};

/* Future-safe accessor for struct mm_struct's cpu_vm_mask. */

2011-03-24 09:34:42

by Kamezawa Hiroyuki

[permalink] [raw]

Subject: [PATCH 3/5] forkbomb : mm histroy scanning and locks

This patch adds a code for scanning mm_history tree. Later, we need
to scan all mm_histroy from children->parent direction.

And this patch adds a global lock which will be required for scanning.
Because scanning isn't called frequently, using rwsem with a help of
percpu variable.

Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>
---
mm/oom_kill.c | 116 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 116 insertions(+)

Index: mm-work2/mm/oom_kill.c
===================================================================
--- mm-work2.orig/mm/oom_kill.c
+++ mm-work2/mm/oom_kill.c
@@ -31,6 +31,7 @@
#include <linux/memcontrol.h>
#include <linux/mempolicy.h>
#include <linux/security.h>
+#include <linux/cpu.h>

int sysctl_panic_on_oom;
int sysctl_oom_kill_allocating_task;
@@ -764,6 +765,58 @@ void pagefault_out_of_memory(void)

#ifdef CONFIG_FORKBOMB_KILLER

+static DEFINE_PER_CPU(unsigned long, pcpu_history_lock);
+static DECLARE_RWSEM(hist_rwsem);
+static int need_global_history_lock;
+
+static void update_history_lock(void)
+{
+retry:
+ preempt_disable();
+ this_cpu_inc(pcpu_history_lock);
+ smp_rmb();
+ if (need_global_history_lock) {
+ this_cpu_dec(pcpu_history_lock);
+ preempt_enable();
+ down_read(&hist_rwsem);
+ up_read(&hist_rwsem);
+ goto retry;
+ }
+}
+
+static void update_history_unlock(void)
+{
+ this_cpu_dec(pcpu_history_lock);
+ preempt_enable();
+}
+
+static void scan_history_lock(void)
+{
+ int cpu;
+ bool loop;
+
+ down_write(&hist_rwsem);
+ need_global_history_lock++;
+ do {
+ loop = false;
+ get_online_cpus();
+ for_each_online_cpu(cpu)
+ if (per_cpu(pcpu_history_lock, cpu)) {
+ loop = true;
+ break;
+ }
+ put_online_cpus();
+ cpu_relax();
+ } while (loop);
+}
+
+static void scan_history_unlock(void)
+{
+ need_global_history_lock--;
+ up_write(&hist_rwsem);
+}
+
+
struct mm_history {
spinlock_t lock;
struct mm_struct *mm;
@@ -791,6 +844,7 @@ void track_mm_history(struct mm_struct *
hist = kmalloc(sizeof(*hist), GFP_KERNEL);
if (!hist)
return;
+ update_history_lock();
spin_lock_init(&hist->lock);
INIT_LIST_HEAD(&hist->children);
hist->mm = new;
@@ -806,6 +860,7 @@ void track_mm_history(struct mm_struct *
spin_lock(&phist->lock);
list_add_tail(&hist->siblings, &phist->children);
spin_unlock(&phist->lock);
+ update_history_unlock();
return;
}

@@ -816,6 +871,7 @@ void delete_mm_history(struct mm_struct

if (!mm->history)
return;
+ update_history_lock();
hist = mm->history;
spin_lock(&hist->lock);
nochild = list_empty(&hist->children);
@@ -833,6 +889,66 @@ void delete_mm_history(struct mm_struct
kfree(hist);
hist = phist;
}
+ update_history_unlock();
}

+/* Because we have global scan lock, we need no lock at scaning. */
+static struct mm_history* __first_child(struct mm_history *p)
+{
+ if (list_empty(&p->children))
+ return NULL;
+ return list_first_entry(&p->children, struct mm_history, siblings);
+}
+
+static struct mm_history* __next_sibling(struct mm_history *p)
+{
+ if (p->siblings.next == &p->parent->children)
+ return NULL;
+ return list_first_entry(&p->siblings, struct mm_history, siblings);
+}
+
+static struct mm_history *first_deepest_child(struct mm_history *p)
+{
+ struct mm_history *tmp;
+
+ do {
+ tmp = __first_child(p);
+ if (!tmp)
+ return p;
+ p = tmp;
+ } while (1);
+}
+
+static struct mm_history *mm_history_scan_start(struct mm_history *hist)
+{
+ return first_deepest_child(hist);
+}
+
+static struct mm_history *mm_history_scan_next(struct mm_history *pos)
+{
+ struct mm_history *tmp;
+
+ tmp = __next_sibling(pos);
+ if (!tmp)
+ return pos->parent;
+ pos = tmp;
+ pos = first_deepest_child(pos);
+ return pos;
+}
+
+#define for_each_mm_history_under(pos, root)\
+ for (pos = mm_history_scan_start(root);\
+ pos != root;\
+ pos = mm_history_scan_next(pos))
+
+#define for_each_mm_history_safe_under(pos, root, tmp)\
+ for (pos = mm_history_scan_start(root),\
+ tmp = mm_history_scan_next(pos);\
+ pos != root;\
+ pos = tmp, tmp = mm_history_scan_next(pos))
+
+#define for_each_mm_history(pos) for_each_mm_history_under((pos), &init_hist)
+#define for_each_mm_history_safe(pos, tmp)\
+ for_each_mm_history_safe_under((pos), &init_hist, (tmp))
+
#endif

2011-03-24 09:36:29

by Kamezawa Hiroyuki

[permalink] [raw]

Subject: [PATCH 4/5] forkbomb : periodic flushing mm history information

At 1st, this patch adds a control knob for enable/disable mm_history
tracking.

2nd, at tracking mm's history for forkbomb detection, information of
processes which doesn't seem to be important for fork-bomb detection
is just a noise.

This patch adds a knob for forgetting information with a periodic
check routine.

At every 30secs (can be configured),
1. check nr_procesess doesn't increase
2. check kswapd doesn't run
3. check allocstall doesn't occur.

If all don't happens, clear mm_history which is older than 30secs.

Note: reorder of objects in makefile was required because
mm_kobj's initcall should be called before oom's...

Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>
---
mm/Makefile | 4 -
mm/oom_kill.c | 144 +++++++++++++++++++++++++++++++++++++++++++++++++++++++---
2 files changed, 139 insertions(+), 9 deletions(-)

Index: mm-work2/mm/oom_kill.c
===================================================================
--- mm-work2.orig/mm/oom_kill.c
+++ mm-work2/mm/oom_kill.c
@@ -768,6 +768,7 @@ void pagefault_out_of_memory(void)
static DEFINE_PER_CPU(unsigned long, pcpu_history_lock);
static DECLARE_RWSEM(hist_rwsem);
static int need_global_history_lock;
+static int mm_tracking_enabled = 1;

static void update_history_lock(void)
{
@@ -841,6 +842,9 @@ void track_mm_history(struct mm_struct *
{
struct mm_history *hist, *phist;

+ if (!mm_tracking_enabled)
+ return;
+
hist = kmalloc(sizeof(*hist), GFP_KERNEL);
if (!hist)
return;
@@ -864,19 +868,19 @@ void track_mm_history(struct mm_struct *
return;
}

-void delete_mm_history(struct mm_struct *mm)
+static void __delete_mm_history(struct mm_history *hist, bool check_ancestors)
{
- struct mm_history *hist, *phist;
+ struct mm_history *phist;
bool nochild;

- if (!mm->history)
+ if (!hist)
return;
- update_history_lock();
- hist = mm->history;
spin_lock(&hist->lock);
nochild = list_empty(&hist->children);
- mm->history = NULL;
- hist->mm = NULL;
+ if (hist->mm) {
+ hist->mm->history = NULL;
+ hist->mm = NULL;
+ }
spin_unlock(&hist->lock);
/* delete if we have no child */
while (nochild && hist != &init_hist) {
@@ -887,8 +891,16 @@ void delete_mm_history(struct mm_struct
nochild = (phist->mm == NULL && list_empty(&phist->children));
spin_unlock(&phist->lock);
kfree(hist);
+ if (!check_ancestors)
+ break;
hist = phist;
}
+}
+
+void delete_mm_history(struct mm_struct *mm)
+{
+ update_history_lock();
+ __delete_mm_history(mm->history, true);
update_history_unlock();
}

@@ -951,4 +963,122 @@ static struct mm_history *mm_history_sca
#define for_each_mm_history_safe(pos, tmp)\
for_each_mm_history_safe_under((pos), &init_hist, (tmp))

+static unsigned long reset_interval_jiffies = 30*HZ;
+unsigned long last_nr_procs;
+unsigned long last_pageout_run;
+unsigned long last_allocstall;
+static void reset_mm_tracking(struct work_struct *w);
+DECLARE_DELAYED_WORK(reset_mm_tracking_work, reset_mm_tracking);
+
+static void reset_mm_tracking(struct work_struct *w)
+{
+ struct mm_history *pos, *tmp;
+ unsigned long nr_procs;
+ unsigned long events[NR_VM_EVENT_ITEMS];
+ bool forget = true;
+
+ nr_procs = nr_processes();
+ if (nr_procs > last_nr_procs)
+ forget = false;
+ last_nr_procs = nr_procs;
+
+ all_vm_events(events);
+ if (last_pageout_run != events[PAGEOUTRUN])
+ forget = false;
+ last_pageout_run = events[PAGEOUTRUN];
+ if (last_allocstall != events[ALLOCSTALL])
+ forget = false;
+ last_allocstall = events[ALLOCSTALL];
+
+ if (forget) {
+ unsigned long thresh = jiffies - reset_interval_jiffies;
+ scan_history_lock();
+ for_each_mm_history_safe(pos, tmp) {
+ if (time_before(pos->start_time, thresh))
+ __delete_mm_history(pos, false);
+ }
+ scan_history_unlock();
+ }
+ if (mm_tracking_enabled)
+ schedule_delayed_work(&reset_mm_tracking_work,
+ reset_interval_jiffies);
+ return;
+}
+
+#define OOM_ATTR(_name)\
+ static struct kobj_attribute _name##_attr =\
+ __ATTR(_name, 0644, _name##_show, _name##_store)
+
+static ssize_t mm_tracker_reset_interval_msecs_show(struct kobject *obj,
+ struct kobj_attribute *attr, char *buf)
+{
+ return sprintf(buf, "%u", jiffies_to_msecs(reset_interval_jiffies));
+}
+
+static ssize_t mm_tracker_reset_interval_msecs_store(struct kobject *obj,
+ struct kobj_attribute *attr, const char *buf, size_t count)
+{
+ unsigned long msecs;
+ int err;
+
+ err = strict_strtoul(buf, 10, &msecs);
+ if (err || msecs > UINT_MAX)
+ return -EINVAL;
+
+ reset_interval_jiffies = msecs_to_jiffies(msecs);
+ return count;
+}
+OOM_ATTR(mm_tracker_reset_interval_msecs);
+
+static ssize_t mm_tracker_enable_show(struct kobject *obj,
+ struct kobj_attribute *attr, char *buf)
+{
+ if (mm_tracking_enabled)
+ return sprintf(buf, "enabled");
+ return sprintf(buf, "disabled");
+}
+
+static ssize_t mm_tracker_enable_store(struct kobject *obj,
+ struct kobj_attribute *attr, const char *buf, size_t count)
+{
+ if (!memcmp("disable", buf, min(sizeof("disable")-1, count)))
+ mm_tracking_enabled = 0;
+ else if (!memcmp("enable", buf, min(sizeof("enable")-1, count)))
+ mm_tracking_enabled = 1;
+ else
+ return -EINVAL;
+ if (mm_tracking_enabled
+ && delayed_work_pending(&reset_mm_tracking_work))
+ schedule_delayed_work(&reset_mm_tracking_work,
+ reset_interval_jiffies);
+
+ return count;
+}
+OOM_ATTR(mm_tracker_enable);
+
+static struct attribute *oom_attrs[] = {
+ &mm_tracker_reset_interval_msecs_attr.attr,
+ &mm_tracker_enable_attr.attr,
+ NULL,
+};
+
+static struct attribute_group oom_attr_group = {
+ .attrs = oom_attrs,
+ .name = "oom",
+};
+
+static int __init init_mm_history(void)
+{
+ int err = 0;
+
+#ifdef CONFIG_SYSFS
+ err = sysfs_create_group(mm_kobj, &oom_attr_group);
+ if (err)
+ printk(KERN_ERR
+ "failed to register mm history tracking for oom\n");
+#endif
+ schedule_delayed_work(&reset_mm_tracking_work, reset_interval_jiffies);
+ return 0;
+}
+module_init(init_mm_history);
#endif
Index: mm-work2/mm/Makefile
===================================================================
--- mm-work2.orig/mm/Makefile
+++ mm-work2/mm/Makefile
@@ -7,11 +7,11 @@ mmu-$(CONFIG_MMU) := fremap.o highmem.o
mlock.o mmap.o mprotect.o mremap.o msync.o rmap.o \
vmalloc.o pagewalk.o pgtable-generic.o

-obj-y := filemap.o mempool.o oom_kill.o fadvise.o \
+obj-y := mm_init.o filemap.o mempool.o oom_kill.o fadvise.o \
maccess.o page_alloc.o page-writeback.o \
readahead.o swap.o truncate.o vmscan.o shmem.o \
prio_tree.o util.o mmzone.o vmstat.o backing-dev.o \
- page_isolation.o mm_init.o mmu_context.o percpu.o \
+ page_isolation.o mmu_context.o percpu.o \
$(mmu-y)
obj-y += init-mm.o

2011-03-24 09:37:13

by Kamezawa Hiroyuki

[permalink] [raw]

Subject: [PATCH 5/5] forkbomb killer

A forkbomb killer implementation.

This patch implements a forkbomb killer which makes use of mm_histroy
record. This calculates badness of each tree of mm_history and kills
all alive processes in the worst tree. This function assumes that
all not-guilty task's mm_history is already removed.

Tested with several known types of forkbombs and works well.

Note:
This doesn't have memory cgroup support because
1. it's difficult.
2. memory cgroup has oom_notify and oom_disable. The userland
management daemon can do better job than kernels.

Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>
---
mm/oom_kill.c | 123 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 123 insertions(+)

Index: mm-work2/mm/oom_kill.c
===================================================================
--- mm-work2.orig/mm/oom_kill.c
+++ mm-work2/mm/oom_kill.c
@@ -83,6 +83,18 @@ static bool has_intersects_mems_allowed(
}
#endif /* CONFIG_NUMA */

+#ifdef CONFIG_FORKBOMB_KILLER
+static bool fork_bomb_killer(unsigned long totalpages, struct mem_cgroup *mem,
+ const nodemask_t *nodemask);
+#else
+static bool fork_bomb_killer(unsigned long totalpages, struct mem_cgroup *mem,
+ const nodemask_t *nodemask)
+{
+ return false;
+}
+#endif
+
+
/*
* If this is a system OOM (not a memcg OOM) and the task selected to be
* killed is not already running at high (RT) priorities, speed up the
@@ -705,6 +717,10 @@ void out_of_memory(struct zonelist *zone
mpol_mask = (constraint == CONSTRAINT_MEMORY_POLICY) ? nodemask : NULL;
check_panic_on_oom(constraint, gfp_mask, order, mpol_mask);

+ if (!sysctl_oom_kill_allocating_task)
+ if (fork_bomb_killer(totalpages, NULL, mpol_mask))
+ return;
+
read_lock(&tasklist_lock);
if (sysctl_oom_kill_allocating_task &&
!oom_unkillable_task(current, NULL, nodemask) &&
@@ -963,6 +979,113 @@ static struct mm_history *mm_history_sca
#define for_each_mm_history_safe(pos, tmp)\
for_each_mm_history_safe_under((pos), &init_hist, (tmp))

+atomic_t forkbomb_killing;
+bool nobomb = false;
+
+void clear_forkbomb_killing(struct work_struct *w)
+{
+ atomic_set(&forkbomb_killing, 0);
+ nobomb = false;
+}
+DECLARE_DELAYED_WORK(fork_bomb_work, clear_forkbomb_killing);
+
+void reset_forkbomb_killing(void)
+{
+ schedule_delayed_work(&fork_bomb_work, 10*HZ);
+}
+
+static void get_badness_score(struct mm_history *pos, struct mem_cgroup *mem,
+ const nodemask_t *nodemask, unsigned long totalpages)
+{
+ struct task_struct *task;
+
+ if (!pos->mm)
+ return;
+ /* task struct is freed by RCU and we;re under rcu_read_lock() */
+ task = pos->mm->owner;
+ if (task && !oom_unkillable_task(task, mem, nodemask))
+ pos->score += oom_badness(task, mem, nodemask, totalpages);
+}
+
+static void propagate_oom_info(struct mm_history *pos)
+{
+ struct mm_history *ppos;
+
+ ppos = pos->parent;
+ if (ppos == &init_hist) /* deadlink by timeout */
+ return;
+ /* +1 means that the child is a burden of the parent */
+ if (pos->mm) {
+ ppos->score += pos->score + 1;
+ ppos->family += pos->family;
+ } else {
+ ppos->score += pos->score;
+ ppos->family += pos->family;
+ }
+}
+
+static bool fork_bomb_killer(unsigned long totalpages, struct mem_cgroup *mem,
+ const nodemask_t *nodemask)
+{
+ struct mm_history *pos, *bomb;
+ unsigned int max_score;
+ struct task_struct *p;
+
+ if (nobomb || !mm_tracking_enabled)
+ return false;
+
+ if (atomic_inc_return(&forkbomb_killing) != 1)
+ return true;
+ /* reset information */
+ scan_history_lock();
+ nobomb = false;
+ pr_err("forkbomb detection running....\n");
+ for_each_mm_history(pos) {
+ pos->score = 0;
+ if (pos->mm)
+ pos->family = 1;
+ pos->need_to_kill = 0;
+ }
+ max_score = 0;
+ bomb = NULL;
+ for_each_mm_history(pos) {
+ get_badness_score(pos, mem, nodemask, totalpages);
+ propagate_oom_info(pos);
+ if (pos->score > max_score) {
+ bomb = pos;
+ max_score = pos->score;
+ }
+ }
+ if (!bomb || bomb->family < 10) {
+ scan_history_unlock();
+ nobomb = true;
+ reset_forkbomb_killing();
+ pr_err("no forkbomb found \n");
+ return false;
+ }
+
+ pr_err("Possible forkbomb. Killing _all_ doubtful tasks\n");
+ for_each_mm_history_under(pos, bomb) {
+ pos->need_to_kill = 1;
+ }
+ read_lock(&tasklist_lock);
+ for_each_process(p) {
+ if (!p->mm || oom_unkillable_task(p, mem, nodemask))
+ continue;
+ if (p->signal->oom_score_adj == -1000)
+ continue;
+ if (p->mm->history && p->mm->history->need_to_kill) {
+ pr_err("kill %d(%s)->%ld\n", task_pid_nr(p),
+ p->comm, p->mm->history->score);
+ force_sig(SIGKILL, p);
+ }
+ }
+ read_unlock(&tasklist_lock);
+ scan_history_unlock();
+ reset_forkbomb_killing();
+ return true;
+}
+
static unsigned long reset_interval_jiffies = 30*HZ;
unsigned long last_nr_procs;
unsigned long last_pageout_run;

2011-03-24 10:52:33

by Minchan Kim

[permalink] [raw]

Subject: [PATCH 0/4] forkbomb killer

Subject: [PATCH 1/5] forkbomb killer config and documentation

Subject: [PATCH 2/5] forkbomb: mm tracking subsystem

Subject: [PATCH 3/5] forkbomb : mm histroy scanning and locks

Subject: [PATCH 4/5] forkbomb : periodic flushing mm history information

Subject: [PATCH 5/5] forkbomb killer

Subject: Re: [PATCH 0/4] forkbomb killer

Subject: Re: [PATCH 0/4] forkbomb killer

Subject: Re: [PATCH 0/4] forkbomb killer

Subject: Re: [PATCH 0/4] forkbomb killer

Subject: Re: [PATCH 0/4] forkbomb killer

Subject: Re: [PATCH 0/4] forkbomb killer

Subject: Re: [PATCH 0/4] forkbomb killer

Subject: Re: [PATCH 0/4] forkbomb killer

Subject: Re: [PATCH 0/4] forkbomb killer

Subject: Re: [PATCH 0/4] forkbomb killer

Subject: Re: [PATCH 0/4] forkbomb killer

Subject: Re: [PATCH 0/4] forkbomb killer

Subject: Re: [PATCH 0/4] forkbomb killer

Subject: Re: [PATCH 0/4] forkbomb killer

Subject: Re: [PATCH 0/4] forkbomb killer

Subject: Re: [PATCH 0/4] forkbomb killer

Subject: Re: [PATCH 0/4] forkbomb killer

Subject: Re: [PATCH 0/4] forkbomb killer

Subject: Re: [PATCH 0/4] forkbomb killer

Subject: Re: [PATCH 0/4] forkbomb killer

Subject: Re: [PATCH 0/4] forkbomb killer

Subject: Re: [PATCH 0/4] forkbomb killer