2018-11-08 10:12:46

by Chanho Min

[permalink] [raw]
Subject: [PATCH] freezer: fix freeze timeout on exec

Suspend fails due to the exec family of fuctnions blocking the freezer.
This issue has been found that it is mentioned in the ancient mail thread.
The casue is that de_thread() sleeps in TASK_UNINTERRUPTIBLE waiting for all
sub-threads to die, and we have the "deadlock" if one of them is frozen.
It causes freeze timeout as bellows.

Freezing of tasks failed after 20.010 seconds (1 tasks refusing to freeze, wq_busy=0):
setcpushares-ls D ffffffc00008ed70 0 5817 1483 0x0040000d
Call trace:
[<ffffffc00008ed70>] __switch_to+0x88/0xa0
[<ffffffc000d1c30c>] __schedule+0x1bc/0x720
[<ffffffc000d1ca90>] schedule+0x40/0xa8
[<ffffffc0001cd784>] flush_old_exec+0xdc/0x640
[<ffffffc000220360>] load_elf_binary+0x2a8/0x1090
[<ffffffc0001ccff4>] search_binary_handler+0x9c/0x240
[<ffffffc00021c584>] load_script+0x20c/0x228
[<ffffffc0001ccff4>] search_binary_handler+0x9c/0x240
[<ffffffc0001ce8e0>] do_execveat_common.isra.14+0x4f8/0x6e8
[<ffffffc0001cedd0>] compat_SyS_execve+0x38/0x48
[<ffffffc00008de30>] el0_svc_naked+0x24/0x28

To fix this, I suggest a patch by emboding the mentioned solution.
First, revive and rework cancel_freezing_and_thaw() function whitch stops the
task from sleeping in refrigirator reliably. And, The task to be killed does not
allow to freeze.

Reference:
http://lkml.iu.edu/hypermail//linux/kernel/0702.2/1300.html

Signed-off-by: Chanho Min <[email protected]>
---
include/linux/freezer.h | 7 +++++++
kernel/freezer.c | 14 ++++++++++++++
kernel/signal.c | 7 +++++++
3 files changed, 28 insertions(+)

diff --git a/include/linux/freezer.h b/include/linux/freezer.h
index 21f5aa0..1c1e3cb 100644
--- a/include/linux/freezer.h
+++ b/include/linux/freezer.h
@@ -57,6 +57,12 @@ static inline bool try_to_freeze_unsafe(void)
might_sleep();
if (likely(!freezing(current)))
return false;
+ /*
+ * we are going to call do_exit() really soon,
+ * we have a pending SIGKILL
+ */
+ if (unlikely(current->signal->flags & SIGNAL_GROUP_EXIT))
+ return false;
return __refrigerator(false);
}

@@ -68,6 +74,7 @@ static inline bool try_to_freeze(void)
}

extern bool freeze_task(struct task_struct *p);
+extern void cancel_freezing_thaw_task(struct task_struct *p);
extern bool set_freezable(void);

#ifdef CONFIG_CGROUP_FREEZER
diff --git a/kernel/freezer.c b/kernel/freezer.c
index b162b74..584c5c8 100644
--- a/kernel/freezer.c
+++ b/kernel/freezer.c
@@ -148,6 +148,20 @@ bool freeze_task(struct task_struct *p)
return true;
}

+void cancel_freezing_thaw_task(struct task_struct *p)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&freezer_lock, flags);
+ if (freezing(p)) {
+ spin_lock(&p->sighand->siglock);
+ recalc_sigpending_and_wake(p);
+ spin_unlock(&p->sighand->siglock);
+ } else if (frozen(p))
+ wake_up_process(p);
+ spin_unlock_irqrestore(&freezer_lock, flags);
+}
+
void __thaw_task(struct task_struct *p)
{
unsigned long flags;
diff --git a/kernel/signal.c b/kernel/signal.c
index 5843c54..ca9b25b 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -41,6 +41,7 @@
#include <linux/compiler.h>
#include <linux/posix-timers.h>
#include <linux/livepatch.h>
+#include <linux/freezer.h>

#define CREATE_TRACE_POINTS
#include <trace/events/signal.h>
@@ -1273,6 +1274,12 @@ int zap_other_threads(struct task_struct *p)
/* Don't bother with already dead threads */
if (t->exit_state)
continue;
+
+ /*
+ * we can check sig->group_exit_task to detect de_thread,
+ * but perhaps it doesn't hurt if the caller is do_group_exit
+ */
+ cancel_freezing_thaw_task(t);
sigaddset(&t->pending.signal, SIGKILL);
signal_wake_up(t, 1);
}
--
2.7.4



2018-11-08 10:51:15

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] freezer: fix freeze timeout on exec

On 11/08, Chanho Min wrote:
>
> Suspend fails due to the exec family of fuctnions blocking the freezer.
> This issue has been found that it is mentioned in the ancient mail thread.
> The casue is that de_thread() sleeps in TASK_UNINTERRUPTIBLE waiting for all
> sub-threads to die, and we have the "deadlock" if one of them is frozen.
> It causes freeze timeout as bellows.
>
> Freezing of tasks failed after 20.010 seconds (1 tasks refusing to freeze, wq_busy=0):
> setcpushares-ls D ffffffc00008ed70 0 5817 1483 0x0040000d
> Call trace:
> [<ffffffc00008ed70>] __switch_to+0x88/0xa0
> [<ffffffc000d1c30c>] __schedule+0x1bc/0x720
> [<ffffffc000d1ca90>] schedule+0x40/0xa8
> [<ffffffc0001cd784>] flush_old_exec+0xdc/0x640
> [<ffffffc000220360>] load_elf_binary+0x2a8/0x1090
> [<ffffffc0001ccff4>] search_binary_handler+0x9c/0x240
> [<ffffffc00021c584>] load_script+0x20c/0x228
> [<ffffffc0001ccff4>] search_binary_handler+0x9c/0x240
> [<ffffffc0001ce8e0>] do_execveat_common.isra.14+0x4f8/0x6e8
> [<ffffffc0001cedd0>] compat_SyS_execve+0x38/0x48
> [<ffffffc00008de30>] el0_svc_naked+0x24/0x28
>
> To fix this, I suggest a patch by emboding the mentioned solution.
> First, revive and rework cancel_freezing_and_thaw() function whitch stops the
> task from sleeping in refrigirator reliably. And, The task to be killed does not
> allow to freeze.

Can't we simply change de_thread() to use freezable_schedule() ?

Oleg.


2018-11-09 00:30:24

by Chanho Min

[permalink] [raw]
Subject: RE: [PATCH] freezer: fix freeze timeout on exec

> >
> > To fix this, I suggest a patch by emboding the mentioned solution.
> > First, revive and rework cancel_freezing_and_thaw() function whitch
> > stops the task from sleeping in refrigirator reliably. And, The task
> > to be killed does not allow to freeze.
>
> Can't we simply change de_thread() to use freezable_schedule() ?
>
> Oleg.

We need to change freezable_schedule_timeout() instead.
freezable_schedule also can't be frozen if sub-threads can't stop
schedule().
Furthermore, I'm not sure if it is safe to freeze it at de_thread().

diff --git a/fs/exec.c b/fs/exec.c
index 9c5ee2a..291cbd6 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -942,7 +942,7 @@ static int de_thread(struct task_struct *tsk)
while (sig->notify_count) {
__set_current_state(TASK_KILLABLE);
spin_unlock_irq(lock);
- schedule();
+ while (!freezable_schedule_timeout(HZ));
if (unlikely(__fatal_signal_pending(tsk)))
goto killed;
spin_lock_irq(lock);

Chanho


2018-11-09 07:53:03

by Chanho Min

[permalink] [raw]
Subject: RE: [PATCH] freezer: fix freeze timeout on exec

> >
> > Can't we simply change de_thread() to use freezable_schedule() ?
> >
> > Oleg.
>
> We need to change freezable_schedule_timeout() instead.
> freezable_schedule also can't be frozen if sub-threads can't stop
> schedule().
> Furthermore, I'm not sure if it is safe to freeze it at de_thread().
>
> diff --git a/fs/exec.c b/fs/exec.c
> index 9c5ee2a..291cbd6 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -942,7 +942,7 @@ static int de_thread(struct task_struct *tsk)
> while (sig->notify_count) {
> __set_current_state(TASK_KILLABLE);
> spin_unlock_irq(lock);
> - schedule();
> + while (!freezable_schedule_timeout(HZ));
> if (unlikely(__fatal_signal_pending(tsk)))
> goto killed;
> spin_lock_irq(lock);
>
> Chanho

Sorry, I might misunderstand freezer.
Changes to freezable_schedule() works fine. It looks safe.
I'll apply patch again.

Chanho


2018-11-11 09:51:20

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] freezer: fix freeze timeout on exec

Hi Chanho,

I love your patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v4.20-rc1 next-20181109]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/Chanho-Min/freezer-fix-freeze-timeout-on-exec/20181111-171200
config: i386-randconfig-x070-201845 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386

All warnings (new ones prefixed by >>):

In file included from arch/x86/include/asm/atomic.h:5:0,
from include/linux/atomic.h:7,
from include/linux/crypto.h:20,
from arch/x86/kernel/asm-offsets.c:9:
include/linux/freezer.h: In function 'try_to_freeze_unsafe':
include/linux/freezer.h:64:30: error: dereferencing pointer to incomplete type 'struct signal_struct'
if (unlikely(current->signal->flags & SIGNAL_GROUP_EXIT))
^
include/linux/compiler.h:58:30: note: in definition of macro '__trace_if'
if (__builtin_constant_p(!!(cond)) ? !!(cond) : \
^~~~
>> include/linux/freezer.h:64:2: note: in expansion of macro 'if'
if (unlikely(current->signal->flags & SIGNAL_GROUP_EXIT))
^~
include/linux/compiler.h:48:24: note: in expansion of macro '__branch_check__'
# define unlikely(x) (__branch_check__(x, 0, __builtin_constant_p(x)))
^~~~~~~~~~~~~~~~
include/linux/freezer.h:64:6: note: in expansion of macro 'unlikely'
if (unlikely(current->signal->flags & SIGNAL_GROUP_EXIT))
^~~~~~~~
include/linux/freezer.h:64:40: error: 'SIGNAL_GROUP_EXIT' undeclared (first use in this function)
if (unlikely(current->signal->flags & SIGNAL_GROUP_EXIT))
^
include/linux/compiler.h:58:30: note: in definition of macro '__trace_if'
if (__builtin_constant_p(!!(cond)) ? !!(cond) : \
^~~~
>> include/linux/freezer.h:64:2: note: in expansion of macro 'if'
if (unlikely(current->signal->flags & SIGNAL_GROUP_EXIT))
^~
include/linux/compiler.h:48:24: note: in expansion of macro '__branch_check__'
# define unlikely(x) (__branch_check__(x, 0, __builtin_constant_p(x)))
^~~~~~~~~~~~~~~~
include/linux/freezer.h:64:6: note: in expansion of macro 'unlikely'
if (unlikely(current->signal->flags & SIGNAL_GROUP_EXIT))
^~~~~~~~
include/linux/freezer.h:64:40: note: each undeclared identifier is reported only once for each function it appears in
if (unlikely(current->signal->flags & SIGNAL_GROUP_EXIT))
^
include/linux/compiler.h:58:30: note: in definition of macro '__trace_if'
if (__builtin_constant_p(!!(cond)) ? !!(cond) : \
^~~~
>> include/linux/freezer.h:64:2: note: in expansion of macro 'if'
if (unlikely(current->signal->flags & SIGNAL_GROUP_EXIT))
^~
include/linux/compiler.h:48:24: note: in expansion of macro '__branch_check__'
# define unlikely(x) (__branch_check__(x, 0, __builtin_constant_p(x)))
^~~~~~~~~~~~~~~~
include/linux/freezer.h:64:6: note: in expansion of macro 'unlikely'
if (unlikely(current->signal->flags & SIGNAL_GROUP_EXIT))
^~~~~~~~
make[2]: *** [arch/x86/kernel/asm-offsets.s] Error 1
make[2]: Target '__build' not remade because of errors.
make[1]: *** [prepare0] Error 2
make[1]: Target 'prepare' not remade because of errors.
make: *** [sub-make] Error 2

vim +/if +64 include/linux/freezer.h

50
51 /*
52 * DO NOT ADD ANY NEW CALLERS OF THIS FUNCTION
53 * If try_to_freeze causes a lockdep warning it means the caller may deadlock
54 */
55 static inline bool try_to_freeze_unsafe(void)
56 {
57 might_sleep();
58 if (likely(!freezing(current)))
59 return false;
60 /*
61 * we are going to call do_exit() really soon,
62 * we have a pending SIGKILL
63 */
> 64 if (unlikely(current->signal->flags & SIGNAL_GROUP_EXIT))
65 return false;
66 return __refrigerator(false);
67 }
68

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (4.52 kB)
.config.gz (26.66 kB)
Download all attachments

2018-11-11 09:53:38

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] freezer: fix freeze timeout on exec

Hi Chanho,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.20-rc1 next-20181109]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/Chanho-Min/freezer-fix-freeze-timeout-on-exec/20181111-171200
config: i386-randconfig-x002-201845 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386

All errors (new ones prefixed by >>):

kernel/signal.c: In function 'zap_other_threads':
>> kernel/signal.c:1283:3: error: implicit declaration of function 'cancel_freezing_thaw_task' [-Werror=implicit-function-declaration]
cancel_freezing_thaw_task(t);
^~~~~~~~~~~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors

vim +/cancel_freezing_thaw_task +1283 kernel/signal.c

1260
1261 /*
1262 * Nuke all other threads in the group.
1263 */
1264 int zap_other_threads(struct task_struct *p)
1265 {
1266 struct task_struct *t = p;
1267 int count = 0;
1268
1269 p->signal->group_stop_count = 0;
1270
1271 while_each_thread(p, t) {
1272 task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK);
1273 count++;
1274
1275 /* Don't bother with already dead threads */
1276 if (t->exit_state)
1277 continue;
1278
1279 /*
1280 * we can check sig->group_exit_task to detect de_thread,
1281 * but perhaps it doesn't hurt if the caller is do_group_exit
1282 */
> 1283 cancel_freezing_thaw_task(t);
1284 sigaddset(&t->pending.signal, SIGKILL);
1285 signal_wake_up(t, 1);
1286 }
1287
1288 return count;
1289 }
1290

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (1.94 kB)
.config.gz (26.39 kB)
Download all attachments