2019-07-15 15:15:53

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.2 039/249] signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of force_sig

From: "Eric W. Biederman" <[email protected]>

[ Upstream commit 72abe3bcf0911d69b46c1e8bdb5612675e0ac42c ]

The locking in force_sig_info is not prepared to deal with a task that
exits or execs (as sighand may change). The is not a locking problem
in force_sig as force_sig is only built to handle synchronous
exceptions.

Further the function force_sig_info changes the signal state if the
signal is ignored, or blocked or if SIGNAL_UNKILLABLE will prevent the
delivery of the signal. The signal SIGKILL can not be ignored and can
not be blocked and SIGNAL_UNKILLABLE won't prevent it from being
delivered.

So using force_sig rather than send_sig for SIGKILL is confusing
and pointless.

Because it won't impact the sending of the signal and and because
using force_sig is wrong, replace force_sig with send_sig.

Cc: Namjae Jeon <[email protected]>
Cc: Jeff Layton <[email protected]>
Cc: Steve French <[email protected]>
Fixes: a5c3e1c725af ("Revert "cifs: No need to send SIGKILL to demux_thread during umount"")
Fixes: e7ddee9037e7 ("cifs: disable sharing session and tcon and add new TCP sharing code")
Signed-off-by: "Eric W. Biederman" <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
fs/cifs/connect.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 8dd6637a3cbb..714a359c7c8d 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -2631,7 +2631,7 @@ cifs_put_tcp_session(struct TCP_Server_Info *server, int from_reconnect)

task = xchg(&server->tsk, NULL);
if (task)
- force_sig(SIGKILL, task);
+ send_sig(SIGKILL, task, 1);
}

static struct TCP_Server_Info *
--
2.20.1


2019-07-24 02:36:36

by ronnie sahlberg

[permalink] [raw]
Subject: Re: [PATCH AUTOSEL 5.2 039/249] signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of force_sig

On Tue, Jul 16, 2019 at 1:15 AM Sasha Levin <[email protected]> wrote:
>
> From: "Eric W. Biederman" <[email protected]>
>
> [ Upstream commit 72abe3bcf0911d69b46c1e8bdb5612675e0ac42c ]
>
> The locking in force_sig_info is not prepared to deal with a task that
> exits or execs (as sighand may change). The is not a locking problem
> in force_sig as force_sig is only built to handle synchronous
> exceptions.
>
> Further the function force_sig_info changes the signal state if the
> signal is ignored, or blocked or if SIGNAL_UNKILLABLE will prevent the
> delivery of the signal. The signal SIGKILL can not be ignored and can
> not be blocked and SIGNAL_UNKILLABLE won't prevent it from being
> delivered.
>
> So using force_sig rather than send_sig for SIGKILL is confusing
> and pointless.
>
> Because it won't impact the sending of the signal and and because
> using force_sig is wrong, replace force_sig with send_sig.

I think this patch broke the cifs module.
The issue is that the use count is now not updated properly and thus
it is no longer possible to
rmmod the module.


>
> Cc: Namjae Jeon <[email protected]>
> Cc: Jeff Layton <[email protected]>
> Cc: Steve French <[email protected]>
> Fixes: a5c3e1c725af ("Revert "cifs: No need to send SIGKILL to demux_thread during umount"")
> Fixes: e7ddee9037e7 ("cifs: disable sharing session and tcon and add new TCP sharing code")
> Signed-off-by: "Eric W. Biederman" <[email protected]>
> Signed-off-by: Sasha Levin <[email protected]>
> ---
> fs/cifs/connect.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
> index 8dd6637a3cbb..714a359c7c8d 100644
> --- a/fs/cifs/connect.c
> +++ b/fs/cifs/connect.c
> @@ -2631,7 +2631,7 @@ cifs_put_tcp_session(struct TCP_Server_Info *server, int from_reconnect)
>
> task = xchg(&server->tsk, NULL);
> if (task)
> - force_sig(SIGKILL, task);
> + send_sig(SIGKILL, task, 1);
> }
>
> static struct TCP_Server_Info *
> --
> 2.20.1
>

2019-07-24 02:37:43

by Steve French

[permalink] [raw]
Subject: Re: [PATCH AUTOSEL 5.2 039/249] signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of force_sig

Very easy to see what caused the regression with this global change:

mount (which launches "cifsd" thread to read the socket)
umount (which kills the "cifsd" thread)
rmmod (rmmod now fails since "cifsd" thread is still active)

mount launches a thread to read from the socket ("cifsd")
umount is supposed to kill that thread (but with the patch
"signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of
force_sig" that no longer works). So the regression is that after
unmount you still see the "cifsd" thread, and the reason that cifsd
thread is still around is that that patch no longer force kills the
process (see line 2652 of fs/cifs/connect.c) which regresses module
removal.

- force_sig(SIGKILL, task);
+ send_sig(SIGKILL, task, 1);

The comment in the changeset indicates "The signal SIGKILL can not be
ignored" but obviously it can be ignored - at least on 5.3-rc1 it is
being ignored.

If send_sig(SIGKILL ...) doesn't work and if force_sig(SIGKILL, task)
is removed and no longer possible - how do we kill a helper process
...

On Tue, Jul 23, 2019 at 6:20 PM ronnie sahlberg
<[email protected]> wrote:
>
> On Tue, Jul 16, 2019 at 1:15 AM Sasha Levin <[email protected]> wrote:
> >
> > From: "Eric W. Biederman" <[email protected]>
> >
> > [ Upstream commit 72abe3bcf0911d69b46c1e8bdb5612675e0ac42c ]
> >
> > The locking in force_sig_info is not prepared to deal with a task that
> > exits or execs (as sighand may change). The is not a locking problem
> > in force_sig as force_sig is only built to handle synchronous
> > exceptions.
> >
> > Further the function force_sig_info changes the signal state if the
> > signal is ignored, or blocked or if SIGNAL_UNKILLABLE will prevent the
> > delivery of the signal. The signal SIGKILL can not be ignored and can
> > not be blocked and SIGNAL_UNKILLABLE won't prevent it from being
> > delivered.
> >
> > So using force_sig rather than send_sig for SIGKILL is confusing
> > and pointless.
> >
> > Because it won't impact the sending of the signal and and because
> > using force_sig is wrong, replace force_sig with send_sig.
>
> I think this patch broke the cifs module.
> The issue is that the use count is now not updated properly and thus
> it is no longer possible to
> rmmod the module.
>
>
> >
> > Cc: Namjae Jeon <[email protected]>
> > Cc: Jeff Layton <[email protected]>
> > Cc: Steve French <[email protected]>
> > Fixes: a5c3e1c725af ("Revert "cifs: No need to send SIGKILL to demux_thread during umount"")
> > Fixes: e7ddee9037e7 ("cifs: disable sharing session and tcon and add new TCP sharing code")
> > Signed-off-by: "Eric W. Biederman" <[email protected]>
> > Signed-off-by: Sasha Levin <[email protected]>
> > ---
> > fs/cifs/connect.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
> > index 8dd6637a3cbb..714a359c7c8d 100644
> > --- a/fs/cifs/connect.c
> > +++ b/fs/cifs/connect.c
> > @@ -2631,7 +2631,7 @@ cifs_put_tcp_session(struct TCP_Server_Info *server, int from_reconnect)
> >
> > task = xchg(&server->tsk, NULL);
> > if (task)
> > - force_sig(SIGKILL, task);
> > + send_sig(SIGKILL, task, 1);
> > }
> >
> > static struct TCP_Server_Info *
> > --
> > 2.20.1
> >



--
Thanks,

Steve

2019-07-24 02:38:48

by Steve French

[permalink] [raw]
Subject: Re: [PATCH AUTOSEL 5.2 039/249] signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of force_sig

I did some additional testing and it looks like the "allow_signal"
change may be safe enough

# git diff -a
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index a4830ced0f98..a15a6e738eb5 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -1113,6 +1113,7 @@ cifs_demultiplex_thread(void *p)
mempool_resize(cifs_req_poolp, length + cifs_min_rcv);

set_freezable();
+ allow_signal(SIGKILL);
while (server->tcpStatus != CifsExiting) {
if (try_to_freeze())
continue;

See below:
root@smf-Thinkpad-P51:~/cifs-2.6/fs/cifs# insmod ./cifs.ko
root@smf-Thinkpad-P51:~/cifs-2.6/fs/cifs# mount -t cifs
//localhost/scratch /mnt -o username=sfrench
Password for sfrench@//localhost/scratch: ************
root@smf-Thinkpad-P51:~/cifs-2.6/fs/cifs# ps -A | grep cifsd
5176 ? 00:00:00 cifsd
root@smf-Thinkpad-P51:~/cifs-2.6/fs/cifs# kill -9 5176
root@smf-Thinkpad-P51:~/cifs-2.6/fs/cifs# ls /mnt
0444 dir0750 dir0754 newfile
root@smf-Thinkpad-P51:~/cifs-2.6/fs/cifs# umount /mnt
root@smf-Thinkpad-P51:~/cifs-2.6/fs/cifs# ps -A | grep cifsd
root@smf-Thinkpad-P51:~/cifs-2.6/fs/cifs# rmmod cifs

On Tue, Jul 23, 2019 at 9:19 PM Steve French <[email protected]> wrote:
>
> Pavel noticed I missed a line from the attempt to do a similar patch
> to Eric's suggestion
> (it still didn't work though - although "allow_signal" does albeit is
> possibly dangerous as user space can kill cifsd)
>
> # git diff -a
> diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
> index a4830ced0f98..8758dff18c15 100644
> --- a/fs/cifs/connect.c
> +++ b/fs/cifs/connect.c
> @@ -1104,6 +1104,7 @@ cifs_demultiplex_thread(void *p)
> struct task_struct *task_to_wake = NULL;
> struct mid_q_entry *mids[MAX_COMPOUND];
> char *bufs[MAX_COMPOUND];
> + sigset_t mask, oldmask;
>
> current->flags |= PF_MEMALLOC;
> cifs_dbg(FYI, "Demultiplex PID: %d\n", task_pid_nr(current));
> @@ -1113,6 +1114,9 @@ cifs_demultiplex_thread(void *p)
> mempool_resize(cifs_req_poolp, length + cifs_min_rcv);
>
> set_freezable();
> + sigfillset(&mask);
> + sigdelset(&mask, SIGKILL);
> + sigprocmask(SIG_BLOCK, &mask, &oldmask);
> while (server->tcpStatus != CifsExiting) {
> if (try_to_freeze())
> continue;
>
> On Tue, Jul 23, 2019 at 9:02 PM Steve French <[email protected]> wrote:
> >
> > On Tue, Jul 23, 2019 at 8:32 PM Eric W. Biederman <[email protected]> wrote:
> > >
> > > Steve French <[email protected]> writes:
> > >
> > > > Very easy to see what caused the regression with this global change:
> > > >
> > > > mount (which launches "cifsd" thread to read the socket)
> > > > umount (which kills the "cifsd" thread)
> > > > rmmod (rmmod now fails since "cifsd" thread is still active)
> > > >
> > > > mount launches a thread to read from the socket ("cifsd")
> > > > umount is supposed to kill that thread (but with the patch
> > > > "signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of
> > > > force_sig" that no longer works). So the regression is that after
> > > > unmount you still see the "cifsd" thread, and the reason that cifsd
> > > > thread is still around is that that patch no longer force kills the
> > > > process (see line 2652 of fs/cifs/connect.c) which regresses module
> > > > removal.
> > > >
> > > > - force_sig(SIGKILL, task);
> > > > + send_sig(SIGKILL, task, 1);
> > > >
> > > > The comment in the changeset indicates "The signal SIGKILL can not be
> > > > ignored" but obviously it can be ignored - at least on 5.3-rc1 it is
> > > > being ignored.
> > > >
> > > > If send_sig(SIGKILL ...) doesn't work and if force_sig(SIGKILL, task)
> > > > is removed and no longer possible - how do we kill a helper process
> > > > ...
> > >
> > > I think I see what is happening. It looks like as well as misuinsg
> > > force_sig, cifs is also violating the invariant that keeps SIGKILL out
> > > of the blocked signal set.
> > >
> > > For that force_sig will act differently. I did not consider it because
> > > that is never supposed to happen.
> > >
> > > Can someone test this code below and confirm the issue goes away?
> > >
> > > diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
> > > index 5d6d44bfe10a..2a782ebc7b65 100644
> > > --- a/fs/cifs/transport.c
> > > +++ b/fs/cifs/transport.c
> > > @@ -347,6 +347,7 @@ __smb_send_rqst(struct TCP_Server_Info *server, int num_rqst,
> > > */
> > >
> > > sigfillset(&mask);
> > > + sigdelset(&mask, SIGKILL);
> > > sigprocmask(SIG_BLOCK, &mask, &oldmask);
> > >
> > > /* Generate a rfc1002 marker for SMB2+ */
> > >
> > >
> > > Eric
> >
> > I just tried your suggestion and it didn't work. I also tried doing
> > a similar thing on the thread we are trying to kills ("cifsd" - ie
> > which is blocked in the function cifs_demultiplex_thread waiting to
> > read from the socket)
> > # git diff -a
> > diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
> > index a4830ced0f98..b73062520a17 100644
> > --- a/fs/cifs/connect.c
> > +++ b/fs/cifs/connect.c
> > @@ -1104,6 +1104,7 @@ cifs_demultiplex_thread(void *p)
> > struct task_struct *task_to_wake = NULL;
> > struct mid_q_entry *mids[MAX_COMPOUND];
> > char *bufs[MAX_COMPOUND];
> > + sigset_t mask;
> >
> > current->flags |= PF_MEMALLOC;
> > cifs_dbg(FYI, "Demultiplex PID: %d\n", task_pid_nr(current));
> > @@ -1113,6 +1114,8 @@ cifs_demultiplex_thread(void *p)
> > mempool_resize(cifs_req_poolp, length + cifs_min_rcv);
> >
> > set_freezable();
> > + sigfillset(&mask);
> > + sigdelset(&mask, SIGKILL);
> > while (server->tcpStatus != CifsExiting) {
> > if (try_to_freeze())
> > continue;
> >
> >
> > That also didn't work. The only thing I have been able to find
> > which worked was:
> >
> > diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
> > index a4830ced0f98..e74f04163fc9 100644
> > --- a/fs/cifs/connect.c
> > +++ b/fs/cifs/connect.c
> > @@ -1113,6 +1113,7 @@ cifs_demultiplex_thread(void *p)
> > mempool_resize(cifs_req_poolp, length + cifs_min_rcv);
> >
> > set_freezable();
> > + allow_signal(SIGKILL);
> > while (server->tcpStatus != CifsExiting) {
> > if (try_to_freeze())
> > continue;
> >
> >
> > That fixes the problem ... but ... as Ronnie and others have noted it
> > would allow a userspace process to make the mount unusable (all you
> > would have to do would be to do a kill -9 of the "cifsd" process from
> > some userspace process like bash and the mount would be unusable - so
> > this sounds dangerous.
> >
> > Is there an alternative that, in the process doing the unmount in
> > kernel, would allow us to do the equivalent of:
> > "allow_signal(SIGKILL, <the id of the cifsd process>"
> > In otherwords, to minimize the risk of some userspace process killing
> > cifsd, could we delay enabling allow_signal(SIGKILL) till the unmount
> > begins by doing it for a different process (have the unmount process
> > enable signals for the cifsd process). Otherwise is there a way to
> > force kill a process from the kernel as we used to do - without
> > running the risk of a user space process killing cifsd (which is bad).
> >
> > --
> > Thanks,
> >
> > Steve
>
>
>
> --
> Thanks,
>
> Steve



--
Thanks,

Steve

2019-07-24 02:39:05

by Steve French

[permalink] [raw]
Subject: Re: [PATCH AUTOSEL 5.2 039/249] signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of force_sig

On Tue, Jul 23, 2019 at 8:32 PM Eric W. Biederman <[email protected]> wrote:
>
> Steve French <[email protected]> writes:
>
> > Very easy to see what caused the regression with this global change:
> >
> > mount (which launches "cifsd" thread to read the socket)
> > umount (which kills the "cifsd" thread)
> > rmmod (rmmod now fails since "cifsd" thread is still active)
> >
> > mount launches a thread to read from the socket ("cifsd")
> > umount is supposed to kill that thread (but with the patch
> > "signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of
> > force_sig" that no longer works). So the regression is that after
> > unmount you still see the "cifsd" thread, and the reason that cifsd
> > thread is still around is that that patch no longer force kills the
> > process (see line 2652 of fs/cifs/connect.c) which regresses module
> > removal.
> >
> > - force_sig(SIGKILL, task);
> > + send_sig(SIGKILL, task, 1);
> >
> > The comment in the changeset indicates "The signal SIGKILL can not be
> > ignored" but obviously it can be ignored - at least on 5.3-rc1 it is
> > being ignored.
> >
> > If send_sig(SIGKILL ...) doesn't work and if force_sig(SIGKILL, task)
> > is removed and no longer possible - how do we kill a helper process
> > ...
>
> I think I see what is happening. It looks like as well as misuinsg
> force_sig, cifs is also violating the invariant that keeps SIGKILL out
> of the blocked signal set.
>
> For that force_sig will act differently. I did not consider it because
> that is never supposed to happen.
>
> Can someone test this code below and confirm the issue goes away?
>
> diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
> index 5d6d44bfe10a..2a782ebc7b65 100644
> --- a/fs/cifs/transport.c
> +++ b/fs/cifs/transport.c
> @@ -347,6 +347,7 @@ __smb_send_rqst(struct TCP_Server_Info *server, int num_rqst,
> */
>
> sigfillset(&mask);
> + sigdelset(&mask, SIGKILL);
> sigprocmask(SIG_BLOCK, &mask, &oldmask);
>
> /* Generate a rfc1002 marker for SMB2+ */
>
>
> Eric

I just tried your suggestion and it didn't work. I also tried doing
a similar thing on the thread we are trying to kills ("cifsd" - ie
which is blocked in the function cifs_demultiplex_thread waiting to
read from the socket)
# git diff -a
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index a4830ced0f98..b73062520a17 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -1104,6 +1104,7 @@ cifs_demultiplex_thread(void *p)
struct task_struct *task_to_wake = NULL;
struct mid_q_entry *mids[MAX_COMPOUND];
char *bufs[MAX_COMPOUND];
+ sigset_t mask;

current->flags |= PF_MEMALLOC;
cifs_dbg(FYI, "Demultiplex PID: %d\n", task_pid_nr(current));
@@ -1113,6 +1114,8 @@ cifs_demultiplex_thread(void *p)
mempool_resize(cifs_req_poolp, length + cifs_min_rcv);

set_freezable();
+ sigfillset(&mask);
+ sigdelset(&mask, SIGKILL);
while (server->tcpStatus != CifsExiting) {
if (try_to_freeze())
continue;


That also didn't work. The only thing I have been able to find
which worked was:

diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index a4830ced0f98..e74f04163fc9 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -1113,6 +1113,7 @@ cifs_demultiplex_thread(void *p)
mempool_resize(cifs_req_poolp, length + cifs_min_rcv);

set_freezable();
+ allow_signal(SIGKILL);
while (server->tcpStatus != CifsExiting) {
if (try_to_freeze())
continue;


That fixes the problem ... but ... as Ronnie and others have noted it
would allow a userspace process to make the mount unusable (all you
would have to do would be to do a kill -9 of the "cifsd" process from
some userspace process like bash and the mount would be unusable - so
this sounds dangerous.

Is there an alternative that, in the process doing the unmount in
kernel, would allow us to do the equivalent of:
"allow_signal(SIGKILL, <the id of the cifsd process>"
In otherwords, to minimize the risk of some userspace process killing
cifsd, could we delay enabling allow_signal(SIGKILL) till the unmount
begins by doing it for a different process (have the unmount process
enable signals for the cifsd process). Otherwise is there a way to
force kill a process from the kernel as we used to do - without
running the risk of a user space process killing cifsd (which is bad).

--
Thanks,

Steve

2019-07-24 02:39:55

by Steve French

[permalink] [raw]
Subject: Re: [PATCH AUTOSEL 5.2 039/249] signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of force_sig

Pavel noticed I missed a line from the attempt to do a similar patch
to Eric's suggestion
(it still didn't work though - although "allow_signal" does albeit is
possibly dangerous as user space can kill cifsd)

# git diff -a
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index a4830ced0f98..8758dff18c15 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -1104,6 +1104,7 @@ cifs_demultiplex_thread(void *p)
struct task_struct *task_to_wake = NULL;
struct mid_q_entry *mids[MAX_COMPOUND];
char *bufs[MAX_COMPOUND];
+ sigset_t mask, oldmask;

current->flags |= PF_MEMALLOC;
cifs_dbg(FYI, "Demultiplex PID: %d\n", task_pid_nr(current));
@@ -1113,6 +1114,9 @@ cifs_demultiplex_thread(void *p)
mempool_resize(cifs_req_poolp, length + cifs_min_rcv);

set_freezable();
+ sigfillset(&mask);
+ sigdelset(&mask, SIGKILL);
+ sigprocmask(SIG_BLOCK, &mask, &oldmask);
while (server->tcpStatus != CifsExiting) {
if (try_to_freeze())
continue;

On Tue, Jul 23, 2019 at 9:02 PM Steve French <[email protected]> wrote:
>
> On Tue, Jul 23, 2019 at 8:32 PM Eric W. Biederman <[email protected]> wrote:
> >
> > Steve French <[email protected]> writes:
> >
> > > Very easy to see what caused the regression with this global change:
> > >
> > > mount (which launches "cifsd" thread to read the socket)
> > > umount (which kills the "cifsd" thread)
> > > rmmod (rmmod now fails since "cifsd" thread is still active)
> > >
> > > mount launches a thread to read from the socket ("cifsd")
> > > umount is supposed to kill that thread (but with the patch
> > > "signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of
> > > force_sig" that no longer works). So the regression is that after
> > > unmount you still see the "cifsd" thread, and the reason that cifsd
> > > thread is still around is that that patch no longer force kills the
> > > process (see line 2652 of fs/cifs/connect.c) which regresses module
> > > removal.
> > >
> > > - force_sig(SIGKILL, task);
> > > + send_sig(SIGKILL, task, 1);
> > >
> > > The comment in the changeset indicates "The signal SIGKILL can not be
> > > ignored" but obviously it can be ignored - at least on 5.3-rc1 it is
> > > being ignored.
> > >
> > > If send_sig(SIGKILL ...) doesn't work and if force_sig(SIGKILL, task)
> > > is removed and no longer possible - how do we kill a helper process
> > > ...
> >
> > I think I see what is happening. It looks like as well as misuinsg
> > force_sig, cifs is also violating the invariant that keeps SIGKILL out
> > of the blocked signal set.
> >
> > For that force_sig will act differently. I did not consider it because
> > that is never supposed to happen.
> >
> > Can someone test this code below and confirm the issue goes away?
> >
> > diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
> > index 5d6d44bfe10a..2a782ebc7b65 100644
> > --- a/fs/cifs/transport.c
> > +++ b/fs/cifs/transport.c
> > @@ -347,6 +347,7 @@ __smb_send_rqst(struct TCP_Server_Info *server, int num_rqst,
> > */
> >
> > sigfillset(&mask);
> > + sigdelset(&mask, SIGKILL);
> > sigprocmask(SIG_BLOCK, &mask, &oldmask);
> >
> > /* Generate a rfc1002 marker for SMB2+ */
> >
> >
> > Eric
>
> I just tried your suggestion and it didn't work. I also tried doing
> a similar thing on the thread we are trying to kills ("cifsd" - ie
> which is blocked in the function cifs_demultiplex_thread waiting to
> read from the socket)
> # git diff -a
> diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
> index a4830ced0f98..b73062520a17 100644
> --- a/fs/cifs/connect.c
> +++ b/fs/cifs/connect.c
> @@ -1104,6 +1104,7 @@ cifs_demultiplex_thread(void *p)
> struct task_struct *task_to_wake = NULL;
> struct mid_q_entry *mids[MAX_COMPOUND];
> char *bufs[MAX_COMPOUND];
> + sigset_t mask;
>
> current->flags |= PF_MEMALLOC;
> cifs_dbg(FYI, "Demultiplex PID: %d\n", task_pid_nr(current));
> @@ -1113,6 +1114,8 @@ cifs_demultiplex_thread(void *p)
> mempool_resize(cifs_req_poolp, length + cifs_min_rcv);
>
> set_freezable();
> + sigfillset(&mask);
> + sigdelset(&mask, SIGKILL);
> while (server->tcpStatus != CifsExiting) {
> if (try_to_freeze())
> continue;
>
>
> That also didn't work. The only thing I have been able to find
> which worked was:
>
> diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
> index a4830ced0f98..e74f04163fc9 100644
> --- a/fs/cifs/connect.c
> +++ b/fs/cifs/connect.c
> @@ -1113,6 +1113,7 @@ cifs_demultiplex_thread(void *p)
> mempool_resize(cifs_req_poolp, length + cifs_min_rcv);
>
> set_freezable();
> + allow_signal(SIGKILL);
> while (server->tcpStatus != CifsExiting) {
> if (try_to_freeze())
> continue;
>
>
> That fixes the problem ... but ... as Ronnie and others have noted it
> would allow a userspace process to make the mount unusable (all you
> would have to do would be to do a kill -9 of the "cifsd" process from
> some userspace process like bash and the mount would be unusable - so
> this sounds dangerous.
>
> Is there an alternative that, in the process doing the unmount in
> kernel, would allow us to do the equivalent of:
> "allow_signal(SIGKILL, <the id of the cifsd process>"
> In otherwords, to minimize the risk of some userspace process killing
> cifsd, could we delay enabling allow_signal(SIGKILL) till the unmount
> begins by doing it for a different process (have the unmount process
> enable signals for the cifsd process). Otherwise is there a way to
> force kill a process from the kernel as we used to do - without
> running the risk of a user space process killing cifsd (which is bad).
>
> --
> Thanks,
>
> Steve



--
Thanks,

Steve

2019-07-24 03:23:59

by Steve French

[permalink] [raw]
Subject: Re: [PATCH AUTOSEL 5.2 039/249] signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of force_sig

Patch attached - tests out ok.


On Tue, Jul 23, 2019 at 9:28 PM Steve French <[email protected]> wrote:
>
> I did some additional testing and it looks like the "allow_signal"
> change may be safe enough
>
> # git diff -a
> diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
> index a4830ced0f98..a15a6e738eb5 100644
> --- a/fs/cifs/connect.c
> +++ b/fs/cifs/connect.c
> @@ -1113,6 +1113,7 @@ cifs_demultiplex_thread(void *p)
> mempool_resize(cifs_req_poolp, length + cifs_min_rcv);
>
> set_freezable();
> + allow_signal(SIGKILL);
> while (server->tcpStatus != CifsExiting) {
> if (try_to_freeze())
> continue;
>
> See below:
> root@smf-Thinkpad-P51:~/cifs-2.6/fs/cifs# insmod ./cifs.ko
> root@smf-Thinkpad-P51:~/cifs-2.6/fs/cifs# mount -t cifs
> //localhost/scratch /mnt -o username=sfrench
> Password for sfrench@//localhost/scratch: ************
> root@smf-Thinkpad-P51:~/cifs-2.6/fs/cifs# ps -A | grep cifsd
> 5176 ? 00:00:00 cifsd
> root@smf-Thinkpad-P51:~/cifs-2.6/fs/cifs# kill -9 5176
> root@smf-Thinkpad-P51:~/cifs-2.6/fs/cifs# ls /mnt
> 0444 dir0750 dir0754 newfile
> root@smf-Thinkpad-P51:~/cifs-2.6/fs/cifs# umount /mnt
> root@smf-Thinkpad-P51:~/cifs-2.6/fs/cifs# ps -A | grep cifsd
> root@smf-Thinkpad-P51:~/cifs-2.6/fs/cifs# rmmod cifs
>
> On Tue, Jul 23, 2019 at 9:19 PM Steve French <[email protected]> wrote:
> >
> > Pavel noticed I missed a line from the attempt to do a similar patch
> > to Eric's suggestion
> > (it still didn't work though - although "allow_signal" does albeit is
> > possibly dangerous as user space can kill cifsd)
> >
> > # git diff -a
> > diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
> > index a4830ced0f98..8758dff18c15 100644
> > --- a/fs/cifs/connect.c
> > +++ b/fs/cifs/connect.c
> > @@ -1104,6 +1104,7 @@ cifs_demultiplex_thread(void *p)
> > struct task_struct *task_to_wake = NULL;
> > struct mid_q_entry *mids[MAX_COMPOUND];
> > char *bufs[MAX_COMPOUND];
> > + sigset_t mask, oldmask;
> >
> > current->flags |= PF_MEMALLOC;
> > cifs_dbg(FYI, "Demultiplex PID: %d\n", task_pid_nr(current));
> > @@ -1113,6 +1114,9 @@ cifs_demultiplex_thread(void *p)
> > mempool_resize(cifs_req_poolp, length + cifs_min_rcv);
> >
> > set_freezable();
> > + sigfillset(&mask);
> > + sigdelset(&mask, SIGKILL);
> > + sigprocmask(SIG_BLOCK, &mask, &oldmask);
> > while (server->tcpStatus != CifsExiting) {
> > if (try_to_freeze())
> > continue;
> >
> > On Tue, Jul 23, 2019 at 9:02 PM Steve French <[email protected]> wrote:
> > >
> > > On Tue, Jul 23, 2019 at 8:32 PM Eric W. Biederman <[email protected]> wrote:
> > > >
> > > > Steve French <[email protected]> writes:
> > > >
> > > > > Very easy to see what caused the regression with this global change:
> > > > >
> > > > > mount (which launches "cifsd" thread to read the socket)
> > > > > umount (which kills the "cifsd" thread)
> > > > > rmmod (rmmod now fails since "cifsd" thread is still active)
> > > > >
> > > > > mount launches a thread to read from the socket ("cifsd")
> > > > > umount is supposed to kill that thread (but with the patch
> > > > > "signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of
> > > > > force_sig" that no longer works). So the regression is that after
> > > > > unmount you still see the "cifsd" thread, and the reason that cifsd
> > > > > thread is still around is that that patch no longer force kills the
> > > > > process (see line 2652 of fs/cifs/connect.c) which regresses module
> > > > > removal.
> > > > >
> > > > > - force_sig(SIGKILL, task);
> > > > > + send_sig(SIGKILL, task, 1);
> > > > >
> > > > > The comment in the changeset indicates "The signal SIGKILL can not be
> > > > > ignored" but obviously it can be ignored - at least on 5.3-rc1 it is
> > > > > being ignored.
> > > > >
> > > > > If send_sig(SIGKILL ...) doesn't work and if force_sig(SIGKILL, task)
> > > > > is removed and no longer possible - how do we kill a helper process
> > > > > ...
> > > >
> > > > I think I see what is happening. It looks like as well as misuinsg
> > > > force_sig, cifs is also violating the invariant that keeps SIGKILL out
> > > > of the blocked signal set.
> > > >
> > > > For that force_sig will act differently. I did not consider it because
> > > > that is never supposed to happen.
> > > >
> > > > Can someone test this code below and confirm the issue goes away?
> > > >
> > > > diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
> > > > index 5d6d44bfe10a..2a782ebc7b65 100644
> > > > --- a/fs/cifs/transport.c
> > > > +++ b/fs/cifs/transport.c
> > > > @@ -347,6 +347,7 @@ __smb_send_rqst(struct TCP_Server_Info *server, int num_rqst,
> > > > */
> > > >
> > > > sigfillset(&mask);
> > > > + sigdelset(&mask, SIGKILL);
> > > > sigprocmask(SIG_BLOCK, &mask, &oldmask);
> > > >
> > > > /* Generate a rfc1002 marker for SMB2+ */
> > > >
> > > >
> > > > Eric
> > >
> > > I just tried your suggestion and it didn't work. I also tried doing
> > > a similar thing on the thread we are trying to kills ("cifsd" - ie
> > > which is blocked in the function cifs_demultiplex_thread waiting to
> > > read from the socket)
> > > # git diff -a
> > > diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
> > > index a4830ced0f98..b73062520a17 100644
> > > --- a/fs/cifs/connect.c
> > > +++ b/fs/cifs/connect.c
> > > @@ -1104,6 +1104,7 @@ cifs_demultiplex_thread(void *p)
> > > struct task_struct *task_to_wake = NULL;
> > > struct mid_q_entry *mids[MAX_COMPOUND];
> > > char *bufs[MAX_COMPOUND];
> > > + sigset_t mask;
> > >
> > > current->flags |= PF_MEMALLOC;
> > > cifs_dbg(FYI, "Demultiplex PID: %d\n", task_pid_nr(current));
> > > @@ -1113,6 +1114,8 @@ cifs_demultiplex_thread(void *p)
> > > mempool_resize(cifs_req_poolp, length + cifs_min_rcv);
> > >
> > > set_freezable();
> > > + sigfillset(&mask);
> > > + sigdelset(&mask, SIGKILL);
> > > while (server->tcpStatus != CifsExiting) {
> > > if (try_to_freeze())
> > > continue;
> > >
> > >
> > > That also didn't work. The only thing I have been able to find
> > > which worked was:
> > >
> > > diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
> > > index a4830ced0f98..e74f04163fc9 100644
> > > --- a/fs/cifs/connect.c
> > > +++ b/fs/cifs/connect.c
> > > @@ -1113,6 +1113,7 @@ cifs_demultiplex_thread(void *p)
> > > mempool_resize(cifs_req_poolp, length + cifs_min_rcv);
> > >
> > > set_freezable();
> > > + allow_signal(SIGKILL);
> > > while (server->tcpStatus != CifsExiting) {
> > > if (try_to_freeze())
> > > continue;
> > >
> > >
> > > That fixes the problem ... but ... as Ronnie and others have noted it
> > > would allow a userspace process to make the mount unusable (all you
> > > would have to do would be to do a kill -9 of the "cifsd" process from
> > > some userspace process like bash and the mount would be unusable - so
> > > this sounds dangerous.
> > >
> > > Is there an alternative that, in the process doing the unmount in
> > > kernel, would allow us to do the equivalent of:
> > > "allow_signal(SIGKILL, <the id of the cifsd process>"
> > > In otherwords, to minimize the risk of some userspace process killing
> > > cifsd, could we delay enabling allow_signal(SIGKILL) till the unmount
> > > begins by doing it for a different process (have the unmount process
> > > enable signals for the cifsd process). Otherwise is there a way to
> > > force kill a process from the kernel as we used to do - without
> > > running the risk of a user space process killing cifsd (which is bad).
> > >
> > > --
> > > Thanks,
> > >
> > > Steve
> >
> >
> >
> > --
> > Thanks,
> >
> > Steve
>
>
>
> --
> Thanks,
>
> Steve



--
Thanks,

Steve


Attachments:
0001-cifs-fix-rmmod-regression-in-cifs.ko-caused-by-force.patch (1.08 kB)

2019-07-24 03:36:39

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH AUTOSEL 5.2 039/249] signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of force_sig

Steve French <[email protected]> writes:

> I did some additional testing and it looks like the "allow_signal"
> change may be safe enough
>
> # git diff -a
> diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
> index a4830ced0f98..a15a6e738eb5 100644
> --- a/fs/cifs/connect.c
> +++ b/fs/cifs/connect.c
> @@ -1113,6 +1113,7 @@ cifs_demultiplex_thread(void *p)
> mempool_resize(cifs_req_poolp, length + cifs_min_rcv);
>
> set_freezable();
> + allow_signal(SIGKILL);
> while (server->tcpStatus != CifsExiting) {
> if (try_to_freeze())
> continue;
>
> See below:
> root@smf-Thinkpad-P51:~/cifs-2.6/fs/cifs# insmod ./cifs.ko
> root@smf-Thinkpad-P51:~/cifs-2.6/fs/cifs# mount -t cifs
> //localhost/scratch /mnt -o username=sfrench
> Password for sfrench@//localhost/scratch: ************
> root@smf-Thinkpad-P51:~/cifs-2.6/fs/cifs# ps -A | grep cifsd
> 5176 ? 00:00:00 cifsd
> root@smf-Thinkpad-P51:~/cifs-2.6/fs/cifs# kill -9 5176
> root@smf-Thinkpad-P51:~/cifs-2.6/fs/cifs# ls /mnt
> 0444 dir0750 dir0754 newfile
> root@smf-Thinkpad-P51:~/cifs-2.6/fs/cifs# umount /mnt
> root@smf-Thinkpad-P51:~/cifs-2.6/fs/cifs# ps -A | grep cifsd
> root@smf-Thinkpad-P51:~/cifs-2.6/fs/cifs# rmmod cifs

Yes. I just discovered that kthreadd calls a function named
ignore_signals that set all signals to SIG_IGN. Which becomes
the default for all kernel threads. So something like allow_signal
to change the signal handler is necessary. The blocking of SIGKILL is
also concerning but apparently that is not the issue here.

Ideally I think cifs should use kthread_stop, instead of signals for
this purpose. The logic is convoluted enough that reading through the
cifs code quickly I don't see how sending SIGKILL to the daemon causes
it to stop.

Eric


> On Tue, Jul 23, 2019 at 9:19 PM Steve French <[email protected]> wrote:
>>
>> Pavel noticed I missed a line from the attempt to do a similar patch
>> to Eric's suggestion
>> (it still didn't work though - although "allow_signal" does albeit is
>> possibly dangerous as user space can kill cifsd)
>>
>> # git diff -a
>> diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
>> index a4830ced0f98..8758dff18c15 100644
>> --- a/fs/cifs/connect.c
>> +++ b/fs/cifs/connect.c
>> @@ -1104,6 +1104,7 @@ cifs_demultiplex_thread(void *p)
>> struct task_struct *task_to_wake = NULL;
>> struct mid_q_entry *mids[MAX_COMPOUND];
>> char *bufs[MAX_COMPOUND];
>> + sigset_t mask, oldmask;
>>
>> current->flags |= PF_MEMALLOC;
>> cifs_dbg(FYI, "Demultiplex PID: %d\n", task_pid_nr(current));
>> @@ -1113,6 +1114,9 @@ cifs_demultiplex_thread(void *p)
>> mempool_resize(cifs_req_poolp, length + cifs_min_rcv);
>>
>> set_freezable();
>> + sigfillset(&mask);
>> + sigdelset(&mask, SIGKILL);
>> + sigprocmask(SIG_BLOCK, &mask, &oldmask);
>> while (server->tcpStatus != CifsExiting) {
>> if (try_to_freeze())
>> continue;
>>
>> On Tue, Jul 23, 2019 at 9:02 PM Steve French <[email protected]> wrote:
>> >
>> > On Tue, Jul 23, 2019 at 8:32 PM Eric W. Biederman <[email protected]> wrote:
>> > >
>> > > Steve French <[email protected]> writes:
>> > >
>> > > > Very easy to see what caused the regression with this global change:
>> > > >
>> > > > mount (which launches "cifsd" thread to read the socket)
>> > > > umount (which kills the "cifsd" thread)
>> > > > rmmod (rmmod now fails since "cifsd" thread is still active)
>> > > >
>> > > > mount launches a thread to read from the socket ("cifsd")
>> > > > umount is supposed to kill that thread (but with the patch
>> > > > "signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of
>> > > > force_sig" that no longer works). So the regression is that after
>> > > > unmount you still see the "cifsd" thread, and the reason that cifsd
>> > > > thread is still around is that that patch no longer force kills the
>> > > > process (see line 2652 of fs/cifs/connect.c) which regresses module
>> > > > removal.
>> > > >
>> > > > - force_sig(SIGKILL, task);
>> > > > + send_sig(SIGKILL, task, 1);
>> > > >
>> > > > The comment in the changeset indicates "The signal SIGKILL can not be
>> > > > ignored" but obviously it can be ignored - at least on 5.3-rc1 it is
>> > > > being ignored.
>> > > >
>> > > > If send_sig(SIGKILL ...) doesn't work and if force_sig(SIGKILL, task)
>> > > > is removed and no longer possible - how do we kill a helper process
>> > > > ...
>> > >
>> > > I think I see what is happening. It looks like as well as misuinsg
>> > > force_sig, cifs is also violating the invariant that keeps SIGKILL out
>> > > of the blocked signal set.
>> > >
>> > > For that force_sig will act differently. I did not consider it because
>> > > that is never supposed to happen.
>> > >
>> > > Can someone test this code below and confirm the issue goes away?
>> > >
>> > > diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
>> > > index 5d6d44bfe10a..2a782ebc7b65 100644
>> > > --- a/fs/cifs/transport.c
>> > > +++ b/fs/cifs/transport.c
>> > > @@ -347,6 +347,7 @@ __smb_send_rqst(struct TCP_Server_Info *server, int num_rqst,
>> > > */
>> > >
>> > > sigfillset(&mask);
>> > > + sigdelset(&mask, SIGKILL);
>> > > sigprocmask(SIG_BLOCK, &mask, &oldmask);
>> > >
>> > > /* Generate a rfc1002 marker for SMB2+ */
>> > >
>> > >
>> > > Eric
>> >
>> > I just tried your suggestion and it didn't work. I also tried doing
>> > a similar thing on the thread we are trying to kills ("cifsd" - ie
>> > which is blocked in the function cifs_demultiplex_thread waiting to
>> > read from the socket)
>> > # git diff -a
>> > diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
>> > index a4830ced0f98..b73062520a17 100644
>> > --- a/fs/cifs/connect.c
>> > +++ b/fs/cifs/connect.c
>> > @@ -1104,6 +1104,7 @@ cifs_demultiplex_thread(void *p)
>> > struct task_struct *task_to_wake = NULL;
>> > struct mid_q_entry *mids[MAX_COMPOUND];
>> > char *bufs[MAX_COMPOUND];
>> > + sigset_t mask;
>> >
>> > current->flags |= PF_MEMALLOC;
>> > cifs_dbg(FYI, "Demultiplex PID: %d\n", task_pid_nr(current));
>> > @@ -1113,6 +1114,8 @@ cifs_demultiplex_thread(void *p)
>> > mempool_resize(cifs_req_poolp, length + cifs_min_rcv);
>> >
>> > set_freezable();
>> > + sigfillset(&mask);
>> > + sigdelset(&mask, SIGKILL);
>> > while (server->tcpStatus != CifsExiting) {
>> > if (try_to_freeze())
>> > continue;
>> >
>> >
>> > That also didn't work. The only thing I have been able to find
>> > which worked was:
>> >
>> > diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
>> > index a4830ced0f98..e74f04163fc9 100644
>> > --- a/fs/cifs/connect.c
>> > +++ b/fs/cifs/connect.c
>> > @@ -1113,6 +1113,7 @@ cifs_demultiplex_thread(void *p)
>> > mempool_resize(cifs_req_poolp, length + cifs_min_rcv);
>> >
>> > set_freezable();
>> > + allow_signal(SIGKILL);
>> > while (server->tcpStatus != CifsExiting) {
>> > if (try_to_freeze())
>> > continue;
>> >
>> >
>> > That fixes the problem ... but ... as Ronnie and others have noted it
>> > would allow a userspace process to make the mount unusable (all you
>> > would have to do would be to do a kill -9 of the "cifsd" process from
>> > some userspace process like bash and the mount would be unusable - so
>> > this sounds dangerous.
>> >
>> > Is there an alternative that, in the process doing the unmount in
>> > kernel, would allow us to do the equivalent of:
>> > "allow_signal(SIGKILL, <the id of the cifsd process>"
>> > In otherwords, to minimize the risk of some userspace process killing
>> > cifsd, could we delay enabling allow_signal(SIGKILL) till the unmount
>> > begins by doing it for a different process (have the unmount process
>> > enable signals for the cifsd process). Otherwise is there a way to
>> > force kill a process from the kernel as we used to do - without
>> > running the risk of a user space process killing cifsd (which is bad).
>> >
>> > --
>> > Thanks,
>> >
>> > Steve
>>
>>
>>
>> --
>> Thanks,
>>
>> Steve

2019-07-24 04:47:48

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH AUTOSEL 5.2 039/249] signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of force_sig

Steve French <[email protected]> writes:

> Very easy to see what caused the regression with this global change:
>
> mount (which launches "cifsd" thread to read the socket)
> umount (which kills the "cifsd" thread)
> rmmod (rmmod now fails since "cifsd" thread is still active)
>
> mount launches a thread to read from the socket ("cifsd")
> umount is supposed to kill that thread (but with the patch
> "signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of
> force_sig" that no longer works). So the regression is that after
> unmount you still see the "cifsd" thread, and the reason that cifsd
> thread is still around is that that patch no longer force kills the
> process (see line 2652 of fs/cifs/connect.c) which regresses module
> removal.
>
> - force_sig(SIGKILL, task);
> + send_sig(SIGKILL, task, 1);
>
> The comment in the changeset indicates "The signal SIGKILL can not be
> ignored" but obviously it can be ignored - at least on 5.3-rc1 it is
> being ignored.
>
> If send_sig(SIGKILL ...) doesn't work and if force_sig(SIGKILL, task)
> is removed and no longer possible - how do we kill a helper process
> ...

I think I see what is happening. It looks like as well as misuinsg
force_sig, cifs is also violating the invariant that keeps SIGKILL out
of the blocked signal set.

For that force_sig will act differently. I did not consider it because
that is never supposed to happen.

Can someone test this code below and confirm the issue goes away?

diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
index 5d6d44bfe10a..2a782ebc7b65 100644
--- a/fs/cifs/transport.c
+++ b/fs/cifs/transport.c
@@ -347,6 +347,7 @@ __smb_send_rqst(struct TCP_Server_Info *server, int num_rqst,
*/

sigfillset(&mask);
+ sigdelset(&mask, SIGKILL);
sigprocmask(SIG_BLOCK, &mask, &oldmask);

/* Generate a rfc1002 marker for SMB2+ */


Eric