2019-01-18 09:30:35

by Li RongQing

[permalink] [raw]
Subject: [PATCH][v4] tty: fix race between flush_to_ldisc and tty_open

There still is a race window after the commit b027e2298bd588
("tty: fix data race between tty_init_dev and flush of buf"),
and we encountered this crash issue if receive_buf call comes
before tty initialization completes in n_tty_open and
tty->driver_data may be NULL.

CPU0 CPU1
---- ----
n_tty_open
tty_init_dev
tty_ldisc_unlock
schedule
flush_to_ldisc
receive_buf
tty_port_default_receive_buf
tty_ldisc_receive_buf
n_tty_receive_buf_common
__receive_buf
uart_flush_chars
uart_start
/*tty->driver_data is NULL*/
tty->ops->open
/*init tty->driver_data*/

it can be fixed by extending ldisc semaphore lock in tty_init_dev
to driver_data initialized completely after tty->ops->open(), but
this will lead to put lock on one function and unlock in some other
function, and hard to maintain, so fix this race only by checking
tty->driver_data when receiving, and return if tty->driver_data
is NULL

Signed-off-by: Wang Li <[email protected]>
Signed-off-by: Zhang Yu <[email protected]>
Signed-off-by: Li RongQing <[email protected]>
---
V4: add version information
V3: not used ldisc semaphore lock, only checking tty->driver_data with NULL
V2: fix building error by EXPORT_SYMBOL tty_ldisc_unlock
V1: extend ldisc lock to protect that tty->driver_data is inited

drivers/tty/tty_port.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/tty/tty_port.c b/drivers/tty/tty_port.c
index 044c3cbdcfa4..86d0bec38322 100644
--- a/drivers/tty/tty_port.c
+++ b/drivers/tty/tty_port.c
@@ -31,6 +31,9 @@ static int tty_port_default_receive_buf(struct tty_port *port,
if (!tty)
return 0;

+ if (!tty->driver_data)
+ return 0;
+
disc = tty_ldisc_ref(tty);
if (!disc)
return 0;
--
2.16.2



2019-01-18 12:53:31

by Gaurav Kohli

[permalink] [raw]
Subject: Re: [PATCH][v4] tty: fix race between flush_to_ldisc and tty_open



On 1/18/2019 2:57 PM, Li RongQing wrote:
> There still is a race window after the commit b027e2298bd588
> ("tty: fix data race between tty_init_dev and flush of buf"),
> and we encountered this crash issue if receive_buf call comes
> before tty initialization completes in n_tty_open and
> tty->driver_data may be NULL.
>
> CPU0 CPU1
> ---- ----
> n_tty_open
> tty_init_dev
> tty_ldisc_unlock
> schedule
> flush_to_ldisc
> receive_buf
> tty_port_default_receive_buf
> tty_ldisc_receive_buf
> n_tty_receive_buf_common
> __receive_buf
> uart_flush_chars
> uart_start
> /*tty->driver_data is NULL*/
> tty->ops->open
> /*init tty->driver_data*/
>
> it can be fixed by extending ldisc semaphore lock in tty_init_dev
> to driver_data initialized completely after tty->ops->open(), but
> this will lead to put lock on one function and unlock in some other
> function, and hard to maintain, so fix this race only by checking
> tty->driver_data when receiving, and return if tty->driver_data
> is NULL
>
> Signed-off-by: Wang Li <[email protected]>
> Signed-off-by: Zhang Yu <[email protected]>
> Signed-off-by: Li RongQing <[email protected]>
> ---
> V4: add version information
> V3: not used ldisc semaphore lock, only checking tty->driver_data with NULL
> V2: fix building error by EXPORT_SYMBOL tty_ldisc_unlock
> V1: extend ldisc lock to protect that tty->driver_data is inited
>
> drivers/tty/tty_port.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/tty/tty_port.c b/drivers/tty/tty_port.c
> index 044c3cbdcfa4..86d0bec38322 100644
> --- a/drivers/tty/tty_port.c
> +++ b/drivers/tty/tty_port.c
> @@ -31,6 +31,9 @@ static int tty_port_default_receive_buf(struct tty_port *port,
> if (!tty)
> return 0;
>
> + if (!tty->driver_data)
> + return 0;
> +
> disc = tty_ldisc_ref(tty);
> if (!disc)
> return 0;
>
Acked-by: Gaurav Kohli <[email protected]>

It looks good to me w.r.t previous approach, but Let's Maintainer decide
once.

Regards
Gaurav
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center,
Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

2019-01-30 10:16:59

by Li RongQing

[permalink] [raw]
Subject: 答复: [PATCH][v4] tty: fix race between flu sh_to_ldisc and tty_open



> -----邮件原件-----
> 发件人: Kohli, Gaurav [mailto:[email protected]]
> 发送时间: 2019年1月18日 20:51
> 收件人: Li,Rongqing <[email protected]>; [email protected];
> [email protected]; [email protected]
> 主题: Re: [PATCH][v4] tty: fix race between flush_to_ldisc and tty_open
>
>
>
> On 1/18/2019 2:57 PM, Li RongQing wrote:
> > There still is a race window after the commit b027e2298bd588
> > ("tty: fix data race between tty_init_dev and flush of buf"), and we
> > encountered this crash issue if receive_buf call comes before tty
> > initialization completes in n_tty_open and
> > tty->driver_data may be NULL.
> >
> > CPU0 CPU1
> > ---- ----
> > n_tty_open
> > tty_init_dev
> > tty_ldisc_unlock
> > schedule flush_to_ldisc
> > receive_buf
> > tty_port_default_receive_buf
> > tty_ldisc_receive_buf
> > n_tty_receive_buf_common
> > __receive_buf
> > uart_flush_chars
> > uart_start
> > /*tty->driver_data is NULL*/
> > tty->ops->open
> > /*init tty->driver_data*/
> >
> > it can be fixed by extending ldisc semaphore lock in tty_init_dev to
> > driver_data initialized completely after tty->ops->open(), but this
> > will lead to put lock on one function and unlock in some other
> > function, and hard to maintain, so fix this race only by checking
> > tty->driver_data when receiving, and return if tty->driver_data
> > is NULL
> >
> > Signed-off-by: Wang Li <[email protected]>
> > Signed-off-by: Zhang Yu <[email protected]>
> > Signed-off-by: Li RongQing <[email protected]>
> > ---
> > V4: add version information
> > V3: not used ldisc semaphore lock, only checking tty->driver_data with
> > NULL
> > V2: fix building error by EXPORT_SYMBOL tty_ldisc_unlock
> > V1: extend ldisc lock to protect that tty->driver_data is inited
> >
> > drivers/tty/tty_port.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/tty/tty_port.c b/drivers/tty/tty_port.c index
> > 044c3cbdcfa4..86d0bec38322 100644
> > --- a/drivers/tty/tty_port.c
> > +++ b/drivers/tty/tty_port.c
> > @@ -31,6 +31,9 @@ static int tty_port_default_receive_buf(struct tty_port
> *port,
> > if (!tty)
> > return 0;
> >
> > + if (!tty->driver_data)
> > + return 0;
> > +
> > disc = tty_ldisc_ref(tty);
> > if (!disc)
> > return 0;
> >
> Acked-by: Gaurav Kohli <[email protected]>
>
> It looks good to me w.r.t previous approach, but Let's Maintainer decide once.
>

Thanks for your review, this one is simple and safe, it is used as live-patch online

-RongQing


> Regards
> Gaurav
> --
> Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
> is a member of the Code Aurora Forum, a Linux Foundation Collaborative
> Project.

2019-01-30 10:20:44

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH][v4] tty: fix race between flush_to_ldisc and tty_open

On Fri, Jan 18, 2019 at 05:27:17PM +0800, Li RongQing wrote:
> There still is a race window after the commit b027e2298bd588
> ("tty: fix data race between tty_init_dev and flush of buf"),
> and we encountered this crash issue if receive_buf call comes
> before tty initialization completes in n_tty_open and
> tty->driver_data may be NULL.
>
> CPU0 CPU1
> ---- ----
> n_tty_open
> tty_init_dev
> tty_ldisc_unlock
> schedule
> flush_to_ldisc
> receive_buf
> tty_port_default_receive_buf
> tty_ldisc_receive_buf
> n_tty_receive_buf_common
> __receive_buf
> uart_flush_chars
> uart_start
> /*tty->driver_data is NULL*/
> tty->ops->open
> /*init tty->driver_data*/
>
> it can be fixed by extending ldisc semaphore lock in tty_init_dev
> to driver_data initialized completely after tty->ops->open(), but
> this will lead to put lock on one function and unlock in some other
> function, and hard to maintain, so fix this race only by checking
> tty->driver_data when receiving, and return if tty->driver_data
> is NULL
>
> Signed-off-by: Wang Li <[email protected]>
> Signed-off-by: Zhang Yu <[email protected]>
> Signed-off-by: Li RongQing <[email protected]>
> ---
> V4: add version information
> V3: not used ldisc semaphore lock, only checking tty->driver_data with NULL
> V2: fix building error by EXPORT_SYMBOL tty_ldisc_unlock
> V1: extend ldisc lock to protect that tty->driver_data is inited
>
> drivers/tty/tty_port.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/tty/tty_port.c b/drivers/tty/tty_port.c
> index 044c3cbdcfa4..86d0bec38322 100644
> --- a/drivers/tty/tty_port.c
> +++ b/drivers/tty/tty_port.c
> @@ -31,6 +31,9 @@ static int tty_port_default_receive_buf(struct tty_port *port,
> if (!tty)
> return 0;
>
> + if (!tty->driver_data)
> + return 0;
> +

How is this working? What is setting driver_data to NULL to "stop" this
race?

There's no requirement that a tty driver set this field to NULL when it
is "done" with the tty device, so I think you are just getting lucky in
that your specific driver happens to be doing this.

What driver are you testing this against?

thanks,

greg k-h

2019-01-30 12:51:00

by Li RongQing

[permalink] [raw]
Subject: 答复: [PATCH][v4] tty: fix race between flush _to_ldisc and tty_open



> -----?ʼ?ԭ??-----
> ??????: [email protected]
> [mailto:[email protected]] ???? Greg KH
> ????ʱ??: 2019??1??30?? 18:19
> ?ռ???: Li,Rongqing <[email protected]>
> ????: [email protected]; [email protected]; [email protected]
> ????: Re: [PATCH][v4] tty: fix race between flush_to_ldisc and tty_open
>
> On Fri, Jan 18, 2019 at 05:27:17PM +0800, Li RongQing wrote:
> > There still is a race window after the commit b027e2298bd588
> > ("tty: fix data race between tty_init_dev and flush of buf"), and we
> > encountered this crash issue if receive_buf call comes before tty
> > initialization completes in n_tty_open and
> > tty->driver_data may be NULL.
> >
> > CPU0 CPU1
> > ---- ----
> > n_tty_open
> > tty_init_dev
> > tty_ldisc_unlock
> > schedule flush_to_ldisc
> > receive_buf
> > tty_port_default_receive_buf
> > tty_ldisc_receive_buf
> > n_tty_receive_buf_common
> > __receive_buf
> > uart_flush_chars
> > uart_start
> > /*tty->driver_data is NULL*/
> > tty->ops->open
> > /*init tty->driver_data*/
> >
> > it can be fixed by extending ldisc semaphore lock in tty_init_dev to
> > driver_data initialized completely after tty->ops->open(), but this
> > will lead to put lock on one function and unlock in some other
> > function, and hard to maintain, so fix this race only by checking
> > tty->driver_data when receiving, and return if tty->driver_data
> > is NULL
> >
> > Signed-off-by: Wang Li <[email protected]>
> > Signed-off-by: Zhang Yu <[email protected]>
> > Signed-off-by: Li RongQing <[email protected]>
> > ---
> > V4: add version information
> > V3: not used ldisc semaphore lock, only checking tty->driver_data with
> > NULL
> > V2: fix building error by EXPORT_SYMBOL tty_ldisc_unlock
> > V1: extend ldisc lock to protect that tty->driver_data is inited
> >
> > drivers/tty/tty_port.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/tty/tty_port.c b/drivers/tty/tty_port.c index
> > 044c3cbdcfa4..86d0bec38322 100644
> > --- a/drivers/tty/tty_port.c
> > +++ b/drivers/tty/tty_port.c
> > @@ -31,6 +31,9 @@ static int tty_port_default_receive_buf(struct tty_port
> *port,
> > if (!tty)
> > return 0;
> >
> > + if (!tty->driver_data)
> > + return 0;
> > +
>
> How is this working? What is setting driver_data to NULL to "stop" this race?
>


if tty->driver_data is NULL and return, tty_port_default_receive_buf will not step to
uart_start which access tty->driver_data and trigger panic before tty_open, so it can
fix the system panic

> There's no requirement that a tty driver set this field to NULL when it is "done"
> with the tty device, so I think you are just getting lucky in that your specific
> driver happens to be doing this.
>

when tty_open is running, tty is allocated by kzalloc in tty_init_dev which called
by tty_open_by_driver, tty is inited to 0

> What driver are you testing this against?
>

8250

Thanks

-RongQing

> thanks,
>
> greg k-h

2019-01-30 13:18:56

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: 答复: [PATCH][v4 ] tty: fix race between flush_to_ldisc and tty_open

On Wed, Jan 30, 2019 at 12:48:42PM +0000, Li,Rongqing wrote:
>
>
> > -----邮件原件-----
> > 发件人: [email protected]
> > [mailto:[email protected]] 代表 Greg KH
> > 发送时间: 2019年1月30日 18:19
> > 收件人: Li,Rongqing <[email protected]>
> > 抄送: [email protected]; [email protected]; [email protected]
> > 主题: Re: [PATCH][v4] tty: fix race between flush_to_ldisc and tty_open
> >
> > On Fri, Jan 18, 2019 at 05:27:17PM +0800, Li RongQing wrote:
> > > There still is a race window after the commit b027e2298bd588
> > > ("tty: fix data race between tty_init_dev and flush of buf"), and we
> > > encountered this crash issue if receive_buf call comes before tty
> > > initialization completes in n_tty_open and
> > > tty->driver_data may be NULL.
> > >
> > > CPU0 CPU1
> > > ---- ----
> > > n_tty_open
> > > tty_init_dev
> > > tty_ldisc_unlock
> > > schedule flush_to_ldisc
> > > receive_buf
> > > tty_port_default_receive_buf
> > > tty_ldisc_receive_buf
> > > n_tty_receive_buf_common
> > > __receive_buf
> > > uart_flush_chars
> > > uart_start
> > > /*tty->driver_data is NULL*/
> > > tty->ops->open
> > > /*init tty->driver_data*/
> > >
> > > it can be fixed by extending ldisc semaphore lock in tty_init_dev to
> > > driver_data initialized completely after tty->ops->open(), but this
> > > will lead to put lock on one function and unlock in some other
> > > function, and hard to maintain, so fix this race only by checking
> > > tty->driver_data when receiving, and return if tty->driver_data
> > > is NULL
> > >
> > > Signed-off-by: Wang Li <[email protected]>
> > > Signed-off-by: Zhang Yu <[email protected]>
> > > Signed-off-by: Li RongQing <[email protected]>
> > > ---
> > > V4: add version information
> > > V3: not used ldisc semaphore lock, only checking tty->driver_data with
> > > NULL
> > > V2: fix building error by EXPORT_SYMBOL tty_ldisc_unlock
> > > V1: extend ldisc lock to protect that tty->driver_data is inited
> > >
> > > drivers/tty/tty_port.c | 3 +++
> > > 1 file changed, 3 insertions(+)
> > >
> > > diff --git a/drivers/tty/tty_port.c b/drivers/tty/tty_port.c index
> > > 044c3cbdcfa4..86d0bec38322 100644
> > > --- a/drivers/tty/tty_port.c
> > > +++ b/drivers/tty/tty_port.c
> > > @@ -31,6 +31,9 @@ static int tty_port_default_receive_buf(struct tty_port
> > *port,
> > > if (!tty)
> > > return 0;
> > >
> > > + if (!tty->driver_data)
> > > + return 0;
> > > +
> >
> > How is this working? What is setting driver_data to NULL to "stop" this race?
> >
>
>
> if tty->driver_data is NULL and return, tty_port_default_receive_buf will not step to
> uart_start which access tty->driver_data and trigger panic before tty_open, so it can
> fix the system panic
>
> > There's no requirement that a tty driver set this field to NULL when it is "done"
> > with the tty device, so I think you are just getting lucky in that your specific
> > driver happens to be doing this.
> >
>
> when tty_open is running, tty is allocated by kzalloc in tty_init_dev which called
> by tty_open_by_driver, tty is inited to 0
>
> > What driver are you testing this against?
> >
>
> 8250

Ok, as this is specific to the uart core, how about this patch instead:

diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index 5c01bb6d1c24..b56a6250df3f 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -130,6 +130,9 @@ static void uart_start(struct tty_struct *tty)
struct uart_port *port;
unsigned long flags;

+ if (!state)
+ return;
+
port = uart_port_lock(state, flags);
__uart_start(tty);
uart_port_unlock(port, flags);

2019-01-31 02:17:33

by Li RongQing

[permalink] [raw]
Subject: 答复: 答复: [PATCH][v4] tty: fix race bet ween flush_to_ldisc and tty_open



> -----邮件原件-----
> 发件人: Greg KH [mailto:[email protected]]
> 发送时间: 2019年1月30日 21:17
> 收件人: Li,Rongqing <[email protected]>
> 抄送: [email protected]; [email protected]; [email protected];
> [email protected]
> 主题: Re: 答复: [PATCH][v4] tty: fix race between flush_to_ldisc and tty_open
>
> On Wed, Jan 30, 2019 at 12:48:42PM +0000, Li,Rongqing wrote:
> >
> >
> > > -----邮件原件-----
> > > 发件人: [email protected]
> > > [mailto:[email protected]] 代表 Greg KH
> > > 发送时间: 2019年1月30日 18:19
> > > 收件人: Li,Rongqing <[email protected]>
> > > 抄送: [email protected]; [email protected];
> > > [email protected]
> > > 主题: Re: [PATCH][v4] tty: fix race between flush_to_ldisc and
> > > tty_open
> > >
> > > On Fri, Jan 18, 2019 at 05:27:17PM +0800, Li RongQing wrote:
> > > > There still is a race window after the commit b027e2298bd588
> > > > ("tty: fix data race between tty_init_dev and flush of buf"), and
> > > > we encountered this crash issue if receive_buf call comes before
> > > > tty initialization completes in n_tty_open and
> > > > tty->driver_data may be NULL.
> > > >
> > > > CPU0 CPU1
> > > > ---- ----
> > > > n_tty_open
> > > > tty_init_dev
> > > > tty_ldisc_unlock
> > > > schedule flush_to_ldisc
> > > > receive_buf
> > > > tty_port_default_receive_buf
> > > > tty_ldisc_receive_buf
> > > > n_tty_receive_buf_common
> > > > __receive_buf
> > > > uart_flush_chars
> > > > uart_start
> > > > /*tty->driver_data is NULL*/
> > > > tty->ops->open
> > > > /*init tty->driver_data*/
> > > >
> > > > it can be fixed by extending ldisc semaphore lock in tty_init_dev
> > > > to driver_data initialized completely after tty->ops->open(), but
> > > > this will lead to put lock on one function and unlock in some
> > > > other function, and hard to maintain, so fix this race only by
> > > > checking
> > > > tty->driver_data when receiving, and return if tty->driver_data
> > > > is NULL
> > > >
> > > > Signed-off-by: Wang Li <[email protected]>
> > > > Signed-off-by: Zhang Yu <[email protected]>
> > > > Signed-off-by: Li RongQing <[email protected]>
> > > > ---
> > > > V4: add version information
> > > > V3: not used ldisc semaphore lock, only checking tty->driver_data
> > > > with NULL
> > > > V2: fix building error by EXPORT_SYMBOL tty_ldisc_unlock
> > > > V1: extend ldisc lock to protect that tty->driver_data is inited
> > > >
> > > > drivers/tty/tty_port.c | 3 +++
> > > > 1 file changed, 3 insertions(+)
> > > >
> > > > diff --git a/drivers/tty/tty_port.c b/drivers/tty/tty_port.c index
> > > > 044c3cbdcfa4..86d0bec38322 100644
> > > > --- a/drivers/tty/tty_port.c
> > > > +++ b/drivers/tty/tty_port.c
> > > > @@ -31,6 +31,9 @@ static int tty_port_default_receive_buf(struct
> > > > tty_port
> > > *port,
> > > > if (!tty)
> > > > return 0;
> > > >
> > > > + if (!tty->driver_data)
> > > > + return 0;
> > > > +
> > >
> > > How is this working? What is setting driver_data to NULL to "stop" this
> race?
> > >
> >
> >
> > if tty->driver_data is NULL and return, tty_port_default_receive_buf
> > will not step to uart_start which access tty->driver_data and trigger
> > panic before tty_open, so it can fix the system panic
> >
> > > There's no requirement that a tty driver set this field to NULL when it is
> "done"
> > > with the tty device, so I think you are just getting lucky in that
> > > your specific driver happens to be doing this.
> > >
> >
> > when tty_open is running, tty is allocated by kzalloc in tty_init_dev
> > which called by tty_open_by_driver, tty is inited to 0
> >
> > > What driver are you testing this against?
> > >
> >
> > 8250
>
> Ok, as this is specific to the uart core, how about this patch instead:
>
> diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
> index 5c01bb6d1c24..b56a6250df3f 100644
> --- a/drivers/tty/serial/serial_core.c
> +++ b/drivers/tty/serial/serial_core.c
> @@ -130,6 +130,9 @@ static void uart_start(struct tty_struct *tty)
> struct uart_port *port;
> unsigned long flags;
>
> + if (!state)
> + return;
> +
> port = uart_port_lock(state, flags);
> __uart_start(tty);
> uart_port_unlock(port, flags);


If move the check into uart_start, i am afraid that it maybe not fully fix this issue,
Since n_tty_receive_buf_common maybe call n_tty_check_throttle/
tty_unthrottle_safe which maybe use the tty->driver_data

if tty is not fully opened, I think no gain to step into more function

thanks

-RongQing

2019-01-31 06:53:13

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: 答复: 答复: [PATC H][v4] tty: fix race between flush_to_ldisc and tty_open

On Thu, Jan 31, 2019 at 02:15:35AM +0000, Li,Rongqing wrote:
>
>
> > -----邮件原件-----
> > 发件人: Greg KH [mailto:[email protected]]
> > 发送时间: 2019年1月30日 21:17
> > 收件人: Li,Rongqing <[email protected]>
> > 抄送: [email protected]; [email protected]; [email protected];
> > [email protected]
> > 主题: Re: 答复: [PATCH][v4] tty: fix race between flush_to_ldisc and tty_open
> >
> > On Wed, Jan 30, 2019 at 12:48:42PM +0000, Li,Rongqing wrote:
> > >
> > >
> > > > -----邮件原件-----
> > > > 发件人: [email protected]
> > > > [mailto:[email protected]] 代表 Greg KH
> > > > 发送时间: 2019年1月30日 18:19
> > > > 收件人: Li,Rongqing <[email protected]>
> > > > 抄送: [email protected]; [email protected];
> > > > [email protected]
> > > > 主题: Re: [PATCH][v4] tty: fix race between flush_to_ldisc and
> > > > tty_open
> > > >
> > > > On Fri, Jan 18, 2019 at 05:27:17PM +0800, Li RongQing wrote:
> > > > > There still is a race window after the commit b027e2298bd588
> > > > > ("tty: fix data race between tty_init_dev and flush of buf"), and
> > > > > we encountered this crash issue if receive_buf call comes before
> > > > > tty initialization completes in n_tty_open and
> > > > > tty->driver_data may be NULL.
> > > > >
> > > > > CPU0 CPU1
> > > > > ---- ----
> > > > > n_tty_open
> > > > > tty_init_dev
> > > > > tty_ldisc_unlock
> > > > > schedule flush_to_ldisc
> > > > > receive_buf
> > > > > tty_port_default_receive_buf
> > > > > tty_ldisc_receive_buf
> > > > > n_tty_receive_buf_common
> > > > > __receive_buf
> > > > > uart_flush_chars
> > > > > uart_start
> > > > > /*tty->driver_data is NULL*/
> > > > > tty->ops->open
> > > > > /*init tty->driver_data*/
> > > > >
> > > > > it can be fixed by extending ldisc semaphore lock in tty_init_dev
> > > > > to driver_data initialized completely after tty->ops->open(), but
> > > > > this will lead to put lock on one function and unlock in some
> > > > > other function, and hard to maintain, so fix this race only by
> > > > > checking
> > > > > tty->driver_data when receiving, and return if tty->driver_data
> > > > > is NULL
> > > > >
> > > > > Signed-off-by: Wang Li <[email protected]>
> > > > > Signed-off-by: Zhang Yu <[email protected]>
> > > > > Signed-off-by: Li RongQing <[email protected]>
> > > > > ---
> > > > > V4: add version information
> > > > > V3: not used ldisc semaphore lock, only checking tty->driver_data
> > > > > with NULL
> > > > > V2: fix building error by EXPORT_SYMBOL tty_ldisc_unlock
> > > > > V1: extend ldisc lock to protect that tty->driver_data is inited
> > > > >
> > > > > drivers/tty/tty_port.c | 3 +++
> > > > > 1 file changed, 3 insertions(+)
> > > > >
> > > > > diff --git a/drivers/tty/tty_port.c b/drivers/tty/tty_port.c index
> > > > > 044c3cbdcfa4..86d0bec38322 100644
> > > > > --- a/drivers/tty/tty_port.c
> > > > > +++ b/drivers/tty/tty_port.c
> > > > > @@ -31,6 +31,9 @@ static int tty_port_default_receive_buf(struct
> > > > > tty_port
> > > > *port,
> > > > > if (!tty)
> > > > > return 0;
> > > > >
> > > > > + if (!tty->driver_data)
> > > > > + return 0;
> > > > > +
> > > >
> > > > How is this working? What is setting driver_data to NULL to "stop" this
> > race?
> > > >
> > >
> > >
> > > if tty->driver_data is NULL and return, tty_port_default_receive_buf
> > > will not step to uart_start which access tty->driver_data and trigger
> > > panic before tty_open, so it can fix the system panic
> > >
> > > > There's no requirement that a tty driver set this field to NULL when it is
> > "done"
> > > > with the tty device, so I think you are just getting lucky in that
> > > > your specific driver happens to be doing this.
> > > >
> > >
> > > when tty_open is running, tty is allocated by kzalloc in tty_init_dev
> > > which called by tty_open_by_driver, tty is inited to 0
> > >
> > > > What driver are you testing this against?
> > > >
> > >
> > > 8250
> >
> > Ok, as this is specific to the uart core, how about this patch instead:
> >
> > diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
> > index 5c01bb6d1c24..b56a6250df3f 100644
> > --- a/drivers/tty/serial/serial_core.c
> > +++ b/drivers/tty/serial/serial_core.c
> > @@ -130,6 +130,9 @@ static void uart_start(struct tty_struct *tty)
> > struct uart_port *port;
> > unsigned long flags;
> >
> > + if (!state)
> > + return;
> > +
> > port = uart_port_lock(state, flags);
> > __uart_start(tty);
> > uart_port_unlock(port, flags);
>
>
> If move the check into uart_start, i am afraid that it maybe not fully fix this issue,
> Since n_tty_receive_buf_common maybe call n_tty_check_throttle/
> tty_unthrottle_safe which maybe use the tty->driver_data
>
> if tty is not fully opened, I think no gain to step into more function

But as I said, the tty core has no knowledge of the "driver_data",
field. It does not know if a driver really is even using that field, so
it means nothing to the tty core, so it can not check it. Your specific
tty driver does happen to use it, so it can check it.

If you also need to check this in unthrottle, how about this patch too?
Does the combination of these two patches solve the problem for your
systems?

thanks,

greg k-h


diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index 5c01bb6d1c24..e33d4c181123 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -727,6 +727,9 @@ static void uart_unthrottle(struct tty_struct *tty)
upstat_t mask = UPSTAT_SYNC_FIFO;
struct uart_port *port;

+ if (!state)
+ return;
+
port = uart_port_ref(state);
if (!port)
return;

2019-01-31 07:42:44

by Li RongQing

[permalink] [raw]
Subject: 答复: 答复: 答复: [PATCH][v4] tty: fix race between flush_to_ldisc and tty_open



> -----邮件原件-----
> 发件人: Greg KH [mailto:[email protected]]
> 发送时间: 2019年1月31日 14:52
> 收件人: Li,Rongqing <[email protected]>
> 抄送: [email protected]; [email protected]; [email protected];
> [email protected]
> 主题: Re: 答复: 答复: [PATCH][v4] tty: fix race between flush_to_ldisc and
> tty_open
>
> On Thu, Jan 31, 2019 at 02:15:35AM +0000, Li,Rongqing wrote:
> >
> >
> > > -----邮件原件-----
> > > 发件人: Greg KH [mailto:[email protected]]
> > > 发送时间: 2019年1月30日 21:17
> > > 收件人: Li,Rongqing <[email protected]>
> > > 抄送: [email protected]; [email protected];
> > > [email protected]; [email protected]
> > > 主题: Re: 答复: [PATCH][v4] tty: fix race between flush_to_ldisc and
> > > tty_open
> > >
> > > On Wed, Jan 30, 2019 at 12:48:42PM +0000, Li,Rongqing wrote:
> > > >
> > > >
> > > > > -----邮件原件-----
> > > > > 发件人: [email protected]
> > > > > [mailto:[email protected]] 代表 Greg KH
> > > > > 发送时间: 2019年1月30日 18:19
> > > > > 收件人: Li,Rongqing <[email protected]>
> > > > > 抄送: [email protected]; [email protected];
> > > > > [email protected]
> > > > > 主题: Re: [PATCH][v4] tty: fix race between flush_to_ldisc and
> > > > > tty_open
> > > > >
> > > > > On Fri, Jan 18, 2019 at 05:27:17PM +0800, Li RongQing wrote:
> > > > > > There still is a race window after the commit b027e2298bd588
> > > > > > ("tty: fix data race between tty_init_dev and flush of buf"),
> > > > > > and we encountered this crash issue if receive_buf call comes
> > > > > > before tty initialization completes in n_tty_open and
> > > > > > tty->driver_data may be NULL.
> > > > > >
> > > > > > CPU0 CPU1
> > > > > > ---- ----
> > > > > > n_tty_open
> > > > > > tty_init_dev
> > > > > > tty_ldisc_unlock
> > > > > > schedule
> flush_to_ldisc
> > > > > > receive_buf
> > > > > > tty_port_default_receive_buf
> > > > > > tty_ldisc_receive_buf
> > > > > > n_tty_receive_buf_common
> > > > > > __receive_buf
> > > > > > uart_flush_chars
> > > > > > uart_start
> > > > > > /*tty->driver_data is NULL*/
> > > > > > tty->ops->open
> > > > > > /*init tty->driver_data*/
> > > > > >
> > > > > > it can be fixed by extending ldisc semaphore lock in
> > > > > > tty_init_dev to driver_data initialized completely after
> > > > > > tty->ops->open(), but this will lead to put lock on one
> > > > > > function and unlock in some other function, and hard to
> > > > > > maintain, so fix this race only by checking
> > > > > > tty->driver_data when receiving, and return if
> > > > > > tty->tty->driver_data
> > > > > > is NULL
> > > > > >
> > > > > > Signed-off-by: Wang Li <[email protected]>
> > > > > > Signed-off-by: Zhang Yu <[email protected]>
> > > > > > Signed-off-by: Li RongQing <[email protected]>
> > > > > > ---
> > > > > > V4: add version information
> > > > > > V3: not used ldisc semaphore lock, only checking
> > > > > > tty->driver_data with NULL
> > > > > > V2: fix building error by EXPORT_SYMBOL tty_ldisc_unlock
> > > > > > V1: extend ldisc lock to protect that tty->driver_data is
> > > > > > inited
> > > > > >
> > > > > > drivers/tty/tty_port.c | 3 +++
> > > > > > 1 file changed, 3 insertions(+)
> > > > > >
> > > > > > diff --git a/drivers/tty/tty_port.c b/drivers/tty/tty_port.c
> > > > > > index
> > > > > > 044c3cbdcfa4..86d0bec38322 100644
> > > > > > --- a/drivers/tty/tty_port.c
> > > > > > +++ b/drivers/tty/tty_port.c
> > > > > > @@ -31,6 +31,9 @@ static int
> > > > > > tty_port_default_receive_buf(struct
> > > > > > tty_port
> > > > > *port,
> > > > > > if (!tty)
> > > > > > return 0;
> > > > > >
> > > > > > + if (!tty->driver_data)
> > > > > > + return 0;
> > > > > > +
> > > > >
> > > > > How is this working? What is setting driver_data to NULL to
> > > > > "stop" this
> > > race?
> > > > >
> > > >
> > > >
> > > > if tty->driver_data is NULL and return,
> > > > tty_port_default_receive_buf will not step to uart_start which
> > > > access tty->driver_data and trigger panic before tty_open, so it
> > > > can fix the system panic
> > > >
> > > > > There's no requirement that a tty driver set this field to NULL
> > > > > when it is
> > > "done"
> > > > > with the tty device, so I think you are just getting lucky in
> > > > > that your specific driver happens to be doing this.
> > > > >
> > > >
> > > > when tty_open is running, tty is allocated by kzalloc in
> > > > tty_init_dev which called by tty_open_by_driver, tty is inited to
> > > > 0
> > > >
> > > > > What driver are you testing this against?
> > > > >
> > > >
> > > > 8250
> > >
> > > Ok, as this is specific to the uart core, how about this patch instead:
> > >
> > > diff --git a/drivers/tty/serial/serial_core.c
> > > b/drivers/tty/serial/serial_core.c
> > > index 5c01bb6d1c24..b56a6250df3f 100644
> > > --- a/drivers/tty/serial/serial_core.c
> > > +++ b/drivers/tty/serial/serial_core.c
> > > @@ -130,6 +130,9 @@ static void uart_start(struct tty_struct *tty)
> > > struct uart_port *port;
> > > unsigned long flags;
> > >
> > > + if (!state)
> > > + return;
> > > +
> > > port = uart_port_lock(state, flags);
> > > __uart_start(tty);
> > > uart_port_unlock(port, flags);
> >
> >
> > If move the check into uart_start, i am afraid that it maybe not fully
> > fix this issue, Since n_tty_receive_buf_common maybe call
> > n_tty_check_throttle/ tty_unthrottle_safe which maybe use the
> > tty->driver_data
> >
> > if tty is not fully opened, I think no gain to step into more function
>
> But as I said, the tty core has no knowledge of the "driver_data", field. It
> does not know if a driver really is even using that field, so it means nothing to
> the tty core, so it can not check it. Your specific tty driver does happen to use
> it, so it can check it.
>
> If you also need to check this in unthrottle, how about this patch too?
> Does the combination of these two patches solve the problem for your
> systems?
>
> thanks,
>
> greg k-h
>

Thanks for you explanation, I see now
Your suggestion should work, I will send V5 based on your suggestion

-RongQing

>
> diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
> index 5c01bb6d1c24..e33d4c181123 100644
> --- a/drivers/tty/serial/serial_core.c
> +++ b/drivers/tty/serial/serial_core.c
> @@ -727,6 +727,9 @@ static void uart_unthrottle(struct tty_struct *tty)
> upstat_t mask = UPSTAT_SYNC_FIFO;
> struct uart_port *port;
>
> + if (!state)
> + return;
> +
> port = uart_port_ref(state);
> if (!port)
> return;