2018-01-03 13:49:04

by Gaurav Kohli

[permalink] [raw]
Subject: [PATCH] tty: fix data race in n_tty_receive_buf_common

There can be a race, if receive_buf call comes before
tty initialization completes in n_tty_open and tty->disc_data
may be NULL.

CPU0 cpu1
---- ----
000|n_tty_receive_buf_common() n_tty_open()
-001|n_tty_receive_buf2() tty_ldisc_open.isra.3()
-002|tty_ldisc_receive_buf(inline) tty_ldisc_setup()

If tty->disc_data is NULL, then return from flush_to_ldisc

Signed-off-by: Gaurav Kohli <[email protected]>

diff --git a/drivers/tty/tty_port.c b/drivers/tty/tty_port.c
index 25d7368..5d49183 100644
--- a/drivers/tty/tty_port.c
+++ b/drivers/tty/tty_port.c
@@ -35,6 +35,9 @@ static int tty_port_default_receive_buf(struct tty_port *port,
if (!disc)
return 0;

+ if (!tty->disc_data)
+ return 0;
+
ret = tty_ldisc_receive_buf(disc, p, (char *)f, count);

tty_ldisc_deref(disc);
--
1.9.1


2018-01-03 19:38:31

by Alan Cox

[permalink] [raw]
Subject: Re: [PATCH] tty: fix data race in n_tty_receive_buf_common

On Wed, 3 Jan 2018 19:18:52 +0530
Gaurav Kohli <[email protected]> wrote:

> There can be a race, if receive_buf call comes before
> tty initialization completes in n_tty_open and tty->disc_data
> may be NULL.


This makes no sense. If the race exists then the check you do isn't good
enough because the ldsic dsta isn't valid even after the initial
assignment of tty->disc_data.

More to the point no ldisc receive method should ever be getting called
during the ldisc open. Likewise we must avoid hitting the window of the
old one closing (potentialyl stale disc_data from the old ldisc)

Any change to the ldisc is supposed to occur under tty->ldisc_sem and the
code does an ldisc_ref before invoking the ldisc method.

The only cases I can see where we set an ldisc are

1. tty_set_ldisc

This correctly takes an ldisc_ref so cannot run in parallel with
tty_port_default_receive_buf.

2. tty_init_dev when we set up a new tty

At that point the tty is not supposed to be receiving data and sure
enough we don't call tty->ops->open until it has finished the ldisc set
up, nor do we tty_init_dev a port that is already running.

So given we don't activate the port until tty->ops->open() calls
tty_port_open calls the port activation routine I don't see a bug.

3. tty_release()

Here we take the locks in tty_ldisc_release so that is ok

4. hangup

Again we take the ldisc lock



So unless your driver is stuffing bytes into the tty either before it's
been told too or after it's been told to shut up I don't see a bug. And
if you driver is doing either of those then it's broken. Even then
port->tty ought to be NULL.

What does a full (all CPU) trace of the bug look like and what tty driver
are you using when you capture the trace ?

Alan

2018-01-04 11:10:11

by Alan Cox

[permalink] [raw]
Subject: Re: [PATCH] tty: fix data race in n_tty_receive_buf_common

> > What does a full (all CPU) trace of the bug look like and what tty driver
> > are you using when you capture the trace ?

Which tty driver ? serial/msm_serial.c ?

> We are using tty for console logging,
>     |    tty = 0xFFFFFFFF477AC880 -> (
>     |      magic = 21505,
>     |      kref = (refcount = (counter = 2)),
>     |      dev = 0xFFFFFFFFEDE3DA80,
>     |      driver = 0xFFFFFFFFEDE2A480,
>     |      ops = 0xFFFFFF9F26F7D0D0,
>     |      index = 0,
>     |      ldisc_sem = (count = 1, wait_lock = (raw_lock = (owner = 0,
> next = 0), magic = 3735899821, own
>     |      termiox = 0x0,
>     |      name = "ttyMSM0",
>     |      pgrp = 0x0,

Ok no what I need to see is a trace of what each CPU is doing at the
point you detect the problem. That way we can see what the path that
races is.

> We have seen this issue on 4.9 and also one thing i have observed,
> before tty is getting reinit in tty_init_dev(),

When yo stop the DMA is it instantaneous or does it cause a final
interrupt after you return from stop_rx ?

To me it still looks like data is being queued after the port is told to
stop but that's not a certainty.

> there is console service exited before it in all the dumps.
>  35206.969644:   <2> init: Service 'console' (pid 7440) exited with
> status 130
>  35206.969690:   <2> init: Sending signal 9 to service 'console' (pid
> 7440) process group...
>  35206.970857:   <2> init: kill(7440, 9) failed: No such process.
>
> So how can we stop request of receive buff, if we already have tty_port
> and tty is getting reinitialized in midway like above
> case?

Is the port your console device. If you use a different port as a console
device does the problem go away - that could be a very important detail
as the hangup behaviour for the two is quite different.

Alan

2018-01-04 13:46:53

by Gaurav Kohli

[permalink] [raw]
Subject: Re: [PATCH] tty: fix data race in n_tty_receive_buf_common


> Which tty driver ? serial/msm_serial.c ?

We are using our internal driver, msm_geni_serial.c
>
> Ok no what I need to see is a trace of what each CPU is doing at the
> point you detect the problem. That way we can see what the path that
> races is.
Below is stack trace running by init in our case on one core
-006|n_tty_open(
    |    tty = 0xFFFFFFFF477AC880 -> (
    |      disc_data = 0xFFFFFF80197AD000,

    |      port = 0xFFFFFFFFEDE40000))
    |  ldata = 0xFFFFFF80197AD000

    |  trace_printk_fmt = 0xFFFFFF9F275125F8
-007|tty_ldisc_open.isra.3(
    |    tty = 0xFFFFFFFF477AC880)
-008|tty_ldisc_setup(

-009|tty_init_dev(
    |    driver = 0xFFFFFFFFEDE2A480,
    |    idx = 0)

-010|tty_open_by_driver(inline)
-010|tty_open(

Core 2:
-000|n_tty_receive_buf_common(
    |    tty = 0xFFFFFFFF477AC880,

    |  ?)
    |  ldata_=_0x0
    |  __func__ = (110, 95, 116, 116, 121, 95, 114, 101, 99, 101, 105,
118, 101, 95, 98, 117, 102, 95, 99, 111, 109, 109, 111, 110, 0)
    |  __u = (__val = 7079195495121566464, __c = (0))
    |  c = 127
    |  ldata = 0xFFFFFFFFF40DF97C

    |  c = 0
    |  ldata = 0xFFFFFF9F26F46000

-001|n_tty_receive_buf2(
    |    tty = 0xFFFFFFFF477AC880,

-002|tty_ldisc_receive_buf(inline)
-002|receive_buf(inline)
-002|flush_to_ldisc(

Please let me know in case some other trace required
>> We have seen this issue on 4.9 and also one thing i have observed,
>> before tty is getting reinit in tty_init_dev(),
> When yo stop the DMA is it instantaneous or does it cause a final
> interrupt after you return from stop_rx ?
>
> To me it still looks like data is being queued after the port is told to
> stop but that's not a certainty.
This geni is based on FIFO.
I have also put ftraces and from that we can see open is not able to
finish but there is request of flushing:
         kworker/-15514   2.... 35206.979226: workqueue_execute_start:
work struct 0xffffffffede40008: function flush_to_ldisc

         kworker/-15514   2.... 35206.979237: bprint:
n_tty_receive_buf_common: <Debug>n_tty_receive_buf_common
tty=0xffffffff477ac880 ldata=(nil)

                   init-1       4.... 35206.979751:
bprint:               n_tty_open: <Debug>n_tty_open
tty=0xffffffff477ac880 ldata=0xffffff80197ad000
>> there is console service exited before it in all the dumps.
>>  35206.969644:   <2> init: Service 'console' (pid 7440) exited with
>> status 130
>>  35206.969690:   <2> init: Sending signal 9 to service 'console' (pid
>> 7440) process group...
>>  35206.970857:   <2> init: kill(7440, 9) failed: No such process.
>>
>> So how can we stop request of receive buff, if we already have tty_port
>> and tty is getting reinitialized in midway like above
>> case?
> Is the port your console device. If you use a different port as a console
> device does the problem go away - that could be a very important detail
> as the hangup behaviour for the two is quite different.

Yes , we are using ttymsm0 as console device, this is the only port we
are using.

So it seems here, we are getting flush request when init reinitialize
the tty for same port. Please let me know
if some other debug logs are required.

Regards
Gaurav

--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

2018-01-04 14:37:37

by Alan Cox

[permalink] [raw]
Subject: Re: [PATCH] tty: fix data race in n_tty_receive_buf_common

On Thu, 4 Jan 2018 19:16:46 +0530
"Kohli, Gaurav" <[email protected]> wrote:

> > Which tty driver ? serial/msm_serial.c ?
>
> We are using our internal driver, msm_geni_serial.c

Can you make that code available otherwise it's impossible to see what
the problem might be.

> >
> > Ok no what I need to see is a trace of what each CPU is doing at the
> > point you detect the problem. That way we can see what the path that
> > races is.
> Below is stack trace running by init in our case on one core
> -006|n_tty_open(
>     |    tty = 0xFFFFFFFF477AC880 -> (
>     |      disc_data = 0xFFFFFF80197AD000,
>
>     |      port = 0xFFFFFFFFEDE40000))
>     |  ldata = 0xFFFFFF80197AD000
>
>     |  trace_printk_fmt = 0xFFFFFF9F275125F8
> -007|tty_ldisc_open.isra.3(
>     |    tty = 0xFFFFFFFF477AC880)
> -008|tty_ldisc_setup(
>
> -009|tty_init_dev(
>     |    driver = 0xFFFFFFFFEDE2A480,
>     |    idx = 0)
>
> -010|tty_open_by_driver(inline)
> -010|tty_open(

So core 1 is opening the tty from user space and that's a normal looking
trace for an open of a port that was closed

>
> Core 2:
> -000|n_tty_receive_buf_common(
>     |    tty = 0xFFFFFFFF477AC880,
>
>     |  ?)
>     |  ldata_=_0x0
>     |  __func__ = (110, 95, 116, 116, 121, 95, 114, 101, 99, 101, 105,
> 118, 101, 95, 98, 117, 102, 95, 99, 111, 109, 109, 111, 110, 0)
>     |  __u = (__val = 7079195495121566464, __c = (0))
>     |  c = 127
>     |  ldata = 0xFFFFFFFFF40DF97C
>
>     |  c = 0
>     |  ldata = 0xFFFFFF9F26F46000
>
> -001|n_tty_receive_buf2(
>     |    tty = 0xFFFFFFFF477AC880,
>
> -002|tty_ldisc_receive_buf(inline)
> -002|receive_buf(inline)
> -002|flush_to_ldisc(

This is probably the important bit. As you say we are doing a flush to
ldisc for a port even though it is not open.

That's starting to make more sense. Becausee your driver is the console
tty_port_shutdown doesn't stop everything (so console printk still
works), and that means you can receive data and we have a window on
reopening a tty that is only in use as a console where port->tty is valid
but ldisc is not.

I wonder what Jiri thinks but my first thougt is that tty_init_dev in
fact needs to do

tty_ldisc_lock(tty, 5 * HZ);
tty_ldisc_setup(tty);
tty_ldisc_unlock(tty)

with the relevant error handling so that the flush_to_ldisc waits and
either hits 'no ldisc' or 'ldisc valid'

Alan

2018-01-05 07:45:51

by Gaurav Kohli

[permalink] [raw]
Subject: Re: [PATCH] tty: fix data race in n_tty_receive_buf_common

Hi Alan,
>>>
>>> Can you make that code available otherwise it's impossible to see
>>> what the problem might be.
>
>
 https://source.codeaurora.org/quic/la/kernel/msm-4.9/tree/drivers/tty/serial?h=msm-4.9
 As discussed , there not seems a problem as we are getting print
request even when port seems to closed.


tty_ldisc_lock(tty, 5 * HZ);
 tty_ldisc_setup(tty);
 tty_ldisc_unlock(tty)

But in above lock,  there is a chance when flush_to_ldisc will occur
first and acquired a lock in
tty_ldisc_ref itself.
So this may fail, I am not much sure here, Please correct me, If i am
missing something here.
> So can not we simply return from flush_to_ldisc ,when we know
> disc_data is not valid like
> we are doing for tty and ldisc already?
>
> if (tty->disc_data == NULL) {
>                 tty_ldisc_deref(disc);
>                 return;
>         }
>
>
>
>


Regards
Gaurav
-- Qualcomm India Private Limited, on behalf of Qualcomm Innovation
Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation
Collaborative Project.

2018-01-05 13:36:32

by Alan Cox

[permalink] [raw]
Subject: Re: [PATCH] tty: fix data race in n_tty_receive_buf_common

On Fri, 5 Jan 2018 13:15:45 +0530
"Kohli, Gaurav" <[email protected]> wrote:

> Hi Alan,
> >>>
> >>> Can you make that code available otherwise it's impossible to see
> >>> what the problem might be.
> >
> >
>  https://source.codeaurora.org/quic/la/kernel/msm-4.9/tree/drivers/tty/serial?h=msm-4.9
>  As discussed , there not seems a problem as we are getting print
> request even when port seems to closed.
>
>
> tty_ldisc_lock(tty, 5 * HZ);
>  tty_ldisc_setup(tty);
>  tty_ldisc_unlock(tty)
>
> But in above lock,  there is a chance when flush_to_ldisc will occur
> first and acquired a lock in
> tty_ldisc_ref itself.

Which is fine.

If the flush_to_ldisc gets there first then it will find there is a NULL
ldisc and do nothing. When it finishes the tty_init_dev will run and will
be protected from a further re-entry.

If the init_dev gets there first it will complete the init before the
flush_to_ldisc is permitted to proceed.

In other words we restore the intended invariant that ldisc's do not get
entered while their setup routine is running.

Alan

2018-01-05 13:56:46

by Gaurav Kohli

[permalink] [raw]
Subject: Re: [PATCH] tty: fix data race in n_tty_receive_buf_common



On 1/5/2018 7:06 PM, Alan Cox wrote:
> On Fri, 5 Jan 2018 13:15:45 +0530
> "Kohli, Gaurav" <[email protected]> wrote:
>
>> Hi Alan,
>>>>> Can you make that code available otherwise it's impossible to see
>>>>> what the problem might be.
>>>
>>  https://source.codeaurora.org/quic/la/kernel/msm-4.9/tree/drivers/tty/serial?h=msm-4.9
>>  As discussed , there not seems a problem as we are getting print
>> request even when port seems to closed.
>>
>>
>> tty_ldisc_lock(tty, 5 * HZ);
>>  tty_ldisc_setup(tty);
>>  tty_ldisc_unlock(tty)
>>
>> But in above lock,  there is a chance when flush_to_ldisc will occur
>> first and acquired a lock in
>> tty_ldisc_ref itself.
> Which is fine.
>
> If the flush_to_ldisc gets there first then it will find there is a NULL
> ldisc and do nothing. When it finishes the tty_init_dev will run and will
> be protected from a further re-entry.
>
> If the init_dev gets there first it will complete the init before the
> flush_to_ldisc is permitted to proceed.
>
> In other words we restore the intended invariant that ldisc's do not get
> entered while their setup routine is running.
>
>

But in above case , there we can hit another race, if we have a sequence
like this
tty_init_dev->alloc_tty_struct -> tty_ldisc_init -> this will initialize
ldisc ,
but at this moment disc_data is still NULL

And if flush_to_ldisc comes in between, it will take ldisc reference and
proceeds receive buffer.


Regards
Gaurav


--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

2018-01-05 14:15:51

by Alan Cox

[permalink] [raw]
Subject: Re: [PATCH] tty: fix data race in n_tty_receive_buf_common

> But in above case , there we can hit another race, if we have a sequence
> like this
> tty_init_dev->alloc_tty_struct -> tty_ldisc_init -> this will initialize
> ldisc ,
> but at this moment disc_data is still NULL
>
> And if flush_to_ldisc comes in between, it will take ldisc reference and
> proceeds receive buffer.

So you need to move the lock up one line to protect the assignment to
tty->port->itty. We can do that.

At that point your flush_to_ldisc should see either port->itty = NULL or a
valid initialized ldisc.

Alan

2018-01-05 20:15:06

by Gaurav Kohli

[permalink] [raw]
Subject: Re: [PATCH] tty: fix data race in n_tty_receive_buf_common

Hi Alan,


On 1/5/2018 7:45 PM, Alan Cox wrote:
>> But in above case , there we can hit another race, if we have a sequence
>> like this
>> tty_init_dev->alloc_tty_struct -> tty_ldisc_init -> this will initialize
>> ldisc ,
>> but at this moment disc_data is still NULL
>>
>> And if flush_to_ldisc comes in between, it will take ldisc reference and
>> proceeds receive buffer.
> So you need to move the lock up one line to protect the assignment to
> tty->port->itty. We can do that.
>
> At that point your flush_to_ldisc should see either port->itty = NULL or a
> valid initialized ldisc.
>
>

Yes , with little modification this should work.

+retval =  tty_ldisc_lock(tty, 5 * HZ);
+if (retval)
+         goto err_release_lock;
tty->port->itty = tty;
/*
* Structures all installed ... call the ldisc open routines.
* If we fail here just call release_tty to clean up.  No need
* to decrement the use counts, as release_tty doesn't care.
*/
retval = tty_ldisc_setup(tty, tty->link);
        if (retval)
            goto err_release_tty;
tty_ldisc_unlock(tty);
err_release_tty:
tty_unlock(tty);
tty_info_ratelimited(tty, "ldisc open failed (%d), clearing slot %d\n",
    retval, idx);
    release_tty(tty, idx);
    return ERR_PTR(retval);

+err_release_lock;
+tty_ldisc_unlock(tty);
+return ERR_PTR(retval);

Please let me know if above modification seems fine , then we can upload
this and try reproduction of Bug.

Regards
Gaurav

--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

2018-01-05 20:24:44

by Gaurav Kohli

[permalink] [raw]
Subject: Re: [PATCH] tty: fix data race in n_tty_receive_buf_common

Hi Alan,

Sorry correcting the typo here:
+retval =  tty_ldisc_lock(tty, 5 * HZ);
+if (retval)
+         goto err_release_lock;
tty->port->itty = tty;
/*
* Structures all installed ... call the ldisc open routines.
* If we fail here just call release_tty to clean up.  No need
* to decrement the use counts, as release_tty doesn't care.
*/
retval = tty_ldisc_setup(tty, tty->link);
        if (retval)
            goto err_release_tty;
tty_ldisc_unlock(tty);
err_release_tty:
tty_info_ratelimited(tty, "ldisc open failed (%d), clearing slot %d\n",
    retval, idx);
+err_release_lock;
+tty_unlock(tty);
+release_tty(tty, idx);
+tty_ldisc_unlock(tty);
+return ERR_PTR(retval);

On 1/6/2018 1:44 AM, Kohli, Gaurav wrote:
> Hi Alan,
>
>
> On 1/5/2018 7:45 PM, Alan Cox wrote:
>>> But in above case , there we can hit another race, if we have a
>>> sequence
>>> like this
>>> tty_init_dev->alloc_tty_struct -> tty_ldisc_init -> this will
>>> initialize
>>> ldisc ,
>>> but at this moment disc_data is still NULL
>>>
>>> And if flush_to_ldisc comes in between, it will take ldisc reference
>>> and
>>> proceeds receive buffer.
>> So you need to move the lock up one line to protect the assignment to
>> tty->port->itty. We can do that.
>>
>> At that point your flush_to_ldisc should see either port->itty = NULL
>> or a
>> valid initialized ldisc.
>>
>>
>
> Yes , with little modification this should work.
>
> +retval =  tty_ldisc_lock(tty, 5 * HZ);
> +if (retval)
> +         goto err_release_lock;
> tty->port->itty = tty;
> /*
> * Structures all installed ... call the ldisc open routines.
> * If we fail here just call release_tty to clean up.  No need
> * to decrement the use counts, as release_tty doesn't care.
> */
> retval = tty_ldisc_setup(tty, tty->link);
>         if (retval)
>             goto err_release_tty;
> tty_ldisc_unlock(tty);
> err_release_tty:
> tty_unlock(tty);
> tty_info_ratelimited(tty, "ldisc open failed (%d), clearing slot %d\n",
>     retval, idx);
>     release_tty(tty, idx);
>     return ERR_PTR(retval);
>
> +err_release_lock;
> +tty_ldisc_unlock(tty);
> +return ERR_PTR(retval);
>
> Please let me know if above modification seems fine , then we can
> upload this and try reproduction of Bug.
>
> Regards
> Gaurav
>

--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

2018-01-05 20:28:21

by Gaurav Kohli

[permalink] [raw]
Subject: Re: [PATCH] tty: fix data race in n_tty_receive_buf_common

HI Alan,

Sorry correcting the typo here:


On 1/6/2018 1:44 AM, Kohli, Gaurav wrote:
> Hi Alan,
>
>
> On 1/5/2018 7:45 PM, Alan Cox wrote:
>>> But in above case , there we can hit another race, if we have a
>>> sequence
>>> like this
>>> tty_init_dev->alloc_tty_struct -> tty_ldisc_init -> this will
>>> initialize
>>> ldisc ,
>>> but at this moment disc_data is still NULL
>>>
>>> And if flush_to_ldisc comes in between, it will take ldisc reference
>>> and
>>> proceeds receive buffer.
>> So you need to move the lock up one line to protect the assignment to
>> tty->port->itty. We can do that.
>>
>> At that point your flush_to_ldisc should see either port->itty = NULL
>> or a
>> valid initialized ldisc.
>>
>>
>
> Yes , with little modification this should work.
>
> +retval =  tty_ldisc_lock(tty, 5 * HZ);
> +if (retval)
> +         goto err_release_lock;
> tty->port->itty = tty;
> /*
> * Structures all installed ... call the ldisc open routines.
> * If we fail here just call release_tty to clean up.  No need
> * to decrement the use counts, as release_tty doesn't care.
> */
> retval = tty_ldisc_setup(tty, tty->link);
>         if (retval)
>             goto err_release_tty;
> tty_ldisc_unlock(tty);
> err_release_tty:
> tty_info_ratelimited(tty, "ldisc open failed (%d), clearing slot %d\n",
>     retval, idx);
> +err_release_lock;
    +tty_unlock(tty);
    +release_tty(tty, idx);
> +tty_ldisc_unlock(tty);
> +return ERR_PTR(retval);
>
> Please let me know if above modification seems fine , then we can
> upload this and try reproduction of Bug.
>
> Regards
> Gaurav
>

--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

2018-01-05 21:07:55

by Alan Cox

[permalink] [raw]
Subject: Re: [PATCH] tty: fix data race in n_tty_receive_buf_common

On Sat, 6 Jan 2018 01:54:36 +0530
"Kohli, Gaurav" <[email protected]> wrote:

> Hi Alan,
>
> Sorry correcting the typo here:
> +retval =  tty_ldisc_lock(tty, 5 * HZ);
> +if (retval)
> +         goto err_release_lock;
> tty->port->itty = tty;
> /*
> * Structures all installed ... call the ldisc open routines.
> * If we fail here just call release_tty to clean up.  No need
> * to decrement the use counts, as release_tty doesn't care.
> */
> retval = tty_ldisc_setup(tty, tty->link);
>         if (retval)
>             goto err_release_tty;
> tty_ldisc_unlock(tty);
> err_release_tty:
> tty_info_ratelimited(tty, "ldisc open failed (%d), clearing slot %d\n",
>     retval, idx);
> +err_release_lock;
> +tty_unlock(tty);
> +release_tty(tty, idx);
> +tty_ldisc_unlock(tty);
> +return ERR_PTR(retval);

Thanks - can you give that a testing since for some reason you seem to be
the only system able to hit this and confirm that it's now working
properly. Then I'll push it upstream

And thanks for doing all the debug work to find this and identify what
was going on.

Alan

2018-01-06 07:50:58

by Gaurav Kohli

[permalink] [raw]
Subject: Re: [PATCH] tty: fix data race in n_tty_receive_buf_common



On 1/6/2018 2:35 AM, Alan Cox wrote:
> On Sat, 6 Jan 2018 01:54:36 +0530
> "Kohli, Gaurav" <[email protected]> wrote:
>
>> Hi Alan,
>>
>> Sorry correcting the typo here:
>> +retval =  tty_ldisc_lock(tty, 5 * HZ);
>> +if (retval)
>> +         goto err_release_lock;
>> tty->port->itty = tty;
>> /*
>> * Structures all installed ... call the ldisc open routines.
>> * If we fail here just call release_tty to clean up.  No need
>> * to decrement the use counts, as release_tty doesn't care.
>> */
>> retval = tty_ldisc_setup(tty, tty->link);
>>         if (retval)
>>             goto err_release_tty;
>> tty_ldisc_unlock(tty);
>> err_release_tty:
>> tty_info_ratelimited(tty, "ldisc open failed (%d), clearing slot %d\n",
>>     retval, idx);
>> +err_release_lock;
>> +tty_unlock(tty);
>> +release_tty(tty, idx);
>> +tty_ldisc_unlock(tty);
>> +return ERR_PTR(retval);
> Thanks - can you give that a testing since for some reason you seem to be
> the only system able to hit this and confirm that it's now working
> properly. Then I'll push it upstream
>
> And thanks for doing all the debug work to find this and identify what
> was going on.
>
> Alan
> --
> To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


Thanks Alan for your support, yes we will try to reproduce and get back
to you.
Ideally it take 2-3 days for issue reproduction, but yes it is
consistently reproducible.

Regards
Gaurav

--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

2018-01-17 13:25:57

by Gaurav Kohli

[permalink] [raw]
Subject: Re: [PATCH] tty: fix data race in n_tty_receive_buf_common

On 1/6/2018 1:20 PM, Kohli, Gaurav wrote:

> On 1/6/2018 2:35 AM, Alan Cox wrote:
>> On Sat, 6 Jan 2018 01:54:36 +0530
>> "Kohli, Gaurav" <[email protected]> wrote:
>>> Hi Alan,
>>> Sorry correcting the typo here:
>>> +retval =  tty_ldisc_lock(tty, 5 * HZ);
>>> +if (retval)
>>> +         goto err_release_lock;
>>> tty->port->itty = tty;
>>> /*
>>> * Structures all installed ... call the ldisc open routines.
>>> * If we fail here just call release_tty to clean up.  No need
>>> * to decrement the use counts, as release_tty doesn't care.
>>> */
>>> retval = tty_ldisc_setup(tty, tty->link);
>>>           if (retval)
>>>               goto err_release_tty;
>>> tty_ldisc_unlock(tty);
>>> err_release_tty:
>>> tty_info_ratelimited(tty, "ldisc open failed (%d), clearing slot %d\n",
>>>       retval, idx);
>>> +err_release_lock;
>>> +tty_unlock(tty);
>>> +release_tty(tty, idx);
>>> +tty_ldisc_unlock(tty);
>>> +return ERR_PTR(retval);
>> Thanks - can you give that a testing since for some reason you seem to be
>> the only system able to hit this and confirm that it's now working
>> properly. Then I'll push it upstream
>> And thanks for doing all the debug work to find this and identify what
>> was going on.
>> Alan
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
>> the body of a message to [email protected]
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Thanks Alan for your support, yes we will try to reproduce and get back
> to you.
> Ideally it take 2-3 days for issue reproduction, but yes it is
> consistently reproducible.
> Regards
> Gaurav

Hi Alan,
Sorry for the delayed response, as i was waiting for test results.
I have uploaded the new patch v1 as per your suggestions and result
looks good.

Thanks
Gaurav
>
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

2018-01-20 18:53:52

by Alan Cox

[permalink] [raw]
Subject: Re: [PATCH] tty: fix data race in n_tty_receive_buf_common

> Sorry for the delayed response, as i was waiting for test results.
> I have uploaded the new patch v1 as per your suggestions and result
> looks good.

Thanks for all the testing. This looks good to me too.

Alan