There seems to be a race condition in tty drivers and I could see on
many boot cycles a NULL pointer dereference as tty_init_dev() tries to
do 'tty->port->itty = tty' even though tty->port is NULL.
'tty->port' will be set by the driver and if the driver has not yet done
it before we open the tty device we can get to this situation. By adding
some extra debug prints, I noticed that:
6.650130: uart_add_one_port
6.663849: register_console
6.664846: tty_open
6.674391: tty_init_dev
6.675456: tty_port_link_device
uart_add_one_port() registers the console, as soon as it registers, the
userspace tries to use it and that leads to tty_open() but
uart_add_one_port() has not yet done tty_port_link_device() and so
tty->port is not yet configured when control reaches tty_init_dev().
So, add one retry and use tty_init_dev_retry().
Signed-off-by: Sudip Mukherjee <[email protected]>
---
v1: had some hardcoded numbers which were difficult to understand.
https://lore.kernel.org/lkml/[email protected]/
I know this is not a proper fix, and the proper fix should have been to
have a lock. But that will be too intrusive and adding retry was a safer
option than that.
drivers/tty/tty_io.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 95d7abeca254..f71a11895230 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -1945,6 +1945,7 @@ EXPORT_SYMBOL_GPL(tty_kopen);
/**
* tty_open_by_driver - open a tty device
* @device: dev_t of device to open
+ * @retry: number of times to retry if tty_init_dev_retry fails
* @filp: file pointer to tty
*
* Performs the driver lookup, checks for a reopen, or otherwise
@@ -1957,7 +1958,7 @@ EXPORT_SYMBOL_GPL(tty_kopen);
* - concurrent tty driver removal w/ lookup
* - concurrent tty removal from driver table
*/
-static struct tty_struct *tty_open_by_driver(dev_t device,
+static struct tty_struct *tty_open_by_driver(dev_t device, int retry,
struct file *filp)
{
struct tty_struct *tty;
@@ -2001,7 +2002,7 @@ static struct tty_struct *tty_open_by_driver(dev_t device,
tty = ERR_PTR(retval);
}
} else { /* Returns with the tty_lock held for now */
- tty = tty_init_dev(driver, index);
+ tty = tty_init_dev_retry(driver, index, retry);
mutex_unlock(&tty_mutex);
}
out:
@@ -2036,7 +2037,7 @@ static struct tty_struct *tty_open_by_driver(dev_t device,
static int tty_open(struct inode *inode, struct file *filp)
{
struct tty_struct *tty;
- int noctty, retval;
+ int noctty, retval, retry = 1;
dev_t device = inode->i_rdev;
unsigned saved_flags = filp->f_flags;
@@ -2049,7 +2050,7 @@ static int tty_open(struct inode *inode, struct file *filp)
tty = tty_open_current_tty(device, filp);
if (!tty)
- tty = tty_open_by_driver(device, filp);
+ tty = tty_open_by_driver(device, retry--, filp);
if (IS_ERR(tty)) {
tty_free_file(filp);
--
2.11.0
On Thu, Nov 21, 2019 at 03:22:39PM +0000, Sudip Mukherjee wrote:
> There seems to be a race condition in tty drivers and I could see on
> many boot cycles a NULL pointer dereference as tty_init_dev() tries to
> do 'tty->port->itty = tty' even though tty->port is NULL.
> 'tty->port' will be set by the driver and if the driver has not yet done
> it before we open the tty device we can get to this situation. By adding
> some extra debug prints, I noticed that:
>
> 6.650130: uart_add_one_port
> 6.663849: register_console
> 6.664846: tty_open
> 6.674391: tty_init_dev
> 6.675456: tty_port_link_device
>
> uart_add_one_port() registers the console, as soon as it registers, the
> userspace tries to use it and that leads to tty_open() but
> uart_add_one_port() has not yet done tty_port_link_device() and so
> tty->port is not yet configured when control reaches tty_init_dev().
Shouldn't we do tty_port_link_device() before uart_add_one_port() to
remove that race? Once you register the console, yes, tty_open() can
happen, so the driver had better be ready to go at that point in time.
This feels like it should be fixed by the caller, not in the tty core.
Any reason that can not happen?
thanks,
greg k-h
Hi Greg,
On Thu, Nov 21, 2019 at 05:41:38PM +0100, Greg Kroah-Hartman wrote:
> On Thu, Nov 21, 2019 at 03:22:39PM +0000, Sudip Mukherjee wrote:
> > There seems to be a race condition in tty drivers and I could see on
> > many boot cycles a NULL pointer dereference as tty_init_dev() tries to
> > do 'tty->port->itty = tty' even though tty->port is NULL.
> > 'tty->port' will be set by the driver and if the driver has not yet done
> > it before we open the tty device we can get to this situation. By adding
> > some extra debug prints, I noticed that:
> >
> > 6.650130: uart_add_one_port
> > 6.663849: register_console
> > 6.664846: tty_open
> > 6.674391: tty_init_dev
> > 6.675456: tty_port_link_device
> >
> > uart_add_one_port() registers the console, as soon as it registers, the
> > userspace tries to use it and that leads to tty_open() but
> > uart_add_one_port() has not yet done tty_port_link_device() and so
> > tty->port is not yet configured when control reaches tty_init_dev().
>
> Shouldn't we do tty_port_link_device() before uart_add_one_port() to
> remove that race? Once you register the console, yes, tty_open() can
> happen, so the driver had better be ready to go at that point in time.
>
But tty_port_link_device() is done by uart_add_one_port() itself.
After registering the console uart_add_one_port() will call
tty_port_register_device_attr_serdev() and tty_port_link_device() is
called from this. Thats still tty core.
> This feels like it should be fixed by the caller, not in the tty core.
> Any reason that can not happen?
tty_port_register_device_attr_serdev() is part of tty core. Or is my
above understanding wrong?
--
Regards
Sudip
On 21. 11. 19, 22:01, Sudip Mukherjee wrote:
> Hi Greg,
>
> On Thu, Nov 21, 2019 at 05:41:38PM +0100, Greg Kroah-Hartman wrote:
>> On Thu, Nov 21, 2019 at 03:22:39PM +0000, Sudip Mukherjee wrote:
>>> There seems to be a race condition in tty drivers and I could see on
>>> many boot cycles a NULL pointer dereference as tty_init_dev() tries to
>>> do 'tty->port->itty = tty' even though tty->port is NULL.
>>> 'tty->port' will be set by the driver and if the driver has not yet done
>>> it before we open the tty device we can get to this situation. By adding
>>> some extra debug prints, I noticed that:
>>>
>>> 6.650130: uart_add_one_port
>>> 6.663849: register_console
>>> 6.664846: tty_open
>>> 6.674391: tty_init_dev
>>> 6.675456: tty_port_link_device
>>>
>>> uart_add_one_port() registers the console, as soon as it registers, the
>>> userspace tries to use it and that leads to tty_open() but
>>> uart_add_one_port() has not yet done tty_port_link_device() and so
>>> tty->port is not yet configured when control reaches tty_init_dev().
>>
>> Shouldn't we do tty_port_link_device() before uart_add_one_port() to
>> remove that race? Once you register the console, yes, tty_open() can
>> happen, so the driver had better be ready to go at that point in time.
>>
>
> But tty_port_link_device() is done by uart_add_one_port() itself.
> After registering the console uart_add_one_port() will call
> tty_port_register_device_attr_serdev() and tty_port_link_device() is
> called from this. Thats still tty core.
Interferences of console vs tty code are ugly. Does it help to simply
put tty_port_link_device to uart_add_one_port before uart_configure_port?
thanks,
--
js
suse labs
On 22. 11. 19, 10:05, Jiri Slaby wrote:
> On 21. 11. 19, 22:01, Sudip Mukherjee wrote:
>> Hi Greg,
>>
>> On Thu, Nov 21, 2019 at 05:41:38PM +0100, Greg Kroah-Hartman wrote:
>>> On Thu, Nov 21, 2019 at 03:22:39PM +0000, Sudip Mukherjee wrote:
>>>> There seems to be a race condition in tty drivers and I could see on
>>>> many boot cycles a NULL pointer dereference as tty_init_dev() tries to
>>>> do 'tty->port->itty = tty' even though tty->port is NULL.
>>>> 'tty->port' will be set by the driver and if the driver has not yet done
>>>> it before we open the tty device we can get to this situation. By adding
>>>> some extra debug prints, I noticed that:
>>>>
>>>> 6.650130: uart_add_one_port
>>>> 6.663849: register_console
>>>> 6.664846: tty_open
>>>> 6.674391: tty_init_dev
>>>> 6.675456: tty_port_link_device
>>>>
>>>> uart_add_one_port() registers the console, as soon as it registers, the
>>>> userspace tries to use it and that leads to tty_open() but
>>>> uart_add_one_port() has not yet done tty_port_link_device() and so
>>>> tty->port is not yet configured when control reaches tty_init_dev().
>>>
>>> Shouldn't we do tty_port_link_device() before uart_add_one_port() to
>>> remove that race? Once you register the console, yes, tty_open() can
>>> happen, so the driver had better be ready to go at that point in time.
>>>
>>
>> But tty_port_link_device() is done by uart_add_one_port() itself.
>> After registering the console uart_add_one_port() will call
>> tty_port_register_device_attr_serdev() and tty_port_link_device() is
>> called from this. Thats still tty core.
>
> Interferences of console vs tty code are ugly. Does it help to simply
> put tty_port_link_device to uart_add_one_port before uart_configure_port?
Alternatively, you can try setting tty_port in uart_install by:
tty->port = &state->port.
BTW do you see the warning from tty_init_dev:
"driver does not set tty->port. This will crash the kernel later. Fix
the driver!\n"
? Maybe not given console is registered already, but crashes.
> thanks,
--
js
suse labs
On Fri, Nov 22, 2019 at 10:11:26AM +0100, Jiri Slaby wrote:
> On 22. 11. 19, 10:05, Jiri Slaby wrote:
> > On 21. 11. 19, 22:01, Sudip Mukherjee wrote:
> >> Hi Greg,
> >>
> >> On Thu, Nov 21, 2019 at 05:41:38PM +0100, Greg Kroah-Hartman wrote:
> >>> On Thu, Nov 21, 2019 at 03:22:39PM +0000, Sudip Mukherjee wrote:
> >>>> There seems to be a race condition in tty drivers and I could see on
> >>>> many boot cycles a NULL pointer dereference as tty_init_dev() tries to
<snip>
> >
> > Interferences of console vs tty code are ugly. Does it help to simply
> > put tty_port_link_device to uart_add_one_port before uart_configure_port?
>
> Alternatively, you can try setting tty_port in uart_install by:
> tty->port = &state->port.
I have not tried these. will try.
>
> BTW do you see the warning from tty_init_dev:
> "driver does not set tty->port. This will crash the kernel later. Fix
> the driver!\n"
> ? Maybe not given console is registered already, but crashes.
yes. I do see the warning but I have always assumed that the warning is
because console is openend as soon as it registers and so uart_add_one_port()
does not get the chance to link it. Is it not so?
--
Regards
Sudip
On 24. 11. 19, 1:02, Sudip Mukherjee wrote:
>> BTW do you see the warning from tty_init_dev:
>> "driver does not set tty->port. This will crash the kernel later. Fix
>> the driver!\n"
>> ? Maybe not given console is registered already, but crashes.
>
> yes. I do see the warning but I have always assumed that the warning is
> because console is openend as soon as it registers and so uart_add_one_port()
> does not get the chance to link it. Is it not so?
Yes it is, I was just curious...
thanks,
--
js
suse labs
Hi Jiri,
On Fri, Nov 22, 2019 at 10:05:09AM +0100, Jiri Slaby wrote:
> On 21. 11. 19, 22:01, Sudip Mukherjee wrote:
> > Hi Greg,
> >
> > On Thu, Nov 21, 2019 at 05:41:38PM +0100, Greg Kroah-Hartman wrote:
> >> On Thu, Nov 21, 2019 at 03:22:39PM +0000, Sudip Mukherjee wrote:
> >>> There seems to be a race condition in tty drivers and I could see on
> >>> many boot cycles a NULL pointer dereference as tty_init_dev() tries to
> >>> do 'tty->port->itty = tty' even though tty->port is NULL.
<snip>
> >>>
> >>> uart_add_one_port() registers the console, as soon as it registers, the
> >>> userspace tries to use it and that leads to tty_open() but
> >>> uart_add_one_port() has not yet done tty_port_link_device() and so
> >>> tty->port is not yet configured when control reaches tty_init_dev().
> >>
> >> Shouldn't we do tty_port_link_device() before uart_add_one_port() to
> >> remove that race? Once you register the console, yes, tty_open() can
> >> happen, so the driver had better be ready to go at that point in time.
> >>
> >
> > But tty_port_link_device() is done by uart_add_one_port() itself.
> > After registering the console uart_add_one_port() will call
> > tty_port_register_device_attr_serdev() and tty_port_link_device() is
> > called from this. Thats still tty core.
>
> Interferences of console vs tty code are ugly. Does it help to simply
> put tty_port_link_device to uart_add_one_port before uart_configure_port?
sorry for the late response, got busy with an out-of-tree driver.
It fixes the problem if I put tty_port_link_device() before
uart_configure_port(). Please check the attached patch and that
completely fixes the problem. Do you want me to send a proper patch for
it or do you want me to check more into it?
--
Regards
Sudip
On Tue, Dec 10, 2019 at 11:41:47AM +0000, Sudip Mukherjee wrote:
> Hi Jiri,
>
> On Fri, Nov 22, 2019 at 10:05:09AM +0100, Jiri Slaby wrote:
> > On 21. 11. 19, 22:01, Sudip Mukherjee wrote:
> > > Hi Greg,
> > >
> > > On Thu, Nov 21, 2019 at 05:41:38PM +0100, Greg Kroah-Hartman wrote:
> > >> On Thu, Nov 21, 2019 at 03:22:39PM +0000, Sudip Mukherjee wrote:
> > >>> There seems to be a race condition in tty drivers and I could see on
> > >>> many boot cycles a NULL pointer dereference as tty_init_dev() tries to
> > >>> do 'tty->port->itty = tty' even though tty->port is NULL.
> <snip>
> > >>>
> > >>> uart_add_one_port() registers the console, as soon as it registers, the
> > >>> userspace tries to use it and that leads to tty_open() but
> > >>> uart_add_one_port() has not yet done tty_port_link_device() and so
> > >>> tty->port is not yet configured when control reaches tty_init_dev().
> > >>
> > >> Shouldn't we do tty_port_link_device() before uart_add_one_port() to
> > >> remove that race? Once you register the console, yes, tty_open() can
> > >> happen, so the driver had better be ready to go at that point in time.
> > >>
> > >
> > > But tty_port_link_device() is done by uart_add_one_port() itself.
> > > After registering the console uart_add_one_port() will call
> > > tty_port_register_device_attr_serdev() and tty_port_link_device() is
> > > called from this. Thats still tty core.
> >
> > Interferences of console vs tty code are ugly. Does it help to simply
> > put tty_port_link_device to uart_add_one_port before uart_configure_port?
>
> sorry for the late response, got busy with an out-of-tree driver.
>
> It fixes the problem if I put tty_port_link_device() before
> uart_configure_port(). Please check the attached patch and that
> completely fixes the problem. Do you want me to send a proper patch for
> it or do you want me to check more into it?
This looks a lot more sane to me, can you resend it in proper format so
that I can apply it?
thanks,
greg k-h