Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753310AbeADOhh convert rfc822-to-8bit (ORCPT + 1 other); Thu, 4 Jan 2018 09:37:37 -0500 Received: from www.llwyncelyn.cymru ([82.70.14.225]:45154 "EHLO fuzix.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753292AbeADOhe (ORCPT ); Thu, 4 Jan 2018 09:37:34 -0500 Date: Thu, 4 Jan 2018 14:37:16 +0000 From: Alan Cox To: "Kohli, Gaurav" Cc: jslaby@suse.com, gregkh@linuxfoundation.org, mikey@neuling.org, linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org Subject: Re: [PATCH] tty: fix data race in n_tty_receive_buf_common Message-ID: <20180104143716.5b09b1c7@alans-desktop> In-Reply-To: <0dbd1f05-4c94-d1cc-3858-7bd4d38b9212@codeaurora.org> References: <1514987332-14122-1-git-send-email-gkohli@codeaurora.org> <20180103193807.465e054e@alans-desktop> <0a456419-c836-08cf-070b-a254fb702b75@codeaurora.org> <20180104110920.169a1fe5@alans-desktop> <0dbd1f05-4c94-d1cc-3858-7bd4d38b9212@codeaurora.org> Organization: Intel Corporation X-Mailer: Claws Mail 3.15.1-dirty (GTK+ 2.24.31; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On Thu, 4 Jan 2018 19:16:46 +0530 "Kohli, Gaurav" wrote: > > Which tty driver ? serial/msm_serial.c ? > > We are using our internal driver, msm_geni_serial.c Can you make that code available otherwise it's impossible to see what the problem might be. > > > > Ok no what I need to see is a trace of what each CPU is doing at the > > point you detect the problem. That way we can see what the path that > > races is. > Below is stack trace running by init in our case on one core > -006|n_tty_open( >     |    tty = 0xFFFFFFFF477AC880 -> ( >     |      disc_data = 0xFFFFFF80197AD000, > >     |      port = 0xFFFFFFFFEDE40000)) >     |  ldata = 0xFFFFFF80197AD000 > >     |  trace_printk_fmt = 0xFFFFFF9F275125F8 > -007|tty_ldisc_open.isra.3( >     |    tty = 0xFFFFFFFF477AC880) > -008|tty_ldisc_setup( > > -009|tty_init_dev( >     |    driver = 0xFFFFFFFFEDE2A480, >     |    idx = 0) > > -010|tty_open_by_driver(inline) > -010|tty_open( So core 1 is opening the tty from user space and that's a normal looking trace for an open of a port that was closed > > Core 2: > -000|n_tty_receive_buf_common( >     |    tty = 0xFFFFFFFF477AC880, > >     |  ?) >     |  ldata_=_0x0 >     |  __func__ = (110, 95, 116, 116, 121, 95, 114, 101, 99, 101, 105, > 118, 101, 95, 98, 117, 102, 95, 99, 111, 109, 109, 111, 110, 0) >     |  __u = (__val = 7079195495121566464, __c = (0)) >     |  c = 127 >     |  ldata = 0xFFFFFFFFF40DF97C > >     |  c = 0 >     |  ldata = 0xFFFFFF9F26F46000 > > -001|n_tty_receive_buf2( >     |    tty = 0xFFFFFFFF477AC880, > > -002|tty_ldisc_receive_buf(inline) > -002|receive_buf(inline) > -002|flush_to_ldisc( This is probably the important bit. As you say we are doing a flush to ldisc for a port even though it is not open. That's starting to make more sense. Becausee your driver is the console tty_port_shutdown doesn't stop everything (so console printk still works), and that means you can receive data and we have a window on reopening a tty that is only in use as a console where port->tty is valid but ldisc is not. I wonder what Jiri thinks but my first thougt is that tty_init_dev in fact needs to do tty_ldisc_lock(tty, 5 * HZ); tty_ldisc_setup(tty); tty_ldisc_unlock(tty) with the relevant error handling so that the flush_to_ldisc waits and either hits 'no ldisc' or 'ldisc valid' Alan