Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753281AbdDGBYz (ORCPT ); Thu, 6 Apr 2017 21:24:55 -0400 Received: from mail-pg0-f68.google.com ([74.125.83.68]:35961 "EHLO mail-pg0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752615AbdDGBYq (ORCPT ); Thu, 6 Apr 2017 21:24:46 -0400 Date: Fri, 7 Apr 2017 09:24:59 +0800 From: Wang YanQing To: Michael Neuling Cc: Al Viro , johan Hovold , Peter Hurley , Alexander Popov , Rob Herring , Mikulas Patocka , Dmitry Vyukov , benh , LKML Subject: Re: tty crash in tty_ldisc_receive_buf() Message-ID: <20170407012459.GA3431@udknight> Mail-Followup-To: Wang YanQing , Michael Neuling , Al Viro , johan Hovold , Peter Hurley , Alexander Popov , Rob Herring , Mikulas Patocka , Dmitry Vyukov , benh , LKML References: <1491462281.2815.47.camel@neuling.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1491462281.2815.47.camel@neuling.org> User-Agent: Mutt/1.7.1 (2016-10-04) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3318 Lines: 88 On Thu, Apr 06, 2017 at 05:04:41PM +1000, Michael Neuling wrote: > Hi all, > > We are seeing the following crash (in linux-next but has been around since at > least v4.10). > > [??417.514499] Unable to handle kernel paging request for data at address 0x00002260 > [??417.515361] Faulting instruction address: 0xc0000000006fad80 > cpu 0x15: Vector: 300 (Data Access) at [c00000799411f890] > ????pc: c0000000006fad80: n_tty_receive_buf_common+0xc0/0xbd0 > ????lr: c0000000006fad5c: n_tty_receive_buf_common+0x9c/0xbd0 > ????sp: c00000799411fb10 > ???msr: 900000000280b033 > ???dar: 2260 > ?dsisr: 40000000 > ? current = 0xc0000079675d1e00 > ? paca????= 0xc00000000fb0d200 ?softe: 0 ?irq_happened: 0x01 > ????pid???= 5, comm = kworker/u56:0 > Linux version 4.11.0-rc5-next-20170405 (mikey@bml86) (gcc version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #2 SMP Thu Apr 6 00:36:46 CDT 2017 > enter ? for help > [c00000799411fbe0] c0000000006ff968 tty_ldisc_receive_buf+0x48/0xe0 > [c00000799411fc10] c0000000007009d8 tty_port_default_receive_buf+0x68/0xe0 > [c00000799411fc50] c0000000006ffce4 flush_to_ldisc+0x114/0x130 > [c00000799411fca0] c00000000010a0fc process_one_work+0x1ec/0x580 > [c00000799411fd30] c00000000010a528 worker_thread+0x98/0x5d0 > [c00000799411fdc0] c00000000011343c kthread+0x16c/0x1b0 > [c00000799411fe30] c00000000000b4e8 ret_from_kernel_thread+0x5c/0x74 > > It seems the null ptr deref is in n_tty_receive_buf_common() where we do: > > size_t tail = smp_load_acquire(&ldata->read_tail); > > ldata is NULL. > > We see this usually on boot but can also see it if we kill a getty attached to > tty (which is then respawned by systemd). It seems like we are flushing data to > a tty at the same time as it's being torn down and restarted. > > I did try the below patch which avoids the crash but locks up one of the CPUs. I > guess the data never gets flushed if we say nothing is processed. > > This is on powerpc but has also been reported by parisc. > > I'm not at all familiar with the tty layer and looking at the locks, mutexes, > semaphores and reference counting in there scares the hell out of me.? > > If anyone has an idea, I'm happy to try a patch. > > Regards, > Mikey > > diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c > index bdf0e6e899..99dd757aa4 100644 > --- a/drivers/tty/n_tty.c > +++ b/drivers/tty/n_tty.c > @@ -1673,6 +1673,10 @@ n_tty_receive_buf_common(struct tty_struct *tty, const unsigned char *cp, > > down_read(&tty->termios_rwsem); > > + /* This probably shouldn't happen, but return 0 data processed */ > + if (!ldata) > + return 0; > + > while (1) { > /* > * When PARMRK is set, each input char may take up to 3 chars Maybe your patch should looks like: + /* This probably shouldn't happen, but return 0 data processed */ + if (!ldata) { + up_read(&tty->termios_rwsem); + return 0; + } or Maybe below patch should work: @@ -1668,11 +1668,12 @@ static int n_tty_receive_buf_common(struct tty_struct *tty, const unsigned char *cp, char *fp, int count, int flow) { - struct n_tty_data *ldata = tty->disc_data; + struct n_tty_data *ldata; int room, n, rcvd = 0, overflow; down_read(&tty->termios_rwsem); + ldata = tty->disc_data;