Received: by 2002:ac0:8c9a:0:0:0:0:0 with SMTP id r26csp499232ima; Fri, 1 Feb 2019 06:35:50 -0800 (PST) X-Google-Smtp-Source: ALg8bN4lYDE51IR91ZSiXmMwgDKsdYFOnq3/fwzV1gH94LQWuJp84SxzmTrDXH8zxUSdFqGtHTLP X-Received: by 2002:a17:902:7687:: with SMTP id m7mr39547865pll.187.1549031749967; Fri, 01 Feb 2019 06:35:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549031749; cv=none; d=google.com; s=arc-20160816; b=CMAsxwhELcyqSkljAcqII4KX6uIxjsK8B++FgKJLU57Q4C0tLi9Y5ggstvPaHx+yxs WkBEzv/Qj+Y4SGaDjwp5oble37jUr2nPs2MeW/6c/xkv3tlSUIFmptKKrPfb1pBIjRw1 D3jPxnEdS/Sfnn/opztYQKUvEjYXSRc5NyyGywn1lUwUV+E9xs52xGKonMxDgv/0qWTx Oif/brtztWuvzSr+gWzbiAEuQY3MHqQ9or3xvOiXK2scUc68cateM667XP6nONfXFV+e 2CJem/bJNzzn1sf/z5VwWAKlKJ1V+B5dZFFck6DQDQhzu2A7pcvzDGvZ6a7mXRCKyDa4 6+og== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=zq3DWnkjVl0SY6qfrzKIGpRAp/tt5L/irRrtnlPViCs=; b=lRyJ2+WzQkXrTvuhw/F1p79thTnZVVBh0h0Bgg3jCznBR8lqS2CpkYhqWcOhqclrdc cfSW62hnuJ30QMfiZRM1zX0QhyipkE0Pf9TUYALdrsaWTs0j9jgTh6tWBp8ywbTL1uRw zTwrhRfSBd69s6jPehTqIJJH7Zd4SJVdPBLc7mJyZKHSWOe3Udv2o1EzClzjk44oe+6J QksEkF/VFYElK7kieochm4l1xXAvE/s/cfAlndPoguW9y3xb5WS15nc1SPYSCOkjWVkQ 0QUAQtA9MmZ6Vm1nbbKqTr2zB+tAJdRAiV7SeOv1ympb4lZt6C9wyy+B0p8yV62LSXab BCmw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=LoWlteNj; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n28si7643371pfb.88.2019.02.01.06.35.33; Fri, 01 Feb 2019 06:35:49 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=LoWlteNj; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728394AbfBAO0r (ORCPT + 99 others); Fri, 1 Feb 2019 09:26:47 -0500 Received: from mail.kernel.org ([198.145.29.99]:43616 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726172AbfBAO0q (ORCPT ); Fri, 1 Feb 2019 09:26:46 -0500 Received: from localhost (5356596B.cm-6-7b.dynamic.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id E8E1A218AC; Fri, 1 Feb 2019 14:26:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1549031205; bh=gSwrlzfwyyml8ihaics6bvBOT0IokvOM4CEtJuX/QC4=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=LoWlteNjG3atOqO62Y5j/cIS3C3kOX+daQSBk6ak+izth7OrH4gaYRvVTGsrka0jW FLKiR5v+odGYYp21sZwjtfqKQjrR6Cdy6Dc6si5wZUpTfPyjDNR6BLJs6RZ0EjqTps YKtvNASvKMqTgArLsUh42sAdVC3/Q/l3n0kLtXEE= Date: Fri, 1 Feb 2019 15:26:42 +0100 From: "gregkh@linuxfoundation.org" To: Maninder Singh Cc: "peter@hurleysoftware.com" , "jslaby@suse.com" , "keun-o.park@darkmatter.ae" , "linux-kernel@vger.kernel.org" , AMIT SAHRAWAT , Vaneet Narang , Rohit Thapliyal , Ayush Mittal Subject: Re: race between flush_to_ldisc and pty_cleanup Message-ID: <20190201142642.GB3211@kroah.com> References: <20190201133326epcms5p506416bc4ae22f600ee705f146ca1a599@epcms5p5> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190201133326epcms5p506416bc4ae22f600ee705f146ca1a599@epcms5p5> User-Agent: Mutt/1.11.2 (2019-01-07) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Feb 01, 2019 at 07:03:26PM +0530, Maninder Singh wrote: > Hi, > > > There is some race condition between tty_port_put and flush_to_ldisc > which lead to use after free case: > (Kernel 4.1) > > [1403.5130] Unable to handle kernel paging request at virtual address 6b6b6b83 > ... > ... > ... > > [1403.5132] [] (ldsem_down_read_trylock) from [] (tty_ldisc_ref+0x24/0x60) > [1403.5132] [] (tty_ldisc_ref) from [] (flush_to_ldisc+0x6c/0x21c) > [1403.5132] r5:dbcd4a84 r4:00000000 > [1403.5132] [] (flush_to_ldisc) from [] (process_one_work+0x214/0x570) > [1403.5132] r10:00000000 r9:ddab0000 r8:e3d6e000 r7:00000000 r6:e453f740 r5:cb37b780 > [1403.5132] r4:dbcd4a84 > [1403.5132] [] (process_one_work) from [] (worker_thread+0x60/0x580) > [1403.5132] r10:e453f740 r9:ddab0000 r8:e453f764 r7:00000088 r6:e453f740 r5:cb37b798 > [1403.5132] r4:cb37b780 > [1403.5132] [] (worker_thread) from [] (kthread+0xec/0x104) > [1403.5132] r10:00000000 r9:00000000 r8:00000000 r7:c004a274 r6:cb37b780 r5:d8a3fc80 > [1403.5132] r4:00000000 > [1403.5132] [] (kthread) from [] (ret_from_fork+0x14/0x3c) > > > for checking further we entered some debug prints and added delay in flush_to_ldisc to reproduce > and seems there is some issue with workqueue implementation of TTY: > > bool tty_buffer_cancel_work(struct tty_port *port) > { > bool ret; > ret = cancel_work_sync(&port->buf.work); // Check return value of cancel_work_sync > pr_emerg("Work cancelled is 0x%x %pS %d\n", (unsigned int)&port->buf.work, (void *)_RET_IP_, ret); > return ret; > } > > static void flush_to_ldisc(struct work_struct *work) > { > ... > mdelay(100); // Added Delay to reproduce race > > if (flag_work_cancel) { > pr_emerg("scheduled work after stopping work %x\n", (unsigned int)work); > > .... > } > > static void pty_cleanup(struct tty_struct *tty) > { > ... > flag_work_cancel = 1; > ... > } > > > [1403.4158]Work cancelled is dbcd4a84 tty_port_destroy+0x1c/0x6c 0 // Since return is 0 so no work is pending > > [1403.5129] scheduled work after stopping work dbcd4a84 // Still same work is scheduled after cancelled > [1403.5130] Unable to handle kernel paging request at virtual address 6b6b6b83 // Kernel OOPs occured because of use after free Ok, after my initial "use a newer kernel" comment, this really does look strange. There has also been a lot of workqueue fixes and rework since 4.1, and that might be the thing that fixes this issue here. However, are you sure you are not just calling flush_to_ldisc() directly through some codepath somehow? If you look at the stack in the pr_emerg() message, where did it come from? From the same workqueue that you already stopped? Testing on a newer kernel would be great, if possible. thanks, greg k-h