Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp1520562ybl; Tue, 13 Aug 2019 14:11:49 -0700 (PDT) X-Google-Smtp-Source: APXvYqzSwcD+CeWR4G9pDhe6GXn9m2J0hLLBNOnFlD8vyeMAHwa9dzyz4Bf030cXuuKlouavKUQm X-Received: by 2002:a63:e20a:: with SMTP id q10mr35316188pgh.24.1565730709572; Tue, 13 Aug 2019 14:11:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565730709; cv=none; d=google.com; s=arc-20160816; b=O2Vpd/UW8uISJ43HGegspO8UpPvOfgfl84a+eHj4nyOHezn2YN2GA7qgqTS8Aj4lJl MTYbe5UIPj3gWLB6sKitJ82/udj+eBWlQjacgx/WMBuRUoDSmRz9kCqqjRmH2oDmSyuA c6/JOD0WO/oIexKAzeBYDPVQWgOlclK6RWdFXdvJmtQqppKzkq9AXczm/ATFFHTJZ1qQ UgqPcUavXF5hiRuSZLJQYsawyFvtTTQ6LtRGKZ7GFzMCKAW/x4U8GTJqBOqAObJcZdPl htjzvNLBEsB1NvzuBeUcVNL96WWiuzKuce78Qy7KXm5apiUQzFDRsTL52EvDRMiRnQd8 h1Tg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:references:cc:to:from:subject:dkim-signature; bh=GSZgbf3HuPjyFe9/pIr+g9FiHr/A7muyNamY/S+NyE0=; b=0WteRrcgeoE1c4goHht2kqfr/3dsTX+megInYNUz/OAksiUEnhvJxS1iMDcwcWKoOD IVd2QTek8C8/4EDn3dTqYBwt5vEb4fhmZTue2lsSVSV/sctnKEaVY8JeyvGg/zNUjytY at6NPbUThiUKzJwk06xSjnZRMPLMgl/eNMHwo/7JjQih4Ib/CVSCvDdToUPiLGGeJMpV ikb14isuR4bz8tCojfCplc8ieGpXKKLEHT7npSoHtqw0q8k0X9m2v/3cJ1gxahH3VEde KbmRaEmVGXA/OEhvXEPHtsbLVyKZ3JAtnvo2jN1P6Wp9rbFpBeP3bVwvk15bxFfFJ4Ej R6Uw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@geanix.com header.s=first header.b=dY3N7FwQ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=geanix.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h25si67251993pfn.56.2019.08.13.14.11.32; Tue, 13 Aug 2019 14:11:49 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@geanix.com header.s=first header.b=dY3N7FwQ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=geanix.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726585AbfHMVKt (ORCPT + 99 others); Tue, 13 Aug 2019 17:10:49 -0400 Received: from first.geanix.com ([116.203.34.67]:53280 "EHLO first.geanix.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725923AbfHMVKt (ORCPT ); Tue, 13 Aug 2019 17:10:49 -0400 Received: from [192.168.8.20] (unknown [85.184.140.241]) by first.geanix.com (Postfix) with ESMTPSA id 85D234B441; Tue, 13 Aug 2019 21:09:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=geanix.com; s=first; t=1565730578; bh=HnRrkMnK2x5biGgdnBcO7bDH82LVKWeShQafsKrCa4g=; h=Subject:From:To:Cc:References:Date:In-Reply-To; b=dY3N7FwQTee1xMDY+q1fZ2CG38Sn4Agm+s9vQmv/0bGtxEsgAA7AM67QUxURna11R NFLVnt1lCyaweKFLTmnVtpIL27j+hL7Y6/2mzx3+fdQ7EBQGFnpKThYZt7Hv8l1V0a eTx1Z3aHEMsIBkbYMpVSNSnZ61hmQO5piKoJuYNTBqGSLYuS2Ct7rr0X+w/cxb9wlk dwEeOrKYLehlUplyByLxB/QJjRSLo1jGQ3K2ISmPTjLbX032/aRisIQDY4/eDws98K dU3ED80VktW+/RY+/3Gqp7QYtu6BrkTf4u92+Jf3lw0mgUpKzv3h7rmQuGw0C4Z+FF 8eINEf/JkVGSQ== Subject: Re: [BUG] n_gsm: possible recursive locking detected From: =?UTF-8?Q?Martin_Hundeb=c3=b8ll?= To: Greg Kroah-Hartman Cc: linux-kernel , Jiri Slaby , Dirkjan Bussink References: <20190725112651.GA12309@kroah.com> <79b38084-8867-0972-ca6c-155e5dbb0d90@geanix.com> Message-ID: Date: Tue, 13 Aug 2019 23:10:17 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <79b38084-8867-0972-ca6c-155e5dbb0d90@geanix.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US-large Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.1 required=4.0 tests=ALL_TRUSTED,BAYES_00, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,URIBL_BLOCKED autolearn=disabled version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on 1ffa6606a633 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 12/08/2019 22.59, Martin Hundebøll wrote: > > > On 25/07/2019 13.26, Greg Kroah-Hartman wrote: >> On Wed, Jul 17, 2019 at 11:40:02AM +0200, Martin Hundebøll wrote: >>> Hi, >>> >>> The GSM0710 line discipline driver triggers a lockdep warning when >>> disabling >>> the ldisc while holding a multiplexed virtual tty open: >>> >>> ============================================ >>> WARNING: possible recursive locking detected >>> 5.2.0-00114-gdab52e30156b #6 Not tainted >>> -------------------------------------------- >>> cmux/263 is trying to acquire lock: >>> e1e23b18 (&tty->legacy_mutex){+.+.}, at: __tty_hangup.part.0+0x58/0x27c >>> >>> but task is already holding lock: >>> d6eddf48 (&tty->legacy_mutex){+.+.}, at: tty_set_ldisc+0x3c/0x1bc >>> >>> other info that might help us debug this: >>>   Possible unsafe locking scenario: >>> >>>         CPU0 >>>         ---- >>>    lock(&tty->legacy_mutex); >>>    lock(&tty->legacy_mutex); >>> >>>   *** DEADLOCK *** >>> >>>   May be due to missing lock nesting notation >>> >>> 3 locks held by cmux/263: >>>   #0: d6eddf48 (&tty->legacy_mutex){+.+.}, at: tty_set_ldisc+0x3c/0x1bc >>>   #1: f28bead9 (&tty->ldisc_sem){++++}, at: tty_ldisc_lock+0x50/0x74 >>>   #2: e5d20e4f (&gsm->mutex){+.+.}, at: gsm_cleanup_mux+0x9c/0x15c >>> >>> stack backtrace: >>> CPU: 0 PID: 263 Comm: cmux Not tainted 5.2.0-00114-gdab52e30156b #6 >>> Hardware name: Freescale i.MX6 Ultralite (Device Tree) >>> [] (unwind_backtrace) from [] (show_stack+0x10/0x14) >>> [] (show_stack) from [] (dump_stack+0xd4/0x108) >>> [] (dump_stack) from [] >>> (__lock_acquire+0x6ec/0x1e84) >>> [] (__lock_acquire) from [] >>> (lock_acquire+0xcc/0x204) >>> [] (lock_acquire) from [] (__mutex_lock+0x64/0x90c) >>> [] (__mutex_lock) from [] >>> (mutex_lock_nested+0x1c/0x24) >>> [] (mutex_lock_nested) from [] >>> (__tty_hangup.part.0+0x58/0x27c) >>> [] (__tty_hangup.part.0) from [] >>> (gsm_cleanup_mux+0xe8/0x15c) >>> [] (gsm_cleanup_mux) from [] (gsmld_close+0x48/0x90) >>> [] (gsmld_close) from [] (tty_set_ldisc+0xb8/0x1bc) >>> [] (tty_set_ldisc) from [] (tty_ioctl+0x640/0xcb0) >>> [] (tty_ioctl) from [] (do_vfs_ioctl+0x41c/0xa5c) >>> [] (do_vfs_ioctl) from [] (ksys_ioctl+0x34/0x60) >>> [] (ksys_ioctl) from [] (ret_fast_syscall+0x0/0x28) >>> Exception stack(0xc8ce1fa8 to 0xc8ce1ff0) >>> 1fa0:                   00438000 00000000 00000003 00005423 beb6cc04 >>> beb6cc04 >>> 1fc0: 00438000 00000000 00000000 00000036 00000000 00000000 00438000 >>> beb6ccd4 >>> 1fe0: 00438048 beb6cbfc 00427684 b6f58b88 >>> >>> >>> Steps to reproduce using the attached cmux util: >>> >>> root@iwg26:~# ./cmux & >>> [1] 254 >>> SERIAL_PORT = /dev/ttymxc0 >>> AT+IFC=2: Ie5   +CFUN: 1    +CPIN: READY    Call Ready  AT+IFC=2,2   OK >>> AT+GMM  : AT+GMM   Quectel_M95    OK >>> AT      : AT   OK >>> AT+IPR=1: AT+IPR=115200&w   OK >>> AT+CMUX=: AT+CMUX=0,0,5,512,10,3,30,10,2   OK >>> Line dicipline set >>> >>> root@iwg26:~# cat /dev/gsmtty1 & >>> [2] 262 >>> root@iwg26:~# kill %1 >>> [   74.517449] ============================================ >>> [   74.522769] WARNING: possible recursive locking detected >>> [   74.528094] 5.2.0-00114-gdab52e30156b #6 Not tainted >>> [   74.533065] -------------------------------------------- >>> <...> >>> >>> >>> This has supposedly been fixed before in 4d9b109060f6 ("tty: Prevent >>> deadlock in n_gsm driver"), but the fix was undone in be7065725590 >>> ("TTY/n_gsm: Removing the wrong tty_unlock/lock() in >>> gsm_dlci_release()") >> >> Do you have a patch that can resolve this given you have a test case? > > No, sorry. > > I can try to cook a patch, but chances are I will break locking for > someone else. Hints are welcome. The recursive locking happens when restoring the initial line discipline: ioctl(serial_fd, TIOCSETD, &ldisc_save); If I remove the restore, and let close() do the ldisc tear down I get a circular lockdep warning: [ 56.254258] ====================================================== [ 56.260447] WARNING: possible circular locking dependency detected [ 56.266641] 5.2.0-00118-g1fd58e20e5b0 #30 Not tainted [ 56.271701] ------------------------------------------------------ [ 56.277890] cmux/271 is trying to acquire lock: [ 56.282436] 8215283a (&tty->legacy_mutex){+.+.}, at: __tty_hangup.part.0+0x58/0x27c [ 56.290128] [ 56.290128] but task is already holding lock: [ 56.295970] e9e2b842 (&gsm->mutex){+.+.}, at: gsm_cleanup_mux+0x9c/0x15c [ 56.302699] [ 56.302699] which lock already depends on the new lock. [ 56.302699] [ 56.310884] [ 56.310884] the existing dependency chain (in reverse order) is: [ 56.318372] [ 56.318372] -> #2 (&gsm->mutex){+.+.}: [ 56.323624] mutex_lock_nested+0x1c/0x24 [ 56.328079] gsm_cleanup_mux+0x9c/0x15c [ 56.332448] gsmld_ioctl+0x418/0x4e8 [ 56.336554] tty_ioctl+0x96c/0xcb0 [ 56.340492] do_vfs_ioctl+0x41c/0xa5c [ 56.344685] ksys_ioctl+0x34/0x60 [ 56.348535] ret_fast_syscall+0x0/0x28 [ 56.352815] 0xbe97cc04 [ 56.355791] [ 56.355791] -> #1 (&tty->ldisc_sem){++++}: [ 56.361388] tty_ldisc_lock+0x50/0x74 [ 56.365581] tty_init_dev+0x88/0x1c4 [ 56.369687] tty_open+0x1c8/0x430 [ 56.373536] chrdev_open+0xa8/0x19c [ 56.377560] do_dentry_open+0x118/0x3c4 [ 56.381928] path_openat+0x2fc/0x1190 [ 56.386123] do_filp_open+0x68/0xd4 [ 56.390146] do_sys_open+0x164/0x220 [ 56.394257] kernel_init_freeable+0x328/0x3e4 [ 56.399146] kernel_init+0x8/0x110 [ 56.403078] ret_from_fork+0x14/0x20 [ 56.407183] 0x0 [ 56.409548] [ 56.409548] -> #0 (&tty->legacy_mutex){+.+.}: [ 56.415402] __mutex_lock+0x64/0x90c [ 56.419508] mutex_lock_nested+0x1c/0x24 [ 56.423961] __tty_hangup.part.0+0x58/0x27c [ 56.428676] gsm_cleanup_mux+0xe8/0x15c [ 56.433043] gsmld_close+0x48/0x90 [ 56.436979] tty_ldisc_kill+0x2c/0x6c [ 56.441173] tty_ldisc_release+0x88/0x194 [ 56.445715] tty_release_struct+0x14/0x44 [ 56.450254] tty_release+0x36c/0x43c [ 56.454365] __fput+0x94/0x1e8 Doing the hangup asynchronously fixes both cases for me (tested on 5.2): diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c index d30525946892..36a3eb4ad4c5 100644 --- a/drivers/tty/n_gsm.c +++ b/drivers/tty/n_gsm.c @@ -1716,7 +1716,7 @@ static void gsm_dlci_release(struct gsm_dlci *dlci) gsm_destroy_network(dlci); mutex_unlock(&dlci->mutex); - tty_vhangup(tty); + tty_hangup(tty); tty_port_tty_set(&dlci->port, NULL); tty_kref_put(tty); What am I missing ? // Martin