Received: by 2002:ac0:aed5:0:0:0:0:0 with SMTP id t21csp5530093imb; Thu, 7 Mar 2019 18:35:43 -0800 (PST) X-Google-Smtp-Source: APXvYqzpWYBPgYRA/p01cx5iMz7VajezX3kUhFTjoFcu0kTeIwdDIt8jKWCPJQBnyIiC7uUByDu2 X-Received: by 2002:a63:ec48:: with SMTP id r8mr14532057pgj.50.1552012543861; Thu, 07 Mar 2019 18:35:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1552012543; cv=none; d=google.com; s=arc-20160816; b=q4MFnI/5iov/3wNthNg/rwvw0GyHfMLHHyLXh3vjyRWRW9yqVeTV7V21qSk2Yp1+FC cJR0FN16Ko4+wU7WDlIGv0g+nxanZKfgFyBWD34Ps9JjjRUJeKiiNBzHFMMqcZuoOlKa DBl8SMPw7Z6VMsGWA1GlPRAgTsSWf25D8dN46OJ2lPXKLLfxaYhvgGdDHac42Vvw2meD szzJBW8SmXDvpaBUMbULMjoP9ysqkGjZXOtgf+e3AttnjIs8NQAi0Fa0RfrSq/24DicK c9waAsJaOC25dwSJ5DOBwz7w59s0eu/oOpQ0BytxBp/PB79o1P/ui+/bdbtfLWGsIdgI 9+Fw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=UTZumptBdSe2N4uc+wXtQ84GftVlF9+XICQPEoUB5wU=; b=XRXcJ8p2AGrHntnYtncQbDB14kGjV+CLIqGZxMEpApR3TkomU9O7TjE+4n1+/KOyLK 7KcKvlauyDKUHw5gXdmTz4SglXx84zZ4ap5auDsaI67wa0X5Uympz0yVmdBas8WLKxtl 14Ez17pxWROmowmmvPjkPY8JaIJtsuOeEyl+poINECIa80MZ98hXmTP/JeORdsYzB38d qMQjjkeAXaxGp59UTdMwx9upeuRwt4L2JTIGSdbeOFRuDwUUBUvhEKOeD+CuM6rObu8Z EqodmeoAj30gQ+rRtbGdNw1gWs1X+BQYwrpg0CpIE/2yvMft919xGSKfBtOOVwO1a5e0 MBgg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b="mjzKw/LH"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k3si5513009plt.342.2019.03.07.18.35.28; Thu, 07 Mar 2019 18:35:43 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b="mjzKw/LH"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726442AbfCHCeI (ORCPT + 99 others); Thu, 7 Mar 2019 21:34:08 -0500 Received: from mail-pg1-f196.google.com ([209.85.215.196]:41851 "EHLO mail-pg1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726286AbfCHCeH (ORCPT ); Thu, 7 Mar 2019 21:34:07 -0500 Received: by mail-pg1-f196.google.com with SMTP id k11so11877814pgb.8 for ; Thu, 07 Mar 2019 18:34:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=UTZumptBdSe2N4uc+wXtQ84GftVlF9+XICQPEoUB5wU=; b=mjzKw/LHi0Ml4H3WhRnf6j6SUuzlEYtG27psMclgewkCimKdqb3mjbglXi54lo6lqm Tt4tCr6KVefbc/jW4i0oXBxSESp7CvVdGF8YBfiKX3KTUW6rB1LAZWZZheCOalUxaVjm Rad1WqNbeoKegWMX3hY8LVUQi60fA8J/KHvMA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=UTZumptBdSe2N4uc+wXtQ84GftVlF9+XICQPEoUB5wU=; b=fHiVkfEZfL6u1l6tQx3aWT7OLP6AVz1MHl0/t3E//WWrQSjDIx9eF/VKVb7nASP3ah b9aIEz+r+QxsirZaTS4yktldPTgfashTYDzVTGJJhp+wpBD8X5vcVCTfQKPlo5d0wBCr 20cZv7pcpRtByzPQX3hyHpADB5PJ9lGZNwMRvRDwuTR48oEsV68KTiwjSBnVaSgyUADf NnKoZiIEdUk4FcuSselddHUT2aZ+D6RfXxjf4tdDiJqSbAfWojDn/GsQV2pXMnjL3gRD rc30G2ujXVIlbPC+jPr6Gw7sOl8g0j7fh/vT+li+ztu0FwQ7ItkD7S6tsATX4/xLv19h VKaw== X-Gm-Message-State: APjAAAW6DEe4R5WEHjcB9UnpMgO/10rz3qrtAhfDacMLR0wDPofuUofW g8V922S7j7XMQndSVeb6KyN7YQ== X-Received: by 2002:a17:902:e711:: with SMTP id co17mr15987990plb.171.1552012446853; Thu, 07 Mar 2019 18:34:06 -0800 (PST) Received: from google.com ([2620:15c:202:1:534:b7c0:a63c:460c]) by smtp.gmail.com with ESMTPSA id p2sm14653048pfi.95.2019.03.07.18.34.05 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 07 Mar 2019 18:34:05 -0800 (PST) Date: Thu, 7 Mar 2019 18:34:03 -0800 From: Brian Norris To: Ganapathi Bhat Cc: linux-kernel@vger.kernel.org, Amitkumar Karwar , Nishant Sarmukadam , Ganapathi Bhat , Xinming Hu , linux-wireless@vger.kernel.org, stable@vger.kernel.org Subject: Re: [4.20 PATCH] Revert "mwifiex: restructure rx_reorder_tbl_lock usage" Message-ID: <20190308023401.GA121759@google.com> References: <20181130175957.167031-1-briannorris@chromium.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181130175957.167031-1-briannorris@chromium.org> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi again Ganapathi, By the way, I was a little curious about what went wrong here, so I dug in a little further: On Fri, Nov 30, 2018 at 09:59:57AM -0800, Brian Norris wrote: > This reverts commit 5188d5453bc9380ccd4ae1086138dd485d13aef2, because it > introduced lock recursion: > > BUG: spinlock recursion on CPU#2, kworker/u13:1/395 > lock: 0xffffffc0e28a47f0, .magic: dead4ead, .owner: kworker/u13:1/395, .owner_cpu: 2 > CPU: 2 PID: 395 Comm: kworker/u13:1 Not tainted 4.20.0-rc4+ #2 > Hardware name: Google Kevin (DT) > Workqueue: MWIFIEX_RX_WORK_QUEUE mwifiex_rx_work_queue [mwifiex] > Call trace: > dump_backtrace+0x0/0x140 > show_stack+0x20/0x28 > dump_stack+0x84/0xa4 > spin_bug+0x98/0xa4 > do_raw_spin_lock+0x5c/0xdc > _raw_spin_lock_irqsave+0x38/0x48 > mwifiex_flush_data+0x2c/0xa4 [mwifiex] > call_timer_fn+0xcc/0x1c4 > run_timer_softirq+0x264/0x4f0 > __do_softirq+0x1a8/0x35c > do_softirq+0x54/0x64 > netif_rx_ni+0xe8/0x120 > mwifiex_recv_packet+0xfc/0x10c [mwifiex] > mwifiex_process_rx_packet+0x1d4/0x238 [mwifiex] > mwifiex_11n_dispatch_pkt+0x190/0x1ac [mwifiex] > mwifiex_11n_rx_reorder_pkt+0x28c/0x354 [mwifiex] TL;DR: the problem was right here ^^^ where you started running mwifiex_11n_dispatch_pkt() (via mwifiex_11n_scan_and_dispatch()) while holding a spinlock. When you do that, you eventually call netif_rx_ni(), which specifically defers to softirq contexts. Then, if you happen to have your flush timer expiring just before that, you end up in mwifiex_flush_data(), which also needs that spinlock. There are a few possible ways to handle this: (a) prevent processing softirqs in that context; e.g., with local_bh_disable(). This seems somewhat of a hack. (Side note: I think most of the locks in this driver really could be spin_lock_bh(), not spin_lock_irqsave() -- we don't really care about hardirq context for 99% of these locks.) (b) restructure so that packet processing (e.g., netif_rx_ni()) is done outside of the spinlock. It's actually not that hard to do (b). You can just queue your skb's up in a temporary sk_buff_head list and process them all at once after you've finished processing the reorder table. I have a local patch to do this, and I might send it your way if I can give it a bit more testing. Brian > mwifiex_process_sta_rx_packet+0x204/0x26c [mwifiex] > mwifiex_handle_rx_packet+0x15c/0x16c [mwifiex] > mwifiex_rx_work_queue+0x104/0x134 [mwifiex] > worker_thread+0x4cc/0x72c > kthread+0x134/0x13c > ret_from_fork+0x10/0x18 > > This was clearly not tested well at all. I simply performed 'wget' in a > loop and it fell over within a few seconds. > > Fixes: 5188d5453bc9 ("mwifiex: restructure rx_reorder_tbl_lock usage") > Cc: > Cc: Ganapathi Bhat > Signed-off-by: Brian Norris