Received: by 2002:a05:6602:2086:0:0:0:0 with SMTP id a6csp4404613ioa; Wed, 27 Apr 2022 03:09:29 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz2AphrEvxIFYjJj5WriN13t/paqcELlfRobU2C6j/VmSJsOnmmrrUygt/WC9ChQmq6FDkV X-Received: by 2002:a05:6a00:1307:b0:4b0:b1c:6fd9 with SMTP id j7-20020a056a00130700b004b00b1c6fd9mr28945940pfu.27.1651054169107; Wed, 27 Apr 2022 03:09:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1651054169; cv=none; d=google.com; s=arc-20160816; b=va6VAPLpsxvTUHbvF0BAgQ7BKUwYSKxAe1dySrGIR4dZ53eJ7SHH4SU0DycNVkTdxs PU/uKCcPmTJDoo4Bj+TgStS5CYAFJD+ZFMssGT7BZQSlZ1sG6MVM0c4PAJjphXr3i5rW B7hN+tIusfDUtv3NoHdSziabKnqGXJZlIz2+yvo6a2tFP3xnzvgcldeoBcX0clcZPcWN ZZko1DshiLHN4TLDCkGxvQOdw0wWL/N3ObCeC9haCn7e7HIld3RVb0pmuza8OrrUU5XN 8JvkZg1RvPU12r8Au3IbalF/NlhDxMlR2Ww+ilOO408y3JtohoEnKoYiRmI4LNDDrMWl AIiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=32zdlqAxGpL7Dq8ks/2ZRJEQKKPp3/s5aIgIq4Gjo6c=; b=gn7o9i8SeJdc7/aiB/CaZVXz4hwxjkbqjmXS74iupnmlDWRoOsmNDWQdJQIeOHD2VX 4PpMGqoZWeVm1+xBe1b/SVRrC7riNhR++5Vg6n22v82T5yC6LNxjfGsT5vcd56gVOjkb zWYVv1p2kDiTrodwvdRDcd8DVKq56WePcZ3kutRCJJ1mi9bfUq0jCL2+l+gy8s9a/STi Q8IDPYA5u3RfQMKr6s5sv598NRWLz7RY4iWQ2wYk6QISaPDy9AlPpjeVpzO/58F8UY3n volE96EFV9rAxwx98IPuzkqNZV009TVRiuUBZO+humCx75DZr6hjkP4Y0TliOkKs1n8I D1dQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@quicinc.com header.s=qcdkim header.b=zB38aOHB; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=quicinc.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id 31-20020a630a1f000000b003abadb1ec47si999760pgk.616.2022.04.27.03.09.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Apr 2022 03:09:29 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@quicinc.com header.s=qcdkim header.b=zB38aOHB; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=quicinc.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id E86FF1EF988; Wed, 27 Apr 2022 02:32:37 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352917AbiDZQVr (ORCPT + 99 others); Tue, 26 Apr 2022 12:21:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53748 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1352905AbiDZQVj (ORCPT ); Tue, 26 Apr 2022 12:21:39 -0400 Received: from alexa-out-sd-01.qualcomm.com (alexa-out-sd-01.qualcomm.com [199.106.114.38]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 61E296F9F7; Tue, 26 Apr 2022 09:18:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1650989911; x=1682525911; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=32zdlqAxGpL7Dq8ks/2ZRJEQKKPp3/s5aIgIq4Gjo6c=; b=zB38aOHBNf7BP49Ngk0brX2nVlPfbtk9KSPa5Y7nOBSRIAlXlzarrzAh 9m82DWOuWcZOMVqOQQIv7lVf8IVjPezOtkqH8bQZQq94fw1hscZhVtKnA nWjty4G6BRlTw+vk6AXRsAMM3IC4eQG5wfjXYeeNZDkfZmDpXDSvl19WZ k=; Received: from unknown (HELO ironmsg03-sd.qualcomm.com) ([10.53.140.143]) by alexa-out-sd-01.qualcomm.com with ESMTP; 26 Apr 2022 09:18:31 -0700 X-QCInternal: smtphost Received: from nasanex01c.na.qualcomm.com ([10.47.97.222]) by ironmsg03-sd.qualcomm.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Apr 2022 09:18:30 -0700 Received: from nalasex01a.na.qualcomm.com (10.47.209.196) by nasanex01c.na.qualcomm.com (10.47.97.222) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.22; Tue, 26 Apr 2022 09:18:30 -0700 Received: from [10.110.124.35] (10.80.80.8) by nalasex01a.na.qualcomm.com (10.47.209.196) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.22; Tue, 26 Apr 2022 09:18:29 -0700 Message-ID: Date: Tue, 26 Apr 2022 09:18:28 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.8.1 Subject: Re: [PATCH] ath10k: skip ath10k_halt during suspend for driver state RESTARTING Content-Language: en-US To: Abhishek Kumar , CC: , , , , , Wen Gong , "David S. Miller" , Jakub Kicinski , Paolo Abeni References: <20220425021442.1.I650b809482e1af8d0156ed88b5dc2677a0711d46@changeid> From: Jeff Johnson In-Reply-To: <20220425021442.1.I650b809482e1af8d0156ed88b5dc2677a0711d46@changeid> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.80.80.8] X-ClientProxiedBy: nasanex01b.na.qualcomm.com (10.46.141.250) To nalasex01a.na.qualcomm.com (10.47.209.196) X-Spam-Status: No, score=-3.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,RDNS_NONE,SPF_HELO_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/24/2022 7:15 PM, Abhishek Kumar wrote: > Double free crash is observed when FW recovery(caused by wmi > timeout/crash) is followed by immediate suspend event. The FW recovery > is triggered by ath10k_core_restart() which calls driver clean up via > ath10k_halt(). When the suspend event occurs between the FW recovery, > the restart worker thread is put into frozen state until suspend completes. > The suspend event triggers ath10k_stop() which again triggers ath10k_halt() > The double invocation of ath10k_halt() causes ath10k_htt_rx_free() to be > called twice(Note: ath10k_htt_rx_alloc was not called by restart worker > thread because of its frozen state), causing the crash. > > To fix this, during the suspend flow, skip call to ath10k_halt() in > ath10k_stop() when the current driver state is ATH10K_STATE_RESTARTING. > Also, for driver state ATH10K_STATE_RESTARTING, call > ath10k_wait_for_suspend() in ath10k_stop(). This is because call to > ath10k_wait_for_suspend() is skipped later in > [ath10k_halt() > ath10k_core_stop()] for the driver state > ATH10K_STATE_RESTARTING. > > The frozen restart worker thread will be cancelled during resume when the > device comes out of suspend. > > Below is the crash stack for reference: > > [ 428.469167] ------------[ cut here ]------------ > [ 428.469180] kernel BUG at mm/slub.c:4150! > [ 428.469193] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI > [ 428.469219] Workqueue: events_unbound async_run_entry_fn > [ 428.469230] RIP: 0010:kfree+0x319/0x31b > [ 428.469241] RSP: 0018:ffffa1fac015fc30 EFLAGS: 00010246 > [ 428.469247] RAX: ffffedb10419d108 RBX: ffff8c05262b0000 > [ 428.469252] RDX: ffff8c04a8c07000 RSI: 0000000000000000 > [ 428.469256] RBP: ffffa1fac015fc78 R08: 0000000000000000 > [ 428.469276] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 428.469285] Call Trace: > [ 428.469295] ? dma_free_attrs+0x5f/0x7d > [ 428.469320] ath10k_core_stop+0x5b/0x6f > [ 428.469336] ath10k_halt+0x126/0x177 > [ 428.469352] ath10k_stop+0x41/0x7e > [ 428.469387] drv_stop+0x88/0x10e > [ 428.469410] __ieee80211_suspend+0x297/0x411 > [ 428.469441] rdev_suspend+0x6e/0xd0 > [ 428.469462] wiphy_suspend+0xb1/0x105 > [ 428.469483] ? name_show+0x2d/0x2d > [ 428.469490] dpm_run_callback+0x8c/0x126 > [ 428.469511] ? name_show+0x2d/0x2d > [ 428.469517] __device_suspend+0x2e7/0x41b > [ 428.469523] async_suspend+0x1f/0x93 > [ 428.469529] async_run_entry_fn+0x3d/0xd1 > [ 428.469535] process_one_work+0x1b1/0x329 > [ 428.469541] worker_thread+0x213/0x372 > [ 428.469547] kthread+0x150/0x15f > [ 428.469552] ? pr_cont_work+0x58/0x58 > [ 428.469558] ? kthread_blkcg+0x31/0x31 > > Signed-off-by: Abhishek Kumar > Co-developed-by: Wen Gong > Signed-off-by: Wen Gong > --- > > drivers/net/wireless/ath/ath10k/mac.c | 18 ++++++++++++++++-- > 1 file changed, 16 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c > index d804e19a742a..57ba27c46371 100644 > --- a/drivers/net/wireless/ath/ath10k/mac.c > +++ b/drivers/net/wireless/ath/ath10k/mac.c > @@ -5345,8 +5345,22 @@ static void ath10k_stop(struct ieee80211_hw *hw) > > mutex_lock(&ar->conf_mutex); > if (ar->state != ATH10K_STATE_OFF) { > - if (!ar->hw_rfkill_on) > - ath10k_halt(ar); > + if (!ar->hw_rfkill_on) { > + /* If the current driver state is RESTARTING but not yet > + * fully RESTARTED because of incoming suspend event, > + * then ath11k_halt is already called via > + * ath10k_core_restart and should not be called here. > + */ > + if (ar->state != ATH10K_STATE_RESTARTING) > + ath10k_halt(ar); > + else > + /* Suspending here, because when in RESTARTING > + * state, ath11k_core_stop skips > + * ath10k_wait_for_suspend. > + */ > + ath10k_wait_for_suspend(ar, > + WMI_PDEV_SUSPEND_AND_DISABLE_INTR); > + } > ar->state = ATH10K_STATE_OFF; > } > mutex_unlock(&ar->conf_mutex);