Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp4233407pxk; Tue, 22 Sep 2020 13:59:27 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzuTPDF/9rpDWJyJz5pofyhl3YSP/mxdyBOn6BcBQp1d+eg2YG5Ighsz8lcdLyB6JFyaa+m X-Received: by 2002:aa7:da16:: with SMTP id r22mr6211190eds.132.1600808367028; Tue, 22 Sep 2020 13:59:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1600808367; cv=none; d=google.com; s=arc-20160816; b=p0INj6bfJhT0Hbo0gD+ri92GBGq++WYtFuFdBx9I3RU23m5rzAYac3Bnsvmnz7+wFY lA4UpDd468nOSp4ljFpgVOumlPO0wAm7pOhPMJVuqQYnv4eZslYCNX3x8pbC0c8r0j9Q sSpkPfxuTSBnOeJ9QHigNxPCeq3KiUQ8VuE2hVeatpiHOtclq23BImtLctGeZt/HPL/d 2SOixp9yP2zsLDjlZBm0+guKam8wxUbhKvHx5/fP6I+iMK6da6smJCrom1hSMmaGAJif 7J6ZEqFbXRcgNUPf3FEXUgrRfFyCB+EJZUxtYK8kE6EPBU0ZxIIu8PWFNNxTOyA9xBf5 9kfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:organization :references:to:from:subject:dkim-signature:dkim-filter; bh=WwO17whln1FeDwa4xqa2g4CkKpGvLjDBmv+DYMGskfk=; b=FWLY3UmZCoyArOJwIB0PQ86Mnc4ul0yNwpyyzJQ2Elskr53HMFwpREaYA4cydEIbsb pBHoBYGWTwVKnFo9y2Wo7p7Rpy6kZyp68kR6N1F3jgYpWo779KU5lJnVgTBu9JF/lw11 ZmgPUDK2XCl9/VnCeV+8ufrgpAyjahCTd3Trf3vJ7MW29tcyrgI2UWH3Bf9Xt+bz3s/r noHZrUyOrMNHmbqo4dLjJfF0T94J2RQUPevJwNAyPwQxGLuaxzQjHrDJBgU90O8l3FcH LZ7tfdtiHbR9EHEA1B6ywOPORlb7me5fyXeTsHqdbuM5VmZHEC4W3PiF0Vns7xE3I955 Hg6g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@candelatech.com header.s=default header.b=J31XYHpC; spf=pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-wireless-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=candelatech.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id bc11si11498785edb.532.2020.09.22.13.59.00; Tue, 22 Sep 2020 13:59:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@candelatech.com header.s=default header.b=J31XYHpC; spf=pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-wireless-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=candelatech.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726654AbgIVU6A (ORCPT + 99 others); Tue, 22 Sep 2020 16:58:00 -0400 Received: from mail2.candelatech.com ([208.74.158.173]:48292 "EHLO mail3.candelatech.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726448AbgIVU56 (ORCPT ); Tue, 22 Sep 2020 16:57:58 -0400 Received: from [192.168.254.6] (unknown [50.46.158.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail3.candelatech.com (Postfix) with ESMTPSA id E26B013C2B1; Tue, 22 Sep 2020 13:57:56 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 mail3.candelatech.com E26B013C2B1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=candelatech.com; s=default; t=1600808277; bh=fK1jUHftaaJuplbD63eSAjwYbSZPP1pPOtRPCJotsuk=; h=Subject:From:To:References:Date:In-Reply-To:From; b=J31XYHpCHtAzlduTpXsZmAU+EtGgEWCLtFjqB6WD2j7du+BZmQeavxhQ0bKboKFwM oo19LOQvuctg++uc80rrDq9wRwU37833ABky6P+fuF8pKtVuFnCyWBZ/4AeXWjY1/v kWYKKsPaYv3dne/g/P5oc8/qptgzp5xzNuFwyyL8= Subject: Re: ax200, fw crashes, and sdata-in-driver From: Ben Greear To: Johannes Berg , "linux-wireless@vger.kernel.org" References: <024147cd-75ed-bacb-fb7a-e97104886751@candelatech.com> Organization: Candela Technologies Message-ID: Date: Tue, 22 Sep 2020 13:57:56 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.11.0 MIME-Version: 1.0 In-Reply-To: <024147cd-75ed-bacb-fb7a-e97104886751@candelatech.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-MW Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-wireless@vger.kernel.org On 7/30/20 5:58 AM, Ben Greear wrote: > On 7/30/20 5:30 AM, Johannes Berg wrote: >> Hi, >> >>> I larded up my 5.4 kernel with KASAN and lockdep, and ran some tests.  This is with my >>> patch that keeps from busy-spinning forever (see previous ignored patch). >> >> Right, sorry, hadn't gotten to patches in a while. >> >>> After a few restarts and FW crashes, the ax200 could not recover firmware.  There >>> were lots of sdata-in-driver errors, and then KASAN hit a use-after-free issue >>> related to ax200 accessing sta object that was previously deleted. >>> >>> Now, I think I know why: >>> >>> In the ieee80211_handle_reconfig_failure(struct ieee80211_local *local) >>> method, it will clear the SDATA_IN_DRIVER flag, and according to comments, >>> this is run when firmware cannot be recovered.  But, just because FW is >>> dead does not mean that the driver itself has cleaned up its state. >>> >>> So question is, should ax200 (and all drivers) be responsible for cleaning >>> up all state when FW cannot be recovered, or should instead mac80211 do cleanup >>> in this case by, among other things, not clearing that flag (and probably >>> not doing the ctx->driver_present = false; config as well)? >> >> I think it should be the driver. It's not clear _why_ the driver failed, >> after all. If the firmware is still alive and just rejected something >> then perhaps rolling things back will work. But if the firmware just >> died again, that will just cause even more trouble. > > The current code clears state without actually notifying the driver, so it > is causing mac80211 to be out of sync with the driver.  I can't see how that > is a good idea.  This is root cause of the issue that causes the busy-spin > related to sdata-in-driver / EIO as far as I can tell. Hello, As far as I can tell, no work has gone into the driver(s) to resolve the use-after-free issue. So, maybe worth considering my earlier patch to clean everything up in mac80211 instead of depending on the drivers to get this correctly cleaned up in all cases? I'll repost it, freshly rebased against latest linus tree.... Thanks, Ben