Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp2536395pxj; Mon, 14 Jun 2021 00:39:24 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyCYkUVYntYtp95tHTlSIMwOxiKEnIytNtda9KgveZbtx+6NApc3tdO3g/GESwtNygnUqda X-Received: by 2002:aa7:cb45:: with SMTP id w5mr15669723edt.170.1623656363798; Mon, 14 Jun 2021 00:39:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1623656363; cv=none; d=google.com; s=arc-20160816; b=BkMHIaXsyn5xF8N7TAsTM/8B0FE0qfq/Gg9jFtqg8SJR3VhzFSR6Jp3D7uYEi9c+b7 2DgtUK4aFXZxEJA01KvGT6QuBNGB4acyu1I7z40lDORzXWvGe7OHb9rKDdiVc+T5nAfq WQAGCRvAUOz0SC5/wOWsosQuBFBbOra/F0c3307NprNgm9XVblzwmAj0jFuqxObuMct2 eNyMKLfOpZFbLm6pkxR0WNZjXKlU53AIO0qX7xJ1e9OvCJD9u/ujKyqvo7TV/0znWgyq bo/SIth4g7HvaK/2tRaA0GQedpnWnkU85UYQkAZSaDTlT2QqQjulLwjdt2NRdt/zfDdE WLbg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-language:content-transfer-encoding :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dmarc-filter:sender:dkim-signature; bh=qpgcKRTavkio4Ka0FRWV7DpBkMfRvza49DuvO6fLFdo=; b=erTPz/9OYiMaXjqxFEmDx95ry39piyLeoE47Ngq809JPNakKBrtWjC5cQQf3zJNv3e alWYjRpeSSelgPThfZ0B++zPBXafvQBMX4iivBU7BSQ8cjoI4f1rpjXOMajFSNgscGJX OkoPMgMiCXAB2RFVjLzyTZhI3TU5Y5LwREMbo3HrLReDJcv9bpYqNg/DyrqCEzbCtLE9 49Rv6pzXy63mJqbLFB7kFuJ5SmttqqLMwsZ+HW0lY9kB941JcNeYI5Q4ogXXi1NiTc+G hrsGkOsrqFWbDo7Coc99fn3AxwxPq0jwkIDNUT68QNbJhnRDzoXnACb6XVqPm4WYqgLZ l/qA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@mg.codeaurora.org header.s=smtp header.b=vgChnwY8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id m20si11499354ejl.598.2021.06.14.00.39.00; Mon, 14 Jun 2021 00:39:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@mg.codeaurora.org header.s=smtp header.b=vgChnwY8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232536AbhFNHjY (ORCPT + 99 others); Mon, 14 Jun 2021 03:39:24 -0400 Received: from so254-9.mailgun.net ([198.61.254.9]:46732 "EHLO so254-9.mailgun.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232493AbhFNHjW (ORCPT ); Mon, 14 Jun 2021 03:39:22 -0400 DKIM-Signature: a=rsa-sha256; v=1; c=relaxed/relaxed; d=mg.codeaurora.org; q=dns/txt; s=smtp; t=1623656240; h=Content-Transfer-Encoding: Content-Type: In-Reply-To: MIME-Version: Date: Message-ID: From: References: Cc: To: Subject: Sender; bh=qpgcKRTavkio4Ka0FRWV7DpBkMfRvza49DuvO6fLFdo=; b=vgChnwY8DBClsl5I9TCpx7uHD3QUniWz2JY4TfKHHKCDfymn9MWlBmD6v+y4AMcAib3UWeyg 7FsJ82oDj1swTTKaIuqW8H3nfkKEtpYxR6ymMkmEnZARMHMrwQIyfEB+jvXxgBaYqrqGJZJl w63kC8OW+qdhCRsa+W9zcfDCC7U= X-Mailgun-Sending-Ip: 198.61.254.9 X-Mailgun-Sid: WyI0MWYwYSIsICJsaW51eC1rZXJuZWxAdmdlci5rZXJuZWwub3JnIiwgImJlOWU0YSJd Received: from smtp.codeaurora.org (ec2-35-166-182-171.us-west-2.compute.amazonaws.com [35.166.182.171]) by smtp-out-n04.prod.us-east-1.postgun.com with SMTP id 60c7072fe27c0cc77fb43399 (version=TLS1.2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256); Mon, 14 Jun 2021 07:37:19 GMT Sender: sanm=codeaurora.org@mg.codeaurora.org Received: by smtp.codeaurora.org (Postfix, from userid 1001) id 00702C433F1; Mon, 14 Jun 2021 07:37:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-caf-mail-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.4 required=2.0 tests=ALL_TRUSTED,BAYES_00, NICE_REPLY_A,SPF_FAIL,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from [192.168.0.108] (unknown [124.123.160.213]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: sanm) by smtp.codeaurora.org (Postfix) with ESMTPSA id 7FBB4C43460; Mon, 14 Jun 2021 07:37:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 7FBB4C43460 Authentication-Results: aws-us-west-2-caf-mail-1.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: aws-us-west-2-caf-mail-1.web.codeaurora.org; spf=fail smtp.mailfrom=sanm@codeaurora.org Subject: Re: [BUG] usb: dwc3: Kernel NULL pointer dereference in dwc3_remove() To: Jack Pham , Felipe Balbi Cc: Alexandru Elisei , Greg Kroah-Hartman , p.zabel@pengutronix.de, linux-usb@vger.kernel.org, Linux Kernel Mailing List , arm-mail-list , linux-arm-msm@vger.kernel.org References: <87r1hjcvf6.fsf@kernel.org> <70be179c-d36b-de6f-6efc-2888055b1312@arm.com> <8272121c-ac8a-1565-a047-e3a16dcf13b0@arm.com> <877djbc8xq.fsf@kernel.org> <20210603173632.GA25299@jackp-linux.qualcomm.com> <87mts6avnn.fsf@kernel.org> <20210607180023.GA23045@jackp-linux.qualcomm.com> <87sg1q1129.fsf@kernel.org> <20210610153346.GA26872@jackp-linux.qualcomm.com> From: Sandeep Maheswaram Message-ID: Date: Mon, 14 Jun 2021 13:07:11 +0530 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <20210610153346.GA26872@jackp-linux.qualcomm.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 6/10/2021 9:03 PM, Jack Pham wrote: > On Thu, Jun 10, 2021 at 01:11:42PM +0300, Felipe Balbi wrote: >> Jack Pham writes: >>> On Fri, Jun 04, 2021 at 11:20:12AM +0300, Felipe Balbi wrote: >>>> Jack Pham writes: >>>>>>>>>> Alexandru Elisei writes: >>>>>>>>>>> I've been able to bisect the panic and the offending commit is 568262bf5492 ("usb: >>>>>>>>>>> dwc3: core: Add shutdown callback for dwc3"). I can provide more diagnostic >>>>>>>>>>> information if needed and I can help test the fix. >>>>>>>>>> if you simply revert that commit in HEAD, does the problem really go >>>>>>>>>> away? >>>>>>>>> Kernel built from commit 324c92e5e0ee, which is the kernel tip today, the panic is >>>>>>>>> there. Reverting the offending commit, 568262bf5492, makes the panic disappear. >>>>>>>> Want to send a revert so I can take it now? >>>>>>> I can send a revert, but Felipe was asking Sandeep (the commit author) for a fix, >>>>>>> so I'll leave it up to Felipe to decide how to proceed. >>>>>> I'm okay with a revert. Feel free to add my Acked-by: Felipe Balbi >>>>>> or it. >>>>>> >>>>>> Sandeep, please send a new version that doesn't encounter the same >>>>>> issue. Make sure to test by reloading the driver in a tight loop for >>>>>> several iterations. >>>>> This would probably be tricky to test on other "glue" drivers as the >>>>> problem appears to be specific only to dwc3_of_simple. It looks like >>>>> both dwc3_of_simple and the dwc3 core now (due to 568262bf5492) each >>>>> implement respective .shutdown callbacks. The latter is simply a wrapper >>>>> around dwc3_remove(). And from the panic call stack above we see that >>>>> dwc3_of_simple_shutdown() calls of_platform_depopulate() which will >>>>> again call dwc3_remove() resulting in the double remove. >>>>> >>>>> So would an alternative approach be to protect against dwc3_remove() >>>>> getting called multiple times? IMO it'd be a bit messy to have to add >>>> no, I don't think so. That sounds like a workaround. We should be able >>>> to guarantee that ->remove() doesn't get called twice using the driver >>>> model properly. >>> Completely fair. So then having a .shutdown callback that directly calls >>> dwc3_remove() is probably not the right thing to do as it completely >>> bypasses the driver model so if and when the driver core does later >>> release the device from the driver that's how we end up with the double >>> remove. >> yeah, I would agree with that. >> >>>>> additional checks there to know if it had already been called. So maybe >>>>> avoid it altogether--should dwc3_of_simple_shutdown() just skip calling >>>>> of_platform_depopulate()? >>>> I don't know what the idiomatic is nowadays, but at least early on, we >>>> had to call depopulate. >>> So any suggestions on how to fix the original issue Sandeep was trying >>> to fix with 568262bf5492? Maybe implement .shutdown in dwc3_qcom and have >>> it follow what dwc3_of_simple does with of_platform_depopulate()? But >>> then wouldn't other "glues" want/need to follow suit? >> I think we can implement shutdown in core, but we need to careful with >> it. Instead of just blindly calling remove, let's extract the common >> parts to another internal function that both remove and shutdown >> call. debugfs removal should not be part of that generic method :-) > Hi Sandeep, > > Upon re-reading your description in 568262bf5492 it sounds like the > original intention of your patch is basically to quiesce the HW so that > it doesn't continue to run after SMMU/IOMMU is disabled right? > > If that is the case, couldn't we simply call only dwc3_core_exit_mode() > assuming there is no other requirement to do any other cleanup/teardown > (PHYs, clocks, resets, runtime PM, et al)? This function should do the > bare minimum of stopping the controller in whatever mode (host or > peripheral) it is currently operating in. Yes that was the intention. I will call only dwc3_core_exit_mode() and check. Is there any way we can do from dwc3 qcom glue driver to avoid problems for other glue drivers? >> Anything in that generic method should, probably, be idempotent. > Yes we'll need to ensure that dwc3_core_exit_mode() can be called > multiple times without additional side effects. At first glance this > probably means setting dwc->xhci and dwc->gadget to NULL from > dwc3_host_exit() and dwc3_gadget_exit(), respectively. Ok. Is there any way to test this ? > > Thanks, > Jack