Received: by 2002:a05:7412:bb8d:b0:d7:7d3a:4fe2 with SMTP id js13csp2157017rdb; Thu, 17 Aug 2023 12:23:31 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHk1ylYrJHMmClmCE3NbsTJ+fdr060euzL+a0Wudf1PwfkAJCpSxA/HtPHks1eVVMwm9Kwz X-Received: by 2002:a17:902:e544:b0:1bc:8c6c:217d with SMTP id n4-20020a170902e54400b001bc8c6c217dmr431526plf.61.1692300210635; Thu, 17 Aug 2023 12:23:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692300210; cv=none; d=google.com; s=arc-20160816; b=f80uslo5MK9h1rd4PsAIXN6JwozKNJSqTLvE0fiepKr+IQUdpgEZHqqs1gNBeuM+j9 OJEa98zEUpJEJr9OnG7ob0K3ywMk95vpTnJvjVkQg4nuQz+d67J4YWtUDdQz10HLShwX Ym0u3KEdXve0mIfdc8OWV2gkO+vezJQA3GhUw0NxfCe7K0tH05s3V+F7317hrSwtkeve Lrn6D8iHUu85hQcUyKsNj/ZNK3yg9OTTjdKoJsKrt3j+zLfkvTQggbCCcmgYfnch25a2 cg4qkCUKMBQk4nAvocENGhhQgvbF87PvqRsb64ODLEqIkEpqv0oUCFkBtIqz8p2wQaFJ ED2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=GlY+pSAMNGv4XhriBwuTANK/uq7gjOJ2nidTu3fVsLE=; fh=HowXvLIh9RsyZW5HWtDSMhMFI0ZLTOJXn8KZxav1sbM=; b=PMAxTKQ3jsmnVF4c5vAj1i/WMQiSKIGj4PIlYOmz4oWjRM2+H1Y/uBx/Q0CGqnlhev GfpKY180km38GYTq9bgx+hTL3hjWgVCIcKMIhsY3+vCSDZSE4SQpapeH6IGKtj3HJWhS NtRfJ7+AjZU0nIiz8bQRt3x28N9EqmNiDjJl3ax228gY1GEP88sgtAMWYpxsW38DApTg MwEsxeMpGtY4QJ8lD+KTUKFwcnMwepV/08232CK5SSIxDaE2251h+49MnfA4BT3BG/VH 5nyB9T0WgZM5lVwVT69Xc1BisNH9KsyrJnLmxVa+otD/Qm9XGLwLNk20K8in6/bSbl60 CHqg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b10-20020a170903228a00b001bba953ac2esi157957plh.339.2023.08.17.12.23.14; Thu, 17 Aug 2023 12:23:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346806AbjHQHOi (ORCPT + 99 others); Thu, 17 Aug 2023 03:14:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40670 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348414AbjHQHOG (ORCPT ); Thu, 17 Aug 2023 03:14:06 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 89CD12D62 for ; Thu, 17 Aug 2023 00:14:01 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id F316ED75; Thu, 17 Aug 2023 00:14:41 -0700 (PDT) Received: from [10.163.56.113] (unknown [10.163.56.113]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 969633F64C; Thu, 17 Aug 2023 00:13:57 -0700 (PDT) Message-ID: <80ef7a87-6adf-27bc-43ae-05ae5680e418@arm.com> Date: Thu, 17 Aug 2023 12:43:55 +0530 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Subject: Re: [PATCH v2 1/2] coresight: trbe: Fix TRBE potential sleep in atomic context Content-Language: en-US To: Suzuki K Poulose , hejunhao3@huawei.com Cc: coresight@lists.linaro.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, jonathan.cameron@huawei.com, mike.leach@linaro.org, linuxarm@huawei.com, yangyicong@huawei.com, prime.zeng@hisilicon.com References: <20230814093813.19152-1-hejunhao3@huawei.com> <20230816141008.535450-1-suzuki.poulose@arm.com> From: Anshuman Khandual In-Reply-To: <20230816141008.535450-1-suzuki.poulose@arm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-7.4 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello Junhao, On 8/16/23 19:40, Suzuki K Poulose wrote: > From: Junhao He > > smp_call_function_single() will allocate an IPI interrupt vector to > the target processor and send a function call request to the interrupt > vector. After the target processor receives the IPI interrupt, it will > execute arm_trbe_remove_coresight_cpu() call request in the interrupt > handler. > > According to the device_unregister() stack information, if other process > is useing the device, the down_write() may sleep, and trigger deadlocks > or unexpected errors. > > arm_trbe_remove_coresight_cpu > coresight_unregister > device_unregister > device_del > kobject_del > __kobject_del > sysfs_remove_dir > kernfs_remove > down_write ---------> it may sleep But how did you really detect this problem ? Does this show up as an warning when you enable lockdep debug ? OR it really happened during a real workload execution followed by TRBE module unload. Although the problem seems plausible (which needs fixing), just wondering how did we trigger this. > > Add a helper arm_trbe_disable_cpu() to disable TRBE precpu irq and reset > per TRBE. > Simply call arm_trbe_remove_coresight_cpu() directly without useing the > smp_call_function_single(), which is the same as registering the TRBE > coresight device. > > Fixes: 3fbf7f011f24 ("coresight: sink: Add TRBE driver") > Signed-off-by: Junhao He > Link: https://lore.kernel.org/r/20230814093813.19152-2-hejunhao3@huawei.com > [ Remove duplicate cpumask checks during removal ] > Signed-off-by: Suzuki K Poulose > --- > drivers/hwtracing/coresight/coresight-trbe.c | 33 +++++++++++--------- > 1 file changed, 18 insertions(+), 15 deletions(-) > > diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c > index 7720619909d6..025f70adee47 100644 > --- a/drivers/hwtracing/coresight/coresight-trbe.c > +++ b/drivers/hwtracing/coresight/coresight-trbe.c > @@ -1225,6 +1225,17 @@ static void arm_trbe_enable_cpu(void *info) > enable_percpu_irq(drvdata->irq, IRQ_TYPE_NONE); > } > > +static void arm_trbe_disable_cpu(void *info) > +{ > + struct trbe_drvdata *drvdata = info; > + struct trbe_cpudata *cpudata = this_cpu_ptr(drvdata->cpudata); > + > + disable_percpu_irq(drvdata->irq); > + trbe_reset_local(cpudata); > + cpudata->drvdata = NULL; > +} > + > + > static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cpu) > { > struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu); > @@ -1326,18 +1337,12 @@ static void arm_trbe_probe_cpu(void *info) > cpumask_clear_cpu(cpu, &drvdata->supported_cpus); > } > > -static void arm_trbe_remove_coresight_cpu(void *info) > +static void arm_trbe_remove_coresight_cpu(struct trbe_drvdata *drvdata, int cpu) > { > - int cpu = smp_processor_id(); > - struct trbe_drvdata *drvdata = info; > - struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu); > struct coresight_device *trbe_csdev = coresight_get_percpu_sink(cpu); > > - disable_percpu_irq(drvdata->irq); > - trbe_reset_local(cpudata); > if (trbe_csdev) { > coresight_unregister(trbe_csdev); > - cpudata->drvdata = NULL; > coresight_set_percpu_sink(cpu, NULL); > } > } > @@ -1366,8 +1371,10 @@ static int arm_trbe_remove_coresight(struct trbe_drvdata *drvdata) > { > int cpu; > > - for_each_cpu(cpu, &drvdata->supported_cpus) > - smp_call_function_single(cpu, arm_trbe_remove_coresight_cpu, drvdata, 1); > + for_each_cpu(cpu, &drvdata->supported_cpus) { > + smp_call_function_single(cpu, arm_trbe_disable_cpu, drvdata, 1); > + arm_trbe_remove_coresight_cpu(drvdata, cpu); > + } > free_percpu(drvdata->cpudata); > return 0; > } > @@ -1406,12 +1413,8 @@ static int arm_trbe_cpu_teardown(unsigned int cpu, struct hlist_node *node) > { > struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct trbe_drvdata, hotplug_node); > > - if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) { > - struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu); > - > - disable_percpu_irq(drvdata->irq); > - trbe_reset_local(cpudata); > - } > + if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) > + arm_trbe_disable_cpu(drvdata); This code hunk seems unrelated to the context here other than just finding another use case for arm_trbe_disable_cpu(). The problem is - arm_trbe_disable_cpu() resets cpudata->drvdata which might not get re-initialized back in arm_trbe_cpu_startup(), as there will still be a per cpu sink associated as confirmed with coresight_get_percpu_sink(). I guess it might be better to drop this change and just keep everything limited to SMP IPI callback reworking in arm_trbe_remove_coresight(). > return 0; > } >