Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp1798875pxb; Wed, 9 Feb 2022 04:54:04 -0800 (PST) X-Google-Smtp-Source: ABdhPJzDhnY0Y1jZO4MnwHA4L8LA4bkyF7hquavCIgWLA7h8ium60awillSmrF9xrYBdVf/YqDRx X-Received: by 2002:a17:90a:7e15:: with SMTP id i21mr3301177pjl.74.1644411244409; Wed, 09 Feb 2022 04:54:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644411244; cv=none; d=google.com; s=arc-20160816; b=CN2G9bnzufYgAXG6EAQZFGRBrzF8PvHagBBqfqfbnT6WrQZSDl3qEHpdta2WDHbyt9 OLpKOW6iZlhcD2H0U/s/nyx+osoPyjj45ARxpK9omuua1RQyLzTujVNf+uGJ1nP+xH1z j/mrpAmkkqMXZA/zoz7+rx7Ra49tVLKXKZefsD5zbmAmLjoIc0OoI3d54d3eGR/b16A0 aRRPlS8GYyuehc0NNmBw4SeDR6LqCh6zZS7rmhsfnbmf6EG+KJ4G8AVs1BW9Ua9uUwmB /lryTizDlH1p1dNt1AYX1bz2Jzqq7xlhb64DPvAvYhafeS3HERFRDNRxRXjV/Tyrs4Ed L3Rg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=sCy+BFzraB5UPCDrJqayTu4U5x2Azckr0KW+I3kfiQc=; b=XeoI+HQBlvl5nGlE99Zr/g3Xna8g5rT65DSMKrEa8xJFOo5ZsMTiNb7R0m93g0nAHJ KCj4SAU0QTlGpUMmitrXKi3c+EsUXa+hPq+pL8kz9bUg3QFsc9uVsN34HKEBcVacu/zP Gc6VHlXtRBvsYd4U8cKLZIZyPBLnxBpJBLZTFx63rkrfDtd8oOKsBvNFh6ioezS2UpDP PiXJAy2NRArYs8VxE1NVpcHuEHFmn8Ofvre1zitc1SOAjpRj5zM+ZglCjLc4CPits8O/ kT9gSBJ0QEO7Ez9JKIX4aWcqUf/5jpUWWrBl5fC1xw6ELcFa4uohhhA82SQOp4Z0yo1C BX8Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@hartkopp.net header.s=strato-dkim-0002 header.b=K9GgT2Gs; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id d70si15981798pgc.182.2022.02.09.04.53.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Feb 2022 04:54:04 -0800 (PST) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@hartkopp.net header.s=strato-dkim-0002 header.b=K9GgT2Gs; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id D1364E05372A; Wed, 9 Feb 2022 02:31:37 -0800 (PST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240132AbiBIIAU (ORCPT + 99 others); Wed, 9 Feb 2022 03:00:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49928 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235466AbiBIIAS (ORCPT ); Wed, 9 Feb 2022 03:00:18 -0500 X-Greylist: delayed 363 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Wed, 09 Feb 2022 00:00:21 PST Received: from mo4-p01-ob.smtp.rzone.de (mo4-p01-ob.smtp.rzone.de [81.169.146.166]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F1700C0613CA for ; Wed, 9 Feb 2022 00:00:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1644393256; s=strato-dkim-0002; d=hartkopp.net; h=In-Reply-To:From:References:Cc:To:Subject:Date:Message-ID:Cc:Date: From:Subject:Sender; bh=sCy+BFzraB5UPCDrJqayTu4U5x2Azckr0KW+I3kfiQc=; b=K9GgT2Gsq67ZGqFQdUZ6Ak/QSOAJogD90+bxmZUKtXDc9vR3k2qp07VDoEatKT9IwE 9/3OfmHeRCFkXwD0dr8TGh3G/GbG9EkmsQ0H2JlqmCOwGN2Q7SFWomKasV8ZSlL0X82k CLciobQThGH4e9ehGnck7C/6iTqsAJZj+yWvQdr0eovdOcxLVQcstoqhqhZPmOuWbF/i FRVZAT3wb/hdmYXLFVDCLteMTDMs3HJUzOpfUmaB5AdSNs4tTqOTGqmf4v2G+JrX56pa gNmi2iDu1bP8M2Uecv4k/Xvb4iHhKNwVgyk8dykOaz+kcb0pstH1l88gmjStXW7ndRyF /bCQ== Authentication-Results: strato.com; dkim=none X-RZG-AUTH: ":P2MHfkW8eP4Mre39l357AZT/I7AY/7nT2yrDxb8mjG14FZxedJy6qgO1qCHSa1GLptZHusx3hdd0DIgVuBOfXW6v7w==" X-RZG-CLASS-ID: mo00 Received: from [IPV6:2a00:6020:1cfa:f900::b82] by smtp.strato.de (RZmta 47.39.0 AUTH) with ESMTPSA id L7379cy197sFNsZ (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits)) (Client did not present a certificate); Wed, 9 Feb 2022 08:54:15 +0100 (CET) Message-ID: Date: Wed, 9 Feb 2022 08:54:09 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 Subject: Re: [PATCH net] can: isotp: isotp_rcv_cf(): fix so->rx race problem Content-Language: en-US To: Marc Kleine-Budde Cc: "Ziyang Xuan (William)" , davem@davemloft.net, kuba@kernel.org, linux-can@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org References: <1fb4407a-1269-ec50-0ad5-074e49f91144@hartkopp.net> <2aba02d4-0597-1d55-8b3e-2c67386f68cf@huawei.com> <64695483-ff75-4872-db81-ca55763f95cf@hartkopp.net> <97339463-b357-3e0e-1cbf-c66415c08129@hartkopp.net> <24e6da96-a3e5-7b4e-102b-b5676770b80e@hartkopp.net> <20220128080704.ns5fzbyn72wfoqmx@pengutronix.de> <72419ca8-b0cb-1e9d-3fcc-655defb662df@hartkopp.net> <20220128084603.jvrvapqf5dt57yiq@pengutronix.de> <07c69ccd-dbc0-5c74-c68e-8636ec9179ef@hartkopp.net> <20220207081123.sdmczptqffwr64al@pengutronix.de> From: Oliver Hartkopp In-Reply-To: <20220207081123.sdmczptqffwr64al@pengutronix.de> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,RDNS_NONE,SPF_HELO_NONE, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Marc, On 07.02.22 09:11, Marc Kleine-Budde wrote: > On 28.01.2022 15:48:05, Oliver Hartkopp wrote: >> Hello Marc, hello William, >> >> On 28.01.22 09:46, Marc Kleine-Budde wrote: >>> On 28.01.2022 09:32:40, Oliver Hartkopp wrote: >>>> >>>> >>>> On 28.01.22 09:07, Marc Kleine-Budde wrote: >>>>> On 28.01.2022 08:56:19, Oliver Hartkopp wrote: >>>>>> I've seen the frame processing sometimes freezes for one second when >>>>>> stressing the isotp_rcv() from multiple sources. This finally freezes >>>>>> the entire softirq which is either not good and not needed as we only >>>>>> need to fix this race for stress tests - and not for real world usage >>>>>> that does not create this case. >>>>> >>>>> Hmmm, this doesn't sound good. Can you test with LOCKDEP enabled? >> >> >>>> # >>>> # Lock Debugging (spinlocks, mutexes, etc...) >>>> # >>>> CONFIG_LOCK_DEBUGGING_SUPPORT=y >>>> # CONFIG_PROVE_LOCKING is not set >>> CONFIG_PROVE_LOCKING=y >> >> Now enabled even more locking (seen relevant kernel config at the end). >> >> It turns out that there is no visible difference when using spin_lock() or >> spin_trylock(). >> >> I only got some of these kernel log entries >> >> Jan 28 11:13:14 silver kernel: [ 2396.323211] perf: interrupt took too long >> (2549 > 2500), lowering kernel.perf_event_max_sample_rate to 78250 >> Jan 28 11:25:49 silver kernel: [ 3151.172773] perf: interrupt took too long >> (3188 > 3186), lowering kernel.perf_event_max_sample_rate to 62500 >> Jan 28 11:45:24 silver kernel: [ 4325.583328] perf: interrupt took too long >> (4009 > 3985), lowering kernel.perf_event_max_sample_rate to 49750 >> Jan 28 12:15:46 silver kernel: [ 6148.238246] perf: interrupt took too long >> (5021 > 5011), lowering kernel.perf_event_max_sample_rate to 39750 >> Jan 28 13:01:45 silver kernel: [ 8907.303715] perf: interrupt took too long >> (6285 > 6276), lowering kernel.perf_event_max_sample_rate to 31750 >> >> But I get these sporadically anyway. No other LOCKDEP splat. >> >> At least the issue reported by William should be fixed now - but I'm still >> unclear whether spin_lock() or spin_trylock() is the best approach here in >> the NET_RX softirq?!? > > With the !spin_trylock() -> return you are saying if something > concurrent happens, drop it. This doesn't sound correct. Yes, I had the same feeling and did some extensive load tests using both variants. It turned out the standard spin_lock() works excellent to fix the issue. Thanks for taking it for upstream here: https://lore.kernel.org/linux-can/20220209074818.3ylfz4zmuhit7orc@pengutronix.de/T/#t Best regards, Oliver