Received: by 2002:ab2:1149:0:b0:1f3:1f8c:d0c6 with SMTP id z9csp2989417lqz; Wed, 3 Apr 2024 14:48:17 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXaoBXLdUCb1cBQhHQI0/gmBkpLLmdPdpuMU5Anu3l84tNH8K1b0zoFES85JY5fuU0VIW8ZEzxPHZ5bIZoimUjpjF32uc7i8cjoC+wUaw== X-Google-Smtp-Source: AGHT+IHiNCmbD4bdYU2kJs8uz9+dpYqnQtuGhgkySA+zuR2HqcO9Jr7UQem5ZtdakNahtbF7xzFp X-Received: by 2002:a17:906:13d7:b0:a4e:1f41:6f76 with SMTP id g23-20020a17090613d700b00a4e1f416f76mr297617ejc.38.1712180897612; Wed, 03 Apr 2024 14:48:17 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1712180897; cv=pass; d=google.com; s=arc-20160816; b=KLbBPnUoi5NdmQYp3fLkoThNTVMN0aYBVaniybixqYs6wEku0+L6DIYuzHlLN6O5y0 eWiO12okvwk/B/aHXhOMJdzAXfMo59hpM1FG7HU//o13lpzDCVMWKMSTPdrd6Ath1O5I nPe4oQU9Ww6cgp9yT2BFFytGcU8NpiD2PMMwGTjcwfCSaMWsTH1UJyTewDexF3BhYD7x XuxSVN3ZNqrbs2mpVKaWh+6R1o3YOAS6oJnvyYTg/23WPMdxcUXdvY1XGThsohdHUly6 6LfJjUfcTrj0nCgPfy00Opliy4nkIGRtSfznD89Wd4FVbJodBaGlKpASGcgzdlZFDsur dmKg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:list-unsubscribe:list-subscribe:list-id:precedence :user-agent:content-transfer-encoding:references:in-reply-to:date:cc :to:from:subject:message-id:dkim-signature; bh=8Iq1khzAZoET1maVM8hhBQfYggvOuTEROT8y1oovSPA=; fh=njY/r5laSUHOp14NlB+Sdj4fVX8oV2v9i5i+PZ3WeNg=; b=LKinaJXj61pJ9ug8Zj6zg6vZl7rTwW/+I3DjXYl8m0/OGuHCQ9EDxIC6HQhHg6qoNK UWfLgP+IyGbP2CCoQni0qiQlneB/3Lri/8ok6nDzHD+xHPTAfq6tTgKFE3KHozBB9cfZ uDBWVlvrnOFjzt5st7WKF/F1tnKsqMcnuxHeJZP3dlmsVBkmhPIEyAgljiv7izkgU3Jg Of4EyZzZCrbRHv9bpg0+b10Nk/6qlIWBBf0OFu1B31gKvWKmozrLyDkZArqLpBzQqmKe oFVz3zE7o4xievJUCaCJbSO5X/0+b1hxT23UC36QY5xFnr2P8rKX/fZQafOtxRIeKJ3w /Xsw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@bigler.one header.s=key2 header.b="fd/dU32A"; arc=pass (i=1 spf=pass spfdomain=bigler.one dkim=pass dkdomain=bigler.one); spf=pass (google.com: domain of linux-kernel+bounces-130656-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-130656-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id r1-20020a170906a20100b00a474a649c8csi6979861ejy.689.2024.04.03.14.48.17 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Apr 2024 14:48:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-130656-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@bigler.one header.s=key2 header.b="fd/dU32A"; arc=pass (i=1 spf=pass spfdomain=bigler.one dkim=pass dkdomain=bigler.one); spf=pass (google.com: domain of linux-kernel+bounces-130656-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-130656-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 8556E1F23B07 for ; Wed, 3 Apr 2024 21:47:28 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 0CD87156969; Wed, 3 Apr 2024 21:47:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bigler.one header.i=@bigler.one header.b="fd/dU32A" Received: from relay.yourmailgateway.de (relay.yourmailgateway.de [188.68.63.162]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A8A9815688B; Wed, 3 Apr 2024 21:47:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=188.68.63.162 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712180831; cv=none; b=fBQoZRAhDyEfYkxBX2vgtQEloNQXPzoUgOJjO41Pa3iB8dK5ePOLMH98woi18lglS9rnULORKmMrWHlsaxilvyWH+mV/sYD5h5bixLVy4EwPce+b4y4GyMRfaztHO+356MWnc2TPUn7sOPnlD0JQdQ3nMTjo2/Z1ZHH+9tOXch4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712180831; c=relaxed/simple; bh=d8DVzRRWiBOBWQHAvqPLGMyCILuV6QJ2ebbK4tgaVFs=; h=Message-ID:Subject:From:To:Cc:Date:In-Reply-To:References: Content-Type:MIME-Version; b=mlkQfALEv12lB6l59YtCKG6X+n2UH77BEEMx1lXmOf4a47RQf6OOeG1H4ws7NyhzN7kJ1pPPMEAarXfWHUf+JvVxDxCdUl501PQRFk4m6vNijUC/T8Xs8GN0lKM24RlYkalnQ84ZKYRKT9nXDPhYoWNSLkd/tYHkVmyybKl5YLM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=bigler.one; spf=pass smtp.mailfrom=bigler.one; dkim=pass (2048-bit key) header.d=bigler.one header.i=@bigler.one header.b=fd/dU32A; arc=none smtp.client-ip=188.68.63.162 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=bigler.one Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bigler.one Received: from mors-relay-8201.netcup.net (localhost [127.0.0.1]) by mors-relay-8201.netcup.net (Postfix) with ESMTPS id 4V8yph0DXyz3x1P; Wed, 3 Apr 2024 23:40:28 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=bigler.one; s=key2; t=1712180428; bh=d8DVzRRWiBOBWQHAvqPLGMyCILuV6QJ2ebbK4tgaVFs=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=fd/dU32AHFSSJK3tHzN8bIz+NYjrXgQ3VuBzIirf4efeESgj5xCtB27wEjpgqTsGv Kiu6VUMfYmBGEuSXLn8fIbPAMKvKBAuKed2P2Qxb419T3jSh3wmxkfEFEZ+3bMOMuQ j/UudeavNFIwKUJSIVqoO7YJppeAOT4Cs3OUhs8PkyCdlPShEOrt7iB21r9nesJKbq /LE8+yJBZKiuZBGDkm/VADGHfbfABI3ev37vzdk0bb59NHNKLYLA621/3F+ibdADnY PQHdt9jZgSMHYYRNuK8EnJAdVc3ZeiBYA4d9/rLVHzAXj9+Re8aTfyxsOOU9eALi/z BLdAq2L6VBr0g== Received: from policy02-mors.netcup.net (unknown [46.38.225.35]) by mors-relay-8201.netcup.net (Postfix) with ESMTPS id 4V8ypg6b7tz3x0p; Wed, 3 Apr 2024 23:40:27 +0200 (CEST) Received: from mx2fc6.netcup.net (unknown [10.243.12.53]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by policy02-mors.netcup.net (Postfix) with ESMTPS id 4V8ypd1DkHz8sWg; Wed, 3 Apr 2024 23:40:24 +0200 (CEST) Received: from [192.168.1.24] (xdsl-188-155-38-242.adslplus.ch [188.155.38.242]) by mx2fc6.netcup.net (Postfix) with ESMTPSA id C318440555; Wed, 3 Apr 2024 23:40:19 +0200 (CEST) Authentication-Results: mx2fc6; spf=pass (sender IP is 188.155.38.242) smtp.mailfrom=benjamin@bigler.one smtp.helo=[192.168.1.24] Received-SPF: pass (mx2fc6: connection is authenticated) Message-ID: Subject: Re: [PATCH net-next v2 0/9] Add support for OPEN Alliance 10BASE-T1x MACPHY Serial Interface From: Benjamin Bigler To: Parthiban.Veerasooran@microchip.com Cc: netdev@vger.kernel.org, devicetree@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, Horatiu.Vultur@microchip.com, Woojung.Huh@microchip.com, Nicolas.Ferre@microchip.com, UNGLinuxDriver@microchip.com, Thorsten.Kummermehr@microchip.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, robh+dt@kernel.org, krzysztof.kozlowski+dt@linaro.org, conor+dt@kernel.org, corbet@lwn.net, Steen.Hegelund@microchip.com, rdunlap@infradead.org, horms@kernel.org, casper.casan@gmail.com, andrew@lunn.ch Date: Wed, 03 Apr 2024 23:40:19 +0200 In-Reply-To: <0596fce8-223b-494e-907e-f13d75f211cd@microchip.com> References: <20231023154649.45931-1-Parthiban.Veerasooran@microchip.com> <0596fce8-223b-494e-907e-f13d75f211cd@microchip.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.52.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-PPP-Message-ID: <171218042052.27882.8421913173689188987@mx2fc6.netcup.net> X-Rspamd-Queue-Id: C318440555 X-Rspamd-Server: rspamd-worker-8404 X-NC-CID: +iN5Y7Xep7LirHRdaEEmR9GEdlgjy4blmLL0Fh9Nb2ZIv2A= Hi Parthiban, Sorry for the late answer, I was quite busy the last few days. On Mon, 2024-03-25 at 13:24 +0000, Parthiban.Veerasooran@microchip.com wrot= e: > Hi Benjamin Bigler, >=20 > Thank you for your testing and feedback. It would be really helpful to= =20 > bring the driver to a good shape. We really appreciate your efforts on th= is. >=20 > On 24/03/24 5:25 pm, Benjamin Bigler wrote: > > [Some people who received this message don't often get email from benja= min@bigler.one. Learn why this is important at https://aka.ms/LearnAboutSen= derIdentification ] > >=20 > > EXTERNAL EMAIL: Do not click links or open attachments unless you know = the content is safe > >=20 > > Hi Parthiban > >=20 > > I hope I send this in the right context as it is not related to just on= e patch or > > some specific code. > >=20 > > I conducted UDP load testing using three i.MX8MM boards in conjunction = with the > > LAN8651. The setup involved one board functioning as a server, which is= just > > echoing back received data, while the remaining two boards acted as cli= ents, > > sending UDP packets of different sizes in various bursts to the server. > > Due to hardware constraints, the SPI bus speed was limited to 15 MHz, w= hich might > > have influenced the results. > >=20 > > During the tests I experienced some issues: > >=20 > > - The boards just start receiving after first sending something (ping a= nother board). > > Some measurements showed that the irq stays asserted after init. Thi= s makes sense > > as far as I understand the chapter 7.7 of the specification, the irq= is deasserted > > on reception of the first data header following CSn being asserted. = As a workaround > > I trigger the thread at the end of oa_tc6_init. > It looks like the IRQ is asserted on RESET completion and expects a data > chunk from host to deassert the IRQ. I used to test the driver in RPI 4 > using iperf3. For some reason I never faced this issue, may be when the > network device is being registered there might be some packet=20 > transmission which leads to deliver a data chunk so that the IRQ is > deasserted. Thanks for the workaround. I think that would be the=20 > solution to solve this issue. Adding the below lines in the end of the= =20 > function oa_tc6_init() will trigger the oa_tc6_spi_thread_handler() to= =20 > perform an empty data chunk transfer which will deassert the IRQ before= =20 > starting the actual data transfer. I have ipv6 disabled and use static ipv4 addresses. That could be the reaso= n why on my side no packet is sent. >=20 > /* oa_tc6_sw_reset_macphy() function resets and clears the MAC-PHY reset > * complete status. IRQ is also asserted on reset completion and it is > * remain asserted until MAC-PHY receives a data chunk. So performing an > * empty data chunk transmission will deassert the IRQ. Refer section > * 7.7 and 9.2.8.8 in the OPEN Alliance specification for more details. > */ > tc6->int_flag =3D true; > wake_up_interruptible(&tc6->spi_wq); Perfect, thats the same I added and also works on my side. > >=20 > > - If there is a lot of traffic, the receive buffer overflow error spams= the log. > >=20 > > - If there is a lot of traffic, I got various kernel panics in oa_tc6_u= pdate_rx_skb. > > Mostly because more data to rx_skb is added than allocated and somet= imes because > > rx_skb is null in oa_tc6_update_rx_skb or oa_tc6_prcs_rx_frame_end. = Some debugging > > with a logic analyzer showed that the chip is not behave correctly. = There is more > > bytes between start_valid and end_valid than there should be. Also t= here > > seems to be 2 end_valid without a start_valid between. What is commo= n is that the incorrect > > frame starts in a chunk where end_valid and start_valid is set. > > In my opinion its a problem in the chip (maybe related to the errata= in the next point) > > but the driver should be resilent and just drop the packet and not c= ause a kernel panic. > Usually I run into this issue "receive buffer overflow" when I run RPI 4 > with default cpu governor setting which is "ondemand". In this case,=20 > even though if I set SPI clock speed as 15 MHz the RPI 4 core clock is > clocking down when it is idle which leads delivering half of the > configured SPI clock speed around 5.9 MHz. So the systems like RPI 4=20 > need performance mode enabled to get the proper clock speed for SPI.=20 > Refer below link for more details. >=20 > https://github.com/raspberrypi/linux/issues/3381#issuecomment-1144723750 >=20 > I used to enable performance mode using the below command. >=20 > echo performance | sudo tee=20 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor > /dev/null >=20 > So please ensure the SPI clock speed using a logic analyzer to get the > maximum throughput without receive buffer overflow. >=20 > Of course, I agree that the driver should not crash in case of receive > buffer overflow. By referring your investigations, I understand that the > buffers in the MAC-PHY is being continuously overwritten again and again > as the host is very slow to read the data from the MAC-PHY buffers > through SPI which alters the descriptors. There might be two reasons why > we run into this situation. > 1. The host is busy doing something else and delays to initiate SPI even > though SPI clock speed is 15 MHz. > 2. The SPI clock speed is less than 15 MHz. Sorry there is a missunderstanding between us. The receive buffer overflow = is not causing any=C2=A0harm except filling the log. In my setup I get in one day = about 35000 entries. I am not sure=C2=A0if its appropriate to log these errors. The SPI Frequency is at 14.8 MHz. If I just have 2 boards connected, I am n= ot able to reproduce this. Only with 3 boards when 2 boards sends multiple big ethe= rnet frames (1512 byte per Frame) to one, I get these log entries.=C2=A0 The latency seems to be quite low, from IRQ to start reading first frame it= takes always less than 500us. Also the boards are just running the udp test. >=20 > I use the below iperf3 setup for my testing and never faced the driver > crash issue even though faced "receive buffer overflow" error when I run > RPI 4 with "ondemand" default mode. >=20 > Node 0 - Raspberry Pi 4 with LAN8650 MAC-PHY > $ iperf3 -s > Node 1 - Raspberry Pi 4 with EVB-LAN8670-USB USB Stick > $ iperf3 -c 192.168.5.100 -u -b 10M -i 1 -t 0 >=20 > and vice versa. >=20 > I never faced "receive buffer overflow" error when I run RPI 4 with > "performance" mode enabled and even though all the cores are stressed > using the below command, >=20 > $ yes >/dev/null & yes >/dev/null & yes >/dev/null & yes >/dev/null & >=20 > Can you share more details about your testing setup and applications you > use, so that I will try to reproduce the issue in my setup to debug the > driver? I use a internal tool which does some stress tests using udp. Unfortunately= , I am not allowed to publish it, but a colleague works on a rust implementat= ion, which we can publish, but its not fully ready yet. On one board the tool is running in server mode. It just echoes back the re= ceived data. On the 2 other boards the tool is running in client mode. It sends va= rious sized udp-packets in different bursts and then checks if it receives the sa= me data in the same order. The crashes only happens when ZARFE is not set (with Rev B0). When the cras= h happens, I see on the logic analyzer that there are more bytes than mtu + h= eaders between the frame where start_valid is set and the frame where end_valid is= set. Then this happens: [ 437.155673] skbuff: skb_over_panic: text:ffff80007a8c2bd8 len:1600 put:6= 4 head:ffff00000de28080 data:ffff00000de280c0 tail:0x680 end:0x640 dev:eth1 [ 437.168987] kernel BUG at net/core/skbuff.c:192! [ 437.173612] Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SM= P [ 437.180407] Modules linked in: ppp_async crc_ccitt ppp_generic slhc lan8= 65x oa_tc6 bec_infoo(O) tpm_tis_spi tpm_tis_core spi_imx imx_sdma [ 437.196016] CPU: 1 PID: 455 Comm: oa-tc6-spi-thre Tainted: G O= 6.6.11- gce336e2c2bc3-dirty #1 [ 437.205853] Hardware name: Toradex Verdin iMX8M Mini on FUMU (DT) [ 437.212820] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE= =3D--) [ 437.219790] pc : skb_panic+0x58/0x5c [ 437.223376] lr : skb_panic+0x58/0x5c [ 437.226959] sp : ffff80008362bd90 [ 437.230278] x29: ffff80008362bda0 x28: 0000000000000000 x27: ffff0000010= 66878 [ 437.237426] x26: 000000000000001e x25: 00000000000007f8 x24: ffff0000010= cea80 [ 437.244571] x23: 00000000f0f0f0f1 x22: 000000000000001f x21: 00000000000= 00000 [ 437.251720] x20: ffff0000010ceaa8 x19: 000000003f20003f x18: fffffffffff= fffff [ 437.258867] x17: ffff7ffffded9000 x16: ffff800080008000 x15: 073a0764076= e0765 [ 437.266015] x14: 0720073007380736 x13: ffff8000823d1f58 x12: 00000000000= 00534 [ 437.273162] x11: 00000000000001bc x10: ffff800082429f58 x9 : ffff8000823= d1f58 [ 437.280310] x8 : 00000000ffffefff x7 : ffff800082429f58 x6 : 00000000000= 00000 [ 437.287455] x5 : 000000000000bff4 x4 : 0000000000000000 x3 : 00000000000= 00000 [ 437.294606] x2 : 0000000000000000 x1 : ffff000001223b00 x0 : 00000000000= 00087 [ 437.301753] Call trace: [ 437.304203] skb_panic+0x58/0x5c [ 437.307436] skb_find_text+0x0/0xf0 [ 437.310933] oa_tc6_spi_thread_handler+0x438/0x880 [oa_tc6] [ 437.316523] kthread+0x118/0x11c [ 437.319758] ret_from_fork+0x10/0x20 [ 437.323343] Code: f90007e9 b940b908 f90003e8 97ca3c34 (d4210000) [ 437.329446] ---[ end trace 0000000000000000 ]--- Sometimes there are 2 end_valid after eachother without a start_valid betwe= en. Then this happens: [ 469.737297] Unable to handle kernel NULL pointer dereference at virtual = address 0000000000000074 [ 469.746137] Mem abort info: [ 469.748950] ESR =3D 0x0000000096000004 [ 469.752709] EC =3D 0x25: DABT (current EL), IL =3D 32 bits [ 469.758036] SET =3D 0, FnV =3D 0 [ 469.761098] EA =3D 0, S1PTW =3D 0 [ 469.764252] FSC =3D 0x04: level 0 translation fault [ 469.769144] Data abort info: [ 469.772033] ISV =3D 0, ISS =3D 0x00000004, ISS2 =3D 0x00000000 [ 469.777529] CM =3D 0, WnR =3D 0, TnD =3D 0, TagAccess =3D 0 [ 469.782594] GCS =3D 0, Overlay =3D 0, DirtyBit =3D 0, Xs =3D 0 [ 469.787921] user pgtable: 4k pages, 48-bit VAs, pgdp=3D0000000043c32000 [ 469.794377] [0000000000000074] pgd=3D0000000000000000, p4d=3D00000000000= 00000 [ 469.801184] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 469.807459] Modules linked in: ppp_async crc_ccitt ppp_generic slhc lan8= 65x oa_tc6 bec_infoo(O) tpm_tis_spi tpm_tis_core spi_imx imx_sdma [ 469.823064] CPU: 2 PID: 456 Comm: oa-tc6-spi-thre Tainted: G O= 6.6.11- g350ed394a6ca-dirty #1 [ 469.832903] Hardware name: Toradex Verdin iMX8M Mini on FUMU (DT) [ 469.839871] pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE= =3D--) [ 469.846841] pc : skb_put+0xc/0x6c [ 469.850169] lr : oa_tc6_spi_thread_handler+0x438/0x880 [oa_tc6] [ 469.856106] sp : ffff80008376bdb0 [ 469.859424] x29: ffff80008376bdb0 x28: 0000000000000000 x27: ffff0000019= 4c080 [ 469.866573] x26: 0000000000000000 x25: 0000000000000000 x24: ffff0000010= 95c80 [ 469.873720] x23: 00000000f0f0f0f1 x22: 000000000000001f x21: 00000000000= 00000 [ 469.880870] x20: ffff000001095ca8 x19: 000000003f20003f x18: 00000000000= 00000 [ 469.888023] x17: 0000000000000000 x16: 0000000000000000 x15: 00000000000= 00000 [ 469.895174] x14: 0000031acf8b86d8 x13: 0000000000000000 x12: 00000000000= 00000 [ 469.902321] x11: 0000000000000002 x10: 0000000000000a60 x9 : ffff8000837= 6b970 [ 469.909467] x8 : ffff00007fb6e580 x7 : 000000000194b080 x6 : 00000000000= 00000 [ 469.916616] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000000= 0fc80 [ 469.923765] x2 : 0000000000000001 x1 : 0000000000000040 x0 : 00000000000= 00000 [ 469.930915] Call trace: [ 469.933365] skb_put+0xc/0x6c [ 469.936342] oa_tc6_spi_thread_handler+0x438/0x880 [oa_tc6] [ 469.941929] kthread+0x118/0x11c [ 469.945166] ret_from_fork+0x10/0x20 [ 469.948752] Code: d65f03c0 d503233f a9bf7bfd 910003fd (b9407406) [ 469.954854] ---[ end trace 0000000000000000 ]--- If interested I can try to get a recording with the logic analyzer and send= it to you. By the way in the other answer you attached a screenshot of the logic analy= zer and you have a very nice HLA for oa_tc6. Are they open-source or are there any plan= s to publish them? > >=20 > > - Sometimes the chip stops working. It always asserts the irq but there= is no data (rca=3D0) > > and also exst is not active. I found out that there is an errata (DS= 80001075) point s3 > > that explains this. I set the ZARFE bit in CONFIG0. This also fixes = the point above. > > The driver now works since about 2.5 weeks with various load with ju= st one loss of frame > > error where I had to reboot the system after about 4 days. > It is good to hear that the driver works fine with the above changes. As= =20 > mentioned in the errata, this continuous interrupt issue is a known > issue with LAN8651 Rev.B0. Switching to LAN8651 Rev.B1 will solve this > issue and no need of any workaround. Setting ZARFE bit in the CONFIG0 > will solve the continuous interrupt issue but don't know how the above > "receive buffer overflow" issue also solved. I think it is a good idea= =20 > to test with LAN8651 Rev.B1 without setting ZARFE bit once. It would be= =20 > interesting to see the result. I am always using LAN8651 Rev.B1 for my= =20 > testing. Unfortunately I just have LAN8651 Rev. B0 Chips. Are you sure that the Rev = B1 has the issue fixed? The errata here says that B1 is affected too: https://ww1.microchip.com/downloads/aemDocuments/documents/AIS/ProductDocum= ents/Errata/LAN8650-1-Errata-80001075.pdf >=20 > I should be able to reproduce the "receive buffer overflow" issue and=20 > consequently kernel crash in my setup with LAN8651 Rev.B1 so that I can= =20 > investigate the issue further. As I am not able to reproduce in my RPI= =20 > 4, I need your support for the tests and applications you used in your= =20 > setup. > >=20 > > Is there a reason why you removed the netdev watchdog which was active = in v2? > When the timeout occurs, there is no further action except increasing > tx_errors. Not seeing this except USB-to-Ethernet which can be removed > unexpectedly. But this is SPI interface which will not be removed > unexpectedly as it is a platform device. That's why we removed this. >=20 > Best regards, > Parthiban V > >=20 > > Thanks, > > Benjamin Bigler > >=20 >=20 Thanks, Benjamin Bigler