Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp577963pxj; Fri, 28 May 2021 10:19:26 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxiA6I12XKhFlCSu0Swzxh3Me8lGE6CGc8RZEPEajPgjiTVHGGllAbVJfouuEO7VO2MZRtr X-Received: by 2002:a02:cc37:: with SMTP id o23mr9401085jap.35.1622222366076; Fri, 28 May 2021 10:19:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1622222366; cv=none; d=google.com; s=arc-20160816; b=MlcdpeWXbQNl5t6ngsXsM9anNUlOIv4fBQyzpZUy2ppRyTlZzjDihCpC8mgY7UHGHA t/JASNms0ahsH9H9BgSPeTCe0b6J3Nf3xd7t/dewl28FCHzJYu8oTwuIpboCBwsWjCS/ sLsogSet3MWgQ50fjr8PzIt8ysXx2uj7zmf46GmKEbcD5QqlVzIEjY42y+nAX6teFaN5 PPRm16QBDCWuV8XFKJqF3aXMXbNOk+Y0KIQPvIou2xmuCXbDP1kN3LFejza9E37tiscx +kLT8EM9nP5MkmfKzcYOzCY25ppf0zFdmy2LlEKNz9+wCI6Ino25SL9yAOU3wn2oUe4d ljzw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dkim-signature; bh=f4jz1sy05gHDrPQFFfHgBCP1ldZwBJqEEWVLg5ho68I=; b=0XOqKYbxaitIhly+zbsGHk8byBGNWTcPd+f1oapqca9pkz0pp2g9wA3ILgGYYL6epr 1lDKLXCriiUu3bo54mSPLEvG4Bibj0HKL2HjdnhwM+lP829PLzPTGubu0nwPqdBo+BD9 VQnehaKA4GE2Bh1qUJ/PX8DX3cey+yNTK4C+b1sUmnoOVGsQLLEetsTfbG/3fc5p3hKb SBSAUC88NTZsPx5hvrwZ8tQisrzO2qxoDssXTgVAikLzj5Mau417tLS77aGfvGmM7SjD QBQ0DlSuklqHKZpWztZupgKO82nx8seSrGCQ9s17P9Nd8mHjnDfVdtbVCGGXt+26OJcP xYug== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=t2Mx2o3Q; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b10si6891334ilv.99.2021.05.28.10.19.12; Fri, 28 May 2021 10:19:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=t2Mx2o3Q; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236765AbhE1Qvx (ORCPT + 99 others); Fri, 28 May 2021 12:51:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36132 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236475AbhE1Qum (ORCPT ); Fri, 28 May 2021 12:50:42 -0400 Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 80008C061574; Fri, 28 May 2021 09:48:17 -0700 (PDT) Received: by mail-pf1-x432.google.com with SMTP id c12so3683247pfl.3; Fri, 28 May 2021 09:48:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=f4jz1sy05gHDrPQFFfHgBCP1ldZwBJqEEWVLg5ho68I=; b=t2Mx2o3QBi9XPOGLo7Lc4Ok8QHlICIMzOKsJJPDc26CIfadBkWgemAnXj6JVzkGsbl QOrpHIKE/bqHW9u30vukIKTea8MD1P0tfNseiZNY/L70MaMjHWRNkHpgFvQr8BFz7IL3 KWT7iSswS3Q5DEPNsgqD4yfj9oHfJGDxmYLVkFgz0BaUMsyOEmOydgR7IEVkCWyMYaBW AGkUE7pyWMJweHO8MqP+b0ORCu7lm1zRS7CQa7Th6W0pkLr07VKmBnfRdUUJIRHDTYHR dsqtBUDRKUYkHCIQHIsXIdrhP794pNCGT2aSl9SCMnN2p+guZHAIYev8UsvxGQ27iPbb Oyqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=f4jz1sy05gHDrPQFFfHgBCP1ldZwBJqEEWVLg5ho68I=; b=fvtITJtdcDpz3BHjPKwUE+lrqyN5O9WBj21GBeq96YP67qAciI/f4hw3ibzerJO4Q6 Nrn3KnKsSc1B5P62YKEVF8ls9Ao0GWrH5gT/gj9oPe6Zrom3rumNuIL2Y7F7evGv/Uua FOAfwUd/vDhFCpBdDm+a21oS1dTLxrlrwOEZWHpUGIOKtXsuXJ5e+UE6ba/qSLOR9SXs ApyCIu6X9BCqhGsD8WUf6KTONCaaglnnDWBEk0SNA+qA1r2W/jAQIZdOYS4iCz4tRPkA HkTs/vLCjctZIieRTptzWLp+sDO7RZuobkE9AtinyZSOzMCfrjicxY6+lZNizNp/CZ2/ MOAA== X-Gm-Message-State: AOAM533q8o4ZnTpt2Bp4YuSuMY6aPVbP2w+9kWyTKWoABlTkUCA7fsup /8TZtxj7IMD5CMPUjPTF3H8= X-Received: by 2002:a62:148e:0:b029:2e4:e5a5:7e33 with SMTP id 136-20020a62148e0000b02902e4e5a57e33mr4723130pfu.9.1622220496748; Fri, 28 May 2021 09:48:16 -0700 (PDT) Received: from [10.67.49.104] ([192.19.223.252]) by smtp.googlemail.com with ESMTPSA id x23sm4793240pje.52.2021.05.28.09.48.15 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 28 May 2021 09:48:15 -0700 (PDT) Subject: Re: Kernel Panic in skb_release_data using genet To: Maxime Ripard , Florian Fainelli Cc: Doug Berger , bcm-kernel-feedback-list@broadcom.com, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, Nicolas Saenz Julienne References: <20210524130147.7xv6ih2e3apu2zvu@gilmour> <20210524151329.5ummh4dfui6syme3@gilmour> <1482eff4-c5f4-66d9-237c-55a096ae2eb4@gmail.com> <6caa98e7-28ba-520c-f0cc-ee1219305c17@gmail.com> <20210528163219.x6yn44aimvdxlp6j@gilmour> From: Florian Fainelli Message-ID: <77d412b4-cdd6-ea86-d7fd-adb3af8970d9@gmail.com> Date: Fri, 28 May 2021 09:48:14 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1 MIME-Version: 1.0 In-Reply-To: <20210528163219.x6yn44aimvdxlp6j@gilmour> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 5/28/21 9:32 AM, Maxime Ripard wrote: > hi Florian, > > On Fri, May 28, 2021 at 09:21:27AM -0700, Florian Fainelli wrote: >> On 5/24/21 8:37 AM, Florian Fainelli wrote: >>> >>> >>> On 5/24/2021 8:13 AM, Maxime Ripard wrote: >>>> Hi Florian, >>>> >>>> On Mon, May 24, 2021 at 07:49:25AM -0700, Florian Fainelli wrote: >>>>> Hi Maxime, >>>>> >>>>> On 5/24/2021 6:01 AM, Maxime Ripard wrote: >>>>>> Hi Doug, Florian, >>>>>> >>>>>> I've been running a RaspberryPi4 with a mainline kernel for a while, >>>>>> booting from NFS. Every once in a while (I'd say ~20-30% of all boots), >>>>>> I'm getting a kernel panic around the time init is started. >>>>>> >>>>>> I was debugging a kernel based on drm-misc-next-2021-05-17 today with >>>>>> KASAN enabled and got this, which looks related: >>>>> >>>>> Is there a known good version that could be used for bisection or you >>>>> just started to do this test and you have no reference point? >>>> >>>> I've had this issue for over a year and never (I think?) got a good >>>> version, so while it might be a regression, it's not a recent one. >>> >>> OK, this helps and does not really help. >>> >>>> >>>>> How stable in terms of clocking is the configuration that you are using? >>>>> I could try to fire up a similar test on a Pi4 at home, or use one of >>>>> our 72112 systems which is the closest we have to a Pi4 and see if that >>>>> happens there as well. >>>> >>>> I'm not really sure about the clocking. Is there any clock you want to >>>> look at in particular? >>> >>> ARM, DDR, AXI, anything that could cause some memory corruption to occur >>> essentially. GENET clocks are fairly fixed, you have a 250MHz clock and >>> a 125MHz clock feeding the data path. >>> >>>> >>>> My setup is fairly simple: the firmware and kernel are loaded over TFTP >>>> and the rootfs is mounted over NFS, and the crash always occur around >>>> init start, so I guess when it actually starts to transmit a decent >>>> amount of data? >>> >>> Do you reproduce this problem with KASAN disabled, do you eventually >>> have a crash pointing back to the same location? >>> >>> I have a suspicion that this is all Pi4 specific because we regularly >>> run the GENET driver through various kernel versions (4.9, 5.4 and 5.10 >>> and mainline) and did not run into that. >> >> I have not had time to get a set-up to reproduce what you are seeing, >> could you share your .config meanwhile? Thanks > > Sorry, I didn't have the time to check how the clock were behaving. > > You'll find attached my config.txt file and .config > > I'm booting the board entirely from TFTP (which might introduce some > issues in the "handoff" from the bootloader to the kernel), you'll find > some guide there: > > https://www.raspberrypi.org/documentation/hardware/raspberrypi/bootmodes/net_tutorial.md That is also how I boot my Pi4 at home, and I suspect you are right, if the VPU does not shut down GENET's DMA, and leaves buffer addresses in the on-chip descriptors that point to an address space that is managed totally differently by Linux, then we can have a serious problem and create some memory corruption when the ring is being reclaimed. I will run a few experiments to test that theory and there may be a solution using the SW_INIT reset controller to have a big reset of the controller before handing it over to the Linux driver. -- Florian