Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp2639328pxb; Thu, 11 Feb 2021 19:04:47 -0800 (PST) X-Google-Smtp-Source: ABdhPJyAlx8hioI7PRCOCXyD+bV+E9+yUxyVmasU7PmurVhfEVq7JahLJBIHlR5jsy43dU90kBu/ X-Received: by 2002:a05:6402:35c2:: with SMTP id z2mr1222684edc.34.1613099087250; Thu, 11 Feb 2021 19:04:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1613099087; cv=none; d=google.com; s=arc-20160816; b=QebMqsAAE8ioyYHe8+TwvGMxp11wZuDaD7eUaUoquNGSCsNrhg11ihQ7sqEo5FG5Sl QEN5uYzazoj4IZgmsS7VxSe3xBev6RlD6MsOAL8V0JL1Iq/aK4S0hE96AHtwfpvAm5Gd 7B1/cShW2Zek6wwiHphGj8azS4ah5d0CfDJAXyLDtlvPP2sb4PQ+xwaCkWv4m8gpJRxZ vCT2YmIgq3bhIuQ7Rt4JzUFOrVR2tx0b6qIB8Pmj9YCrcS3+fQxuB8qxqa9SPZcy9ZKZ D0amYiIgqPghf2pUm1oAx2dzXCZSe1p0IR4Rv2cX5CFDNeyFCtwz8/dgUZHHQNGfY6TP MJ2w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=2YjH7EUH7e5dFZyK95JFbg4Mk0a2h8e9VHpnb/uN+f4=; b=0PD9Q+xONatRj+4pzhAhzinHy3toUAEZUG1oAx4aWEWsfECn3RSVH5sYzhyabE6Qp9 DhAbbRMeivvnZhdI0u4xzoD/Iep5H1sL1nEyZhvq9G1R9wst4MYw5HwESgqkPnWRXw45 RUezS3xbcVRD9Yb2+/j9rs9xLlaVFyp1C14GaTHXY/ISKLzCkQhE6mT9Bs1SxQ+IX7Wl 1rlxsVAwx+naGvVc4z/grGWpjs5AsMKhkWmY5xir3Hze9Clhtm7unN8Ih8InT5kxIJnp ci24sTQdKGXNiPbilRCoKBidteZAeaqvVGkWt/nND8QlRDW+u2dTbWyahejx0hujPu5u kw+g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=hrrZIRrC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id hp6si5255444ejc.532.2021.02.11.19.04.24; Thu, 11 Feb 2021 19:04:47 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=hrrZIRrC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229829AbhBLDDC (ORCPT + 99 others); Thu, 11 Feb 2021 22:03:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60110 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229675AbhBLDBW (ORCPT ); Thu, 11 Feb 2021 22:01:22 -0500 Received: from mail-yb1-xb34.google.com (mail-yb1-xb34.google.com [IPv6:2607:f8b0:4864:20::b34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 735FAC06178A for ; Thu, 11 Feb 2021 19:00:31 -0800 (PST) Received: by mail-yb1-xb34.google.com with SMTP id q12so5590899ybm.8 for ; Thu, 11 Feb 2021 19:00:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=2YjH7EUH7e5dFZyK95JFbg4Mk0a2h8e9VHpnb/uN+f4=; b=hrrZIRrCawC05z8ys0aGjWa1eOqkwmku4UHqTFr13ZSIA6fT+G2IvzKpKYh9K5dqwn NIABQJjxAd28M3i8JvfkPjLBLQwPlGHAgBCTRoMjjkw7ZVxU+l7U68uX4Xm4dneE1KzF Vthvf9SBlYUJa8qd4KE1Yplts1yOFoAUi8F+g0z31m5ckKC/LzJo5EcW4YIv6eCp3k51 nX1yU/6un0qh6l4Um5ikuPa98QpzhY0sgOgYLAseeWfyQ/ejxSsSWx4zVon9MqL/ofF1 Mlb+0a3lG5bWPnL31fTMQz52NB2wT3ptd77Zg2UC4PQDql2P4piYPqTCvzV9Li53IoTp C7Bw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=2YjH7EUH7e5dFZyK95JFbg4Mk0a2h8e9VHpnb/uN+f4=; b=AySqRoZQldO4G36AuTkHeyZCZLo1yKUTV1CWkkKlf7urEa+oDun9+cy7/sBKfybKDG Xn4B19gTtQ4vypRNHntDKDg4Lrz9sefBeO5tAGZAfPBQbEs8qBj7dnhSNat3hijjAerF +e+qE6cgoUyIXf3cJi+dmcsrXOffRWCrCFnrRuHpJIiX2CXI7Hz20lvAyd1etKHpG77F rfnx7/plnLmhP143DFhb5JsRoGCja+hqdu5DAMi/hVvIuKgCn0TlzCEUHrhbwkOLHCIX tEKc0A5t1hKWNoeS18WcBXHCQXTrxc8zZGjAe8vZpAIR+GFNjSCP52BNIc+63A2erd6I /XFg== X-Gm-Message-State: AOAM533TngNk18ephR64jg505qoL6t+Mbbnip/ss4ZbnKoqcHYfgXXeD VoejGqS387wncg1as8R2FXz9DusxePgeKeMBRauNfA== X-Received: by 2002:a25:cc89:: with SMTP id l131mr1299415ybf.346.1613098830159; Thu, 11 Feb 2021 19:00:30 -0800 (PST) MIME-Version: 1.0 References: <20210205222644.2357303-1-saravanak@google.com> In-Reply-To: From: Saravana Kannan Date: Thu, 11 Feb 2021 18:59:54 -0800 Message-ID: Subject: Re: [PATCH v4 0/8] Make fw_devlink=on more forgiving To: Geert Uytterhoeven Cc: Jonathan Corbet , Greg Kroah-Hartman , "Rafael J. Wysocki" , Kevin Hilman , Ulf Hansson , Len Brown , Len Brown , Pavel Machek , Michael Turquette , Stephen Boyd , Rob Herring , Frank Rowand , Marc Zyngier , Thomas Gleixner , "open list:DOCUMENTATION" , Linux Kernel Mailing List , Linux PM list , linux-clk , "open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS" , ACPI Devel Maling List , Marek Szyprowski , Android Kernel Team , Linux-Renesas Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 11, 2021 at 5:00 AM Geert Uytterhoeven wrote: > > Hi Saravana, > > On Fri, Feb 5, 2021 at 11:26 PM Saravana Kannan wrote: > > There are a lot of devices/drivers where they never have a struct device > > created for them or the driver initializes the hardware without ever > > binding to the struct device. > > > > This series is intended to avoid any boot regressions due to such > > devices/drivers when fw_devlink=on and also address the handling of > > optional suppliers. > > > > Patch 1 and 2 addresses the issue of firmware nodes that look like > > they'll have struct devices created for them, but will never actually > > have struct devices added for them. For example, DT nodes with a > > compatible property that don't have devices added for them. > > > > Patch 3 and 4 allow for handling optional DT bindings. > > > > Patch 5 sets up a generic API to handle drivers that never bind with > > their devices. > > > > Patch 6 through 8 update different frameworks to use the new API. > > > > Thanks, > > Saravana > > > > Saravana Kannan (8): > > driver core: fw_devlink: Detect supplier devices that will never be > > added > > of: property: Don't add links to absent suppliers > > driver core: Add fw_devlink.strict kernel param > > of: property: Add fw_devlink support for optional properties > > driver core: fw_devlink: Handle suppliers that don't use driver core > > irqdomain: Mark fwnodes when their irqdomain is added/removed > > PM: domains: Mark fwnodes when their powerdomain is added/removed > > clk: Mark fwnodes when their clock provider is added/removed > > Thanks for your series, which is now part of driver-core-next. > I gave driver-core-next + [1] a try on various Renesas boards. Thanks! > Test results are below. > In general, the result looks much better than before. Ah, good to hear this. > [1] - https://lore.kernel.org/lkml/20210210114435.122242-1-tudor.ambarus@microchip.com/ > > 1. R-Car Gen2 (Koelsch), R-Car Gen3 (Salvator-X(S), Ebisu). > > - Commit 2dfc564bda4a31bc ("soc: renesas: rcar-sysc: Mark device > node OF_POPULATED after init") is no longer needed (but already > queued for v5.12 anyway) Rob doesn't like the proliferation of OF_POPULATED and we don't need it anymore, so maybe work it out with him? It's a balance between some wasted memory (struct device(s)) vs not proliferating OF_POPULATED. > - Some devices are reprobed, despite their drivers returning > a real error code, and not -EPROBE_DEFER: Sorry, it's not obvious from the logs below where "reprobing" is happening. Can you give more pointers please? Also, thinking more about this, the only way I could see this happen is: 1. Device fails with error that's not -EPROBE_DEFER 2. It somehow gets added to a device link (with AUTOPROBE_CONSUMER flag) where it's a consumer. 3. The supplier probes and the device gets added to the deferred probe list again. But I can't see how this sequence can happen. Device links are created only when a device is added. And is the supplier isn't added yet, the consumer wouldn't have probed in the first place. Other than "annoying waste of time" is this causing any other problems? > renesas_wdt e6020000.watchdog: Watchdog blacklisted on r8a7791 ES1.* > (rwdt_probe() returns -ENODEV) > > sh-pfc e6060000.pinctrl: pin GP_7_23 already requested by > ee090000.pci; cannot claim for e6590000.usb > sh-pfc e6060000.pinctrl: pin-247 (e6590000.usb) status -22 > sh-pfc e6060000.pinctrl: could not request pin 247 > (GP_7_23) from group usb0 on device sh-pfc > renesas_usbhs e6590000.usb: Error applying setting, > reverse things back > renesas_usbhs: probe of e6590000.usb failed with error -22 > > rcar-pcie fe000000.pcie: host bridge /soc/pcie@fe000000 ranges: > rcar-pcie fe000000.pcie: IO > 0x00fe100000..0x00fe1fffff -> 0x0000000000 > rcar-pcie fe000000.pcie: MEM > 0x00fe200000..0x00fe3fffff -> 0x00fe200000 > rcar-pcie fe000000.pcie: MEM > 0x0030000000..0x0037ffffff -> 0x0030000000 > rcar-pcie fe000000.pcie: MEM > 0x0038000000..0x003fffffff -> 0x0038000000 > rcar-pcie fe000000.pcie: IB MEM > 0x0040000000..0x00bfffffff -> 0x0040000000 > rcar-pcie fe000000.pcie: IB MEM > 0x0200000000..0x02ffffffff -> 0x0200000000 > rcar-pcie fe000000.pcie: PCIe link down > (rcar_pcie_probe() returns -ENODEV) > > xhci-hcd ee000000.usb: xHCI Host Controller > xhci-hcd ee000000.usb: new USB bus registered, assigned bus number 7 > xhci-hcd ee000000.usb: Direct firmware load for > r8a779x_usb3_v3.dlmem failed with error -2 > xhci-hcd ee000000.usb: can't setup: -2 > xhci-hcd ee000000.usb: USB bus 7 deregistered > xhci-hcd: probe of ee000000.usb failed with error -2 > > - The PCI reprobing leads to a memory leak, for which I've sent a fix > "[PATCH] PCI: Fix memory leak in pci_register_io_range()" > https://lore.kernel.org/linux-pci/20210202100332.829047-1-geert+renesas@glider.be/ Wrt PCI reprobing, 1. Is this PCI never expected to probe, but it's being reattempted despite the NOT EPROBE_DEFER error? Or 2. The PCI was deferred probe when it should have probed and then when it's finally reattemped and it could succeed, we are hitting this mem leak issue? I'm basically trying to distinguish between "this stuff should never be retried" vs "this/it's suppliers got probe deferred with fw_devlink=on vs but didn't get probe deferred with fw_devlink=permissive and that's causing issues" > - I2C on R-Car Gen3 does not seem to use DMA, according to > /sys/kernel/debug/dmaengine/summary: > > -dma4chan0 | e66d8000.i2c:tx > -dma4chan1 | e66d8000.i2c:rx > -dma5chan0 | e6510000.i2c:tx I think I need more context on the problem before I can try to fix it. I'm also very unfamiliar with that file. With fw_devlink=permissive, I2C was using DMA? If so, the next step is to see if the I2C relative probe order with DMA is getting changed and if so, why. > - Disabling CONFIG_IPMMU_VMSA (IOMMU) now works, good! > > ignoring dependency for device, assuming no driver > > - Disabling CONFIG_RCAR_DMAC works for most devices, except for > sound: > > -rcar_sound ec500000.sound: probed > > ALSA device list: > - #0: rcar-sound > + No soundcards found. > > # cat /sys/kernel/debug/devices_deferred > 2-0010 > sound > ec500000.sound > > platform e6510000.i2c: Linked as a sync state only > consumer to ec500000.sound > platform ec500000.sound: Linked as a consumer to e6060000.pinctrl > platform ec500000.sound: Linked as a consumer to > e6150000.clock-controller > i2c 2-0010: Linked as a consumer to ec500000.sound > platform ec500000.sound: Linked as a consumer to 2-004f > cs2000-cp 2-004f: revision - C1 > i2c-rcar e6510000.i2c: probed > i2c-rcar e6510000.i2c: Dropping the link to ec500000.sound > i2c 2-0010: probe deferral - supplier ec500000.sound not ready > > With CONFIG_RCAR_DMAC=y, ec500000.sound is probed quite early. I saw your other reply, so I'll ignore this sound/DMA issue. > > arch/arm64/boot/dts/renesas/r8a77951-salvator-xs.dts > > ak4613: codec@10 { > clocks = <&rcar_sound 3>; > > port { > ak4613_endpoint: endpoint { > remote-endpoint = <&rsnd_endpoint0>; > }; > }; > }; > > sound_card: sound { > dais = <&rsnd_port0 /* ak4613 */ > &rsnd_port1 /* HDMI0 */ > &rsnd_port2>; /* HDMI1 */ > }; > > rcar_sound: sound@ec500000 { > ports { > rsnd_port0: port@0 { > rsnd_endpoint0: endpoint { > remote-endpoint = > <&ak4613_endpoint>; > } > } > } > }; > > > 2. SH/R-Mobile AG5 (kzm9g), APE6 (ape6evm), A1 (armadillo800-eva) > > - "PATCH] soc: renesas: rmobile-sysc: Set OF_POPULATED and absorb > reset handling" is no longer needed > https://lore.kernel.org/linux-arm-kernel/20210205133319.1921108-1-geert+renesas@glider.be/ Good to see more evidence that this series is fixing things at a more generic level. > - On R-Mobile A1, I get a BUG and a memory leak: > > BUG: spinlock bad magic on CPU#0, swapper/1 > lock: lcdc0_device+0x10c/0x308, .magic: 00000000, .owner: > /-1, .owner_cpu: 0 > CPU: 0 PID: 1 Comm: swapper Not tainted > 5.11.0-rc5-armadillo-00032-gf0a85c26907e #266 > Hardware name: Generic R8A7740 (Flattened Device Tree) > [] (unwind_backtrace) from [] > (show_stack+0x10/0x14) > [] (show_stack) from [] > (do_raw_spin_lock+0x20/0x94) > [] (do_raw_spin_lock) from [] > (dev_pm_get_subsys_data+0x30/0xa0) > [] (dev_pm_get_subsys_data) from [] > (genpd_add_device+0x34/0x1c0) > [] (genpd_add_device) from [] > (of_genpd_add_device+0x34/0x4c) > [] (of_genpd_add_device) from [] > (board_staging_register_device+0xf8/0x118) > [] (board_staging_register_device) from > [] (board_staging_register_devices+0x24/0x28) > [] (board_staging_register_devices) from > [] (runtime_board_check+0x2c/0x40) > [] (runtime_board_check) from [] > (do_one_initcall+0xe0/0x278) > [] (do_one_initcall) from [] > (kernel_init_freeable+0x174/0x1c0) > [] (kernel_init_freeable) from [] > (kernel_init+0x8/0x118) > [] (kernel_init) from [] > (ret_from_fork+0x14/0x38) > Exception stack(0xc19c9fb0 to 0xc19c9ff8) > 9fa0: 00000000 > 00000000 00000000 00000000 > 9fc0: 00000000 00000000 00000000 00000000 00000000 > 00000000 00000000 00000000 > 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 > > unreferenced object 0xc4134e00 (size 512): > comm "swapper", pid 1, jiffies 4294937296 (age 3541.930s) > hex dump (first 32 bytes): > 00 4e 13 c4 00 4e 13 c4 ff ff ff 7f ff ff ff 7f > .N...N.......... > ff ff ff 7f 02 00 00 00 00 5f 13 c4 1c 4e 13 c4 > ........._...N.. > backtrace: > [] dev_pm_qos_constraints_allocate+0x10/0xcc > [] dev_pm_qos_add_notifier+0x6c/0xd0 > [] genpd_add_device+0x178/0x1c0 > [<95067303>] of_genpd_add_device+0x34/0x4c > [] board_staging_register_device+0xf8/0x118 > [<01bd495a>] board_staging_register_devices+0x24/0x28 > [] runtime_board_check+0x2c/0x40 > [<65aed679>] do_one_initcall+0xe0/0x278 > [<97e3f4f7>] kernel_init_freeable+0x174/0x1c0 > [<63c8fed0>] kernel_init+0x8/0x118 > [] ret_from_fork+0x14/0x38 > [<00000000>] 0x0 Hmm... I looked at this in bits and pieces throughout the day. At least spent an hour looking at this. This doesn't make a lot of sense to me. I don't even touch anything in this code path AFAICT. Are modules/kernel mixed up somehow? I need more info before I can help. Does reverting my pm domain change make any difference (assume it boots this far without it). > > 3. RZ/A1 and RZ/A2: No issues. Great! -Saravana