Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp728230pxb; Fri, 14 Jan 2022 15:09:02 -0800 (PST) X-Google-Smtp-Source: ABdhPJzAF2OJBsX+Jp52tEPhrEw8+94EVAQN++4bemPWjgxO9QW9Qa3C+whFc97LF1StY8opplIP X-Received: by 2002:a17:902:a404:b0:148:c0e0:423f with SMTP id p4-20020a170902a40400b00148c0e0423fmr11906774plq.90.1642201742581; Fri, 14 Jan 2022 15:09:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1642201742; cv=none; d=google.com; s=arc-20160816; b=Hgt+A911h9VbsJpy/qbCgJMeq/noKUO/zt/6IH7+qXtGl3Tda3tRnCk8HeS29RZjYy Mw87ZQySmLr+NT9ys9ngUbhRN7pk1LqgrjM50d7wtkTm7v036q+k2DHJRF8tzgaCT+LL IVUprOj2rY/VNWHvNARaxOJCXnFJj1kqoyraHGWvP09XbDxmFLdBRKT2rZt3PSRkXSKR IlXmhjeKuNPMWL3yfIFeNt6l6b+n41dk0Yiq5hayFHCqkctu4G7N3Yjkp67uABK7Ujwz ITXhC/UhKiZiz91tcMQODH9QqfQ8uxwdl+T8WoL14OetUahhcNgTUc8NZntB9vicZ0Yg fSGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:user-agent:from :references:in-reply-to:mime-version:dkim-signature; bh=qzWKW58EHMRYfJEMWdh6ZI9TALsJ363+iOZxWHKT5Dg=; b=HAiun1qsNRNgcSpbIXSKddxk2t3tDuqYnOp7bosNUiloKg6dLmLfbzIyqFvz73e2/a Tuxko4VV9zPBgVx1GrVlZ9LgnHw3CjpG19/Ktz/6P8WMLJg0NDrgYF8pqm1QG/s1vnM1 ohcn5eBSYiRCEMT5Y4KO4ELvbqChtJkYSbk22EeHLFfdLls4qS2biEdltKHymvJKOVaU AVsRDNPtaIBpbJeRUAEqEEzZI/tkC5AsB7/uMQxS95lvilyR582F/vm8YsQKvH/vkt65 /WH2RAxviXq53WL4uk3bCjYLFvhNr7fhWFYdz4Srx5Dq8Ljby1G/+j7QixLqFvXJBdUm yLoQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=m+tA33y4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c8si1765642pga.231.2022.01.14.15.08.50; Fri, 14 Jan 2022 15:09:02 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=m+tA33y4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229848AbiANVa4 (ORCPT + 99 others); Fri, 14 Jan 2022 16:30:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34574 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229837AbiANVaz (ORCPT ); Fri, 14 Jan 2022 16:30:55 -0500 Received: from mail-oi1-x234.google.com (mail-oi1-x234.google.com [IPv6:2607:f8b0:4864:20::234]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 034AFC061574 for ; Fri, 14 Jan 2022 13:30:55 -0800 (PST) Received: by mail-oi1-x234.google.com with SMTP id y14so14016029oia.9 for ; Fri, 14 Jan 2022 13:30:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:in-reply-to:references:from:user-agent:date:message-id :subject:to:cc; bh=qzWKW58EHMRYfJEMWdh6ZI9TALsJ363+iOZxWHKT5Dg=; b=m+tA33y4EKPBF5RGvo85oy9HWU3abLrzxeiKVsLD9tVTojyWnpMfPiGManmCHq+SJa jUCYY4SnECxcwj0qpyLRNSTBNh11tWxEcT5GvTDzcAsjz8ZAvy9YRLFECkOSyGWKolcO smb+cvls8PdAvykD60pN2/GL9MWc0BTWbKWjw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:in-reply-to:references:from :user-agent:date:message-id:subject:to:cc; bh=qzWKW58EHMRYfJEMWdh6ZI9TALsJ363+iOZxWHKT5Dg=; b=R7BTUKKXKql7T1U9ks/krG8kfMXLdTC1jq85GMhQaPDNJQWC54LaUtLFjCBQ+PvT9C FOvCAHn+UQPWcd0bUaN1En4GUk9AjtN6beF2Aqsvp0L+0kh+koJ9f6sm/MgzABu4z/uV W2g1fer4/lkBdCkceDNgIB4AtvH4zyYyCzjrguN685IAsRj5M1pZ9Qb3ohNmf+ieoD0R iaqNyjvjs+q3tnQ3NxyY2gUNy5AqRYO0IcSfOZF9XvzQShfVGjhypxYtYKfe7ZSCm07i ywFMFk+lndvG5RJ+bZ1Jk/HM1feWM9EUGf72VSaBS76nsVgcJumooc5GkJrL2+Uk9td7 B5xg== X-Gm-Message-State: AOAM531p3GPQl3k3h9VE1zZI9QrVy2LV4aaJX2OduockjQbsbtMRlWIJ EghCwuLdGilO41ay9jJZugDk8V5Cv3b90JrQUwWc/w== X-Received: by 2002:aca:a953:: with SMTP id s80mr14644271oie.164.1642195854341; Fri, 14 Jan 2022 13:30:54 -0800 (PST) Received: from 753933720722 named unknown by gmailapi.google.com with HTTPREST; Fri, 14 Jan 2022 15:30:53 -0600 MIME-Version: 1.0 In-Reply-To: <69a10908622512c60790f97942731a8ab989b727.camel@mediatek.com> References: <20220106214556.2461363-1-swboyd@chromium.org> <20220106214556.2461363-26-swboyd@chromium.org> <1a3b368eb891ca55c33265397cffab0b9f128737.camel@mediatek.com> <69a10908622512c60790f97942731a8ab989b727.camel@mediatek.com> From: Stephen Boyd User-Agent: alot/0.10 Date: Fri, 14 Jan 2022 15:30:53 -0600 Message-ID: Subject: Re: [PATCH v5 25/32] iommu/mtk: Migrate to aggregate driver To: Yong Wu Cc: Krzysztof Kozlowski , Greg Kroah-Hartman , Douglas Anderson , linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, dri-devel@lists.freedesktop.org, freedreno@lists.freedesktop.org, Joerg Roedel , Will Deacon , Daniel Vetter , "Rafael J. Wysocki" , Rob Clark , Russell King , Saravana Kannan , linux-mediatek@lists.infradead.org, iommu@lists.linux-foundation.org, youlin.pei@mediatek.com Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Quoting Yong Wu (2022-01-14 01:06:31) > On Wed, 2022-01-12 at 20:25 -0800, Stephen Boyd wrote: > > > > > > [ 2.654526] ------------[ cut here ]------------ > > > [ 2.655558] refcount_t: addition on 0; use-after-free. > > > > > > After this patch, the aggregate_driver flow looks ok. But our > > > driver > > > still aborts like this: > > > > > > [ 2.721316] Unable to handle kernel NULL pointer dereference at > > > virtual address 0000000000000000 > > > ... > > > [ 2.731658] pc : > > > mtk_smi_larb_config_port_gen2_general+0xa4/0x138 > > > [ 2.732434] lr : mtk_smi_larb_resume+0x54/0x98 > > > ... > > > [ 2.742457] Call trace: > > > [ 2.742768] mtk_smi_larb_config_port_gen2_general+0xa4/0x138 > > > [ 2.743496] pm_generic_runtime_resume+0x2c/0x48 > > > [ 2.744090] __genpd_runtime_resume+0x30/0xa8 > > > [ 2.744648] genpd_runtime_resume+0x94/0x2c8 > > > [ 2.745191] __rpm_callback+0x44/0x150 > > > [ 2.745669] rpm_callback+0x6c/0x78 > > > [ 2.746114] rpm_resume+0x314/0x558 > > > [ 2.746559] __pm_runtime_resume+0x3c/0x88 > > > [ 2.747080] pm_runtime_get_suppliers+0x7c/0x110 > > > [ 2.747668] __driver_probe_device+0x4c/0xe8 > > > [ 2.748212] driver_probe_device+0x44/0x130 > > > [ 2.748745] __device_attach_driver+0x98/0xd0 > > > [ 2.749300] bus_for_each_drv+0x68/0xd0 > > > [ 2.749787] __device_attach+0xec/0x148 > > > [ 2.750277] device_attach+0x14/0x20 > > > [ 2.750733] bus_rescan_devices_helper+0x50/0x90 > > > [ 2.751319] bus_for_each_dev+0x7c/0xd8 > > > [ 2.751806] bus_rescan_devices+0x20/0x30 > > > [ 2.752315] __component_add+0x7c/0xa0 > > > [ 2.752795] component_add+0x14/0x20 > > > [ 2.753253] mtk_smi_larb_probe+0xe0/0x120 > > > > > > This is because the device runtime_resume is called before the bind > > > operation(In our case this detailed function is mtk_smi_larb_bind). > > > The issue doesn't happen without this patchset. I'm not sure the > > > right > > > sequence. If we should fix in mediatek driver, the patch could be: > > > > Oh, the runtime PM is moved around with these patches. The aggregate > > device is runtime PM enabled before the probe is called, > > In our case, the component device may probe before the aggregate > device. thus the component device runtime PM has already been enabled > when aggregate device probe. This is always the case. The component device always probes before the aggregate device because the component device registers with the component framework. But the component device can decide to enable runtime PM during driver probe or during component bind. > > > and there are > > supplier links made to each component, so each component is runtime > > resumed before the aggregate probe function is called. > > Yes. This is the current flow. Got it. > > > It means that all > > the component drivers need to have their resources ready to power on > > before their component_bind() callback is made. > > Sorry, I don't understand here well. In this case, The component > drivers prepare the resource for power on in the component_bind since > the resource comes from the aggregate driver. Thus, we expect the > component_bind run before the runtime resume callback. What resource comes from the aggregate device that the component device manages in runtime PM? > > Another solution is moving the component's pm_runtime_enable into the > component_bind(It's mtk_smi_larb_bind here), then the runtime callback > is called after component_bind in which the resource for power on is > ready. This sounds more correct to me. I'm not an expert on runtime PM though as I always have to read the code to remember how it works. if the device isn't ready for runtime PM until the component bind function is called then runtime PM shouldn't be enabled on the component device.