Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp1103759rwl; Wed, 29 Mar 2023 12:45:46 -0700 (PDT) X-Google-Smtp-Source: AKy350ZuKemRVEworbKOQWLRAWJiXx8JLw0u0zOVbksme67WuAnMaY55tFhwbdKJZUShuu8GXzal X-Received: by 2002:a17:907:3f81:b0:93d:bae1:ca9e with SMTP id hr1-20020a1709073f8100b0093dbae1ca9emr26879250ejc.25.1680119146471; Wed, 29 Mar 2023 12:45:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680119146; cv=none; d=google.com; s=arc-20160816; b=cAFjT83sj3Z8vuDMNIzefsRA1CE0n5VwvqU7/x7gUPC+wygotqH9ZtOBEJB5nw+t4W R01fzdzRNkCriAj6YVeBGzze6w7OOTYmlqDjjAFhm+vE35a1BpP8Kpj5SlY9akJUlyzL FEN0EF8xRXzfgrqvFbUGeAZ9S8wzugNJ72El8SpNDvEmGuecuL3dL8MXn8ixJ/fNYKtJ tFewRyi6fBDFbyyTxyHlXDO7v2Eopg/8nltZHuCkS3GDQWcWIY0M6oOo2gNN2R3U9pRI XtKCUE+mxrBJZLMA/9qO85PBOfs9nwaczDykZ9mDp2soSmn7mF44uQcW1FAKTi2015Dp fgCA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:feedback-id:dkim-signature:dkim-signature; bh=7fJNXQMgupYfTfzXjUT/VSxxmReLzs9mghv5xKeEkEw=; b=ZPcHxw2ZThCf5j1LnCGcEthfrv5WSPyFpoaU8ctyVVIqVGYKpq9JnqST5bLlRd1lYh Gmj/szOfRMoeaJPHbhy3PJe4FfRrfglKnoXLkt6YtIEQpK/Jpg5z4RbBiSv8zNoHGpuB N1mV14fyLB6lOvdraGZ/corRoq9Bsyy26Az9SVcKqFI3BqRFyisdvL/n4+f0x5pCO4Ld 5AEkwsZ+29too4GBEwzaFybdcx7NgGkywS2tJ5tsYz41YobBVoCcrrN7ajnCRi270ytD pjiTpL1ubAyR7WCzCHF3f3Ph3VWTyrzzd0la0FmY5ZSx5QgQ5FxSrtPs/0rIrbfkLCB2 vz/w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cerno.tech header.s=fm3 header.b=YMrEz4K5; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=JWvn+CUv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=cerno.tech Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id oz31-20020a1709077d9f00b0093defbd6280si17401146ejc.1031.2023.03.29.12.45.20; Wed, 29 Mar 2023 12:45:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@cerno.tech header.s=fm3 header.b=YMrEz4K5; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=JWvn+CUv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=cerno.tech Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229453AbjC2Tob (ORCPT + 99 others); Wed, 29 Mar 2023 15:44:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53290 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230335AbjC2ToS (ORCPT ); Wed, 29 Mar 2023 15:44:18 -0400 Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 73C1F55A3; Wed, 29 Mar 2023 12:43:48 -0700 (PDT) Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.west.internal (Postfix) with ESMTP id 287C4320082A; Wed, 29 Mar 2023 15:43:23 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute2.internal (MEProxy); Wed, 29 Mar 2023 15:43:24 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cerno.tech; h=cc :cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to; s=fm3; t= 1680119002; x=1680205402; bh=7fJNXQMgupYfTfzXjUT/VSxxmReLzs9mghv 5xKeEkEw=; b=YMrEz4K5HIombsOqi/BTS1U0U4a3C24Tq0rTiALpR3++DDywfut dNjr+s/KXnf+PikPClXtjnLLVzDd9jIK1ZRCBie3TgpEMdqznPRRuDPmlezlBwsK peKslDAwg2DPt1EQRxW/L3N3PhUMnmEj+2pAfK9YriS0NRY1zSnKocqVRv3eddTt xh5upaamzA7PW9MDpV2vB+UHoPtu5u4EfYD9fNC45nu7Lj+YC3u5zny72gAy9/br QcSeZUK7pvHuaW8rpiwP6IF2DwrNkwojY+Zkkqtbh8eUj3I3nMdUEsz2W8kiW27Z hoTyuMFY+d8cZtsW60B+RmaYZwzkLqQALcw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t= 1680119002; x=1680205402; bh=7fJNXQMgupYfTfzXjUT/VSxxmReLzs9mghv 5xKeEkEw=; b=JWvn+CUvHHVcyJJTBhbShimHD82B8geTFc9Nk45mijBxErPVsC3 NstcM5fy+2prNS3WY8fVkFSw7kkNAhLLq9QF9fDX4qoC+dOzUO98g+BMlHXooL95 f7gbLBcmFu0HF9weDBybkEjUbWw4STdgmLoy6bOyo5vOko11mxXQ8O9s69oAWuSx mKfP4842tuxR2IFSteFWQNfVBtmDqOnqGOAm1IwHX35Kv33QRGMn5IjJdqI7L2xW S0hmUbmarn3QfPOK5q+BYu77axj5PI2OWBRVQnh/mOMGQ0dKaG70BN+ePdx45Qmy FGrUzaopbsDVpxgZLTrHnpsq8GJr9u4Qlcg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrvdehiedgudegvdcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpeffhffvvefukfhfgggtugfgjgesthhqredttddtvdenucfhrhhomhepofgr gihimhgvucftihhprghrugcuoehmrgigihhmvgestggvrhhnohdrthgvtghhqeenucggtf frrghtthgvrhhnpeeitdeuffevieeufedtuddvffffffegfffgkeeihfelleektdelhfev hfduudfhgfenucffohhmrghinhepkhgvrhhnvghlrdhorhhgnecuvehluhhsthgvrhfuih iivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepmhgrgihimhgvsegtvghrnhhordht vggthh X-ME-Proxy: Feedback-ID: i8771445c:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 29 Mar 2023 15:43:21 -0400 (EDT) Date: Wed, 29 Mar 2023 21:43:20 +0200 From: Maxime Ripard To: David Gow Cc: Matti Vaittinen , Greg Kroah-Hartman , Matti Vaittinen , "Rafael J. Wysocki" , Brendan Higgins , Andy Shevchenko , Heikki Krogerus , linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, kunit-dev@googlegroups.com, Stephen Boyd , Jonathan Cameron , linux-iio@vger.kernel.org Subject: Re: [PATCH v5 1/8] drivers: kunit: Generic helpers for test device creation Message-ID: <20230329194320.who2gobizgffbzqz@penduick> References: <20230323101216.w56kz3rudlj23vab@houat> <8a03a6fb-39b9-cd17-cc10-ece71111357d@gmail.com> <20230323122925.kqdnomr7i46qnyo4@houat> <590189b3-42d9-ab12-fccd-37338595cb6f@gmail.com> <20230323163639.xtwpid2uunwnzai4@houat> <20230324123157.bbwvfq4gsxnlnfwb@houat> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable In-Reply-To: X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW,RCVD_IN_MSPIKE_H2, SPF_HELO_PASS,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi David, On Sat, Mar 25, 2023 at 01:40:01PM +0800, David Gow wrote: > On Fri, 24 Mar 2023 at 20:32, Maxime Ripard wrote: > > > > On Fri, Mar 24, 2023 at 08:11:52AM +0200, Matti Vaittinen wrote: > > > On 3/23/23 18:36, Maxime Ripard wrote: > > > > On Thu, Mar 23, 2023 at 03:02:03PM +0200, Matti Vaittinen wrote: > > > > > On 3/23/23 14:29, Maxime Ripard wrote: > > > > > > On Thu, Mar 23, 2023 at 02:16:52PM +0200, Matti Vaittinen wrote: > > > > > > > > > > > > This is the description of what was happening: > > > > > > https://lore.kernel.org/dri-devel/20221117165311.vovrc7usy4efiy= tl@houat/ > > > > > > > > > > Thanks Maxime. Do I read this correcty. The devm_ unwinding not b= eing done > > > > > when root_device_register() is used is not because root_device_un= register() > > > > > would not trigger the unwinding - but rather because DRM code on = top of this > > > > > device keeps the refcount increased? > > > > > > > > There's a difference of behaviour between a root_device and any dev= ice > > > > with a bus: the root_device will only release the devm resources wh= en > > > > it's freed (in device_release), but a bus device will also do it in > > > > device_del (through bus_remove_device() -> device_release_driver() = -> > > > > device_release_driver_internal() -> __device_release_driver() -> > > > > device_unbind_cleanup(), which are skipped (in multiple places) if > > > > there's no bus and no driver attached to the device). > > > > > > > > It does affect DRM, but I'm pretty sure it will affect any framework > > > > that deals with device hotplugging by deferring the framework struc= ture > > > > until the last (userspace) user closes its file descriptor. So I'd > > > > assume that v4l2 and cec at least are also affected, and most likely > > > > others. > > > > > > Thanks for the explanation and patience :) > > > > > > > > > > > > If this is the case, then it sounds like a DRM specific issue to = me. > > > > > > > > I mean, I guess. One could also argue that it's because IIO doesn't > > > > properly deal with hotplugging. > > > > > > I must say I haven't been testing the IIO registration API. I've only= tested > > > the helper API which is not backed up by any "IIO device". (This is f= ine for > > > the helper because it must by design be cleaned-up only after the > > > IIO-deregistration). > > > > > > After your explanation here, I am not convinced IIO wouldn't see the = same > > > issue if I was testing the devm_iio_device_alloc() & co. > > > > It depends really. The issue DRM is trying to solve is that, when a > > device is gone, some application might still have an open FD and could > > still poke into the kernel, while all the resources would have been > > free'd if it was using devm. > > > > So everything is kept around until the last fd is closed, so you still > > have a reference to the device (even though it's been removed from its > > bus) until that time. > > > > It could be possible that IIO just doesn't handle that case at all. I > > guess most of the devices aren't hotpluggable, and there's not much to > > interact with from a userspace PoV iirc, so it might be why. > > > > > > I'm not sure how that helps. Those are > > > > common helpers which should accommodate every framework, > > > > > > Ok. Fair enough. Besides, if the root-device was sufficient - then I = would > > > actually not see the need for a helper. People could in that case dir= ectly > > > use the root_device_register(). So, if helpers are provided they shou= ld be > > > backed up by a device with a bus then. > > > > > > > and your second > > > > patch breaks the kunit tests for DRM anyway. > > > > > > Oh, I must have made an error there. It was supposed to be just a > > > refactoring with no functional changes. Sorry about that. Anyways, th= at > > > patch can be forgotten as Greg opposes using the platform devices in = generic > > > helpers. > > > > > > > > Whether it is a feature or bug is beyond my knowledge. Still, I w= ould > > > > > not say using the root_device_[un]register() in generic code is n= ot > > > > > feasible - unless all other subsytems have similar refcount handl= ing. > > > > > > > > > > Sure thing using root_device_register() root_device_unregister() = in DRM does > > > > > not work as such. This, however, does not mean the generic kunit = helpers > > > > > should use platform_devices to force unwinding? > > > > > > > > platform_devices were a quick way to get a device that would have a= bus > > > > and a driver bound to fall into the right patch above. We probably > > > > shouldn't use platform_devices and a kunit_device sounds like the b= est > > > > idea, but the test linked in the original mail I pointed you to sho= uld > > > > work with whatever we come up with. It works with multiple (platfor= m, > > > > PCI, USB, etc) buses, so the mock we create should behave like their > > > > real world equivalents. > > > > > > Thanks for the patience and the explanation. Now I understand a gener= ic test > > > device needs to sit on a bus. > > > > > > As I said, in my very specific IIO related test the test device does = not > > > need a bus. Hence I'll drop the 'generic helpers' from this series. > > > > So, I went around and created a bunch of kunit tests that shows the > > problem without DRM being involved at all. > > > > It does three things: > > > > - It registers a device, attaches a devm action, unregisters the device > > and then checks that the action has ran. > > > > - It registers a device, gets a reference to it, attaches a devm > > action, puts back the reference, unregisters the device and then > > checks that the action has ran. > > > > - It registers a device, gets a reference to it, attaches a devm action > > that will put back the reference, unregisters the device and then > > checks that the action has ran. > > > > And in three cases: first with a root_device, then platform_device, then > > a platform_device that has been bound to a driver. > > > > Once you've applied that patch, you can run it using: > > > > ./tools/testing/kunit/kunit.py run --kunitconfig=3Ddrivers/base/test/ d= evm-inconsistencies > > > > You'll see that only the last case passes all the tests, even though the > > code itself is exactly the same. > > >=20 > This illustrates the problem very nicely, thanks. >=20 > I played around a bit with this, and I'm definitely leaning towards > this being a bug, rather than intentional behaviour. There's actually > an explicit call to devres_release_all() in device_release() which > seems to suggest that this should work: > /* > * Some platform devices are driven without driver attached > * and managed resources may have been acquired. Make sure > * all resources are released. > * > * Drivers still can add resources into device after device > * is deleted but alive, so release devres here to avoid > * possible memory leak. > */ >=20 > My "solution" is just to call devres_release_all() in device_del() as > well, which fixes your tests (and the drm-test-managed one when ported > to use root_device_register() or my kunit_device_register() API[1]). >=20 > --8<-- > diff --git a/drivers/base/core.c b/drivers/base/core.c > index 6878dfcbf0d6..adfec6185014 100644 > --- a/drivers/base/core.c > +++ b/drivers/base/core.c > @@ -3778,6 +3778,17 @@ void device_del(struct device *dev) > device_platform_notify_remove(dev); > device_links_purge(dev); >=20 > + /* > + * If a device does not have a driver attached, we need to clean = up any > + * managed resources. We do this in device_release(), but it's ne= ver > + * called (and we leak the device) if a managed resource holds a > + * reference to the device. So release all managed resources here, > + * like we do in driver_detach(). We still need to do so again in > + * device_release() in case someone adds a new resource after this > + * point, though. > + */ > + devres_release_all(dev); > + > bus_notify(dev, BUS_NOTIFY_REMOVED_DEVICE); > kobject_uevent(&dev->kobj, KOBJ_REMOVE); > glue_dir =3D get_glue_dir(dev); >=20 > -->8-- >=20 > It doesn't _seem_ to break anything else, and I've managed to convince > myself that it's probably the correct fix. Yeah, as an outsider, I came to the same conclusion last time... > (Albeit, as someone with a limited knowledge of this part of the code, > who also hasn't had quite enough sleep, so take that with some salt.) =2E.. but as someone that also have a limited knowledge of that part of the code, I certainly wasn't sure it was the proper thing to do :) > Still, I'd agree with Greg that it'd be great to have versions of your > tests upstream before making any such radical changes. I just submitted them. Thanks! Maxime