Received: by 2002:a05:6a10:d5a5:0:0:0:0 with SMTP id gn37csp3990239pxb; Mon, 4 Oct 2021 14:35:45 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwvZWj9DRjbntD1CGK8m70jsTqU17iMbPb8ZkApHgi+d2vW9vSdY1GVZIPFH/b8buqP6XmF X-Received: by 2002:a17:902:9882:b0:13e:1749:daae with SMTP id s2-20020a170902988200b0013e1749daaemr1751599plp.60.1633383345703; Mon, 04 Oct 2021 14:35:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1633383345; cv=none; d=google.com; s=arc-20160816; b=hwD3IQYbClg1rxVb/kk9FYQObdF40ktj8d7cGNv1oH1VoHAeaGFQ2ZZTEk6QUXbL4e o5GS6JGs6k2RorjFjhOru7101h7UhdfXQT/8460GQwwh0GBSWOZexI1hs9dYr1JkuJsR MminFTErU7px7Evh+9ZX4HKYN3Sl6LbnHuTUKuUGN1vBOVMKKEwZBULmjpEkDXEqTOTS kVQ7baHWdtrjNiHc5EiQmANtJtYNUoVL3VxcBCVWc6Pu+PECkGzZHv8T1nltpdckwfX/ uZvkaAeq8KCbx/c2okGRWsJHoyj28A+Jo0VwxG6RC+x/+98qMybMFz2VdrQ3ldJKgcG6 2cZw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=BxVUfHjogE34eov2vheKQAoBbpZ9fH62VyuqNndP8W4=; b=FrNxLfS8MuXNtowY6DuYAU/iAoE1MtpMRtrFrlZ0KjdfnrV8MGUZif+CgqgaXmqrMM 00FlsUo0QW0PxRlcqpEtHy2ZHFUDsgdmhuu8JAGlzM2AmC/h7X9aFVB6piozjCfqTp7V riI5mv/m/DNzKcNeuuYSb06QIiUk6xZo4yICR45hPtC0XLVCIjSR/CxukniQSTHALfKa UjoYmLgOuWfUZI7s2akPY+NStPtK9RKaJlc10hGfKTYAeIV5dcxFMJ9ZUXwBdlE+tTCy tysC8rwYRRWbZrkCYtRP3ay8CHONQW78OetNZMWuTPswbba1jwBRWSbfahvTbjA4j5lq VzGA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id om12si20997260pjb.53.2021.10.04.14.35.31; Mon, 04 Oct 2021 14:35:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238451AbhJDNk3 (ORCPT + 99 others); Mon, 4 Oct 2021 09:40:29 -0400 Received: from mga17.intel.com ([192.55.52.151]:1365 "EHLO mga17.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238479AbhJDNig (ORCPT ); Mon, 4 Oct 2021 09:38:36 -0400 X-IronPort-AV: E=McAfee;i="6200,9189,10126"; a="206233733" X-IronPort-AV: E=Sophos;i="5.85,345,1624345200"; d="scan'208";a="206233733" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Oct 2021 06:20:59 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.85,345,1624345200"; d="scan'208";a="622201235" Received: from kuha.fi.intel.com ([10.237.72.162]) by fmsmga001.fm.intel.com with SMTP; 04 Oct 2021 06:20:56 -0700 Received: by kuha.fi.intel.com (sSMTP sendmail emulation); Mon, 04 Oct 2021 16:20:55 +0300 Date: Mon, 4 Oct 2021 16:20:55 +0300 From: Heikki Krogerus To: Kent Gibson Cc: Greg Kroah-Hartman , Andy Shevchenko , "Rafael J. Wysocki" , linux-acpi@vger.kernel.org, linux-kernel@vger.kernel.org, Bartosz Golaszewski Subject: Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup Message-ID: References: <20211004093416.GA2513199@sol> <20211004121942.GA3343713@sol> <20211004124701.GA3418302@sol> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20211004124701.GA3418302@sol> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 04, 2021 at 08:47:01PM +0800, Kent Gibson wrote: > On Mon, Oct 04, 2021 at 03:30:43PM +0300, Heikki Krogerus wrote: > > On Mon, Oct 04, 2021 at 08:19:42PM +0800, Kent Gibson wrote: > > > On Mon, Oct 04, 2021 at 11:44:17AM +0200, Greg Kroah-Hartman wrote: > > > > On Mon, Oct 04, 2021 at 05:34:16PM +0800, Kent Gibson wrote: > > > > > Hi, > > > > > > > > > > I'm seeing a refcount underflow when I unload the gpio-mockup module on > > > > > Linux v5.15-rc4 (and going back to v5.15-rc1): > > > > > > > > > > # modprobe gpio-mockup gpio_mockup_ranges=-1,4,-1,10 > > > > > # rmmod gpio-mockup > > > > > ------------[ cut here ]------------ > > > > > refcount_t: underflow; use-after-free. > > > > > WARNING: CPU: 0 PID: 103 at lib/refcount.c:28 refcount_warn_saturate+0xd1/0x120 > > > > > Modules linked in: gpio_mockup(-) > > > > > CPU: 0 PID: 103 Comm: rmmod Not tainted 5.15.0-rc4 #1 > > > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 > > > > > EIP: refcount_warn_saturate+0xd1/0x120 > > > > > Code: e8 a2 b0 3b 00 0f 0b eb 83 80 3d db 2a 8c c1 00 0f 85 76 ff ff ff c7 04 24 88 85 78 c1 b1 01 88 0d db 2a 8c c1 e8 7d b0 3b 00 <0f> 0b e9 5b ff ff ff 80 3d d9 2a 8c c1 00 0f 85 4e ff ff ff c7 04 > > > > > EAX: 00000026 EBX: c250b100 ECX: f5fe8c28 EDX: 00000000 > > > > > ESI: c244860c EDI: c250b100 EBP: c245be84 ESP: c245be80 > > > > > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00000296 > > > > > CR0: 80050033 CR2: b7e3c3e1 CR3: 024ba000 CR4: 00000690 > > > > > Call Trace: > > > > > kobject_put+0xdc/0xf0 > > > > > software_node_notify_remove+0xa8/0xc0 > > > > > device_del+0x15a/0x3e0 > > > > > ? kfree_const+0xf/0x30 > > > > > ? kobject_put+0xa6/0xf0 > > > > > ? module_remove_driver+0x73/0xa0 > > > > > platform_device_del.part.0+0xf/0x80 > > > > > platform_device_unregister+0x19/0x40 > > > > > gpio_mockup_unregister_pdevs+0x13/0x1b [gpio_mockup] > > > > > gpio_mockup_exit+0x1c/0x68c [gpio_mockup] > > > > > __ia32_sys_delete_module+0x137/0x1e0 > > > > > ? task_work_run+0x61/0x90 > > > > > ? exit_to_user_mode_prepare+0x1b5/0x1c0 > > > > > __do_fast_syscall_32+0x50/0xc0 > > > > > do_fast_syscall_32+0x32/0x70 > > > > > do_SYSENTER_32+0x15/0x20 > > > > > entry_SYSENTER_32+0x98/0xe7 > > > > > EIP: 0xb7eda549 > > > > > Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76 > > > > > EAX: ffffffda EBX: 0045a19c ECX: 00000800 EDX: 0045a160 > > > > > ESI: fffffffe EDI: 0045a160 EBP: bff19d08 ESP: bff19cc8 > > > > > DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00000202 > > > > > ---[ end trace 3d71387f54bc2d06 ]--- > > > > > > > > > > I suspect this is related to the recent changes to swnode.c or > > > > > platform.c, as gpio-mockup hasn't changed, but haven't had the > > > > > chance to debug further. > > > > > > > > Any chance you can run 'git bisect' for this? > > > > > > > > > > That results in: > > > > > > bd1e336aa8535a99f339e2d66a611984262221ce is the first bad commit > > > commit bd1e336aa8535a99f339e2d66a611984262221ce > > > Author: Heikki Krogerus > > > Date: Tue Aug 17 13:24:49 2021 +0300 > > > > > > driver core: platform: Remove platform_device_add_properties() > > > > Can you test does this patch help: > > https://lore.kernel.org/all/20210930121246.22833-3-heikki.krogerus@linux.intel.com/ > > > > You sure that is the patch you have in mind? It only removes dead code, > so I don't see how that would help. And it isn't quite dead either - > drivers/pci/quirks.c is still using device_add_properties(), so it won't > build. Right, so can you test with the whole series that patch is part of? > Looking at the offending patch, it effectively replaces a call to > device_add_properties() with one to > device_create_managed_software_node(), and those two functions appear > quite different - at least at first glance. > Is that correct? The only real difference between the two functions is that device_create_managed_software_node() marks the software node it creates (and it does it exactly the same way as device_add_properties()) as "managed" with a specific flag. It means that when the device is removed, so is the software node. It happens when device_del() calls device_platform_notify_remove(), which then calls software_node_notify_remove(). The problem is that after doing that step, device_del() then calls device_remove_properties() unconditionally which also attempts to remove the software node. So you end up doing the same thing twice. So the code in the patch that we're interested, and that I would like you to test, is this: diff --git a/drivers/base/core.c b/drivers/base/core.c index 938cfcd1674eb..152a611a7e9ca 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -3583,7 +3583,6 @@ void device_del(struct device *dev) device_pm_remove(dev); driver_deferred_probe_del(dev); device_platform_notify_remove(dev); - device_remove_properties(dev); device_links_purge(dev); if (dev->bus) thanks, -- heikki