Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753682AbbGTKgf (ORCPT ); Mon, 20 Jul 2015 06:36:35 -0400 Received: from pandora.arm.linux.org.uk ([78.32.30.218]:55155 "EHLO pandora.arm.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751203AbbGTKgd (ORCPT ); Mon, 20 Jul 2015 06:36:33 -0400 Date: Mon, 20 Jul 2015 11:36:24 +0100 From: Russell King - ARM Linux To: Viresh Kumar Cc: Rafael Wysocki , linaro-kernel@lists.linaro.org, linux-pm@vger.kernel.org, open list Subject: Re: [PATCH] cpufreq: Avoid double addition/removal of sysfs links Message-ID: <20150720103623.GQ7557@n2100.arm.linux.org.uk> References: <20150718163149.GP7557@n2100.arm.linux.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5403 Lines: 105 On Mon, Jul 20, 2015 at 03:17:10PM +0530, Viresh Kumar wrote: > Consider a dual core (0/1) system with two CPUs: > - sharing clock/voltage rails and hence cpufreq-policy > - CPU1 is offline while the cpufreq driver is registered > > - cpufreq_add_dev() is called from subsys callback for CPU0 and we > create the policy for the CPUs and create link for CPU1. > - cpufreq_add_dev() is called from subsys callback for CPU1, we find > that the cpu is offline and we try to create a sysfs link for CPU1. > - This results in double addition of the sysfs link and we get this: > > WARNING: CPU: 0 PID: 1 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x60/0x7c() > sysfs: cannot create duplicate filename '/devices/system/cpu/cpu1/cpufreq' > Modules linked in: > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.2.0-rc2+ #1704 > Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) > Backtrace: > [] (dump_backtrace) from [] (show_stack+0x18/0x1c) > r6:c01a1f30 r5:0000001f r4:00000000 r3:00000000 > [] (show_stack) from [] (dump_stack+0x7c/0x98) > [] (dump_stack) from [] (warn_slowpath_common+0x80/0xbc) > r4:d74abbd0 r3:d74c0000 > [] (warn_slowpath_common) from [] (warn_slowpath_fmt+0x38/0x40) > r8:ffffffef r7:00000000 r6:d75a8960 r5:c0993280 r4:d6b4d000 > [] (warn_slowpath_fmt) from [] (sysfs_warn_dup+0x60/0x7c) > r3:d6b4dfe7 r2:c0930750 > [] (sysfs_warn_dup) from [] (sysfs_do_create_link_sd+0xb8/0xc0) > r6:d75a8960 r5:c0993280 r4:d00aba20 > [] (sysfs_do_create_link_sd) from [] (sysfs_create_link+0x2c/0x3c) > r10:00000001 r8:c14db3c8 r7:d7b89010 r6:c0ae7c60 r5:d7b89010 r4:d00d1200 > [] (sysfs_create_link) from [] (add_cpu_dev_symlink+0x34/0x5c) > [] (add_cpu_dev_symlink) from [] (cpufreq_add_dev+0x674/0x794) > r5:00000001 r4:00000000 > [] (cpufreq_add_dev) from [] (subsys_interface_register+0x8c/0xd0) > r10:00000003 r9:d7bb01f0 r8:c14db3c8 r7:00106738 r6:c0ae7c60 r5:c0acbd08 > r4:c0ae7e20 > [] (subsys_interface_register) from [] (cpufreq_register_driver+0x104/0x1f4) > > > The check for offline-cpu in cpufreq_add_dev() is present to ensure that > link gets added for the CPUs, that weren't physically present earlier > and we missed the case where a CPU is offline while registering the > driver. > > Fix this by keeping track of CPUs for which link is already created, and > avoiding duplicate sysfs entries. Why do we try to create the symlink for CPU devices which we haven't "detected" yet (iow, we haven't had cpufreq_add_dev() called for)? Surely we are guaranteed to have cpufreq_add_dev() called for every CPU which exists in sysfs? So why not _only_ create the sysfs symlinks when cpufreq_add_dev() is notified that a CPU subsys interface is present? Sure, if the policy changes, we need to do maintanence on these symlinks, but I see only one path down into cpufreq_add_dev_symlink(), which is: cpufreq_add_dev() -> cpufreq_add_dev_interface() -> cpufreq_add_dev_symlink() In other words, only when we see a new CPU interface appears, not when the policy changes. If the set of related CPUs is policy independent, why is this information carried in the cpufreq_policy struct? If it is policy dependent, then I see no code which handles the effect of a policy change where the policy->related_cpus is different. To me, that sounds like a rather huge design hole. Things get worse. Reading drivers/base/cpu.c, CPU interface nodes are only ever created - they're created for the set of _possible_ CPUs in the system, not those which are possible and present, and there is no unregister_cpu() API, only a register_cpu() API. So, cpufreq_remove_dev() won't be called for CPUs which were present and are no longer present. This appears to be a misunderstanding of CPU hotplug... So, cpufreq_remove_dev() will only get called when you call subsys_interface_unregister(), not when the CPU present mask changes. I suspect that the code in cpufreq_remove_dev() dealing with "offline" CPUs even works... I'd recommend reading Documentation/cpu-hotplug.txt: | cpu_present_mask: Bitmap of CPUs currently present in the system. Not all | of them may be online. When physical hotplug is processed by the relevant | subsystem (e.g ACPI) can change and new bit either be added or removed | from the map depending on the event is hot-add/hot-remove. There are | currently no locking rules as of now. Typical usage is to init topology | during boot, at which time hotplug is disabled. | | You really dont need to manipulate any of the system cpu maps. They should | be read-only for most use. When setting up per-cpu resources almost always | use cpu_possible_mask/for_each_possible_cpu() to iterate. In other words, I think your usage of cpu_present_mask in this code is buggy in itself. Please rethink the design of this code - I think your original change is mis-designed. -- FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/