>On Mon, Apr 30, 2012 at 11:51:49AM -0400, Greg KH wrote:
>> On Mon, Apr 30, 2012 at 11:50:18AM -0400, Greg KH wrote:
>> > On Mon, Apr 30, 2012 at 11:36:23AM -0400, Konrad Rzeszutek Wilk wrote:
>> > > Hey Greg,
>> > >
>> > > Hoping you can help with some guidance on how to fix this.
>> > >
>> > > The issue is with CPU hotplug is that when a CPU goes up
>> > > it calls 'arch_register_cpu' which eventually calls
>> > > register_cpu. That function does these two things:
>> > >
>> > > 251 error = device_register(&cpu->dev);
>> > > 252 if (!error && cpu->hotpluggable)
>> > > 253 register_cpu_control(cpu);
>> > >
>> > > and the device_register creates a nice little SysFS directory:
>> > >
>> > > /sys/devices/system/cpu/cpu2/ which at line 251 has the 'add' attribute
>> > > but no 'online' attribute. udev then tries to echo 1 to the 'online'
>> > > and it we get:
>> > > udevd-work[2421]: error opening ATTR{/sys/devices/system/cpu/cpu2/online} for writing: No such file or directory
>> > >
>> > > Line 253 creates said 'online' and at that time udev [or the system admin]
>> > > can write 1 to 'online' and the CPU goes up.
>> > >
>> > > So .. any thoughts? Is there some way to inhibit from uevent being sent
>> > > until line 253 has run?
>> >
>> > Yes.
>>
>> Oh, I imagine you want to know _how_ to do it too, right? (sorry, I
>> couldn't resist...)
>
>Heh.
>>
>> Make this a default attribute of the cpu device, and then it will be
>> created by the driver core before the uevent is sent to userspace.
>> That's what you are supposed to do in the first place, adding files "by
>> hand" is wrong, for this very reason.
>
>OK, will prep up a patch shortly.
Hello Konrad,
Is there a posted/accepted patch or idea was dropped?
On Fri, May 10, 2013 at 06:34:34PM +0200, Igor Mammedov wrote:
> >On Mon, Apr 30, 2012 at 11:51:49AM -0400, Greg KH wrote:
> >> On Mon, Apr 30, 2012 at 11:50:18AM -0400, Greg KH wrote:
> >> > On Mon, Apr 30, 2012 at 11:36:23AM -0400, Konrad Rzeszutek Wilk wrote:
> >> > > Hey Greg,
> >> > >
> >> > > Hoping you can help with some guidance on how to fix this.
> >> > >
> >> > > The issue is with CPU hotplug is that when a CPU goes up
> >> > > it calls 'arch_register_cpu' which eventually calls
> >> > > register_cpu. That function does these two things:
> >> > >
> >> > > 251 error = device_register(&cpu->dev);
> >> > > 252 if (!error && cpu->hotpluggable)
> >> > > 253 register_cpu_control(cpu);
> >> > >
> >> > > and the device_register creates a nice little SysFS directory:
> >> > >
> >> > > /sys/devices/system/cpu/cpu2/ which at line 251 has the 'add' attribute
> >> > > but no 'online' attribute. udev then tries to echo 1 to the 'online'
> >> > > and it we get:
> >> > > udevd-work[2421]: error opening ATTR{/sys/devices/system/cpu/cpu2/online} for writing: No such file or directory
> >> > >
> >> > > Line 253 creates said 'online' and at that time udev [or the system admin]
> >> > > can write 1 to 'online' and the CPU goes up.
> >> > >
> >> > > So .. any thoughts? Is there some way to inhibit from uevent being sent
> >> > > until line 253 has run?
> >> >
> >> > Yes.
> >>
> >> Oh, I imagine you want to know _how_ to do it too, right? (sorry, I
> >> couldn't resist...)
> >
> >Heh.
> >>
> >> Make this a default attribute of the cpu device, and then it will be
> >> created by the driver core before the uevent is sent to userspace.
> >> That's what you are supposed to do in the first place, adding files "by
> >> hand" is wrong, for this very reason.
> >
> >OK, will prep up a patch shortly.
>
> Hello Konrad,
>
> Is there a posted/accepted patch or idea was dropped?
Hey Igor,
CC-ing here Chuck here who is looking at that. My recollection (and Chuck please
correct me if I am wrong) is that the idea Greg outlined won't work very well.
The reason is that making 'online' an attribute of 'struct dev' means that:
1) A lot of SysFS attributes that have no notion of online/offline (say ISA bus)
will now have.
2) The default action item (so function) to do something based on writting/reading
from 'online' will have to be overridden by the driver using it. Which means
another race - we can create an SysFS attribute but the default points to something
that does nothing.
Chuck, does that sound right?
>
>
On Mon, May 13, 2013 at 09:31:28AM -0400, Konrad Rzeszutek Wilk wrote:
> On Fri, May 10, 2013 at 06:34:34PM +0200, Igor Mammedov wrote:
> > >On Mon, Apr 30, 2012 at 11:51:49AM -0400, Greg KH wrote:
> > >> On Mon, Apr 30, 2012 at 11:50:18AM -0400, Greg KH wrote:
> > >> > On Mon, Apr 30, 2012 at 11:36:23AM -0400, Konrad Rzeszutek Wilk wrote:
> > >> > > Hey Greg,
> > >> > >
> > >> > > Hoping you can help with some guidance on how to fix this.
> > >> > >
> > >> > > The issue is with CPU hotplug is that when a CPU goes up
> > >> > > it calls 'arch_register_cpu' which eventually calls
> > >> > > register_cpu. That function does these two things:
> > >> > >
> > >> > > 251 error = device_register(&cpu->dev);
> > >> > > 252 if (!error && cpu->hotpluggable)
> > >> > > 253 register_cpu_control(cpu);
> > >> > >
> > >> > > and the device_register creates a nice little SysFS directory:
> > >> > >
> > >> > > /sys/devices/system/cpu/cpu2/ which at line 251 has the 'add' attribute
> > >> > > but no 'online' attribute. udev then tries to echo 1 to the 'online'
> > >> > > and it we get:
> > >> > > udevd-work[2421]: error opening ATTR{/sys/devices/system/cpu/cpu2/online} for writing: No such file or directory
> > >> > >
> > >> > > Line 253 creates said 'online' and at that time udev [or the system admin]
> > >> > > can write 1 to 'online' and the CPU goes up.
> > >> > >
> > >> > > So .. any thoughts? Is there some way to inhibit from uevent being sent
> > >> > > until line 253 has run?
> > >> >
> > >> > Yes.
> > >>
> > >> Oh, I imagine you want to know _how_ to do it too, right? (sorry, I
> > >> couldn't resist...)
> > >
> > >Heh.
> > >>
> > >> Make this a default attribute of the cpu device, and then it will be
> > >> created by the driver core before the uevent is sent to userspace.
> > >> That's what you are supposed to do in the first place, adding files "by
> > >> hand" is wrong, for this very reason.
> > >
> > >OK, will prep up a patch shortly.
> >
> > Hello Konrad,
> >
> > Is there a posted/accepted patch or idea was dropped?
>
> Hey Igor,
>
> CC-ing here Chuck here who is looking at that. My recollection (and Chuck please
> correct me if I am wrong) is that the idea Greg outlined won't work very well.
> The reason is that making 'online' an attribute of 'struct dev' means that:
> 1) A lot of SysFS attributes that have no notion of online/offline (say ISA bus)
> will now have.
Then only create that attribute for busses that ask for it to be enabled.
> 2) The default action item (so function) to do something based on
> writting/reading from 'online' will have to be overridden by the
> driver using it. Which means another race - we can create an SysFS
> attribute but the default points to something that does nothing.
I don't understand this at all, how will there be a race exactly?
greg k-h
"crash_notes" is dinamically created with device_create_file() but
isn't anywhere deleted.
Define "crash_notes" statically via attribute groups so that
device_register would create it automatically and file would be
destroyed when CPU is destroyed.
Signed-off-by: Igor Mammedov <[email protected]>
---
drivers/base/cpu.c | 21 +++++++++++++++++----
1 files changed, 17 insertions(+), 4 deletions(-)
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index fb10728..02e4d73 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -132,8 +132,24 @@ static ssize_t show_crash_notes(struct device *dev, struct device_attribute *att
return rc;
}
static DEVICE_ATTR(crash_notes, 0400, show_crash_notes, NULL);
+
+static struct attribute *crash_note_cpu_attrs[] = {
+ &dev_attr_crash_notes.attr,
+ NULL
+};
+
+static struct attribute_group crash_note_cpu_attr_group = {
+ .attrs = crash_note_cpu_attrs,
+};
#endif
+static const struct attribute_group *common_cpu_attr_groups[] = {
+#ifdef CONFIG_KEXEC
+ &crash_note_cpu_attr_group,
+#endif
+ NULL
+};
+
/*
* Print cpu online, possible, present, and system maps
*/
@@ -248,6 +264,7 @@ int __cpuinit register_cpu(struct cpu *cpu, int num)
#ifdef CONFIG_ARCH_HAS_CPU_AUTOPROBE
cpu->dev.bus->uevent = arch_cpu_uevent;
#endif
+ cpu->dev.groups = common_cpu_attr_groups;
error = device_register(&cpu->dev);
if (!error && cpu->hotpluggable)
register_cpu_control(cpu);
@@ -256,10 +273,6 @@ int __cpuinit register_cpu(struct cpu *cpu, int num)
if (!error)
register_cpu_under_node(num, cpu_to_node(num));
-#ifdef CONFIG_KEXEC
- if (!error)
- error = device_create_file(&cpu->dev, &dev_attr_crash_notes);
-#endif
return error;
}
--
1.7.1
Here is a crude attempt fix race the way suggested by Greg,
probably done wrong but hopefully in the right direction.
1. move "crash_notes" to static attributes to guarantee that it's
destroyed with CPU on unregister.
2. fixes race between hotplugged CPU and onlining it via udev, described here
https://lkml.org/lkml/2012/4/30/193
Igor Mammedov (2):
cpu: fix crash_notes leak
cpu: make sure that cpu/online file created before KOBJ_ADD is
emitted
drivers/base/cpu.c | 55 +++++++++++++++++++++++++++++++++++----------------
1 files changed, 38 insertions(+), 17 deletions(-)
Fixes race between udev and hotplugged CPU registration, described at
https://lkml.org/lkml/2012/4/30/198
by defining "online" attribute statically so that device_add() would
create it before notifying udev about new CPU.
Signed-off-by: Igor Mammedov <[email protected]>
---
drivers/base/cpu.c | 34 +++++++++++++++++++++-------------
1 files changed, 21 insertions(+), 13 deletions(-)
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 02e4d73..6578030 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -64,18 +64,21 @@ static ssize_t __ref store_online(struct device *dev,
}
static DEVICE_ATTR(online, 0644, show_online, store_online);
-static void __cpuinit register_cpu_control(struct cpu *cpu)
-{
- device_create_file(&cpu->dev, &dev_attr_online);
-}
+static struct attribute *hotplug_cpu_attrs[] = {
+ &dev_attr_online.attr,
+ NULL
+};
+
+static struct attribute_group hotplug_cpu_attr_group = {
+ .attrs = hotplug_cpu_attrs,
+};
+
void unregister_cpu(struct cpu *cpu)
{
int logical_cpu = cpu->dev.id;
unregister_cpu_under_node(logical_cpu, cpu_to_node(logical_cpu));
- device_remove_file(&cpu->dev, &dev_attr_online);
-
device_unregister(&cpu->dev);
per_cpu(cpu_sys_devices, logical_cpu) = NULL;
return;
@@ -101,11 +104,6 @@ static ssize_t cpu_release_store(struct device *dev,
static DEVICE_ATTR(probe, S_IWUSR, NULL, cpu_probe_store);
static DEVICE_ATTR(release, S_IWUSR, NULL, cpu_release_store);
#endif /* CONFIG_ARCH_CPU_PROBE_RELEASE */
-
-#else /* ... !CONFIG_HOTPLUG_CPU */
-static inline void register_cpu_control(struct cpu *cpu)
-{
-}
#endif /* CONFIG_HOTPLUG_CPU */
#ifdef CONFIG_KEXEC
@@ -150,6 +148,16 @@ static const struct attribute_group *common_cpu_attr_groups[] = {
NULL
};
+static const struct attribute_group *hotplugable_cpu_attr_groups[] = {
+#ifdef CONFIG_KEXEC
+ &crash_note_cpu_attr_group,
+#endif
+#ifdef CONFIG_HOTPLUG_CPU
+ &hotplug_cpu_attr_group,
+#endif
+ NULL
+};
+
/*
* Print cpu online, possible, present, and system maps
*/
@@ -265,9 +273,9 @@ int __cpuinit register_cpu(struct cpu *cpu, int num)
cpu->dev.bus->uevent = arch_cpu_uevent;
#endif
cpu->dev.groups = common_cpu_attr_groups;
+ if (cpu->hotpluggable)
+ cpu->dev.groups = hotplugable_cpu_attr_groups;
error = device_register(&cpu->dev);
- if (!error && cpu->hotpluggable)
- register_cpu_control(cpu);
if (!error)
per_cpu(cpu_sys_devices, num) = &cpu->dev;
if (!error)
--
1.7.1
On Tue, May 14, 2013 at 12:05:31AM +0200, Igor Mammedov wrote:
> "crash_notes" is dinamically created with device_create_file() but
^^^^^^^^^^^-dynamically
> isn't anywhere deleted.
> Define "crash_notes" statically via attribute groups so that
> device_register would create it automatically and file would be
> destroyed when CPU is destroyed.
I had to modify it a bit to work with v3.10-rc1 (It is missing
the dev_attr_crash_notes_size), but otherwise it worked nicely.
>
> Signed-off-by: Igor Mammedov <[email protected]>
> ---
> drivers/base/cpu.c | 21 +++++++++++++++++----
> 1 files changed, 17 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> index fb10728..02e4d73 100644
> --- a/drivers/base/cpu.c
> +++ b/drivers/base/cpu.c
> @@ -132,8 +132,24 @@ static ssize_t show_crash_notes(struct device *dev, struct device_attribute *att
> return rc;
> }
> static DEVICE_ATTR(crash_notes, 0400, show_crash_notes, NULL);
> +
> +static struct attribute *crash_note_cpu_attrs[] = {
> + &dev_attr_crash_notes.attr,
> + NULL
> +};
> +
> +static struct attribute_group crash_note_cpu_attr_group = {
> + .attrs = crash_note_cpu_attrs,
> +};
> #endif
>
> +static const struct attribute_group *common_cpu_attr_groups[] = {
> +#ifdef CONFIG_KEXEC
> + &crash_note_cpu_attr_group,
> +#endif
> + NULL
> +};
> +
> /*
> * Print cpu online, possible, present, and system maps
> */
> @@ -248,6 +264,7 @@ int __cpuinit register_cpu(struct cpu *cpu, int num)
> #ifdef CONFIG_ARCH_HAS_CPU_AUTOPROBE
> cpu->dev.bus->uevent = arch_cpu_uevent;
> #endif
> + cpu->dev.groups = common_cpu_attr_groups;
> error = device_register(&cpu->dev);
> if (!error && cpu->hotpluggable)
> register_cpu_control(cpu);
> @@ -256,10 +273,6 @@ int __cpuinit register_cpu(struct cpu *cpu, int num)
> if (!error)
> register_cpu_under_node(num, cpu_to_node(num));
>
> -#ifdef CONFIG_KEXEC
> - if (!error)
> - error = device_create_file(&cpu->dev, &dev_attr_crash_notes);
> -#endif
> return error;
> }
>
> --
> 1.7.1
>
On Tue, May 14, 2013 at 12:05:32AM +0200, Igor Mammedov wrote:
> Fixes race between udev and hotplugged CPU registration, described at
> https://lkml.org/lkml/2012/4/30/198
It would be better if you copy-n-pasted the contents of that URL in here.
> by defining "online" attribute statically so that device_add() would
> create it before notifying udev about new CPU.
>
> Signed-off-by: Igor Mammedov <[email protected]>
> ---
> drivers/base/cpu.c | 34 +++++++++++++++++++++-------------
> 1 files changed, 21 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> index 02e4d73..6578030 100644
> --- a/drivers/base/cpu.c
> +++ b/drivers/base/cpu.c
> @@ -64,18 +64,21 @@ static ssize_t __ref store_online(struct device *dev,
> }
> static DEVICE_ATTR(online, 0644, show_online, store_online);
>
> -static void __cpuinit register_cpu_control(struct cpu *cpu)
> -{
> - device_create_file(&cpu->dev, &dev_attr_online);
> -}
> +static struct attribute *hotplug_cpu_attrs[] = {
> + &dev_attr_online.attr,
> + NULL
> +};
> +
> +static struct attribute_group hotplug_cpu_attr_group = {
> + .attrs = hotplug_cpu_attrs,
> +};
> +
> void unregister_cpu(struct cpu *cpu)
> {
> int logical_cpu = cpu->dev.id;
>
> unregister_cpu_under_node(logical_cpu, cpu_to_node(logical_cpu));
>
> - device_remove_file(&cpu->dev, &dev_attr_online);
> -
> device_unregister(&cpu->dev);
> per_cpu(cpu_sys_devices, logical_cpu) = NULL;
> return;
> @@ -101,11 +104,6 @@ static ssize_t cpu_release_store(struct device *dev,
> static DEVICE_ATTR(probe, S_IWUSR, NULL, cpu_probe_store);
> static DEVICE_ATTR(release, S_IWUSR, NULL, cpu_release_store);
> #endif /* CONFIG_ARCH_CPU_PROBE_RELEASE */
> -
> -#else /* ... !CONFIG_HOTPLUG_CPU */
> -static inline void register_cpu_control(struct cpu *cpu)
> -{
> -}
> #endif /* CONFIG_HOTPLUG_CPU */
>
> #ifdef CONFIG_KEXEC
> @@ -150,6 +148,16 @@ static const struct attribute_group *common_cpu_attr_groups[] = {
> NULL
> };
>
> +static const struct attribute_group *hotplugable_cpu_attr_groups[] = {
> +#ifdef CONFIG_KEXEC
> + &crash_note_cpu_attr_group,
> +#endif
> +#ifdef CONFIG_HOTPLUG_CPU
> + &hotplug_cpu_attr_group,
> +#endif
> + NULL
> +};
> +
> /*
> * Print cpu online, possible, present, and system maps
> */
> @@ -265,9 +273,9 @@ int __cpuinit register_cpu(struct cpu *cpu, int num)
> cpu->dev.bus->uevent = arch_cpu_uevent;
> #endif
> cpu->dev.groups = common_cpu_attr_groups;
> + if (cpu->hotpluggable)
> + cpu->dev.groups = hotplugable_cpu_attr_groups;
> error = device_register(&cpu->dev);
> - if (!error && cpu->hotpluggable)
> - register_cpu_control(cpu);
> if (!error)
> per_cpu(cpu_sys_devices, num) = &cpu->dev;
> if (!error)
> --
> 1.7.1
>
On Tue, May 14, 2013 at 12:05:30AM +0200, Igor Mammedov wrote:
> Here is a crude attempt fix race the way suggested by Greg,
> probably done wrong but hopefully in the right direction.
Weird, I thought I had tried that at first but got tons of kobject
warnings and such. But I think I tried to add it to kset instead of
the one you did.
It fixes it for me so
Tested-by: Konrad Rzeszutek Wilk <[email protected]>
and also (thought the git commit descriptions need a bit of work,
but that is expected as an RFC patch):
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
>
> 1. move "crash_notes" to static attributes to guarantee that it's
> destroyed with CPU on unregister.
>
> 2. fixes race between hotplugged CPU and onlining it via udev, described here
> https://lkml.org/lkml/2012/4/30/193
>
> Igor Mammedov (2):
> cpu: fix crash_notes leak
> cpu: make sure that cpu/online file created before KOBJ_ADD is
> emitted
>
> drivers/base/cpu.c | 55 +++++++++++++++++++++++++++++++++++----------------
> 1 files changed, 38 insertions(+), 17 deletions(-)
>
On Tue, May 14, 2013 at 12:05:30AM +0200, Igor Mammedov wrote:
> Here is a crude attempt fix race the way suggested by Greg,
> probably done wrong but hopefully in the right direction.
>
> 1. move "crash_notes" to static attributes to guarantee that it's
> destroyed with CPU on unregister.
>
> 2. fixes race between hotplugged CPU and onlining it via udev, described here
> https://lkml.org/lkml/2012/4/30/193
Looks right to me, nice job.
greg k-h