2008-10-17 08:57:47

by Lai Jiangshan

[permalink] [raw]
Subject: [PATCH] x86/proc: fix /proc/cpuinfo cpu offline bug


In my test, I found that if a cpu has been offline,
the next cpus may not be shown in the /proc/cpuinfo.

trivially reproduce this bug:

1) add these lines in the end of show_cpuinfo()
if (m->size - m->count - 20 > 0)
seq_printf(m, "%*s", (int)(m->size - m->count - 20), "show bug\n");

2) rebuilt kernel & reboot
3) offline cpu#1
4) cat /proc/cpuinfo
cpu#2 and cpu#3 .... cannot be shown in /proc/cpuinfo

Signed-off-by: Lai Jiangshan <[email protected]>
---
diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
index a26c480..01b1244 100644
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -160,14 +160,16 @@ static void *c_start(struct seq_file *m, loff_t *pos)
{
if (*pos == 0) /* just in case, cpu 0 is not the first */
*pos = first_cpu(cpu_online_map);
- if ((*pos) < nr_cpu_ids && cpu_online(*pos))
+ else
+ *pos = next_cpu_nr(*pos - 1, cpu_online_map);
+ if ((*pos) < nr_cpu_ids)
return &cpu_data(*pos);
return NULL;
}

static void *c_next(struct seq_file *m, void *v, loff_t *pos)
{
- *pos = next_cpu(*pos, cpu_online_map);
+ (*pos)++;
return c_start(m, pos);
}




2008-10-17 11:38:40

by Alexey Dobriyan

[permalink] [raw]
Subject: Re: [PATCH] x86/proc: fix /proc/cpuinfo cpu offline bug

On Fri, Oct 17, 2008 at 04:55:25PM +0800, Lai Jiangshan wrote:
> In my test, I found that if a cpu has been offline,
> the next cpus may not be shown in the /proc/cpuinfo.
>
> trivially reproduce this bug:
>
> 1) add these lines in the end of show_cpuinfo()
> if (m->size - m->count - 20 > 0)
> seq_printf(m, "%*s", (int)(m->size - m->count - 20), "show bug\n");

What is it?

Can you just show wrong /proc/cpuinfo ?

Can someone with at least 4-way box please do so?

> 3) offline cpu#1
> 4) cat /proc/cpuinfo
> cpu#2 and cpu#3 .... cannot be shown in /proc/cpuinfo

> --- a/arch/x86/kernel/cpu/proc.c
> +++ b/arch/x86/kernel/cpu/proc.c
> @@ -160,14 +160,16 @@ static void *c_start(struct seq_file *m, loff_t *pos)
> {
> if (*pos == 0) /* just in case, cpu 0 is not the first */
> *pos = first_cpu(cpu_online_map);
> - if ((*pos) < nr_cpu_ids && cpu_online(*pos))
> + else
> + *pos = next_cpu_nr(*pos - 1, cpu_online_map);
> + if ((*pos) < nr_cpu_ids)
> return &cpu_data(*pos);
> return NULL;
> }
>
> static void *c_next(struct seq_file *m, void *v, loff_t *pos)
> {
> - *pos = next_cpu(*pos, cpu_online_map);
> + (*pos)++;
> return c_start(m, pos);
> }

2008-10-17 12:45:31

by Lai Jiangshan

[permalink] [raw]
Subject: Re: [PATCH] x86/proc: fix /proc/cpuinfo cpu offline bug

On Fri, Oct 17, 2008 at 7:41 PM, Alexey Dobriyan <[email protected]> wrote:
> On Fri, Oct 17, 2008 at 04:55:25PM +0800, Lai Jiangshan wrote:
>> In my test, I found that if a cpu has been offline,
>> the next cpus may not be shown in the /proc/cpuinfo.
>>
>> trivially reproduce this bug:
>>
>> 1) add these lines in the end of show_cpuinfo()
>> if (m->size - m->count - 20 > 0)
>> seq_printf(m, "%*s", (int)(m->size - m->count - 20), "show bug\n");
>
> What is it?
>
> Can you just show wrong /proc/cpuinfo ?
>
> Can someone with at least 4-way box please do so?

this is not wrong /proc/cpuinfo, this is an enlarged /proc/cpuinfo
this trivial example just use "show bug\n" as the enlarged content.

if you boot your machine with enough cpu, your /proc/cpuinfo
have been enlarged too, please use this:

#!/bin/sh
nr_cpus=16 # i386
for ((i=1; i<nr_cpus; i++))
do
echo 0 > /sys/devices/system/cpu/cpu$i/online
cpus=$(grep processor /proc/cpuinfo | wc -l)
(( cpus == nr_cpus -1 )) || break; # bug eccur
echo 1 > /sys/devices/system/cpu/cpu$i/online
done
cat /proc/cpuinfo # it shows this bug.

Lai

>
>> 3) offline cpu#1
>> 4) cat /proc/cpuinfo
>> cpu#2 and cpu#3 .... cannot be shown in /proc/cpuinfo
>
>> --- a/arch/x86/kernel/cpu/proc.c
>> +++ b/arch/x86/kernel/cpu/proc.c
>> @@ -160,14 +160,16 @@ static void *c_start(struct seq_file *m, loff_t *pos)
>> {
>> if (*pos == 0) /* just in case, cpu 0 is not the first */
>> *pos = first_cpu(cpu_online_map);
>> - if ((*pos) < nr_cpu_ids && cpu_online(*pos))
>> + else
>> + *pos = next_cpu_nr(*pos - 1, cpu_online_map);
>> + if ((*pos) < nr_cpu_ids)
>> return &cpu_data(*pos);
>> return NULL;
>> }
>>
>> static void *c_next(struct seq_file *m, void *v, loff_t *pos)
>> {
>> - *pos = next_cpu(*pos, cpu_online_map);
>> + (*pos)++;
>> return c_start(m, pos);
>> }
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2008-10-22 04:43:30

by Lai Jiangshan

[permalink] [raw]
Subject: Re: [PATCH] x86/proc: fix /proc/cpuinfo cpu offline bug


Hi, Ingo and Alexey,

I have changed my changelog. please review again.

Thanks, Lai.

Lai Jiangshan wrote:
> On Fri, Oct 17, 2008 at 7:41 PM, Alexey Dobriyan <[email protected]> wrote:
>> On Fri, Oct 17, 2008 at 04:55:25PM +0800, Lai Jiangshan wrote:
>>> In my test, I found that if a cpu has been offline,
>>> the next cpus may not be shown in the /proc/cpuinfo.
>>>
>>> trivially reproduce this bug:
>>>
>>> 1) add these lines in the end of show_cpuinfo()
>>> if (m->size - m->count - 20 > 0)
>>> seq_printf(m, "%*s", (int)(m->size - m->count - 20), "show bug\n");
>> What is it?
>>
>> Can you just show wrong /proc/cpuinfo ?
>>
>> Can someone with at least 4-way box please do so?
>
> this is not wrong /proc/cpuinfo, this is an enlarged /proc/cpuinfo
> this trivial example just use "show bug\n" as the enlarged content.
>
> if you boot your machine with enough cpu, your /proc/cpuinfo
> have been enlarged too, please use this:
>
> #!/bin/sh
> nr_cpus=16 # i386
> for ((i=1; i<nr_cpus; i++))
> do
> echo 0 > /sys/devices/system/cpu/cpu$i/online
> cpus=$(grep processor /proc/cpuinfo | wc -l)
> (( cpus == nr_cpus -1 )) || break; # bug eccur
> echo 1 > /sys/devices/system/cpu/cpu$i/online
> done
> cat /proc/cpuinfo # it shows this bug.
>
> Lai
>
>>> 3) offline cpu#1
>>> 4) cat /proc/cpuinfo
>>> cpu#2 and cpu#3 .... cannot be shown in /proc/cpuinfo
>>> --- a/arch/x86/kernel/cpu/proc.c
>>> +++ b/arch/x86/kernel/cpu/proc.c
>>> @@ -160,14 +160,16 @@ static void *c_start(struct seq_file *m, loff_t *pos)
>>> {
>>> if (*pos == 0) /* just in case, cpu 0 is not the first */
>>> *pos = first_cpu(cpu_online_map);
>>> - if ((*pos) < nr_cpu_ids && cpu_online(*pos))
>>> + else
>>> + *pos = next_cpu_nr(*pos - 1, cpu_online_map);
>>> + if ((*pos) < nr_cpu_ids)
>>> return &cpu_data(*pos);
>>> return NULL;
>>> }
>>>
>>> static void *c_next(struct seq_file *m, void *v, loff_t *pos)
>>> {
>>> - *pos = next_cpu(*pos, cpu_online_map);
>>> + (*pos)++;
>>> return c_start(m, pos);
>>> }
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at http://www.tux.org/lkml/
>>
>
>
>

2008-10-22 04:44:49

by Lai Jiangshan

[permalink] [raw]
Subject: Re: [PATCH] x86/proc: fix /proc/cpuinfo cpu offline bug

Lai Jiangshan wrote:
> Hi, Ingo and Alexey,
>
> I have changed my changelog. please review again.
>
> Thanks, Lai.
>


In my test, I found that if a cpu has been offline,
the next cpus may not be shown in the /proc/cpuinfo.

if one read() cannot consume the whole /proc/cpuinfo,
c_start() will be called again in the next read() calls.
And *pos has been increased by 1 by the caller(seq_read()).
if this time the cpu#*pos is offline, c_start() will return
NULL, and the next cpus can not be shown.

this fix use next_cpu_nr(*pos - 1, cpu_online_map) to
search the next unshown cpu.

the most easy way to reproduce this bug:
1) offline cpu#1 (cpu#0 is online)
2) dd ibs=2 if=/proc/cpuinfo
the result is that only cpu#0 is shown.
cpu#2 and cpu#3 .... cannot be shown in /proc/cpuinfo
it's bug.

Signed-off-by: Lai Jiangshan <[email protected]>
---
diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
index a26c480..01b1244 100644
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -160,14 +160,16 @@ static void *c_start(struct seq_file *m, loff_t *pos)
{
if (*pos == 0) /* just in case, cpu 0 is not the first */
*pos = first_cpu(cpu_online_map);
- if ((*pos) < nr_cpu_ids && cpu_online(*pos))
+ else
+ *pos = next_cpu_nr(*pos - 1, cpu_online_map);
+ if ((*pos) < nr_cpu_ids)
return &cpu_data(*pos);
return NULL;
}

static void *c_next(struct seq_file *m, void *v, loff_t *pos)
{
- *pos = next_cpu(*pos, cpu_online_map);
+ (*pos)++;
return c_start(m, pos);
}