Christoph Lameter wrote:
> On Fri, 30 May 2008, Eric Dumazet wrote:
>
>>> +static DEFINE_PER_CPU(UNIT_TYPE, area[UNITS]);
>>>
>> area[] is not guaranteed to be aligned on anything but 4 bytes.
>>
>> If someone then needs to call cpu_alloc(8, GFP_KERNEL, 8), it might get an non
>> aligned result.
>>
>> Either you should add an __attribute__((__aligned__(PAGE_SIZE))),
>> or take into account the real address of area[] in cpu_alloc() to avoid waste
>> of up to PAGE_SIZE bytes
>> per cpu.
>
> I think cacheline aligning should be sufficient. People should not
> allocate large page aligned objects here.
I'm a bit confused. Why is DEFINE_PER_CPU_SHARED_ALIGNED() conditioned on
ifdef MODULE?
#ifdef MODULE
#define SHARED_ALIGNED_SECTION ".data.percpu"
#else
#define SHARED_ALIGNED_SECTION ".data.percpu.shared_aligned"
#endif
#define DEFINE_PER_CPU_SHARED_ALIGNED(type, name) \
__attribute__((__section__(SHARED_ALIGNED_SECTION))) \
PER_CPU_ATTRIBUTES __typeof__(type) per_cpu__##name \
____cacheline_aligned_in_smp
Thanks,
Mike
Mike Travis a ?crit :
> Christoph Lameter wrote:
>> On Fri, 30 May 2008, Eric Dumazet wrote:
>>
>>>> +static DEFINE_PER_CPU(UNIT_TYPE, area[UNITS]);
>>>>
>>> area[] is not guaranteed to be aligned on anything but 4 bytes.
>>>
>>> If someone then needs to call cpu_alloc(8, GFP_KERNEL, 8), it might get an non
>>> aligned result.
>>>
>>> Either you should add an __attribute__((__aligned__(PAGE_SIZE))),
>>> or take into account the real address of area[] in cpu_alloc() to avoid waste
>>> of up to PAGE_SIZE bytes
>>> per cpu.
>> I think cacheline aligning should be sufficient. People should not
>> allocate large page aligned objects here.
>
> I'm a bit confused. Why is DEFINE_PER_CPU_SHARED_ALIGNED() conditioned on
> ifdef MODULE?
>
> #ifdef MODULE
> #define SHARED_ALIGNED_SECTION ".data.percpu"
> #else
> #define SHARED_ALIGNED_SECTION ".data.percpu.shared_aligned"
> #endif
>
> #define DEFINE_PER_CPU_SHARED_ALIGNED(type, name) \
> __attribute__((__section__(SHARED_ALIGNED_SECTION))) \
> PER_CPU_ATTRIBUTES __typeof__(type) per_cpu__##name \
> ____cacheline_aligned_in_smp
>
> Thanks,
> Mike
>
>
Because we had crashes when loading oprofile module, when a previous version of oprofile
used to use DEFINE_PER_CPU_SHARED_ALIGNED variable
module loader only takes into account the special section ".data.percpu" and ignores ".data.percpu.shared_aligned"
I therefore submitted two patches :
1) commit 8b8b498836942c0c855333d357d121c0adeefbd9
oprofile: don't request cache line alignment for cpu_buffer
Alignment was previously requested because cpu_buffer was an [NR_CPUS]
array, to avoid cache line sharing between CPUS.
After commit 608dfddd845da5ab6accef70154c8910529699f7 (oprofile: change
cpu_buffer from array to per_cpu variable ), we dont need to force an
alignement anymore since cpu_buffer sits in per_cpu zone.
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Mike Travis <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
2) and commit 44c81433e8b05dbc85985d939046f10f95901184
per_cpu: fix DEFINE_PER_CPU_SHARED_ALIGNED for modules
Current module loader lookups ".data.percpu" ELF section to perform
per_cpu relocation. But DEFINE_PER_CPU_SHARED_ALIGNED() uses another
section (".data.percpu.shared_aligned"), currently only handled in
vmlinux.lds, not by module loader.
To correct this problem, instead of adding logic into module loader, or
using at build time a module.lds file for all arches to group
".data.percpu.shared_aligned" into ".data.percpu", just use ".data.percpu"
for modules.
Alignment requirements are correctly handled by ld and module loader.
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: Fenghua Yu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
On Thursday 05 June 2008 01:11:00 Eric Dumazet wrote:
> Mike Travis a ?crit :
> > I'm a bit confused. Why is DEFINE_PER_CPU_SHARED_ALIGNED() conditioned
> > on ifdef MODULE?
> Because we had crashes when loading oprofile module, when a previous
> version of oprofile used to use DEFINE_PER_CPU_SHARED_ALIGNED variable
>
> module loader only takes into account the special section ".data.percpu"
> and ignores ".data.percpu.shared_aligned"
>
> I therefore submitted two patches :
Put one way, putting page-aligned per-cpu data in a separate section is a
space-saving hack: one which is not really required for modules because of
the low frequency of such variables. Put another way, not respecting
the .data.percpu.shared_aligned section in modules is a bug.
But a comment would probably be nice!
Cheers,
Rusty.
On Wed, 4 Jun 2008, Mike Travis wrote:
> I'm a bit confused. Why is DEFINE_PER_CPU_SHARED_ALIGNED() conditioned on
> ifdef MODULE?
>
> #ifdef MODULE
> #define SHARED_ALIGNED_SECTION ".data.percpu"
> #else
> #define SHARED_ALIGNED_SECTION ".data.percpu.shared_aligned"
> #endif
>
> #define DEFINE_PER_CPU_SHARED_ALIGNED(type, name) \
> __attribute__((__section__(SHARED_ALIGNED_SECTION))) \
> PER_CPU_ATTRIBUTES __typeof__(type) per_cpu__##name \
> ____cacheline_aligned_in_smp
Looks wrong to me. There can be shared objects even without modules.
Christoph Lameter a ?crit :
> On Wed, 4 Jun 2008, Mike Travis wrote:
>
>> I'm a bit confused. Why is DEFINE_PER_CPU_SHARED_ALIGNED() conditioned on
>> ifdef MODULE?
>>
>> #ifdef MODULE
>> #define SHARED_ALIGNED_SECTION ".data.percpu"
>> #else
>> #define SHARED_ALIGNED_SECTION ".data.percpu.shared_aligned"
>> #endif
>>
>> #define DEFINE_PER_CPU_SHARED_ALIGNED(type, name) \
>> __attribute__((__section__(SHARED_ALIGNED_SECTION))) \
>> PER_CPU_ATTRIBUTES __typeof__(type) per_cpu__##name \
>> ____cacheline_aligned_in_smp
>
> Looks wrong to me. There can be shared objects even without modules.
>
>
Well, MODULE is not CONFIG_MODULES :)
If compiling an object that is going to be statically linked to kernel,
MODULE is not defined, so we have shared objects.
When compiling a module, we cannot *yet* use .data.percpu.shared_aligned
section, since module loader wont handle this section.
Alternative is to change modules linking for all arches to merge
.data.percpu{*} subsections correctly, or tell module loader to take
into account all .data.percpu sections.
AFAIK no module uses DEFINE_PER_CPU_SHARED_ALIGNED() yet...
On Tue, 10 Jun 2008, Eric Dumazet wrote:
> Well, MODULE is not CONFIG_MODULES :)
>
> If compiling an object that is going to be statically linked to kernel, MODULE
> is not defined, so we have shared objects.
>
> When compiling a module, we cannot *yet* use .data.percpu.shared_aligned
> section, since module loader wont handle this section.
>
> Alternative is to change modules linking for all arches to merge
> .data.percpu{*} subsections correctly, or tell module loader to take into
> account all .data.percpu sections.
>
> AFAIK no module uses DEFINE_PER_CPU_SHARED_ALIGNED() yet...
Ahhh. Makes sense. Add a comment to explain this?