LinuxLists.cc - replace "memset(...,0,PAGE_SIZE)" calls with "clear

2006-12-29 06:20:37

Subject: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?

is there some reason there are so many calls of the form

memset(addr, 0, PAGE_SIZE)

rather than the apparently equivalent invocation of

clear_page(addr)

the majority of architectures appear to define the clear_page() macro
in their include/<arch>/page.h header file, but not entirely
identically, and in some cases that definition is conditional, as with
i386:

=============================================================
#ifdef CONFIG_X86_USE_3DNOW
...
#define clear_page(page) mmx_clear_page((void *)(page))
...
#else
...
#define clear_page(page) memset((void *)(page), 0, PAGE_SIZE)
...
#endif
============================================================

should it perhaps be part of the CodingStyle doc to use the
clear_page() macro rather than an explicit call to memset()? (and
should all architectures be required to define that macro?)

rday

2006-12-30 20:50:49

by Denys Vlasenko

[permalink] [raw]

Subject: Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?

On Friday 29 December 2006 07:16, Robert P. J. Day wrote:
>
> is there some reason there are so many calls of the form
>
> memset(addr, 0, PAGE_SIZE)
>
> rather than the apparently equivalent invocation of
>
> clear_page(addr)
>
> the majority of architectures appear to define the clear_page() macro
> in their include/<arch>/page.h header file, but not entirely
> identically, and in some cases that definition is conditional, as with
> i386:
>
> =============================================================
> #ifdef CONFIG_X86_USE_3DNOW
> ...
> #define clear_page(page) mmx_clear_page((void *)(page))
> ...
> #else
> ...
> #define clear_page(page) memset((void *)(page), 0, PAGE_SIZE)
> ...
> #endif
> ============================================================
>
> should it perhaps be part of the CodingStyle doc to use the
> clear_page() macro rather than an explicit call to memset()? (and
> should all architectures be required to define that macro?)

clear_page assumes that given address is page aligned, I think.
It may fail if you feed it with misaligned region's address.
--
vda

2006-12-30 22:14:33

by Robert P. J. Day

[permalink] [raw]

Subject: Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?

On Sat, 30 Dec 2006, Denis Vlasenko wrote:

> On Friday 29 December 2006 07:16, Robert P. J. Day wrote:
> >
> > is there some reason there are so many calls of the form
> >
> > memset(addr, 0, PAGE_SIZE)
> >
> > rather than the apparently equivalent invocation of
> >
> > clear_page(addr)
> >
> > the majority of architectures appear to define the clear_page()
> > macro in their include/<arch>/page.h header file, but not entirely
> > identically, and in some cases that definition is conditional, as
> > with i386:
> >
> > =============================================================
> > #ifdef CONFIG_X86_USE_3DNOW
> > ...
> > #define clear_page(page) mmx_clear_page((void *)(page))
> > ...
> > #else
> > ...
> > #define clear_page(page) memset((void *)(page), 0, PAGE_SIZE)
> > ...
> > #endif
> > ============================================================
> >
> > should it perhaps be part of the CodingStyle doc to use the
> > clear_page() macro rather than an explicit call to memset()?
> > (and should all architectures be required to define that macro?)
>
> clear_page assumes that given address is page aligned, I think. It
> may fail if you feed it with misaligned region's address.

i don't see how that can be true, given that most of the definitions
of the clear_page() macro are simply invocations of memset(). see for
yourself:

$ grep -r "#define clear_page" include

my only point here was that lots of code seems to be calling memset()
when it would be clearer to invoke clear_page(). but there's still
something a bit curious happening here. i'll poke around a bit more
before i ask, though.

rday

2006-12-30 22:42:14

by Denys Vlasenko

[permalink] [raw]

Subject: Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?

On Saturday 30 December 2006 23:08, Robert P. J. Day wrote:
> >
> > clear_page assumes that given address is page aligned, I think. It
> > may fail if you feed it with misaligned region's address.
>
> i don't see how that can be true, given that most of the definitions
> of the clear_page() macro are simply invocations of memset(). see for
> yourself:
>
> $ grep -r "#define clear_page" include
>
> my only point here was that lots of code seems to be calling memset()
> when it would be clearer to invoke clear_page(). but there's still
> something a bit curious happening here. i'll poke around a bit more
> before i ask, though.

There are MMX implementations of clear_page().

I was experimenting with SSE[2] clear_page() which uses
non-temporal stores. That one requires 16 byte alignment.

BTW, it worked ~300% faster than memset. But Andi Kleen
insists that cache eviction caused by NT stores will make it
slower in macrobenchmark.

Apart from fairly extensive set of microbechmarks
I tested kernel compiles (i.e. "real world load")
and they are FASTER too, not slower, but Andi
is fairly entrenched in his opinion ;)
I gave up.
--
vda

2006-12-30 22:45:54

by Arjan van de Ven

[permalink] [raw]

Subject: Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?

> i don't see how that can be true, given that most of the definitions
> of the clear_page() macro are simply invocations of memset(). see for
> yourself:

*MOST*. Not all.
For example an SSE version will at least assume 16 byte alignment, etc
etc.

clear_page() is supposed to be for full real pages only... for example
it allows the architecture to optimize for alignment, cache aliasing etc
etc. (and if there are cpus that get a "clear an entire page"
instruction.... there has been hardware like that in the past, even on
x86, just it's no longer sold afaik)

--
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org

2006-12-30 23:09:14

by Robert P. J. Day

[permalink] [raw]

Subject: Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?

On Sat, 30 Dec 2006, Arjan van de Ven wrote:

> rday wrote:

> > ... most of the definitions of the clear_page() macro are simply
> > invocations of memset(). see for yourself:

> *MOST*. Not all.

i did notice that. while the majority of the architectures simply
define clear_page() as a macro calling memset(ptr, 0, PAGE_SIZE), the
rest will implement it in assembler code for whatever reason. (i'm
assuming that *every* architecture *must* define clear_page() one way
or the other, is that correct? that would seem fairly obvious, but
i just want to make sure i'm not missing anything obvious.)

> clear_page() is supposed to be for full real pages only... for
> example it allows the architecture to optimize for alignment, cache
> aliasing etc etc.

fair enough. *technically*, not every call of the form
"memset(ptr,0,PAGE_SIZE)" necessarily represents an address that's on
a page boundary. but, *realistically*, i'm guessing most of them do.
just grabbing a random example from some grep output:

arch/sh/mm/init.c:
...
/* clear the zero-page */
memset(empty_zero_page, 0, PAGE_SIZE);
...

my only point here is that, given that every architecture needs to
supply some kind of definition of a "clear_page()" routine, one would
think that *lots* of those memset() calls could reasonably be
rewritten as a clear_page() call for improved readibility, no?

and there are a *lot* of memset() calls like that.

rday

2006-12-31 13:39:31

by folkert

[permalink] [raw]

Subject: Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?

> > i don't see how that can be true, given that most of the definitions
> > of the clear_page() macro are simply invocations of memset(). see for
> > yourself:
> *MOST*. Not all.
> For example an SSE version will at least assume 16 byte alignment, etc
> etc.

What about an if (adress & 15) { memset } else { sse stuff }
or is that too obvious? :-)

> clear_page() is supposed to be for full real pages only... for example
> it allows the architecture to optimize for alignment, cache aliasing etc
> etc. (and if there are cpus that get a "clear an entire page"
> instruction.... there has been hardware like that in the past, even on
> x86, just it's no longer sold afaik)

Folkert van Heusden

--
http://www.biglumber.com <- site where one can exchange PGP key signatures
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, http://www.vanheusden.com

2006-12-31 13:45:43

by Arjan van de Ven

[permalink] [raw]

Subject: Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?

On Sun, 2006-12-31 at 14:39 +0100, Folkert van Heusden wrote:
> > > i don't see how that can be true, given that most of the definitions
> > > of the clear_page() macro are simply invocations of memset(). see for
> > > yourself:
> > *MOST*. Not all.
> > For example an SSE version will at least assume 16 byte alignment, etc
> > etc.
>
> What about an if (adress & 15) { memset } else { sse stuff }
> or is that too obvious? :-)

it's only one example. clear_page() working only on a full page is a
nice restriction that allows the implementation to be optimized (again
the x86 hardware that had a hardware page zeroer comes to mind, the hw
is only 4 years old or so... and future hw may have it again)

clear_page() is more restricted than memset(). And that's good, it
allows for a more focused implementation. Otherwise there'd be no reason
to HAVE a clear_page(), if it just was a memset wrapper entirely.

--
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org

2006-12-31 16:44:39

by Robert P. J. Day

[permalink] [raw]

Subject: Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?

On Sun, 31 Dec 2006, Arjan van de Ven wrote:

> On Sun, 2006-12-31 at 14:39 +0100, Folkert van Heusden wrote:
> > > > i don't see how that can be true, given that most of the definitions
> > > > of the clear_page() macro are simply invocations of memset(). see for
> > > > yourself:
> > > *MOST*. Not all.
> > > For example an SSE version will at least assume 16 byte alignment, etc
> > > etc.
> >
> > What about an if (adress & 15) { memset } else { sse stuff }
> > or is that too obvious? :-)
>
> it's only one example. clear_page() working only on a full page is a
> nice restriction that allows the implementation to be optimized
> (again the x86 hardware that had a hardware page zeroer comes to
> mind, the hw is only 4 years old or so... and future hw may have it
> again)
>
> clear_page() is more restricted than memset(). And that's good, it
> allows for a more focused implementation. Otherwise there'd be no
> reason to HAVE a clear_page(), if it just was a memset wrapper
> entirely.

(just one more note about this, then i'll stop dragging it out. i
didn't mean to get this long-winded about it.)

arjan, you and i actually agree on this. i fully accept that the idea
of a "clear_page()" call might or should have extra semantics,
compared to the more simple and direct "memset(...,0,PAGE_SIZE)" call
(such as alignment requirements, for example). my observation is
simply that this is not what is currently happening.

consider, for example, how many calls there are to clear_page() in the
drivers directory:

$ grep -rw clear_page drivers

not that many. now how many calls are there of the memset variety?

$ grep -Er "memset(.*0, ?PAGE_SIZE)" drivers

i can't believe that at least *some* of those memset() calls couldn't
be re-written as clear_page() calls. and that's just for the
drivers/ directory.

sure, clear_page() might have extra semantics. but if that's the
case, and those semantics happen to be in play, i'm suggesting that
not only *can* one use clear_page() at that point, one *should* use
it.

put another way, if a given situation is appropriate for a call to
clear_page(), then that's what should be used. because if one sees
instead a call to an equivalent memset(), that might suggest that
there's something *preventing* the use of clear_page() -- that it's
not appropriate. and, really, there's no need to be unnecessarily
confusing.

this is just another example of the basic kernel infrastructure
defining lots of useful features (ARRAY_SIZE, etc.) and lots of
programmers not using them for one reason or another. in this
situation with clear_page(), the semantics of that call should be
defined clearly and, when the situation arises, i think that call
should be used unless there's a clear reason *not* to. it just makes
the code easier to read.

and on that note, i'll let this one go. others are free to follow
up.

rday

2006-12-31 17:43:26

by Arjan van de Ven

[permalink] [raw]

Subject: Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?

> arjan, you and i actually agree on this. i fully accept that the idea
> of a "clear_page()" call might or should have extra semantics,
> compared to the more simple and direct "memset(...,0,PAGE_SIZE)" call
> (such as alignment requirements, for example). my observation is
> simply that this is not what is currently happening.

that's fair
>
> consider, for example, how many calls there are to clear_page() in the
> drivers directory:
>
> $ grep -rw clear_page drivers
>
> not that many.

the biggest user of clear_page and such is the pagefault code path in
practice.

> i can't believe that at least *some* of those memset() calls couldn't
> be re-written as clear_page() calls. and that's just for the
> drivers/ directory.

yes I can believe that ....
>
> sure, clear_page() might have extra semantics. but if that's the
> case, and those semantics happen to be in play, i'm suggesting that
> not only *can* one use clear_page() at that point, one *should* use
> it.
>
> put another way, if a given situation is appropriate for a call to
> clear_page(), then that's what should be used.

.... however there is potentially a bigger thing possible.
These places that zero a whole full page may have just allocated it
(that's an assumption on my side), and if that's the case, maybe those
places instead should use the zeroing version of the allocator instead
(which internally uses clear_page() ).

So... yes I fully agree with you that it's worth looking at the
memset( , PAGE_SIZE) users. If they are page aligned, yes absolutely
make it a clear_page(), I think that's a very good idea. However also
please check if they've been very recently allocated in that code, and
if maybe the zeroing allocators are better suited there..
(or maybe there's even double zeroing going on.. that's be a nice gain)

--
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org

2006-12-31 18:40:57

by Paul Mundt

[permalink] [raw]

Subject: Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?

On Sat, Dec 30, 2006 at 06:04:14PM -0500, Robert P. J. Day wrote:
> fair enough. *technically*, not every call of the form
> "memset(ptr,0,PAGE_SIZE)" necessarily represents an address that's on
> a page boundary. but, *realistically*, i'm guessing most of them do.
> just grabbing a random example from some grep output:
>
> arch/sh/mm/init.c:
> ...
> /* clear the zero-page */
> memset(empty_zero_page, 0, PAGE_SIZE);
> ...
>
The problem with random grepping is that it doesn't give you any context.
clear_page() isn't available in this case since we have a couple of
different ways of implementing it, and the optimal approach is selected
later on. There are also additional assumptions regarding alignment that
don't allow clear_page() to be used directly as replacement for the
memset() callsites (as has already been pointed out for some of the other
architectures). While the empty_zero_page in this case sits on a full page
boundary, others do not.

You might find some places in drivers that do this where you might be
able to optimize things slightly with a clear_page() (or copy_page() in
the memcpy() case), but it's going to need a lot of manual auditing
rather than a find and replace. Any sort of wins you get out of this
would be marginal at best, anyways.

The more interesting case would be page clustering/bulk page clearing
with offload engines, and there's certainly room to build on the SGI
patches for this.

2006-12-31 19:09:24

by Robert P. J. Day

[permalink] [raw]

Subject: Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?

On Mon, 1 Jan 2007, Paul Mundt wrote:

> On Sat, Dec 30, 2006 at 06:04:14PM -0500, Robert P. J. Day wrote:
> > fair enough. *technically*, not every call of the form
> > "memset(ptr,0,PAGE_SIZE)" necessarily represents an address that's on
> > a page boundary. but, *realistically*, i'm guessing most of them do.
> > just grabbing a random example from some grep output:
> >
> > arch/sh/mm/init.c:
> > ...
> > /* clear the zero-page */
> > memset(empty_zero_page, 0, PAGE_SIZE);
> > ...
> >
> The problem with random grepping is that it doesn't give you any
> context. clear_page() isn't available in this case since we have a
> couple of different ways of implementing it, and the optimal
> approach is selected later on. There are also additional assumptions
> regarding alignment that don't allow clear_page() to be used
> directly as replacement for the memset() callsites (as has already
> been pointed out for some of the other architectures). While the
> empty_zero_page in this case sits on a full page boundary, others do
> not.
>
> You might find some places in drivers that do this where you might
> be able to optimize things slightly with a clear_page() (or
> copy_page() in the memcpy() case), but it's going to need a lot of
> manual auditing rather than a find and replace. Any sort of wins you
> get out of this would be marginal at best, anyways.
>
> The more interesting case would be page clustering/bulk page
> clearing with offload engines, and there's certainly room to build
> on the SGI patches for this.

your point is well taken -- i wasn't trying to suggest that a blind
cut-and-replace would be appropriate, only that there were an awful
lot of places where it wasn't clear that that kind of replacement
*wasn't* appropriate. or perhaps even recommended. (doing that kind
of search in the drivers/ directory would perhaps be more meaningful
than in the arch/ directory. just my luck i picked a bad example.)

clearly, that kind of replacement might require manual intervention in
a lot of cases, no question. as with other examples i've brought up
here, i'm just looking at this from a relatively newbie perspective,
where i'm perusing the code and, in this case, got to thinking, "gee,
given that every architecture defines a clear_page() macro, i wonder
why all these people keep calling memset()." that's all.

kind of like how, given that include/linux/gfp.h defines the macro
__get_dma_pages(), so many people persist in calling
__get_free_pages() with a GFP_DMA setting. that sort of thing. :-)

rday

2007-01-01 01:59:48

by folkert

[permalink] [raw]

Subject: Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?

> > regarding alignment that don't allow clear_page() to be used
> > copy_page() in the memcpy() case), but it's going to need a lot of

Maybe these optimalisations should be in the coding style docs?

Folkert van Heusden

--
Ever wonder what is out there? Any alien races? Then please support
the seti@home project: setiathome.ssl.berkeley.edu
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, http://www.vanheusden.com

2007-01-01 08:38:52

by Robert P. J. Day

[permalink] [raw]

Subject: Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?

On Mon, 1 Jan 2007, Folkert van Heusden wrote:

> > > regarding alignment that don't allow clear_page() to be used
> > > copy_page() in the memcpy() case), but it's going to need a lot of
>
> Maybe these optimalisations should be in the coding style docs?

i was thinking of submitting the following as a new "chapter" for the
doc -- it would address *some* of these issues:

Chapter XX: Page-related memory management

The following functions and macro definitions are available via
include/linux/gfp.h for page-based memory management:

struct page *alloc_pages(gfp_mask, order);
unsigned long __get_free_pages(gfp_mask, order);
unsigned long get_zeroed_page(gfp_mask);
void __free_pages(struct page *page, order);
void free_pages(unsigned long addr, order);

#define alloc_page(gfp_mask) alloc_pages(gfp_mask, 0)
#define __get_free_page(gfp_mask) __get_free_pages((gfp_mask),0)
#define __free_page(page) __free_pages((page), 0)
#define free_page(addr) free_pages((addr),0)

#define __get_dma_pages(gfp_mask, order) \
__get_free_pages((gfp_mask) | GFP_DMA,(order))

Given the above, some basic suggestions for page-based memory management:

(a) If you need to allocate or free a single page, use the single page
version of the routine/macro, rather than calling the multi-page
version with an order value of zero, such as:

alloc_pages(gfp_mask, 0); /* no */
alloc_page(gfp_mask); /* better */

(b) If you need to allocate a single zeroed page by logical address,
use get_zeroed_page(), rather than __get_free_page() followed
by a call to memset() to clear that page.

(c) If you need to specifically allocate some DMA pages, use the
__get_dma_pages() macro, as in:

__get_free_pages(GFP_KERNEL|GFP_DMA, order) /* no */
__get_dma_pages(GFP_KERNEL, order) /* better */

(d) If you need to clear (zero) a page, be aware that every
architecture defines a clear_page() routine, either as a macro
in include/<arch>/page.h or as an assembler routine.

You should check if it's appropriate to use that routine/macro,
rather than making an explicit call to memset(...,0, PAGE_SIZE).

2007-01-01 08:43:36

by Paul Mundt

[permalink] [raw]

Subject: Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?

On Mon, Jan 01, 2007 at 02:59:32AM +0100, Folkert van Heusden wrote:
> > > regarding alignment that don't allow clear_page() to be used
> > > copy_page() in the memcpy() case), but it's going to need a lot of
>
> Maybe these optimalisations should be in the coding style docs?
>
For what purpose? CodingStyle is not about documenting usage constraints
for every minor part of the kernel. If someone intends to use an API,
it's up to them to figure out the semantics for doing so. Let's not
confuse common sense with style.

2007-01-01 10:14:25

by Arjan van de Ven

[permalink] [raw]

Subject: Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?

> Given the above, some basic suggestions for page-based memory management:
>
> (a) If you need to allocate or free a single page, use the single page
> version of the routine/macro, rather than calling the multi-page
> version with an order value of zero, such as:
>
> alloc_pages(gfp_mask, 0); /* no */
> alloc_page(gfp_mask); /* better */
>
> (b) If you need to allocate a single zeroed page by logical address,
> use get_zeroed_page(), rather than __get_free_page() followed
> by a call to memset() to clear that page.

both look good... I'd be in favor of this.
Maybe also add a part about using GFP_KERNEL whenever possible, GFP_NOFS
from filesystem writeout code and GFP_NOIO from block writeout code
(and never doing in_interrupt()?GFP_ATOMIC:GFP_KERNEL !)

>
> (c) If you need to specifically allocate some DMA pages, use the
> __get_dma_pages() macro, as in:
>
> __get_free_pages(GFP_KERNEL|GFP_DMA, order) /* no */
> __get_dma_pages(GFP_KERNEL, order) /* better */

this.. does not really. GFP_DMA is an ancient artifact from the ISA
days. Better to describe the dma mapping interface (well give a pointer
to the doc that already exists about that), that one is REALLY for
allocating dma pages in this century.

--
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org

2007-01-01 10:33:41

by Robert P. J. Day

[permalink] [raw]

Subject: Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?

On Mon, 1 Jan 2007, Arjan van de Ven wrote:

> > Given the above, some basic suggestions for page-based memory management:
> >
> > (a) If you need to allocate or free a single page, use the single page
> > version of the routine/macro, rather than calling the multi-page
> > version with an order value of zero, such as:
> >
> > alloc_pages(gfp_mask, 0); /* no */
> > alloc_page(gfp_mask); /* better */
> >
> > (b) If you need to allocate a single zeroed page by logical address,
> > use get_zeroed_page(), rather than __get_free_page() followed
> > by a call to memset() to clear that page.
>
> both look good... I'd be in favor of this. Maybe also add a part
> about using GFP_KERNEL whenever possible, GFP_NOFS from filesystem
> writeout code and GFP_NOIO from block writeout code (and never doing
> in_interrupt()?GFP_ATOMIC:GFP_KERNEL !)

it strikes me that that latter part is starting to go beyond the scope
of simple coding style aesthetics and getting into actual coding
distinctions. would that really be appropriate for the CodingStyle
doc? i'm just asking.

> > (c) If you need to specifically allocate some DMA pages, use the
> > __get_dma_pages() macro, as in:
> >
> > __get_free_pages(GFP_KERNEL|GFP_DMA, order) /* no */
> > __get_dma_pages(GFP_KERNEL, order) /* better */
>
> this.. does not really. GFP_DMA is an ancient artifact from the ISA
> days. Better to describe the dma mapping interface (well give a
> pointer to the doc that already exists about that), that one is
> REALLY for allocating dma pages in this century.

ok, i was just trying to make the calls consistent based on what i
could see in the current source code. i'm still reviewing the
material on DMA -- feel free to suggest better wording.

rday

p.s. what DMA doc are you referring to above? DMA-mapping.txt?

2007-01-01 17:22:56

by Randy Dunlap

[permalink] [raw]

Subject: Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?

On Mon, 1 Jan 2007 17:42:31 +0900 Paul Mundt wrote:

> On Mon, Jan 01, 2007 at 02:59:32AM +0100, Folkert van Heusden wrote:
> > > > regarding alignment that don't allow clear_page() to be used
> > > > copy_page() in the memcpy() case), but it's going to need a lot of
> >
> > Maybe these optimalisations should be in the coding style docs?
> >
> For what purpose? CodingStyle is not about documenting usage constraints
> for every minor part of the kernel. If someone intends to use an API,
> it's up to them to figure out the semantics for doing so. Let's not
> confuse common sense with style.
> -

I agree, these aren't CodingStyle material. They could make sense
in some other doc, either MM/VM-related or more general.

I've often wanted a somewhat introductory doc about Linux environmental
assumptions (memory model, pointers/longs, data cleared to 0,
many other items that I don't have on my fingertips).

---
~Randy

2007-01-01 19:06:10

by Dave Jones

[permalink] [raw]

Subject: Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?

On Mon, Jan 01, 2007 at 05:27:10AM -0500, Robert P. J. Day wrote:

> > both look good... I'd be in favor of this. Maybe also add a part
> > about using GFP_KERNEL whenever possible, GFP_NOFS from filesystem
> > writeout code and GFP_NOIO from block writeout code (and never doing
> > in_interrupt()?GFP_ATOMIC:GFP_KERNEL !)
>
> it strikes me that that latter part is starting to go beyond the scope
> of simple coding style aesthetics and getting into actual coding
> distinctions. would that really be appropriate for the CodingStyle
> doc? i'm just asking.

Adding a separate 'good practices' doc wouldn't be a bad idea.

Dave

--
http://www.codemonkey.org.uk

2007-01-03 06:23:47

by dean gaudet

[permalink] [raw]

Subject: Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?

On Sat, 30 Dec 2006, Denis Vlasenko wrote:

> I was experimenting with SSE[2] clear_page() which uses
> non-temporal stores. That one requires 16 byte alignment.
>
> BTW, it worked ~300% faster than memset. But Andi Kleen
> insists that cache eviction caused by NT stores will make it
> slower in macrobenchmark.
>
> Apart from fairly extensive set of microbechmarks
> I tested kernel compiles (i.e. "real world load")
> and they are FASTER too, not slower, but Andi
> is fairly entrenched in his opinion ;)
> I gave up.

you know, with the kernel zeroing pages through the 1:1 phys mapping, and
userland accessing pages through a different mapping... it seems that
frequently virtual address bits 12..14 will differ between user and
kernel.

on K8 this results in a virtual alias conflict which costs *70 cycles* per
cache line. (K8 L1 DC uses virtual bits 12..14 as part of the index.)
this is larger than the cost for L1 miss L2 hit...

this wouldn't happen with movnt... but then we get into the handwaving
arguments about timing of accesses to the freshly zeroed page. too bad
there's no "evict from L1 to L2" operation -- that would avoid the virtual
alias problem.

there's an event (75h unit mask 02h) to measure virtual alias conflicts...
i've always wondered if there are workloads which trigger this behaviour.
it can happy on copy to/from user as well.

-dean

2007-01-03 13:25:39

by Robert P. J. Day

[permalink] [raw]

Subject: Re: replace "memset(...,0,PAGE_SIZE)" calls with "clear_page()"?

On Sun, 31 Dec 2006, Arjan van de Ven wrote:

> So... yes I fully agree with you that it's worth looking at the
> memset( , PAGE_SIZE) users. If they are page aligned, yes absolutely
> make it a clear_page(), I think that's a very good idea. However
> also please check if they've been very recently allocated in that
> code, and if maybe the zeroing allocators are better suited there..
> (or maybe there's even double zeroing going on.. that's be a nice
> gain)

there's certainly some cleanup/speedup that could be done regarding
these numerous "memset(...,0,PAGE_SIZE) calls.

first, there the obvious 1:1 replacement with a call to
"clear_page()" ***if that's appropriate***.

second, there's some possible simplification, given snippets like
this one from arch/sparc/mm/sun4c.c

pte = (pte_t *)__get_free_page(GFP_KERNEL|__GFP_REPEAT);
if (pte)
memset(pte, 0, PAGE_SIZE);

which seems to be an obvious candidate for replacement with:

pte = get_zeroed_page(GFP_KERNEL|__GFP_REPEAT)

no?

finally, there is certainly some "double zeroing" going on, as with
this snippet from drivers/atm/eni.c:

...
eni_dev->rx_map = (struct atm_vcc **) get_zeroed_page(GFP_KERNEL);
^^^^^^^^^^^^^^^
if (!eni_dev->rx_map) {
printk(KERN_ERR DEV_LABEL "(itf %d): couldn't get free page\n",
dev->number);
free_page((unsigned long) eni_dev->free_list);
return -ENOMEM;
}
memset(eni_dev->rx_map,0,PAGE_SIZE); // redundant, no?
...

so, yes, there does appear to be room for cleanup/speedup.

rday