From: Andi Kleen <[email protected]>
glibc calloc() has an optimization to not explicitely memset()
very large calloc allocations that just came from mmap(),
because they are known to be zero.
This could result in the perf memcpy benchmark reading only from
the zero page, which gives unrealistic results.
Always call memset explicitly on the source area to avoid this problem.
Cc: [email protected]
Cc: [email protected]
v2: Actually memset the right area and also fix the NULL check before.
Signed-off-by: Andi Kleen <[email protected]>
---
tools/perf/bench/mem-memcpy.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/perf/bench/mem-memcpy.c b/tools/perf/bench/mem-memcpy.c
index 93c83e3..0887b46 100644
--- a/tools/perf/bench/mem-memcpy.c
+++ b/tools/perf/bench/mem-memcpy.c
@@ -115,8 +115,10 @@ static void alloc_mem(void **dst, void **src, size_t length)
die("memory allocation failed - maybe length is too large?\n");
*src = zalloc(length);
- if (!src)
+ if (!*src)
die("memory allocation failed - maybe length is too large?\n");
+ /* Make sure to always replace the zero pages even if MMAP_THRESH is crossed */
+ memset(*src, 0, length);
}
static u64 do_memcpy_cycle(memcpy_t fn, size_t len, bool prefault)
--
1.8.1.4
At Thu, 18 Jul 2013 15:43:18 -0700,
Andi Kleen wrote:
>
> From: Andi Kleen <[email protected]>
>
> glibc calloc() has an optimization to not explicitely memset()
> very large calloc allocations that just came from mmap(),
> because they are known to be zero.
>
> This could result in the perf memcpy benchmark reading only from
> the zero page, which gives unrealistic results.
>
> Always call memset explicitly on the source area to avoid this problem.
>
> Cc: [email protected]
> Cc: [email protected]
> v2: Actually memset the right area and also fix the NULL check before.
> Signed-off-by: Andi Kleen <[email protected]>
> ---
> tools/perf/bench/mem-memcpy.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/bench/mem-memcpy.c b/tools/perf/bench/mem-memcpy.c
> index 93c83e3..0887b46 100644
> --- a/tools/perf/bench/mem-memcpy.c
> +++ b/tools/perf/bench/mem-memcpy.c
> @@ -115,8 +115,10 @@ static void alloc_mem(void **dst, void **src, size_t length)
> die("memory allocation failed - maybe length is too large?\n");
>
> *src = zalloc(length);
> - if (!src)
> + if (!*src)
In the latest mem-memcpy.c, this if (!src) is already fixed as if
(!*src). This modification makes applying fail.
Thanks,
Hitoshi
> > diff --git a/tools/perf/bench/mem-memcpy.c b/tools/perf/bench/mem-memcpy.c
> > index 93c83e3..0887b46 100644
> > --- a/tools/perf/bench/mem-memcpy.c
> > +++ b/tools/perf/bench/mem-memcpy.c
> > @@ -115,8 +115,10 @@ static void alloc_mem(void **dst, void **src, size_t length)
> > die("memory allocation failed - maybe length is too large?\n");
> >
> > *src = zalloc(length);
> > - if (!src)
> > + if (!*src)
>
> In the latest mem-memcpy.c, this if (!src) is already fixed as if
> (!*src). This modification makes applying fail.
I can undo it, and repost, but the patch would still conflict.
Just whoever applies it has to resolve the trivial conflcit.
-Andi
--
[email protected] -- Speaking for myself only.
Em Sat, Jul 20, 2013 at 12:57:27AM +0900, Hitoshi Mitake escreveu:
> At Thu, 18 Jul 2013 15:43:18 -0700, Andi Kleen wrote:
> > glibc calloc() has an optimization to not explicitely memset()
> > very large calloc allocations that just came from mmap(),
> > because they are known to be zero.
> > This could result in the perf memcpy benchmark reading only from
> > the zero page, which gives unrealistic results.
> > Always call memset explicitly on the source area to avoid this problem.
> > +++ b/tools/perf/bench/mem-memcpy.c
> > @@ -115,8 +115,10 @@ static void alloc_mem(void **dst, void **src, size_t length)
> > *src = zalloc(length);
> > - if (!src)
> > + if (!*src)
> In the latest mem-memcpy.c, this if (!src) is already fixed as if
> (!*src). This modification makes applying fail.
I fixed this up, please take a look at:
https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/commit/?h=perf/core&id=a198996c7afae0097c67a61851f19863e59697b2
https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/log/?h=perf/core
- Arnaldo
On Mon, Jul 22, 2013 at 01:00:45PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Sat, Jul 20, 2013 at 12:57:27AM +0900, Hitoshi Mitake escreveu:
> > At Thu, 18 Jul 2013 15:43:18 -0700, Andi Kleen wrote:
> > > glibc calloc() has an optimization to not explicitely memset()
> > > very large calloc allocations that just came from mmap(),
> > > because they are known to be zero.
>
> > > This could result in the perf memcpy benchmark reading only from
> > > the zero page, which gives unrealistic results.
>
> > > Always call memset explicitly on the source area to avoid this problem.
>
> > > +++ b/tools/perf/bench/mem-memcpy.c
> > > @@ -115,8 +115,10 @@ static void alloc_mem(void **dst, void **src, size_t length)
> > > *src = zalloc(length);
> > > - if (!src)
> > > + if (!*src)
>
> > In the latest mem-memcpy.c, this if (!src) is already fixed as if
> > (!*src). This modification makes applying fail.
>
> I fixed this up, please take a look at:
>
> https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/commit/?h=perf/core&id=a198996c7afae0097c67a61851f19863e59697b2
>
> https://git.kernel.org/cgit/linux/kernel/git/acme/linux.git/log/?h=perf/core
Looks good. Thanks.
-Andi
--
[email protected] -- Speaking for myself only.