2012-08-13 23:44:17

by Markus F.X.J. Oberhumer

[permalink] [raw]
Subject: [GIT PULL] Update LZO compression

Hi all,

as suggested on the mailing list I have converted the updated LZO
code into git, so please pull my "lzo-update" branch from

git://github.com/markus-oberhumer/linux.git lzo-update

You can browse the branch at

https://github.com/markus-oberhumer/linux/compare/lzo-update

I'd ask some official kernel maintainer for review and to push this into
linux-next so that it will hopefully land in the 3.7 release.

Share and enjoy,
Markus

Signed-off-by: Markus F.X.J. Oberhumer <[email protected]>

On 2012-07-16 20:30, Markus F.X.J. Oberhumer wrote:
> Hi all,
>
> I finally have prepared a small package that updates the LZO version
> in the Linux kernel. Please get it from:
>
> http://www.oberhumer.com/opensource/lzo/download/Testing/linux-kernel-lzo-20120716.tar.gz
>
> As stated in the README this version is significantly faster (typically more
> than 2 times faster!) than the current version, has been thoroughly tested on
> x86_64/i386/powerpc platforms and is intended to get included into the
> official Linux 3.6 or 3.7 release.
>
> I encourage all compression users to test and benchmark this new version,
> and I also would ask some official LZO maintainer to convert the updated
> source files into a GIT commit and possibly push it to Linus or linux-next.
>
> Share and enjoy,
> Markus
>
> Signed-off-by: Markus F.X.J. Oberhumer" <[email protected]>

--
Markus Oberhumer, <[email protected]>, http://www.oberhumer.com/


2012-08-14 03:15:48

by Andi Kleen

[permalink] [raw]
Subject: Re: [GIT PULL] Update LZO compression

On Tue, Aug 14, 2012 at 01:44:02AM +0200, Markus F.X.J. Oberhumer wrote:
> Hi all,
>
> as suggested on the mailing list I have converted the updated LZO
> code into git, so please pull my "lzo-update" branch from
>
> git://github.com/markus-oberhumer/linux.git lzo-update
>
> You can browse the branch at
>
> https://github.com/markus-oberhumer/linux/compare/lzo-update

Looks ok to me from a quick look.

Since kernel lzo is security relevant, I assume the new version
has been fuzz tested?

I couldn't tell from the github view, but I assume you follow
standard coding style.

-Andi

2012-08-14 10:10:44

by Markus F.X.J. Oberhumer

[permalink] [raw]
Subject: Re: [GIT PULL] Update LZO compression

On 2012-08-14 05:15, Andi Kleen wrote:
> On Tue, Aug 14, 2012 at 01:44:02AM +0200, Markus F.X.J. Oberhumer wrote:
>> Hi all,
>>
>> as suggested on the mailing list I have converted the updated LZO
>> code into git, so please pull my "lzo-update" branch from
>>
>> git://github.com/markus-oberhumer/linux.git lzo-update
>>
>> You can browse the branch at
>>
>> https://github.com/markus-oberhumer/linux/compare/lzo-update
>
> Looks ok to me from a quick look.
>
> Since kernel lzo is security relevant, I assume the new version
> has been fuzz tested?

Of course!

> I couldn't tell from the github view, but I assume you follow
> standard coding style.

I've tried my best to follow the style guidelines.

Cheers,
Markus

>
> -Andi
>

--
Markus Oberhumer, <[email protected]>, http://www.oberhumer.com/

2012-08-14 12:41:12

by Johannes Stezenbach

[permalink] [raw]
Subject: Re: [GIT PULL] Update LZO compression

On Tue, Aug 14, 2012 at 01:44:02AM +0200, Markus F.X.J. Oberhumer wrote:
> On 2012-07-16 20:30, Markus F.X.J. Oberhumer wrote:
> >
> > As stated in the README this version is significantly faster (typically more
> > than 2 times faster!) than the current version, has been thoroughly tested on
> > x86_64/i386/powerpc platforms and is intended to get included into the
> > official Linux 3.6 or 3.7 release.
> >
> > I encourage all compression users to test and benchmark this new version,
> > and I also would ask some official LZO maintainer to convert the updated
> > source files into a GIT commit and possibly push it to Linus or linux-next.

Sorry for not reporting earlier, but I didn't have time to do real
benchmarks, just a quick test on ARM926EJ-S using barebox,
and found in the new version decompression is slower:
http://lists.infradead.org/pipermail/barebox/2012-July/008268.html

BTW, do you have userspace code matching the old and new
lzo versions? It would be easier to benchmark.

Unfortunately I cannot claim high confidence in my benchmark results
due to missing time to do it properly, it would be useful if
someone else could do some benchmarks on ARM before merging this.


Johannes

2012-08-15 12:02:55

by Markus F.X.J. Oberhumer

[permalink] [raw]
Subject: Re: [GIT PULL] Update LZO compression

Hi Johannes,

On 2012-08-14 14:39, Johannes Stezenbach wrote:
> On Tue, Aug 14, 2012 at 01:44:02AM +0200, Markus F.X.J. Oberhumer wrote:
>> On 2012-07-16 20:30, Markus F.X.J. Oberhumer wrote:
>>>
>>> As stated in the README this version is significantly faster (typically more
>>> than 2 times faster!) than the current version, has been thoroughly tested on
>>> x86_64/i386/powerpc platforms and is intended to get included into the
>>> official Linux 3.6 or 3.7 release.
>>>
>>> I encourage all compression users to test and benchmark this new version,
>>> and I also would ask some official LZO maintainer to convert the updated
>>> source files into a GIT commit and possibly push it to Linus or linux-next.
>
> Sorry for not reporting earlier, but I didn't have time to do real
> benchmarks, just a quick test on ARM926EJ-S using barebox,
> and found in the new version decompression is slower:
> http://lists.infradead.org/pipermail/barebox/2012-July/008268.html

I can only guess, but maybe your ARM cpu does not have an efficient
implementation of {get,put}_unaligned().

Could you please try the following patch and test if you can see
any significant speed difference?

Thanks,
Markus


diff --git a/lib/lzo/lzodefs.h b/lib/lzo/lzodefs.h
index ddc8db5..efc5714 100644
--- a/lib/lzo/lzodefs.h
+++ b/lib/lzo/lzodefs.h
@@ -12,8 +12,15 @@
*/


+#if defined(__arm__)
+#define COPY4(dst, src) \
+ (dst)[0] = (src)[0]; (dst)[1] = (src)[1]; \
+ (dst)[2] = (src)[2]; (dst)[3] = (src)[3]
+#endif
+#ifndef COPY4
#define COPY4(dst, src) \
put_unaligned(get_unaligned((const u32 *)(src)), (u32 *)(dst))
+#endif
#if defined(__x86_64__)
#define COPY8(dst, src) \
put_unaligned(get_unaligned((const u64 *)(src)), (u64 *)(dst))


>
> BTW, do you have userspace code matching the old and new
> lzo versions? It would be easier to benchmark.
>
> Unfortunately I cannot claim high confidence in my benchmark results
> due to missing time to do it properly, it would be useful if
> someone else could do some benchmarks on ARM before merging this.
>
>
> Johannes

--
Markus Oberhumer, <[email protected]>, http://www.oberhumer.com/

2012-08-15 14:45:55

by Johannes Stezenbach

[permalink] [raw]
Subject: Re: [GIT PULL] Update LZO compression

Hi Marcus,

On Wed, Aug 15, 2012 at 02:02:43PM +0200, Markus F.X.J. Oberhumer wrote:
> On 2012-08-14 14:39, Johannes Stezenbach wrote:
> > On Tue, Aug 14, 2012 at 01:44:02AM +0200, Markus F.X.J. Oberhumer wrote:
> >> On 2012-07-16 20:30, Markus F.X.J. Oberhumer wrote:
> >>>
> >>> As stated in the README this version is significantly faster (typically more
> >>> than 2 times faster!) than the current version, has been thoroughly tested on
> >>> x86_64/i386/powerpc platforms and is intended to get included into the
> >>> official Linux 3.6 or 3.7 release.
> >>>
> >>> I encourage all compression users to test and benchmark this new version,
> >>> and I also would ask some official LZO maintainer to convert the updated
> >>> source files into a GIT commit and possibly push it to Linus or linux-next.
> >
> > Sorry for not reporting earlier, but I didn't have time to do real
> > benchmarks, just a quick test on ARM926EJ-S using barebox,
> > and found in the new version decompression is slower:
> > http://lists.infradead.org/pipermail/barebox/2012-July/008268.html
>
> I can only guess, but maybe your ARM cpu does not have an efficient
> implementation of {get,put}_unaligned().

Yes, ARMv5 cannot do unaligned access. ARMv6+ could, but
I think the Linux kernel normally traps it for debug,
all ARM seem to use generic {get,put}_unaligned() implementation
which use byte access and shift.

> Could you please try the following patch and test if you can see
> any significant speed difference?

It isn't. I made the attached quick hack userspace code
using ARM kernel headers and barebox unlzop code.
(new == your new code, old == linux-3.5 git, test == new + your suggested change)
(sorry I had no time to clean it up)

I compressed a Linux Image with lzop (lzop <arch/arm/boot/Image >lzoimage)
and timed uncompression:

# time ./unlzopold <lzoimage >/dev/null
real 0m 0.29s
user 0m 0.19s
sys 0m 0.10s
# time ./unlzopold <lzoimage >/dev/null
real 0m 0.29s
user 0m 0.20s
sys 0m 0.09s
# time ./unlzopnew <lzoimage >/dev/null
real 0m 0.41s
user 0m 0.30s
sys 0m 0.10s
# time ./unlzopnew <lzoimage >/dev/null
real 0m 0.40s
user 0m 0.30s
sys 0m 0.10s
# time ./unlzopnew <lzoimage >/dev/null
real 0m 0.40s
user 0m 0.29s
sys 0m 0.11s
# time ./unlzoptest <lzoimage >/dev/null
real 0m 0.39s
user 0m 0.28s
sys 0m 0.11s
# time ./unlzoptest <lzoimage >/dev/null
real 0m 0.39s
user 0m 0.27s
sys 0m 0.11s
# time ./unlzoptest <lzoimage >/dev/null
real 0m 0.39s
user 0m 0.27s
sys 0m 0.11s

FWIW I also checked the sha1sum to confirm the Image uncompressed OK.


Johannes


Attachments:
(No filename) (2.59 kB)
lzo-bench.tar.gz (52.95 kB)
Download all attachments

2012-08-16 06:27:56

by Markus F.X.J. Oberhumer

[permalink] [raw]
Subject: Re: [GIT PULL] Update LZO compression

On 2012-08-15 16:45, Johannes Stezenbach wrote:
> On Wed, Aug 15, 2012 at 02:02:43PM +0200, Markus F.X.J. Oberhumer wrote:
>> On 2012-08-14 14:39, Johannes Stezenbach wrote:
>>> On Tue, Aug 14, 2012 at 01:44:02AM +0200, Markus F.X.J. Oberhumer wrote:
>>>> On 2012-07-16 20:30, Markus F.X.J. Oberhumer wrote:
>>>>>
>>>>> As stated in the README this version is significantly faster (typically more
>>>>> than 2 times faster!) than the current version, has been thoroughly tested on
>>>>> x86_64/i386/powerpc platforms and is intended to get included into the
>>>>> official Linux 3.6 or 3.7 release.
>>>>>
>>>>> I encourage all compression users to test and benchmark this new version,
>>>>> and I also would ask some official LZO maintainer to convert the updated
>>>>> source files into a GIT commit and possibly push it to Linus or linux-next.
>>>
>>> Sorry for not reporting earlier, but I didn't have time to do real
>>> benchmarks, just a quick test on ARM926EJ-S using barebox,
>>> and found in the new version decompression is slower:
>>> http://lists.infradead.org/pipermail/barebox/2012-July/008268.html
>>
>> I can only guess, but maybe your ARM cpu does not have an efficient
>> implementation of {get,put}_unaligned().
>
> Yes, ARMv5 cannot do unaligned access. ARMv6+ could, but
> I think the Linux kernel normally traps it for debug,
> all ARM seem to use generic {get,put}_unaligned() implementation
> which use byte access and shift.

Hmm - I could imagine that we're wasting a lot of possible speed gain
by not exploiting that feature on ARMv6+.

>> Could you please try the following patch and test if you can see
>> any significant speed difference?
>
> It isn't. I made the attached quick hack userspace code
> using ARM kernel headers and barebox unlzop code.
> (new == your new code, old == linux-3.5 git, test == new + your suggested change)
> (sorry I had no time to clean it up)

My suggested COPY4 replacement probably has a lot of load stalls - maybe some
ARM expert could have a look and suggest a more efficient implementation.

In any case, I still would like to see the new code in linux-next because
of the huge improvements on other modern CPUs.

Cheers,
Markus

>
> I compressed a Linux Image with lzop (lzop <arch/arm/boot/Image >lzoimage)
> and timed uncompression:
>
> # time ./unlzopold <lzoimage >/dev/null
> real 0m 0.29s
> user 0m 0.19s
> sys 0m 0.10s
> # time ./unlzopold <lzoimage >/dev/null
> real 0m 0.29s
> user 0m 0.20s
> sys 0m 0.09s
> # time ./unlzopnew <lzoimage >/dev/null
> real 0m 0.41s
> user 0m 0.30s
> sys 0m 0.10s
> # time ./unlzopnew <lzoimage >/dev/null
> real 0m 0.40s
> user 0m 0.30s
> sys 0m 0.10s
> # time ./unlzopnew <lzoimage >/dev/null
> real 0m 0.40s
> user 0m 0.29s
> sys 0m 0.11s
> # time ./unlzoptest <lzoimage >/dev/null
> real 0m 0.39s
> user 0m 0.28s
> sys 0m 0.11s
> # time ./unlzoptest <lzoimage >/dev/null
> real 0m 0.39s
> user 0m 0.27s
> sys 0m 0.11s
> # time ./unlzoptest <lzoimage >/dev/null
> real 0m 0.39s
> user 0m 0.27s
> sys 0m 0.11s
>
> FWIW I also checked the sha1sum to confirm the Image uncompressed OK.
>
>
> Johannes

--
Markus Oberhumer, <[email protected]>, http://www.oberhumer.com/

2012-08-16 15:07:05

by Johannes Stezenbach

[permalink] [raw]
Subject: Re: [GIT PULL] Update LZO compression

On Thu, Aug 16, 2012 at 08:27:15AM +0200, Markus F.X.J. Oberhumer wrote:
> On 2012-08-15 16:45, Johannes Stezenbach wrote:
> >
> > I made the attached quick hack userspace code
> > using ARM kernel headers and barebox unlzop code.
> > (new == your new code, old == linux-3.5 git, test == new + your suggested change)
> > (sorry I had no time to clean it up)
>
> My suggested COPY4 replacement probably has a lot of load stalls - maybe some
> ARM expert could have a look and suggest a more efficient implementation.
>
> In any case, I still would like to see the new code in linux-next because
> of the huge improvements on other modern CPUs.

Well, ~2x speedup on x86 is certainly a good achievement, but there
are more ARM based devices than there are PCs, and I guess many
embedded devices use lzo compressed kernels and file systems
while I'm not convinced many PCs rely on lzo in the kernel.

I know everyone's either busy or on vacation, but it would
be so cool if someone could test on a more modern ARM core,
with the userspace test code I posted it should be easy to do.


Johannes

2012-08-16 15:22:06

by Jeff Garzik

[permalink] [raw]
Subject: Re: [GIT PULL] Update LZO compression

On 08/16/2012 02:27 AM, Markus F.X.J. Oberhumer wrote:
> On 2012-08-15 16:45, Johannes Stezenbach wrote:
>> On Wed, Aug 15, 2012 at 02:02:43PM +0200, Markus F.X.J. Oberhumer wrote:
>>> On 2012-08-14 14:39, Johannes Stezenbach wrote:
>>>> On Tue, Aug 14, 2012 at 01:44:02AM +0200, Markus F.X.J. Oberhumer wrote:
>>>>> On 2012-07-16 20:30, Markus F.X.J. Oberhumer wrote:
>>>>>>
>>>>>> As stated in the README this version is significantly faster (typically more
>>>>>> than 2 times faster!) than the current version, has been thoroughly tested on
>>>>>> x86_64/i386/powerpc platforms and is intended to get included into the
>>>>>> official Linux 3.6 or 3.7 release.
>>>>>>
>>>>>> I encourage all compression users to test and benchmark this new version,
>>>>>> and I also would ask some official LZO maintainer to convert the updated
>>>>>> source files into a GIT commit and possibly push it to Linus or linux-next.
>>>>
>>>> Sorry for not reporting earlier, but I didn't have time to do real
>>>> benchmarks, just a quick test on ARM926EJ-S using barebox,
>>>> and found in the new version decompression is slower:
>>>> http://lists.infradead.org/pipermail/barebox/2012-July/008268.html
>>>
>>> I can only guess, but maybe your ARM cpu does not have an efficient
>>> implementation of {get,put}_unaligned().
>>
>> Yes, ARMv5 cannot do unaligned access. ARMv6+ could, but
>> I think the Linux kernel normally traps it for debug,
>> all ARM seem to use generic {get,put}_unaligned() implementation
>> which use byte access and shift.
>
> Hmm - I could imagine that we're wasting a lot of possible speed gain
> by not exploiting that feature on ARMv6+.

Or you could just realize that unaligned accesses are slow in the best
case, and are simply not supported on some processors.

If you think a little bit, I bet you could come up with a solution that
operates at cacheline-aligned granularity, something that would be _even
faster_ than simply fixing the code to do aligned accesses.

Jeff



2012-08-16 16:20:59

by Andi Kleen

[permalink] [raw]
Subject: Re: [GIT PULL] Update LZO compression

> If you think a little bit, I bet you could come up with a solution that
> operates at cacheline-aligned granularity, something that would be _even
> faster_ than simply fixing the code to do aligned accesses.

Cache aligned compression is unlikely to compress anything at all.
Compression algorithms are usually by definition unaligned.

-Andi

2012-08-16 16:48:56

by Jeff Garzik

[permalink] [raw]
Subject: Re: [GIT PULL] Update LZO compression

On 08/16/2012 12:20 PM, Andi Kleen wrote:
>> If you think a little bit, I bet you could come up with a solution that
>> operates at cacheline-aligned granularity, something that would be _even
>> faster_ than simply fixing the code to do aligned accesses.
>
> Cache aligned compression is unlikely to compress anything at all.
> Compression algorithms are usually by definition unaligned.

Sure it's a bitstream, but that does not imply the impossibility of
reading data in in an word-aligned manner.

Maybe cache-aligned is ambitious, because of resultant code bloat, but
machine-int-aligned is doable and reasonable.

Jeff



2012-08-16 17:23:17

by Johannes Stezenbach

[permalink] [raw]
Subject: Re: [GIT PULL] Update LZO compression

On Thu, Aug 16, 2012 at 12:48:49PM -0400, Jeff Garzik wrote:
> On 08/16/2012 12:20 PM, Andi Kleen wrote:
> >>If you think a little bit, I bet you could come up with a solution that
> >>operates at cacheline-aligned granularity, something that would be _even
> >>faster_ than simply fixing the code to do aligned accesses.
> >
> >Cache aligned compression is unlikely to compress anything at all.
> >Compression algorithms are usually by definition unaligned.
>
> Sure it's a bitstream, but that does not imply the impossibility of
> reading data in in an word-aligned manner.
>
> Maybe cache-aligned is ambitious, because of resultant code bloat,
> but machine-int-aligned is doable and reasonable.

Well, I for one would be content if the old and new lzo versions
could be merged based on CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS.

(Assuming that the slowdown on ARM is due to unaligned access,
since the old version also uses get/put_unaligned, is the new
version actually using more unaligned accesses?)


Johannes

2012-08-16 17:25:55

by Roman Mamedov

[permalink] [raw]
Subject: Re: [GIT PULL] Update LZO compression

On Thu, 16 Aug 2012 17:06:47 +0200
Johannes Stezenbach <[email protected]> wrote:

> Well, ~2x speedup on x86 is certainly a good achievement, but there
> are more ARM based devices than there are PCs, and I guess many
> embedded devices use lzo compressed kernels and file systems
> while I'm not convinced many PCs rely on lzo in the kernel.

Keep in mind that a major user of LZO is the BTRFS filesystem, and I believe it
is used much more often on larger machines than on ARM, in fact it had problems
operating on ARM at all, until quite recently.

> I know everyone's either busy or on vacation, but it would
> be so cool if someone could test on a more modern ARM core,
> with the userspace test code I posted it should be easy to do.

I have locked the Allwinner A10 CPU in my Mele A2000 to 60 MHz using cpufreq-set,
and ran your test. rnd.lzo is a 9 MB file from /dev/urandom compressed with lzo.
There doesn't seem to be a significant difference between all three variants.

# time for i in {1..20}; do old/unlzop < rnd.lzo >/dev/null ; done

real 0m11.353s
user 0m3.060s
sys 0m8.170s
# time for i in {1..20}; do new/unlzop < rnd.lzo >/dev/null ; done

real 0m11.416s
user 0m3.030s
sys 0m8.200s
# time for i in {1..20}; do test/unlzop < rnd.lzo >/dev/null ; done

real 0m11.310s
user 0m3.100s
sys 0m8.150s

--
With respect,
Roman

~~~~~~~~~~~~~~~~~~~~~~~~~~~
"Stallman had a printer,
with code he could not see.
So he began to tinker,
and set the software free."


Attachments:
signature.asc (198.00 B)

2012-08-16 17:52:40

by Andi Kleen

[permalink] [raw]
Subject: Re: [GIT PULL] Update LZO compression

> I have locked the Allwinner A10 CPU in my Mele A2000 to 60 MHz using cpufreq-set,
> and ran your test. rnd.lzo is a 9 MB file from /dev/urandom compressed with lzo.
> There doesn't seem to be a significant difference between all three variants.

I found that in compression benchmarks it depends a lot on the data
compressed.

urandom (which should be essentially incompressible) will be handled
by different code paths in the compressor than other more compressible data.
It becomes a complicated memcpy then.

But then there are IO benchmarks which also only do zeroes, which
also gives an unrealistic picture.

Usually it's best to use some corpus with different data types, from very
compressible to less so; and look at the aggregate.

For my snappy work I usually had at least large executables (medium) and some
pdfs (already compressed; low) and then uncompressed source code tars (high
compression)

-Andi

2012-08-16 18:19:00

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [GIT PULL] Update LZO compression

On Thu, Aug 16, 2012 at 7:52 PM, Andi Kleen <[email protected]> wrote:
>> I have locked the Allwinner A10 CPU in my Mele A2000 to 60 MHz using cpufreq-set,
>> and ran your test. rnd.lzo is a 9 MB file from /dev/urandom compressed with lzo.
>> There doesn't seem to be a significant difference between all three variants.
>
> I found that in compression benchmarks it depends a lot on the data
> compressed.
>
> urandom (which should be essentially incompressible) will be handled
> by different code paths in the compressor than other more compressible data.
> It becomes a complicated memcpy then.

In addition, locking the CPU to 60 MHz may improve the memory access penalty,
as a cache miss may cost much less cycles.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2012-08-16 22:18:07

by Andi Kleen

[permalink] [raw]
Subject: Re: [GIT PULL] Update LZO compression

On Thu, Aug 16, 2012 at 11:55:06AM -0700, james northrup wrote:
> looks like ARM results are inconclusive from a lot of folks without
> bandwidth to do a write-up, what about just plain STAGING status for ARM so
> the android tweakers can beat on it for a while?

Staging only really works for new drivers, not for updating existing
library functions like this.

I suppose you could keep both and have the architecture select with a
CONFIG.

-Andi

2012-08-17 01:23:10

by Mitch Harder

[permalink] [raw]
Subject: Re: [GIT PULL] Update LZO compression

On Thu, Aug 16, 2012 at 5:17 PM, Andi Kleen <[email protected]> wrote:
> On Thu, Aug 16, 2012 at 11:55:06AM -0700, james northrup wrote:
>> looks like ARM results are inconclusive from a lot of folks without
>> bandwidth to do a write-up, what about just plain STAGING status for ARM so
>> the android tweakers can beat on it for a while?
>
> Staging only really works for new drivers, not for updating existing
> library functions like this.
>
> I suppose you could keep both and have the architecture select with a
> CONFIG.
>

I've been doing some rough benchmarking with the updated LZO in btrfs.

My tests primarily consist of timing some typical copying, git
manipulating, and running rsync using a set of kernel git sources.
Git sources are typically about 50% pack files which won't compress
very well, with the remainder being mostly highly compressible source
files.

Of course, any underlying speed improvement attributable only to LZO
is not shown by test like this. But I thought it would be interesting
to see the impact in some typical real-world btrfs operations.

I was seeing between 3-9% improvement in speed with the new LZO.

Copying several directories of git sources showed the most
improvement, ~9%. Typical git operations, such as a git checkout or
git status where only showing 3-5% improvement, which is close to the
noise level of my tests. Running multiple rsync processes showed a 5%
improvement.

With only 10 trials (5 with each LZO), I can't say I would
statistically hang my hat on these numbers.

Given all the other stuff that is going on in my rough benchmarks, a
3-9% improvement from a single change is probably pretty good.