2003-07-01 15:31:10

by Jörn Engel

[permalink] [raw]
Subject: [PATCH RFC] 2.5.73 zlib #1 memmove

[ I currently have some mail problems that might take a while to
resolve. wohnheim.fh-wedel.de has a new dns-entry and the old one
has a max lifetime on one week. :(

I'm trying to resolve that problem but please keep linux-kernel on
CC:. That way I can at least read the archives until the problems
are history. ]

Joakim found a performance issue in the zlib and after some back and
forth, it appears that memmove() should be the correct solution.

The code in question copies a string from the LZ77 sliding window into
the output buffer. The string is 3-258 bytes long with a tendency
towards small strings. The zlib uses this code to do the copy:

*q++ = *r++; c--;
*q++ = *r++; c--;
do {
*q++ = *r++;
} while (--c);

The first two lines are loop unrolling. Apart from being ugly, this
should also be slower than memmove(), so I propose this patch.

J?rn

--
Fantasy is more important than knowlegde. Knowlegde is limited,
while fantasy embraces the whole world.
-- Albert Einstein

--- linux-2.5.73/lib/zlib_inflate/inffast.c~zlib_memcpy 2003-06-30 03:51:54.000000000 +0200
+++ linux-2.5.73/lib/zlib_inflate/inffast.c 2003-06-30 04:22:32.000000000 +0200
@@ -20,6 +20,14 @@
#define GRABBITS(j) {while(k<(j)){b|=((uLong)NEXTBYTE)<<k;k+=8;}}
#define UNGRAB {c=z->avail_in-n;c=(k>>3)<c?k>>3:c;n+=c;p-=c;k-=c<<3;}

+static inline void memmove_update(Byte **dest, Byte **src, size_t *n)
+{
+ memmove(*dest, *src, *n);
+ *dest += *n;
+ *src += *n;
+ *n = 0;
+}
+
/* Called with number of bytes left to write in window at least 258
(the maximum string length) and number of input bytes available
at least ten. The ten bytes are six bytes for the longest length/
@@ -100,30 +108,18 @@
if (c > e)
{
c -= e; /* wrapped copy */
- do {
- *q++ = *r++;
- } while (--e);
+ memmove_update(&q, &r, &e);
r = s->window;
- do {
- *q++ = *r++;
- } while (--c);
+ memmove_update(&q, &r, &c);
}
else /* normal copy */
{
- *q++ = *r++; c--;
- *q++ = *r++; c--;
- do {
- *q++ = *r++;
- } while (--c);
+ memmove_update(&q, &r, &c);
}
}
else /* normal copy */
{
- *q++ = *r++; c--;
- *q++ = *r++; c--;
- do {
- *q++ = *r++;
- } while (--c);
+ memmove_update(&q, &r, &c);
}
break;
}


2003-07-01 16:02:16

by Jörn Engel

[permalink] [raw]
Subject: [PATCH RFC] 2.5.73 zlib #2 codefold

This patch folds three calls to memmove_update into one. This is the
same structure that was in the 1.1.3 version of the zlib as well. The
change towards 1.1.4 was mixed with a real bugfix, so it slipped
through my brain.

J?rn

--
When you close your hand, you own nothing. When you open it up, you
own the whole world.
-- Li Mu Bai in Tiger & Dragon

--- linux-2.5.73/lib/zlib_inflate/inffast.c~zlib_codefold 2003-06-30 03:57:31.000000000 +0200
+++ linux-2.5.73/lib/zlib_inflate/inffast.c 2003-06-30 04:11:37.000000000 +0200
@@ -99,28 +99,18 @@
/* do the copy */
m -= c;
r = q - d;
- if (r < s->window) /* wrap if needed */
- {
- do {
+ if (r < s->window) { /* wrap if needed */
+ while (r < s->window) {
r += s->end - s->window; /* force pointer in window */
- } while (r < s->window); /* covers invalid distances */
+ } /* covers invalid distances */
e = s->end - r;
- if (c > e)
- {
+ if (c > e) {
c -= e; /* wrapped copy */
memmove_update(&q, &r, &e);
r = s->window;
- memmove_update(&q, &r, &c);
- }
- else /* normal copy */
- {
- memmove_update(&q, &r, &c);
}
}
- else /* normal copy */
- {
- memmove_update(&q, &r, &c);
- }
+ memmove_update(&q, &r, &c);
break;
}
else if ((e & 64) == 0)

2003-07-02 08:31:44

by Joakim Tjernlund

[permalink] [raw]
Subject: RE: [PATCH RFC] 2.5.73 zlib #2 codefold

> This patch folds three calls to memmove_update into one. This is the
> same structure that was in the 1.1.3 version of the zlib as well. The
> change towards 1.1.4 was mixed with a real bugfix, so it slipped
> through my brain.
>
> J?rn
[SNIP]

Looks fine to me.

Here is another one in gen_bitlen():
Replace:
for (bits = 0; bits <= MAX_BITS; bits++) s->bl_count[bits] = 0;
with:
memset(&s->bl_count[0], 0, MAX_BITS * sizeof(s->bl_count[0]));

Also the following could should be replaced(in defutil.h):
/* ===========================================================================
* Reverse the first len bits of a code, using straightforward code (a faster
* method would use a table)
* IN assertion: 1 <= len <= 15
*/
static inline unsigned bi_reverse(unsigned code, /* the value to invert */
int len) /* its bit length */
{
register unsigned res = 0;
do {
res |= code & 1;
code >>= 1, res <<= 1;
} while (--len > 0);
return res >> 1;
}

Anybody have a table version handy?

Joakim


2003-07-02 09:12:44

by Etienne Lorrain

[permalink] [raw]
Subject: Re: [PATCH RFC] 2.5.73 zlib #1 memmove

Hi,

I do not know if you are interrested, but I already did
a lot work on zlib/gzlib in Gujin:
http://gujin.sourceforge.net/

There is a near complete rewrite of zlib, even removing the
32Kb window stuff. Unlike zlib, it is only licenced as GPL, and only
tested on i386, in fact 32 bit processors. An unusual CRC32 function
is also available, optimised to i386 instructions but without a table
to reduce the data cache page misses and code size.

Have a look at gzcopy.c:
gzcopy -t infile.gz -> test the content of the file (-t0 for quite).

The kind of testing done is in Makefile:
testgzlib: gzcopy
find /mnt/cdrom/ -name "*.tgz" -o -name "*.gz" \
-exec ./gzcopy -t {} \; 2>&1 | tee log

I did it on multi CDROMs collection like ftp.cdrom.com , it checks
all files CRC32 - the only failure were unreadable files on the device
and file not compressed by GZIP (GZIP can decompress .Z and .ZIP files)
even if someone changed their name to *.gz

My aim there was code size reduction (4-5 Kbytes of code total).
It does compile and work with GCC aliasing optimisation.

Cheers,
Etienne.

___________________________________________________________
Do You Yahoo!? -- Une adresse @yahoo.fr gratuite et en fran?ais !
Yahoo! Mail : http://fr.mail.yahoo.com

2003-07-02 16:45:05

by Jörn Engel

[permalink] [raw]
Subject: Re: [PATCH RFC] 2.5.73 zlib #2 codefold

On Wed, 2 July 2003 10:46:05 +0200, Joakim Tjernlund wrote:
>
> > This patch folds three calls to memmove_update into one. This is the
> > same structure that was in the 1.1.3 version of the zlib as well. The
> > change towards 1.1.4 was mixed with a real bugfix, so it slipped
> > through my brain.
> >
> [SNIP]
>
> Looks fine to me.
>
> Here is another one in gen_bitlen():
> Replace:
> for (bits = 0; bits <= MAX_BITS; bits++) s->bl_count[bits] = 0;
> with:
> memset(&s->bl_count[0], 0, MAX_BITS * sizeof(s->bl_count[0]));
>
> Also the following could should be replaced(in defutil.h):
> /* ===========================================================================
> * Reverse the first len bits of a code, using straightforward code (a faster
> * method would use a table)
> * IN assertion: 1 <= len <= 15
> */
> static inline unsigned bi_reverse(unsigned code, /* the value to invert */
> int len) /* its bit length */
> {
> register unsigned res = 0;
> do {
> res |= code & 1;
> code >>= 1, res <<= 1;
> } while (--len > 0);
> return res >> 1;
> }
>
> Anybody have a table version handy?

Onto my unwritten todo list with them. The next lazy afternoon will
come for sure.

Etiennes code sounds promising as well. Will have a closer look one
of those afternoons. If it does fit the description, it might be a
good alternative for those platforms it happens to run on.

On a whole, I think it is better to leave most changes out of
mainline until 2.7 is opened. At least, unless someone comes up with
an extensive test suite for correctness, throughput and interactivity.
Volunteers? ;)

J?rn

--
Data dominates. If you've chosen the right data structures and organized
things well, the algorithms will almost always be self-evident. Data
structures, not algorithms, are central to programming.
-- Rob Pike