2001-10-15 13:12:10

by Cristiano Paris

[permalink] [raw]
Subject: libz, libbz2, ramfs and cramfs

Hi everyone.

I'm interested in developing a file system which could take features from
ramfs and cramfs so I have a couple of questions which possibly Linus
would answer to.

First, what is the current status of these modules ? Are new features
currently being developed ?

Second, quoting from the jffs2's TODO list :

- fix zlib. It's ugly as hell and there are at least three copies in the
kernel tree

What is the current status of zlib ?

Third, is there any project which tries to implement bzip2 algorithm
inside the kernel ? Does it give better compression ratios on 1-page-long
data ?

Thanks :-)

Cristiano


2001-10-15 22:53:54

by H. Peter Anvin

[permalink] [raw]
Subject: Re: libz, libbz2, ramfs and cramfs

Followup to: <Pine.LNX.4.33.0110151436040.755-100000@lisa.rhpk.springfield.inwind.it>
By author: Cristiano Paris <[email protected]>
In newsgroup: linux.dev.kernel
>
> Third, is there any project which tries to implement bzip2 algorithm
> inside the kernel ? Does it give better compression ratios on 1-page-long
> data ?
>

No, in fact, it has been measured to be somewhere between the same and
significantly worse, even with the 32K chunk size that zisofs uses.

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[email protected]>

2001-10-16 04:35:37

by Keith Owens

[permalink] [raw]
Subject: Re: libz, libbz2, ramfs and cramfs

On Mon, 15 Oct 2001 15:06:42 +0200 (CEST),
Cristiano Paris <[email protected]> wrote:
>I'm interested in developing a file system which could take features from
>ramfs and cramfs so I have a couple of questions which possibly Linus
>would answer to.
>Second, quoting from the jffs2's TODO list :
>
>- fix zlib. It's ugly as hell and there are at least three copies in the
>kernel tree

The -ac tree is moving to a single copy of zlib, in fs/inflate_fs. It
is currently used by cramfs and zisofs. jffs2 in the -ac tree still
uses its own copy of zlib and should be converted.

2001-10-16 07:36:32

by Matt D. Robinson

[permalink] [raw]
Subject: Re: libz, libbz2, ramfs and cramfs

Keith Owens wrote:
> On Mon, 15 Oct 2001 15:06:42 +0200 (CEST),
> Cristiano Paris <[email protected]> wrote:
> >I'm interested in developing a file system which could take features from
> >ramfs and cramfs so I have a couple of questions which possibly Linus
> >would answer to.
> >Second, quoting from the jffs2's TODO list :
> >
> >- fix zlib. It's ugly as hell and there are at least three copies in the
> >kernel tree
>
> The -ac tree is moving to a single copy of zlib, in fs/inflate_fs. It
> is currently used by cramfs and zisofs. jffs2 in the -ac tree still
> uses its own copy of zlib and should be converted.

Any plans to fix this for the Linus tree? Also, why place this in fs?
Shouldn't this be around for PPP along with other things that
can use it (like LKCD)?

--Matt

2001-10-16 17:41:31

by David Woodhouse

[permalink] [raw]
Subject: Re: libz, libbz2, ramfs and cramfs


[email protected] said:
> The -ac tree is moving to a single copy of zlib, in fs/inflate_fs.
> It is currently used by cramfs and zisofs. jffs2 in the -ac tree
> still uses its own copy of zlib and should be converted.

AFAIK the new zlib doesn't (yet) do compression, so JFFS2 can't use it.

--
dwmw2


2001-10-17 08:31:32

by H. Peter Anvin

[permalink] [raw]
Subject: Re: libz, libbz2, ramfs and cramfs

Followup to: <[email protected]>
By author: "Matt D. Robinson" <[email protected]>
In newsgroup: linux.dev.kernel
> >
> > The -ac tree is moving to a single copy of zlib, in fs/inflate_fs. It
> > is currently used by cramfs and zisofs. jffs2 in the -ac tree still
> > uses its own copy of zlib and should be converted.
>
> Any plans to fix this for the Linus tree? Also, why place this in fs?
> Shouldn't this be around for PPP along with other things that
> can use it (like LKCD)?
>

PPP uses a nonstandard deviant of zlib, or *so I've been told*, so
that one is out.

The reason it's in fs is because I wasn't feeling sure that the memory
management as implemented is adequate for non-fs-related
applications. I might change that, though, but I wanted to move
somewhat slowly.

Memory management in zlib is nontrivial. If you port the user-space
zlib the "obvious" way to kernel space, you get memory management that
is completely unacceptable to a filesystem application -- too easy to
get random errors due to memory allocation failures.

A major problem is that the module name "deflate" is used by PPP,
despite it being a nonstandard format...

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[email protected]>

2001-10-17 08:45:20

by David Woodhouse

[permalink] [raw]
Subject: Re: libz, libbz2, ramfs and cramfs



[email protected] said:
> PPP uses a nonstandard deviant of zlib, or *so I've been told*, so
> that one is out.

OOI, jffs2 used the PPP version of zlib, and mkfs.jffs2 uses the
normal libz. No problems have been encountered so far.

--
dwmw2


2001-10-18 03:11:24

by Paul Mackerras

[permalink] [raw]
Subject: Re: libz, libbz2, ramfs and cramfs

H. Peter Anvin writes:

> PPP uses a nonstandard deviant of zlib, or *so I've been told*, so
> that one is out.

PPP uses a variant of zlib with some extensions. I believe that I
didn't break zlib for normal use when I added the extensions but I
would have to check that to be 100% sure. The PPP zlib.c is based on
zlib-1.0.4, which is no longer the most recent version.

I think it would be possible to make PPP use the standard zlib but
with decreased performance. It's a long time since I looked at that
stuff though.

> A major problem is that the module name "deflate" is used by PPP,
> despite it being a nonstandard format...

No, the module name is "ppp_deflate".

Paul.

2001-10-18 03:25:47

by H. Peter Anvin

[permalink] [raw]
Subject: Re: libz, libbz2, ramfs and cramfs

Paul Mackerras wrote:

>
> PPP uses a variant of zlib with some extensions. I believe that I
> didn't break zlib for normal use when I added the extensions but I
> would have to check that to be 100% sure. The PPP zlib.c is based on
> zlib-1.0.4, which is no longer the most recent version.
>


What kind of extensions?


> I think it would be possible to make PPP use the standard zlib but
> with decreased performance. It's a long time since I looked at that
> stuff though.

>

>>A major problem is that the module name "deflate" is used by PPP,
>>despite it being a nonstandard format...
>>
>
> No, the module name is "ppp_deflate".
>


Oh. Well, then ...

-hpa


2001-10-18 07:16:27

by Paul Mackerras

[permalink] [raw]
Subject: Re: libz, libbz2, ramfs and cramfs

H. Peter Anvin writes:

> What kind of extensions?

/me pages that bit of his brain in...

I added a Z_PACKET_FLUSH value for the `flush' parameter. For deflate
it is like Z_SYNC_FLUSH except that it omits the 00 00 ff ff bytes
that Z_SYNC_FLUSH will put on the end of the output. For inflate it
checks that we are at a packet boundary once we have consumed all the
input, i.e. that we have seen the "000" block type code (meaning a
"stored" block) and we are waiting for the 2-byte length.

I added a deflateOutputPending routine which returns the number of
bytes of data that the compressor has pending to give to you.

I added an inflateIncomp routine which takes uncompressed data and
adds it to the decompressor history. The reason this is needed is
that if PPP-deflate goes to compress a packet and it expands, it sends
the packet uncompressed instead. The receiver still needs to add the
packet data to the history though, since the packet data has been
processed by the compressor on the sending end.

I added a check so that it is legal to set strm->next_out to NULL and
the de/compressor will just discard its output data. This is useful
on the sending side for PPP-deflate because there are situations where
the transmitted data has to be added to the compressor's history but
may not be transmitted in compressed form.

I also made various other minor changes so it would all compile
happily combined together into one file and in the kernel environment.

None of these changes affect its behaviour if you use it in the normal
way, i.e. if you don't use Z_PACKET_FLUSH and don't set strm->next_out
to NULL.

Most of these things are optimizations to reduce time and memory usage
for PPP-deflate. The one thing that I don't think could be done with
the stock zlib is the check for the decompressor state that
Z_PACKET_FLUSH on inflate() provides. That is not *strictly*
necessary since it is just a check, but it does give us some chance of
detecting if we receive a corrupted compressed packet that still has
the correct FCS.

Paul.

2001-10-18 12:15:55

by Paul Mackerras

[permalink] [raw]
Subject: Re: libz, libbz2, ramfs and cramfs

I wrote:

> I added a deflateOutputPending routine which returns the number of
> bytes of data that the compressor has pending to give to you.

I just checked and in fact ppp_deflate doesn't use this.

> I added a check so that it is legal to set strm->next_out to NULL and
> the de/compressor will just discard its output data. This is useful
> on the sending side for PPP-deflate because there are situations where
> the transmitted data has to be added to the compressor's history but
> may not be transmitted in compressed form.

ppp_deflate doesn't use next_out = NULL in this case, but it does use
next_out = NULL when we are compressing a packet and the compressed
packet turns out to be larger than the uncompressed. With deflate
there is a limit on how much larger the compressed packet would be, so
it would be possible to give it a small extra buffer on the stack
instead of using next_out = NULL.

If we were going to standardize on a newer zlib in the kernel, I could
change ppp_deflate to cope with that without too much pain, I think.
The main thing I would want to add is a way to check what state the
decompressor is in at the end of each packet - we want
strm->state->blocks->mode == LENS at that point, which is not
something that can be checked using the existing zlib interface.

Paul.

2001-10-18 17:34:02

by H. Peter Anvin

[permalink] [raw]
Subject: Re: libz, libbz2, ramfs and cramfs

Followup to: <[email protected]>
By author: Paul Mackerras <[email protected]>
In newsgroup: linux.dev.kernel
>
> ppp_deflate doesn't use next_out = NULL in this case, but it does use
> next_out = NULL when we are compressing a packet and the compressed
> packet turns out to be larger than the uncompressed. With deflate
> there is a limit on how much larger the compressed packet would be, so
> it would be possible to give it a small extra buffer on the stack
> instead of using next_out = NULL.
>

Discarding data is an important operation -- in zisofs that can happen
on a page lock conflict; I just use a dummy page for that.

> If we were going to standardize on a newer zlib in the kernel, I could
> change ppp_deflate to cope with that without too much pain, I think.
> The main thing I would want to add is a way to check what state the
> decompressor is in at the end of each packet - we want
> strm->state->blocks->mode == LENS at that point, which is not
> something that can be checked using the existing zlib interface.

The big issue is memory management. I *think* the memory policy I
implemented in inflate_fs would work for PPP (you have to provide a
memory area of sufficient size, about 40K, at the time you open a
stream. In the case of PPP this could probably just be vmalloc()'d at
the time the interface is created. In the fs case, this is not
acceptable; rather, the filesystem in question has to maintain a
preallocation of memory and mutex it properly.)

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[email protected]>