this small patchset adds support for compressing initial ramdisks
into LZO (Lempel Ziv Oberhumer) format. It has been tested on x86
and x86_64. The patches apply to current git mainline.
In a pair of real-world examples, an Eee 901 boots ~120 ms faster.
A high-end desktop only saves an inconsequential 35 ms.
Specs:
* LZO data decompresses in 55 - 60% of the time needed by gzip,
when using the 'fast' decompressor.
* The decompressor is very small, < 2 kB on x86.
* An LZO-compressed initramfs takes 7-10% more disk space vs gzip.
Test results:
initramfs.cpio: 14174 kB
initramfs.gz: 4657 kB
initramfs.lzo: 5007 kB
Eee 901 Core i7 920
SSD 30MB/s HDD 60MB/s
gz lzo gz lzo
disk read 152 163 76 82
unpacking 247 113 95 54
----------------------------------
total 399 276 171 136
net gain 123 35
All values are milliseconds. Disk read times are estimates.
I have a patch that adds support for LZO-compressed kernels but
decided to not include it now since it's only implemented for x86.
Cheers,
Andreas
This patch adds an LZO decompressor tweaked to be faster than
the 'safe' decompressor already in the kernel.
On x86_64, it runs in roughly 80% of the time needed by the safe
decompressor.
This function is inherently insecure and can cause buffer overruns.
It is only intended for decompressing implicitly trusted data, such
as an initramfs and the kernel itself.
As such, the function is neither exported nor declared in a header.
Signed-off-by: Andreas Robinson <[email protected]>
---
lib/lzo/lzo1x_decompress_fast.c | 206 +++++++++++++++++++++++++++++++++++++++
1 files changed, 206 insertions(+), 0 deletions(-)
create mode 100644 lib/lzo/lzo1x_decompress_fast.c
diff --git a/lib/lzo/lzo1x_decompress_fast.c b/lib/lzo/lzo1x_decompress_fast.c
new file mode 100644
index 0000000..4f7908f
--- /dev/null
+++ b/lib/lzo/lzo1x_decompress_fast.c
@@ -0,0 +1,206 @@
+/*
+ * LZO1X Decompressor from MiniLZO
+ *
+ * Copyright (C) 1996-2005 Markus F.X.J. Oberhumer <[email protected]>
+ *
+ * The full LZO package can be found at:
+ * http://www.oberhumer.com/opensource/lzo/
+ *
+ * Changed for kernel use by:
+ * Nitin Gupta <[email protected]>
+ * Richard Purdie <[email protected]>
+ *
+ * Added 'fast' decompressor:
+ * Andreas Robinson <[email protected]>
+ */
+
+#include <linux/kernel.h>
+#include <linux/lzo.h>
+#include <asm/byteorder.h>
+#include <asm/unaligned.h>
+
+#define M2_MAX_OFFSET 0x0800
+#define COPY4(dst, src) \
+ put_unaligned(get_unaligned((const u32 *)(src)), (u32 *)(dst))
+
+/* Faster lzo1x decompression.
+ *
+ * This function is inherently insecure and can cause buffer overruns.
+ * It is not intended for general use and should never be exported or
+ * passed untrusted data.
+ *
+ * The function can also overrun the destination buffer by up to 3 bytes
+ * for performance reasons. Be sure to size the output buffer accordingly.
+ *
+ * Implementation notes - comparisons to the 'safe' decompressor:
+ *
+ * - Removed bounds checking.
+ * - Made copying 32-bit whenever possible.
+ * This is what causes 3-byte overruns of the destination buffer.
+ * - Added branch prediction hints.
+ * The likely/unlikely choices were made based on performance testing
+ * with a 5MB compressed initramfs.
+ */
+int lzo1x_decompress_fast(const unsigned char *in, size_t in_len,
+ unsigned char *out, size_t *out_len)
+{
+ const unsigned char * const ip_end = in + in_len;
+ const unsigned char *ip = in, *m_pos;
+ unsigned char *op = out;
+ long t;
+
+ *out_len = 0;
+
+ if (*ip > 17) {
+ t = *ip++ - 17;
+ if (t < 4)
+ goto match_next;
+ do {
+ *op++ = *ip++;
+ } while (--t > 0);
+ goto first_literal_run;
+ }
+
+ while ((ip < ip_end)) {
+ t = *ip++;
+ if (t >= 16)
+ goto match;
+ if (t == 0) {
+ while (*ip == 0) {
+ t += 255;
+ ip++;
+ }
+ t += 15 + *ip++;
+ }
+
+ t += 3;
+ while (t > 0) {
+ COPY4(op, ip);
+ op += 4;
+ ip += 4;
+ t -= 4;
+ }
+ op += t;
+ ip += t;
+
+first_literal_run:
+ t = *ip++;
+ if (t >= 16)
+ goto match;
+ m_pos = op - (1 + M2_MAX_OFFSET);
+ m_pos -= t >> 2;
+ m_pos -= *ip++ << 2;
+
+ *op++ = *m_pos++;
+ *op++ = *m_pos++;
+ *op++ = *m_pos;
+
+ goto match_done;
+
+ do {
+match:
+ if (likely(t >= 64)) {
+ unsigned char u = t;
+ m_pos = op - 1;
+ m_pos -= (t >> 2) & 7;
+ m_pos -= *ip++ << 3;
+ t = (t >> 5) - 1;
+ /* 0 <= t <= 6 */
+
+ *op++ = *m_pos++;
+ *op++ = *m_pos++;
+ do {
+ *op++ = *m_pos++;
+ } while (--t > 0);
+
+ t = u & 3;
+ if (t == 0)
+ break;
+ goto match_next;
+
+ } else if (t >= 32) {
+ t &= 31;
+ if (t == 0) {
+ while (unlikely(*ip == 0)) {
+ t += 255;
+ ip++;
+ }
+ t += 31 + *ip++;
+ }
+ m_pos = op - 1;
+ m_pos -= get_unaligned_le16(ip) >> 2;
+ ip += 2;
+
+ } else if (t >= 16) {
+ m_pos = op;
+ m_pos -= (t & 8) << 11;
+
+ t &= 7;
+ if (t == 0) {
+ while (unlikely(*ip == 0)) {
+ t += 255;
+ ip++;
+ }
+ t += 7 + *ip++;
+ }
+ m_pos -= get_unaligned_le16(ip) >> 2;
+ ip += 2;
+ if (m_pos == op)
+ goto eof_found;
+ m_pos -= 0x4000;
+
+ } else {
+ m_pos = op - 1;
+ m_pos -= t >> 2;
+ m_pos -= *ip++ << 2;
+
+ *op++ = *m_pos++;
+ *op++ = *m_pos;
+ goto match_done;
+ }
+
+ if (t >= 2 * 4 - (3 - 1) && (op - m_pos) >= 4) {
+ COPY4(op, m_pos);
+ op += 4;
+ m_pos += 4;
+ t -= 4 - (3 - 1);
+ do {
+ COPY4(op, m_pos);
+ op += 4;
+ m_pos += 4;
+ t -= 4;
+ } while (t >= 4);
+ while (t > 0) {
+ *op++ = *m_pos++;
+ t--;
+ }
+ } else {
+ /* copy_match */
+ *op++ = *m_pos++;
+ *op++ = *m_pos++;
+ do {
+ *op++ = *m_pos++;
+ } while (--t > 0);
+ }
+match_done:
+ t = ip[-2] & 3;
+ if (t == 0)
+ break;
+match_next: /* 1 <= t <= 3 */
+ COPY4(op, ip);
+ op += t;
+ ip += t;
+
+ t = *ip++;
+ } while (ip < ip_end);
+ }
+
+ *out_len = op - out;
+ return LZO_E_EOF_NOT_FOUND;
+
+eof_found:
+ *out_len = op - out;
+ return (ip == ip_end ? LZO_E_OK :
+ (ip < ip_end ? LZO_E_INPUT_NOT_CONSUMED : LZO_E_INPUT_OVERRUN));
+}
+
--
1.5.6.3
Add support for loading initial ramdisks packed with lzop
using the lzo1x compression method.
Signed-off-by: Andreas Robinson <[email protected]>
---
include/linux/decompress/unlzo.h | 11 ++
include/linux/lzo.h | 4 +
lib/Makefile | 1 +
lib/decompress.c | 5 +
lib/decompress_unlzo.c | 363 ++++++++++++++++++++++++++++++++++++++
scripts/gen_initramfs_list.sh | 1 +
usr/Kconfig | 30 +++-
usr/Makefile | 6 +-
usr/initramfs_data.lzo.S | 29 +++
9 files changed, 443 insertions(+), 7 deletions(-)
create mode 100644 include/linux/decompress/unlzo.h
create mode 100644 lib/decompress_unlzo.c
create mode 100644 usr/initramfs_data.lzo.S
diff --git a/include/linux/decompress/unlzo.h b/include/linux/decompress/unlzo.h
new file mode 100644
index 0000000..b780ce2
--- /dev/null
+++ b/include/linux/decompress/unlzo.h
@@ -0,0 +1,11 @@
+#ifndef DECOMPRESS_UNLZO_H
+#define DECOMPRESS_UNLZO_H
+
+int unlzo(unsigned char *in, int in_len,
+ int(*fill)(void*, unsigned int),
+ int(*flush)(void*, unsigned int),
+ unsigned char *output, int *pos,
+ void(*error)(char *x)
+ );
+
+#endif
diff --git a/include/linux/lzo.h b/include/linux/lzo.h
index d793497..896f1dc 100644
--- a/include/linux/lzo.h
+++ b/include/linux/lzo.h
@@ -40,5 +40,9 @@ int lzo1x_decompress_safe(const unsigned char *src, size_t src_len,
#define LZO_E_EOF_NOT_FOUND (-7)
#define LZO_E_INPUT_NOT_CONSUMED (-8)
#define LZO_E_NOT_YET_IMPLEMENTED (-9)
+/* Used in decompress_unlzo.c */
+#define LZO_E_INVALID_FORMAT (-10)
+#define LZO_E_INVALID_PARAM (-11)
+#define LZO_E_CORRUPTED (-12)
#endif
diff --git a/lib/Makefile b/lib/Makefile
index 051a33a..6a81b0d 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -65,6 +65,7 @@ obj-$(CONFIG_REED_SOLOMON) += reed_solomon/
obj-$(CONFIG_LZO_COMPRESS) += lzo/
obj-$(CONFIG_LZO_DECOMPRESS) += lzo/
+lib-$(CONFIG_DECOMPRESS_LZO) += decompress_unlzo.o
lib-$(CONFIG_DECOMPRESS_GZIP) += decompress_inflate.o
lib-$(CONFIG_DECOMPRESS_BZIP2) += decompress_bunzip2.o
lib-$(CONFIG_DECOMPRESS_LZMA) += decompress_unlzma.o
diff --git a/lib/decompress.c b/lib/decompress.c
index d2842f5..aee1231 100644
--- a/lib/decompress.c
+++ b/lib/decompress.c
@@ -9,10 +9,14 @@
#include <linux/decompress/bunzip2.h>
#include <linux/decompress/unlzma.h>
#include <linux/decompress/inflate.h>
+#include <linux/decompress/unlzo.h>
#include <linux/types.h>
#include <linux/string.h>
+#ifndef CONFIG_DECOMPRESS_LZO
+# define unlzo NULL
+#endif
#ifndef CONFIG_DECOMPRESS_GZIP
# define gunzip NULL
#endif
@@ -28,6 +32,7 @@ static const struct compress_format {
const char *name;
decompress_fn decompressor;
} compressed_formats[] = {
+ { {0x89, 0x4c}, "lzo", unlzo },
{ {037, 0213}, "gzip", gunzip },
{ {037, 0236}, "gzip", gunzip },
{ {0x42, 0x5a}, "bzip2", bunzip2 },
diff --git a/lib/decompress_unlzo.c b/lib/decompress_unlzo.c
new file mode 100644
index 0000000..ad2bda6
--- /dev/null
+++ b/lib/decompress_unlzo.c
@@ -0,0 +1,363 @@
+/* Simple LZO file format decompressor for Linux.
+ *
+ * Copyright (C) 2009 Andreas Robinson
+ *
+ * Derived from the LZO package
+ *
+ * Copyright (C) 1996-2008 Markus F.X.J. Oberhumer <[email protected]>
+ *
+ * This program is licensed under the terms of the Linux GPL2. See COPYING.
+ *
+ * LZO files can be generated with the lzop utility, available along with
+ * the full LZO library at http://www.oberhumer.com/opensource/lzo/
+ */
+
+#include <linux/types.h>
+#include <linux/lzo.h>
+#include <linux/zutil.h>
+#include <linux/decompress/mm.h>
+#include "lzo/lzo1x_decompress_fast.c"
+
+/* LZO file format related */
+
+/* Header flags. */
+#define F_ADLER32_D 0x00000001L
+#define F_ADLER32_C 0x00000002L
+#define F_STDIN 0x00000004L
+#define F_STDOUT 0x00000008L
+#define F_NAME_DEFAULT 0x00000010L
+#define F_DOSISH 0x00000020L
+#define F_H_EXTRA_FIELD 0x00000040L
+#define F_H_GMTDIFF 0x00000080L
+#define F_CRC32_D 0x00000100L
+#define F_CRC32_C 0x00000200L
+#define F_MULTIPART 0x00000400L
+#define F_H_FILTER 0x00000800L
+#define F_H_CRC32 0x00001000L
+#define F_H_PATH 0x00002000L
+
+#define F_MASK 0x00003FFFL
+#define F_OS_MASK 0xff000000L
+#define F_CS_MASK 0x00f00000L
+
+/* These bits must be zero. */
+#define F_RESERVED ((F_MASK | F_OS_MASK | F_CS_MASK) ^ 0xffffffffL)
+
+#define LZO_BLOCK_SIZE (256*1024l)
+
+/* Supported compression methods */
+#define M_LZO1X_1 1
+#define M_LZO1X_1_15 2
+#define M_LZO1X_999 3
+
+#if 0
+#define DPRINTK(fmt, args...) printk(KERN_DEBUG "%s " fmt, __func__, ## args)
+#else
+#define DPRINTK(fmt, ...)
+#endif
+
+#define THROW(errno) \
+ do { \
+ DPRINTK("line %d: throws %s\n", \
+ __LINE__, #errno); \
+ ret = (errno); \
+ goto error_exit; \
+ } while (0)
+
+#define READ8(src, dst) \
+ do { \
+ if (!read_field((src), (dst), 1)) \
+ THROW(LZO_E_INPUT_OVERRUN); \
+ } while (0)
+
+#define READ16(src, dst) \
+ do { \
+ if (!read_field((src), (dst), 2)) \
+ THROW(LZO_E_INPUT_OVERRUN); \
+ } while (0)
+
+#define READ32(src, dst) \
+ do { \
+ if (!read_field((src), (dst), 4)) \
+ THROW(LZO_E_INPUT_OVERRUN); \
+ } while (0)
+
+struct lzo_source {
+
+ u8 *buf; /* Start */
+ u8 *buf_end; /* End */
+ u8 *p; /* Read position */
+ size_t len; /* Length */
+ u32 chksum; /* Running adler32 checksum */
+ u32 flags; /* LZO file flags */
+};
+
+/* read_field - Read LZO header field (big endian).
+ * src: source data buffer
+ * dst: destination
+ * len: field length in bytes (1 - 4)
+ */
+static int INIT read_field(struct lzo_source *src, u32 *dst, int len)
+{
+ int i;
+ u32 b = 0;
+
+ if (src->p + len > src->buf_end)
+ return 0;
+
+ src->chksum = zlib_adler32(src->chksum, src->p, len);
+
+ if (dst) {
+ for (i = 0; i < len; i++)
+ b = (b << 8) | *(src->p)++;
+ *dst = b;
+ } else
+ src->p += len;
+
+ return len;
+}
+
+static int INIT unpack_block(struct lzo_source *src, u32 c_len, u32 c_chk,
+ u8 *dst, u32 d_len, u32 d_chk, int skip_checksum)
+{
+ int ret = LZO_E_OK;
+ size_t out_len;
+
+ if (!skip_checksum && (src->flags & F_ADLER32_C)) {
+ u32 chk = zlib_adler32(1, src->p, c_len);
+ if (chk != c_chk)
+ THROW(LZO_E_CORRUPTED);
+ }
+
+ out_len = d_len;
+ ret = lzo1x_decompress_fast(src->p, c_len, dst, &out_len);
+ if (out_len != d_len)
+ THROW(LZO_E_CORRUPTED);
+ if (ret != LZO_E_OK)
+ goto error_exit;
+
+ if (!skip_checksum && (src->flags & F_ADLER32_D)) {
+ u32 chk = zlib_adler32(1, dst, d_len);
+ if (chk != d_chk)
+ THROW(LZO_E_CORRUPTED);
+ }
+
+ src->p += c_len;
+
+error_exit:
+ return ret;
+}
+
+static int INIT read_block_header(struct lzo_source *src,
+ u32 *c_len, u32 *c_chk,
+ u32 *d_len, u32 *d_chk)
+{
+ int ret = LZO_E_OK;
+
+ /* Read block sizes */
+
+ READ32(src, d_len);
+
+ if (*d_len == 0)
+ return ret; /* EOF */
+ if (*d_len == 0xffffffffUL)
+ THROW(LZO_E_NOT_YET_IMPLEMENTED); /* split file */
+ if (*d_len > LZO_BLOCK_SIZE)
+ THROW(LZO_E_INVALID_PARAM);
+
+ READ32(src, c_len);
+
+ if (src->p + *c_len > src->buf_end)
+ THROW(LZO_E_INPUT_OVERRUN);
+
+ /* Read checksums */
+
+ if (src->flags & F_ADLER32_D)
+ READ32(src, d_chk);
+
+ if (src->flags & F_ADLER32_C) {
+ if (c_len < d_len)
+ READ32(src, d_chk);
+ else if (src->flags & F_ADLER32_D)
+ c_chk = d_chk;
+ else
+ THROW(LZO_E_INVALID_PARAM);
+ }
+error_exit:
+ return ret;
+}
+
+static const unsigned char lzop_magic[9] =
+ {0x89, 0x4c, 0x5a, 0x4f, 0x00, 0x0d, 0x0a, 0x1a, 0x0a};
+
+/* read_header - read the lzo file header */
+static int INIT read_header(struct lzo_source *src)
+{
+ int ret = LZO_E_OK;
+
+ int i;
+ u32 v, len;
+ u32 method, level;
+ u32 chk, checksum;
+
+ src->flags = 0;
+
+ /* Check magic number */
+
+ for (i = 0; i < 9; i++) {
+ READ8(src, &v);
+ if (v != lzop_magic[i])
+ THROW(LZO_E_INVALID_FORMAT);
+ }
+
+ src->chksum = 1;
+
+ /* Check supported versions */
+
+ READ16(src, &v); /* File format version */
+ if (v < 0x0940)
+ THROW(LZO_E_NOT_YET_IMPLEMENTED);
+
+ READ16(src, NULL); /* ignored: lib_version */
+ READ16(src, &v); /* LZO lib version needed to extract */
+
+ if (v > 0x1020 || v < 0x0900)
+ THROW(LZO_E_NOT_YET_IMPLEMENTED);
+
+ /* Check compression method and level */
+
+ READ8(src, &method);
+ READ8(src, &level);
+
+ if ((method != M_LZO1X_1) && (method != M_LZO1X_1_15) &&
+ (method != M_LZO1X_999) && (level > 9))
+ THROW(LZO_E_NOT_YET_IMPLEMENTED);
+
+ /* Check flags */
+
+ READ32(src, &(src->flags));
+
+ if (src->flags & (F_H_FILTER | F_MULTIPART | F_RESERVED |
+ F_H_CRC32 | F_CRC32_C | F_CRC32_D))
+ THROW(LZO_E_NOT_YET_IMPLEMENTED);
+
+ if ((src->flags & F_ADLER32_D) == 0)
+ THROW(LZO_E_INVALID_PARAM); /* Decompressed checksum required */
+
+ /* Skip uninteresting fields */
+
+ READ32(src, NULL); /* mode */
+ READ32(src, NULL); /* mtime_low */
+ READ32(src, NULL); /* mtime_high */
+
+ /* Skip original file name */
+
+ READ8(src, &len);
+ for (i = 0; i < len; i++)
+ READ8(src, NULL);
+
+ /* Test header checksum */
+
+ chk = src->chksum;
+ READ32(src, &checksum);
+ if (checksum != chk)
+ THROW(LZO_E_CORRUPTED);
+
+ /* Skip extra field */
+
+ if (src->flags & F_H_EXTRA_FIELD) {
+ src->chksum = 1;
+ READ32(src, &len);
+ for (i = 0; i < len; i++)
+ READ8(src, NULL);
+
+ chk = src->chksum;
+ READ32(src, &checksum);
+ if (checksum != chk)
+ THROW(LZO_E_CORRUPTED);
+ }
+
+error_exit:
+ return ret;
+}
+
+int INIT unlzo(unsigned char *in, int in_len,
+ int(*fill)(void*, unsigned int),
+ int(*flush)(void*, unsigned int),
+ unsigned char *out, int *pos,
+ void(*error)(char *x))
+{
+ int ret = LZO_E_OK;
+ struct lzo_source src;
+
+ u8 *buf = NULL;
+ size_t buf_len;
+
+ src.buf = in;
+ src.buf_end = in + in_len;
+ src.len = in_len;
+ src.p = src.buf;
+
+ if (in_len == 0 || fill != NULL) {
+ error("lzo: requested feature not implemented");
+ THROW(LZO_E_NOT_YET_IMPLEMENTED);
+ }
+
+ ret = read_header(&src);
+ if (ret != LZO_E_OK) {
+ error("lzo: bad data header");
+ goto error_exit;
+ }
+
+ if (flush) {
+ buf_len = LZO_BLOCK_SIZE;
+ buf = malloc(buf_len);
+ if (!buf) {
+ error("lzo: out of memory");
+ THROW(LZO_E_OUT_OF_MEMORY);
+ }
+ out = buf;
+ } else if (!out)
+ THROW(LZO_E_ERROR);
+ /* else assume caller has a large enough buffer. */
+
+ while (src.p < src.buf_end && ret == LZO_E_OK) {
+ u32 c_len, c_chk, d_len, d_chk;
+
+ ret = read_block_header(&src, &c_len, &c_chk, &d_len, &d_chk);
+ if (ret != LZO_E_OK) {
+ error("lzo: bad block header");
+ goto error_exit;
+ }
+ if (d_len == 0)
+ break;
+ if (d_len > LZO_BLOCK_SIZE) {
+ error("lzo: invalid block size");
+ THROW(LZO_E_INVALID_PARAM);
+ }
+
+ ret = unpack_block(&src, c_len, c_chk, out, d_len, d_chk, 1);
+ if (ret != LZO_E_OK) {
+ error("lzo: decompression error");
+ goto error_exit;
+ }
+
+ if (flush) {
+ if (flush(buf, d_len) != d_len) {
+ error("lzo: write error");
+ THROW(LZO_E_ERROR);
+ }
+ } else {
+ out += d_len;
+ }
+ }
+
+error_exit:
+ if (pos)
+ *pos = src.p - src.buf;
+ if (buf)
+ free(buf);
+ return (ret == LZO_E_OK) ? 0 : -1;
+}
+
+#define decompress unlzo
diff --git a/scripts/gen_initramfs_list.sh b/scripts/gen_initramfs_list.sh
index 3eea8f1..f575b87 100644
--- a/scripts/gen_initramfs_list.sh
+++ b/scripts/gen_initramfs_list.sh
@@ -239,6 +239,7 @@ case "$arg" in
output_file="$1"
cpio_list="$(mktemp ${TMPDIR:-/tmp}/cpiolist.XXXXXX)"
output=${cpio_list}
+ echo "$output_file" | grep -q "\.lzo$" && compr="lzop -9 -f"
echo "$output_file" | grep -q "\.gz$" && compr="gzip -9 -f"
echo "$output_file" | grep -q "\.bz2$" && compr="bzip2 -9 -f"
echo "$output_file" | grep -q "\.lzma$" && compr="lzma -9 -f"
diff --git a/usr/Kconfig b/usr/Kconfig
index 588c588..547630d 100644
--- a/usr/Kconfig
+++ b/usr/Kconfig
@@ -45,6 +45,15 @@ config INITRAMFS_ROOT_GID
If you are not sure, leave it set to "0".
+config RD_LZO
+ bool "Support initial ramdisks compressed using lzop" if EMBEDDED
+ default !EMBEDDED
+ depends on BLK_DEV_INITRD
+ select DECOMPRESS_LZO
+ help
+ Support loading of a lzo encoded initial ramdisk or cpio buffer.
+ If unsure, say N.
+
config RD_GZIP
bool "Support initial ramdisks compressed using gzip" if EMBEDDED
default y
@@ -106,20 +115,29 @@ config INITRAMFS_COMPRESSION_NONE
both the cpio image and the unpacked filesystem image will
be present in memory simultaneously
+config INITRAMFS_COMPRESSION_LZO
+ bool "LZO"
+ depends on RD_LZO
+ help
+ Lempel Ziv Oberhumer compression. Its compression ratio is
+ the poorest among the four choices; maximum compression yields
+ roughly 7-10% larger initramfs compared to gzip. However,
+ decompression time is only 55 - 60% of that of gzip.
+
config INITRAMFS_COMPRESSION_GZIP
bool "Gzip"
depends on RD_GZIP
help
The old and tried gzip compression. Its compression ratio is
- the poorest among the 3 choices; however its speed (both
- compression and decompression) is the fastest.
+ worse than that of bzip2 and lzma; however compression and
+ decompression are faster.
config INITRAMFS_COMPRESSION_BZIP2
bool "Bzip2"
depends on RD_BZIP2
help
Its compression ratio and speed is intermediate.
- Decompression speed is slowest among the three. The initramfs
+ Decompression speed is slowest among the four. The initramfs
size is about 10% smaller with bzip2, in comparison to gzip.
Bzip2 uses a large amount of memory. For modern kernels you
will need at least 8MB RAM or more for booting.
@@ -129,9 +147,9 @@ config INITRAMFS_COMPRESSION_LZMA
depends on RD_LZMA
help
The most recent compression algorithm.
- Its ratio is best, decompression speed is between the other
- two. Compression is slowest. The initramfs size is about 33%
- smaller with LZMA in comparison to gzip.
+ Its ratio is best, decompression speed is between gzip and bzip2
+ Compression is slowest. The initramfs size is about 33% smaller
+ with LZMA in comparison to gzip.
endchoice
diff --git a/usr/Makefile b/usr/Makefile
index b84894b..4a887d2 100644
--- a/usr/Makefile
+++ b/usr/Makefile
@@ -9,6 +9,9 @@ PHONY += klibcdirs
# No compression
suffix_$(CONFIG_INITRAMFS_COMPRESSION_NONE) =
+# Lzo
+suffix_$(CONFIG_INITRAMFS_COMPRESSION_LZO) = .lzo
+
# Gzip, but no bzip2
suffix_$(CONFIG_INITRAMFS_COMPRESSION_GZIP) = .gz
@@ -48,7 +51,8 @@ endif
quiet_cmd_initfs = GEN $@
cmd_initfs = $(initramfs) -o $@ $(ramfs-args) $(ramfs-input)
-targets := initramfs_data.cpio.gz initramfs_data.cpio.bz2 initramfs_data.cpio.lzma initramfs_data.cpio
+targets := initramfs_data.cpio.lzo initramfs_data.cpio.gz \
+ initramfs_data.cpio.bz2 initramfs_data.cpio.lzma initramfs_data.cpio
# do not try to update files included in initramfs
$(deps_initramfs): ;
diff --git a/usr/initramfs_data.lzo.S b/usr/initramfs_data.lzo.S
new file mode 100644
index 0000000..5921190
--- /dev/null
+++ b/usr/initramfs_data.lzo.S
@@ -0,0 +1,29 @@
+/*
+ initramfs_data includes the compressed binary that is the
+ filesystem used for early user space.
+ Note: Older versions of "as" (prior to binutils 2.11.90.0.23
+ released on 2001-07-14) dit not support .incbin.
+ If you are forced to use older binutils than that then the
+ following trick can be applied to create the resulting binary:
+
+
+ ld -m elf_i386 --format binary --oformat elf32-i386 -r \
+ -T initramfs_data.scr initramfs_data.cpio.gz -o initramfs_data.o
+ ld -m elf_i386 -r -o built-in.o initramfs_data.o
+
+ initramfs_data.scr looks like this:
+SECTIONS
+{
+ .init.ramfs : { *(.data) }
+}
+
+ The above example is for i386 - the parameters vary from architectures.
+ Eventually look up LDFLAGS_BLOB in an older version of the
+ arch/$(ARCH)/Makefile to see the flags used before .incbin was introduced.
+
+ Using .incbin has the advantage over ld that the correct flags are set
+ in the ELF header, as required by certain architectures.
+*/
+
+.section .init.ramfs,"a"
+.incbin "usr/initramfs_data.cpio.lzo"
--
1.5.6.3
Andreas Robinson wrote:
> This patch adds an LZO decompressor tweaked to be faster than
> the 'safe' decompressor already in the kernel.
>
> On x86_64, it runs in roughly 80% of the time needed by the safe
> decompressor.
>
> This function is inherently insecure and can cause buffer overruns.
> It is only intended for decompressing implicitly trusted data, such
> as an initramfs and the kernel itself.
>
> As such, the function is neither exported nor declared in a header.
>
OK, I'm more than a bit nervous about that, especially since we're
trying to make the decompression functions more generic.
Furthermore, is there a specific reason you didn't implent this for the
kernel itself as well as for the initramfs? I'd really would strongly
prefer if the two compression sets didn't diverge.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
On Wed, 2009-04-01 at 09:12 -0700, H. Peter Anvin wrote:
> Andreas Robinson wrote:
> > This patch adds an LZO decompressor tweaked to be faster than
> > the 'safe' decompressor already in the kernel.
> >
> > On x86_64, it runs in roughly 80% of the time needed by the safe
> > decompressor.
> >
> > This function is inherently insecure and can cause buffer overruns.
> > It is only intended for decompressing implicitly trusted data, such
> > as an initramfs and the kernel itself.
> >
> > As such, the function is neither exported nor declared in a header.
> >
>
> OK, I'm more than a bit nervous about that, especially since we're
> trying to make the decompression functions more generic.
Perhaps the system can default to the safe decompressor for normal use
and choose the fast one if STATIC is defined or when system_state ==
SYSTEM_BOOTING?
> Furthermore, is there a specific reason you didn't implent this for the
> kernel itself as well as for the initramfs? I'd really would strongly
> prefer if the two compression sets didn't diverge.
There is a patch but I wanted to be sure that I had not missed anything
before submtting it, and also have a look at possibly supporting more
architectures. But I'll post it shortly.
Signed-off-by: Andreas Robinson <[email protected]>
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 34bc3a8..3d1c2a6 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -40,6 +40,7 @@ config X86
select HAVE_GENERIC_DMA_COHERENT if X86_32
select HAVE_EFFICIENT_UNALIGNED_ACCESS
select USER_STACKTRACE_SUPPORT
+ select HAVE_KERNEL_LZO
select HAVE_KERNEL_GZIP
select HAVE_KERNEL_BZIP2
select HAVE_KERNEL_LZMA
diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 3ca4c19..561790b 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -4,7 +4,7 @@
# create a compressed vmlinux image from the original vmlinux
#
-targets := vmlinux vmlinux.bin vmlinux.bin.gz vmlinux.bin.bz2 vmlinux.bin.lzma head_$(BITS).o misc.o piggy.o
+targets := vmlinux vmlinux.bin vmlinux.bin.lzo vmlinux.bin.gz vmlinux.bin.bz2 vmlinux.bin.lzma head_$(BITS).o misc.o piggy.o
KBUILD_CFLAGS := -m$(BITS) -D__KERNEL__ $(LINUX_INCLUDE) -O2
KBUILD_CFLAGS += -fno-strict-aliasing -fPIC
@@ -45,6 +45,8 @@ $(obj)/vmlinux.bin.all: $(vmlinux.bin.all-y) FORCE
ifeq ($(CONFIG_X86_32),y)
ifdef CONFIG_RELOCATABLE
+$(obj)/vmlinux.bin.lzo: $(obj)/vmlinux.bin.all FORCE
+ $(call if_changed,lzop)
$(obj)/vmlinux.bin.gz: $(obj)/vmlinux.bin.all FORCE
$(call if_changed,gzip)
$(obj)/vmlinux.bin.bz2: $(obj)/vmlinux.bin.all FORCE
@@ -52,6 +54,8 @@ $(obj)/vmlinux.bin.bz2: $(obj)/vmlinux.bin.all FORCE
$(obj)/vmlinux.bin.lzma: $(obj)/vmlinux.bin.all FORCE
$(call if_changed,lzma)
else
+$(obj)/vmlinux.bin.lzo: $(obj)/vmlinux.bin FORCE
+ $(call if_changed,lzop)
$(obj)/vmlinux.bin.gz: $(obj)/vmlinux.bin FORCE
$(call if_changed,gzip)
$(obj)/vmlinux.bin.bz2: $(obj)/vmlinux.bin FORCE
@@ -63,6 +67,8 @@ LDFLAGS_piggy.o := -r --format binary --oformat elf32-i386 -T
else
+$(obj)/vmlinux.bin.lzo: $(obj)/vmlinux.bin FORCE
+ $(call if_changed,lzop)
$(obj)/vmlinux.bin.gz: $(obj)/vmlinux.bin FORCE
$(call if_changed,gzip)
$(obj)/vmlinux.bin.bz2: $(obj)/vmlinux.bin FORCE
@@ -73,6 +79,7 @@ $(obj)/vmlinux.bin.lzma: $(obj)/vmlinux.bin FORCE
LDFLAGS_piggy.o := -r --format binary --oformat elf64-x86-64 -T
endif
+suffix_$(CONFIG_KERNEL_LZO) = lzo
suffix_$(CONFIG_KERNEL_GZIP) = gz
suffix_$(CONFIG_KERNEL_BZIP2) = bz2
suffix_$(CONFIG_KERNEL_LZMA) = lzma
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index e45be73..a0fd406 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -150,6 +150,10 @@ static char *vidmem;
static int vidport;
static int lines, cols;
+#ifdef CONFIG_KERNEL_LZO
+#include "../../../../lib/decompress_unlzo.c"
+#endif
+
#ifdef CONFIG_KERNEL_GZIP
#include "../../../../lib/decompress_inflate.c"
#endif
diff --git a/init/Kconfig b/init/Kconfig
index 14c483d..c59e55d 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -101,6 +101,9 @@ config LOCALVERSION_AUTO
which is done within the script "scripts/setlocalversion".)
+config HAVE_KERNEL_LZO
+ bool
+
config HAVE_KERNEL_GZIP
bool
@@ -113,7 +116,7 @@ config HAVE_KERNEL_LZMA
choice
prompt "Kernel compression mode"
default KERNEL_GZIP
- depends on HAVE_KERNEL_GZIP || HAVE_KERNEL_BZIP2 || HAVE_KERNEL_LZMA
+ depends on HAVE_KERNEL_LZO || HAVE_KERNEL_GZIP || HAVE_KERNEL_BZIP2 || HAVE_KERNEL_LZMA
help
The linux kernel is a kind of self-extracting executable.
Several compression algorithms are available, which differ
@@ -132,20 +135,29 @@ choice
If in doubt, select 'gzip'
+config KERNEL_LZO
+ bool "LZO"
+ depends on HAVE_KERNEL_LZO
+ help
+ Lempel Ziv Oberhumer compression. Its compression ratio is
+ the poorest among the four choices; maximum compression yields
+ roughly 7% larger kernels compared to gzip. However, decompression
+ time is only 55 - 60% of that of gzip.
+
config KERNEL_GZIP
bool "Gzip"
depends on HAVE_KERNEL_GZIP
help
The old and tried gzip compression. Its compression ratio is
- the poorest among the 3 choices; however its speed (both
- compression and decompression) is the fastest.
+ worse than that of bzip2 and lzma; however compression and
+ decompression are faster.
config KERNEL_BZIP2
bool "Bzip2"
depends on HAVE_KERNEL_BZIP2
help
Its compression ratio and speed is intermediate.
- Decompression speed is slowest among the three. The kernel
+ Decompression speed is slowest among the four. The kernel
size is about 10% smaller with bzip2, in comparison to gzip.
Bzip2 uses a large amount of memory. For modern kernels you
will need at least 8MB RAM or more for booting.
@@ -155,9 +167,9 @@ config KERNEL_LZMA
depends on HAVE_KERNEL_LZMA
help
The most recent compression algorithm.
- Its ratio is best, decompression speed is between the other
- two. Compression is slowest. The kernel size is about 33%
- smaller with LZMA in comparison to gzip.
+ Its ratio is best, decompression speed is between gzip and bzip2
+ Compression is slowest. The kernel size is about 33% smaller with
+ LZMA compared to gzip.
endchoice
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 9796195..f29037b 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -200,3 +200,11 @@ cmd_bzip2 = (bzip2 -9 < $< && $(size_append) $<) > $@ || (rm -f $@ ; false)
quiet_cmd_lzma = LZMA $@
cmd_lzma = (lzma -9 -c $< && $(size_append) $<) >$@ || (rm -f $@ ; false)
+
+# Lzo
+# ---------------------------------------------------------------------------
+
+quiet_cmd_lzop = LZOP $@
+cmd_lzop = (lzop -9 -c $< && $(size_append) $<) >$@ || (rm -f $@ ; false)
+
+
--
1.5.6.3
Andreas Robinson wrote:
>
> Perhaps the system can default to the safe decompressor for normal use
> and choose the fast one if STATIC is defined or when system_state ==
> SYSTEM_BOOTING?
>
Do we really need two pieces of code? What if we have memory corruption
during early boot - it seems we'd want to at least try to catch that
with an error message rather than just crashing.
-hpa
On Wed, 2009-04-01 at 13:55 -0700, H. Peter Anvin wrote:
> Andreas Robinson wrote:
> >
> > Perhaps the system can default to the safe decompressor for normal use
> > and choose the fast one if STATIC is defined or when system_state ==
> > SYSTEM_BOOTING?
> >
>
> Do we really need two pieces of code?
To get the higher speed offered by the fast function, yes.
Merging the two with some macro magic and then compile twice with
different macro definitions could work though. That might hurt the
readability of the code a bit. Or help. :-) I'll look into it.
Anyway, I assume it is maintainability rather than size you're concerned
about here?
The duplicate function adds just 1.7 KB to the kernel and you throw it
out once the kernel has finished booting.
OTOH, the safe version is far from useless.
I estimate (but haven't tested yet) that you would lose about 40 ms in
the Eee test case. That is, the boot-time savings are reduced from 123
to perhaps 85 ms which is still acceptable. It is certainly much less
complicated than the alternatives, so if that's what you would prefer I
can go that way.
> What if we have memory corruption
> during early boot - it seems we'd want to at least try to catch that
> with an error message rather than just crashing.
This is very easy to do, so consider it done.
The decompressor is essentially a glorified memcpy that can copy the
same data to several locations. You only have to check whether the write
pointer has passed the end of the output buffer to see if something went
wrong.
Andreas Robinson wrote:
> Anyway, I assume it is maintainability rather than size you're concerned
> about here?
Right, of course.
> OTOH, the safe version is far from useless.
>
> I estimate (but haven't tested yet) that you would lose about 40 ms in
> the Eee test case. That is, the boot-time savings are reduced from 123
> to perhaps 85 ms which is still acceptable. It is certainly much less
> complicated than the alternatives, so if that's what you would prefer I
> can go that way.
I think if the cost is 40 ms once during boot on a slow platform, it's
worth unifying the two codebases. I am *not* saying that I don't think
boot performance matters -- far be from it -- but I think this is
probably worth the reliability and maintainability advantages of having
a single piece of code if at all possible.
Of course, if you can figure out how to avoid that and still have the
code clean, then that's another matter.
[Cc: Arjan, fast boot evangelizer. ;)]
-hpa
H. Peter Anvin wrote:
> Andreas Robinson wrote:
>> Anyway, I assume it is maintainability rather than size you're concerned
>> about here?
>
> Right, of course.
>
>> OTOH, the safe version is far from useless.
>> I estimate (but haven't tested yet) that you would lose about 40 ms in
>> the Eee test case. That is, the boot-time savings are reduced from 123
>> to perhaps 85 ms which is still acceptable. It is certainly much less
>> complicated than the alternatives, so if that's what you would prefer I
>> can go that way.
>
> I think if the cost is 40 ms once during boot on a slow platform, it's
> worth unifying the two codebases. I am *not* saying that I don't think
> boot performance matters -- far be from it -- but I think this is
> probably worth the reliability and maintainability advantages of having
> a single piece of code if at all possible.
>
> Of course, if you can figure out how to avoid that and still have the
> code clean, then that's another matter.
>
> [Cc: Arjan, fast boot evangelizer. ;)]
as long as LZO is optional.... and it's documented somewhere to not use
it if you want fast speed I'm totally fine.
Hi.
On Wed, 2009-04-01 at 16:11 -0700, Arjan van de Ven wrote:
> H. Peter Anvin wrote:
> > Andreas Robinson wrote:
> >> Anyway, I assume it is maintainability rather than size you're concerned
> >> about here?
> >
> > Right, of course.
> >
> >> OTOH, the safe version is far from useless.
> >> I estimate (but haven't tested yet) that you would lose about 40 ms in
> >> the Eee test case. That is, the boot-time savings are reduced from 123
> >> to perhaps 85 ms which is still acceptable. It is certainly much less
> >> complicated than the alternatives, so if that's what you would prefer I
> >> can go that way.
> >
> > I think if the cost is 40 ms once during boot on a slow platform, it's
> > worth unifying the two codebases. I am *not* saying that I don't think
> > boot performance matters -- far be from it -- but I think this is
> > probably worth the reliability and maintainability advantages of having
> > a single piece of code if at all possible.
> >
> > Of course, if you can figure out how to avoid that and still have the
> > code clean, then that's another matter.
> >
> > [Cc: Arjan, fast boot evangelizer. ;)]
>
> as long as LZO is optional.... and it's documented somewhere to not use
> it if you want fast speed I'm totally fine.
Sorry to jump in with a tangential issue, but I just noticed the thread
and it reminded me of an issue :)
Should the lzo code used via cryptoapi (I believe it's the same stuff)
be SMP safe? I've tried to use it work TuxOnIce and get image corruption
(test kernel is 64 bit). The same code works fine if I tell it to use
LZF (which comes with TuxOnIce), no compression or, IIRC, work single
threaded.
Regards,
Nigel
Arjan van de Ven wrote:
>>
>> [Cc: Arjan, fast boot evangelizer. ;)]
>
> as long as LZO is optional.... and it's documented somewhere to not use
> it if you want fast speed I'm totally fine.
LZO is the fastest (and least dense) decompression method on the
spectrum, so I figured you'd care about it. It's faster than gzip, for
example. (Of course, if you're loading-time-bound you might be better
off using LZMA.)
-hpa
On Wed, 2009-04-01 at 15:42 -0700, H. Peter Anvin wrote:
> I think if the cost is 40 ms once during boot on a slow platform, it's
> worth unifying the two codebases. I am *not* saying that I don't think
> boot performance matters -- far be from it -- but I think this is
> probably worth the reliability and maintainability advantages of having
> a single piece of code if at all possible.
>
> Of course, if you can figure out how to avoid that and still have the
> code clean, then that's another matter.
Alrighty. I will merge the two functions and if that turns out ugly, go
with the safe one, and submit a new patchset. I'll be right back. :)
Cheers,
Andreas
On Thu, 2009-04-02 at 10:40 +1100, Nigel Cunningham wrote:
> Sorry to jump in with a tangential issue, but I just noticed the
> thread
> and it reminded me of an issue :)
>
> Should the lzo code used via cryptoapi (I believe it's the same stuff)
> be SMP safe? I've tried to use it work TuxOnIce and get image corruption
> (test kernel is 64 bit). The same code works fine if I tell it to use
> LZF (which comes with TuxOnIce), no compression or, IIRC, work single
> threaded.
Do you compress or decompress data through the crypto API?
Neither function modifies the input data and there are no static or
global variables in use, so there shouldn't be a problem there.
They do however modify the destination length argument.
But most importantly, each compressor thread needs a private work buffer
allocated through lzo_init() in crypto/lzo.c. So, if you use the crypto
API to compress and share the crypto_tfm struct among threads, that
would explain your breakage.
Cheers,
Andreas
>>>>> "Andreas" == Andreas Robinson <[email protected]> writes:
Andreas> On Wed, 2009-04-01 at 09:12 -0700, H. Peter Anvin wrote:
>> Andreas Robinson wrote:
>> > This patch adds an LZO decompressor tweaked to be faster than
>> > the 'safe' decompressor already in the kernel.
>> >
>> > On x86_64, it runs in roughly 80% of the time needed by the safe
>> > decompressor.
>> >
>> > This function is inherently insecure and can cause buffer overruns.
>> > It is only intended for decompressing implicitly trusted data, such
>> > as an initramfs and the kernel itself.
>> >
>> > As such, the function is neither exported nor declared in a header.
>> >
>>
>> OK, I'm more than a bit nervous about that, especially since we're
>> trying to make the decompression functions more generic.
Andreas> Perhaps the system can default to the safe decompressor for
Andreas> normal use and choose the fast one if STATIC is defined or
Andreas> when system_state == SYSTEM_BOOTING?
So how do you prove that data is trusted? What happens on buffer
overflow? I don't think that a 20% speedup on decompression, with a
possibility of borking the boot completely is worth it. Or are you
suggesting that people pre-test their initramfs images with this
compressor before deciding to boot from it?
Reliable booting is better than random crashes in my book.
John
Hi.
On Thu, 2009-04-02 at 14:30 +0200, Andreas Robinson wrote:
> On Thu, 2009-04-02 at 10:40 +1100, Nigel Cunningham wrote:
> > Sorry to jump in with a tangential issue, but I just noticed the
> > thread
> > and it reminded me of an issue :)
> >
> > Should the lzo code used via cryptoapi (I believe it's the same stuff)
> > be SMP safe? I've tried to use it work TuxOnIce and get image corruption
> > (test kernel is 64 bit). The same code works fine if I tell it to use
> > LZF (which comes with TuxOnIce), no compression or, IIRC, work single
> > threaded.
>
> Do you compress or decompress data through the crypto API?
>
> Neither function modifies the input data and there are no static or
> global variables in use, so there shouldn't be a problem there.
>
> They do however modify the destination length argument.
>
> But most importantly, each compressor thread needs a private work buffer
> allocated through lzo_init() in crypto/lzo.c. So, if you use the crypto
> API to compress and share the crypto_tfm struct among threads, that
> would explain your breakage.
I use cryptoapi, and have per-cpu struct crypto_comps and per-cpu
buffers for output. As I said, it works fine if I use LZF (all I'm doing
is changing the name of the algorithm passed to crypto_alloc_comp. Here
are the relevant data structures, the initialisation routine and the
routines for reading and writing one page.
Regards,
Nigel
struct cpu_context {
u8 *page_buffer;
struct crypto_comp *transform;
unsigned int len;
char *buffer_start;
};
static DEFINE_PER_CPU(struct cpu_context, contexts);
static int toi_compress_crypto_prepare(void)
{
int cpu;
if (!*toi_compressor_name) {
printk(KERN_INFO "TuxOnIce: Compression enabled but no "
"compressor name set.\n");
return 1;
}
for_each_online_cpu(cpu) {
struct cpu_context *this = &per_cpu(contexts, cpu);
this->transform = crypto_alloc_comp(toi_compressor_name, 0, 0);
if (IS_ERR(this->transform)) {
printk(KERN_INFO "TuxOnIce: Failed to initialise the "
"%s compression transform.\n",
toi_compressor_name);
this->transform = NULL;
return 1;
}
this->page_buffer =
(char *) toi_get_zeroed_page(16, TOI_ATOMIC_GFP);
if (!this->page_buffer) {
printk(KERN_ERR
"Failed to allocate a page buffer for TuxOnIce "
"encryption driver.\n");
return -ENOMEM;
}
}
return 0;
}
static int toi_compress_write_page(unsigned long index,
struct page *buffer_page, unsigned int buf_size)
{
int ret, cpu = smp_processor_id();
struct cpu_context *ctx = &per_cpu(contexts, cpu);
if (!ctx->transform)
return next_driver->write_page(index, buffer_page, buf_size);
ctx->buffer_start = kmap(buffer_page);
ctx->len = buf_size;
ret = crypto_comp_compress(ctx->transform,
ctx->buffer_start, buf_size,
ctx->page_buffer, &ctx->len);
kunmap(buffer_page);
if (ret) {
printk(KERN_INFO "Compression failed.\n");
return ret;
}
mutex_lock(&stats_lock);
toi_compress_bytes_in += buf_size;
toi_compress_bytes_out += ctx->len;
mutex_unlock(&stats_lock);
if (ctx->len < buf_size) /* some compression */
return next_driver->write_page(index,
virt_to_page(ctx->page_buffer),
ctx->len);
else
return next_driver->write_page(index, buffer_page, buf_size);
}
static int toi_compress_read_page(unsigned long *index,
struct page *buffer_page, unsigned int *buf_size)
{
int ret, cpu = smp_processor_id();
unsigned int len;
unsigned int outlen = PAGE_SIZE;
char *buffer_start;
struct cpu_context *ctx = &per_cpu(contexts, cpu);
if (!ctx->transform)
return next_driver->read_page(index, buffer_page, buf_size);
/*
* All our reads must be synchronous - we can't decompress
* data that hasn't been read yet.
*/
*buf_size = PAGE_SIZE;
ret = next_driver->read_page(index, buffer_page, &len);
/* Error or uncompressed data */
if (ret || len == PAGE_SIZE)
return ret;
buffer_start = kmap(buffer_page);
memcpy(ctx->page_buffer, buffer_start, len);
ret = crypto_comp_decompress(
ctx->transform,
ctx->page_buffer,
len, buffer_start, &outlen);
if (ret)
abort_hibernate(TOI_FAILED_IO,
"Compress_read returned %d.\n", ret);
else if (outlen != PAGE_SIZE) {
abort_hibernate(TOI_FAILED_IO,
"Decompression yielded %d bytes instead of %ld.\n",
outlen, PAGE_SIZE);
printk(KERN_ERR "Decompression yielded %d bytes instead of "
"%ld.\n", outlen, PAGE_SIZE);
ret = -EIO;
*buf_size = outlen;
}
kunmap(buffer_page);
return ret;
}
On Thu, 2009-04-02 at 10:30 -0400, John Stoffel wrote:
> >>>>> "Andreas" == Andreas Robinson <[email protected]> writes:
>
> Andreas> On Wed, 2009-04-01 at 09:12 -0700, H. Peter Anvin wrote:
> >> OK, I'm more than a bit nervous about that, especially since we're
> >> trying to make the decompression functions more generic.
>
> Andreas> Perhaps the system can default to the safe decompressor for
> Andreas> normal use and choose the fast one if STATIC is defined or
> Andreas> when system_state == SYSTEM_BOOTING?
>
> So how do you prove that data is trusted?
The kernel and initramfs images are implicitly trusted, but that is not
unique to this implementation. None of the decompressors check the data
e.g by comparing checksums, AFAICT.
> What happens on buffer overflow?
The overflow is detected and an error is returned to the caller ... or
the kernel just committed suicide by overwriting itself. *cough*.
Ok, I will put back the checks that actually prevents output overruns.
It is still faster, but could no longer crash the system directly.
There would remain a 3-byte overflow at the end of the output buffer, by
design. However, it can be managed to never be a problem.
> I don't think that a 20% speedup on decompression, with a
> possibility of borking the boot completely is worth it. Or are you
> suggesting that people pre-test their initramfs images with this
> compressor before deciding to boot from it?
The compressor (lzop) has been around for a long while so it's probably
ok.
In case I broke the decompressor the best I can offer is more testing.
In its present form it unpacks a 5 MB initramfs without errors. Does it
need to be tested with more data? In that case, do you have any
suggestions?
Thanks,
Andreas
On Fri, 2009-04-03 at 07:59 +1100, Nigel Cunningham wrote:
> static int toi_compress_write_page(unsigned long index,
> struct page *buffer_page, unsigned int buf_size)
> {
> int ret, cpu = smp_processor_id();
> struct cpu_context *ctx = &per_cpu(contexts, cpu);
>
> if (!ctx->transform)
> return next_driver->write_page(index, buffer_page, buf_size);
>
> ctx->buffer_start = kmap(buffer_page);
>
> ctx->len = buf_size;
The LZO compressor can produce more bytes than it consumes but here the
output buffer is the same size as the input.
This macro in linux/lzo.h defines how big the buffer needs to be:
#define lzo1x_worst_compress(x) ((x) + ((x) / 16) + 64 + 3)
If there are multiple threads perhaps they clobber each other's output
buffers?
> ret = crypto_comp_compress(ctx->transform,
> ctx->buffer_start, buf_size,
> ctx->page_buffer, &ctx->len);
>
> kunmap(buffer_page);
>
> if (ret) {
> printk(KERN_INFO "Compression failed.\n");
> return ret;
> }
>
> mutex_lock(&stats_lock);
> toi_compress_bytes_in += buf_size;
> toi_compress_bytes_out += ctx->len;
> mutex_unlock(&stats_lock);
>
> if (ctx->len < buf_size) /* some compression */
> return next_driver->write_page(index,
> virt_to_page(ctx->page_buffer),
> ctx->len);
> else
> return next_driver->write_page(index, buffer_page, buf_size);
> }
Hi.
On Fri, 2009-04-03 at 12:54 +0200, Andreas Robinson wrote:
> The LZO compressor can produce more bytes than it consumes but here the
> output buffer is the same size as the input.
> This macro in linux/lzo.h defines how big the buffer needs to be:
> #define lzo1x_worst_compress(x) ((x) + ((x) / 16) + 64 + 3)
Okay. Am I right in thinking (from staring at the code) that the
compression algo just assumes it has an output buffer big enough? (I
don't see it checking out_len, only writing to it). If that's the case,
I guess I need to (ideally) persuade the cryptoapi guys to extend the
api so you can find out how big an output buffer is needed for a
particular compression algorithm - or learn how they've already done
that (though it doesn't look like it to me).
> If there are multiple threads perhaps they clobber each other's output
> buffers?
Nope. The output buffers you see here are fed to the next part of the
pipeline (the block I/O code), which combines them (under a mutex) into
a stream of |index|size|data|index|size|data... so that we don't have to
worry at all about which processor compressed (or decompresses data
later). As I said earlier, it's worked fine with LZF - or no compression
- for years. It's just LZO that causes me problems.
Thanks!
Nigel
On Fri, 2009-04-03 at 22:48 +1100, Nigel Cunningham wrote:
> Okay. Am I right in thinking (from staring at the code) that the
> compression algo just assumes it has an output buffer big enough? (I
> don't see it checking out_len, only writing to it).
I came to that conclusion too. And it is not just LZO that needs a
bigger buffer. Non-compressed blocks in deflate streams occupy 4 bytes
more than the original, according to RFC 1951 section 3.2.4.
> If that's the case,
> I guess I need to (ideally) persuade the cryptoapi guys to extend the
> api so you can find out how big an output buffer is needed for a
> particular compression algorithm - or learn how they've already done
> that (though it doesn't look like it to me).
I can not see anything to that effect either.
> > If there are multiple threads perhaps they clobber each other's output
> > buffers?
>
> Nope. The output buffers you see here are fed to the next part of the
> pipeline (the block I/O code), which combines them (under a mutex) into
> a stream of |index|size|data|index|size|data... so that we don't have to
> worry at all about which processor compressed (or decompresses data
> later). As I said earlier, it's worked fine with LZF - or no compression
> - for years. It's just LZO that causes me problems.
>
> Thanks!
>
> Nigel
>
I'm glad I was able to help!
Cheers,
Andreas
Andreas Robinson wrote:
>
> The kernel and initramfs images are implicitly trusted, but that is not
> unique to this implementation. None of the decompressors check the data
> e.g by comparing checksums, AFAICT.
>
That's not true. At least the gzip decompressor definitely checks the
resulting CRC32. Being a CRC32, it's a test against corruption, not
malicious injection, but if you have malicious injection problems this
can't help you anyway.
However, corruption problems can and do happen during boot, and it's
really important that we get some kind of useful notification.
-hpa
Hi.
On Fri, 2009-04-03 at 14:53 +0200, Andreas Robinson wrote:
> On Fri, 2009-04-03 at 22:48 +1100, Nigel Cunningham wrote:
>
> > Okay. Am I right in thinking (from staring at the code) that the
> > compression algo just assumes it has an output buffer big enough? (I
> > don't see it checking out_len, only writing to it).
>
> I came to that conclusion too. And it is not just LZO that needs a
> bigger buffer. Non-compressed blocks in deflate streams occupy 4 bytes
> more than the original, according to RFC 1951 section 3.2.4.
>
> > If that's the case,
> > I guess I need to (ideally) persuade the cryptoapi guys to extend the
> > api so you can find out how big an output buffer is needed for a
> > particular compression algorithm - or learn how they've already done
> > that (though it doesn't look like it to me).
>
> I can not see anything to that effect either.
>
> > > If there are multiple threads perhaps they clobber each other's output
> > > buffers?
> >
> > Nope. The output buffers you see here are fed to the next part of the
> > pipeline (the block I/O code), which combines them (under a mutex) into
> > a stream of |index|size|data|index|size|data... so that we don't have to
> > worry at all about which processor compressed (or decompresses data
> > later). As I said earlier, it's worked fine with LZF - or no compression
> > - for years. It's just LZO that causes me problems.
> >
> > Thanks!
> >
> > Nigel
> >
> I'm glad I was able to help!
Vmalloc'ing a 2 * PAGE_SIZE buffer seems to have done the trick - I've
done a couple of cycles with no problems and slightly better throughput
than LZF. A couple of tests in a row of just compressing data using
first LZO then LZF gave 260MB/s vs 230MB/s throughput respectively.
Doing real writes slows things down so that the difference is only about
10MB/s (I only have a 53MB/s SATA HDD), but that's still better than a
poke in the eye!
Thanks!
Nigel
Hi,
after enabling checksumming (adler32) and adding an overrun check, based
on your suggestions and objections, the performance advantage of LZO on
the Eee 901 dropped to 67 ms with the fast, and 54 ms with the safe
decompressor. Apparently almost half of the speed boost came from
skipping the checksumming.
At this point I don't think this idea is worth pursuing further, when
LZO is so close to gzip in performance after all.
But I want thank you for discussing this topic and taking my patch
submission seriously.
I hope I did not waste too much of your time and that I will be able to
contribute something more useful next time.
Thanks,
Andreas