2019-05-17 18:07:54

by Roberto Sassu

[permalink] [raw]
Subject: [PATCH v3 0/2] initramfs: add support for xattrs in the initial ram disk

This patch set aims at solving the following use case: appraise files from
the initial ram disk. To do that, IMA checks the signature/hash from the
security.ima xattr. Unfortunately, this use case cannot be implemented
currently, as the CPIO format does not support xattrs.

This proposal consists in marshaling pathnames and xattrs in a file called
.xattr-list. They are unmarshaled by the CPIO parser after all files have
been extracted, or before the next ram disk is processed.

The difference from v1 (https://lkml.org/lkml/2018/11/22/1182) is that all
xattrs are stored in a single file and not per file (solves the file name
limitation issue, as it is not necessary to add a suffix to files
containing xattrs).

The difference with another proposal
(https://lore.kernel.org/patchwork/cover/888071/) is that xattrs can be
included in an image without changing the image format, as opposed to
defining a new one. As seen from the discussion, if a new format has to be
defined, it should fix the issues of the existing format, which requires
more time.

To fulfill both requirements, adding support for xattrs in a short time and
defining a new image format properly, this patch set takes an incremental
approach: it introduces a parser of xattrs that can be used either if
xattrs are in a regular file or directly added to the image (this patch set
reuses patch 9/15 of the existing proposal); in addition, it introduces a
wrapper of the xattr parser, to read xattrs from a file.

The changes introduced by this patch set don't cause any compatibility
issue: kernels without the xattr parser simply extracts .xattr-list and
don't unmarshal xattrs; kernels with the xattr parser don't unmarshal
xattrs if .xattr-list is not found in the image.

From the kernel space perspective, backporting this functionality to older
kernels should be very easy. It is sufficient to add two calls to the new
function do_readxattrs(). From the user space perspective, no change is
required for the use case. A new dracut module (module-setup.sh) will
execute:

getfattr --absolute-names -d -h -R -e hex -m security.ima \
<file list> | xattr.awk -b > ${initdir}/.xattr-list

where xattr.awk is the script that marshals xattrs (see patch 3/3). The
same can be done with the initramfs-tools ram disk generator.

Changelog

v2:
- replace ksys_lsetxattr() with kern_path() and vfs_setxattr()
(suggested by Jann Horn)
- replace ksys_open()/ksys_read()/ksys_close() with
filp_open()/kernel_read()/fput()
(suggested by Jann Horn)
- use path variable instead of name_buf in do_readxattrs()
- set last byte of str to 0 in do_readxattrs()
- call do_readxattrs() in do_name() before replacing an existing
.xattr-list
- pass pathname to do_setxattrs()


Mimi Zohar (1):
initramfs: set extended attributes

Roberto Sassu (1):
initramfs: introduce do_readxattrs()

init/initramfs.c | 170 ++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 168 insertions(+), 2 deletions(-)

--
2.17.1


2019-05-17 18:34:52

by Roberto Sassu

[permalink] [raw]
Subject: [PATCH v3 1/2] initramfs: set extended attributes

From: Mimi Zohar <[email protected]>

This patch adds xattrs to a file, with name and value taken from a supplied
buffer. The data format is:

<xattr #N data len (ASCII, 8 chars)><xattr #N name>\0<xattr #N value>

[kamensky: fixed restoring of xattrs for symbolic links by using
sys_lsetxattr() instead of sys_setxattr()]

[sassu: removed state management, kept only do_setxattrs(), replaced
sys_lsetxattr() with vfs_setxattr(), added check for
xattr_entry_size, added check for hdr->c_size, replaced strlen()
with strnlen(); moved do_setxattrs() before do_name()]

Signed-off-by: Mimi Zohar <[email protected]>
Signed-off-by: Victor Kamensky <[email protected]>
Signed-off-by: Taras Kondratiuk <[email protected]>
Signed-off-by: Roberto Sassu <[email protected]>
---
init/initramfs.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 69 insertions(+), 2 deletions(-)

diff --git a/init/initramfs.c b/init/initramfs.c
index 435a428c2af1..0c6dd1d5d3f6 100644
--- a/init/initramfs.c
+++ b/init/initramfs.c
@@ -10,6 +10,8 @@
#include <linux/syscalls.h>
#include <linux/utime.h>
#include <linux/file.h>
+#include <linux/namei.h>
+#include <linux/xattr.h>

static ssize_t __init xwrite(int fd, const char *p, size_t count)
{
@@ -146,7 +148,8 @@ static __initdata time64_t mtime;

static __initdata unsigned long ino, major, minor, nlink;
static __initdata umode_t mode;
-static __initdata unsigned long body_len, name_len;
+static __initdata u32 name_len, xattr_len;
+static __initdata u64 body_len;
static __initdata uid_t uid;
static __initdata gid_t gid;
static __initdata unsigned rdev;
@@ -218,7 +221,7 @@ static void __init read_into(char *buf, unsigned size, enum state next)
}
}

-static __initdata char *header_buf, *symlink_buf, *name_buf;
+static __initdata char *header_buf, *symlink_buf, *name_buf, *xattr_buf;

static int __init do_start(void)
{
@@ -315,6 +318,70 @@ static int __init maybe_link(void)
return 0;
}

+struct xattr_hdr {
+ char c_size[8]; /* total size including c_size field */
+ char c_data[]; /* <name>\0<value> */
+};
+
+static int __init __maybe_unused do_setxattrs(char *pathname)
+{
+ char *buf = xattr_buf;
+ char *bufend = buf + xattr_len;
+ struct xattr_hdr *hdr;
+ char str[sizeof(hdr->c_size) + 1];
+ struct path path;
+
+ if (!xattr_len)
+ return 0;
+
+ str[sizeof(hdr->c_size)] = 0;
+
+ while (buf < bufend) {
+ char *xattr_name, *xattr_value;
+ unsigned long xattr_entry_size;
+ unsigned long xattr_name_size, xattr_value_size;
+ int ret;
+
+ if (buf + sizeof(hdr->c_size) > bufend) {
+ error("malformed xattrs");
+ break;
+ }
+
+ hdr = (struct xattr_hdr *)buf;
+ memcpy(str, hdr->c_size, sizeof(hdr->c_size));
+ ret = kstrtoul(str, 16, &xattr_entry_size);
+ buf += xattr_entry_size;
+ if (ret || buf > bufend || !xattr_entry_size) {
+ error("malformed xattrs");
+ break;
+ }
+
+ xattr_name = hdr->c_data;
+ xattr_name_size = strnlen(xattr_name,
+ xattr_entry_size - sizeof(hdr->c_size));
+ if (xattr_name_size == xattr_entry_size - sizeof(hdr->c_size)) {
+ error("malformed xattrs");
+ break;
+ }
+
+ xattr_value = xattr_name + xattr_name_size + 1;
+ xattr_value_size = buf - xattr_value;
+
+ ret = kern_path(pathname, 0, &path);
+ if (!ret) {
+ ret = vfs_setxattr(path.dentry, xattr_name, xattr_value,
+ xattr_value_size, 0);
+
+ path_put(&path);
+ }
+
+ pr_debug("%s: %s size: %lu val: %s (ret: %d)\n", pathname,
+ xattr_name, xattr_value_size, xattr_value, ret);
+ }
+
+ return 0;
+}
+
static __initdata int wfd;

static int __init do_name(void)
--
2.17.1

2019-05-17 18:34:55

by Roberto Sassu

[permalink] [raw]
Subject: [PATCH v3 2/2] initramfs: introduce do_readxattrs()

This patch adds support for an alternative method to add xattrs to files in
the rootfs filesystem. Instead of extracting them directly from the ram
disk image, they are extracted from a regular file called .xattr-list, that
can be added by any ram disk generator available today. The file format is:

<file #N data len (ASCII, 10 chars)><file #N path>\0
<xattr #N data len (ASCII, 8 chars)><xattr #N name>\0<xattr #N value>

.xattr-list can be generated by executing:

$ getfattr --absolute-names -d -h -R -e hex -m - \
<file list> | xattr.awk -b > ${initdir}/.xattr-list

where the content of the xattr.awk script is:

#! /usr/bin/awk -f
{
if (!length($0)) {
printf("%.10x%s\0", len, file);
for (x in xattr) {
printf("%.8x%s\0", xattr_len[x], x);
for (i = 0; i < length(xattr[x]) / 2; i++) {
printf("%c", strtonum("0x"substr(xattr[x], i * 2 + 1, 2)));
}
}
i = 0;
delete xattr;
delete xattr_len;
next;
};
if (i == 0) {
file=$3;
len=length(file) + 8 + 1;
}
if (i > 0) {
split($0, a, "=");
xattr[a[1]]=substr(a[2], 3);
xattr_len[a[1]]=length(a[1]) + 1 + 8 + length(xattr[a[1]]) / 2;
len+=xattr_len[a[1]];
};
i++;
}

Signed-off-by: Roberto Sassu <[email protected]>
---
init/initramfs.c | 99 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 99 insertions(+)

diff --git a/init/initramfs.c b/init/initramfs.c
index 0c6dd1d5d3f6..6ec018c6279a 100644
--- a/init/initramfs.c
+++ b/init/initramfs.c
@@ -13,6 +13,8 @@
#include <linux/namei.h>
#include <linux/xattr.h>

+#define XATTR_LIST_FILENAME ".xattr-list"
+
static ssize_t __init xwrite(int fd, const char *p, size_t count)
{
ssize_t out = 0;
@@ -382,6 +384,97 @@ static int __init __maybe_unused do_setxattrs(char *pathname)
return 0;
}

+struct path_hdr {
+ char p_size[10]; /* total size including p_size field */
+ char p_data[]; /* <path>\0<xattrs> */
+};
+
+static int __init do_readxattrs(void)
+{
+ struct path_hdr hdr;
+ char *path = NULL;
+ char str[sizeof(hdr.p_size) + 1];
+ unsigned long file_entry_size;
+ size_t size, path_size, total_size;
+ struct kstat st;
+ struct file *file;
+ loff_t pos;
+ int ret;
+
+ ret = vfs_lstat(XATTR_LIST_FILENAME, &st);
+ if (ret < 0)
+ return ret;
+
+ total_size = st.size;
+
+ file = filp_open(XATTR_LIST_FILENAME, O_RDONLY, 0);
+ if (IS_ERR(file))
+ return PTR_ERR(file);
+
+ pos = file->f_pos;
+
+ while (total_size) {
+ size = kernel_read(file, (char *)&hdr, sizeof(hdr), &pos);
+ if (size != sizeof(hdr)) {
+ ret = -EIO;
+ goto out;
+ }
+
+ total_size -= size;
+
+ str[sizeof(hdr.p_size)] = 0;
+ memcpy(str, hdr.p_size, sizeof(hdr.p_size));
+ ret = kstrtoul(str, 16, &file_entry_size);
+ if (ret < 0)
+ goto out;
+
+ file_entry_size -= sizeof(sizeof(hdr.p_size));
+ if (file_entry_size > total_size) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ path = vmalloc(file_entry_size);
+ if (!path) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ size = kernel_read(file, path, file_entry_size, &pos);
+ if (size != file_entry_size) {
+ ret = -EIO;
+ goto out_free;
+ }
+
+ total_size -= size;
+
+ path_size = strnlen(path, file_entry_size);
+ if (path_size == file_entry_size) {
+ ret = -EINVAL;
+ goto out_free;
+ }
+
+ xattr_buf = path + path_size + 1;
+ xattr_len = file_entry_size - path_size - 1;
+
+ ret = do_setxattrs(path);
+ vfree(path);
+ path = NULL;
+
+ if (ret < 0)
+ break;
+ }
+out_free:
+ vfree(path);
+out:
+ fput(file);
+
+ if (ret < 0)
+ error("Unable to parse xattrs");
+
+ return ret;
+}
+
static __initdata int wfd;

static int __init do_name(void)
@@ -391,6 +484,11 @@ static int __init do_name(void)
if (strcmp(collected, "TRAILER!!!") == 0) {
free_hash();
return 0;
+ } else if (strcmp(collected, XATTR_LIST_FILENAME) == 0) {
+ struct kstat st;
+
+ if (!vfs_lstat(collected, &st))
+ do_readxattrs();
}
clean_path(collected, mode);
if (S_ISREG(mode)) {
@@ -562,6 +660,7 @@ static char * __init unpack_to_rootfs(char *buf, unsigned long len)
buf += my_inptr;
len -= my_inptr;
}
+ do_readxattrs();
dir_utime();
kfree(name_buf);
kfree(symlink_buf);
--
2.17.1

2019-05-17 21:13:52

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] initramfs: introduce do_readxattrs()

On May 17, 2019 9:55:19 AM PDT, Roberto Sassu <[email protected]> wrote:
>This patch adds support for an alternative method to add xattrs to
>files in
>the rootfs filesystem. Instead of extracting them directly from the ram
>disk image, they are extracted from a regular file called .xattr-list,
>that
>can be added by any ram disk generator available today. The file format
>is:
>
><file #N data len (ASCII, 10 chars)><file #N path>\0
><xattr #N data len (ASCII, 8 chars)><xattr #N name>\0<xattr #N value>
>
>.xattr-list can be generated by executing:
>
>$ getfattr --absolute-names -d -h -R -e hex -m - \
> <file list> | xattr.awk -b > ${initdir}/.xattr-list
>
>where the content of the xattr.awk script is:
>
>#! /usr/bin/awk -f
>{
> if (!length($0)) {
> printf("%.10x%s\0", len, file);
> for (x in xattr) {
> printf("%.8x%s\0", xattr_len[x], x);
> for (i = 0; i < length(xattr[x]) / 2; i++) {
> printf("%c", strtonum("0x"substr(xattr[x], i * 2 + 1, 2)));
> }
> }
> i = 0;
> delete xattr;
> delete xattr_len;
> next;
> };
> if (i == 0) {
> file=$3;
> len=length(file) + 8 + 1;
> }
> if (i > 0) {
> split($0, a, "=");
> xattr[a[1]]=substr(a[2], 3);
> xattr_len[a[1]]=length(a[1]) + 1 + 8 + length(xattr[a[1]]) / 2;
> len+=xattr_len[a[1]];
> };
> i++;
>}
>
>Signed-off-by: Roberto Sassu <[email protected]>
>---
> init/initramfs.c | 99 ++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 99 insertions(+)
>
>diff --git a/init/initramfs.c b/init/initramfs.c
>index 0c6dd1d5d3f6..6ec018c6279a 100644
>--- a/init/initramfs.c
>+++ b/init/initramfs.c
>@@ -13,6 +13,8 @@
> #include <linux/namei.h>
> #include <linux/xattr.h>
>
>+#define XATTR_LIST_FILENAME ".xattr-list"
>+
> static ssize_t __init xwrite(int fd, const char *p, size_t count)
> {
> ssize_t out = 0;
>@@ -382,6 +384,97 @@ static int __init __maybe_unused do_setxattrs(char
>*pathname)
> return 0;
> }
>
>+struct path_hdr {
>+ char p_size[10]; /* total size including p_size field */
>+ char p_data[]; /* <path>\0<xattrs> */
>+};
>+
>+static int __init do_readxattrs(void)
>+{
>+ struct path_hdr hdr;
>+ char *path = NULL;
>+ char str[sizeof(hdr.p_size) + 1];
>+ unsigned long file_entry_size;
>+ size_t size, path_size, total_size;
>+ struct kstat st;
>+ struct file *file;
>+ loff_t pos;
>+ int ret;
>+
>+ ret = vfs_lstat(XATTR_LIST_FILENAME, &st);
>+ if (ret < 0)
>+ return ret;
>+
>+ total_size = st.size;
>+
>+ file = filp_open(XATTR_LIST_FILENAME, O_RDONLY, 0);
>+ if (IS_ERR(file))
>+ return PTR_ERR(file);
>+
>+ pos = file->f_pos;
>+
>+ while (total_size) {
>+ size = kernel_read(file, (char *)&hdr, sizeof(hdr), &pos);
>+ if (size != sizeof(hdr)) {
>+ ret = -EIO;
>+ goto out;
>+ }
>+
>+ total_size -= size;
>+
>+ str[sizeof(hdr.p_size)] = 0;
>+ memcpy(str, hdr.p_size, sizeof(hdr.p_size));
>+ ret = kstrtoul(str, 16, &file_entry_size);
>+ if (ret < 0)
>+ goto out;
>+
>+ file_entry_size -= sizeof(sizeof(hdr.p_size));
>+ if (file_entry_size > total_size) {
>+ ret = -EINVAL;
>+ goto out;
>+ }
>+
>+ path = vmalloc(file_entry_size);
>+ if (!path) {
>+ ret = -ENOMEM;
>+ goto out;
>+ }
>+
>+ size = kernel_read(file, path, file_entry_size, &pos);
>+ if (size != file_entry_size) {
>+ ret = -EIO;
>+ goto out_free;
>+ }
>+
>+ total_size -= size;
>+
>+ path_size = strnlen(path, file_entry_size);
>+ if (path_size == file_entry_size) {
>+ ret = -EINVAL;
>+ goto out_free;
>+ }
>+
>+ xattr_buf = path + path_size + 1;
>+ xattr_len = file_entry_size - path_size - 1;
>+
>+ ret = do_setxattrs(path);
>+ vfree(path);
>+ path = NULL;
>+
>+ if (ret < 0)
>+ break;
>+ }
>+out_free:
>+ vfree(path);
>+out:
>+ fput(file);
>+
>+ if (ret < 0)
>+ error("Unable to parse xattrs");
>+
>+ return ret;
>+}
>+
> static __initdata int wfd;
>
> static int __init do_name(void)
>@@ -391,6 +484,11 @@ static int __init do_name(void)
> if (strcmp(collected, "TRAILER!!!") == 0) {
> free_hash();
> return 0;
>+ } else if (strcmp(collected, XATTR_LIST_FILENAME) == 0) {
>+ struct kstat st;
>+
>+ if (!vfs_lstat(collected, &st))
>+ do_readxattrs();
> }
> clean_path(collected, mode);
> if (S_ISREG(mode)) {
>@@ -562,6 +660,7 @@ static char * __init unpack_to_rootfs(char *buf,
>unsigned long len)
> buf += my_inptr;
> len -= my_inptr;
> }
>+ do_readxattrs();
> dir_utime();
> kfree(name_buf);
> kfree(symlink_buf);

Ok... I just realized this does not work for a modular initramfs, composed at load time from multiple files, which is a very real problem. Should be easy enough to deal with: instead of one large file, use one companion file per source file, perhaps something like filename..xattrs (suggesting double dots to make it less likely to conflict with a "real" file.) No leading dot, as it makes it more likely that archivers will sort them before the file proper.

A side benefit is that the format can be simpler as there is no need to encode the filename.

A technically cleaner solution still, but which would need archiver modifications, would be to encode the xattrs as an optionally nameless file (just an empty string) with a new file mode value, immediately following the original file. The advantage there is that the archiver itself could support xattrs and other extended metadata (which has been requested elsewhere); the disadvantage obviously is that that it requires new support in the archiver. However, at least it ought to be simpler since it is still a higher protocol level than the cpio archive itself.

There's already one special case in cpio, which is the "!!!TRAILER!!!" filename; although I don't think it is part of the formal spec, to the extent there is one, I would expect that in practice it is always encoded with a mode of 0, which incidentally could be used to unbreak the case where such a filename actually exists. So one way to support such extended metadata would be to set mode to 0 and use the filename to encode the type of metadata. I wonder how existing GNU or BSD cpio (the BSD one is better maintained these days) would deal with reading such a file; it would at least not be a regression if it just read it still, possibly with warnings. It could also be possible to use bits 17:16 in the mode, which are traditionally always zero (mode_t being 16 bits), but I believe are present in most or all of the cpio formats for historical reasons. It might be accepted better by existing implementations to use one of these high bits combined with S_IFREG, I dont know.

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

2019-05-17 21:44:56

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] initramfs: introduce do_readxattrs()

On 5/17/19 1:18 PM, [email protected] wrote:
>
> Ok... I just realized this does not work for a modular initramfs, composed at load time from multiple files, which is a very real problem. Should be easy enough to deal with: instead of one large file, use one companion file per source file, perhaps something like filename..xattrs (suggesting double dots to make it less likely to conflict with a "real" file.) No leading dot, as it makes it more likely that archivers will sort them before the file proper.
>
> A side benefit is that the format can be simpler as there is no need to encode the filename.
>
> A technically cleaner solution still, but which would need archiver modifications, would be to encode the xattrs as an optionally nameless file (just an empty string) with a new file mode value, immediately following the original file. The advantage there is that the archiver itself could support xattrs and other extended metadata (which has been requested elsewhere); the disadvantage obviously is that that it requires new support in the archiver. However, at least it ought to be simpler since it is still a higher protocol level than the cpio archive itself.
>
> There's already one special case in cpio, which is the "!!!TRAILER!!!" filename; although I don't think it is part of the formal spec, to the extent there is one, I would expect that in practice it is always encoded with a mode of 0, which incidentally could be used to unbreak the case where such a filename actually exists. So one way to support such extended metadata would be to set mode to 0 and use the filename to encode the type of metadata. I wonder how existing GNU or BSD cpio (the BSD one is better maintained these days) would deal with reading such a file; it would at least not be a regression if it just read it still, possibly with warnings. It could also be possible to use bits 17:16 in the mode, which are traditionally always zero (mode_t being 16 bits), but I believe are present in most or all of the cpio formats for historical reasons. It might be accepted better by existing implementations to use one of these high bits combined with S_IFREG, I dont know.
>

Correction: it's just !!!TRAILER!!!.

I tested with GNU cpio, BSD cpio, scpio and pax.

With a mode of 0:

- GNU cpio errors, but extracts all the other files.
- BSD cpio extracts them as regular files.
- scpio and pax abort.

With a mode of 0x18000 (bit 16 + S_IFREG), all of them happily extracted
the data as regular files.

-hpa

2019-05-17 22:40:42

by Arvind Sankar

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] initramfs: introduce do_readxattrs()

On Fri, May 17, 2019 at 01:18:11PM -0700, [email protected] wrote:
>
> Ok... I just realized this does not work for a modular initramfs, composed at load time from multiple files, which is a very real problem. Should be easy enough to deal with: instead of one large file, use one companion file per source file, perhaps something like filename..xattrs (suggesting double dots to make it less likely to conflict with a "real" file.) No leading dot, as it makes it more likely that archivers will sort them before the file proper.
This version of the patch was changed from the previous one exactly to deal with this case --
it allows for the bootloader to load multiple initramfs archives, each
with its own .xattr-list file, and to have that work properly.
Could you elaborate on the issue that you see?

2019-05-17 22:55:26

by Arvind Sankar

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] initramfs: introduce do_readxattrs()

On Fri, May 17, 2019 at 05:02:20PM -0400, Arvind Sankar wrote:
> On Fri, May 17, 2019 at 01:18:11PM -0700, [email protected] wrote:
> >
> > Ok... I just realized this does not work for a modular initramfs, composed at load time from multiple files, which is a very real problem. Should be easy enough to deal with: instead of one large file, use one companion file per source file, perhaps something like filename..xattrs (suggesting double dots to make it less likely to conflict with a "real" file.) No leading dot, as it makes it more likely that archivers will sort them before the file proper.
> This version of the patch was changed from the previous one exactly to deal with this case --
> it allows for the bootloader to load multiple initramfs archives, each
> with its own .xattr-list file, and to have that work properly.
> Could you elaborate on the issue that you see?
Roberto, are you missing a changelog entry for v2->v3 change?

2019-05-17 22:55:33

by Rob Landley

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] initramfs: introduce do_readxattrs()

On 5/17/19 3:18 PM, [email protected] wrote:
> Ok... I just realized this does not work for a modular initramfs, composed at load time from multiple files, which is a very real problem. Should be easy enough to deal with: instead of one large file, use one companion file per source file, perhaps something like filename..xattrs (suggesting double dots to make it less likely to conflict with a "real" file.) No leading dot, as it makes it more likely that archivers will sort them before the file proper.
>
> A side benefit is that the format can be simpler as there is no need to encode the filename.
>
> A technically cleaner solution still, but which would need archiver modifications, would be to encode the xattrs as an optionally nameless file (just an empty string) with a new file mode value, immediately following the original file. The advantage there is that the archiver itself could support xattrs and other extended metadata (which has been requested elsewhere); the disadvantage obviously is that that it requires new support in the archiver. However, at least it ought to be simpler since it is still a higher protocol level than the cpio archive itself.
>
> There's already one special case in cpio, which is the "!!!TRAILER!!!" filename; although I don't think it is part of the formal spec, to the extent there is one, I would expect that in practice it is always encoded with a mode of 0, which incidentally could be used to unbreak the case where such a filename actually exists. So one way to support such extended metadata would be to set mode to 0 and use the filename to encode the type of metadata. I wonder how existing GNU or BSD cpio (the BSD one is better maintained these days) would deal with reading such a file; it would at least not be a regression if it just read it still, possibly with warnings. It could also be possible to use bits 17:16 in the mode, which are traditionally always zero (mode_t being 16 bits), but I believe are present in most or all of the cpio formats for historical reasons. It might be accepted better by existing implementations to use one of these high bits combined with S_IFREG, I dont know.
>

I'll happily modify toybox cpio to understand xattrs (compress and decompress),
the android guys do a lot with xattrs already. I tapped out of _this_ discussion
from disgust with the proposed encoding.

Rob

2019-05-17 22:57:38

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] initramfs: introduce do_readxattrs()

On 5/17/19 2:02 PM, Arvind Sankar wrote:
> On Fri, May 17, 2019 at 01:18:11PM -0700, [email protected] wrote:
>>
>> Ok... I just realized this does not work for a modular initramfs, composed at load time from multiple files, which is a very real problem. Should be easy enough to deal with: instead of one large file, use one companion file per source file, perhaps something like filename..xattrs (suggesting double dots to make it less likely to conflict with a "real" file.) No leading dot, as it makes it more likely that archivers will sort them before the file proper.
> This version of the patch was changed from the previous one exactly to deal with this case --
> it allows for the bootloader to load multiple initramfs archives, each
> with its own .xattr-list file, and to have that work properly.
> Could you elaborate on the issue that you see?
>

Well, for one thing, how do you define "cpio archive", each with its own
.xattr-list file? Second, that would seem to depend on the ordering, no,
in which case you depend critically on .xattr-list file following the
files, which most archivers won't do.

Either way it seems cleaner to have this per file; especially if/as it
can be done without actually mucking up the format.

I need to run, but I'll post a more detailed explanation of what I did
in a little bit.

-hpa

2019-05-17 23:12:45

by Arvind Sankar

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] initramfs: introduce do_readxattrs()

On Fri, May 17, 2019 at 02:47:31PM -0700, H. Peter Anvin wrote:
> On 5/17/19 2:02 PM, Arvind Sankar wrote:
> > On Fri, May 17, 2019 at 01:18:11PM -0700, [email protected] wrote:
> >>
> >> Ok... I just realized this does not work for a modular initramfs, composed at load time from multiple files, which is a very real problem. Should be easy enough to deal with: instead of one large file, use one companion file per source file, perhaps something like filename..xattrs (suggesting double dots to make it less likely to conflict with a "real" file.) No leading dot, as it makes it more likely that archivers will sort them before the file proper.
> > This version of the patch was changed from the previous one exactly to deal with this case --
> > it allows for the bootloader to load multiple initramfs archives, each
> > with its own .xattr-list file, and to have that work properly.
> > Could you elaborate on the issue that you see?
> >
>
> Well, for one thing, how do you define "cpio archive", each with its own
> .xattr-list file? Second, that would seem to depend on the ordering, no,
> in which case you depend critically on .xattr-list file following the
> files, which most archivers won't do.
>
> Either way it seems cleaner to have this per file; especially if/as it
> can be done without actually mucking up the format.
>
> I need to run, but I'll post a more detailed explanation of what I did
> in a little bit.
>
> -hpa
>
Not sure what you mean by how do I define it? Each cpio archive will
contain its own .xattr-list file with signatures for the files within
it, that was the idea.

You need to review the code more closely I think -- it does not depend
on the .xattr-list file following the files to which it applies.

The code first extracts .xattr-list as though it was a regular file. If
a later dupe shows up (presumably from a second archive, although the
patch will actually allow a second one in the same archive), it will
then process the existing .xattr-list file and apply the attributes
listed within it. It then will proceed to read the second one and
overwrite the first one with it (this is the normal behaviour in the
kernel cpio parser). At the end once all the archives have been
extracted, if there is an .xattr-list file in the rootfs it will be
parsed (it would've been the last one encountered, which hasn't been
parsed yet, just extracted).

Regarding the idea to use the high 16 bits of the mode field in
the header that's another possibility. It would just require additional
support in the program that actually creates the archive though, which
the current patch doesn't.

2019-05-17 23:26:05

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] initramfs: introduce do_readxattrs()

Hi Roberto,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.1 next-20190517]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/Roberto-Sassu/initramfs-set-extended-attributes/20190518-055846
config: sparc64-allyesconfig (attached as .config)
compiler: sparc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.2.0 make.cross ARCH=sparc64

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <[email protected]>

All error/warnings (new ones prefixed by >>):

init/initramfs.c: In function 'do_readxattrs':
>> init/initramfs.c:437:10: error: implicit declaration of function 'vmalloc'; did you mean 'kvmalloc'? [-Werror=implicit-function-declaration]
path = vmalloc(file_entry_size);
^~~~~~~
kvmalloc
>> init/initramfs.c:437:8: warning: assignment makes pointer from integer without a cast [-Wint-conversion]
path = vmalloc(file_entry_size);
^
>> init/initramfs.c:461:3: error: implicit declaration of function 'vfree'; did you mean 'kvfree'? [-Werror=implicit-function-declaration]
vfree(path);
^~~~~
kvfree
cc1: some warnings being treated as errors

vim +437 init/initramfs.c

391
392 static int __init do_readxattrs(void)
393 {
394 struct path_hdr hdr;
395 char *path = NULL;
396 char str[sizeof(hdr.p_size) + 1];
397 unsigned long file_entry_size;
398 size_t size, path_size, total_size;
399 struct kstat st;
400 struct file *file;
401 loff_t pos;
402 int ret;
403
404 ret = vfs_lstat(XATTR_LIST_FILENAME, &st);
405 if (ret < 0)
406 return ret;
407
408 total_size = st.size;
409
410 file = filp_open(XATTR_LIST_FILENAME, O_RDONLY, 0);
411 if (IS_ERR(file))
412 return PTR_ERR(file);
413
414 pos = file->f_pos;
415
416 while (total_size) {
417 size = kernel_read(file, (char *)&hdr, sizeof(hdr), &pos);
418 if (size != sizeof(hdr)) {
419 ret = -EIO;
420 goto out;
421 }
422
423 total_size -= size;
424
425 str[sizeof(hdr.p_size)] = 0;
426 memcpy(str, hdr.p_size, sizeof(hdr.p_size));
427 ret = kstrtoul(str, 16, &file_entry_size);
428 if (ret < 0)
429 goto out;
430
431 file_entry_size -= sizeof(sizeof(hdr.p_size));
432 if (file_entry_size > total_size) {
433 ret = -EINVAL;
434 goto out;
435 }
436
> 437 path = vmalloc(file_entry_size);
438 if (!path) {
439 ret = -ENOMEM;
440 goto out;
441 }
442
443 size = kernel_read(file, path, file_entry_size, &pos);
444 if (size != file_entry_size) {
445 ret = -EIO;
446 goto out_free;
447 }
448
449 total_size -= size;
450
451 path_size = strnlen(path, file_entry_size);
452 if (path_size == file_entry_size) {
453 ret = -EINVAL;
454 goto out_free;
455 }
456
457 xattr_buf = path + path_size + 1;
458 xattr_len = file_entry_size - path_size - 1;
459
460 ret = do_setxattrs(path);
> 461 vfree(path);
462 path = NULL;
463
464 if (ret < 0)
465 break;
466 }
467 out_free:
468 vfree(path);
469 out:
470 fput(file);
471
472 if (ret < 0)
473 error("Unable to parse xattrs");
474
475 return ret;
476 }
477

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (3.92 kB)
.config.gz (56.05 kB)
Download all attachments

2019-05-18 02:32:33

by Rob Landley

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] initramfs: introduce do_readxattrs()

On 5/17/19 4:41 PM, H. Peter Anvin wrote:
> On 5/17/19 1:18 PM, [email protected] wrote:
>>
>> Ok... I just realized this does not work for a modular initramfs, composed at load time from multiple files, which is a very real problem. Should be easy enough to deal with: instead of one large file, use one companion file per source file, perhaps something like filename..xattrs (suggesting double dots to make it less likely to conflict with a "real" file.) No leading dot, as it makes it more likely that archivers will sort them before the file proper.
>>
>> A side benefit is that the format can be simpler as there is no need to encode the filename.
>>
>> A technically cleaner solution still, but which would need archiver modifications, would be to encode the xattrs as an optionally nameless file (just an empty string) with a new file mode value, immediately following the original file. The advantage there is that the archiver itself could support xattrs and other extended metadata (which has been requested elsewhere); the disadvantage obviously is that that it requires new support in the archiver. However, at least it ought to be simpler since it is still a higher protocol level than the cpio archive itself.
>>
>> There's already one special case in cpio, which is the "!!!TRAILER!!!" filename; although I don't think it is part of the formal spec, to the extent there is one, I would expect that in practice it is always encoded with a mode of 0, which incidentally could be used to unbreak the case where such a filename actually exists. So one way to support such extended metadata would be to set mode to 0 and use the filename to encode the type of metadata. I wonder how existing GNU or BSD cpio (the BSD one is better maintained these days) would deal with reading such a file; it would at least not be a regression if it just read it still, possibly with warnings. It could also be possible to use bits 17:16 in the mode, which are traditionally always zero (mode_t being 16 bits), but I believe are present in most or all of the cpio formats for historical reasons. It might be accepted better by existing implementations to use one of these high bits combined with S_IFREG, I dont know.
>
>
> Correction: it's just !!!TRAILER!!!.

We documented it as "TRAILER!!!" without leading !!!, and that its purpose is to
flush hardlinks:

https://www.kernel.org/doc/Documentation/early-userspace/buffer-format.txt

That's what toybox cpio has been producing. Kernel consumes it just fine. Just
checked busybox cpio and that's what they're producing as well...

Rob

2019-05-18 03:00:12

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] initramfs: introduce do_readxattrs()

Hi Roberto,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.1 next-20190517]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/Roberto-Sassu/initramfs-set-extended-attributes/20190518-055846
config: m68k-allyesconfig (attached as .config)
compiler: m68k-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.2.0 make.cross ARCH=m68k

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <[email protected]>

All error/warnings (new ones prefixed by >>):

init/initramfs.c: In function 'do_readxattrs':
init/initramfs.c:437:10: error: implicit declaration of function 'vmalloc'; did you mean 'kvmalloc'? [-Werror=implicit-function-declaration]
path = vmalloc(file_entry_size);
^~~~~~~
kvmalloc
init/initramfs.c:437:8: warning: assignment makes pointer from integer without a cast [-Wint-conversion]
path = vmalloc(file_entry_size);
^
init/initramfs.c:461:3: error: implicit declaration of function 'vfree'; did you mean 'kvfree'? [-Werror=implicit-function-declaration]
vfree(path);
^~~~~
kvfree
In file included from include/asm-generic/io.h:891:0,
from arch/m68k/include/asm/io.h:11,
from include/linux/kexec.h:19,
from init/initramfs.c:694:
include/linux/vmalloc.h: At top level:
>> include/linux/vmalloc.h:77:14: error: conflicting types for 'vmalloc'
extern void *vmalloc(unsigned long size);
^~~~~~~
init/initramfs.c:437:10: note: previous implicit declaration of 'vmalloc' was here
path = vmalloc(file_entry_size);
^~~~~~~
In file included from include/asm-generic/io.h:891:0,
from arch/m68k/include/asm/io.h:11,
from include/linux/kexec.h:19,
from init/initramfs.c:694:
>> include/linux/vmalloc.h:102:13: warning: conflicting types for 'vfree'
extern void vfree(const void *addr);
^~~~~
init/initramfs.c:461:3: note: previous implicit declaration of 'vfree' was here
vfree(path);
^~~~~
cc1: some warnings being treated as errors

vim +/vmalloc +77 include/linux/vmalloc.h

db64fe02 Nick Piggin 2008-10-18 76
^1da177e Linus Torvalds 2005-04-16 @77 extern void *vmalloc(unsigned long size);
e1ca7788 Dave Young 2010-10-26 78 extern void *vzalloc(unsigned long size);
83342314 Nick Piggin 2006-06-23 79 extern void *vmalloc_user(unsigned long size);
930fc45a Christoph Lameter 2005-10-29 80 extern void *vmalloc_node(unsigned long size, int node);
e1ca7788 Dave Young 2010-10-26 81 extern void *vzalloc_node(unsigned long size, int node);
^1da177e Linus Torvalds 2005-04-16 82 extern void *vmalloc_exec(unsigned long size);
^1da177e Linus Torvalds 2005-04-16 83 extern void *vmalloc_32(unsigned long size);
83342314 Nick Piggin 2006-06-23 84 extern void *vmalloc_32_user(unsigned long size);
dd0fc66f Al Viro 2005-10-07 85 extern void *__vmalloc(unsigned long size, gfp_t gfp_mask, pgprot_t prot);
d0a21265 David Rientjes 2011-01-13 86 extern void *__vmalloc_node_range(unsigned long size, unsigned long align,
d0a21265 David Rientjes 2011-01-13 87 unsigned long start, unsigned long end, gfp_t gfp_mask,
cb9e3c29 Andrey Ryabinin 2015-02-13 88 pgprot_t prot, unsigned long vm_flags, int node,
cb9e3c29 Andrey Ryabinin 2015-02-13 89 const void *caller);
1f5307b1 Michal Hocko 2017-05-08 90 #ifndef CONFIG_MMU
a7c3e901 Michal Hocko 2017-05-08 91 extern void *__vmalloc_node_flags(unsigned long size, int node, gfp_t flags);
8594a21c Michal Hocko 2017-05-12 92 static inline void *__vmalloc_node_flags_caller(unsigned long size, int node,
8594a21c Michal Hocko 2017-05-12 93 gfp_t flags, void *caller)
1f5307b1 Michal Hocko 2017-05-08 94 {
8594a21c Michal Hocko 2017-05-12 95 return __vmalloc_node_flags(size, node, flags);
1f5307b1 Michal Hocko 2017-05-08 96 }
8594a21c Michal Hocko 2017-05-12 97 #else
8594a21c Michal Hocko 2017-05-12 98 extern void *__vmalloc_node_flags_caller(unsigned long size,
8594a21c Michal Hocko 2017-05-12 99 int node, gfp_t flags, void *caller);
1f5307b1 Michal Hocko 2017-05-08 100 #endif
cb9e3c29 Andrey Ryabinin 2015-02-13 101
b3bdda02 Christoph Lameter 2008-02-04 @102 extern void vfree(const void *addr);
bf22e37a Andrey Ryabinin 2016-12-12 103 extern void vfree_atomic(const void *addr);
^1da177e Linus Torvalds 2005-04-16 104

:::::: The code at line 77 was first introduced by commit
:::::: 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 Linux-2.6.12-rc2

:::::: TO: Linus Torvalds <[email protected]>
:::::: CC: Linus Torvalds <[email protected]>

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (5.35 kB)
.config.gz (48.69 kB)
Download all attachments

2019-05-18 06:35:50

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] initramfs: introduce do_readxattrs()

Hi Roberto,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v5.1 next-20190517]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/Roberto-Sassu/initramfs-set-extended-attributes/20190518-055846
reproduce:
# apt-get install sparse
make ARCH=x86_64 allmodconfig
make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <[email protected]>


sparse warnings: (new ones prefixed by >>)

init/initramfs.c:24:45: sparse: sparse: incorrect type in argument 2 (different address spaces) @@ expected char const [noderef] <asn:1> *buf @@ got f] <asn:1> *buf @@
init/initramfs.c:24:45: sparse: expected char const [noderef] <asn:1> *buf
init/initramfs.c:24:45: sparse: got char const *p
init/initramfs.c:115:36: sparse: sparse: incorrect type in argument 2 (different address spaces) @@ expected char const [noderef] <asn:1> *filename @@ got n:1> *filename @@
init/initramfs.c:115:36: sparse: expected char const [noderef] <asn:1> *filename
init/initramfs.c:115:36: sparse: got char *filename
init/initramfs.c:303:24: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected char const [noderef] <asn:1> *name @@ got n:1> *name @@
init/initramfs.c:303:24: sparse: expected char const [noderef] <asn:1> *name
init/initramfs.c:303:24: sparse: got char *path
init/initramfs.c:305:36: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected char const [noderef] <asn:1> *pathname @@ got n:1> *pathname @@
init/initramfs.c:305:36: sparse: expected char const [noderef] <asn:1> *pathname
init/initramfs.c:305:36: sparse: got char *path
init/initramfs.c:307:37: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected char const [noderef] <asn:1> *pathname @@ got n:1> *pathname @@
init/initramfs.c:307:37: sparse: expected char const [noderef] <asn:1> *pathname
init/initramfs.c:307:37: sparse: got char *path
init/initramfs.c:317:43: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected char const [noderef] <asn:1> *oldname @@ got n:1> *oldname @@
init/initramfs.c:317:43: sparse: expected char const [noderef] <asn:1> *oldname
init/initramfs.c:317:43: sparse: got char *old
init/initramfs.c:317:48: sparse: sparse: incorrect type in argument 2 (different address spaces) @@ expected char const [noderef] <asn:1> *newname @@ got char char const [noderef] <asn:1> *newname @@
init/initramfs.c:317:48: sparse: expected char const [noderef] <asn:1> *newname
init/initramfs.c:317:48: sparse: got char *static [toplevel] [assigned] collected
init/initramfs.c:404:25: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected char const [noderef] <asn:1> *name @@ got n:1> *name @@
init/initramfs.c:404:25: sparse: expected char const [noderef] <asn:1> *name
init/initramfs.c:404:25: sparse: got char *
>> init/initramfs.c:490:32: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected char const [noderef] <asn:1> *name @@ got char char const [noderef] <asn:1> *name @@
init/initramfs.c:490:32: sparse: expected char const [noderef] <asn:1> *name
init/initramfs.c:490:32: sparse: got char *static [toplevel] [assigned] collected
init/initramfs.c:500:41: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected char const [noderef] <asn:1> *filename @@ got char char const [noderef] <asn:1> *filename @@
init/initramfs.c:500:41: sparse: expected char const [noderef] <asn:1> *filename
init/initramfs.c:500:41: sparse: got char *static [toplevel] [assigned] collected
init/initramfs.c:512:28: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected char const [noderef] <asn:1> *pathname @@ got char char const [noderef] <asn:1> *pathname @@
init/initramfs.c:512:28: sparse: expected char const [noderef] <asn:1> *pathname
init/initramfs.c:512:28: sparse: got char *static [toplevel] [assigned] collected
init/initramfs.c:513:28: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected char const [noderef] <asn:1> *filename @@ got char char const [noderef] <asn:1> *filename @@
init/initramfs.c:513:28: sparse: expected char const [noderef] <asn:1> *filename
init/initramfs.c:513:28: sparse: got char *static [toplevel] [assigned] collected
init/initramfs.c:514:28: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected char const [noderef] <asn:1> *filename @@ got char char const [noderef] <asn:1> *filename @@
init/initramfs.c:514:28: sparse: expected char const [noderef] <asn:1> *filename
init/initramfs.c:514:28: sparse: got char *static [toplevel] [assigned] collected
init/initramfs.c:519:36: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected char const [noderef] <asn:1> *filename @@ got char char const [noderef] <asn:1> *filename @@
init/initramfs.c:519:36: sparse: expected char const [noderef] <asn:1> *filename
init/initramfs.c:519:36: sparse: got char *static [toplevel] [assigned] collected
init/initramfs.c:520:36: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected char const [noderef] <asn:1> *filename @@ got char char const [noderef] <asn:1> *filename @@
init/initramfs.c:520:36: sparse: expected char const [noderef] <asn:1> *filename
init/initramfs.c:520:36: sparse: got char *static [toplevel] [assigned] collected
init/initramfs.c:521:36: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected char const [noderef] <asn:1> *filename @@ got char char const [noderef] <asn:1> *filename @@
init/initramfs.c:521:36: sparse: expected char const [noderef] <asn:1> *filename
init/initramfs.c:521:36: sparse: got char *static [toplevel] [assigned] collected
init/initramfs.c:552:32: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected char const [noderef] <asn:1> *oldname @@ got n:1> *oldname @@
init/initramfs.c:552:32: sparse: expected char const [noderef] <asn:1> *oldname
init/initramfs.c:552:32: sparse: got char *
init/initramfs.c:552:53: sparse: sparse: incorrect type in argument 2 (different address spaces) @@ expected char const [noderef] <asn:1> *newname @@ got char char const [noderef] <asn:1> *newname @@
init/initramfs.c:552:53: sparse: expected char const [noderef] <asn:1> *newname
init/initramfs.c:552:53: sparse: got char *static [toplevel] [assigned] collected
init/initramfs.c:553:21: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected char const [noderef] <asn:1> *filename @@ got char char const [noderef] <asn:1> *filename @@
init/initramfs.c:553:21: sparse: expected char const [noderef] <asn:1> *filename
init/initramfs.c:553:21: sparse: got char *static [toplevel] [assigned] collected

vim +490 init/initramfs.c

310
311 static int __init maybe_link(void)
312 {
313 if (nlink >= 2) {
314 char *old = find_link(major, minor, ino, mode, collected);
315 if (old) {
316 clean_path(collected, 0);
> 317 return (ksys_link(old, collected) < 0) ? -1 : 1;
318 }
319 }
320 return 0;
321 }
322
323 struct xattr_hdr {
324 char c_size[8]; /* total size including c_size field */
325 char c_data[]; /* <name>\0<value> */
326 };
327
328 static int __init __maybe_unused do_setxattrs(char *pathname)
329 {
330 char *buf = xattr_buf;
331 char *bufend = buf + xattr_len;
332 struct xattr_hdr *hdr;
333 char str[sizeof(hdr->c_size) + 1];
334 struct path path;
335
336 if (!xattr_len)
337 return 0;
338
339 str[sizeof(hdr->c_size)] = 0;
340
341 while (buf < bufend) {
342 char *xattr_name, *xattr_value;
343 unsigned long xattr_entry_size;
344 unsigned long xattr_name_size, xattr_value_size;
345 int ret;
346
347 if (buf + sizeof(hdr->c_size) > bufend) {
348 error("malformed xattrs");
349 break;
350 }
351
352 hdr = (struct xattr_hdr *)buf;
353 memcpy(str, hdr->c_size, sizeof(hdr->c_size));
354 ret = kstrtoul(str, 16, &xattr_entry_size);
355 buf += xattr_entry_size;
356 if (ret || buf > bufend || !xattr_entry_size) {
357 error("malformed xattrs");
358 break;
359 }
360
361 xattr_name = hdr->c_data;
362 xattr_name_size = strnlen(xattr_name,
363 xattr_entry_size - sizeof(hdr->c_size));
364 if (xattr_name_size == xattr_entry_size - sizeof(hdr->c_size)) {
365 error("malformed xattrs");
366 break;
367 }
368
369 xattr_value = xattr_name + xattr_name_size + 1;
370 xattr_value_size = buf - xattr_value;
371
372 ret = kern_path(pathname, 0, &path);
373 if (!ret) {
374 ret = vfs_setxattr(path.dentry, xattr_name, xattr_value,
375 xattr_value_size, 0);
376
377 path_put(&path);
378 }
379
380 pr_debug("%s: %s size: %lu val: %s (ret: %d)\n", pathname,
381 xattr_name, xattr_value_size, xattr_value, ret);
382 }
383
384 return 0;
385 }
386
387 struct path_hdr {
388 char p_size[10]; /* total size including p_size field */
389 char p_data[]; /* <path>\0<xattrs> */
390 };
391
392 static int __init do_readxattrs(void)
393 {
394 struct path_hdr hdr;
395 char *path = NULL;
396 char str[sizeof(hdr.p_size) + 1];
397 unsigned long file_entry_size;
398 size_t size, path_size, total_size;
399 struct kstat st;
400 struct file *file;
401 loff_t pos;
402 int ret;
403
404 ret = vfs_lstat(XATTR_LIST_FILENAME, &st);
405 if (ret < 0)
406 return ret;
407
408 total_size = st.size;
409
410 file = filp_open(XATTR_LIST_FILENAME, O_RDONLY, 0);
411 if (IS_ERR(file))
412 return PTR_ERR(file);
413
414 pos = file->f_pos;
415
416 while (total_size) {
417 size = kernel_read(file, (char *)&hdr, sizeof(hdr), &pos);
418 if (size != sizeof(hdr)) {
419 ret = -EIO;
420 goto out;
421 }
422
423 total_size -= size;
424
425 str[sizeof(hdr.p_size)] = 0;
426 memcpy(str, hdr.p_size, sizeof(hdr.p_size));
427 ret = kstrtoul(str, 16, &file_entry_size);
428 if (ret < 0)
429 goto out;
430
431 file_entry_size -= sizeof(sizeof(hdr.p_size));
432 if (file_entry_size > total_size) {
433 ret = -EINVAL;
434 goto out;
435 }
436
437 path = vmalloc(file_entry_size);
438 if (!path) {
439 ret = -ENOMEM;
440 goto out;
441 }
442
443 size = kernel_read(file, path, file_entry_size, &pos);
444 if (size != file_entry_size) {
445 ret = -EIO;
446 goto out_free;
447 }
448
449 total_size -= size;
450
451 path_size = strnlen(path, file_entry_size);
452 if (path_size == file_entry_size) {
453 ret = -EINVAL;
454 goto out_free;
455 }
456
457 xattr_buf = path + path_size + 1;
458 xattr_len = file_entry_size - path_size - 1;
459
460 ret = do_setxattrs(path);
461 vfree(path);
462 path = NULL;
463
464 if (ret < 0)
465 break;
466 }
467 out_free:
468 vfree(path);
469 out:
470 fput(file);
471
472 if (ret < 0)
473 error("Unable to parse xattrs");
474
475 return ret;
476 }
477
478 static __initdata int wfd;
479
480 static int __init do_name(void)
481 {
482 state = SkipIt;
483 next_state = Reset;
484 if (strcmp(collected, "TRAILER!!!") == 0) {
485 free_hash();
486 return 0;
487 } else if (strcmp(collected, XATTR_LIST_FILENAME) == 0) {
488 struct kstat st;
489
> 490 if (!vfs_lstat(collected, &st))
491 do_readxattrs();
492 }
493 clean_path(collected, mode);
494 if (S_ISREG(mode)) {
495 int ml = maybe_link();
496 if (ml >= 0) {
497 int openflags = O_WRONLY|O_CREAT;
498 if (ml != 1)
499 openflags |= O_TRUNC;
500 wfd = ksys_open(collected, openflags, mode);
501
502 if (wfd >= 0) {
503 ksys_fchown(wfd, uid, gid);
504 ksys_fchmod(wfd, mode);
505 if (body_len)
506 ksys_ftruncate(wfd, body_len);
507 vcollected = kstrdup(collected, GFP_KERNEL);
508 state = CopyFile;
509 }
510 }
511 } else if (S_ISDIR(mode)) {
512 ksys_mkdir(collected, mode);
513 ksys_chown(collected, uid, gid);
514 ksys_chmod(collected, mode);
515 dir_add(collected, mtime);
516 } else if (S_ISBLK(mode) || S_ISCHR(mode) ||
517 S_ISFIFO(mode) || S_ISSOCK(mode)) {
518 if (maybe_link() == 0) {
519 ksys_mknod(collected, mode, rdev);
520 ksys_chown(collected, uid, gid);
521 ksys_chmod(collected, mode);
522 do_utime(collected, mtime);
523 }
524 }
525 return 0;
526 }
527

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation

2019-05-20 09:10:42

by Roberto Sassu

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] initramfs: introduce do_readxattrs()

On 5/17/2019 11:10 PM, Arvind Sankar wrote:
> On Fri, May 17, 2019 at 05:02:20PM -0400, Arvind Sankar wrote:
>> On Fri, May 17, 2019 at 01:18:11PM -0700, [email protected] wrote:
>>>
>>> Ok... I just realized this does not work for a modular initramfs, composed at load time from multiple files, which is a very real problem. Should be easy enough to deal with: instead of one large file, use one companion file per source file, perhaps something like filename..xattrs (suggesting double dots to make it less likely to conflict with a "real" file.) No leading dot, as it makes it more likely that archivers will sort them before the file proper.
>> This version of the patch was changed from the previous one exactly to deal with this case --
>> it allows for the bootloader to load multiple initramfs archives, each
>> with its own .xattr-list file, and to have that work properly.
>> Could you elaborate on the issue that you see?
> Roberto, are you missing a changelog entry for v2->v3 change?

The changelog for v1->v2 is missing.

Thanks

Roberto

--
HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063
Managing Director: Bo PENG, Jian LI, Yanli SHI

2019-05-20 10:13:59

by Roberto Sassu

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] initramfs: introduce do_readxattrs()

On 5/17/2019 10:18 PM, [email protected] wrote:
> On May 17, 2019 9:55:19 AM PDT, Roberto Sassu <[email protected]> wrote:
>> This patch adds support for an alternative method to add xattrs to
>> files in
>> the rootfs filesystem. Instead of extracting them directly from the ram
>> disk image, they are extracted from a regular file called .xattr-list,
>> that
>> can be added by any ram disk generator available today. The file format
>> is:
>>
>> <file #N data len (ASCII, 10 chars)><file #N path>\0
>> <xattr #N data len (ASCII, 8 chars)><xattr #N name>\0<xattr #N value>
>>
>> .xattr-list can be generated by executing:
>>
>> $ getfattr --absolute-names -d -h -R -e hex -m - \
>> <file list> | xattr.awk -b > ${initdir}/.xattr-list
>>
>> where the content of the xattr.awk script is:
>>
>> #! /usr/bin/awk -f
>> {
>> if (!length($0)) {
>> printf("%.10x%s\0", len, file);
>> for (x in xattr) {
>> printf("%.8x%s\0", xattr_len[x], x);
>> for (i = 0; i < length(xattr[x]) / 2; i++) {
>> printf("%c", strtonum("0x"substr(xattr[x], i * 2 + 1, 2)));
>> }
>> }
>> i = 0;
>> delete xattr;
>> delete xattr_len;
>> next;
>> };
>> if (i == 0) {
>> file=$3;
>> len=length(file) + 8 + 1;
>> }
>> if (i > 0) {
>> split($0, a, "=");
>> xattr[a[1]]=substr(a[2], 3);
>> xattr_len[a[1]]=length(a[1]) + 1 + 8 + length(xattr[a[1]]) / 2;
>> len+=xattr_len[a[1]];
>> };
>> i++;
>> }
>>
>> Signed-off-by: Roberto Sassu <[email protected]>
>> ---
>> init/initramfs.c | 99 ++++++++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 99 insertions(+)
>>
>> diff --git a/init/initramfs.c b/init/initramfs.c
>> index 0c6dd1d5d3f6..6ec018c6279a 100644
>> --- a/init/initramfs.c
>> +++ b/init/initramfs.c
>> @@ -13,6 +13,8 @@
>> #include <linux/namei.h>
>> #include <linux/xattr.h>
>>
>> +#define XATTR_LIST_FILENAME ".xattr-list"
>> +
>> static ssize_t __init xwrite(int fd, const char *p, size_t count)
>> {
>> ssize_t out = 0;
>> @@ -382,6 +384,97 @@ static int __init __maybe_unused do_setxattrs(char
>> *pathname)
>> return 0;
>> }
>>
>> +struct path_hdr {
>> + char p_size[10]; /* total size including p_size field */
>> + char p_data[]; /* <path>\0<xattrs> */
>> +};
>> +
>> +static int __init do_readxattrs(void)
>> +{
>> + struct path_hdr hdr;
>> + char *path = NULL;
>> + char str[sizeof(hdr.p_size) + 1];
>> + unsigned long file_entry_size;
>> + size_t size, path_size, total_size;
>> + struct kstat st;
>> + struct file *file;
>> + loff_t pos;
>> + int ret;
>> +
>> + ret = vfs_lstat(XATTR_LIST_FILENAME, &st);
>> + if (ret < 0)
>> + return ret;
>> +
>> + total_size = st.size;
>> +
>> + file = filp_open(XATTR_LIST_FILENAME, O_RDONLY, 0);
>> + if (IS_ERR(file))
>> + return PTR_ERR(file);
>> +
>> + pos = file->f_pos;
>> +
>> + while (total_size) {
>> + size = kernel_read(file, (char *)&hdr, sizeof(hdr), &pos);
>> + if (size != sizeof(hdr)) {
>> + ret = -EIO;
>> + goto out;
>> + }
>> +
>> + total_size -= size;
>> +
>> + str[sizeof(hdr.p_size)] = 0;
>> + memcpy(str, hdr.p_size, sizeof(hdr.p_size));
>> + ret = kstrtoul(str, 16, &file_entry_size);
>> + if (ret < 0)
>> + goto out;
>> +
>> + file_entry_size -= sizeof(sizeof(hdr.p_size));
>> + if (file_entry_size > total_size) {
>> + ret = -EINVAL;
>> + goto out;
>> + }
>> +
>> + path = vmalloc(file_entry_size);
>> + if (!path) {
>> + ret = -ENOMEM;
>> + goto out;
>> + }
>> +
>> + size = kernel_read(file, path, file_entry_size, &pos);
>> + if (size != file_entry_size) {
>> + ret = -EIO;
>> + goto out_free;
>> + }
>> +
>> + total_size -= size;
>> +
>> + path_size = strnlen(path, file_entry_size);
>> + if (path_size == file_entry_size) {
>> + ret = -EINVAL;
>> + goto out_free;
>> + }
>> +
>> + xattr_buf = path + path_size + 1;
>> + xattr_len = file_entry_size - path_size - 1;
>> +
>> + ret = do_setxattrs(path);
>> + vfree(path);
>> + path = NULL;
>> +
>> + if (ret < 0)
>> + break;
>> + }
>> +out_free:
>> + vfree(path);
>> +out:
>> + fput(file);
>> +
>> + if (ret < 0)
>> + error("Unable to parse xattrs");
>> +
>> + return ret;
>> +}
>> +
>> static __initdata int wfd;
>>
>> static int __init do_name(void)
>> @@ -391,6 +484,11 @@ static int __init do_name(void)
>> if (strcmp(collected, "TRAILER!!!") == 0) {
>> free_hash();
>> return 0;
>> + } else if (strcmp(collected, XATTR_LIST_FILENAME) == 0) {
>> + struct kstat st;
>> +
>> + if (!vfs_lstat(collected, &st))
>> + do_readxattrs();
>> }
>> clean_path(collected, mode);
>> if (S_ISREG(mode)) {
>> @@ -562,6 +660,7 @@ static char * __init unpack_to_rootfs(char *buf,
>> unsigned long len)
>> buf += my_inptr;
>> len -= my_inptr;
>> }
>> + do_readxattrs();
>> dir_utime();
>> kfree(name_buf);
>> kfree(symlink_buf);
>
> Ok... I just realized this does not work for a modular initramfs, composed at load time from multiple files, which is a very real problem. Should be easy enough to deal with: instead of one large file, use one companion file per source file, perhaps something like filename..xattrs (suggesting double dots to make it less likely to conflict with a "real" file.) No leading dot, as it makes it more likely that archivers will sort them before the file proper.

Version 1 of the patch set worked exactly in this way. However, Rob
pointed out that this would be a problem if file names plus the suffix
exceed 255 characters.

Roberto

--
HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063
Managing Director: Bo PENG, Jian LI, Yanli SHI

2019-05-20 11:35:10

by Roberto Sassu

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] initramfs: introduce do_readxattrs()

On 5/18/2019 12:17 AM, Arvind Sankar wrote:
> On Fri, May 17, 2019 at 02:47:31PM -0700, H. Peter Anvin wrote:
>> On 5/17/19 2:02 PM, Arvind Sankar wrote:
>>> On Fri, May 17, 2019 at 01:18:11PM -0700, [email protected] wrote:
>>>>
>>>> Ok... I just realized this does not work for a modular initramfs, composed at load time from multiple files, which is a very real problem. Should be easy enough to deal with: instead of one large file, use one companion file per source file, perhaps something like filename..xattrs (suggesting double dots to make it less likely to conflict with a "real" file.) No leading dot, as it makes it more likely that archivers will sort them before the file proper.
>>> This version of the patch was changed from the previous one exactly to deal with this case --
>>> it allows for the bootloader to load multiple initramfs archives, each
>>> with its own .xattr-list file, and to have that work properly.
>>> Could you elaborate on the issue that you see?
>>>
>>
>> Well, for one thing, how do you define "cpio archive", each with its own
>> .xattr-list file? Second, that would seem to depend on the ordering, no,
>> in which case you depend critically on .xattr-list file following the
>> files, which most archivers won't do.
>>
>> Either way it seems cleaner to have this per file; especially if/as it
>> can be done without actually mucking up the format.
>>
>> I need to run, but I'll post a more detailed explanation of what I did
>> in a little bit.
>>
>> -hpa
>>
> Not sure what you mean by how do I define it? Each cpio archive will
> contain its own .xattr-list file with signatures for the files within
> it, that was the idea.
>
> You need to review the code more closely I think -- it does not depend
> on the .xattr-list file following the files to which it applies.
>
> The code first extracts .xattr-list as though it was a regular file. If
> a later dupe shows up (presumably from a second archive, although the
> patch will actually allow a second one in the same archive), it will
> then process the existing .xattr-list file and apply the attributes
> listed within it. It then will proceed to read the second one and
> overwrite the first one with it (this is the normal behaviour in the
> kernel cpio parser). At the end once all the archives have been
> extracted, if there is an .xattr-list file in the rootfs it will be
> parsed (it would've been the last one encountered, which hasn't been
> parsed yet, just extracted).
>
> Regarding the idea to use the high 16 bits of the mode field in
> the header that's another possibility. It would just require additional
> support in the program that actually creates the archive though, which
> the current patch doesn't.

Yes, for adding signatures for a subset of files, no changes to the ram
disk generator are necessary. Everything is done by a custom module. To
support a generic use case, it would be necessary to modify the
generator to execute getfattr and the awk script after files have been
placed in the temporary directory.

If I understood the new proposal correctly, it would be task for cpio to
read file metadata after the content and create a new record for each
file with mode 0x18000, type of metadata encoded in the file name and
metadata as file content. I don't know how easy it would be to modify
cpio. Probably the amount of changes would be reasonable.

The kernel will behave in a similar way. It will call do_readxattrs() in
do_copy() for each file. Since the only difference between the current
and the new proposal would be two additional calls to do_readxattrs() in
do_name() and unpack_to_rootfs(), maybe we could support both.

Roberto

--
HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063
Managing Director: Bo PENG, Jian LI, Yanli SHI

2019-05-22 16:22:00

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] initramfs: introduce do_readxattrs()

On May 17, 2019 7:16:04 PM PDT, Rob Landley <[email protected]> wrote:
>On 5/17/19 4:41 PM, H. Peter Anvin wrote:
>> On 5/17/19 1:18 PM, [email protected] wrote:
>>>
>>> Ok... I just realized this does not work for a modular initramfs,
>composed at load time from multiple files, which is a very real
>problem. Should be easy enough to deal with: instead of one large file,
>use one companion file per source file, perhaps something like
>filename..xattrs (suggesting double dots to make it less likely to
>conflict with a "real" file.) No leading dot, as it makes it more
>likely that archivers will sort them before the file proper.
>>>
>>> A side benefit is that the format can be simpler as there is no need
>to encode the filename.
>>>
>>> A technically cleaner solution still, but which would need archiver
>modifications, would be to encode the xattrs as an optionally nameless
>file (just an empty string) with a new file mode value, immediately
>following the original file. The advantage there is that the archiver
>itself could support xattrs and other extended metadata (which has been
>requested elsewhere); the disadvantage obviously is that that it
>requires new support in the archiver. However, at least it ought to be
>simpler since it is still a higher protocol level than the cpio archive
>itself.
>>>
>>> There's already one special case in cpio, which is the
>"!!!TRAILER!!!" filename; although I don't think it is part of the
>formal spec, to the extent there is one, I would expect that in
>practice it is always encoded with a mode of 0, which incidentally
>could be used to unbreak the case where such a filename actually
>exists. So one way to support such extended metadata would be to set
>mode to 0 and use the filename to encode the type of metadata. I wonder
>how existing GNU or BSD cpio (the BSD one is better maintained these
>days) would deal with reading such a file; it would at least not be a
>regression if it just read it still, possibly with warnings. It could
>also be possible to use bits 17:16 in the mode, which are traditionally
>always zero (mode_t being 16 bits), but I believe are present in most
>or all of the cpio formats for historical reasons. It might be accepted
>better by existing implementations to use one of these high bits
>combined with S_IFREG, I dont know.
>>
>>
>> Correction: it's just !!!TRAILER!!!.
>
>We documented it as "TRAILER!!!" without leading !!!, and that its
>purpose is to
>flush hardlinks:
>
>https://www.kernel.org/doc/Documentation/early-userspace/buffer-format.txt
>
>That's what toybox cpio has been producing. Kernel consumes it just
>fine. Just
>checked busybox cpio and that's what they're producing as well...
>
>Rob

Yes, TRAILER!!! is correct. Somehow I managed to get it wrong twice.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

2019-05-22 17:28:32

by Roberto Sassu

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] initramfs: introduce do_readxattrs()

On 5/22/2019 6:17 PM, [email protected] wrote:
> On May 20, 2019 2:39:46 AM PDT, Roberto Sassu <[email protected]> wrote:
>> On 5/18/2019 12:17 AM, Arvind Sankar wrote:
>>> On Fri, May 17, 2019 at 02:47:31PM -0700, H. Peter Anvin wrote:
>>>> On 5/17/19 2:02 PM, Arvind Sankar wrote:
>>>>> On Fri, May 17, 2019 at 01:18:11PM -0700, [email protected] wrote:
>>>>>>
>>>>>> Ok... I just realized this does not work for a modular initramfs,
>> composed at load time from multiple files, which is a very real
>> problem. Should be easy enough to deal with: instead of one large file,
>> use one companion file per source file, perhaps something like
>> filename..xattrs (suggesting double dots to make it less likely to
>> conflict with a "real" file.) No leading dot, as it makes it more
>> likely that archivers will sort them before the file proper.
>>>>> This version of the patch was changed from the previous one exactly
>> to deal with this case --
>>>>> it allows for the bootloader to load multiple initramfs archives,
>> each
>>>>> with its own .xattr-list file, and to have that work properly.
>>>>> Could you elaborate on the issue that you see?
>>>>>
>>>>
>>>> Well, for one thing, how do you define "cpio archive", each with its
>> own
>>>> .xattr-list file? Second, that would seem to depend on the ordering,
>> no,
>>>> in which case you depend critically on .xattr-list file following
>> the
>>>> files, which most archivers won't do.
>>>>
>>>> Either way it seems cleaner to have this per file; especially if/as
>> it
>>>> can be done without actually mucking up the format.
>>>>
>>>> I need to run, but I'll post a more detailed explanation of what I
>> did
>>>> in a little bit.
>>>>
>>>> -hpa
>>>>
>>> Not sure what you mean by how do I define it? Each cpio archive will
>>> contain its own .xattr-list file with signatures for the files within
>>> it, that was the idea.
>>>
>>> You need to review the code more closely I think -- it does not
>> depend
>>> on the .xattr-list file following the files to which it applies.
>>>
>>> The code first extracts .xattr-list as though it was a regular file.
>> If
>>> a later dupe shows up (presumably from a second archive, although the
>>> patch will actually allow a second one in the same archive), it will
>>> then process the existing .xattr-list file and apply the attributes
>>> listed within it. It then will proceed to read the second one and
>>> overwrite the first one with it (this is the normal behaviour in the
>>> kernel cpio parser). At the end once all the archives have been
>>> extracted, if there is an .xattr-list file in the rootfs it will be
>>> parsed (it would've been the last one encountered, which hasn't been
>>> parsed yet, just extracted).
>>>
>>> Regarding the idea to use the high 16 bits of the mode field in
>>> the header that's another possibility. It would just require
>> additional
>>> support in the program that actually creates the archive though,
>> which
>>> the current patch doesn't.
>>
>> Yes, for adding signatures for a subset of files, no changes to the ram
>> disk generator are necessary. Everything is done by a custom module. To
>> support a generic use case, it would be necessary to modify the
>> generator to execute getfattr and the awk script after files have been
>> placed in the temporary directory.
>>
>> If I understood the new proposal correctly, it would be task for cpio
>> to
>> read file metadata after the content and create a new record for each
>> file with mode 0x18000, type of metadata encoded in the file name and
>> metadata as file content. I don't know how easy it would be to modify
>> cpio. Probably the amount of changes would be reasonable.
>>
>> The kernel will behave in a similar way. It will call do_readxattrs()
>> in
>> do_copy() for each file. Since the only difference between the current
>> and the new proposal would be two additional calls to do_readxattrs()
>> in
>> do_name() and unpack_to_rootfs(), maybe we could support both.
>>
>> Roberto
>
> The nice thing with explicit metadata is that it doesn't have to contain the filename per se, and each file is self-contained. There is a reason why each cpio header starts with the magic number: each cpio record is formally independent and can be processed in isolation. The TRAILER!!! thing is a huge wart in the format, although in practice TRAILER!!! always has a mode of 0 and so can be distinguished from an actual file.
>
> The use of mode 0x18000 for metadata allows for optional backwards compatibility for extraction; for encoding this can be handled with very simple postprocessing.
>
> So my suggestion would be to have mode 0x18000 indicate extended file metadata, with the filename of the form:
>
> optional_filename!XXXXX!
>
> ... where XXXXX indicates the type of metadata (e.g. !XATTR!). The optional_filename prefix allows an unaware decoder to extract to a well-defined name; simple postprocessing would be able to either remove (for size) or add (for compatibility) this prefix. It would be an error for this prefix, if present, to not match the name of the previous file.

Actually, I defined '..metadata..' as special name to indicate that the
file contains metadata. Then, the content of the file is a set of:

struct metadata_hdr {
char c_size[8]; /* total size including c_size field */
char c_version; /* header version */
char c_type; /* metadata type */
char c_metadata[]; /* metadata */
} __packed;

init/initramfs.c now has a specific parser for c_type. Currently, I
implemented a parser for xattrs, which expects data in the format:

<xattr #N name>\0<xattr #N value>

I checked if it is possible to use bit 17:16 to identify files with
metadata, but both the cpio and the kernel use unsigned short.

I already modified gen_init_cpio and cpio. I modify at run-time the list
of files to be included in the image by adding a temporary file, that
each time is set with the xattrs of the previously processed file.

The output of cpio -t looks like:

--
.
..metadata..
bin
..metadata..
dev
..metadata..
dev/console
..metadata..
--

Would it be ok? If you prefer that I add the format to the file name or
you/anyone has a comment about this proposal, please let me know so that
I make the changes before sending a new version of the patch set.

Thanks

Roberto

--
HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063
Managing Director: Bo PENG, Jian LI, Yanli SHI

2019-05-22 17:38:55

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] initramfs: introduce do_readxattrs()

On May 20, 2019 2:39:46 AM PDT, Roberto Sassu <[email protected]> wrote:
>On 5/18/2019 12:17 AM, Arvind Sankar wrote:
>> On Fri, May 17, 2019 at 02:47:31PM -0700, H. Peter Anvin wrote:
>>> On 5/17/19 2:02 PM, Arvind Sankar wrote:
>>>> On Fri, May 17, 2019 at 01:18:11PM -0700, [email protected] wrote:
>>>>>
>>>>> Ok... I just realized this does not work for a modular initramfs,
>composed at load time from multiple files, which is a very real
>problem. Should be easy enough to deal with: instead of one large file,
>use one companion file per source file, perhaps something like
>filename..xattrs (suggesting double dots to make it less likely to
>conflict with a "real" file.) No leading dot, as it makes it more
>likely that archivers will sort them before the file proper.
>>>> This version of the patch was changed from the previous one exactly
>to deal with this case --
>>>> it allows for the bootloader to load multiple initramfs archives,
>each
>>>> with its own .xattr-list file, and to have that work properly.
>>>> Could you elaborate on the issue that you see?
>>>>
>>>
>>> Well, for one thing, how do you define "cpio archive", each with its
>own
>>> .xattr-list file? Second, that would seem to depend on the ordering,
>no,
>>> in which case you depend critically on .xattr-list file following
>the
>>> files, which most archivers won't do.
>>>
>>> Either way it seems cleaner to have this per file; especially if/as
>it
>>> can be done without actually mucking up the format.
>>>
>>> I need to run, but I'll post a more detailed explanation of what I
>did
>>> in a little bit.
>>>
>>> -hpa
>>>
>> Not sure what you mean by how do I define it? Each cpio archive will
>> contain its own .xattr-list file with signatures for the files within
>> it, that was the idea.
>>
>> You need to review the code more closely I think -- it does not
>depend
>> on the .xattr-list file following the files to which it applies.
>>
>> The code first extracts .xattr-list as though it was a regular file.
>If
>> a later dupe shows up (presumably from a second archive, although the
>> patch will actually allow a second one in the same archive), it will
>> then process the existing .xattr-list file and apply the attributes
>> listed within it. It then will proceed to read the second one and
>> overwrite the first one with it (this is the normal behaviour in the
>> kernel cpio parser). At the end once all the archives have been
>> extracted, if there is an .xattr-list file in the rootfs it will be
>> parsed (it would've been the last one encountered, which hasn't been
>> parsed yet, just extracted).
>>
>> Regarding the idea to use the high 16 bits of the mode field in
>> the header that's another possibility. It would just require
>additional
>> support in the program that actually creates the archive though,
>which
>> the current patch doesn't.
>
>Yes, for adding signatures for a subset of files, no changes to the ram
>disk generator are necessary. Everything is done by a custom module. To
>support a generic use case, it would be necessary to modify the
>generator to execute getfattr and the awk script after files have been
>placed in the temporary directory.
>
>If I understood the new proposal correctly, it would be task for cpio
>to
>read file metadata after the content and create a new record for each
>file with mode 0x18000, type of metadata encoded in the file name and
>metadata as file content. I don't know how easy it would be to modify
>cpio. Probably the amount of changes would be reasonable.
>
>The kernel will behave in a similar way. It will call do_readxattrs()
>in
>do_copy() for each file. Since the only difference between the current
>and the new proposal would be two additional calls to do_readxattrs()
>in
>do_name() and unpack_to_rootfs(), maybe we could support both.
>
>Roberto

The nice thing with explicit metadata is that it doesn't have to contain the filename per se, and each file is self-contained. There is a reason why each cpio header starts with the magic number: each cpio record is formally independent and can be processed in isolation. The TRAILER!!! thing is a huge wart in the format, although in practice TRAILER!!! always has a mode of 0 and so can be distinguished from an actual file.

The use of mode 0x18000 for metadata allows for optional backwards compatibility for extraction; for encoding this can be handled with very simple postprocessing.

So my suggestion would be to have mode 0x18000 indicate extended file metadata, with the filename of the form:

optional_filename!XXXXX!

... where XXXXX indicates the type of metadata (e.g. !XATTR!). The optional_filename prefix allows an unaware decoder to extract to a well-defined name; simple postprocessing would be able to either remove (for size) or add (for compatibility) this prefix. It would be an error for this prefix, if present, to not match the name of the previous file.

I do agree that the delayed processing of an .xattr-list as you describe ought to work even with a modular initramfs.


--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

2019-05-22 19:53:51

by Rob Landley

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] initramfs: introduce do_readxattrs()



On 5/22/19 11:17 AM, [email protected] wrote:
> On May 20, 2019 2:39:46 AM PDT, Roberto Sassu <[email protected]> wrote:
>> On 5/18/2019 12:17 AM, Arvind Sankar wrote:
>>> On Fri, May 17, 2019 at 02:47:31PM -0700, H. Peter Anvin wrote:
>>>> On 5/17/19 2:02 PM, Arvind Sankar wrote:
>>>>> On Fri, May 17, 2019 at 01:18:11PM -0700, [email protected] wrote:
>>>>>>
>>>>>> Ok... I just realized this does not work for a modular initramfs,
>> composed at load time from multiple files, which is a very real
>> problem. Should be easy enough to deal with: instead of one large file,
>> use one companion file per source file, perhaps something like
>> filename..xattrs (suggesting double dots to make it less likely to
>> conflict with a "real" file.) No leading dot, as it makes it more
>> likely that archivers will sort them before the file proper.
>>>>> This version of the patch was changed from the previous one exactly
>> to deal with this case --
>>>>> it allows for the bootloader to load multiple initramfs archives,
>> each
>>>>> with its own .xattr-list file, and to have that work properly.
>>>>> Could you elaborate on the issue that you see?
>>>>>
>>>>
>>>> Well, for one thing, how do you define "cpio archive", each with its
>> own
>>>> .xattr-list file? Second, that would seem to depend on the ordering,
>> no,
>>>> in which case you depend critically on .xattr-list file following
>> the
>>>> files, which most archivers won't do.
>>>>
>>>> Either way it seems cleaner to have this per file; especially if/as
>> it
>>>> can be done without actually mucking up the format.
>>>>
>>>> I need to run, but I'll post a more detailed explanation of what I
>> did
>>>> in a little bit.
>>>>
>>>> -hpa
>>>>
>>> Not sure what you mean by how do I define it? Each cpio archive will
>>> contain its own .xattr-list file with signatures for the files within
>>> it, that was the idea.
>>>
>>> You need to review the code more closely I think -- it does not
>> depend
>>> on the .xattr-list file following the files to which it applies.
>>>
>>> The code first extracts .xattr-list as though it was a regular file.
>> If
>>> a later dupe shows up (presumably from a second archive, although the
>>> patch will actually allow a second one in the same archive), it will
>>> then process the existing .xattr-list file and apply the attributes
>>> listed within it. It then will proceed to read the second one and
>>> overwrite the first one with it (this is the normal behaviour in the
>>> kernel cpio parser). At the end once all the archives have been
>>> extracted, if there is an .xattr-list file in the rootfs it will be
>>> parsed (it would've been the last one encountered, which hasn't been
>>> parsed yet, just extracted).
>>>
>>> Regarding the idea to use the high 16 bits of the mode field in
>>> the header that's another possibility. It would just require
>> additional
>>> support in the program that actually creates the archive though,
>> which
>>> the current patch doesn't.
>>
>> Yes, for adding signatures for a subset of files, no changes to the ram
>> disk generator are necessary. Everything is done by a custom module. To
>> support a generic use case, it would be necessary to modify the
>> generator to execute getfattr and the awk script after files have been
>> placed in the temporary directory.
>>
>> If I understood the new proposal correctly, it would be task for cpio
>> to
>> read file metadata after the content and create a new record for each
>> file with mode 0x18000, type of metadata encoded in the file name and
>> metadata as file content. I don't know how easy it would be to modify
>> cpio. Probably the amount of changes would be reasonable.

I could make toybox cpio do it in a weekend, and could probably throw a patch at
usr/gen_init_cpio.c while I'm at it. I prototyped something like that a couple
years ago, it's not hard.

The real question is scripts/gen_initramfs_list.sh and the text format it
produces. We can currently generate cpio files with different ownership and
permissions than the host system can represent (when not building as root, on a
filesystem that may not support xattrs or would get unhappy about conflicting
selinux annotations). We work around it by having the metadata represented
textually in the initramfs_list file gen_initramfs_list.sh produces and
gen_init_cpio.c consumes.

xattrs are a terrible idea the Macintosh invented so Finder could remember where
you moved a file's icon in its folder without having to modify the file, and
then things like OS/2 copied it and Windows picked it up from there and went "Of
course, this is a security mechanism!" and... sigh.

This is "data that is not data", it's metadata of unbounded size. It seems like
it should go in gen_initramfs_list.sh but as what, keyword=value pairs that
might have embedded newlines in them? A base64 encoding? Something else?

>> The kernel will behave in a similar way. It will call do_readxattrs()
>> in
>> do_copy() for each file. Since the only difference between the current
>> and the new proposal would be two additional calls to do_readxattrs()
>> in
>> do_name() and unpack_to_rootfs(), maybe we could support both.
>>
>> Roberto
>
> The nice thing with explicit metadata is that it doesn't have to contain the filename per se, and each file is self-contained. There is a reason why each cpio header starts with the magic number: each cpio record is formally independent and can be processed in isolation. The TRAILER!!! thing is a huge wart in the format, although in practice TRAILER!!! always has a mode of 0 and so can be distinguished from an actual file.

Not adding the requirement that the cpio.gz must be generated as root from a
filesystem with the same users and selinux rules as the target system would be nice.

> The use of mode 0x18000 for metadata allows for optional backwards compatibility for extraction; for encoding this can be handled with very simple postprocessing.

The representation within the cpio file was never a huge deal to me. 0x18000
sounds fine for that.

> So my suggestion would be to have mode 0x18000 indicate extended file metadata, with the filename of the form:
>
> optional_filename!XXXXX!
>
> ... where XXXXX indicates the type of metadata (e.g. !XATTR!). The optional_filename prefix allows an unaware decoder to extract to a well-defined name; simple postprocessing would be able to either remove (for size) or add (for compatibility) this prefix. It would be an error for this prefix, if present, to not match the name of the previous file.

I'd suggest METADATA!!! to look like TRAILER!!!. (METADATA!!!XXXXX! if you
really think a keyword=value pair store is _not_ universal and we're going to
invent entire new _categories_ of this side channel nonsense.)

And extracting conflicting filenames is presumably already covered, it either
replaces or the new one fails to create the file and the extractor moves on.
(You need a working error recovery path that skips the right amount of data so
you can handle the next file properly, but you should have that anyway.)

Rob

2019-05-22 20:29:47

by Taras Kondratiuk

[permalink] [raw]
Subject: Re: [PATCH v3 2/2] initramfs: introduce do_readxattrs()

Quoting Rob Landley (2019-05-22 12:26:43)
>
>
> On 5/22/19 11:17 AM, [email protected] wrote:
> > On May 20, 2019 2:39:46 AM PDT, Roberto Sassu <[email protected]> wrote:
> >> On 5/18/2019 12:17 AM, Arvind Sankar wrote:
> >>> On Fri, May 17, 2019 at 02:47:31PM -0700, H. Peter Anvin wrote:
> >>>> On 5/17/19 2:02 PM, Arvind Sankar wrote:
> >>>>> On Fri, May 17, 2019 at 01:18:11PM -0700, [email protected] wrote:
> >>>>>>
> >>>>>> Ok... I just realized this does not work for a modular initramfs,
> >> composed at load time from multiple files, which is a very real
> >> problem. Should be easy enough to deal with: instead of one large file,
> >> use one companion file per source file, perhaps something like
> >> filename..xattrs (suggesting double dots to make it less likely to
> >> conflict with a "real" file.) No leading dot, as it makes it more
> >> likely that archivers will sort them before the file proper.
> >>>>> This version of the patch was changed from the previous one exactly
> >> to deal with this case --
> >>>>> it allows for the bootloader to load multiple initramfs archives,
> >> each
> >>>>> with its own .xattr-list file, and to have that work properly.
> >>>>> Could you elaborate on the issue that you see?
> >>>>>
> >>>>
> >>>> Well, for one thing, how do you define "cpio archive", each with its
> >> own
> >>>> .xattr-list file? Second, that would seem to depend on the ordering,
> >> no,
> >>>> in which case you depend critically on .xattr-list file following
> >> the
> >>>> files, which most archivers won't do.
> >>>>
> >>>> Either way it seems cleaner to have this per file; especially if/as
> >> it
> >>>> can be done without actually mucking up the format.
> >>>>
> >>>> I need to run, but I'll post a more detailed explanation of what I
> >> did
> >>>> in a little bit.
> >>>>
> >>>> -hpa
> >>>>
> >>> Not sure what you mean by how do I define it? Each cpio archive will
> >>> contain its own .xattr-list file with signatures for the files within
> >>> it, that was the idea.
> >>>
> >>> You need to review the code more closely I think -- it does not
> >> depend
> >>> on the .xattr-list file following the files to which it applies.
> >>>
> >>> The code first extracts .xattr-list as though it was a regular file.
> >> If
> >>> a later dupe shows up (presumably from a second archive, although the
> >>> patch will actually allow a second one in the same archive), it will
> >>> then process the existing .xattr-list file and apply the attributes
> >>> listed within it. It then will proceed to read the second one and
> >>> overwrite the first one with it (this is the normal behaviour in the
> >>> kernel cpio parser). At the end once all the archives have been
> >>> extracted, if there is an .xattr-list file in the rootfs it will be
> >>> parsed (it would've been the last one encountered, which hasn't been
> >>> parsed yet, just extracted).
> >>>
> >>> Regarding the idea to use the high 16 bits of the mode field in
> >>> the header that's another possibility. It would just require
> >> additional
> >>> support in the program that actually creates the archive though,
> >> which
> >>> the current patch doesn't.
> >>
> >> Yes, for adding signatures for a subset of files, no changes to the ram
> >> disk generator are necessary. Everything is done by a custom module. To
> >> support a generic use case, it would be necessary to modify the
> >> generator to execute getfattr and the awk script after files have been
> >> placed in the temporary directory.
> >>
> >> If I understood the new proposal correctly, it would be task for cpio
> >> to
> >> read file metadata after the content and create a new record for each
> >> file with mode 0x18000, type of metadata encoded in the file name and
> >> metadata as file content. I don't know how easy it would be to modify
> >> cpio. Probably the amount of changes would be reasonable.
>
> I could make toybox cpio do it in a weekend, and could probably throw a patch at
> usr/gen_init_cpio.c while I'm at it. I prototyped something like that a couple
> years ago, it's not hard.
>
> The real question is scripts/gen_initramfs_list.sh and the text format it
> produces. We can currently generate cpio files with different ownership and
> permissions than the host system can represent (when not building as root, on a
> filesystem that may not support xattrs or would get unhappy about conflicting
> selinux annotations). We work around it by having the metadata represented
> textually in the initramfs_list file gen_initramfs_list.sh produces and
> gen_init_cpio.c consumes.
>
> xattrs are a terrible idea the Macintosh invented so Finder could remember where
> you moved a file's icon in its folder without having to modify the file, and
> then things like OS/2 copied it and Windows picked it up from there and went "Of
> course, this is a security mechanism!" and... sigh.
>
> This is "data that is not data", it's metadata of unbounded size. It seems like
> it should go in gen_initramfs_list.sh but as what, keyword=value pairs that
> might have embedded newlines in them? A base64 encoding? Something else?

I the previous try to add xattrs to cpio I've used hex encoding in
gen_initramfs_list.sh:
https://lkml.org/lkml/2018/1/24/851 - gen_init_cpio: set extended attributes for newcx format
https://lkml.org/lkml/2018/1/24/852 - gen_initramfs_list.sh: add -x option to enable newcx format