2012-10-30 21:10:33

by Arvid Brodin

[permalink] [raw]
Subject: fs/proc/base.c: text md5sums; tgid vs tid; and INF vs ONE?

Hi,

Below is a patch that adds a file /proc/PID/text_md5sum which when read returns the md5
checksum of a process' text segment. (This would be used e.g. to make sure a process'
code hasn't been tampered with.)

However, I have a few questions:

* What's the difference between the tgid_base_stuff and tid_base_stuff arrays? (One for
processes and one for the process' threads? I haven't been able to find any info about
this so I'm guessing.)

* When should I use the INF ("read") vs the ONE ("show") macro?

* Any other comments about the code?

Thanks!

---

diff --git a/fs/proc/Kconfig b/fs/proc/Kconfig
index 15af622..317ad92 100644
--- a/fs/proc/Kconfig
+++ b/fs/proc/Kconfig
@@ -67,3 +67,13 @@ config PROC_PAGE_MONITOR
/proc/pid/smaps, /proc/pid/clear_refs, /proc/pid/pagemap,
/proc/kpagecount, and /proc/kpageflags. Disabling these
interfaces will reduce the size of the kernel by approximately 4kb.
+
+config PROC_TEXT_MD5SUM
+ bool "/proc/<pid>/text_md5sum support"
+ depends on PROC_FS
+ select CRYPTO
+ select CRYPTO_MD5
+ help
+ Read /proc/<pid>/text_md5sum to get the kernel to perform an MD5
+ checksum over the process' text segment and print the result. Can be
+ used to make sure a process' code has not been tampered with.
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 91da78c..14a825f 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -85,6 +85,10 @@
#include <linux/fs_struct.h>
#include <linux/slab.h>
#include <linux/flex_array.h>
+#ifdef CONFIG_PROC_TEXT_MD5SUM
+#include <linux/scatterlist.h>
+#include <linux/crypto.h>
+#endif
#ifdef CONFIG_HARDWALL
#include <asm/hardwall.h>
#endif
@@ -2526,6 +2530,107 @@ static int proc_pid_personality(struct seq_file *m, struct
pid_namespace *ns,
return err;
}

+#ifdef CONFIG_PROC_TEXT_MD5SUM
+#define MD5_DIGEST_SIZE 16
+static int proc_get_text_md5sum(struct seq_file *m, struct pid_namespace *ns,
+ struct pid *pid, struct task_struct *task)
+{
+ int retval;
+ int text_size;
+ int nr_pages, nr_pages_mapped;
+ int i;
+ struct page **pages;
+ struct scatterlist *sgl, *sg;
+ u8 result[MD5_DIGEST_SIZE + 2];
+ struct crypto_hash *tfm;
+ struct hash_desc desc;
+
+ retval = 0;
+
+ if (!task->mm)
+ return -EINVAL;
+
+ text_size = task->mm->end_code - task->mm->start_code;
+ nr_pages = (text_size + PAGE_SIZE - 1) >> PAGE_SHIFT;
+
+
+ /**** User page code ****/
+
+ pages = kmalloc(nr_pages * sizeof(*pages), GFP_KERNEL);
+ if (!pages) {
+ retval = -ENOMEM;
+ goto err_pages;
+ }
+
+ down_read(&task->mm->mmap_sem);
+ nr_pages_mapped = get_user_pages(current, task->mm,
+ task->mm->start_code, nr_pages, 0, 0, pages, NULL);
+ up_read(&task->mm->mmap_sem);
+ if (nr_pages_mapped < 0) {
+ retval = nr_pages_mapped;
+ goto err_mapped;
+ }
+ if (nr_pages_mapped < nr_pages) {
+ retval = -EBUSY; /* Weird error code for this,
+ but couldn't find any better */
+ goto err_not_all_pages;
+ }
+
+
+ /**** Scatterlist code ****/
+
+ sgl = kmalloc(nr_pages_mapped * sizeof(*sgl), GFP_KERNEL);
+ if (!sgl) {
+ retval = -ENOMEM;
+ goto err_sg;
+ }
+
+ sg_init_table(sgl, nr_pages_mapped);
+ for_each_sg(sgl, sg, nr_pages_mapped, i)
+ sg_set_page(sg, pages[i], (i < nr_pages_mapped) ? PAGE_SIZE :
+ text_size & ~PAGE_MASK, 0);
+
+
+ /**** Crypto code ****/
+
+ tfm = crypto_alloc_hash("md5", 0, CRYPTO_ALG_ASYNC);
+ if (IS_ERR(tfm)) {
+ retval = -ENOMEM;
+ goto err_crypto;
+ }
+
+ desc.tfm = tfm;
+ desc.flags = CRYPTO_TFM_REQ_MAY_SLEEP;
+
+ memset(result, 0, MD5_DIGEST_SIZE + 2);
+ retval = crypto_hash_digest(&desc, sgl, text_size, result);
+ if (retval)
+ goto err_digest;
+
+ for (i = 0; i < MD5_DIGEST_SIZE; i++)
+ seq_printf(m, "%02x", result[i]);
+ seq_printf(m, "\n");
+
+
+err_digest:
+ crypto_free_hash(tfm);
+
+err_crypto:
+ kfree(sgl);
+
+err_sg:
+err_not_all_pages:
+ for (i = 0; i < nr_pages_mapped; i++)
+ put_page(pages[i]);
+
+err_mapped:
+ kfree(pages);
+
+err_pages:
+ return retval;
+}
+#endif /* CONFIG_PROC_TEXT_MD5SUM */
+
/*
* Thread groups
*/
@@ -2621,6 +2726,9 @@ static const struct pid_entry tgid_base_stuff[] = {
REG("gid_map", S_IRUGO|S_IWUSR, proc_gid_map_operations),
REG("projid_map", S_IRUGO|S_IWUSR, proc_projid_map_operations),
#endif
+#ifdef CONFIG_PROC_TEXT_MD5SUM
+ ONE("text_md5sum", S_IRUGO, proc_get_text_md5sum),
+#endif
};

static int proc_tgid_base_readdir(struct file * filp,


--
Arvid Brodin | Consultant (Linux)
XDIN AB | Knarrarnäsgatan 7 | SE-164 40 Kista | Sweden | xdin.com????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?


2012-10-30 21:22:22

by Al Viro

[permalink] [raw]
Subject: Re: fs/proc/base.c: text md5sums; tgid vs tid; and INF vs ONE?

On Tue, Oct 30, 2012 at 09:02:33PM +0000, Arvid Brodin wrote:

> +config PROC_TEXT_MD5SUM
> + bool "/proc/<pid>/text_md5sum support"
> + depends on PROC_FS
> + select CRYPTO
> + select CRYPTO_MD5
> + help
> + Read /proc/<pid>/text_md5sum to get the kernel to perform an MD5
> + checksum over the process' text segment and print the result. Can be
> + used to make sure a process' code has not been tampered with.

Sorry, but this is pointless. Any attacker capable of modifying the code
will be just as capable of modifying pointers to functions in data segment.
IOW, you are not making sure of anything useful.

2012-10-30 21:24:14

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: fs/proc/base.c: text md5sums; tgid vs tid; and INF vs ONE?

On Tue, Oct 30, 2012 at 09:02:33PM +0000, Arvid Brodin wrote:
> Hi,
>
> Below is a patch that adds a file /proc/PID/text_md5sum which when read returns the md5
> checksum of a process' text segment. (This would be used e.g. to make sure a process'
> code hasn't been tampered with.)
>
> However, I have a few questions:
>
> * What's the difference between the tgid_base_stuff and tid_base_stuff arrays? (One for
> processes and one for the process' threads? I haven't been able to find any info about
> this so I'm guessing.)
>
> * When should I use the INF ("read") vs the ONE ("show") macro?
>
> * Any other comments about the code?
>
> Thanks!

I don't think this increments security by any means. start/end-code are rather
informative fields which are set when program being started, so one can ptrace
it, alloc new exec area, put evil code there, tuneup cs:ip and restore original
program contents, you won't even notice that.

2012-10-30 22:45:47

by Eric W. Biederman

[permalink] [raw]
Subject: Re: fs/proc/base.c: text md5sums; tgid vs tid; and INF vs ONE?

Arvid Brodin <[email protected]> writes:

> Hi,
>
> Below is a patch that adds a file /proc/PID/text_md5sum which when read returns the md5
> checksum of a process' text segment. (This would be used e.g. to make sure a process'
> code hasn't been tampered with.)
>
> However, I have a few questions:
>
> * What's the difference between the tgid_base_stuff and tid_base_stuff arrays? (One for
> processes and one for the process' threads? I haven't been able to find any info about
> this so I'm guessing.)

Yes. One for thread groups and one for threads.

> * When should I use the INF ("read") vs the ONE ("show") macro?

proc_read depends on the caller to allocate a 4k buffer, instead of
sizing the buffer based upon the size of the text being written. Which
makes proc_read an error prone and ultimately deprecated way of handling
things.

Using some variant on seq_file is preferred for new files.

> * Any other comments about the code?

There are known successful attacks against md5 so using md5 for
something new and security related is a bad idea.

Userspace can just as easily compute a security hash itself you don't
need kernel support.


I recommend you checkout the code in security/ima/ looks like it can
already do what you are trying to do.

Eric

2012-11-01 20:29:28

by Arvid Brodin

[permalink] [raw]
Subject: Re: fs/proc/base.c: text md5sums; tgid vs tid; and INF vs ONE?

On 2012-10-30 23:45, Eric W. Biederman wrote:
> Arvid Brodin <[email protected]> writes:
>> Hi,
>> []

Thanks for the info!

>> * Any other comments about the code?
>
> There are known successful attacks against md5 so using md5 for
> something new and security related is a bad idea.
>
> Userspace can just as easily compute a security hash itself you don't
> need kernel support.

How would I do this on running code? Does the kernel export the process memory somewhere
so that I can pipe it to md5sum?


> I recommend you checkout the code in security/ima/ looks like it can
> already do what you are trying to do.

Ah. I did actually check this out a year and a half ago or something like that. Seems like
it has gotten a bit more capable since then with the new patches (immutable executables
etc)! I'll take a look at it again to see if it might fit our needs. Thanks!

>
> Eric


--
Arvid Brodin | Consultant (Linux)
XDIN AB | Knarrarnäsgatan 7 | SE-164 40 Kista | Sweden | xdin.com????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2012-11-01 20:30:19

by Arvid Brodin

[permalink] [raw]
Subject: Re: fs/proc/base.c: text md5sums; tgid vs tid; and INF vs ONE?

On 2012-10-30 22:22, Al Viro wrote:
> On Tue, Oct 30, 2012 at 09:02:33PM +0000, Arvid Brodin wrote:
>
>> +config PROC_TEXT_MD5SUM
>> + bool "/proc/<pid>/text_md5sum support"
>> + depends on PROC_FS
>> + select CRYPTO
>> + select CRYPTO_MD5
>> + help
>> + Read /proc/<pid>/text_md5sum to get the kernel to perform an MD5
>> + checksum over the process' text segment and print the result. Can be
>> + used to make sure a process' code has not been tampered with.
>
> Sorry, but this is pointless. Any attacker capable of modifying the code
> will be just as capable of modifying pointers to functions in data segment.
> IOW, you are not making sure of anything useful.

On 2012-10-30 22:23, Cyrill Gorcunov wrote:
> I don't think this increments security by any means. start/end-code are rather
> informative fields which are set when program being started, so one can ptrace
> it, alloc new exec area, put evil code there, tuneup cs:ip and restore original
> program contents, you won't even notice that.

You are both correct of course. Actually, I was kind of sloppy when I wrote the
Kconfig help text. The following more accurately describes the intended use. Would
this make the patch more acceptable?

+config PROC_TEXT_MD5SUM
+ bool "/proc/<pid>/text_md5sum support"
+ depends on PROC_FS
+ select CRYPTO
+ select CRYPTO_MD5
+ help
+ Read /proc/<pid>/text_md5sum to get the kernel to perform an MD5
+ checksum over the process' text segment and print the result. This
+ can detect some cases where the system RAM has been disturbed by
+ e.g. EMC or cosmic radiation (on systems where ECC is not available).
+ It might also detect some accidental or malicious modifications of
+ executables, where the perpetrator has not bothered to cover up the
+ tracks.


--
Arvid Brodin | Consultant (Linux)
XDIN AB | Knarrarn?sgatan 7 | SE-164 40 Kista | Sweden | xdin.com-

2012-11-02 23:00:23

by Arvid Brodin

[permalink] [raw]
Subject: Re: fs/proc/base.c: text md5sums; tgid vs tid; and INF vs ONE?

On 2012-11-01 21:29, Arvid Brodin wrote:
> On 2012-10-30 23:45, Eric W. Biederman wrote:
>> I recommend you checkout the code in security/ima/ looks like it can
>> already do what you are trying to do.
>
> Ah. I did actually check this out a year and a half ago or something like that. Seems like
> it has gotten a bit more capable since then with the new patches (immutable executables
> etc)! I'll take a look at it again to see if it might fit our needs. Thanks!
>

It looks like IMA only checks files when they are accessed; I cannot find any mechanism
for checking code that is already executing. So it won't help us, unfortunately.

--
Arvid Brodin | Consultant (Linux)
XDIN AB | Knarrarnäsgatan 7 | SE-164 40 Kista | Sweden | xdin.com????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?