2008-11-16 14:48:10

by Bernhard Walle

[permalink] [raw]
Subject: Turn CONFIG_STRICT_DEVMEM in sysctl dev.mem.restricted

This patch series turns CONFIG_STRICT_DEVMEM in a sysctl
dev.mem.restricted.

While the restricted /dev/mem is useful in most scenarios, it is not
when doing live debugging. The crash utility
(http://people.redhat.com/~anderson) needs access to /dev/mem.

As distributor (at least for "enterprise" distributions) you need both:
The protection in the general case and the ability to do live debugging.
The patch doesn't make the kernel more insecure: Without SELinux or
AppArmor, it has always been possible to circumvent that /dev/mem
restriction. With it, you can also prevent the (super) user from doing
"sysctl dev.mem.restricted=1".

This patch series differs in two ways from the original submission:

- The patch that removes CONFIG_STRICT_DEVMEM has been added.
- The binary sysctl is removed, now it's only a /proc/sys sysctl.

While the original submission of CONFIG_STRICT_DEVMEM mentions that the
option has been in RHEL and Fedora for 4 years without problems, that's
only a half of the story. The truth is that at least RHEL has /dev/crash
exactly to circumvent that /dev/mem restriction. Don't tell me that this
is better than having that sysctl entry. ;-)

The patch has been tested on i386. There should be no difference to
x86_64.


Signed-off-by: Bernhard Walle <[email protected]>


2008-11-16 14:48:27

by Bernhard Walle

[permalink] [raw]
Subject: [PATCH 1/3] Unify devmem_is_allowed across architectures

This patch unifies devmem_is_allowed() across different architectures.
It removes the devmem_is_allowed 1 defines from m32r and frv and
adds a generic fallback implementation to <asm-generic/page.h>.

Architecutes can define their version and define __HAVE_ARCH_RANGE_IS_ALLOWED.


Signed-off-by: Bernhard Walle <[email protected]>
---
arch/x86/include/asm/page.h | 1 +
include/asm-frv/page.h | 2 --
include/asm-generic/page.h | 4 ++++
include/asm-m32r/page.h | 2 --
4 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h
index e9873a2..b768401 100644
--- a/arch/x86/include/asm/page.h
+++ b/arch/x86/include/asm/page.h
@@ -63,6 +63,7 @@ extern void map_devmem(unsigned long pfn, unsigned long size,
pgprot_t vma_prot);
extern void unmap_devmem(unsigned long pfn, unsigned long size,
pgprot_t vma_prot);
+#define __HAVE_ARCH_RANGE_IS_ALLOWED 1

extern unsigned long max_low_pfn_mapped;
extern unsigned long max_pfn_mapped;
diff --git a/include/asm-frv/page.h b/include/asm-frv/page.h
index bd9c220..bc2c835 100644
--- a/include/asm-frv/page.h
+++ b/include/asm-frv/page.h
@@ -40,8 +40,6 @@ typedef struct page *pgtable_t;
#define __pgprot(x) ((pgprot_t) { (x) } )
#define PTE_MASK PAGE_MASK

-#define devmem_is_allowed(pfn) 1
-
#define __pa(vaddr) virt_to_phys((void *) (unsigned long) (vaddr))
#define __va(paddr) phys_to_virt((unsigned long) (paddr))

diff --git a/include/asm-generic/page.h b/include/asm-generic/page.h
index 14db733..cde4ceb 100644
--- a/include/asm-generic/page.h
+++ b/include/asm-generic/page.h
@@ -19,6 +19,10 @@ static __inline__ __attribute_const__ int get_order(unsigned long size)
return order;
}

+#ifndef __HAVE_ARCH_RANGE_IS_ALLOWED
+#define devmem_is_allowed(pfn) 1
+#endif
+
#endif /* __ASSEMBLY__ */

#endif /* _ASM_GENERIC_PAGE_H */
diff --git a/include/asm-m32r/page.h b/include/asm-m32r/page.h
index c933308..89d6b10 100644
--- a/include/asm-m32r/page.h
+++ b/include/asm-m32r/page.h
@@ -79,8 +79,6 @@ typedef struct page *pgtable_t;
#define VM_DATA_DEFAULT_FLAGS (VM_READ | VM_WRITE | VM_EXEC | \
VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC )

-#define devmem_is_allowed(x) 1
-
#include <asm-generic/memory_model.h>
#include <asm-generic/page.h>

--
1.6.0.4

2008-11-16 14:48:41

by Bernhard Walle

[permalink] [raw]
Subject: [PATCH 3/3] Remove CONFIG_STRICT_DEVMEM

Since the behaviour of /dev/mem can now be controlled via sysctl, we don't need
CONFIG_STRICT_DEVMEM any more. With SELinux or Apparmor, the sysctl can be
prohibited to be turned on. Without SELinux or Apparmor, you can circumvent
the restriction anyways by loading a kernel module that installs a kretprobe
that just ignores the check and always returns true.

The increase of code size is neglecatble and the code becomes more readable
with less CONFIG options and #ifdef's.


Signed-off-by: Bernhard Walle <[email protected]>
---
arch/x86/Kconfig.debug | 17 -----------------
arch/x86/configs/i386_defconfig | 1 -
arch/x86/configs/x86_64_defconfig | 1 -
arch/x86/include/asm/page.h | 4 ----
drivers/char/mem.c | 7 +------
5 files changed, 1 insertions(+), 29 deletions(-)

diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
index 2a3dfbd..28b7c26 100644
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -5,23 +5,6 @@ config TRACE_IRQFLAGS_SUPPORT

source "lib/Kconfig.debug"

-config STRICT_DEVMEM
- bool "Filter access to /dev/mem"
- help
- If this option is disabled, you allow userspace (root) access to all
- of memory, including kernel and userspace memory. Accidental
- access to this is obviously disastrous, but specific access can
- be used by people debugging the kernel. Note that with PAT support
- enabled, even in this case there are restrictions on /dev/mem
- use due to the cache aliasing requirements.
-
- If this option is switched on, the /dev/mem file only allows
- userspace access to PCI space and the BIOS code and data regions.
- This is sufficient for dosemu and X and all common users of
- /dev/mem.
-
- If in doubt, say Y.
-
config X86_VERBOSE_BOOTUP
bool "Enable verbose x86 bootup info messages"
default y
diff --git a/arch/x86/configs/i386_defconfig b/arch/x86/configs/i386_defconfig
index 13b8c86..93e8696 100644
--- a/arch/x86/configs/i386_defconfig
+++ b/arch/x86/configs/i386_defconfig
@@ -2090,7 +2090,6 @@ CONFIG_PROVIDE_OHCI1394_DMA_INIT=y
# CONFIG_SAMPLES is not set
CONFIG_HAVE_ARCH_KGDB=y
# CONFIG_KGDB is not set
-# CONFIG_STRICT_DEVMEM is not set
CONFIG_X86_VERBOSE_BOOTUP=y
CONFIG_EARLY_PRINTK=y
CONFIG_DEBUG_STACKOVERFLOW=y
diff --git a/arch/x86/configs/x86_64_defconfig b/arch/x86/configs/x86_64_defconfig
index f0a03d7..8b162ea 100644
--- a/arch/x86/configs/x86_64_defconfig
+++ b/arch/x86/configs/x86_64_defconfig
@@ -2059,7 +2059,6 @@ CONFIG_PROVIDE_OHCI1394_DMA_INIT=y
# CONFIG_SAMPLES is not set
CONFIG_HAVE_ARCH_KGDB=y
# CONFIG_KGDB is not set
-# CONFIG_STRICT_DEVMEM is not set
CONFIG_X86_VERBOSE_BOOTUP=y
CONFIG_EARLY_PRINTK=y
CONFIG_DEBUG_STACKOVERFLOW=y
diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h
index e5fe778..90dfcf2 100644
--- a/arch/x86/include/asm/page.h
+++ b/arch/x86/include/asm/page.h
@@ -66,11 +66,7 @@ extern void unmap_devmem(unsigned long pfn, unsigned long size,
#define __HAVE_ARCH_RANGE_IS_ALLOWED 1


-#ifdef CONFIG_STRICT_DEVMEM
extern int devmem_restricted;
-#else
-#define devmem_restricted 0
-#endif

extern unsigned long max_low_pfn_mapped;
extern unsigned long max_pfn_mapped;
diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index 43b70b8..b4bbf80 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -37,8 +37,6 @@
#endif


-#ifdef CONFIG_STRICT_DEVMEM
-
int devmem_restricted = 1;

#ifdef CONFIG_SYSCTL
@@ -74,9 +72,6 @@ struct ctl_table dev_sysctl_table[] = {

#endif

-#endif /* CONFIG_STRICT_DEVMEM */
-
-
/*
* Architectures vary in how they handle caching for addresses
* outside of main memory.
@@ -1034,7 +1029,7 @@ static int __init chr_dev_init(void)
MKDEV(MEM_MAJOR, devlist[i].minor), NULL,
devlist[i].name);

-#if defined(CONFIG_SYSCTL) && defined(CONFIG_STRICT_DEVMEM)
+#if defined(CONFIG_SYSCTL)
/*
* since there is no unload function, we don't have to deregister that
* the whole lifetime of the kernel and can ignore the return value
--
1.6.0.4

2008-11-16 14:48:59

by Bernhard Walle

[permalink] [raw]
Subject: [PATCH 2/3] Add dev.mem.restricted sysctl

When CONFIG_STRICT_DEVMEM is set, live debugging is not possible with the
crash utility (see http://people.redhat.com/~anderson). For distributors
who ship a generic kernel it's difficult: Disabling CONFIG_STRICT_DEVMEM
is possible, but in general the protection provided by CONFIG_STRICT_DEVMEM
is useful. However, live debugging should be still neceessary.

This patch now adds a dev.mem.restricted sysctl that defaults to 0 (off).
When set to 1 (on), /dev/mem access is unrestricted and crash can be used.

>From a security point of view the sysctl should be no problem. It's already
possible to circumvent that restriction if you have root access by loading
a kernel module that installs a kretprobe that returns 1 for the check function.

I thought of a command line parameter first, but we already have lots of
command line parameters and rebooting the machine is more difficult than
just setting a sysctl to 1. It may be possible to implement setting that
variable in the tools automatically, but that's out of the scope of that
patch for the kernel and should also not be discussed on LKML.


Signed-off-by: Bernhard Walle <[email protected]>
---
arch/x86/include/asm/page.h | 7 +++++
arch/x86/mm/pat.c | 9 +-----
drivers/char/mem.c | 60 ++++++++++++++++++++++++++++++++++++++-----
3 files changed, 61 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h
index b768401..e5fe778 100644
--- a/arch/x86/include/asm/page.h
+++ b/arch/x86/include/asm/page.h
@@ -65,6 +65,13 @@ extern void unmap_devmem(unsigned long pfn, unsigned long size,
pgprot_t vma_prot);
#define __HAVE_ARCH_RANGE_IS_ALLOWED 1

+
+#ifdef CONFIG_STRICT_DEVMEM
+extern int devmem_restricted;
+#else
+#define devmem_restricted 0
+#endif
+
extern unsigned long max_low_pfn_mapped;
extern unsigned long max_pfn_mapped;

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index eb1bf00..c80ced4 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -26,6 +26,7 @@
#include <asm/msr.h>
#include <asm/pat.h>
#include <asm/io.h>
+#include <asm/page.h>

#ifdef CONFIG_X86_PAT
int __read_mostly pat_enabled = 1;
@@ -474,13 +475,6 @@ pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
return vma_prot;
}

-#ifdef CONFIG_STRICT_DEVMEM
-/* This check is done in drivers/char/mem.c in case of STRICT_DEVMEM*/
-static inline int range_is_allowed(unsigned long pfn, unsigned long size)
-{
- return 1;
-}
-#else
/* This check is needed to avoid cache aliasing when PAT is enabled */
static inline int range_is_allowed(unsigned long pfn, unsigned long size)
{
@@ -503,7 +497,6 @@ static inline int range_is_allowed(unsigned long pfn, unsigned long size)
}
return 1;
}
-#endif /* CONFIG_STRICT_DEVMEM */

int phys_mem_access_prot_allowed(struct file *file, unsigned long pfn,
unsigned long size, pgprot_t *vma_prot)
diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index 6431f69..43b70b8 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -27,6 +27,7 @@
#include <linux/splice.h>
#include <linux/pfn.h>
#include <linux/smp_lock.h>
+#include <linux/sysctl.h>

#include <asm/uaccess.h>
#include <asm/io.h>
@@ -35,6 +36,47 @@
# include <linux/efi.h>
#endif

+
+#ifdef CONFIG_STRICT_DEVMEM
+
+int devmem_restricted = 1;
+
+#ifdef CONFIG_SYSCTL
+struct ctl_table dev_mem_restricted_sysctl_table[] = {
+ {
+ .procname = "restricted",
+ .data = &devmem_restricted,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = &proc_dointvec,
+ },
+ { .ctl_name = 0 }
+};
+
+struct ctl_table dev_mem_sysctl_table[] = {
+ {
+ .procname = "mem",
+ .mode = 0555,
+ .child = dev_mem_restricted_sysctl_table
+ },
+ { .ctl_name = 0 }
+};
+
+struct ctl_table dev_sysctl_table[] = {
+ {
+ .ctl_name = CTL_DEV,
+ .procname = "dev",
+ .mode = 0555,
+ .child = dev_mem_sysctl_table
+ },
+ { .ctl_name = 0 }
+};
+
+#endif
+
+#endif /* CONFIG_STRICT_DEVMEM */
+
+
/*
* Architectures vary in how they handle caching for addresses
* outside of main memory.
@@ -80,13 +122,15 @@ static inline int valid_mmap_phys_addr_range(unsigned long pfn, size_t size)
}
#endif

-#ifdef CONFIG_STRICT_DEVMEM
static inline int range_is_allowed(unsigned long pfn, unsigned long size)
{
u64 from = ((u64)pfn) << PAGE_SHIFT;
u64 to = from + size;
u64 cursor = from;

+ if (!devmem_restricted)
+ return 1;
+
while (cursor < to) {
if (!devmem_is_allowed(pfn)) {
printk(KERN_INFO
@@ -99,12 +143,6 @@ static inline int range_is_allowed(unsigned long pfn, unsigned long size)
}
return 1;
}
-#else
-static inline int range_is_allowed(unsigned long pfn, unsigned long size)
-{
- return 1;
-}
-#endif

void __attribute__((weak)) unxlate_dev_mem_ptr(unsigned long phys, void *addr)
{
@@ -996,6 +1034,14 @@ static int __init chr_dev_init(void)
MKDEV(MEM_MAJOR, devlist[i].minor), NULL,
devlist[i].name);

+#if defined(CONFIG_SYSCTL) && defined(CONFIG_STRICT_DEVMEM)
+ /*
+ * since there is no unload function, we don't have to deregister that
+ * the whole lifetime of the kernel and can ignore the return value
+ */
+ register_sysctl_table(dev_sysctl_table);
+#endif
+
return 0;
}

--
1.6.0.4

2008-11-16 15:08:31

by Willy Tarreau

[permalink] [raw]
Subject: Re: [PATCH 2/3] Add dev.mem.restricted sysctl

On Sun, Nov 16, 2008 at 03:47:47PM +0100, Bernhard Walle wrote:
> When CONFIG_STRICT_DEVMEM is set, live debugging is not possible with the
> crash utility (see http://people.redhat.com/~anderson). For distributors
> who ship a generic kernel it's difficult: Disabling CONFIG_STRICT_DEVMEM
> is possible, but in general the protection provided by CONFIG_STRICT_DEVMEM
> is useful. However, live debugging should be still neceessary.
>
> This patch now adds a dev.mem.restricted sysctl that defaults to 0 (off).
> When set to 1 (on), /dev/mem access is unrestricted and crash can be used.

I like your approach. I use /dev/mem a lot for debugging purposes, and
even sometimes to recover important data from memory after a crash has
occurred. Having the ability to inspect memory again by setting a sysctl
seems very appealing to me.

> >From a security point of view the sysctl should be no problem. It's already
> possible to circumvent that restriction if you have root access by loading
> a kernel module that installs a kretprobe that returns 1 for the check function.

I agree.

> I thought of a command line parameter first, but we already have lots of
> command line parameters and rebooting the machine is more difficult than
> just setting a sysctl to 1. It may be possible to implement setting that
> variable in the tools automatically, but that's out of the scope of that
> patch for the kernel and should also not be discussed on LKML.

Rebooting the machine voids any ability to debug or recover data, so the
sysctl is the way to go IMHO.

Regards,
Willy

2008-11-16 15:08:45

by Alan

[permalink] [raw]
Subject: Re: Turn CONFIG_STRICT_DEVMEM in sysctl dev.mem.restricted

> The protection in the general case and the ability to do live debugging.

What protection. You've completely failed to explain or provide a single
example of any protection provided by the STRICT_DEVMEM code.

> only a half of the story. The truth is that at least RHEL has /dev/crash
> exactly to circumvent that /dev/mem restriction. Don't tell me that this
> is better than having that sysctl entry. ;-)

Which nicely illustrates what a waste of time the whole thing is.

Please do the decent thing and just turn the crap concerned off. The fact
other vendors get it wrong doesn't mean you need to copy. Even better
submit a patch to remove this rubbish from the kernel completely.

Your patch is still adding bells and whistles to a useless turd. In fact
this patch is worse. Without this patch the turd can be disabled and left
out, with your patch everyone now has to compile in said turd pile.

NAK this changeset.

Alan

2008-11-16 15:09:23

by Alan

[permalink] [raw]
Subject: Re: [PATCH 3/3] Remove CONFIG_STRICT_DEVMEM

> -#ifdef CONFIG_STRICT_DEVMEM
> -
> int devmem_restricted = 1;
>
> -#if defined(CONFIG_SYSCTL) && defined(CONFIG_STRICT_DEVMEM)
> +#if defined(CONFIG_SYSCTL)
> /*
> * since there is no unload function, we don't have to deregister that
> * the whole lifetime of the kernel and can ignore the return value

NAK - this adds a pile of memory wasting complete crap to every kernel
including embedded systems. At least before you could turn it off now you
can't.

2008-11-16 15:20:23

by Bernhard Walle

[permalink] [raw]
Subject: Re: Turn CONFIG_STRICT_DEVMEM in sysctl dev.mem.restricted

* Alan Cox [2008-11-16 15:07]:
>
> > The protection in the general case and the ability to do live debugging.
>
> What protection. You've completely failed to explain or provide a single
> example of any protection provided by the STRICT_DEVMEM code.

I don't need to explain what protection STRICT_DEVMEM provides, just
because I didn't submit STRICT_DEVMEM. However:

Author: Arjan van de Ven <[email protected]>
Date: Thu Apr 24 23:40:47 2008 +0200

The X server needs access to /dev/mem for the PCI space, but it doesn't need
access to memory; both the file permissions and SELinux permissions of /dev/mem
just make X effectively super-super powerful. With the exception of the
BIOS area, there's just no valid app that uses /dev/mem on actual memory.
Other popular users of /dev/mem are rootkits and the like.
(note: mmap access of memory via /dev/mem was already not allowed since
a really long time)

So without that patch, a distributor needs to have two kernels: One
with SELinux and with /dev/mem protection and one without it. If I
remember correctly, you can turn off SELinux on the boot command line.
But I never used it. At least I don't see -selinux and -noselinux
kernels in Redhat.

However, with my patch you can make everything configurable. With
SELinux or Apparmor you can also protect the user from writing that
sysctl. Or from loading kernel modules that circumvent that protection.



Regards,
Bernhard

2008-11-16 15:22:00

by Bernhard Walle

[permalink] [raw]
Subject: Re: [PATCH 3/3] Remove CONFIG_STRICT_DEVMEM

* Alan Cox [2008-11-16 15:09]:
>
> > -#ifdef CONFIG_STRICT_DEVMEM
> > -
> > int devmem_restricted = 1;
> >
> > -#if defined(CONFIG_SYSCTL) && defined(CONFIG_STRICT_DEVMEM)
> > +#if defined(CONFIG_SYSCTL)
> > /*
> > * since there is no unload function, we don't have to deregister that
> > * the whole lifetime of the kernel and can ignore the return value
>
> NAK - this adds a pile of memory wasting complete crap to every kernel
> including embedded systems. At least before you could turn it off now you
> can't.

For embedded users, do they use CONFIG_SYSCTL at all? If not, then we
can just allow /dev/mem access when CONFIG_SYSCTL is turned on. That
should make embedded users happy.

But: I don't want to get that patch in. I just want to have the sysctl,
if just that patch doesn't make it, well, ok for me. I just think
CONFIG_STRICT_DEVMEM is superfluous when the runtime configuration
option is present.


Regards,
Bernhard

2008-11-16 15:39:19

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Turn CONFIG_STRICT_DEVMEM in sysctl dev.mem.restricted

On Sun, 16 Nov 2008 15:47:45 +0100
Bernhard Walle <[email protected]> wrote:

> While the original submission of CONFIG_STRICT_DEVMEM mentions that
> the option has been in RHEL and Fedora for 4 years without problems,
> that's only a half of the story. The truth is that at least RHEL
> has /dev/crash exactly to circumvent that /dev/mem restriction. Don't
> tell me that this is better than having that sysctl entry. ;-)

I assume /dev/crash is read only


but your series still makes absolutely no sense to me...
really. Nak.

You either want this at compile time or you don't want it at all.


--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2008-11-16 15:45:59

by Alan

[permalink] [raw]
Subject: Re: Turn CONFIG_STRICT_DEVMEM in sysctl dev.mem.restricted

> I don't need to explain what protection STRICT_DEVMEM provides, just
> because I didn't submit STRICT_DEVMEM. However:

Which also doesn't explain what this turd is for.

> Author: Arjan van de Ven <[email protected]>
> Date: Thu Apr 24 23:40:47 2008 +0200
>
> The X server needs access to /dev/mem for the PCI space, but it doesn't need
> access to memory; both the file permissions and SELinux permissions of /dev/mem
> just make X effectively super-super powerful. With the exception of the
> BIOS area, there's just no valid app that uses /dev/mem on actual memory.

Note that this statement directly conflicts with your debugging statement
you need it switchable, and directly conflicts with the Red Hat crash
memory access. So you are trying to support something with a changelog
that demonstrates that what you are trying to make configurable is
completely broken anyway

The functionality provided by STRICT_DEVMEM is the same with it on or off
- absolutely *nothing*.

> So without that patch, a distributor needs to have two kernels: One
> with SELinux and with /dev/mem protection and one without it. If I
> remember correctly, you can turn off SELinux on the boot command line.

You can turn it off at boot time, but if you intend not to use it then it
is better (and measurably so with microbenchmark tools) to compile
without it. Red Hat doesn't do the two kernels as the maintenance cost
exceeds the gain for customers. Note however that compiling it out really
does compile it *out* and the overhead is gone totally for the many
embedded and other devices that don't use it.

> But I never used it. At least I don't see -selinux and -noselinux
> kernels in Redhat.

It is Red Hat, two words and a trademark (sorry but our legal people
insist we remind people who get it wrong).

Also I don't actually care what Red Hat ship. The fact Red Hat (or any
vendor) ship something doesn't make it a good idea.

> However, with my patch you can make everything configurable. With
> SELinux or Apparmor you can also protect the user from writing that
> sysctl. Or from loading kernel modules that circumvent that protection.

With your patch I get crap in the kernel I don't need. In every kernel
including those on memory tight devices like wireless routers that don't
need it.

You are turd polishing, and what is needed is a shovel.

Even if you want to turd polish there are cleaner solutions. A process
with CAP_SYS_RAWIO can cheerfully bypass any restriction you try and
place so you could rip out all the sysctl crap and just say that
the /dev/mem restriction doesn't apply to a CAP_SYS_RAWIO process. It's
still a turd but the change is a lot cleaner and one line long with no
userspace semantics, paths and the like.

There are proper ways to deal with X, modern video cards and modern
security models. They involve using framebuffer mappings off the PCI
device node itself and DRI.

Alan

2008-11-16 15:47:18

by Alan

[permalink] [raw]
Subject: Re: [PATCH 3/3] Remove CONFIG_STRICT_DEVMEM

> if just that patch doesn't make it, well, ok for me. I just think
> CONFIG_STRICT_DEVMEM is superfluous when the runtime configuration
> option is present.

It should always be "N" because the only other option means "add
pointless code of no value whatsoever"

Which means strict_devmem really belongs in the bitbucket as a bad idea

Alan

2008-11-16 15:56:21

by Bernhard Walle

[permalink] [raw]
Subject: Re: Turn CONFIG_STRICT_DEVMEM in sysctl dev.mem.restricted

* Arjan van de Ven [2008-11-16 07:39]:
>
> On Sun, 16 Nov 2008 15:47:45 +0100
> Bernhard Walle <[email protected]> wrote:
>
> > While the original submission of CONFIG_STRICT_DEVMEM mentions that
> > the option has been in RHEL and Fedora for 4 years without problems,
> > that's only a half of the story. The truth is that at least RHEL
> > has /dev/crash exactly to circumvent that /dev/mem restriction. Don't
> > tell me that this is better than having that sysctl entry. ;-)
>
> I assume /dev/crash is read only

I don't know. But if that matters, why can't we make /dev/mem
write-only for certain areas and read-only for the rest ...?

> You either want this at compile time or you don't want it at all.

Why? You don't write something about my arguments (as Alan does, even
though I disagree), you only write that you "nak" it.



Regards,
Bernhard

2008-11-16 16:03:19

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Turn CONFIG_STRICT_DEVMEM in sysctl dev.mem.restricted

On Sun, 16 Nov 2008 15:45:41 +0000
Alan Cox <[email protected]> wrote:
ules that circumvent that
> > protection.
>
> With your patch I get crap in the kernel I don't need. In every kernel
> including those on memory tight devices like wireless routers that
> don't need it.
>
> You are turd polishing, and what is needed is a shovel.
>
> Even if you want to turd polish there are cleaner solutions. A process
> with CAP_SYS_RAWIO can cheerfully bypass any restriction you try and
> place

because it can load kernel modules?
or because it can bypass the iommu?

the point of the /dev/mem restrictions is to not allow things you know
you don't need, while still allowing X to function where it can access
the crap it does. Now in Bernhard's case he DOES need them, so he
shouldn't use the restrictions.

> There are proper ways to deal with X, modern video cards and modern
> security models. They involve using framebuffer mappings off the PCI
> device node itself and DRI.
>
when X has this for all hw that matters /dev/mem could go away for the
people who then no longer have any need for it.


--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2008-11-16 16:06:16

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Turn CONFIG_STRICT_DEVMEM in sysctl dev.mem.restricted

On Sun, 16 Nov 2008 16:56:09 +0100
Bernhard Walle <[email protected]> wrote:

> > > While the original submission of CONFIG_STRICT_DEVMEM mentions
> > > that the option has been in RHEL and Fedora for 4 years without
> > > problems, that's only a half of the story. The truth is that at
> > > least RHEL has /dev/crash exactly to circumvent that /dev/mem
> > > restriction. Don't tell me that this is better than having that
> > > sysctl entry. ;-)
> >
> > I assume /dev/crash is read only
>
> I don't know. But if that matters, why can't we make /dev/mem
> write-only for certain areas and read-only for the rest ...?
>
> > You either want this at compile time or you don't want it at all.
>
> Why? You don't write something about my arguments (as Alan does, even
> though I disagree), you only write that you "nak" it.

It's very simple: if you want the strict form you really want the
strict form, not some half "oh but it's one line to turn off" form.


(Note: your usecase is in trouble in general already with PAT used by
modern video drivers: the requirement is that all mappings of the same
physical page are mapped with the same cachability semantics. mmaping
random parts of physical real memory via /dev/mem makes that a way too
complex issue and will likely turn the crash utility in what it's name
says ;-)

--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2008-11-16 16:09:47

by Alan

[permalink] [raw]
Subject: Re: Turn CONFIG_STRICT_DEVMEM in sysctl dev.mem.restricted

> because it can load kernel modules?
> or because it can bypass the iommu?

It has iopl, firmware loading, ioperm, raw disk I/O, mknod, module loading
etc etc..

> the point of the /dev/mem restrictions is to not allow things you know
> you don't need, while still allowing X to function where it can access
> the crap it does. Now in Bernhard's case he DOES need them, so he
> shouldn't use the restrictions.

I know what the point is, but it doesn't actually implement any
meaningful restriction to achieve that result, so it is worthless junk.

> > There are proper ways to deal with X, modern video cards and modern
> > security models. They involve using framebuffer mappings off the PCI
> > device node itself and DRI.
> >
> when X has this for all hw that matters /dev/mem could go away for the
> people who then no longer have any need for it.

Why should it go away ? It's a matter of file permissions and security
rules as to who can access it. Trying to make it go away is just more
fake-security crap.

Alan

2008-11-16 16:11:52

by Bernhard Walle

[permalink] [raw]
Subject: Re: Turn CONFIG_STRICT_DEVMEM in sysctl dev.mem.restricted

* Alan Cox [2008-11-16 15:45]:
>
> > I don't need to explain what protection STRICT_DEVMEM provides, just
> > because I didn't submit STRICT_DEVMEM. However:
>
> Which also doesn't explain what this turd is for.

Could we discuss more factual without words like "turd", please?

> > Author: Arjan van de Ven <[email protected]>
> > Date: Thu Apr 24 23:40:47 2008 +0200
> >
> > The X server needs access to /dev/mem for the PCI space, but it doesn't need
> > access to memory; both the file permissions and SELinux permissions of /dev/mem
> > just make X effectively super-super powerful. With the exception of the
> > BIOS area, there's just no valid app that uses /dev/mem on actual memory.
>
> Note that this statement directly conflicts with your debugging statement
> you need it switchable, and directly conflicts with the Red Hat crash
> memory access. So you are trying to support something with a changelog
> that demonstrates that what you are trying to make configurable is
> completely broken anyway
>
> The functionality provided by STRICT_DEVMEM is the same with it on or off
> - absolutely *nothing*.

Even with SELinux?

> You can turn it off at boot time, but if you intend not to use it then it
> is better (and measurably so with microbenchmark tools) to compile
> without it. Red Hat doesn't do the two kernels as the maintenance cost
> exceeds the gain for customers.

And that's the same argument why SUSE does not ship a -devmem and a
-nodevmem kernel. When you turn it off, you lose the protection that is
there if you use SELinux / Apparmor. If you turn it on, you cannot use
crash.

Note however that compiling it out really
> does compile it *out* and the overhead is gone totally for the many
> embedded and other devices that don't use it.

For SELinux, yes.

> > But I never used it. At least I don't see -selinux and -noselinux
> > kernels in Redhat.
>
> It is Red Hat, two words and a trademark (sorry but our legal people
> insist we remind people who get it wrong).

Oh well, I don't tell it our legal people if you would write SuSE
instead of SUSE. Even not S.u.S.E. ;-)

> > However, with my patch you can make everything configurable. With
> > SELinux or Apparmor you can also protect the user from writing that
> > sysctl. Or from loading kernel modules that circumvent that protection.
>
> With your patch I get crap in the kernel I don't need. In every kernel
> including those on memory tight devices like wireless routers that don't
> need it.

Well, if you let the CONFIG option there and only add the sysfs entry
it does not. Even most embedded stuff is not x86.

> Even if you want to turd polish there are cleaner solutions. A process
> with CAP_SYS_RAWIO can cheerfully bypass any restriction you try and
> place so you could rip out all the sysctl crap and just say that
> the /dev/mem restriction doesn't apply to a CAP_SYS_RAWIO process.

According to Arjan that does not work.


Regards,
Bernhard

2008-11-16 16:20:09

by Bernhard Walle

[permalink] [raw]
Subject: Re: Turn CONFIG_STRICT_DEVMEM in sysctl dev.mem.restricted

* Arjan van de Ven [2008-11-16 08:03]:

>
> the point of the /dev/mem restrictions is to not allow things you know
> you don't need, while still allowing X to function where it can access
> the crap it does. Now in Bernhard's case he DOES need them, so he
> shouldn't use the restrictions.

Right. But shipping two kernel images is a bit too much to turn a
restriction on or off. Get away from that "recompile your kernel".

But I get more and more convinced that we really want to just turn that
configuration option off. I'm no expert in security, and at some point,
I just have to believe what people write.


Regards,
Bernhard

2008-11-16 17:03:31

by Alan

[permalink] [raw]
Subject: Re: Turn CONFIG_STRICT_DEVMEM in sysctl dev.mem.restricted

> > Which also doesn't explain what this turd is for.
>
> Could we discuss more factual without words like "turd", please?

What else would you have me call it. It's junk.

> > The functionality provided by STRICT_DEVMEM is the same with it on or off
> > - absolutely *nothing*.
>
> Even with SELinux?

As far as I can tell yes, and certainly with vendor shipped rulesets.

> > does compile it *out* and the overhead is gone totally for the many
> > embedded and other devices that don't use it.
>
> For SELinux, yes.

And for strict devmem but only before your patch.

> Oh well, I don't tell it our legal people if you would write SuSE
> instead of SUSE. Even not S.u.S.E. ;-)

Hehe ;)

> Well, if you let the CONFIG option there and only add the sysfs entry
> it does not. Even most embedded stuff is not x86.

These days a large amount of embedded is x86 and this is growing.

> > Even if you want to turd polish there are cleaner solutions. A process
> > with CAP_SYS_RAWIO can cheerfully bypass any restriction you try and
> > place so you could rip out all the sysctl crap and just say that
> > the /dev/mem restriction doesn't apply to a CAP_SYS_RAWIO process.
>
> According to Arjan that does not work.

According to me CONFIG_STRICT_DEVMEM does not work either. Looking at the
X code most of it now tries really hard to use the right device files and
accesses. The window that is left is primarily the old ISA VGA hole and
even that could better be solved with a /dev/vgamem or similar that X
could use rather than /dev/mem.

Alan

2008-11-16 20:36:34

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Turn CONFIG_STRICT_DEVMEM in sysctl dev.mem.restricted

Arjan van de Ven wrote:
>
> I assume /dev/crash is read only
>
> but your series still makes absolutely no sense to me...
> really. Nak.
>
> You either want this at compile time or you don't want it at all.
>

You could certainly envision this as a boottime option, or even a
*one-way* settable runtime option (kind of like the BSD securelevel
idea.) The question is to which extent even that makes any sense.

-hpa

2008-11-17 13:58:12

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 3/3] Remove CONFIG_STRICT_DEVMEM

Bernhard Walle <[email protected]> wrote:

> For embedded users, do they use CONFIG_SYSCTL at all?

Yes. It still allows you to configure various things about the kernel.

David