2008-11-16 00:03:55

by Bernhard Walle

[permalink] [raw]
Subject: [RFC] Add dev.mem.restricted sysctl

While the first patch is only a small cleanup, the second patch adds a
new sysctl dev.mem.restricted. See that patch for a description why I
think that sysctl is needed.

Please give feedback and review. Thanks!


Signed-off-by: Bernhard Walle <[email protected]>


2008-11-16 00:04:19

by Bernhard Walle

[permalink] [raw]
Subject: [PATCH 1/2] Unify devmem_is_allowed across architectures

This patch unifies devmem_is_allowed() across different architectures.
It removes the devmem_is_allowed 1 defines from m32r and frv and
adds a generic fallback implementation to <asm-generic/page.h>.

Architecutes can define their version and define __HAVE_ARCH_RANGE_IS_ALLOWED.


Signed-off-by: Bernhard Walle <[email protected]>
---
arch/x86/include/asm/page.h | 1 +
include/asm-frv/page.h | 2 --
include/asm-generic/page.h | 4 ++++
include/asm-m32r/page.h | 2 --
4 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h
index e9873a2..b768401 100644
--- a/arch/x86/include/asm/page.h
+++ b/arch/x86/include/asm/page.h
@@ -63,6 +63,7 @@ extern void map_devmem(unsigned long pfn, unsigned long size,
pgprot_t vma_prot);
extern void unmap_devmem(unsigned long pfn, unsigned long size,
pgprot_t vma_prot);
+#define __HAVE_ARCH_RANGE_IS_ALLOWED 1

extern unsigned long max_low_pfn_mapped;
extern unsigned long max_pfn_mapped;
diff --git a/include/asm-frv/page.h b/include/asm-frv/page.h
index bd9c220..bc2c835 100644
--- a/include/asm-frv/page.h
+++ b/include/asm-frv/page.h
@@ -40,8 +40,6 @@ typedef struct page *pgtable_t;
#define __pgprot(x) ((pgprot_t) { (x) } )
#define PTE_MASK PAGE_MASK

-#define devmem_is_allowed(pfn) 1
-
#define __pa(vaddr) virt_to_phys((void *) (unsigned long) (vaddr))
#define __va(paddr) phys_to_virt((unsigned long) (paddr))

diff --git a/include/asm-generic/page.h b/include/asm-generic/page.h
index 14db733..cde4ceb 100644
--- a/include/asm-generic/page.h
+++ b/include/asm-generic/page.h
@@ -19,6 +19,10 @@ static __inline__ __attribute_const__ int get_order(unsigned long size)
return order;
}

+#ifndef __HAVE_ARCH_RANGE_IS_ALLOWED
+#define devmem_is_allowed(pfn) 1
+#endif
+
#endif /* __ASSEMBLY__ */

#endif /* _ASM_GENERIC_PAGE_H */
diff --git a/include/asm-m32r/page.h b/include/asm-m32r/page.h
index c933308..89d6b10 100644
--- a/include/asm-m32r/page.h
+++ b/include/asm-m32r/page.h
@@ -79,8 +79,6 @@ typedef struct page *pgtable_t;
#define VM_DATA_DEFAULT_FLAGS (VM_READ | VM_WRITE | VM_EXEC | \
VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC )

-#define devmem_is_allowed(x) 1
-
#include <asm-generic/memory_model.h>
#include <asm-generic/page.h>

--
1.6.0.4

2008-11-16 00:04:35

by Bernhard Walle

[permalink] [raw]
Subject: [PATCH 2/2] Add dev.mem.restricted sysctl

When CONFIG_STRICT_DEVMEM is set, live debugging is not possible with the
crash utility (see http://people.redhat.com/~anderson). For distributors
who ship a generic kernel it's difficult: Disabling CONFIG_STRICT_DEVMEM
is possible, but in general the protection provided by CONFIG_STRICT_DEVMEM
is useful. However, live debugging should be still neceessary.

This patch now adds a dev.mem.restricted sysctl that defaults to 0 (off).
When set to 1 (on), /dev/mem access is unrestricted and crash can be used.

>From a security point of view the sysctl should be no problem. It's already
possible to circumvent that restriction if you have root access by loading
a kernel module that installs a kretprobe that returns 1 for the check function.

I thought of a command line parameter first, but we already have lots of
command line parameters and rebooting the machine is more difficult than
just setting a sysctl to 1. It may be possible to implement setting that
variable in the tools automatically, but that's out of the scope of that
patch for the kernel and should also not be discussed on LKML.


Signed-off-by: Bernhard Walle <[email protected]>
---
arch/x86/include/asm/page.h | 7 +++++
arch/x86/mm/pat.c | 9 +------
drivers/char/mem.c | 57 +++++++++++++++++++++++++++++++++++++-----
include/linux/sysctl.h | 6 ++++
kernel/sysctl_check.c | 7 +++++
5 files changed, 71 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h
index b768401..e5fe778 100644
--- a/arch/x86/include/asm/page.h
+++ b/arch/x86/include/asm/page.h
@@ -65,6 +65,13 @@ extern void unmap_devmem(unsigned long pfn, unsigned long size,
pgprot_t vma_prot);
#define __HAVE_ARCH_RANGE_IS_ALLOWED 1

+
+#ifdef CONFIG_STRICT_DEVMEM
+extern int devmem_restricted;
+#else
+#define devmem_restricted 0
+#endif
+
extern unsigned long max_low_pfn_mapped;
extern unsigned long max_pfn_mapped;

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index eb1bf00..c80ced4 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -26,6 +26,7 @@
#include <asm/msr.h>
#include <asm/pat.h>
#include <asm/io.h>
+#include <asm/page.h>

#ifdef CONFIG_X86_PAT
int __read_mostly pat_enabled = 1;
@@ -474,13 +475,6 @@ pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
return vma_prot;
}

-#ifdef CONFIG_STRICT_DEVMEM
-/* This check is done in drivers/char/mem.c in case of STRICT_DEVMEM*/
-static inline int range_is_allowed(unsigned long pfn, unsigned long size)
-{
- return 1;
-}
-#else
/* This check is needed to avoid cache aliasing when PAT is enabled */
static inline int range_is_allowed(unsigned long pfn, unsigned long size)
{
@@ -503,7 +497,6 @@ static inline int range_is_allowed(unsigned long pfn, unsigned long size)
}
return 1;
}
-#endif /* CONFIG_STRICT_DEVMEM */

int phys_mem_access_prot_allowed(struct file *file, unsigned long pfn,
unsigned long size, pgprot_t *vma_prot)
diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index 6431f69..562438e 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -27,6 +27,7 @@
#include <linux/splice.h>
#include <linux/pfn.h>
#include <linux/smp_lock.h>
+#include <linux/sysctl.h>

#include <asm/uaccess.h>
#include <asm/io.h>
@@ -35,6 +36,48 @@
# include <linux/efi.h>
#endif

+
+#ifdef CONFIG_STRICT_DEVMEM
+
+int devmem_restricted = 1;
+
+#ifdef CONFIG_SYSCTL
+struct ctl_table dev_mem_restricted_sysctl_table[] = {
+ {
+ .ctl_name = DEV_MEM_RESTRICTED,
+ .procname = "restricted",
+ .data = &devmem_restricted,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = &proc_dointvec,
+ },
+ { .ctl_name = 0 }
+};
+
+struct ctl_table dev_mem_sysctl_table[] = {
+ {
+ .ctl_name = DEV_MEM,
+ .procname = "mem",
+ .mode = 0555,
+ .child = dev_mem_restricted_sysctl_table
+ },
+ { .ctl_name = 0 }
+};
+
+struct ctl_table dev_sysctl_table[] = {
+ {
+ .ctl_name = CTL_DEV,
+ .procname = "dev",
+ .mode = 0555,
+ .child = dev_mem_sysctl_table
+ },
+ { .ctl_name = 0 }
+};
+#endif
+
+#endif /* CONFIG_STRICT_DEVMEM */
+
+
/*
* Architectures vary in how they handle caching for addresses
* outside of main memory.
@@ -80,13 +123,15 @@ static inline int valid_mmap_phys_addr_range(unsigned long pfn, size_t size)
}
#endif

-#ifdef CONFIG_STRICT_DEVMEM
static inline int range_is_allowed(unsigned long pfn, unsigned long size)
{
u64 from = ((u64)pfn) << PAGE_SHIFT;
u64 to = from + size;
u64 cursor = from;

+ if (!devmem_restricted)
+ return 1;
+
while (cursor < to) {
if (!devmem_is_allowed(pfn)) {
printk(KERN_INFO
@@ -99,12 +144,6 @@ static inline int range_is_allowed(unsigned long pfn, unsigned long size)
}
return 1;
}
-#else
-static inline int range_is_allowed(unsigned long pfn, unsigned long size)
-{
- return 1;
-}
-#endif

void __attribute__((weak)) unxlate_dev_mem_ptr(unsigned long phys, void *addr)
{
@@ -996,6 +1035,10 @@ static int __init chr_dev_init(void)
MKDEV(MEM_MAJOR, devlist[i].minor), NULL,
devlist[i].name);

+#if defined(CONFIG_SYSCTL) && defined(CONFIG_STRICT_DEVMEM)
+ register_sysctl_table(dev_sysctl_table);
+#endif
+
return 0;
}

diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h
index 39d471d..45ab794 100644
--- a/include/linux/sysctl.h
+++ b/include/linux/sysctl.h
@@ -858,6 +858,7 @@ enum {
DEV_MAC_HID=5,
DEV_SCSI=6,
DEV_IPMI=7,
+ DEV_MEM=8,
};

/* /proc/sys/dev/cdrom */
@@ -875,6 +876,11 @@ enum {
DEV_PARPORT_DEFAULT=-3
};

+/* /proc/sys/dev/mem */
+enum {
+ DEV_MEM_RESTRICTED=1,
+};
+
/* /proc/sys/dev/raid */
enum {
DEV_RAID_SPEED_LIMIT_MIN=1,
diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c
index c35da23..2cbdd52 100644
--- a/kernel/sysctl_check.c
+++ b/kernel/sysctl_check.c
@@ -870,6 +870,12 @@ static const struct trans_ctl_table trans_parport_table[] = {
{}
};

+
+static const struct trans_ctl_table trans_mem_table[] = {
+ { DEV_MEM_RESTRICTED, "restricted" },
+ {}
+};
+
static const struct trans_ctl_table trans_dev_table[] = {
{ DEV_CDROM, "cdrom", trans_cdrom_table },
/* DEV_HWMON unused */
@@ -878,6 +884,7 @@ static const struct trans_ctl_table trans_dev_table[] = {
{ DEV_MAC_HID, "mac_hid", trans_mac_hid_files },
{ DEV_SCSI, "scsi", trans_scsi_table },
{ DEV_IPMI, "ipmi", trans_ipmi_table },
+ { DEV_MEM, "mem", trans_mem_table },
{}
};

--
1.6.0.4

2008-11-16 00:07:46

by Alan

[permalink] [raw]
Subject: Re: [PATCH 2/2] Add dev.mem.restricted sysctl

On Sun, 16 Nov 2008 01:03:43 +0100
Bernhard Walle <[email protected]> wrote:

> When CONFIG_STRICT_DEVMEM is set, live debugging is not possible with the
> crash utility (see http://people.redhat.com/~anderson). For distributors
> who ship a generic kernel it's difficult: Disabling CONFIG_STRICT_DEVMEM
> is possible, but in general the protection provided by CONFIG_STRICT_DEVMEM
> is useful. However, live debugging should be still neceessary.

Why not just turn strictmem off - as you've correctly demonstrated its
completely useless and always was.

A switchable configurable piece of turd is still at the end of the day a
piece of turd.

2008-11-16 00:09:25

by Bernhard Walle

[permalink] [raw]
Subject: Re: [PATCH 2/2] Add dev.mem.restricted sysctl

* Alan Cox [2008-11-16 00:07]:
>
> On Sun, 16 Nov 2008 01:03:43 +0100
> Bernhard Walle <[email protected]> wrote:
>
> > When CONFIG_STRICT_DEVMEM is set, live debugging is not possible with the
> > crash utility (see http://people.redhat.com/~anderson). For distributors
> > who ship a generic kernel it's difficult: Disabling CONFIG_STRICT_DEVMEM
> > is possible, but in general the protection provided by CONFIG_STRICT_DEVMEM
> > is useful. However, live debugging should be still neceessary.
>
> Why not just turn strictmem off - as you've correctly demonstrated its
> completely useless and always was.
>
> A switchable configurable piece of turd is still at the end of the day a
> piece of turd.

Well, I think that option makes sense to protect the system
from processes that *accidentally* read/write to the wrong memory
location (like the X server that always runs as root).


Regards,
Bernhard

2008-11-16 00:11:19

by Bernhard Walle

[permalink] [raw]
Subject: Re: [PATCH 2/2] Add dev.mem.restricted sysctl

* Bernhard Walle [2008-11-16 01:03]:
>
> +
> +static const struct trans_ctl_table trans_mem_table[] = {
> + { DEV_MEM_RESTRICTED, "restricted" },
> + {}
> +};

Noticed that we either need to put everything into CONFIG_STRICT_DEVMEM
or nothing. But I want to head opinions if the concept can be accepted
before I repost.



Regards,
Bernhard

2008-11-16 00:12:40

by Alan

[permalink] [raw]
Subject: Re: [PATCH 2/2] Add dev.mem.restricted sysctl

> Well, I think that option makes sense to protect the system
> from processes that *accidentally* read/write to the wrong memory
> location (like the X server that always runs as root).

Older X drivers very carefully only map the memory they need. They also
unmap and remap as well as using iopl/ioperm ranges to avoid such
accidents. Newer ones are all DMA command chains and stuff so it won't
make one iota of difference.

2008-11-16 00:30:11

by Alexey Dobriyan

[permalink] [raw]
Subject: Re: [PATCH 2/2] Add dev.mem.restricted sysctl

On Sun, Nov 16, 2008 at 01:03:43AM +0100, Bernhard Walle wrote:
> --- a/include/linux/sysctl.h
> +++ b/include/linux/sysctl.h
> @@ -858,6 +858,7 @@ enum {
> DEV_MAC_HID=5,
> DEV_SCSI=6,
> DEV_IPMI=7,
> + DEV_MEM=8,
> };
>
> /* /proc/sys/dev/cdrom */
> @@ -875,6 +876,11 @@ enum {
> DEV_PARPORT_DEFAULT=-3
> };
>
> +/* /proc/sys/dev/mem */
> +enum {
> + DEV_MEM_RESTRICTED=1,
> +};

No new binary sysctls please.

2008-11-16 03:48:36

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [PATCH 2/2] Add dev.mem.restricted sysctl

On Sun, 16 Nov 2008 01:03:43 +0100
Bernhard Walle <[email protected]> wrote:

> When CONFIG_STRICT_DEVMEM is set, live debugging is not possible with
> the crash utility (see http://people.redhat.com/~anderson). For
> distributors who ship a generic kernel it's difficult: Disabling
> CONFIG_STRICT_DEVMEM is possible, but in general the protection
> provided by CONFIG_STRICT_DEVMEM is useful. However, live debugging
> should be still neceessary.
>
> This patch now adds a dev.mem.restricted sysctl that defaults to 0
> (off). When set to 1 (on), /dev/mem access is unrestricted and crash
> can be used.
>


sounds like a really bad idea to me.
If you want to use /dev/mem like this, don't enable the config option
to restrict it. Really.


--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2008-11-16 09:28:55

by Bernhard Walle

[permalink] [raw]
Subject: Re: [PATCH 2/2] Add dev.mem.restricted sysctl

* Alexey Dobriyan [2008-11-16 03:33]:
>
> On Sun, Nov 16, 2008 at 01:03:43AM +0100, Bernhard Walle wrote:
> > --- a/include/linux/sysctl.h
> > +++ b/include/linux/sysctl.h
> > @@ -858,6 +858,7 @@ enum {
> > DEV_MAC_HID=5,
> > DEV_SCSI=6,
> > DEV_IPMI=7,
> > + DEV_MEM=8,
> > };
> >
> > /* /proc/sys/dev/cdrom */
> > @@ -875,6 +876,11 @@ enum {
> > DEV_PARPORT_DEFAULT=-3
> > };
> >
> > +/* /proc/sys/dev/mem */
> > +enum {
> > + DEV_MEM_RESTRICTED=1,
> > +};
>
> No new binary sysctls please.

Why is that sysctl binary? It's just an integer.


Regards,
Bernhard

2008-11-16 09:30:25

by Bernhard Walle

[permalink] [raw]
Subject: Re: [PATCH 2/2] Add dev.mem.restricted sysctl

* Arjan van de Ven [2008-11-15 19:49]:
>
> On Sun, 16 Nov 2008 01:03:43 +0100
> Bernhard Walle <[email protected]> wrote:
>
> > When CONFIG_STRICT_DEVMEM is set, live debugging is not possible with
> > the crash utility (see http://people.redhat.com/~anderson). For
> > distributors who ship a generic kernel it's difficult: Disabling
> > CONFIG_STRICT_DEVMEM is possible, but in general the protection
> > provided by CONFIG_STRICT_DEVMEM is useful. However, live debugging
> > should be still neceessary.
> >
> > This patch now adds a dev.mem.restricted sysctl that defaults to 0
> > (off). When set to 1 (on), /dev/mem access is unrestricted and crash
> > can be used.
>
> sounds like a really bad idea to me.
> If you want to use /dev/mem like this, don't enable the config option
> to restrict it. Really.

So, what's that restriction really for? Maybe it would make sense to
remove that CONFIG option entirely and turn it into the systl? Just an
idea.


Regards,
Bernhard

2008-11-16 09:35:47

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [PATCH 2/2] Add dev.mem.restricted sysctl

On Sun, 16 Nov 2008 10:30:05 +0100
Bernhard Walle <[email protected]> wrote:

> * Arjan van de Ven [2008-11-15 19:49]:
> >
> > On Sun, 16 Nov 2008 01:03:43 +0100
> > Bernhard Walle <[email protected]> wrote:
> >
> > > When CONFIG_STRICT_DEVMEM is set, live debugging is not possible
> > > with the crash utility (see http://people.redhat.com/~anderson).
> > > For distributors who ship a generic kernel it's difficult:
> > > Disabling CONFIG_STRICT_DEVMEM is possible, but in general the
> > > protection provided by CONFIG_STRICT_DEVMEM is useful. However,
> > > live debugging should be still neceessary.
> > >
> > > This patch now adds a dev.mem.restricted sysctl that defaults to 0
> > > (off). When set to 1 (on), /dev/mem access is unrestricted and
> > > crash can be used.
> >
> > sounds like a really bad idea to me.
> > If you want to use /dev/mem like this, don't enable the config
> > option to restrict it. Really.
>
> So, what's that restriction really for?

It's for not allowing something that you don't need, if you
know you don't need it. Least privilege and all that.
(so that selinux can give X access to devices but not to all of memory.
I know Alan mentioned DMA but that's also a choice, a lot of the systems
on the market for the last year or two have an IOMMU that you can use
etc)

But if you do need it, then you do need it, simple as that.

(that leave the entire fun question of cache attributes but thats' a
whole different animal)


--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2008-11-16 10:09:56

by Alexey Dobriyan

[permalink] [raw]
Subject: Re: [PATCH 2/2] Add dev.mem.restricted sysctl

On Sun, Nov 16, 2008 at 10:28:43AM +0100, Bernhard Walle wrote:
> * Alexey Dobriyan [2008-11-16 03:33]:
> >
> > On Sun, Nov 16, 2008 at 01:03:43AM +0100, Bernhard Walle wrote:
> > > --- a/include/linux/sysctl.h
> > > +++ b/include/linux/sysctl.h
> > > @@ -858,6 +858,7 @@ enum {
> > > DEV_MAC_HID=5,
> > > DEV_SCSI=6,
> > > DEV_IPMI=7,
> > > + DEV_MEM=8,
> > > };
> > >
> > > /* /proc/sys/dev/cdrom */
> > > @@ -875,6 +876,11 @@ enum {
> > > DEV_PARPORT_DEFAULT=-3
> > > };
> > >
> > > +/* /proc/sys/dev/mem */
> > > +enum {
> > > + DEV_MEM_RESTRICTED=1,
> > > +};
> >
> > No new binary sysctls please.
>
> Why is that sysctl binary? It's just an integer.

It's a name for sysctls accessible via sysctl(2).

Please, if you're adding sysctl, supply only .procname and make it accessible
only via /proc/sys/.