2015-11-20 17:32:21

by Dan Williams

[permalink] [raw]
Subject: [RFC PATCH] restrict /dev/mem to idle io memory ranges

This effectively promotes IORESOURCE_BUSY to IORESOURCE_EXCLUSIVE
semantics by default. If userspace really believes it is safe to access
the memory region it can also perform the extra step of disabling an
active driver. This protects device address ranges with read side
effects and otherwise directs userspace to use the driver.

Persistent memory presents a large "mistake surface" to /dev/mem as now
accidental writes can corrupt a filesystem.

Cc: Kees Cook <[email protected]>
Cc: Russell King <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
---
arch/arm/Kconfig.debug | 14 --------------
arch/arm64/Kconfig.debug | 14 --------------
arch/powerpc/Kconfig.debug | 12 ------------
arch/s390/Kconfig.debug | 12 ------------
arch/tile/Kconfig | 3 ---
arch/unicore32/Kconfig.debug | 14 --------------
arch/x86/Kconfig.debug | 17 -----------------
kernel/resource.c | 3 +++
lib/Kconfig.debug | 36 ++++++++++++++++++++++++++++++++++++
9 files changed, 39 insertions(+), 86 deletions(-)

diff --git a/arch/arm/Kconfig.debug b/arch/arm/Kconfig.debug
index 259c0ca9c99a..e356357d86bb 100644
--- a/arch/arm/Kconfig.debug
+++ b/arch/arm/Kconfig.debug
@@ -15,20 +15,6 @@ config ARM_PTDUMP
kernel.
If in doubt, say "N"

-config STRICT_DEVMEM
- bool "Filter access to /dev/mem"
- depends on MMU
- ---help---
- If this option is disabled, you allow userspace (root) access to all
- of memory, including kernel and userspace memory. Accidental
- access to this is obviously disastrous, but specific access can
- be used by people debugging the kernel.
-
- If this option is switched on, the /dev/mem file only allows
- userspace access to memory mapped peripherals.
-
- If in doubt, say Y.
-
# RMK wants arm kernels compiled with frame pointers or stack unwinding.
# If you know what you are doing and are willing to live without stack
# traces, you can get a slightly smaller kernel by setting this option to
diff --git a/arch/arm64/Kconfig.debug b/arch/arm64/Kconfig.debug
index 04fb73b973f1..e13c4bf84d9e 100644
--- a/arch/arm64/Kconfig.debug
+++ b/arch/arm64/Kconfig.debug
@@ -14,20 +14,6 @@ config ARM64_PTDUMP
kernel.
If in doubt, say "N"

-config STRICT_DEVMEM
- bool "Filter access to /dev/mem"
- depends on MMU
- help
- If this option is disabled, you allow userspace (root) access to all
- of memory, including kernel and userspace memory. Accidental
- access to this is obviously disastrous, but specific access can
- be used by people debugging the kernel.
-
- If this option is switched on, the /dev/mem file only allows
- userspace access to memory mapped peripherals.
-
- If in doubt, say Y.
-
config PID_IN_CONTEXTIDR
bool "Write the current PID to the CONTEXTIDR register"
help
diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug
index 3a510f4a6b68..a0e44a9c456f 100644
--- a/arch/powerpc/Kconfig.debug
+++ b/arch/powerpc/Kconfig.debug
@@ -335,18 +335,6 @@ config PPC_EARLY_DEBUG_CPM_ADDR
platform probing is done, all platforms selected must
share the same address.

-config STRICT_DEVMEM
- def_bool y
- prompt "Filter access to /dev/mem"
- help
- This option restricts access to /dev/mem. If this option is
- disabled, you allow userspace access to all memory, including
- kernel and userspace memory. Accidental memory access is likely
- to be disastrous.
- Memory access is required for experts who want to debug the kernel.
-
- If you are unsure, say Y.
-
config FAIL_IOMMU
bool "Fault-injection capability for IOMMU"
depends on FAULT_INJECTION
diff --git a/arch/s390/Kconfig.debug b/arch/s390/Kconfig.debug
index c56878e1245f..26c5d5beb4be 100644
--- a/arch/s390/Kconfig.debug
+++ b/arch/s390/Kconfig.debug
@@ -5,18 +5,6 @@ config TRACE_IRQFLAGS_SUPPORT

source "lib/Kconfig.debug"

-config STRICT_DEVMEM
- def_bool y
- prompt "Filter access to /dev/mem"
- ---help---
- This option restricts access to /dev/mem. If this option is
- disabled, you allow userspace access to all memory, including
- kernel and userspace memory. Accidental memory access is likely
- to be disastrous.
- Memory access is required for experts who want to debug the kernel.
-
- If you are unsure, say Y.
-
config S390_PTDUMP
bool "Export kernel pagetable layout to userspace via debugfs"
depends on DEBUG_KERNEL
diff --git a/arch/tile/Kconfig b/arch/tile/Kconfig
index 106c21bd7f44..7b2d40db11fa 100644
--- a/arch/tile/Kconfig
+++ b/arch/tile/Kconfig
@@ -116,9 +116,6 @@ config ARCH_DISCONTIGMEM_DEFAULT
config TRACE_IRQFLAGS_SUPPORT
def_bool y

-config STRICT_DEVMEM
- def_bool y
-
# SMP is required for Tilera Linux.
config SMP
def_bool y
diff --git a/arch/unicore32/Kconfig.debug b/arch/unicore32/Kconfig.debug
index 1a3626239843..f075bbe1d46f 100644
--- a/arch/unicore32/Kconfig.debug
+++ b/arch/unicore32/Kconfig.debug
@@ -2,20 +2,6 @@ menu "Kernel hacking"

source "lib/Kconfig.debug"

-config STRICT_DEVMEM
- bool "Filter access to /dev/mem"
- depends on MMU
- ---help---
- If this option is disabled, you allow userspace (root) access to all
- of memory, including kernel and userspace memory. Accidental
- access to this is obviously disastrous, but specific access can
- be used by people debugging the kernel.
-
- If this option is switched on, the /dev/mem file only allows
- userspace access to memory mapped peripherals.
-
- If in doubt, say Y.
-
config EARLY_PRINTK
def_bool DEBUG_OCD
help
diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
index 137dfa96aa14..1116452fcfc2 100644
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -5,23 +5,6 @@ config TRACE_IRQFLAGS_SUPPORT

source "lib/Kconfig.debug"

-config STRICT_DEVMEM
- bool "Filter access to /dev/mem"
- ---help---
- If this option is disabled, you allow userspace (root) access to all
- of memory, including kernel and userspace memory. Accidental
- access to this is obviously disastrous, but specific access can
- be used by people debugging the kernel. Note that with PAT support
- enabled, even in this case there are restrictions on /dev/mem
- use due to the cache aliasing requirements.
-
- If this option is switched on, the /dev/mem file only allows
- userspace access to PCI space and the BIOS code and data regions.
- This is sufficient for dosemu and X and all common users of
- /dev/mem.
-
- If in doubt, say Y.
-
config X86_VERBOSE_BOOTUP
bool "Enable verbose x86 bootup info messages"
default y
diff --git a/kernel/resource.c b/kernel/resource.c
index f150dbbe6f62..03a8b09f68a8 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -1498,6 +1498,9 @@ int iomem_is_exclusive(u64 addr)
break;
if (p->end < addr)
continue;
+ if (IS_ENABLED(CONFIG_IO_STRICT_DEVMEM)
+ && p->flags & IORESOURCE_BUSY)
+ break;
if (p->flags & IORESOURCE_BUSY &&
p->flags & IORESOURCE_EXCLUSIVE) {
err = 1;
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 8c15b29d5adc..a188d7757e26 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1853,3 +1853,39 @@ source "samples/Kconfig"

source "lib/Kconfig.kgdb"

+config STRICT_DEVMEM
+ bool "Filter access to /dev/mem"
+ depends on MMU
+ default y if TILE || PPC || S390
+ ---help---
+ If this option is disabled, you allow userspace (root) access to all
+ of memory, including kernel and userspace memory. Accidental
+ access to this is obviously disastrous, but specific access can
+ be used by people debugging the kernel. Note that with PAT support
+ enabled, even in this case there are restrictions on /dev/mem
+ use due to the cache aliasing requirements.
+
+ If this option is switched on, the /dev/mem file only allows
+ userspace access to PCI space and the BIOS code and data regions.
+ This is sufficient for dosemu and X and all common users of
+ /dev/mem.
+
+ If in doubt, say Y.
+
+config IO_STRICT_DEVMEM
+ bool "Filter I/O access to /dev/mem"
+ depends on STRICT_DEVMEM
+ ---help---
+ If this option is disabled, you allow userspace (root) access
+ to all io memory regardless of whether a driver is actively
+ using that range. Accidental access to this is obviously
+ disastrous, but specific access can be used by people
+ debugging the kernel.
+
+ If this option is switched on, the /dev/mem file only allows
+ userspace access to *idle* io memory ranges (any non "System
+ RAM" range listed in /proc/iomem). This may break
+ traditional users of /dev/mem if the driver using a given
+ range cannot be disabled.
+
+ If in doubt, say N.


2015-11-20 20:02:53

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [RFC PATCH] restrict /dev/mem to idle io memory ranges

On Friday 20 November 2015 09:31:33 Dan Williams wrote:
> This effectively promotes IORESOURCE_BUSY to IORESOURCE_EXCLUSIVE
> semantics by default. If userspace really believes it is safe to access
> the memory region it can also perform the extra step of disabling an
> active driver. This protects device address ranges with read side
> effects and otherwise directs userspace to use the driver.
>
> Persistent memory presents a large "mistake surface" to /dev/mem as now
> accidental writes can corrupt a filesystem.
>
> Cc: Kees Cook <[email protected]>
> Cc: Russell King <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> Cc: Will Deacon <[email protected]>
> Cc: Benjamin Herrenschmidt <[email protected]>
> Cc: Martin Schwidefsky <[email protected]>
> Cc: Heiko Carstens <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: "H. Peter Anvin" <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Greg Kroah-Hartman <[email protected]>
> Signed-off-by: Dan Williams <[email protected]>
>

I like the idea.

Maybe split the change up into two patches, where the first one
just does the trivial move of the Kconfig option, and the second
one that changes behavior is small?

There is also a question of whether we actually need two options
or if we can safely make the existing option stricter.

Arnd

2015-11-20 20:07:12

by Kees Cook

[permalink] [raw]
Subject: Re: [RFC PATCH] restrict /dev/mem to idle io memory ranges

On Fri, Nov 20, 2015 at 12:00 PM, Arnd Bergmann <[email protected]> wrote:
> On Friday 20 November 2015 09:31:33 Dan Williams wrote:
>> This effectively promotes IORESOURCE_BUSY to IORESOURCE_EXCLUSIVE
>> semantics by default. If userspace really believes it is safe to access
>> the memory region it can also perform the extra step of disabling an
>> active driver. This protects device address ranges with read side
>> effects and otherwise directs userspace to use the driver.
>>
>> Persistent memory presents a large "mistake surface" to /dev/mem as now
>> accidental writes can corrupt a filesystem.
>>
>> Cc: Kees Cook <[email protected]>
>> Cc: Russell King <[email protected]>
>> Cc: Catalin Marinas <[email protected]>
>> Cc: Will Deacon <[email protected]>
>> Cc: Benjamin Herrenschmidt <[email protected]>
>> Cc: Martin Schwidefsky <[email protected]>
>> Cc: Heiko Carstens <[email protected]>
>> Cc: Thomas Gleixner <[email protected]>
>> Cc: Ingo Molnar <[email protected]>
>> Cc: "H. Peter Anvin" <[email protected]>
>> Cc: Andrew Morton <[email protected]>
>> Cc: Greg Kroah-Hartman <[email protected]>
>> Signed-off-by: Dan Williams <[email protected]>
>>
>
> I like the idea.

Yes please! I was always surprised that IORESOURCE_BUSY was allowed
under STRICT_DEVMEM.

> Maybe split the change up into two patches, where the first one
> just does the trivial move of the Kconfig option, and the second
> one that changes behavior is small?

Agreed: consolidate the per-arch Kconfigs first.

> There is also a question of whether we actually need two options
> or if we can safely make the existing option stricter.

Right -- what actually breaks if we add _BUSY to getting blocked?

-Kees

--
Kees Cook
Chrome OS Security

2015-11-20 20:12:32

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [RFC PATCH] restrict /dev/mem to idle io memory ranges

On Fri, Nov 20, 2015 at 09:31:33AM -0800, Dan Williams wrote:
> This effectively promotes IORESOURCE_BUSY to IORESOURCE_EXCLUSIVE
> semantics by default. If userspace really believes it is safe to access
> the memory region it can also perform the extra step of disabling an
> active driver. This protects device address ranges with read side
> effects and otherwise directs userspace to use the driver.

I'm happy with this as long as we retain the option to disable this
new behaviour.

The reason being, when developing a driver, it is _very_ useful to
be able to poke around in the device's (and system memory) address
spaces with tools like devmem2 to work out what's going on when
things go wrong.

To put it another way, I think it's a good idea to disable access to
these regions on production systems, but for driver development, we
want to retain the ability to poke around in physical address space
in any way we so desire.

--
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

2015-11-20 20:26:16

by Dan Williams

[permalink] [raw]
Subject: Re: [RFC PATCH] restrict /dev/mem to idle io memory ranges

On Fri, Nov 20, 2015 at 12:12 PM, Russell King - ARM Linux
<[email protected]> wrote:
> On Fri, Nov 20, 2015 at 09:31:33AM -0800, Dan Williams wrote:
>> This effectively promotes IORESOURCE_BUSY to IORESOURCE_EXCLUSIVE
>> semantics by default. If userspace really believes it is safe to access
>> the memory region it can also perform the extra step of disabling an
>> active driver. This protects device address ranges with read side
>> effects and otherwise directs userspace to use the driver.
>
> I'm happy with this as long as we retain the option to disable this
> new behaviour.
>
> The reason being, when developing a driver, it is _very_ useful to
> be able to poke around in the device's (and system memory) address
> spaces with tools like devmem2 to work out what's going on when
> things go wrong.
>
> To put it another way, I think it's a good idea to disable access to
> these regions on production systems, but for driver development, we
> want to retain the ability to poke around in physical address space
> in any way we so desire.
>

Sounds ok to me, but I do think it's a good idea to default it to the
same value as STRICT_DEVMEM. Perhaps:

bool "Filter I/O access to /dev/mem" if EXPERT
default STRICT_DEVMEM

When this in do we even need IORESOURCE_EXCLUSIVE? It's barely used.

2015-11-20 20:45:06

by Kees Cook

[permalink] [raw]
Subject: Re: [RFC PATCH] restrict /dev/mem to idle io memory ranges

On Fri, Nov 20, 2015 at 12:26 PM, Dan Williams <[email protected]> wrote:
> On Fri, Nov 20, 2015 at 12:12 PM, Russell King - ARM Linux
> <[email protected]> wrote:
>> On Fri, Nov 20, 2015 at 09:31:33AM -0800, Dan Williams wrote:
>>> This effectively promotes IORESOURCE_BUSY to IORESOURCE_EXCLUSIVE
>>> semantics by default. If userspace really believes it is safe to access
>>> the memory region it can also perform the extra step of disabling an
>>> active driver. This protects device address ranges with read side
>>> effects and otherwise directs userspace to use the driver.
>>
>> I'm happy with this as long as we retain the option to disable this
>> new behaviour.
>>
>> The reason being, when developing a driver, it is _very_ useful to
>> be able to poke around in the device's (and system memory) address
>> spaces with tools like devmem2 to work out what's going on when
>> things go wrong.
>>
>> To put it another way, I think it's a good idea to disable access to
>> these regions on production systems, but for driver development, we
>> want to retain the ability to poke around in physical address space
>> in any way we so desire.
>>
>
> Sounds ok to me, but I do think it's a good idea to default it to the
> same value as STRICT_DEVMEM. Perhaps:
>
> bool "Filter I/O access to /dev/mem" if EXPERT
> default STRICT_DEVMEM
>
> When this in do we even need IORESOURCE_EXCLUSIVE? It's barely used.

Let's leave it for now to give us the debugging granularity Russell
mentioned. If it turns out it's never used, we can drop it in the
future.

-Kees

--
Kees Cook
Chrome OS Security

2015-11-23 09:38:18

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFC PATCH] restrict /dev/mem to idle io memory ranges


* Dan Williams <[email protected]> wrote:

> On Fri, Nov 20, 2015 at 12:12 PM, Russell King - ARM Linux
> <[email protected]> wrote:
> > On Fri, Nov 20, 2015 at 09:31:33AM -0800, Dan Williams wrote:
> >> This effectively promotes IORESOURCE_BUSY to IORESOURCE_EXCLUSIVE
> >> semantics by default. If userspace really believes it is safe to access
> >> the memory region it can also perform the extra step of disabling an
> >> active driver. This protects device address ranges with read side
> >> effects and otherwise directs userspace to use the driver.
> >
> > I'm happy with this as long as we retain the option to disable this
> > new behaviour.
> >
> > The reason being, when developing a driver, it is _very_ useful to
> > be able to poke around in the device's (and system memory) address
> > spaces with tools like devmem2 to work out what's going on when
> > things go wrong.
> >
> > To put it another way, I think it's a good idea to disable access to
> > these regions on production systems, but for driver development, we
> > want to retain the ability to poke around in physical address space
> > in any way we so desire.
> >
>
> Sounds ok to me, but I do think it's a good idea to default it to the
> same value as STRICT_DEVMEM. Perhaps:
>
> bool "Filter I/O access to /dev/mem" if EXPERT
> default STRICT_DEVMEM

Agreed, STRICT_DEVMEM=y should grandfather in this new (and very sensible)
restriction.

Thanks,

Ingo