2018-05-11 01:14:33

by Hamish Martin

[permalink] [raw]
Subject: [PATCH 0/2] Prevent kernel oops on UIO device remove with open fds

If a UIO device is removed while a user space app has an open file
descriptor to that device's /dev/uio* file, a kernel oops can occur when
the file descriptor is ultimately closed. The oops is triggered by
dereferencing either the uio_listener struct's 'dev' pointer, or at the
next level, when dereferencing a stale 'info' pointer on that dev.

To resolve this we now increment the reference count for the uio_device
and prevent the destruction of the uio_device until all open file
descriptors are closed.
A further consequence of resolving the stale pointers to the uio_device
is that it exposes an issue with the independent life cycles of the
uio_device and its related uio_info. The uio_info struct is allocated by
the user of the uio subsystem and connected to a uio_device at
registration time (see __uio_register_device()). When the device
corresponding to that uio_info is unregistered and the memory for the
uio_info is freed, the uio_device is left with a stale pointer which may
still be used in the file system ops (uio_poll(), uio_read(), etc)
To resolve this second issue, we now lock alteration or access of the
'info' member of the uio_device struct and alter the accessing code to
handle that pointer being null.

This patch series contains two patches. The first is a cosmetic change
to the return paths from uio_write to facilitate easier review of the
second patch. The second patch contains the change to prevent destruction
of the uio_device while file descriptors to it remain open and the
additional locking to prevent the use of a stale 'info' pointer.

This patch series is heavily based on the work done by Brian Russell
(formerly of Brocade) in May 2015. His last version of an attempt to fix
this is seen here:
https://patchwork.kernel.org/patch/6057431/
My new addition is the locking around use of the info pointer. It is my
proposed solution to Brian's last comment about what he had left
unfinished:
"It needs a bit more work. uio_info needs to live as long as the
corresponding uio_device. Since they seem to always be 1:1,
uio_info could embedded within uio_device (but then all the users
of uio need changed) or uio_info could be a refcounted object."

For further info here is an example of the kernel oops this patch series
is trying to resolve. Output is from a 4.17.0-rc1 kernel with a test app
opening a uio device and doing select with the fd, then removing the UIO
device while the select is still happening:

Unable to handle kernel paging request for data at address 0x6b6b6b6b6b6b6d63
Faulting instruction address: 0xc000000000605c98
Oops: Kernel access of bad area, sig: 11 [#1]
BE PREEMPT SMP NR_CPUS=8 CoreNet Generic
Modules linked in:
CPU: 6 PID: 282 Comm: uio_tester Not tainted 4.17.0-rc1-at1+ #8
NIP: c000000000605c98 LR: c000000000211d8c CTR: c000000000605c60
REGS: c0000000f02f3480 TRAP: 0300 Not tainted (4.17.0-rc1-at1+)
MSR: 000000008202b000 <VEC,CE,EE,FP,ME> CR: 24284448 XER: 20000000
DEAR: 6b6b6b6b6b6b6d63 ESR: 0000000000000000 SOFTE: 0
GPR00: c000000000211d8c c0000000f02f3700 c000000000dc7d00 c0000000ef365bc0
GPR04: c0000000f02f3800 0000000000000000 c0000000ef4b9b58 0000000000000006
GPR08: c000000000605c60 6b6b6b6b6b6b6b6b 0000000000000000 0000000000000016
GPR12: 0000000042284448 c00000003fffd440 0000000000000004 0000000000000003
GPR16: 0000000000000008 0000000000000000 00000000ef365bc0 0000000000000000
GPR20: c0000000f02f3c00 0000000000000000 0000000000000000 c0000000ef365bc0
GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR28: c0000000f02f3800 c0000000ef365bc0 c0000000f02c2c68 c0000000efd01d20
NIP [c000000000605c98] .uio_poll+0x38/0xe0
LR [c000000000211d8c] .do_select+0x39c/0x7a0
Call Trace:
[c0000000f02f3700] [c0000000f02f3790] 0xc0000000f02f3790 (unreliable)
[c0000000f02f3790] [c000000000211d8c] .do_select+0x39c/0x7a0
[c0000000f02f3b90] [c000000000212eac] .core_sys_select+0x22c/0x320
[c0000000f02f3d70] [c000000000213098] .__se_sys_select+0xf8/0x170
[c0000000f02f3e30] [c000000000000674] system_call+0x58/0x64
Instruction dump:
f8010010 fba1ffe8 fbc1fff0 fbe1fff8 f821ff71 7c7d1b78 7c9c2378 60000000
60000000 ebdd00c0 ebfe0000 e93f0038 <e92901f8> 2fa90000 40de0030 3c60ffff
---[ end trace 8badf75b83f45856 ]---

Hamish Martin (2):
uio: Reduce return paths from uio_write()
uio: Prevent device destruction while fds are open

drivers/uio/uio.c | 121 ++++++++++++++++++++++++++++++++-------------
include/linux/uio_driver.h | 3 +-
2 files changed, 90 insertions(+), 34 deletions(-)

--
2.16.2



2018-05-11 01:14:01

by Hamish Martin

[permalink] [raw]
Subject: [PATCH 2/2] uio: Prevent device destruction while fds are open

Prevent destruction of a uio_device while user space apps hold open
file descriptors to that device. Further, access to the 'info' member
of the struct uio_device is protected by spinlock. This is to ensure
stale pointers to data not under control of the UIO subsystem are not
dereferenced.

Signed-off-by: Hamish Martin <[email protected]>
Reviewed-by: Chris Packham <[email protected]>
---
drivers/uio/uio.c | 98 ++++++++++++++++++++++++++++++++++------------
include/linux/uio_driver.h | 3 +-
2 files changed, 74 insertions(+), 27 deletions(-)

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index dd44df17004d..5473e77c85be 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -270,7 +270,7 @@ static int uio_dev_add_attributes(struct uio_device *idev)
if (!map_found) {
map_found = 1;
idev->map_dir = kobject_create_and_add("maps",
- &idev->dev->kobj);
+ &idev->dev.kobj);
if (!idev->map_dir) {
ret = -ENOMEM;
goto err_map;
@@ -299,7 +299,7 @@ static int uio_dev_add_attributes(struct uio_device *idev)
if (!portio_found) {
portio_found = 1;
idev->portio_dir = kobject_create_and_add("portio",
- &idev->dev->kobj);
+ &idev->dev.kobj);
if (!idev->portio_dir) {
ret = -ENOMEM;
goto err_portio;
@@ -342,7 +342,7 @@ static int uio_dev_add_attributes(struct uio_device *idev)
kobject_put(&map->kobj);
}
kobject_put(idev->map_dir);
- dev_err(idev->dev, "error creating sysfs files (%d)\n", ret);
+ dev_err(&idev->dev, "error creating sysfs files (%d)\n", ret);
return ret;
}

@@ -379,7 +379,7 @@ static int uio_get_minor(struct uio_device *idev)
idev->minor = retval;
retval = 0;
} else if (retval == -ENOSPC) {
- dev_err(idev->dev, "too many uio devices\n");
+ dev_err(&idev->dev, "too many uio devices\n");
retval = -EINVAL;
}
mutex_unlock(&minor_lock);
@@ -433,6 +433,7 @@ static int uio_open(struct inode *inode, struct file *filep)
struct uio_device *idev;
struct uio_listener *listener;
int ret = 0;
+ unsigned long flags;

mutex_lock(&minor_lock);
idev = idr_find(&uio_idr, iminor(inode));
@@ -442,9 +443,11 @@ static int uio_open(struct inode *inode, struct file *filep)
goto out;
}

+ get_device(&idev->dev);
+
if (!try_module_get(idev->owner)) {
ret = -ENODEV;
- goto out;
+ goto err_module_get;
}

listener = kmalloc(sizeof(*listener), GFP_KERNEL);
@@ -457,11 +460,13 @@ static int uio_open(struct inode *inode, struct file *filep)
listener->event_count = atomic_read(&idev->event);
filep->private_data = listener;

- if (idev->info->open) {
+ spin_lock_irqsave(&idev->info_lock, flags);
+ if (idev->info && idev->info->open)
ret = idev->info->open(idev->info, inode);
- if (ret)
- goto err_infoopen;
- }
+ spin_unlock_irqrestore(&idev->info_lock, flags);
+ if (ret)
+ goto err_infoopen;
+
return 0;

err_infoopen:
@@ -470,6 +475,9 @@ static int uio_open(struct inode *inode, struct file *filep)
err_alloc_listener:
module_put(idev->owner);

+err_module_get:
+ put_device(&idev->dev);
+
out:
return ret;
}
@@ -487,12 +495,16 @@ static int uio_release(struct inode *inode, struct file *filep)
int ret = 0;
struct uio_listener *listener = filep->private_data;
struct uio_device *idev = listener->dev;
+ unsigned long flags;

- if (idev->info->release)
+ spin_lock_irqsave(&idev->info_lock, flags);
+ if (idev->info && idev->info->release)
ret = idev->info->release(idev->info, inode);
+ spin_unlock_irqrestore(&idev->info_lock, flags);

module_put(idev->owner);
kfree(listener);
+ put_device(&idev->dev);
return ret;
}

@@ -500,9 +512,16 @@ static __poll_t uio_poll(struct file *filep, poll_table *wait)
{
struct uio_listener *listener = filep->private_data;
struct uio_device *idev = listener->dev;
+ __poll_t ret = 0;
+ unsigned long flags;

- if (!idev->info->irq)
- return -EIO;
+ spin_lock_irqsave(&idev->info_lock, flags);
+ if (!idev->info || !idev->info->irq)
+ ret = -EIO;
+ spin_unlock_irqrestore(&idev->info_lock, flags);
+
+ if (ret)
+ return ret;

poll_wait(filep, &idev->wait, wait);
if (listener->event_count != atomic_read(&idev->event))
@@ -516,11 +535,17 @@ static ssize_t uio_read(struct file *filep, char __user *buf,
struct uio_listener *listener = filep->private_data;
struct uio_device *idev = listener->dev;
DECLARE_WAITQUEUE(wait, current);
- ssize_t retval;
+ ssize_t retval = 0;
s32 event_count;
+ unsigned long flags;

- if (!idev->info->irq)
- return -EIO;
+ spin_lock_irqsave(&idev->info_lock, flags);
+ if (!idev->info || !idev->info->irq)
+ retval = -EIO;
+ spin_unlock_irqrestore(&idev->info_lock, flags);
+
+ if (retval)
+ return retval;

if (count != sizeof(s32))
return -EINVAL;
@@ -567,8 +592,10 @@ static ssize_t uio_write(struct file *filep, const char __user *buf,
struct uio_device *idev = listener->dev;
ssize_t retval;
s32 irq_on;
+ unsigned long flags;

- if (!idev->info->irq) {
+ spin_lock_irqsave(&idev->info_lock, flags);
+ if (!idev->info || !idev->info->irq) {
retval = -EIO;
goto out;
}
@@ -591,6 +618,7 @@ static ssize_t uio_write(struct file *filep, const char __user *buf,
retval = idev->info->irqcontrol(idev->info, irq_on);

out:
+ spin_unlock_irqrestore(&idev->info_lock, flags);
return retval ? retval : sizeof(s32);
}

@@ -803,6 +831,13 @@ static void release_uio_class(void)
uio_major_cleanup();
}

+static void uio_device_release(struct device *dev)
+{
+ struct uio_device *idev = dev_get_drvdata(dev);
+
+ kfree(idev);
+}
+
/**
* uio_register_device - register a new userspace IO device
* @owner: module that creates the new device
@@ -823,13 +858,14 @@ int __uio_register_device(struct module *owner,

info->uio_dev = NULL;

- idev = devm_kzalloc(parent, sizeof(*idev), GFP_KERNEL);
+ idev = kzalloc(sizeof(*idev), GFP_KERNEL);
if (!idev) {
return -ENOMEM;
}

idev->owner = owner;
idev->info = info;
+ spin_lock_init(&idev->info_lock);
init_waitqueue_head(&idev->wait);
atomic_set(&idev->event, 0);

@@ -837,14 +873,19 @@ int __uio_register_device(struct module *owner,
if (ret)
return ret;

- idev->dev = device_create(&uio_class, parent,
- MKDEV(uio_major, idev->minor), idev,
- "uio%d", idev->minor);
- if (IS_ERR(idev->dev)) {
- printk(KERN_ERR "UIO: device register failed\n");
- ret = PTR_ERR(idev->dev);
+ idev->dev.devt = MKDEV(uio_major, idev->minor);
+ idev->dev.class = &uio_class;
+ idev->dev.parent = parent;
+ idev->dev.release = uio_device_release;
+ dev_set_drvdata(&idev->dev, idev);
+
+ ret = dev_set_name(&idev->dev, "uio%d", idev->minor);
+ if (ret)
+ goto err_device_create;
+
+ ret = device_register(&idev->dev);
+ if (ret)
goto err_device_create;
- }

ret = uio_dev_add_attributes(idev);
if (ret)
@@ -872,7 +913,7 @@ int __uio_register_device(struct module *owner,
err_request_irq:
uio_dev_del_attributes(idev);
err_uio_dev_add_attributes:
- device_destroy(&uio_class, MKDEV(uio_major, idev->minor));
+ device_unregister(&idev->dev);
err_device_create:
uio_free_minor(idev);
return ret;
@@ -887,6 +928,7 @@ EXPORT_SYMBOL_GPL(__uio_register_device);
void uio_unregister_device(struct uio_info *info)
{
struct uio_device *idev;
+ unsigned long flags;

if (!info || !info->uio_dev)
return;
@@ -900,7 +942,11 @@ void uio_unregister_device(struct uio_info *info)
if (info->irq && info->irq != UIO_IRQ_CUSTOM)
free_irq(info->irq, idev);

- device_destroy(&uio_class, MKDEV(uio_major, idev->minor));
+ spin_lock_irqsave(&idev->info_lock, flags);
+ idev->info = NULL;
+ spin_unlock_irqrestore(&idev->info_lock, flags);
+
+ device_unregister(&idev->dev);

return;
}
diff --git a/include/linux/uio_driver.h b/include/linux/uio_driver.h
index 3c85c81b0027..42b3d7cd7413 100644
--- a/include/linux/uio_driver.h
+++ b/include/linux/uio_driver.h
@@ -68,12 +68,13 @@ struct uio_port {

struct uio_device {
struct module *owner;
- struct device *dev;
+ struct device dev;
int minor;
atomic_t event;
struct fasync_struct *async_queue;
wait_queue_head_t wait;
struct uio_info *info;
+ spinlock_t info_lock;
struct kobject *map_dir;
struct kobject *portio_dir;
};
--
2.16.2


2018-05-11 01:15:12

by Hamish Martin

[permalink] [raw]
Subject: [PATCH 1/2] uio: Reduce return paths from uio_write()

Drive all return paths for uio_write() through a single block at the
end of the function.

Signed-off-by: Hamish Martin <[email protected]>
Reviewed-by: Chris Packham <[email protected]>
---
drivers/uio/uio.c | 25 +++++++++++++++++--------
1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index fd4848392e0d..dd44df17004d 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -568,20 +568,29 @@ static ssize_t uio_write(struct file *filep, const char __user *buf,
ssize_t retval;
s32 irq_on;

- if (!idev->info->irq)
- return -EIO;
+ if (!idev->info->irq) {
+ retval = -EIO;
+ goto out;
+ }

- if (count != sizeof(s32))
- return -EINVAL;
+ if (count != sizeof(s32)) {
+ retval = -EINVAL;
+ goto out;
+ }

- if (!idev->info->irqcontrol)
- return -ENOSYS;
+ if (!idev->info->irqcontrol) {
+ retval = -ENOSYS;
+ goto out;
+ }

- if (copy_from_user(&irq_on, buf, count))
- return -EFAULT;
+ if (copy_from_user(&irq_on, buf, count)) {
+ retval = -EFAULT;
+ goto out;
+ }

retval = idev->info->irqcontrol(idev->info, irq_on);

+out:
return retval ? retval : sizeof(s32);
}

--
2.16.2