Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932695AbcCVNL5 (ORCPT ); Tue, 22 Mar 2016 09:11:57 -0400 Received: from mail-wm0-f67.google.com ([74.125.82.67]:36305 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932383AbcCVNLj (ORCPT ); Tue, 22 Mar 2016 09:11:39 -0400 From: Nicolai Stange To: Greg Kroah-Hartman Cc: Rasmus Villemoes , "Paul E. McKenney" , Alexander Viro , Jonathan Corbet , Jan Kara , Andrew Morton , Julia Lawall , Gilles Muller , Nicolas Palix , Michal Marek , linux-kernel@vger.kernel.org, cocci@systeme.lip6.fr, Nicolai Stange Subject: [PATCH v6 1/8] debugfs: prevent access to possibly dead file_operations at file open Date: Tue, 22 Mar 2016 14:11:13 +0100 Message-Id: <1458652280-19785-2-git-send-email-nicstange@gmail.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1458652280-19785-1-git-send-email-nicstange@gmail.com> References: <1458652280-19785-1-git-send-email-nicstange@gmail.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 9312 Lines: 278 Nothing prevents a dentry found by path lookup before a return of __debugfs_remove() to actually get opened after that return. Now, after the return of __debugfs_remove(), there are no guarantees whatsoever regarding the memory the corresponding inode's file_operations object had been kept in. Since __debugfs_remove() is seldomly invoked, usually from module exit handlers only, the race is hard to trigger and the impact is very low. A discussion of the problem outlined above as well as a suggested solution can be found in the (sub-)thread rooted at http://lkml.kernel.org/g/20130401203445.GA20862@ZenIV.linux.org.uk ("Yet another pipe related oops.") Basically, Greg KH suggests to introduce an intermediate fops and Al Viro points out that a pointer to the original ones may be stored in ->d_fsdata. Follow this line of reasoning: - Add SRCU as a reverse dependency of DEBUG_FS. - Introduce a srcu_struct object for the debugfs subsystem. - In debugfs_create_file(), store a pointer to the original file_operations object in ->d_fsdata. - Make debugfs_remove() and debugfs_remove_recursive() wait for a SRCU grace period after the dentry has been delete()'d and before they return to their callers. - Introduce an intermediate file_operations object named "debugfs_open_proxy_file_operations". It's ->open() functions checks, under the protection of a SRCU read lock, whether the dentry is still alive, i.e. has not been d_delete()'d and if so, tries to acquire a reference on the owning module. On success, it sets the file object's ->f_op to the original file_operations and forwards the ongoing open() call to the original ->open(). - For clarity, rename the former debugfs_file_operations to debugfs_noop_file_operations -- they are in no way canonical. The choice of SRCU over "normal" RCU is justified by the fact, that the former may also be used to protect ->i_private data from going away during the execution of a file's readers and writers which may (and do) sleep. Finally, introduce the fs/debugfs/internal.h header containing some declarations internal to the debugfs implementation. Signed-off-by: Nicolai Stange --- fs/debugfs/file.c | 91 ++++++++++++++++++++++++++++++++++++++++++++++++- fs/debugfs/inode.c | 13 ++++++- fs/debugfs/internal.h | 24 +++++++++++++ include/linux/debugfs.h | 3 -- lib/Kconfig.debug | 1 + 5 files changed, 127 insertions(+), 5 deletions(-) create mode 100644 fs/debugfs/internal.h diff --git a/fs/debugfs/file.c b/fs/debugfs/file.c index d2ba12e..736ab3c 100644 --- a/fs/debugfs/file.c +++ b/fs/debugfs/file.c @@ -22,6 +22,9 @@ #include #include #include +#include + +#include "internal.h" static ssize_t default_read_file(struct file *file, char __user *buf, size_t count, loff_t *ppos) @@ -35,13 +38,99 @@ static ssize_t default_write_file(struct file *file, const char __user *buf, return count; } -const struct file_operations debugfs_file_operations = { +const struct file_operations debugfs_noop_file_operations = { .read = default_read_file, .write = default_write_file, .open = simple_open, .llseek = noop_llseek, }; +/** + * debugfs_use_file_start - mark the beginning of file data access + * @dentry: the dentry object whose data is being accessed. + * @srcu_idx: a pointer to some memory to store a SRCU index in. + * + * Up to a matching call to debugfs_use_file_finish(), any + * successive call into the file removing functions debugfs_remove() + * and debugfs_remove_recursive() will block. Since associated private + * file data may only get freed after a successful return of any of + * the removal functions, you may safely access it after a successful + * call to debugfs_use_file_start() without worrying about + * lifetime issues. + * + * If -%EIO is returned, the file has already been removed and thus, + * it is not safe to access any of its data. If, on the other hand, + * it is allowed to access the file data, zero is returned. + * + * Regardless of the return code, any call to + * debugfs_use_file_start() must be followed by a matching call + * to debugfs_use_file_finish(). + */ +static int debugfs_use_file_start(const struct dentry *dentry, int *srcu_idx) + __acquires(&debugfs_srcu) +{ + *srcu_idx = srcu_read_lock(&debugfs_srcu); + barrier(); + if (d_unlinked(dentry)) + return -EIO; + return 0; +} + +/** + * debugfs_use_file_finish - mark the end of file data access + * @srcu_idx: the SRCU index "created" by a former call to + * debugfs_use_file_start(). + * + * Allow any ongoing concurrent call into debugfs_remove() or + * debugfs_remove_recursive() blocked by a former call to + * debugfs_use_file_start() to proceed and return to its caller. + */ +static void debugfs_use_file_finish(int srcu_idx) __releases(&debugfs_srcu) +{ + srcu_read_unlock(&debugfs_srcu, srcu_idx); +} + +#define F_DENTRY(filp) ((filp)->f_path.dentry) + +#define REAL_FOPS_DEREF(dentry) \ + ((const struct file_operations *)(dentry)->d_fsdata) + +static int open_proxy_open(struct inode *inode, struct file *filp) +{ + const struct dentry *dentry = F_DENTRY(filp); + const struct file_operations *real_fops = NULL; + int srcu_idx, r; + + r = debugfs_use_file_start(dentry, &srcu_idx); + if (r) { + r = -ENOENT; + goto out; + } + + real_fops = REAL_FOPS_DEREF(dentry); + real_fops = fops_get(real_fops); + if (!real_fops) { + /* Huh? Module did not clean up after itself at exit? */ + WARN(1, "debugfs file owner did not clean up at exit: %pd", + dentry); + r = -ENXIO; + goto out; + } + replace_fops(filp, real_fops); + + if (real_fops->open) + r = real_fops->open(inode, filp); + +out: + fops_put(real_fops); + debugfs_use_file_finish(srcu_idx); + return r; +} + +const struct file_operations debugfs_open_proxy_file_operations = { + .open = open_proxy_open, +}; + static struct dentry *debugfs_create_mode(const char *name, umode_t mode, struct dentry *parent, void *value, const struct file_operations *fops, diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c index bece948..26880ad 100644 --- a/fs/debugfs/inode.c +++ b/fs/debugfs/inode.c @@ -27,9 +27,14 @@ #include #include #include +#include + +#include "internal.h" #define DEBUGFS_DEFAULT_MODE 0700 +DEFINE_SRCU(debugfs_srcu); + static struct vfsmount *debugfs_mount; static int debugfs_mount_count; static bool debugfs_registered; @@ -340,8 +345,12 @@ struct dentry *debugfs_create_file(const char *name, umode_t mode, return failed_creating(dentry); inode->i_mode = mode; - inode->i_fop = fops ? fops : &debugfs_file_operations; inode->i_private = data; + + inode->i_fop = fops ? &debugfs_open_proxy_file_operations + : &debugfs_noop_file_operations; + dentry->d_fsdata = (void *)fops; + d_instantiate(dentry, inode); fsnotify_create(d_inode(dentry->d_parent), dentry); return end_creating(dentry); @@ -565,6 +574,7 @@ void debugfs_remove(struct dentry *dentry) inode_unlock(d_inode(parent)); if (!ret) simple_release_fs(&debugfs_mount, &debugfs_mount_count); + synchronize_srcu(&debugfs_srcu); } EXPORT_SYMBOL_GPL(debugfs_remove); @@ -642,6 +652,7 @@ void debugfs_remove_recursive(struct dentry *dentry) if (!__debugfs_remove(child, parent)) simple_release_fs(&debugfs_mount, &debugfs_mount_count); inode_unlock(d_inode(parent)); + synchronize_srcu(&debugfs_srcu); } EXPORT_SYMBOL_GPL(debugfs_remove_recursive); diff --git a/fs/debugfs/internal.h b/fs/debugfs/internal.h new file mode 100644 index 0000000..c7aaa5c --- /dev/null +++ b/fs/debugfs/internal.h @@ -0,0 +1,24 @@ +/* + * internal.h - declarations internal to debugfs + * + * Copyright (C) 2016 Nicolai Stange + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License version + * 2 as published by the Free Software Foundation. + * + */ + +#ifndef _DEBUGFS_INTERNAL_H_ +#define _DEBUGFS_INTERNAL_H_ + +struct file_operations; +struct srcu_struct; + +/* declared over in file.c */ +extern const struct file_operations debugfs_noop_file_operations; +extern const struct file_operations debugfs_open_proxy_file_operations; + +extern struct srcu_struct debugfs_srcu; + +#endif /* _DEBUGFS_INTERNAL_H_ */ diff --git a/include/linux/debugfs.h b/include/linux/debugfs.h index 981e53a..fcafe2d 100644 --- a/include/linux/debugfs.h +++ b/include/linux/debugfs.h @@ -43,9 +43,6 @@ extern struct dentry *arch_debugfs_dir; #if defined(CONFIG_DEBUG_FS) -/* declared over in file.c */ -extern const struct file_operations debugfs_file_operations; - struct dentry *debugfs_create_file(const char *name, umode_t mode, struct dentry *parent, void *data, const struct file_operations *fops); diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 5a60f45..0d5177f 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -257,6 +257,7 @@ config PAGE_OWNER config DEBUG_FS bool "Debug Filesystem" + select SRCU help debugfs is a virtual file system that kernel developers use to put debugging files into. Enable this option to be able to read and -- 2.7.4