Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933089AbbFIR1q (ORCPT ); Tue, 9 Jun 2015 13:27:46 -0400 Received: from mail-vn0-f48.google.com ([209.85.216.48]:35242 "EHLO mail-vn0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751279AbbFIR1g (ORCPT ); Tue, 9 Jun 2015 13:27:36 -0400 MIME-Version: 1.0 In-Reply-To: <1433821173-2804704-1-git-send-email-calvinowens@fb.com> References: <1432005006-3428-1-git-send-email-calvinowens@fb.com> <1433821173-2804704-1-git-send-email-calvinowens@fb.com> Date: Tue, 9 Jun 2015 10:27:35 -0700 X-Google-Sender-Auth: 1POgqbeuh7Gxj6ZqBI26E501R6M Message-ID: Subject: Re: [PATCH v6] procfs: Always expose /proc//map_files/ and make it readable From: Kees Cook To: Calvin Owens Cc: Andrew Morton , Alexey Dobriyan , "Eric W. Biederman" , Al Viro , Miklos Szeredi , Zefan Li , Oleg Nesterov , Joe Perches , David Howells , LKML , kernel-team@fb.com, Andy Lutomirski , Cyrill Gorcunov , "Kirill A. Shutemov" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7522 Lines: 193 On Mon, Jun 8, 2015 at 8:39 PM, Calvin Owens wrote: > Currently, /proc//map_files/ is restricted to CAP_SYS_ADMIN, and > is only exposed if CONFIG_CHECKPOINT_RESTORE is set. > > This interface very useful because it allows userspace to stat() > deleted files that are still mapped by some process, which enables a > much quicker and more accurate answer to the question "How much disk > space is being consumed by files that are deleted but still mapped?" > than is currently possible. > > This patch moves map_files/ out from behind CONFIG_CHECKPOINT_RESTORE, > and adjusts the permissions enforced on it as follows: > > * proc_map_files_lookup() > * proc_map_files_readdir() > * map_files_d_revalidate() > > Remove the CAP_SYS_ADMIN restriction, leaving only the current > restriction requiring PTRACE_MODE_READ. > > In earlier versions of this patch, I changed the ptrace checks > in the functions above to enforce MODE_ATTACH instead of > MODE_READ. That was an oversight: all the information exposed > by the above three functions is already available with > MODE_READ from /proc/PID/maps. I was only being asked to > strengthen the protection around functionality provided by > follow_link(), not the above. > > So, I've left the checks for MODE_READ as-is, since AFAICS all > objections raised so far are addressed by the new CAP_SYS_ADMIN > check in follow_link(), explained below. > > * proc_map_files_follow_link() > > This stub has been added, and requires that the user have > CAP_SYS_ADMIN in order to follow the links in map_files/, > since there was concern on LKML both about the potential for > bypassing permissions on ancestor directories in the path to > files pointed to, and about what happens with more exotic > memory mappings created by some drivers (ie dma-buf). > > Cc: Andrew Morton > Cc: Andy Lutomirski > Cc: Cyrill Gorcunov > Cc: Joe Perches > Cc: Kees Cook > Cc: Kirill A. Shutemov > Signed-off-by: Calvin Owens > --- > Changes in v6: Require CAP_SYS_ADMIN for follow_link(). Leave other > PTRACE_MODE_READ checks as-is, since CAP_SYS_ADMIN > alone addresses all concerns raised AFAICS. > > Changes in v5: s/dentry->d_inode/d_inode(dentry)/g > > Changes in v4: Return -ESRCH from follow_link() when get_proc_task() > returns NULL. > > Changes in v3: Changed permission checks to use PTRACE_MODE_ATTACH > instead of PTRACE_MODE_READ, and added a stub to > enforce MODE_ATTACH on follow_link() as well. > > Changes in v2: Removed the follow_link() stub that returned -EPERM if > the caller didn't have CAP_SYS_ADMIN, since the caller > in my chroot() scenario gets -EACCES anyway. > > fs/proc/base.c | 42 +++++++++++++++++++++++------------------- > 1 file changed, 23 insertions(+), 19 deletions(-) > > diff --git a/fs/proc/base.c b/fs/proc/base.c > index 093ca14..0270191 100644 > --- a/fs/proc/base.c > +++ b/fs/proc/base.c > @@ -1641,8 +1641,6 @@ end_instantiate: > return dir_emit(ctx, name, len, 1, DT_UNKNOWN); > } > > -#ifdef CONFIG_CHECKPOINT_RESTORE > - > /* > * dname_to_vma_addr - maps a dentry name into two unsigned longs > * which represent vma start and end addresses. > @@ -1669,11 +1667,6 @@ static int map_files_d_revalidate(struct dentry *dentry, unsigned int flags) > if (flags & LOOKUP_RCU) > return -ECHILD; > > - if (!capable(CAP_SYS_ADMIN)) { > - status = -EPERM; > - goto out_notask; > - } > - > inode = d_inode(dentry); > task = get_proc_task(inode); > if (!task) > @@ -1762,6 +1755,28 @@ struct map_files_info { > unsigned char name[4*sizeof(long)+2]; /* max: %lx-%lx\0 */ > }; > > +/* > + * Only allow CAP_SYS_ADMIN to follow the links, due to concerns about how the > + * symlinks may be used to bypass permissions on ancestor directories in the > + * path to the file in question. > + */ Cool, I think this looks good. Thanks! Reviewed-by: Kees Cook -Kees > +static void *proc_map_files_follow_link(struct dentry *dentry, struct nameidata *nd) > +{ > + if (!capable(CAP_SYS_ADMIN)) > + return ERR_PTR(-EPERM); > + > + return proc_pid_follow_link(dentry, nd); > +} > + > +/* > + * Identical to proc_pid_link_inode_operations except for follow_link() > + */ > +static const struct inode_operations proc_map_files_link_inode_operations = { > + .readlink = proc_pid_readlink, > + .follow_link = proc_map_files_follow_link, > + .setattr = proc_setattr, > +}; > + > static int > proc_map_files_instantiate(struct inode *dir, struct dentry *dentry, > struct task_struct *task, const void *ptr) > @@ -1777,7 +1792,7 @@ proc_map_files_instantiate(struct inode *dir, struct dentry *dentry, > ei = PROC_I(inode); > ei->op.proc_get_link = proc_map_files_get_link; > > - inode->i_op = &proc_pid_link_inode_operations; > + inode->i_op = &proc_map_files_link_inode_operations; > inode->i_size = 64; > inode->i_mode = S_IFLNK; > > @@ -1801,10 +1816,6 @@ static struct dentry *proc_map_files_lookup(struct inode *dir, > int result; > struct mm_struct *mm; > > - result = -EPERM; > - if (!capable(CAP_SYS_ADMIN)) > - goto out; > - > result = -ENOENT; > task = get_proc_task(dir); > if (!task) > @@ -1858,10 +1869,6 @@ proc_map_files_readdir(struct file *file, struct dir_context *ctx) > struct map_files_info *p; > int ret; > > - ret = -EPERM; > - if (!capable(CAP_SYS_ADMIN)) > - goto out; > - > ret = -ENOENT; > task = get_proc_task(file_inode(file)); > if (!task) > @@ -2050,7 +2057,6 @@ static const struct file_operations proc_timers_operations = { > .llseek = seq_lseek, > .release = seq_release_private, > }; > -#endif /* CONFIG_CHECKPOINT_RESTORE */ > > static int proc_pident_instantiate(struct inode *dir, > struct dentry *dentry, struct task_struct *task, const void *ptr) > @@ -2549,9 +2555,7 @@ static const struct inode_operations proc_task_inode_operations; > static const struct pid_entry tgid_base_stuff[] = { > DIR("task", S_IRUGO|S_IXUGO, proc_task_inode_operations, proc_task_operations), > DIR("fd", S_IRUSR|S_IXUSR, proc_fd_inode_operations, proc_fd_operations), > -#ifdef CONFIG_CHECKPOINT_RESTORE > DIR("map_files", S_IRUSR|S_IXUSR, proc_map_files_inode_operations, proc_map_files_operations), > -#endif > DIR("fdinfo", S_IRUSR|S_IXUSR, proc_fdinfo_inode_operations, proc_fdinfo_operations), > DIR("ns", S_IRUSR|S_IXUGO, proc_ns_dir_inode_operations, proc_ns_dir_operations), > #ifdef CONFIG_NET > -- > 1.8.1 > -- Kees Cook Chrome OS Security -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/