Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932550AbbFISJq (ORCPT ); Tue, 9 Jun 2015 14:09:46 -0400 Received: from mail-lb0-f182.google.com ([209.85.217.182]:36672 "EHLO mail-lb0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752971AbbFISJh (ORCPT ); Tue, 9 Jun 2015 14:09:37 -0400 MIME-Version: 1.0 In-Reply-To: References: <1432005006-3428-1-git-send-email-calvinowens@fb.com> <1433821173-2804704-1-git-send-email-calvinowens@fb.com> From: Andy Lutomirski Date: Tue, 9 Jun 2015 10:47:31 -0700 Message-ID: Subject: Re: [PATCH v6] procfs: Always expose /proc//map_files/ and make it readable To: Kees Cook Cc: Calvin Owens , Andrew Morton , Alexey Dobriyan , "Eric W. Biederman" , Al Viro , Miklos Szeredi , Zefan Li , Oleg Nesterov , Joe Perches , David Howells , LKML , kernel-team@fb.com, Cyrill Gorcunov , "Kirill A. Shutemov" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7874 Lines: 205 On Tue, Jun 9, 2015 at 10:27 AM, Kees Cook wrote: > On Mon, Jun 8, 2015 at 8:39 PM, Calvin Owens wrote: >> Currently, /proc//map_files/ is restricted to CAP_SYS_ADMIN, and >> is only exposed if CONFIG_CHECKPOINT_RESTORE is set. >> >> This interface very useful because it allows userspace to stat() >> deleted files that are still mapped by some process, which enables a >> much quicker and more accurate answer to the question "How much disk >> space is being consumed by files that are deleted but still mapped?" >> than is currently possible. >> >> This patch moves map_files/ out from behind CONFIG_CHECKPOINT_RESTORE, >> and adjusts the permissions enforced on it as follows: >> >> * proc_map_files_lookup() >> * proc_map_files_readdir() >> * map_files_d_revalidate() >> >> Remove the CAP_SYS_ADMIN restriction, leaving only the current >> restriction requiring PTRACE_MODE_READ. >> >> In earlier versions of this patch, I changed the ptrace checks >> in the functions above to enforce MODE_ATTACH instead of >> MODE_READ. That was an oversight: all the information exposed >> by the above three functions is already available with >> MODE_READ from /proc/PID/maps. I was only being asked to >> strengthen the protection around functionality provided by >> follow_link(), not the above. >> >> So, I've left the checks for MODE_READ as-is, since AFAICS all >> objections raised so far are addressed by the new CAP_SYS_ADMIN >> check in follow_link(), explained below. >> >> * proc_map_files_follow_link() >> >> This stub has been added, and requires that the user have >> CAP_SYS_ADMIN in order to follow the links in map_files/, >> since there was concern on LKML both about the potential for >> bypassing permissions on ancestor directories in the path to >> files pointed to, and about what happens with more exotic >> memory mappings created by some drivers (ie dma-buf). >> >> Cc: Andrew Morton >> Cc: Andy Lutomirski >> Cc: Cyrill Gorcunov >> Cc: Joe Perches >> Cc: Kees Cook >> Cc: Kirill A. Shutemov >> Signed-off-by: Calvin Owens >> --- >> Changes in v6: Require CAP_SYS_ADMIN for follow_link(). Leave other >> PTRACE_MODE_READ checks as-is, since CAP_SYS_ADMIN >> alone addresses all concerns raised AFAICS. >> >> Changes in v5: s/dentry->d_inode/d_inode(dentry)/g >> >> Changes in v4: Return -ESRCH from follow_link() when get_proc_task() >> returns NULL. >> >> Changes in v3: Changed permission checks to use PTRACE_MODE_ATTACH >> instead of PTRACE_MODE_READ, and added a stub to >> enforce MODE_ATTACH on follow_link() as well. >> >> Changes in v2: Removed the follow_link() stub that returned -EPERM if >> the caller didn't have CAP_SYS_ADMIN, since the caller >> in my chroot() scenario gets -EACCES anyway. >> >> fs/proc/base.c | 42 +++++++++++++++++++++++------------------- >> 1 file changed, 23 insertions(+), 19 deletions(-) >> >> diff --git a/fs/proc/base.c b/fs/proc/base.c >> index 093ca14..0270191 100644 >> --- a/fs/proc/base.c >> +++ b/fs/proc/base.c >> @@ -1641,8 +1641,6 @@ end_instantiate: >> return dir_emit(ctx, name, len, 1, DT_UNKNOWN); >> } >> >> -#ifdef CONFIG_CHECKPOINT_RESTORE >> - >> /* >> * dname_to_vma_addr - maps a dentry name into two unsigned longs >> * which represent vma start and end addresses. >> @@ -1669,11 +1667,6 @@ static int map_files_d_revalidate(struct dentry *dentry, unsigned int flags) >> if (flags & LOOKUP_RCU) >> return -ECHILD; >> >> - if (!capable(CAP_SYS_ADMIN)) { >> - status = -EPERM; >> - goto out_notask; >> - } >> - >> inode = d_inode(dentry); >> task = get_proc_task(inode); >> if (!task) >> @@ -1762,6 +1755,28 @@ struct map_files_info { >> unsigned char name[4*sizeof(long)+2]; /* max: %lx-%lx\0 */ >> }; >> >> +/* >> + * Only allow CAP_SYS_ADMIN to follow the links, due to concerns about how the >> + * symlinks may be used to bypass permissions on ancestor directories in the >> + * path to the file in question. >> + */ > > Cool, I think this looks good. Thanks! > > Reviewed-by: Kees Cook > Looks good to me, too. --Andy > -Kees > >> +static void *proc_map_files_follow_link(struct dentry *dentry, struct nameidata *nd) >> +{ >> + if (!capable(CAP_SYS_ADMIN)) >> + return ERR_PTR(-EPERM); >> + >> + return proc_pid_follow_link(dentry, nd); >> +} >> + >> +/* >> + * Identical to proc_pid_link_inode_operations except for follow_link() >> + */ >> +static const struct inode_operations proc_map_files_link_inode_operations = { >> + .readlink = proc_pid_readlink, >> + .follow_link = proc_map_files_follow_link, >> + .setattr = proc_setattr, >> +}; >> + >> static int >> proc_map_files_instantiate(struct inode *dir, struct dentry *dentry, >> struct task_struct *task, const void *ptr) >> @@ -1777,7 +1792,7 @@ proc_map_files_instantiate(struct inode *dir, struct dentry *dentry, >> ei = PROC_I(inode); >> ei->op.proc_get_link = proc_map_files_get_link; >> >> - inode->i_op = &proc_pid_link_inode_operations; >> + inode->i_op = &proc_map_files_link_inode_operations; >> inode->i_size = 64; >> inode->i_mode = S_IFLNK; >> >> @@ -1801,10 +1816,6 @@ static struct dentry *proc_map_files_lookup(struct inode *dir, >> int result; >> struct mm_struct *mm; >> >> - result = -EPERM; >> - if (!capable(CAP_SYS_ADMIN)) >> - goto out; >> - >> result = -ENOENT; >> task = get_proc_task(dir); >> if (!task) >> @@ -1858,10 +1869,6 @@ proc_map_files_readdir(struct file *file, struct dir_context *ctx) >> struct map_files_info *p; >> int ret; >> >> - ret = -EPERM; >> - if (!capable(CAP_SYS_ADMIN)) >> - goto out; >> - >> ret = -ENOENT; >> task = get_proc_task(file_inode(file)); >> if (!task) >> @@ -2050,7 +2057,6 @@ static const struct file_operations proc_timers_operations = { >> .llseek = seq_lseek, >> .release = seq_release_private, >> }; >> -#endif /* CONFIG_CHECKPOINT_RESTORE */ >> >> static int proc_pident_instantiate(struct inode *dir, >> struct dentry *dentry, struct task_struct *task, const void *ptr) >> @@ -2549,9 +2555,7 @@ static const struct inode_operations proc_task_inode_operations; >> static const struct pid_entry tgid_base_stuff[] = { >> DIR("task", S_IRUGO|S_IXUGO, proc_task_inode_operations, proc_task_operations), >> DIR("fd", S_IRUSR|S_IXUSR, proc_fd_inode_operations, proc_fd_operations), >> -#ifdef CONFIG_CHECKPOINT_RESTORE >> DIR("map_files", S_IRUSR|S_IXUSR, proc_map_files_inode_operations, proc_map_files_operations), >> -#endif >> DIR("fdinfo", S_IRUSR|S_IXUSR, proc_fdinfo_inode_operations, proc_fdinfo_operations), >> DIR("ns", S_IRUSR|S_IXUGO, proc_ns_dir_inode_operations, proc_ns_dir_operations), >> #ifdef CONFIG_NET >> -- >> 1.8.1 >> > > > > -- > Kees Cook > Chrome OS Security -- Andy Lutomirski AMA Capital Management, LLC -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/