Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754853AbbG0VIZ (ORCPT ); Mon, 27 Jul 2015 17:08:25 -0400 Received: from emvm-gh1-uea09.nsa.gov ([63.239.67.10]:50744 "EHLO emvm-gh1-uea09.nsa.gov" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754663AbbG0VIX (ORCPT ); Mon, 27 Jul 2015 17:08:23 -0400 X-TM-IMSS-Message-ID: Message-ID: <55B69D67.4070002@tycho.nsa.gov> Date: Mon, 27 Jul 2015 17:06:47 -0400 From: Stephen Smalley Organization: National Security Agency User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Hugh Dickins CC: prarit@redhat.com, david@fromorbit.com, mstevens@fedoraproject.org, manfred@colorfullife.com, esandeen@redhat.com, wagi@monom.org, linux-kernel@vger.kernel.org, eparis@redhat.com, linux-mm@kvack.org, linux-security-module@vger.kernel.org, dave@stgolabs.net, nyc@holomorphy.com, akpm@linux-foundation.org, torvalds@linux-foundation.org, selinux@tycho.nsa.gov Subject: Re: [PATCH v2] ipc: Use private shmem or hugetlbfs inodes for shm segments. References: <1437741275-5388-1-git-send-email-sds@tycho.nsa.gov> In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 11255 Lines: 213 On 07/27/2015 03:32 PM, Hugh Dickins wrote: > On Fri, 24 Jul 2015, Stephen Smalley wrote: > >> The shm implementation internally uses shmem or hugetlbfs inodes >> for shm segments. As these inodes are never directly exposed to >> userspace and only accessed through the shm operations which are >> already hooked by security modules, mark the inodes with the >> S_PRIVATE flag so that inode security initialization and permission >> checking is skipped. >> >> This was motivated by the following lockdep warning: >> Jul 22 14:36:40 fc23 kernel: >> ====================================================== >> Jul 22 14:36:40 fc23 kernel: [ INFO: possible circular locking >> dependency detected ] >> Jul 22 14:36:40 fc23 kernel: 4.2.0-0.rc3.git0.1.fc24.x86_64+debug #1 >> Tainted: G W >> Jul 22 14:36:40 fc23 kernel: >> ------------------------------------------------------- >> Jul 22 14:36:40 fc23 kernel: httpd/1597 is trying to acquire lock: >> Jul 22 14:36:40 fc23 kernel: (&ids->rwsem){+++++.}, at: >> [] shm_close+0x34/0x130 >> Jul 22 14:36:40 fc23 kernel: #012but task is already holding lock: >> Jul 22 14:36:40 fc23 kernel: (&mm->mmap_sem){++++++}, at: >> [] SyS_shmdt+0x4b/0x180 >> Jul 22 14:36:40 fc23 kernel: #012which lock already depends on the new lock. >> Jul 22 14:36:40 fc23 kernel: #012the existing dependency chain (in >> reverse order) is: >> Jul 22 14:36:40 fc23 kernel: #012-> #3 (&mm->mmap_sem){++++++}: >> Jul 22 14:36:40 fc23 kernel: [] lock_acquire+0xc7/0x270 >> Jul 22 14:36:40 fc23 kernel: [] __might_fault+0x7a/0xa0 >> Jul 22 14:36:40 fc23 kernel: [] filldir+0x9e/0x130 >> Jul 22 14:36:40 fc23 kernel: [] >> xfs_dir2_block_getdents.isra.12+0x198/0x1c0 [xfs] >> Jul 22 14:36:40 fc23 kernel: [] >> xfs_readdir+0x1b4/0x330 [xfs] >> Jul 22 14:36:40 fc23 kernel: [] >> xfs_file_readdir+0x2b/0x30 [xfs] >> Jul 22 14:36:40 fc23 kernel: [] iterate_dir+0x97/0x130 >> Jul 22 14:36:40 fc23 kernel: [] SyS_getdents+0x91/0x120 >> Jul 22 14:36:40 fc23 kernel: [] >> entry_SYSCALL_64_fastpath+0x12/0x76 >> Jul 22 14:36:40 fc23 kernel: #012-> #2 (&xfs_dir_ilock_class){++++.+}: >> Jul 22 14:36:40 fc23 kernel: [] lock_acquire+0xc7/0x270 >> Jul 22 14:36:40 fc23 kernel: [] >> down_read_nested+0x57/0xa0 >> Jul 22 14:36:40 fc23 kernel: [] >> xfs_ilock+0x167/0x350 [xfs] >> Jul 22 14:36:40 fc23 kernel: [] >> xfs_ilock_attr_map_shared+0x38/0x50 [xfs] >> Jul 22 14:36:40 fc23 kernel: [] >> xfs_attr_get+0xbd/0x190 [xfs] >> Jul 22 14:36:40 fc23 kernel: [] >> xfs_xattr_get+0x3d/0x70 [xfs] >> Jul 22 14:36:40 fc23 kernel: [] >> generic_getxattr+0x4f/0x70 >> Jul 22 14:36:40 fc23 kernel: [] >> inode_doinit_with_dentry+0x162/0x670 >> Jul 22 14:36:40 fc23 kernel: [] >> sb_finish_set_opts+0xd9/0x230 >> Jul 22 14:36:40 fc23 kernel: [] >> selinux_set_mnt_opts+0x35c/0x660 >> Jul 22 14:36:40 fc23 kernel: [] >> superblock_doinit+0x77/0xf0 >> Jul 22 14:36:40 fc23 kernel: [] >> delayed_superblock_init+0x10/0x20 >> Jul 22 14:36:40 fc23 kernel: [] >> iterate_supers+0xb3/0x110 >> Jul 22 14:36:40 fc23 kernel: [] >> selinux_complete_init+0x2f/0x40 >> Jul 22 14:36:40 fc23 kernel: [] >> security_load_policy+0x103/0x600 >> Jul 22 14:36:40 fc23 kernel: [] >> sel_write_load+0xc1/0x750 >> Jul 22 14:36:40 fc23 kernel: [] __vfs_write+0x37/0x100 >> Jul 22 14:36:40 fc23 kernel: [] vfs_write+0xa9/0x1a0 >> Jul 22 14:36:40 fc23 kernel: [] SyS_write+0x58/0xd0 >> Jul 22 14:36:40 fc23 kernel: [] >> entry_SYSCALL_64_fastpath+0x12/0x76 >> Jul 22 14:36:40 fc23 kernel: #012-> #1 (&isec->lock){+.+.+.}: >> Jul 22 14:36:40 fc23 kernel: [] lock_acquire+0xc7/0x270 >> Jul 22 14:36:40 fc23 kernel: [] >> mutex_lock_nested+0x7f/0x3e0 >> Jul 22 14:36:40 fc23 kernel: [] >> inode_doinit_with_dentry+0xb9/0x670 >> Jul 22 14:36:40 fc23 kernel: [] >> selinux_d_instantiate+0x1c/0x20 >> Jul 22 14:36:40 fc23 kernel: [] >> security_d_instantiate+0x36/0x60 >> Jul 22 14:36:40 fc23 kernel: [] d_instantiate+0x54/0x70 >> Jul 22 14:36:40 fc23 kernel: [] >> __shmem_file_setup+0xdc/0x240 >> Jul 22 14:36:40 fc23 kernel: [] >> shmem_file_setup+0x10/0x20 >> Jul 22 14:36:40 fc23 kernel: [] newseg+0x290/0x3a0 >> Jul 22 14:36:40 fc23 kernel: [] ipcget+0x208/0x2d0 >> Jul 22 14:36:40 fc23 kernel: [] SyS_shmget+0x54/0x70 >> Jul 22 14:36:40 fc23 kernel: [] >> entry_SYSCALL_64_fastpath+0x12/0x76 >> Jul 22 14:36:40 fc23 kernel: #012-> #0 (&ids->rwsem){+++++.}: >> Jul 22 14:36:40 fc23 kernel: [] >> __lock_acquire+0x1a78/0x1d00 >> Jul 22 14:36:40 fc23 kernel: [] lock_acquire+0xc7/0x270 >> Jul 22 14:36:40 fc23 kernel: [] down_write+0x5a/0xc0 >> Jul 22 14:36:40 fc23 kernel: [] shm_close+0x34/0x130 >> Jul 22 14:36:40 fc23 kernel: [] remove_vma+0x45/0x80 >> Jul 22 14:36:40 fc23 kernel: [] do_munmap+0x2b0/0x460 >> Jul 22 14:36:40 fc23 kernel: [] SyS_shmdt+0xb5/0x180 >> Jul 22 14:36:40 fc23 kernel: [] >> entry_SYSCALL_64_fastpath+0x12/0x76 >> Jul 22 14:36:40 fc23 kernel: #012other info that might help us debug this: >> Jul 22 14:36:40 fc23 kernel: Chain exists of:#012 &ids->rwsem --> >> &xfs_dir_ilock_class --> &mm->mmap_sem >> Jul 22 14:36:40 fc23 kernel: Possible unsafe locking scenario: >> Jul 22 14:36:40 fc23 kernel: CPU0 CPU1 >> Jul 22 14:36:40 fc23 kernel: ---- ---- >> Jul 22 14:36:40 fc23 kernel: lock(&mm->mmap_sem); >> Jul 22 14:36:40 fc23 kernel: >> lock(&xfs_dir_ilock_class); >> Jul 22 14:36:40 fc23 kernel: lock(&mm->mmap_sem); >> Jul 22 14:36:40 fc23 kernel: lock(&ids->rwsem); >> Jul 22 14:36:40 fc23 kernel: #012 *** DEADLOCK *** >> Jul 22 14:36:40 fc23 kernel: 1 lock held by httpd/1597: >> Jul 22 14:36:40 fc23 kernel: #0: (&mm->mmap_sem){++++++}, at: >> [] SyS_shmdt+0x4b/0x180 >> Jul 22 14:36:40 fc23 kernel: #012stack backtrace: >> Jul 22 14:36:40 fc23 kernel: CPU: 7 PID: 1597 Comm: httpd Tainted: G >> W 4.2.0-0.rc3.git0.1.fc24.x86_64+debug #1 >> Jul 22 14:36:40 fc23 kernel: Hardware name: VMware, Inc. VMware >> Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 >> 05/20/2014 >> Jul 22 14:36:40 fc23 kernel: 0000000000000000 000000006cb6fe9d >> ffff88019ff07c58 ffffffff81868175 >> Jul 22 14:36:40 fc23 kernel: 0000000000000000 ffffffff82aea390 >> ffff88019ff07ca8 ffffffff81105903 >> Jul 22 14:36:40 fc23 kernel: ffff88019ff07c78 ffff88019ff07d08 >> 0000000000000001 ffff8800b75108f0 >> Jul 22 14:36:40 fc23 kernel: Call Trace: >> Jul 22 14:36:40 fc23 kernel: [] dump_stack+0x4c/0x65 >> Jul 22 14:36:40 fc23 kernel: [] print_circular_bug+0x1e3/0x250 >> Jul 22 14:36:40 fc23 kernel: [] __lock_acquire+0x1a78/0x1d00 >> Jul 22 14:36:40 fc23 kernel: [] ? unlink_file_vma+0x33/0x60 >> Jul 22 14:36:40 fc23 kernel: [] lock_acquire+0xc7/0x270 >> Jul 22 14:36:40 fc23 kernel: [] ? shm_close+0x34/0x130 >> Jul 22 14:36:40 fc23 kernel: [] down_write+0x5a/0xc0 >> Jul 22 14:36:40 fc23 kernel: [] ? shm_close+0x34/0x130 >> Jul 22 14:36:40 fc23 kernel: [] shm_close+0x34/0x130 >> Jul 22 14:36:40 fc23 kernel: [] remove_vma+0x45/0x80 >> Jul 22 14:36:40 fc23 kernel: [] do_munmap+0x2b0/0x460 >> Jul 22 14:36:40 fc23 kernel: [] ? SyS_shmdt+0x4b/0x180 >> Jul 22 14:36:40 fc23 kernel: [] SyS_shmdt+0xb5/0x180 >> Jul 22 14:36:40 fc23 kernel: [] >> entry_SYSCALL_64_fastpath+0x12/0x76 >> >> Reported-by: Morten Stevens >> Signed-off-by: Stephen Smalley > > Acked-by: Hugh Dickins > but with one reservation below... > >> --- >> This version only differs in the patch description, which restores >> the original lockdep trace from Morten Stevens. It was unfortunately >> mangled in the prior version. >> >> fs/hugetlbfs/inode.c | 2 ++ >> ipc/shm.c | 2 +- >> mm/shmem.c | 4 ++-- >> 3 files changed, 5 insertions(+), 3 deletions(-) >> >> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c >> index 0cf74df..973c24c 100644 >> --- a/fs/hugetlbfs/inode.c >> +++ b/fs/hugetlbfs/inode.c >> @@ -1010,6 +1010,8 @@ struct file *hugetlb_file_setup(const char *name, size_t size, >> inode = hugetlbfs_get_inode(sb, NULL, S_IFREG | S_IRWXUGO, 0); >> if (!inode) >> goto out_dentry; >> + if (creat_flags == HUGETLB_SHMFS_INODE) >> + inode->i_flags |= S_PRIVATE; > > I wonder if you would do better just to set S_PRIVATE unconditionally > there. > > hugetlb_file_setup() has two callsites, neither of which exposes an fd. > One of them is shm.c's newseg(), which is getting us into the lockdep > trouble that you're fixing here. > > The other is mmap.c's mmap_pgoff(). Now I don't think that will ever > get into lockdep trouble (no mutex or rwsem has been taken at that > point), but might your change above introduce (perhaps now or perhaps > in future) an inconsistency between how SElinux checks are applied to > a SHM area, and how they are applied to a MAP_ANONYMOUS|MAP_HUGETLB > area, and how they are applied to a straight MAP_ANONYMOUS area? > > I think your patch as it stands brings SHM into line with > MAP_ANONYMOUS, but leaves MAP_ANONYMOUS|MAP_HUGETLB going the old way. > Perhaps an anomaly would appear when mprotect() is used? > > It's up to you: I think your patch is okay as is, > but I just wonder if it has a surprise in store for the future. That sounds reasonable, although there is the concern that hugetlb_file_setup() might be used in the future for files that are exposed as fds, unless we rename it to hugetlb_kernel_file_setup() or similar to match shmem_kernel_file_setup(). Also should probably be done as a separate change on top since it isn't directly related to ipc/shm or fixing this lockdep. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/