Return-Path: Received: from youngberry.canonical.com ([91.189.89.112]:52756 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752119AbbGFJpT (ORCPT ); Mon, 6 Jul 2015 05:45:19 -0400 Date: Mon, 6 Jul 2015 10:45:16 +0100 From: Luis Henriques To: Jeff Layton Cc: stable@vger.kernel.org, linux-nfs@vger.kernel.org, William Dauchy , jean@primarydata.com Subject: Re: [PATCH][stable] nfs: take extra reference to fl->fl_file when running a setlk Message-ID: <20150706094516.GB2165@ares> References: <1435833617-4562-1-git-send-email-jeff.layton@primarydata.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 In-Reply-To: <1435833617-4562-1-git-send-email-jeff.layton@primarydata.com> Sender: linux-nfs-owner@vger.kernel.org List-ID: On Thu, Jul 02, 2015 at 06:40:17AM -0400, Jeff Layton wrote: > From: Jeff Layton > > We had a report of a crash while stress testing the NFS client: > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000150 > IP: [] locks_get_lock_context+0x8/0x90 > PGD 0 > Oops: 0000 [#1] SMP > Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_filter ebtable_broute bridge stp llc ebtables ip6table_security ip6table_mangle ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_raw ip6table_filter ip6_tables iptable_security iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_raw coretemp crct10dif_pclmul ppdev crc32_pclmul crc32c_intel ghash_clmulni_intel vmw_balloon serio_raw vmw_vmci i2c_piix4 shpchp parport_pc acpi_cpufreq parport nfsd auth_rpcgss nfs_acl lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi scsi_transport_spi mptscsih mptbase e1000 ata_generic pata_acpi > CPU: 1 PID: 399 Comm: kworker/1:1H Not tainted 4.1.0-0.rc1.git0.1.fc23.x86_64 #1 > Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/30/2013 > Workqueue: rpciod rpc_async_schedule [sunrpc] > task: ffff880036aea7c0 ti: ffff8800791f4000 task.ti: ffff8800791f4000 > RIP: 0010:[] [] locks_get_lock_context+0x8/0x90 > RSP: 0018:ffff8800791f7c00 EFLAGS: 00010293 > RAX: ffff8800791f7c40 RBX: ffff88001f2ad8c0 RCX: ffffe8ffffc80305 > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > RBP: ffff8800791f7c88 R08: ffff88007fc971d8 R09: 279656d600000000 > R10: 0000034a01000000 R11: 279656d600000000 R12: ffff88001f2ad918 > R13: ffff88001f2ad8c0 R14: 0000000000000000 R15: 0000000100e73040 > FS: 0000000000000000(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000150 CR3: 0000000001c0b000 CR4: 00000000000407e0 > Stack: > ffffffff8127c5b0 ffff8800791f7c18 ffffffffa0171e29 ffff8800791f7c58 > ffffffffa0171ef8 ffff8800791f7c78 0000000000000246 ffff88001ea0ba00 > ffff8800791f7c40 ffff8800791f7c40 00000000ff5d86a3 ffff8800791f7ca8 > Call Trace: > [] ? __posix_lock_file+0x40/0x760 > [] ? rpc_make_runnable+0x99/0xa0 [sunrpc] > [] ? rpc_wake_up_task_queue_locked.part.35+0xc8/0x250 [sunrpc] > [] posix_lock_file_wait+0x4a/0x120 > [] ? nfs41_wake_and_assign_slot+0x32/0x40 [nfsv4] > [] ? nfs41_sequence_done+0xd8/0x2d0 [nfsv4] > [] do_vfs_lock+0x2d/0x30 [nfsv4] > [] nfs4_lock_done+0x1ad/0x210 [nfsv4] > [] ? __rpc_sleep_on_priority+0x390/0x390 [sunrpc] > [] ? __rpc_sleep_on_priority+0x390/0x390 [sunrpc] > [] rpc_exit_task+0x2c/0xa0 [sunrpc] > [] ? call_refreshresult+0x150/0x150 [sunrpc] > [] __rpc_execute+0x90/0x460 [sunrpc] > [] rpc_async_schedule+0x15/0x20 [sunrpc] > [] process_one_work+0x1bb/0x410 > [] worker_thread+0x53/0x480 > [] ? process_one_work+0x410/0x410 > [] ? process_one_work+0x410/0x410 > [] kthread+0xd8/0xf0 > [] ? kthread_worker_fn+0x180/0x180 > [] ret_from_fork+0x42/0x70 > [] ? kthread_worker_fn+0x180/0x180 > > Jean says: > > "Running locktests with a large number of iterations resulted in a > client crash. The test run took a while and hasn't finished after close > to 2 hours. The crash happened right after I gave up and killed the test > (after 107m) with Ctrl+C." > > The crash happened because a NULL inode pointer got passed into > locks_get_lock_context. The call chain indicates that file_inode(filp) > returned NULL, which means that f_inode was NULL. Since that's zeroed > out in __fput, that suggests that this filp pointer outlived the last > reference. > > Looking at the code, that seems possible. We copy the struct file_lock > that's passed in, but if the task is signalled at an inopportune time we > can end up trying to use that file_lock in rpciod context after the process > that requested it has already returned (and possibly put its filp > reference). > > Fix this by taking an extra reference to the filp when we allocate the > lock info, and put it in nfs4_lock_release. > > Reported-by: Jean Spector > Signed-off-by: Jeff Layton > Signed-off-by: Trond Myklebust > Cc: # all stable-series kernels > Upstream commit: feaff8e5b2cfc3eae02cf65db7a400b0b9ffc596 Thanks, I'm queuing it for the 3.16 kernel. Cheers, -- Lu?s