Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751850AbbD3QdK (ORCPT ); Thu, 30 Apr 2015 12:33:10 -0400 Received: from smtprelay0053.hostedemail.com ([216.40.44.53]:44885 "EHLO smtprelay.hostedemail.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750891AbbD3QdH (ORCPT ); Thu, 30 Apr 2015 12:33:07 -0400 X-Session-Marker: 726F737465647440676F6F646D69732E6F7267 X-Spam-Summary: 2,0,0,,d41d8cd98f00b204,rostedt@goodmis.org,:::::::::::::,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1277:1311:1313:1314:1345:1437:1515:1516:1518:1534:1543:1593:1594:1711:1730:1747:1777:1792:2194:2198:2199:2200:2393:2553:2559:2562:2691:3138:3139:3140:3141:3142:3354:3865:3866:3867:3868:3870:3871:3872:3874:4250:4321:4605:5007:6117:6119:6261:7903:8660:10004:10400:10848:11026:11473:11658:11914:12043:12296:12438:12517:12519:12555:12663:13148:13230:14096:14097:14394:21080,0,RBL:none,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fn,MSBL:0,DNSBL:none,Custom_rules:0:0:0 X-HE-Tag: toy77_552413d3bc118 X-Filterd-Recvd-Size: 4372 Date: Thu, 30 Apr 2015 12:33:03 -0400 From: Steven Rostedt To: LKML , linux-rt-users Cc: Thomas Gleixner , Sebastian Andrzej Siewior , Clark Williams , Dave Chinner , Peter Zijlstra Subject: [PATCH][RT] xfs: Disable preemption when grabbing all icsb counter locks Message-ID: <20150430123303.30f5bd12@gandalf.local.home> X-Mailer: Claws Mail 3.11.1 (GTK+ 2.24.25; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3587 Lines: 121 Running a test on a large CPU count box with xfs, I hit a live lock with the following backtraces on several CPUs: Call Trace: [] __const_udelay+0x28/0x30 [] xfs_icsb_lock_cntr+0x2a/0x40 [xfs] [] xfs_icsb_modify_counters+0x71/0x280 [xfs] [] xfs_trans_reserve+0x171/0x210 [xfs] [] xfs_create+0x24d/0x6f0 [xfs] [] ? avc_has_perm_flags+0xfb/0x1e0 [] xfs_vn_mknod+0xbb/0x1e0 [xfs] [] xfs_vn_create+0x13/0x20 [xfs] [] vfs_create+0xcd/0x130 [] do_last+0xb8f/0x1240 [] path_openat+0xc2/0x490 Looking at the code I see it was stuck at: STATIC void xfs_icsb_lock_cntr( xfs_icsb_cnts_t *icsbp) { while (test_and_set_bit(XFS_ICSB_FLAG_LOCK, &icsbp->icsb_flags)) { ndelay(1000); } } I'm not sure why it does the ndelay() and not just a cpu_relax(), but that's besides the point. In xfs_icsb_modify_counters() the code is fine. There's a preempt_disable() called when taking this bit spinlock and a preempt_enable() after it is released. The issue is that not all locations are protected by preempt_disable() when PREEMPT_RT is set. Namely the places that grab all CPU cntr locks. STATIC void xfs_icsb_lock_all_counters( xfs_mount_t *mp) { xfs_icsb_cnts_t *cntp; int i; for_each_online_cpu(i) { cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i); xfs_icsb_lock_cntr(cntp); } } STATIC void xfs_icsb_disable_counter() { [...] xfs_icsb_lock_all_counters(mp); [...] xfs_icsb_unlock_all_counters(mp); } STATIC void xfs_icsb_balance_counter_locked() { [...] xfs_icsb_disable_counter(); [...] } STATIC void xfs_icsb_balance_counter( xfs_mount_t *mp, xfs_sb_field_t fields, int min_per_cpu) { spin_lock(&mp->m_sb_lock); xfs_icsb_balance_counter_locked(mp, fields, min_per_cpu); spin_unlock(&mp->m_sb_lock); } Now, when PREEMPT_RT is not enabled, that spin_lock() disables preemption. But for PREEMPT_RT, it does not. Although with my test box I was not able to produce a task state of all tasks, but I'm assuming that some task called the xfs_icsb_lock_all_counters() and was preempted by an RT task and could not finish, causing all callers of that lock to block indefinitely. Looking at all users of xfs_icsb_lock_all_counters(), they are leaf functions and do not call anything that may block on PREEMPT_RT. I believe the proper fix here is to simply disable preemption in xfs_icsb_lock_all_counters() when PREEMPT_RT is enabled. Signed-off-by: Steven Rostedt --- diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index 51435dbce9c4..dbaa1ce3f308 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -1660,6 +1660,12 @@ xfs_icsb_lock_all_counters( xfs_icsb_cnts_t *cntp; int i; + /* + * In PREEMPT_RT, preemption is not disabled here, and it + * must be to take the xfs_icsb_lock_cntr. + */ + preempt_disable_rt(); + for_each_online_cpu(i) { cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i); xfs_icsb_lock_cntr(cntp); @@ -1677,6 +1683,8 @@ xfs_icsb_unlock_all_counters( cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i); xfs_icsb_unlock_cntr(cntp); } + + preempt_enable_rt(); } STATIC void -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/