Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933355AbcDYSLp (ORCPT ); Mon, 25 Apr 2016 14:11:45 -0400 Received: from ns.lynxeye.de ([87.118.118.114]:58726 "EHLO lynxeye.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S932905AbcDYSLo (ORCPT ); Mon, 25 Apr 2016 14:11:44 -0400 Message-ID: <1461607897.2364.27.camel@lynxeye.de> Subject: Re: [PATCH] xfs: idle aild if the AIL is pushed up to the target LSN From: Lucas Stach To: Brian Foster Cc: Dave Chinner , linux-kernel@vger.kernel.org, xfs@oss.sgi.com Date: Mon, 25 Apr 2016 20:11:37 +0200 In-Reply-To: <20160425142444.GC33882@bfoster.bfoster> References: <1461570163-4083-1-git-send-email-dev@lynxeye.de> <20160425142444.GC33882@bfoster.bfoster> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.18.5.2 (3.18.5.2-1.fc23) Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5818 Lines: 152 Am Montag, den 25.04.2016, 10:24 -0400 schrieb Brian Foster: > On Mon, Apr 25, 2016 at 09:42:43AM +0200, Lucas Stach wrote: > > > > The current logic only idles aild if the AIL is empty, rescheduling > > the worker with a timeout of 50ms otherwise. If the target LSN > > isn't > > moved forward, the worker will not make any progress as it only > > pushes the AIL up to the target LSN, leading to the empty AIL > > condition to only be met after the log worker pushed the complete > > AIL. As the log worker runs on the scale of minutes, this leaves > > aild in the "no progress, 50ms timeout" state for extended periods > > of > > time, causing many unecessary wakeups. > > > Where do you get the "scale of minutes" from? I believe the default > log > worker interval is 30s, though it can be increased by the admin. > Uh, right. It's 30s, which is still a very long time when we consider that new items can be added to the AIL right after the log worker forced it out. > The other thing to note is that the AIL target is also driven by > demand > on log space. E.g., see xlog_grant_push_ail(), which executes when a > transaction attempts to reserve log space that isn't currently > available > in the log. > And that's where I think the problem stems from: the above function pushes the AIL only up to a target LSN to make room for the required log space. Only the log worker empties the AIL completely by issuing a xfs_ail_push_all_sync(). > That said, I'm not sure whether there's a notable benefit of idling > for > 50ms over just scheduling out when we've hit the target lsn. It seems > like that anybody who pushes the target forward again is going to > wake > up the thread anyways. On the other hand, if the fs is idle the > thread > will eventually schedule out indefinitely. Is this a problem? The patch tries to do exactly that: schedule out aild indefinitely when there is no more work to do as nobody is pushing the target LSN forward. On an idle FS this will stop aild when the target LSN is hit, will wake it up again as soon as the log worker requests the complete AIL to be pushed out, then it goes back to sleep indefinitely until the log fills up again. > Have you observed problematic > behavior due to the current heuristic? > Yes. On a FS with a low, but constant metadata workload I would expect aild to be scheduled only when the log space is running out and/or the log worker forces a sync. What currently happens is that aild stays in the "target LSN hit -> no work to do, schedule a timeout each 50ms" state for almost the entire runtime of the system. This causes about 20 CPU wakeups per second per mountpoint, most of which are completely wasted, as there is no work to do. My use case may be atypical, as I'm using XFS on a laptop, but in that case the XFS aild is actually causing most of the CPU wakeups in an otherwise idle system, which is clearly undesirable, as those wakeups aren't really needed as they don't lead to actual work getting done. > > > > Fix this by idling aild as soon as the AIL is pushed up to the > > target > > LSN. All code paths that move the target LSN forward already ensure > > that aild is woken up again when needed. > > > > Signed-off-by: Lucas Stach > > --- > > I'm pretty sure that the above deduction is correct, but as this my > > first real encounter with the XFS code base, someone with a bit > > more > > knowledge than me should give this some thought. > > --- > >  fs/xfs/xfs_trans_ail.c | 7 +++++-- > >  1 file changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c > > index d6c9c3e..7eca852 100644 > > --- a/fs/xfs/xfs_trans_ail.c > > +++ b/fs/xfs/xfs_trans_ail.c > > @@ -495,6 +495,7 @@ xfsaild( > >  { > >   struct xfs_ail *ailp = data; > >   long tout = 0; /* milliseconds */ > > + struct xfs_log_item *lip; > >   > >   current->flags |= PF_MEMALLOC; > >   set_freezable(); > > @@ -508,7 +509,8 @@ xfsaild( > >   spin_lock(&ailp->xa_lock); > >   > >   /* > > -  * Idle if the AIL is empty and we are not racing > > with a target > > +  * Idle if the AIL is empty or pushed up to the > > requested > > +  * target LSN and we are not racing with a target > >    * update. We check the AIL after we set the task > > to a sleep > >    * state to guarantee that we either catch an > > xa_target update > >    * or that a wake_up resets the state to > > TASK_RUNNING. > > @@ -517,7 +519,8 @@ xfsaild( > >    * The barrier matches the xa_target update in > > xfs_ail_push(). > >    */ > >   smp_rmb(); > > - if (!xfs_ail_min(ailp) && > > + lip = xfs_ail_min(ailp); > > + if ((!lip || XFS_LSN_CMP(lip->li_lsn, ailp- > > >xa_target) >= 0) && > I think you'd want to use > 0, as == 0 suggests you have an item that > matches the target. > This check just mirrors what is done in xfsaild_push() to check if the target has been hit. Either both sites should be changed (which would assume the target LSN is inclusive and we should push the AIL to at least the target) or left as is (assuming the target LSN) is exclusive and we should push up to the target). > FWIW, another option might be to increase the timeout to something > much > larger than the current 50ms when we've hit the target. I don't think we want to extend the timeout if someone already pushed the target LSN forward, so we would need to mirror the check for the target LSN race here. I believe my change is less intrusive. Regards, Lucas > See towards the > end of xfsaild_push() where tout is determined based on the state of > the > push (and we already have logic to check against xa_target). > > Brian > > > > >       ailp->xa_target == ailp->xa_target_prev) { > >   spin_unlock(&ailp->xa_lock); > >   freezable_schedule();