Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752757AbbH2N5e (ORCPT ); Sat, 29 Aug 2015 09:57:34 -0400 Received: from mail-pa0-f48.google.com ([209.85.220.48]:35119 "EHLO mail-pa0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752616AbbH2N5d (ORCPT ); Sat, 29 Aug 2015 09:57:33 -0400 Message-ID: <1440856650.8932.144.camel@edumazet-glaptop2.roam.corp.google.com> Subject: Re: [PATCH] task_work: remove fifo ordering guarantee From: Eric Dumazet To: Oleg Nesterov Cc: Al Viro , Linus Torvalds , "linux-kernel@vger.kernel.org" , Andrew Morton , Thomas Gleixner , Ingo Molnar , Maciej =?UTF-8?Q?=C5=BBenczykowski?= Date: Sat, 29 Aug 2015 06:57:30 -0700 In-Reply-To: <20150829124921.GA14973@redhat.com> References: <1440816150.8932.123.camel@edumazet-glaptop2.roam.corp.google.com> <20150829124921.GA14973@redhat.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4-0ubuntu2 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2219 Lines: 63 On Sat, 2015-08-29 at 14:49 +0200, Oleg Nesterov wrote: > On 08/28, Eric Dumazet wrote: > > > > From: Eric Dumazet > > > > In commit f341861fb0b ("task_work: add a scheduling point in > > task_work_run()") I fixed a latency problem adding a cond_resched() > > call. > > > > Later, commit ac3d0da8f329 added yet another loop to reverse a list, > > bringing back the latency spike : > > > > I've seen in some cases this loop taking 275 ms, if for example a > > process with 2,000,000 files is killed. > > > > We could add yet another cond_resched() in the reverse loop, > > Can't we do this? Well, I stated in the changelog we could. Obviously we can. Adding 275 ms of pure overhead to perform this list reversal for files to be closed is quite unfortunate. > Personally I'd prefer to keep the fifo ordering. It just makes > more sense imho. Even if currently nobody depends on it (although > I am not sure about out-of-tree modules, say, systemtap). > > Let's look keyctl_session_to_parent(). It does task_work_cancel() > but only because we can not trust user-space. Otherwise we could > remove it and just do task_work_add(), but this needs fifo. So it looks like there is no problem today, right, other than the possibility to parse a long list while blocking IRQ ? > > Fifo just looks more sane to me. Well, files are closed in a random order. These are the main user of this stuff. If this is that critical, maybe use 2 lists, one for stuff needing fifo, and another one for un-ordered stuff (ed : file closing), and add a boolean to task_work_add()/task_work_cancel(). This adds yet another field into struct task_struct. Now we also could question why we needed commit 4a9d4b024a3102fc083c925c242d98ac27b1c5f6 ("switch fput to task_work_add ") since it seems quite an overhead at task exit with 10^6 of files to close. I understood the 'schedule_work() for interrupt/kernel_thread callers' part, but not the task_work_add() one. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/