Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754527AbbFQAqi (ORCPT ); Tue, 16 Jun 2015 20:46:38 -0400 Received: from cdptpa-outbound-snat.email.rr.com ([107.14.166.230]:40058 "EHLO cdptpa-oedge-vip.email.rr.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752507AbbFQAq3 (ORCPT ); Tue, 16 Jun 2015 20:46:29 -0400 Date: Tue, 16 Jun 2015 20:47:11 -0400 From: Steven Rostedt To: Alexei Starovoitov Cc: Daniel Wagner , paulmck@linux.vnet.ibm.com, Daniel Wagner , LKML Subject: Re: call_rcu from trace_preempt Message-ID: <20150616204711.0e6ea1d7@grimm.local.home> In-Reply-To: <5580C054.2080809@plumgrid.com> References: <557F509D.2000509@plumgrid.com> <20150615230702.GB3913@linux.vnet.ibm.com> <557F7764.5060707@plumgrid.com> <20150616021458.GE3913@linux.vnet.ibm.com> <557FB7E1.6080004@plumgrid.com> <20150616122733.GG3913@linux.vnet.ibm.com> <558018DD.1080701@monom.org> <55805AC5.8020507@plumgrid.com> <20150616133709.6c53645d@gandalf.local.home> <5580C054.2080809@plumgrid.com> X-Mailer: Claws Mail 3.11.1 (GTK+ 2.24.25; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-RR-Connecting-IP: 107.14.168.142:25 X-Cloudmark-Score: 0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2014 Lines: 43 On Tue, 16 Jun 2015 17:33:24 -0700 Alexei Starovoitov wrote: > On 6/16/15 10:37 AM, Steven Rostedt wrote: > >>> + kfree(l); > >> > > >> >that's not right, since such thread defeats rcu protection of lookup. > >> >We need either kfree_rcu/call_rcu or synchronize_rcu. > >> >Obviously the former is preferred that's why I'm still digging into it. > >> >Probably a thread that does kfree_rcu would be ok, but we shouldn't > >> >be doing it unconditionally. For all networking programs and 99% > >> >of tracing programs the existing code is fine and I don't want to > >> >slow it down to tackle the corner case. > >> >Extra spin_lock just to add it to the list is also quite costly. > > Use a irq_work() handler to do the kfree_rcu(), and use llist (lockless > > list) to add items to the list. > > have been studying irq_work and llist... it will work, but it's quite > costly too. Every kfree_rcu will be replaced with irq_work_queue(), > which is irq_work_claim() with one lock_cmpxchg plus another > lock_cmpxchg in llist_add, plus another lock_cmpxchg for our own llist > of 'to be kfree_rcu-ed htab elements'. That's a lot. > The must be better solution. Need to explore more. Do what I do in tracing. Use a bit (per cpu?) test. Add the element to the list (that will be a cmpxchg, but I'm not sure you can avoid it), then check the bit to see if the irq work is already been activated. If not, then activate the irq work and set the bit. Then you will not have any more cmpxchg in the fast path. In your irq work handler, you clear the bit, process all the entries until they are empty, check if the bit is set again, and repeat. I haven't looked at the thread before I was added to the Cc, so I'm answering this out of context. -- Steve -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/