Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752785AbdGLOtO (ORCPT ); Wed, 12 Jul 2017 10:49:14 -0400 Received: from resqmta-ch2-03v.sys.comcast.net ([69.252.207.35]:50936 "EHLO resqmta-ch2-03v.sys.comcast.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752503AbdGLOtN (ORCPT ); Wed, 12 Jul 2017 10:49:13 -0400 Date: Wed, 12 Jul 2017 09:49:11 -0500 (CDT) From: Christopher Lameter X-X-Sender: cl@nuc-kabylake To: Joonsoo Kim cc: Laura Abbott , Pekka Enberg , David Rientjes , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Kees Cook Subject: Re: [RFC][PATCH] slub: Introduce 'alternate' per cpu partial lists In-Reply-To: <20170614044528.GA5924@js1304-desktop> Message-ID: References: <1496965984-21962-1-git-send-email-labbott@redhat.com> <20170614044528.GA5924@js1304-desktop> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-CMAE-Envelope: MS4wfNjFc6X/6p1Tiu1Q7nvdPeicy49TurH7Ce9VcVzgIFRwePwz1SA/alowZPW0X8LsS/TfU0192sws/h7G9J/ts+QABQsRsJwtqE4NTWwiZfAH9RSSD8Te 0ElSDGmIDuSwuxd0z/KP06CqMumD9I51XFob4/p0NGVoUyXtmmqY5O9uUCt5pSBmM6PqsczeBUyu7tNpqVLkeUc1I9SHjelAlitk/24rkB1OZcy51e1273Y8 mVNHQO1gSJcZabfvzTnkyh9+z+JKZjOKUnHjQteD4ekO09juj/CzGVmIgjK5KxtZCx4fxbDbZyNrPb7ZHpRLJpTQ4r4HH5s4PnJwfkxjMwwCl05LzX6O4BsE VwoWDODASr34LtpaMX2dZje77il6Bg== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 956 Lines: 21 On Wed, 14 Jun 2017, Joonsoo Kim wrote: > > - Some of this code is redundant and can probably be combined. > > - The fast path is very sensitive and it was suggested I leave it alone. The > > approach I took means the fastpath cmpxchg always fails before trying the > > alternate cmpxchg. From some of my profiling, the cmpxchg seemed to be fairly > > expensive. > > It looks better to modify the fastpath for non-debuging poisoning. If > we use the jump label, it doesn't cause any overhead to the fastpath > for the user who doesn't use this feature. It really makes thing > simpler. Only a few more lines will be needed in the fastpath. > > Christoph, any opinion? Just looked through it. Sorry was on vacation in Europe for awhile. The duplication in kmem_cache_cpu is not good performance wise. Maybe just keep the single per cpu partial list and depending on a kmem_cache flag change the locking semantics in order to allow for faster debugging?