Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757457AbcDACd3 (ORCPT ); Thu, 31 Mar 2016 22:33:29 -0400 Received: from LGEAMRELO11.lge.com ([156.147.23.51]:43705 "EHLO lgeamrelo11.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752210AbcDACd2 (ORCPT ); Thu, 31 Mar 2016 22:33:28 -0400 X-Original-SENDERIP: 156.147.1.151 X-Original-MAILFROM: iamjoonsoo.kim@lge.com X-Original-SENDERIP: 10.177.222.138 X-Original-MAILFROM: iamjoonsoo.kim@lge.com Date: Fri, 1 Apr 2016 11:35:33 +0900 From: Joonsoo Kim To: Laura Abbott Cc: Christoph Lameter , Pekka Enberg , David Rientjes , Andrew Morton , Laura Abbott , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Kees Cook Subject: Re: [RFC][PATCH] mm/slub: Skip CPU slab activation when debugging Message-ID: <20160401023533.GB13179@js1304-P5Q-DELUXE> References: <1459205581-4605-1-git-send-email-labbott@fedoraproject.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1459205581-4605-1-git-send-email-labbott@fedoraproject.org> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1343 Lines: 29 On Mon, Mar 28, 2016 at 03:53:01PM -0700, Laura Abbott wrote: > The per-cpu slab is designed to be the primary path for allocation in SLUB > since it assumed allocations will go through the fast path if possible. > When debugging is enabled, the fast path is disabled and per-cpu > allocations are not used. The current debugging code path still activates > the cpu slab for allocations and then immediately deactivates it. This > is useless work. When a slab is enabled for debugging, skip cpu > activation. > > Signed-off-by: Laura Abbott > --- > This is a follow on to the optimization of the debug paths for poisoning > With this I get ~2 second drop on hackbench -g 20 -l 1000 with slub_debug=P > and no noticable change with slub_debug=- . I'd like to know the performance difference between slub_debug=P and slub_debug=- with this change. Although this patch increases hackbench performance, I'm not sure it's sufficient for the production system. Concurrent slab allocation request will contend the node lock in every allocation attempt. So, there would be other ues-cases that performance drop due to slub_debug=P cannot be accepted even if it is security feature. How about allowing cpu partial list for debug cases? It will not hurt fast path and will make less contention on the node lock. Thanks.