Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965402AbcJXQbl (ORCPT ); Mon, 24 Oct 2016 12:31:41 -0400 Received: from tex.lwn.net ([70.33.254.29]:52987 "EHLO vena.lwn.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S938936AbcJXQbg (ORCPT ); Mon, 24 Oct 2016 12:31:36 -0400 Date: Mon, 24 Oct 2016 10:31:33 -0600 From: Jonathan Corbet To: Tim Chen Cc: Andrew Morton , "Huang, Ying" , dave.hansen@intel.com, ak@linux.intel.com, aaron.lu@intel.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Hugh Dickins , Shaohua Li , Minchan Kim , Rik van Riel , Andrea Arcangeli , "Kirill A . Shutemov" , Vladimir Davydov , Johannes Weiner , Michal Hocko , Hillf Danton Subject: Re: [PATCH v2 2/8] mm/swap: Add cluster lock Message-ID: <20161024103133.7c1a8f83@lwn.net> In-Reply-To: References: Organization: LWN.net MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2292 Lines: 58 On Thu, 20 Oct 2016 16:31:41 -0700 Tim Chen wrote: > From: "Huang, Ying" > > This patch is to reduce the lock contention of swap_info_struct->lock > via using a more fine grained lock in swap_cluster_info for some swap > operations. swap_info_struct->lock is heavily contended if multiple > processes reclaim pages simultaneously. Because there is only one lock > for each swap device. While in common configuration, there is only one > or several swap devices in the system. The lock protects almost all > swap related operations. So I'm looking at this a bit. Overall it seems like a good thing to do (from my limited understanding of this area) but I have a probably silly question... > struct swap_cluster_info { > - unsigned int data:24; > - unsigned int flags:8; > + unsigned long data; > }; > -#define CLUSTER_FLAG_FREE 1 /* This cluster is free */ > -#define CLUSTER_FLAG_NEXT_NULL 2 /* This cluster has no next cluster */ > +#define CLUSTER_COUNT_SHIFT 8 > +#define CLUSTER_FLAG_MASK ((1UL << CLUSTER_COUNT_SHIFT) - 1) > +#define CLUSTER_COUNT_MASK (~CLUSTER_FLAG_MASK) > +#define CLUSTER_FLAG_FREE 1 /* This cluster is free */ > +#define CLUSTER_FLAG_NEXT_NULL 2 /* This cluster has no next cluster */ > +/* cluster lock, protect cluster_info contents and sis->swap_map */ > +#define CLUSTER_FLAG_LOCK_BIT 2 > +#define CLUSTER_FLAG_LOCK (1 << CLUSTER_FLAG_LOCK_BIT) Why the roll-your-own locking and data structures here? To my naive understanding, it seems like you could do something like: struct swap_cluster_info { spinlock_t lock; atomic_t count; unsigned int flags; }; Then you could use proper spinlock operations which, among other things, would make the realtime folks happier. That might well help with the cache-line sharing issues as well. Some of the count manipulations could perhaps be done without the lock entirely; similarly, atomic bitops might save you the locking for some of the flag tweaks - though I'd have to look more closely to be really sure of that. The cost, of course, is the growth of this structure, but you've already noted that the overhead isn't all that high; seems like it could be worth it. I assume that I'm missing something obvious here? Thanks, jon