Received: by 2002:ab2:710b:0:b0:1ef:a325:1205 with SMTP id z11csp471560lql; Mon, 11 Mar 2024 08:05:04 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXkoS6DF8XWqBJlpglmO2Zaaoeii9iwQUn0ikBdMZqIiRv4i1as6iEiW04nnHZWtitiY2F+eGkdkGZxyRr2zURz5loi84602vHs3MiOjg== X-Google-Smtp-Source: AGHT+IHx2uKBKulavVUHpCkItsUV9gQP+puszkjlad7W70MlE8/XUuOHdTI8/6Ew2d1mHJQbHdf7 X-Received: by 2002:a17:907:a781:b0:a46:134c:afe0 with SMTP id vx1-20020a170907a78100b00a46134cafe0mr3862255ejc.51.1710169504354; Mon, 11 Mar 2024 08:05:04 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1710169504; cv=pass; d=google.com; s=arc-20160816; b=gzOH4NmPM16t8d4gyeRvQUkHQVmeQl6jI3CsChnE9pTr6q1+Z5UZWOs/5tLFdN7L3q 89uowWCiz239K8M91gYGnfnW6T2MBeKW5k3epXUiV4BAgxDYBZeLRugvIF+s0h2X5kBc 8OdhL3APmr44RkISfHDZ4CXM1LU73ICLr1AuuFMMA9ucObifWR54yvOlzWtWHqMTWg0E RSDgF6XxuKlqBSRoAY8v3vaiXb7H7cGDzcHt+ugIqoduDgiPgGqZ52KN+xEFQRc1Pf8u sjbLSDBX1Pwc4vLh+shzMLZStXjOHheAsqjwmX2vkj5KQgnelvBe1f9sdFbYa94HsFxr ym7A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=JhGMTpj5yC0+MzmgutXJj1JLPcuamOP+qjERV2HcV3w=; fh=JqZy7iLc6dHI48tYxBOMxKaUckLuIxmFnYJ17pQsdfQ=; b=UkV1YUXVP75VVqxSLsEtaAGqg3uDTm8YuMsBW4C1926mNkVColGm1AC5TA+4GB1ipx NGELDV7NmWkezjctH0HWmhy6KeRWgzdHeyn7hCLXw8Hx2rDtNfKAiqinFs6FQHRJd2mR K9DBRICtrDC9hAoDa7XcjRKTGRDxLx6VEfG9ENe1v/WmwB/IyvL3i2apduxpYeDFya4X wQJK2R3VpgxYT3WSwIFxlmRmz/3aO4GlWR8scx6ulfIQSab+8ZsdnNf+j5OuPSEPu0aS FJJUo0aBNQCm+bSO77IcfSiAsbvrsgYTs1dTKsp+X9JpxvrtiV8lOzfLz8xGu5isMxfz wDDw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-99034-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-99034-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id e5-20020a17090681c500b00a45f848ad11si2483679ejx.1047.2024.03.11.08.05.04 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Mar 2024 08:05:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-99034-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-99034-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-99034-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id B35E41F242B4 for ; Mon, 11 Mar 2024 15:02:13 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 189FC44C6E; Mon, 11 Mar 2024 15:01:20 +0000 (UTC) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id ED814446A0 for ; Mon, 11 Mar 2024 15:01:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710169279; cv=none; b=KBnsMG1h8mbfH3lXBztFtyljv9WS5EF5VDk/7iRQhDUuccvEjGY1XjRkI9oXK7CBGmOlI321hbTnmzar46uRiGcQpowXyw23aUv0mu/kPmb0tFxjG/10HmFRnWJluQVuBwIo9jVnjn1BAXnZ/UcnoehWVOisJbKSe894basV0MM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710169279; c=relaxed/simple; bh=qN/KxcJbwbD57I3Qfyamd12HqhbZFxxTeZAyoxMTOM4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=kbcGAaQFhDTfCE3ZB1mzofn3xEKc34zJXVLs7gfusthSw7QZX8Y47+d7XoqAde3SIBHD2SuCK4ULGgmvxraD4z87amZhN84n52cRuxScrrlTpmpKnkghvT+rPq+xpjVdGTGtnp2AGicNo8JogZ7NK/r/v7Nv1G0+snR32KOYvyE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1953C1570; Mon, 11 Mar 2024 08:01:54 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3126C3F64C; Mon, 11 Mar 2024 08:01:15 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , David Hildenbrand , Matthew Wilcox , Huang Ying , Gao Xiang , Yu Zhao , Yang Shi , Michal Hocko , Kefeng Wang , Barry Song <21cnbao@gmail.com>, Chris Li Cc: Ryan Roberts , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 3/6] mm: swap: Simplify struct percpu_cluster Date: Mon, 11 Mar 2024 15:00:55 +0000 Message-Id: <20240311150058.1122862-4-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240311150058.1122862-1-ryan.roberts@arm.com> References: <20240311150058.1122862-1-ryan.roberts@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit struct percpu_cluster stores the index of cpu's current cluster and the offset of the next entry that will be allocated for the cpu. These two pieces of information are redundant because the cluster index is just (offset / SWAPFILE_CLUSTER). The only reason for explicitly keeping the cluster index is because the structure used for it also has a flag to indicate "no cluster". However this data structure also contains a spin lock, which is never used in this context, as a side effect the code copies the spinlock_t structure, which is questionable coding practice in my view. So let's clean this up and store only the next offset, and use a sentinal value (SWAP_NEXT_INVALID) to indicate "no cluster". SWAP_NEXT_INVALID is chosen to be 0, because 0 will never be seen legitimately; The first page in the swap file is the swap header, which is always marked bad to prevent it from being allocated as an entry. This also prevents the cluster to which it belongs being marked free, so it will never appear on the free list. This change saves 16 bytes per cpu. And given we are shortly going to extend this mechanism to be per-cpu-AND-per-order, we will end up saving 16 * 9 = 144 bytes per cpu, which adds up if you have 256 cpus in the system. Signed-off-by: Ryan Roberts --- include/linux/swap.h | 9 ++++++++- mm/swapfile.c | 22 +++++++++++----------- 2 files changed, 19 insertions(+), 12 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index f2b7f204b968..0cb082bee717 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -260,13 +260,20 @@ struct swap_cluster_info { #define CLUSTER_FLAG_FREE 1 /* This cluster is free */ #define CLUSTER_FLAG_NEXT_NULL 2 /* This cluster has no next cluster */ +/* + * The first page in the swap file is the swap header, which is always marked + * bad to prevent it from being allocated as an entry. This also prevents the + * cluster to which it belongs being marked free. Therefore 0 is safe to use as + * a sentinel to indicate next is not valid in percpu_cluster. + */ +#define SWAP_NEXT_INVALID 0 + /* * We assign a cluster to each CPU, so each CPU can allocate swap entry from * its own cluster and swapout sequentially. The purpose is to optimize swapout * throughput. */ struct percpu_cluster { - struct swap_cluster_info index; /* Current cluster index */ unsigned int next; /* Likely next allocation offset */ }; diff --git a/mm/swapfile.c b/mm/swapfile.c index ee7e44cb40c5..3828d81aa6b8 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -609,7 +609,7 @@ scan_swap_map_ssd_cluster_conflict(struct swap_info_struct *si, return false; percpu_cluster = this_cpu_ptr(si->percpu_cluster); - cluster_set_null(&percpu_cluster->index); + percpu_cluster->next = SWAP_NEXT_INVALID; return true; } @@ -622,14 +622,14 @@ static bool scan_swap_map_try_ssd_cluster(struct swap_info_struct *si, { struct percpu_cluster *cluster; struct swap_cluster_info *ci; - unsigned long tmp, max; + unsigned int tmp, max; new_cluster: cluster = this_cpu_ptr(si->percpu_cluster); - if (cluster_is_null(&cluster->index)) { + tmp = cluster->next; + if (tmp == SWAP_NEXT_INVALID) { if (!cluster_list_empty(&si->free_clusters)) { - cluster->index = si->free_clusters.head; - cluster->next = cluster_next(&cluster->index) * + tmp = cluster_next(&si->free_clusters.head) * SWAPFILE_CLUSTER; } else if (!cluster_list_empty(&si->discard_clusters)) { /* @@ -649,9 +649,7 @@ static bool scan_swap_map_try_ssd_cluster(struct swap_info_struct *si, * Other CPUs can use our cluster if they can't find a free cluster, * check if there is still free entry in the cluster */ - tmp = cluster->next; - max = min_t(unsigned long, si->max, - (cluster_next(&cluster->index) + 1) * SWAPFILE_CLUSTER); + max = min_t(unsigned long, si->max, ALIGN(tmp + 1, SWAPFILE_CLUSTER)); if (tmp < max) { ci = lock_cluster(si, tmp); while (tmp < max) { @@ -662,12 +660,13 @@ static bool scan_swap_map_try_ssd_cluster(struct swap_info_struct *si, unlock_cluster(ci); } if (tmp >= max) { - cluster_set_null(&cluster->index); + cluster->next = SWAP_NEXT_INVALID; goto new_cluster; } - cluster->next = tmp + 1; *offset = tmp; *scan_base = tmp; + tmp += 1; + cluster->next = tmp < max ? tmp : SWAP_NEXT_INVALID; return true; } @@ -3138,8 +3137,9 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) } for_each_possible_cpu(cpu) { struct percpu_cluster *cluster; + cluster = per_cpu_ptr(p->percpu_cluster, cpu); - cluster_set_null(&cluster->index); + cluster->next = SWAP_NEXT_INVALID; } } else { atomic_inc(&nr_rotate_swap); -- 2.25.1