Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757109AbdGXXEs (ORCPT ); Mon, 24 Jul 2017 19:04:48 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:51187 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756975AbdGXXDD (ORCPT ); Mon, 24 Jul 2017 19:03:03 -0400 From: Dennis Zhou To: Tejun Heo , Christoph Lameter , Josef Bacik CC: , , , Dennis Zhou Subject: [PATCH v2 23/23] percpu: update header to contain bitmap allocator explanation. Date: Mon, 24 Jul 2017 19:02:20 -0400 Message-ID: <20170724230220.21774-24-dennisz@fb.com> X-Mailer: git-send-email 2.13.3 In-Reply-To: <20170724230220.21774-1-dennisz@fb.com> References: <20170724230220.21774-1-dennisz@fb.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [192.168.52.123] X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-07-24_14:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3309 Lines: 71 From: "Dennis Zhou (Facebook)" The other patches contain a lot of information, so adding this information in a separate patch. It adds my copyright and a brief explanation of how the bitmap allocator works. There is a minor typo as well in the prior explanation so that is fixed. Signed-off-by: Dennis Zhou --- mm/percpu.c | 32 ++++++++++++++++++-------------- 1 file changed, 18 insertions(+), 14 deletions(-) diff --git a/mm/percpu.c b/mm/percpu.c index ffa9da7..a4dd0c8 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -4,6 +4,9 @@ * Copyright (C) 2009 SUSE Linux Products GmbH * Copyright (C) 2009 Tejun Heo * + * Copyright (C) 2017 Facebook Inc. + * Copyright (C) 2017 Dennis Zhou + * * This file is released under the GPLv2 license. * * The percpu allocator handles both static and dynamic areas. Percpu @@ -25,7 +28,7 @@ * * There is special consideration for the first chunk which must handle * the static percpu variables in the kernel image as allocation services - * are not online yet. In short, the first chunk is structure like so: + * are not online yet. In short, the first chunk is structured like so: * * * @@ -34,19 +37,20 @@ * percpu variables from kernel modules. Finally, the dynamic section * takes care of normal allocations. * - * Allocation state in each chunk is kept using an array of integers - * on chunk->map. A positive value in the map represents a free - * region and negative allocated. Allocation inside a chunk is done - * by scanning this map sequentially and serving the first matching - * entry. This is mostly copied from the percpu_modalloc() allocator. - * Chunks can be determined from the address using the index field - * in the page struct. The index field contains a pointer to the chunk. - * - * These chunks are organized into lists according to free_size and - * tries to allocate from the fullest chunk first. Each chunk maintains - * a maximum contiguous area size hint which is guaranteed to be equal - * to or larger than the maximum contiguous area in the chunk. This - * helps prevent the allocator from iterating over chunks unnecessarily. + * The allocator organizes chunks into lists according to free size and + * tries to allocate from the fullest chunk first. Each chunk is managed + * by a bitmap with metadata blocks. The allocation map is updated on + * every allocation and free to reflect the current state while the boundary + * map is only updated on allocation. Each metadata block contains + * information to help mitigate the need to iterate over large portions + * of the bitmap. The reverse mapping from page to chunk is stored in + * the page's index. Lastly, units are lazily backed and grow in unison. + * + * There is a unique conversion that goes on here between bytes and bits. + * Each bit represents a fragment of size PCPU_MIN_ALLOC_SIZE. The chunk + * tracks the number of pages it is responsible for in nr_pages. Helper + * functions are used to convert from between the bytes, bits, and blocks. + * All hints are managed in bits unless explicitly stated. * * To use this allocator, arch code should do the following: * -- 2.9.3