Received: by 2002:a89:2c3:0:b0:1ed:23cc:44d1 with SMTP id d3csp1061206lqs; Wed, 6 Mar 2024 05:23:43 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWt/faZmqhlv8qNhdjcTDoU++tRS6eSRdJLCONpOQn47soQKAw+/g0+jdqb68ZgsswxM3OSgb99pV9Od4jJmWi8ZpegG+MypYbPplmiSg== X-Google-Smtp-Source: AGHT+IGthFfOx8S1SDP27r71VdEECK44EpsMFdfc/V4HedfojQBSD8rtO/dPvNYgIeaSVoZcXHYL X-Received: by 2002:a17:906:5a9a:b0:a43:ffe3:70a with SMTP id l26-20020a1709065a9a00b00a43ffe3070amr9888248ejq.9.1709731422859; Wed, 06 Mar 2024 05:23:42 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709731422; cv=pass; d=google.com; s=arc-20160816; b=z9BqfCvLW0H5t0nb46cK1RJnYzvY9jpZ/BH9MtKpyE9dUrz/NtuBwA52bk6aY6+x5e NZV82uPKVFKIEZqOInTySEhaGIasT3KT6s3UIRuH78yoX/5eQuGv5klHz1w7UnFcOIrE Oi7twp0Y5scrBUor1hW1bvP2ZEJVL+vKSOHpiPxVASCWnZpIySb97z88P9jffwupbOIe AsRGi42gu3eLMPntI9IzjGU1GKC85NYi9C7R2EG2AikcYKpOw+4eCGA/ya+LqhYog4NH NuKB2j+Yg0bw63tngfIv5UicO9J4NWUBa0OBWbkwKEmZEeUsiAci1Ziw4VaiwCaOZrHG g3rQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=gaQGurn/ktHKXD7AUOUSUDulwzVJEBlYlsg/v+7xnJg=; fh=bFYeJv5xzYDcEe29ZQEDNCN43SFP24EXBwJ5ilDTpww=; b=PlU/3s2OO6zWEiugUY+H9ZAFclkUdl6ujFtVAznI15oVZ7Zc0H0VOCpArelMFHhpAz +YN1qe5Z/cusyTF+9V9ye6GhBWstULwM8rfnYx0KkTyFg+47uQ3w6iVOfzm6bD+rHaNy BNaBi1l269fEvHpPWmYlTHq5jNytWbGh3+1wQUHwmMYjWR6ZiaIctrEX5GQkHDzKlxdZ ofEScaeXUJXyrfwWEcNOyK0VtJbJo10EWLpKZOd4gWcDbFXQjAd8mMqhYfO4upTT79S6 rvh8OtWX41vCWQaSkZia0FeqjP7C3riL2EMgCLAcSU5PyA/84pIPdjWOeJCwworE4a1Y seHg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=G8R82yL1; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-93971-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-93971-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id qb10-20020a1709077e8a00b00a445ab8c59fsi5992444ejc.935.2024.03.06.05.23.42 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Mar 2024 05:23:42 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-93971-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=G8R82yL1; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-93971-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-93971-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 6F9471F21588 for ; Wed, 6 Mar 2024 13:23:42 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id C66C1130AFE; Wed, 6 Mar 2024 13:23:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="G8R82yL1" Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 910F712D1FC for ; Wed, 6 Mar 2024 13:23:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709731414; cv=none; b=QEelLEXAN1iXp6NhX5504j3woT35LdCYbrcR/aotETTCJ04AixfyjXf0migDidb7I2MU5Wo9T71zvMVRjM95SknBvxbicFtVkepTFs0DtIXeXrw4vBylB9XypABbyqNsM7B6P1DZkunKM2aEtFIu2yLCZxinNJAoNEE5FvBm1ds= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709731414; c=relaxed/simple; bh=rKW3ft37ZMKSqEXYm/vAWr2d6eCNLj10DGC2XshvqDM=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=fOMNMnma4JOLFcJLBsu9pkK7I9M0cX39pzsyfaX+hi839y7bOUR0EBKjuADom27DhbF/KwDvifg8qD/O1JY3CpnFEkwY9kx68e0lMUtvAt+ZoUf9+plNpF/kN/wyLxIeuWDEyASs8hU1F4qIbPyoOnx2A8LIPdp5OHnF2LLXH1s= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=G8R82yL1; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1709731411; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=gaQGurn/ktHKXD7AUOUSUDulwzVJEBlYlsg/v+7xnJg=; b=G8R82yL11i53emwhSz0VBFbbVEzCsdaZAIVlhC6431E6NhqgIoDY9mH1uaI0UHzRFjCl7I d0BcuHP3/8lPNvU9lbTmS54m1I8QYp3wlTazE3A31/l1GrP9eC+zQXdb4OLt1MwgLjmxgi ApSFIyQyHGA7dYajinI/iq5jZv+N7i8= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-154-9LQ1wa9LMnaK3oyuuQWzHw-1; Wed, 06 Mar 2024 08:23:26 -0500 X-MC-Unique: 9LQ1wa9LMnaK3oyuuQWzHw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 7BB648007A1; Wed, 6 Mar 2024 13:23:25 +0000 (UTC) Received: from localhost (unknown [10.72.116.15]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 3015E40C6CBC; Wed, 6 Mar 2024 13:23:23 +0000 (UTC) Date: Wed, 6 Mar 2024 21:23:21 +0800 From: Baoquan He To: rulinhuang Cc: urezki@gmail.com, akpm@linux-foundation.org, colin.king@intel.com, hch@infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, lstoakes@gmail.com, tianyou.li@intel.com, tim.c.chen@intel.com, wangyang.guo@intel.com, zhiguo.zhou@intel.com Subject: Re: [PATCH v7 1/2] mm/vmalloc: Moved macros with no functional change happened Message-ID: References: <20240301155417.1852290-1-rulin.huang@intel.com> <20240301155417.1852290-2-rulin.huang@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240301155417.1852290-2-rulin.huang@intel.com> X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.2 Sorry, I missed this patchset in my mail box. On 03/01/24 at 10:54am, rulinhuang wrote: > Moved data structures and basic helpers related to per cpu kva allocator ~~~ s/Moved/move/? And the subject too? > up too to along with these macros with no functional change happened. Maybe we should add below line to tell why the moving need be done. This is in preparation for later VMAP_RAM checking in alloc_vmap_area(). Other than above nitpicks, this looks good to me. If you update this patch log and post a new version, please feel free to add: Reviewed-by: Baoquan He > > Signed-off-by: rulinhuang > --- > V6 -> V7: Adjusted the macros > --- > mm/vmalloc.c | 262 +++++++++++++++++++++++++-------------------------- > 1 file changed, 131 insertions(+), 131 deletions(-) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index 25a8df497255..fc027a61c12e 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -887,6 +887,137 @@ is_vn_id_valid(unsigned int node_id) > return false; > } > > +/* > + * vmap space is limited especially on 32 bit architectures. Ensure there is > + * room for at least 16 percpu vmap blocks per CPU. > + */ > +/* > + * If we had a constant VMALLOC_START and VMALLOC_END, we'd like to be able > + * to #define VMALLOC_SPACE (VMALLOC_END-VMALLOC_START). Guess > + * instead (we just need a rough idea) > + */ > +#if BITS_PER_LONG == 32 > +#define VMALLOC_SPACE (128UL*1024*1024) > +#else > +#define VMALLOC_SPACE (128UL*1024*1024*1024) > +#endif > + > +#define VMALLOC_PAGES (VMALLOC_SPACE / PAGE_SIZE) > +#define VMAP_MAX_ALLOC BITS_PER_LONG /* 256K with 4K pages */ > +#define VMAP_BBMAP_BITS_MAX 1024 /* 4MB with 4K pages */ > +#define VMAP_BBMAP_BITS_MIN (VMAP_MAX_ALLOC*2) > +#define VMAP_MIN(x, y) ((x) < (y) ? (x) : (y)) /* can't use min() */ > +#define VMAP_MAX(x, y) ((x) > (y) ? (x) : (y)) /* can't use max() */ > +#define VMAP_BBMAP_BITS \ > + VMAP_MIN(VMAP_BBMAP_BITS_MAX, \ > + VMAP_MAX(VMAP_BBMAP_BITS_MIN, \ > + VMALLOC_PAGES / roundup_pow_of_two(NR_CPUS) / 16)) > + > +#define VMAP_BLOCK_SIZE (VMAP_BBMAP_BITS * PAGE_SIZE) > + > +/* > + * Purge threshold to prevent overeager purging of fragmented blocks for > + * regular operations: Purge if vb->free is less than 1/4 of the capacity. > + */ > +#define VMAP_PURGE_THRESHOLD (VMAP_BBMAP_BITS / 4) > + > +#define VMAP_RAM 0x1 /* indicates vm_map_ram area*/ > +#define VMAP_BLOCK 0x2 /* mark out the vmap_block sub-type*/ > +#define VMAP_FLAGS_MASK 0x3 > + > +struct vmap_block_queue { > + spinlock_t lock; > + struct list_head free; > + > + /* > + * An xarray requires an extra memory dynamically to > + * be allocated. If it is an issue, we can use rb-tree > + * instead. > + */ > + struct xarray vmap_blocks; > +}; > + > +struct vmap_block { > + spinlock_t lock; > + struct vmap_area *va; > + unsigned long free, dirty; > + DECLARE_BITMAP(used_map, VMAP_BBMAP_BITS); > + unsigned long dirty_min, dirty_max; /*< dirty range */ > + struct list_head free_list; > + struct rcu_head rcu_head; > + struct list_head purge; > +}; > + > +/* Queue of free and dirty vmap blocks, for allocation and flushing purposes */ > +static DEFINE_PER_CPU(struct vmap_block_queue, vmap_block_queue); > + > +/* > + * In order to fast access to any "vmap_block" associated with a > + * specific address, we use a hash. > + * > + * A per-cpu vmap_block_queue is used in both ways, to serialize > + * an access to free block chains among CPUs(alloc path) and it > + * also acts as a vmap_block hash(alloc/free paths). It means we > + * overload it, since we already have the per-cpu array which is > + * used as a hash table. When used as a hash a 'cpu' passed to > + * per_cpu() is not actually a CPU but rather a hash index. > + * > + * A hash function is addr_to_vb_xa() which hashes any address > + * to a specific index(in a hash) it belongs to. This then uses a > + * per_cpu() macro to access an array with generated index. > + * > + * An example: > + * > + * CPU_1 CPU_2 CPU_0 > + * | | | > + * V V V > + * 0 10 20 30 40 50 60 > + * |------|------|------|------|------|------|... > + * CPU0 CPU1 CPU2 CPU0 CPU1 CPU2 > + * > + * - CPU_1 invokes vm_unmap_ram(6), 6 belongs to CPU0 zone, thus > + * it access: CPU0/INDEX0 -> vmap_blocks -> xa_lock; > + * > + * - CPU_2 invokes vm_unmap_ram(11), 11 belongs to CPU1 zone, thus > + * it access: CPU1/INDEX1 -> vmap_blocks -> xa_lock; > + * > + * - CPU_0 invokes vm_unmap_ram(20), 20 belongs to CPU2 zone, thus > + * it access: CPU2/INDEX2 -> vmap_blocks -> xa_lock. > + * > + * This technique almost always avoids lock contention on insert/remove, > + * however xarray spinlocks protect against any contention that remains. > + */ > +static struct xarray * > +addr_to_vb_xa(unsigned long addr) > +{ > + int index = (addr / VMAP_BLOCK_SIZE) % num_possible_cpus(); > + > + return &per_cpu(vmap_block_queue, index).vmap_blocks; > +} > + > +/* > + * We should probably have a fallback mechanism to allocate virtual memory > + * out of partially filled vmap blocks. However vmap block sizing should be > + * fairly reasonable according to the vmalloc size, so it shouldn't be a > + * big problem. > + */ > + > +static unsigned long addr_to_vb_idx(unsigned long addr) > +{ > + addr -= VMALLOC_START & ~(VMAP_BLOCK_SIZE-1); > + addr /= VMAP_BLOCK_SIZE; > + return addr; > +} > + > +static void *vmap_block_vaddr(unsigned long va_start, unsigned long pages_off) > +{ > + unsigned long addr; > + > + addr = va_start + (pages_off << PAGE_SHIFT); > + BUG_ON(addr_to_vb_idx(addr) != addr_to_vb_idx(va_start)); > + return (void *)addr; > +} > + > static __always_inline unsigned long > va_size(struct vmap_area *va) > { > @@ -2327,137 +2458,6 @@ static struct vmap_area *find_unlink_vmap_area(unsigned long addr) > > /*** Per cpu kva allocator ***/ > > -/* > - * vmap space is limited especially on 32 bit architectures. Ensure there is > - * room for at least 16 percpu vmap blocks per CPU. > - */ > -/* > - * If we had a constant VMALLOC_START and VMALLOC_END, we'd like to be able > - * to #define VMALLOC_SPACE (VMALLOC_END-VMALLOC_START). Guess > - * instead (we just need a rough idea) > - */ > -#if BITS_PER_LONG == 32 > -#define VMALLOC_SPACE (128UL*1024*1024) > -#else > -#define VMALLOC_SPACE (128UL*1024*1024*1024) > -#endif > - > -#define VMALLOC_PAGES (VMALLOC_SPACE / PAGE_SIZE) > -#define VMAP_MAX_ALLOC BITS_PER_LONG /* 256K with 4K pages */ > -#define VMAP_BBMAP_BITS_MAX 1024 /* 4MB with 4K pages */ > -#define VMAP_BBMAP_BITS_MIN (VMAP_MAX_ALLOC*2) > -#define VMAP_MIN(x, y) ((x) < (y) ? (x) : (y)) /* can't use min() */ > -#define VMAP_MAX(x, y) ((x) > (y) ? (x) : (y)) /* can't use max() */ > -#define VMAP_BBMAP_BITS \ > - VMAP_MIN(VMAP_BBMAP_BITS_MAX, \ > - VMAP_MAX(VMAP_BBMAP_BITS_MIN, \ > - VMALLOC_PAGES / roundup_pow_of_two(NR_CPUS) / 16)) > - > -#define VMAP_BLOCK_SIZE (VMAP_BBMAP_BITS * PAGE_SIZE) > - > -/* > - * Purge threshold to prevent overeager purging of fragmented blocks for > - * regular operations: Purge if vb->free is less than 1/4 of the capacity. > - */ > -#define VMAP_PURGE_THRESHOLD (VMAP_BBMAP_BITS / 4) > - > -#define VMAP_RAM 0x1 /* indicates vm_map_ram area*/ > -#define VMAP_BLOCK 0x2 /* mark out the vmap_block sub-type*/ > -#define VMAP_FLAGS_MASK 0x3 > - > -struct vmap_block_queue { > - spinlock_t lock; > - struct list_head free; > - > - /* > - * An xarray requires an extra memory dynamically to > - * be allocated. If it is an issue, we can use rb-tree > - * instead. > - */ > - struct xarray vmap_blocks; > -}; > - > -struct vmap_block { > - spinlock_t lock; > - struct vmap_area *va; > - unsigned long free, dirty; > - DECLARE_BITMAP(used_map, VMAP_BBMAP_BITS); > - unsigned long dirty_min, dirty_max; /*< dirty range */ > - struct list_head free_list; > - struct rcu_head rcu_head; > - struct list_head purge; > -}; > - > -/* Queue of free and dirty vmap blocks, for allocation and flushing purposes */ > -static DEFINE_PER_CPU(struct vmap_block_queue, vmap_block_queue); > - > -/* > - * In order to fast access to any "vmap_block" associated with a > - * specific address, we use a hash. > - * > - * A per-cpu vmap_block_queue is used in both ways, to serialize > - * an access to free block chains among CPUs(alloc path) and it > - * also acts as a vmap_block hash(alloc/free paths). It means we > - * overload it, since we already have the per-cpu array which is > - * used as a hash table. When used as a hash a 'cpu' passed to > - * per_cpu() is not actually a CPU but rather a hash index. > - * > - * A hash function is addr_to_vb_xa() which hashes any address > - * to a specific index(in a hash) it belongs to. This then uses a > - * per_cpu() macro to access an array with generated index. > - * > - * An example: > - * > - * CPU_1 CPU_2 CPU_0 > - * | | | > - * V V V > - * 0 10 20 30 40 50 60 > - * |------|------|------|------|------|------|... > - * CPU0 CPU1 CPU2 CPU0 CPU1 CPU2 > - * > - * - CPU_1 invokes vm_unmap_ram(6), 6 belongs to CPU0 zone, thus > - * it access: CPU0/INDEX0 -> vmap_blocks -> xa_lock; > - * > - * - CPU_2 invokes vm_unmap_ram(11), 11 belongs to CPU1 zone, thus > - * it access: CPU1/INDEX1 -> vmap_blocks -> xa_lock; > - * > - * - CPU_0 invokes vm_unmap_ram(20), 20 belongs to CPU2 zone, thus > - * it access: CPU2/INDEX2 -> vmap_blocks -> xa_lock. > - * > - * This technique almost always avoids lock contention on insert/remove, > - * however xarray spinlocks protect against any contention that remains. > - */ > -static struct xarray * > -addr_to_vb_xa(unsigned long addr) > -{ > - int index = (addr / VMAP_BLOCK_SIZE) % num_possible_cpus(); > - > - return &per_cpu(vmap_block_queue, index).vmap_blocks; > -} > - > -/* > - * We should probably have a fallback mechanism to allocate virtual memory > - * out of partially filled vmap blocks. However vmap block sizing should be > - * fairly reasonable according to the vmalloc size, so it shouldn't be a > - * big problem. > - */ > - > -static unsigned long addr_to_vb_idx(unsigned long addr) > -{ > - addr -= VMALLOC_START & ~(VMAP_BLOCK_SIZE-1); > - addr /= VMAP_BLOCK_SIZE; > - return addr; > -} > - > -static void *vmap_block_vaddr(unsigned long va_start, unsigned long pages_off) > -{ > - unsigned long addr; > - > - addr = va_start + (pages_off << PAGE_SHIFT); > - BUG_ON(addr_to_vb_idx(addr) != addr_to_vb_idx(va_start)); > - return (void *)addr; > -} > - > /** > * new_vmap_block - allocates new vmap_block and occupies 2^order pages in this > * block. Of course pages number can't exceed VMAP_BBMAP_BITS > -- > 2.43.0 >