Received: by 2002:ab2:6309:0:b0:1fb:d597:ff75 with SMTP id s9csp770395lqt; Thu, 6 Jun 2024 19:31:51 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCWBOycp/gwBR5Orfu3rLn9pXugV6jvOWSDMaT6PEFee7oXzQVJygZs1T4giEjQhfq9gCU5Bdghh0UGL4r4ZHpST8py98gGpRr7vc6YVAA== X-Google-Smtp-Source: AGHT+IH0bYixtARyj/TDowsMy1QlF+EqOq4lv0WKjMTzCyUWb5zeRtfwgEwl0TSzpkHhJb4dhXSD X-Received: by 2002:a17:906:dfd7:b0:a68:b073:14a5 with SMTP id a640c23a62f3a-a6c75fabb5amr344871866b.9.1717727511014; Thu, 06 Jun 2024 19:31:51 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1717727511; cv=pass; d=google.com; s=arc-20160816; b=vVEpk+R2Swq0lnbmv8ln0Ky+Hq45eAk/b+iYMLc99pcNNNFtAoj8rhp//9tH69uo/S 1slet/HNB2TCSo6KNwgb68rc9VHh7xiOdYo3mj/LaEFK8a90sXUd3hSAx0Nf6eIsW1fl V91psite3X0uzIkBAjO5AUPrwBrI86o4XcTxaju7uNZED1o6/8zQMYRNSAbdX+q/fkzK wB42zEZcDStZSb0bhRhOlZYKMvT161zfeKP9YL+r5bcRT7oDVXcRGsS4TKDd6u+g4S4V HpaAH1jOy5plSv2go59j0g3HHe1IRWwD8A3IjpfG2FU+sXyl2sUDVnZSZ1yclCuh+t2X TDLA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:to:from; bh=5OzgKsiYClhl2hthTs2jo0H8/UT1fZHyBIgFqykUrxg=; fh=h606K5hwBFgbLZF1xUmBdEKZrtRpj1+AFkHrfL7csxU=; b=o7t4qd65aBYe2D3F1oIemptO9jdAkr0ibSQBmne5/KmE3Gg+U/Qzs2JcTCBtuOzsRC lE0la+bFnwvx8uZN2Vg1mGzx8cmt/gOh3sW3TqQwl/KNPdDHtB2aaLNNxtUS93tz4rqb iH9dvpyfj4sXJ5ZxKc+0JRzUEOtOQvPFplNQK8w2d0OjnoSyQ/v3fXIOak88Bgz70OOw KhI0zpw3U9lac+4vAyRuiHIv4LxTVMKAxp9v0ZTYCaxoPBGUfjEGnBr9hZ8lrjptKlAk wFhR3glkid1g25Iu0ycNwm42eJPrPGF4n9D+fyl5pVllDjrgeNh3i07zQ23NQk1V+HoY L7+w==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=unisoc.com); spf=pass (google.com: domain of linux-kernel+bounces-205248-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-205248-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id a640c23a62f3a-a6c80728287si132438466b.850.2024.06.06.19.31.50 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 19:31:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-205248-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=unisoc.com); spf=pass (google.com: domain of linux-kernel+bounces-205248-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-205248-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 860731F24087 for ; Fri, 7 Jun 2024 02:31:50 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A6CBE12E47; Fri, 7 Jun 2024 02:31:44 +0000 (UTC) Received: from SHSQR01.spreadtrum.com (unknown [222.66.158.135]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CF9EF2F37 for ; Fri, 7 Jun 2024 02:31:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=222.66.158.135 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717727504; cv=none; b=AnAmbtfa4GajkXI9VmxAUqAEQiFISoxixZSWxzMkiUtEgB+epHGpXxiCIZv4CKePz0LiV15cyfhPw27h/dyrRySpLhks/m4Oqt2qS30XIU4hUx+VsV0AOse2enZ0xXNNjOKUSGHB4YXu3x7VjZZ0MmB028o7qJQD6UOpKkS8bWo= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717727504; c=relaxed/simple; bh=j7SmNnf2IJ7HtGtXQrqOdQgWBu66WrwLsS5J2Vr5gCE=; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; b=U6HwYKbptDiSItr+CKWqSUtnYzKBAGstiM1kHT3jvxLzNs7tpC57pK41gS0+bwpJsAqpxv1/Bt78y7I5US2ZUs5+76ZT47fZtRZoSNFU5F706tmn4jctKT0I1No2e0gQrvxYwOxBRIOISujHk7CGaPRkl6zzrRHAl1V239lbHxs= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=unisoc.com; spf=pass smtp.mailfrom=unisoc.com; arc=none smtp.client-ip=222.66.158.135 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=unisoc.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=unisoc.com Received: from dlp.unisoc.com ([10.29.3.86]) by SHSQR01.spreadtrum.com with ESMTP id 4572VNQe079441; Fri, 7 Jun 2024 10:31:23 +0800 (+08) (envelope-from zhaoyang.huang@unisoc.com) Received: from SHDLP.spreadtrum.com (bjmbx01.spreadtrum.com [10.0.64.7]) by dlp.unisoc.com (SkyGuard) with ESMTPS id 4VwQ870Jdyz2QNRs5; Fri, 7 Jun 2024 10:27:19 +0800 (CST) Received: from bj03382pcu01.spreadtrum.com (10.0.73.40) by BJMBX01.spreadtrum.com (10.0.64.7) with Microsoft SMTP Server (TLS) id 15.0.1497.23; Fri, 7 Jun 2024 10:31:20 +0800 From: "zhaoyang.huang" To: Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Lorenzo Stoakes , Baoquan He , Thomas Gleixner , hailong liu , , , , Zhaoyang Huang , Subject: [Resend PATCHv4 1/1] mm: fix incorrect vbq reference in purge_fragmented_block Date: Fri, 7 Jun 2024 10:31:16 +0800 Message-ID: <20240607023116.1720640-1-zhaoyang.huang@unisoc.com> X-Mailer: git-send-email 2.25.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-ClientProxiedBy: SHCAS01.spreadtrum.com (10.0.1.201) To BJMBX01.spreadtrum.com (10.0.64.7) X-MAIL:SHSQR01.spreadtrum.com 4572VNQe079441 From: Zhaoyang Huang vmalloc area runs out in our ARM64 system during an erofs test as vm_map_ram failed[1]. By following the debug log, we find that vm_map_ram()->vb_alloc() will allocate new vb->va which corresponding to 4MB vmalloc area as list_for_each_entry_rcu returns immediately when vbq->free->next points to vbq->free. That is to say, 65536 times of page fault after the list's broken will run out of the whole vmalloc area. This should be introduced by one vbq->free->next point to vbq->free which makes list_for_each_entry_rcu can not iterate the list and find the BUG. [1] PID: 1 TASK: ffffff80802b4e00 CPU: 6 COMMAND: "init" #0 [ffffffc08006afe0] __switch_to at ffffffc08111d5cc #1 [ffffffc08006b040] __schedule at ffffffc08111dde0 #2 [ffffffc08006b0a0] schedule at ffffffc08111e294 #3 [ffffffc08006b0d0] schedule_preempt_disabled at ffffffc08111e3f0 #4 [ffffffc08006b140] __mutex_lock at ffffffc08112068c #5 [ffffffc08006b180] __mutex_lock_slowpath at ffffffc08111f8f8 #6 [ffffffc08006b1a0] mutex_lock at ffffffc08111f834 #7 [ffffffc08006b1d0] reclaim_and_purge_vmap_areas at ffffffc0803ebc3c #8 [ffffffc08006b290] alloc_vmap_area at ffffffc0803e83fc #9 [ffffffc08006b300] vm_map_ram at ffffffc0803e78c0 Fixes: fc1e0d980037 ("mm/vmalloc: prevent stale TLBs in fully utilized blocks") For detailed reason of broken list, please refer to below URL https://lore.kernel.org/all/20240531024820.5507-1-hailong.liu@oppo.com/ Suggested-by: Hailong.Liu Signed-off-by: Zhaoyang Huang --- v2: introduce cpu in vmap_block to record the right CPU number v3: use get_cpu/put_cpu to prevent schedule between core v4: replace get_cpu/put_cpu by another API to avoid disabling preemption --- --- mm/vmalloc.c | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 22aa63f4ef63..89eb034f4ac6 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2458,6 +2458,7 @@ struct vmap_block { struct list_head free_list; struct rcu_head rcu_head; struct list_head purge; + unsigned int cpu; }; /* Queue of free and dirty vmap blocks, for allocation and flushing purposes */ @@ -2585,8 +2586,15 @@ static void *new_vmap_block(unsigned int order, gfp_t gfp_mask) free_vmap_area(va); return ERR_PTR(err); } - - vbq = raw_cpu_ptr(&vmap_block_queue); + /* + * list_add_tail_rcu could happened in another core + * rather than vb->cpu due to task migration, which + * is safe as list_add_tail_rcu will ensure the list's + * integrity together with list_for_each_rcu from read + * side. + */ + vb->cpu = raw_smp_processor_id(); + vbq = per_cpu_ptr(&vmap_block_queue, vb->cpu); spin_lock(&vbq->lock); list_add_tail_rcu(&vb->free_list, &vbq->free); spin_unlock(&vbq->lock); @@ -2614,9 +2622,10 @@ static void free_vmap_block(struct vmap_block *vb) } static bool purge_fragmented_block(struct vmap_block *vb, - struct vmap_block_queue *vbq, struct list_head *purge_list, - bool force_purge) + struct list_head *purge_list, bool force_purge) { + struct vmap_block_queue *vbq = &per_cpu(vmap_block_queue, vb->cpu); + if (vb->free + vb->dirty != VMAP_BBMAP_BITS || vb->dirty == VMAP_BBMAP_BITS) return false; @@ -2664,7 +2673,7 @@ static void purge_fragmented_blocks(int cpu) continue; spin_lock(&vb->lock); - purge_fragmented_block(vb, vbq, &purge, true); + purge_fragmented_block(vb, &purge, true); spin_unlock(&vb->lock); } rcu_read_unlock(); @@ -2801,7 +2810,7 @@ static void _vm_unmap_aliases(unsigned long start, unsigned long end, int flush) * not purgeable, check whether there is dirty * space to be flushed. */ - if (!purge_fragmented_block(vb, vbq, &purge_list, false) && + if (!purge_fragmented_block(vb, &purge_list, false) && vb->dirty_max && vb->dirty != VMAP_BBMAP_BITS) { unsigned long va_start = vb->va->va_start; unsigned long s, e; -- 2.25.1