Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp1682600iob; Thu, 19 May 2022 11:52:07 -0700 (PDT) X-Google-Smtp-Source: ABdhPJymzgkk4k7DIK33/j06uUVJTj/vx5mfvAy3fkLbK7lFUr29hmVTFMoK5Rh3qQhwh6X1AeZP X-Received: by 2002:a63:1b0e:0:b0:3c2:8205:17a9 with SMTP id b14-20020a631b0e000000b003c2820517a9mr5190119pgb.192.1652986327245; Thu, 19 May 2022 11:52:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1652986327; cv=none; d=google.com; s=arc-20160816; b=XahFLi5Hkv5wWKUs8lpr9ikYRc0OAWS/94Td69hXxRFWOjUm4oVv0qd+IWQPHMDnaW sv6DnulbJ+irfKZV1ktFt3E3Hvi0ZPJVLXIZAP6EFANVSSth55jchGH//ZViyuBGBh44 zZweT7pKFxY7Yp1Ok2M0nwHwTatXX0zmKeqviGgutVCc8G39GemiI7A68alZEGNlZay0 XL0Y3W2ZKCT+UJ28a5LyQUX7hkU4Wm3U8cLmYtBSkDkgMEAxiVOAFWHr5QyMmOW54ntS 16LlRwRHEAl1xB1ZSslrxqv8Nk8Z4y7PUwS+RUXqs9NQPRAJoqsVIPnIhCx8BOVhoIZG +VFQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:message-id:date:cc:to:from:subject:organization :dkim-signature; bh=p92fXrJQ4gGF6gXdL4rpxaBMAzMtQCdZw00jmCIxVH0=; b=giTnbq9xVGpR3GnCYg/JbxcDaFS5T9FUyf6KQzvRYzTBtvW8KULmKzmR7xGtk/CNHQ nqpTw9vFVJfk6LIfVfULRT8MpD9c/jv2eKjo2LDX8xaCsj/+68kLkSFYn7y50n1pFnXn qTk2zrgLt3Azr4CnhxaBEFYMu21vQJkMeOe9xNa38flhv57/r5tXfnMDJ64tbci+xwIq HuNWS4QKJC08Xr5UCPXC20+BeUHcRvOdkXmpnp+QReJLVHsE3R730A1PUlkTeuTjLMz8 rDv/Gl2V62vqtve/nSKj+e6ewl2A5F8bHmlS5I1G6n8S+PJkrTMAI7KMIiMo0huGY3c+ Ozog== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=RfvD9ZEb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y26-20020a634b1a000000b003c1abf9ef96si7230981pga.777.2022.05.19.11.51.52; Thu, 19 May 2022 11:52:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=RfvD9ZEb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234377AbiESIun (ORCPT + 99 others); Thu, 19 May 2022 04:50:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50760 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231838AbiESIuk (ORCPT ); Thu, 19 May 2022 04:50:40 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 31E009C2E7 for ; Thu, 19 May 2022 01:50:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1652950238; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=p92fXrJQ4gGF6gXdL4rpxaBMAzMtQCdZw00jmCIxVH0=; b=RfvD9ZEbDJQn76vLk72WCqEDr+MJryQolwXrvX16OCmdpHaT32WOCBMnKpuvvZOyk0hSrZ RjD4S7txL+Ky5t6PXYU8SAJFRLk7Qs3BCxj6eLsEUNmee4PHmQcxs0A5YhCaNJSnH6KFZ2 MCm3wCuBo3Vu/OwBxs4i2RrloLljDK4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-638-AYYgi-E3OmePuQW2dI8JAw-1; Thu, 19 May 2022 04:50:33 -0400 X-MC-Unique: AYYgi-E3OmePuQW2dI8JAw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id BD516802803; Thu, 19 May 2022 08:50:32 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.8]) by smtp.corp.redhat.com (Postfix) with ESMTP id 95948400E114; Thu, 19 May 2022 08:50:31 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH] assoc_array: Fix BUG_ON during garbage collect From: David Howells To: torvalds@linux-foundation.org Cc: stable@vger.kernel.org, Stephen Brennan , Jarkko Sakkinen , Andrew Morton , keyrings@vger.kernel.org, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Thu, 19 May 2022 09:50:30 +0100 Message-ID: <165295023086.3361286.8662079860706628540.stgit@warthog.procyon.org.uk> User-Agent: StGit/1.4 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.84 on 10.11.54.2 X-Spam-Status: No, score=-3.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Stephen Brennan A rare BUG_ON triggered in assoc_array_gc: [3430308.818153] kernel BUG at lib/assoc_array.c:1609! Which corresponded to the statement currently at line 1593 upstream: BUG_ON(assoc_array_ptr_is_meta(p)); Using the data from the core dump, I was able to generate a userspace reproducer[1] and determine the cause of the bug. [1]: https://github.com/brenns10/kernel_stuff/tree/master/assoc_array_gc After running the iterator on the entire branch, an internal tree node looked like the following: NODE (nr_leaves_on_branch: 3) SLOT [0] NODE (2 leaves) SLOT [1] NODE (1 leaf) SLOT [2..f] NODE (empty) In the userspace reproducer, the pr_devel output when compressing this node was: -- compress node 0x5607cc089380 -- free=0, leaves=0 [0] retain node 2/1 [nx 0] [1] fold node 1/1 [nx 0] [2] fold node 0/1 [nx 2] [3] fold node 0/2 [nx 2] [4] fold node 0/3 [nx 2] [5] fold node 0/4 [nx 2] [6] fold node 0/5 [nx 2] [7] fold node 0/6 [nx 2] [8] fold node 0/7 [nx 2] [9] fold node 0/8 [nx 2] [10] fold node 0/9 [nx 2] [11] fold node 0/10 [nx 2] [12] fold node 0/11 [nx 2] [13] fold node 0/12 [nx 2] [14] fold node 0/13 [nx 2] [15] fold node 0/14 [nx 2] after: 3 At slot 0, an internal node with 2 leaves could not be folded into the node, because there was only one available slot (slot 0). Thus, the internal node was retained. At slot 1, the node had one leaf, and was able to be folded in successfully. The remaining nodes had no leaves, and so were removed. By the end of the compression stage, there were 14 free slots, and only 3 leaf nodes. The tree was ascended and then its parent node was compressed. When this node was seen, it could not be folded, due to the internal node it contained. The invariant for compression in this function is: whenever nr_leaves_on_branch < ASSOC_ARRAY_FAN_OUT, the node should contain all leaf nodes. The compression step currently cannot guarantee this, given the corner case shown above. To fix this issue, retry compression whenever we have retained a node, and yet nr_leaves_on_branch < ASSOC_ARRAY_FAN_OUT. This second compression will then allow the node in slot 1 to be folded in, satisfying the invariant. Below is the output of the reproducer once the fix is applied: -- compress node 0x560e9c562380 -- free=0, leaves=0 [0] retain node 2/1 [nx 0] [1] fold node 1/1 [nx 0] [2] fold node 0/1 [nx 2] [3] fold node 0/2 [nx 2] [4] fold node 0/3 [nx 2] [5] fold node 0/4 [nx 2] [6] fold node 0/5 [nx 2] [7] fold node 0/6 [nx 2] [8] fold node 0/7 [nx 2] [9] fold node 0/8 [nx 2] [10] fold node 0/9 [nx 2] [11] fold node 0/10 [nx 2] [12] fold node 0/11 [nx 2] [13] fold node 0/12 [nx 2] [14] fold node 0/13 [nx 2] [15] fold node 0/14 [nx 2] internal nodes remain despite enough space, retrying -- compress node 0x560e9c562380 -- free=14, leaves=1 [0] fold node 2/15 [nx 0] after: 3 Changes ======= DH: - Use false instead of 0. - Reorder the inserted lines in a couple of places to put retained before next_slot. ver #2) - Fix typo in pr_devel, correct comparison to "<=" Fixes: 3cb989501c26 ("Add a generic associative array implementation.") Cc: Signed-off-by: Stephen Brennan Signed-off-by: David Howells cc: Jarkko Sakkinen cc: Andrew Morton cc: keyrings@vger.kernel.org Link: https://lore.kernel.org/r/20220511225517.407935-1-stephen.s.brennan@oracle.com/ # v1 Link: https://lore.kernel.org/r/20220512215045.489140-1-stephen.s.brennan@oracle.com/ # v2 --- lib/assoc_array.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/lib/assoc_array.c b/lib/assoc_array.c index 079c72e26493..ca0b4f360c1a 100644 --- a/lib/assoc_array.c +++ b/lib/assoc_array.c @@ -1461,6 +1461,7 @@ int assoc_array_gc(struct assoc_array *array, struct assoc_array_ptr *cursor, *ptr; struct assoc_array_ptr *new_root, *new_parent, **new_ptr_pp; unsigned long nr_leaves_on_tree; + bool retained; int keylen, slot, nr_free, next_slot, i; pr_devel("-->%s()\n", __func__); @@ -1536,6 +1537,7 @@ int assoc_array_gc(struct assoc_array *array, goto descend; } +retry_compress: pr_devel("-- compress node %p --\n", new_n); /* Count up the number of empty slots in this node and work out the @@ -1553,6 +1555,7 @@ int assoc_array_gc(struct assoc_array *array, pr_devel("free=%d, leaves=%lu\n", nr_free, new_n->nr_leaves_on_branch); /* See what we can fold in */ + retained = false; next_slot = 0; for (slot = 0; slot < ASSOC_ARRAY_FAN_OUT; slot++) { struct assoc_array_shortcut *s; @@ -1602,9 +1605,14 @@ int assoc_array_gc(struct assoc_array *array, pr_devel("[%d] retain node %lu/%d [nx %d]\n", slot, child->nr_leaves_on_branch, nr_free + 1, next_slot); + retained = true; } } + if (retained && new_n->nr_leaves_on_branch <= ASSOC_ARRAY_FAN_OUT) { + pr_devel("internal nodes remain despite enough space, retrying\n"); + goto retry_compress; + } pr_devel("after: %lu\n", new_n->nr_leaves_on_branch); nr_leaves_on_tree = new_n->nr_leaves_on_branch;