Received: by 2002:a05:6358:489b:b0:bb:da1:e618 with SMTP id x27csp5796124rwn; Mon, 12 Sep 2022 14:38:29 -0700 (PDT) X-Google-Smtp-Source: AA6agR5MbCp1ShgcYpZA6VFIXcfCcz8QtPsh3VyfX4JfQwcIZcQwwWsWtD5u7iQW9zLikVcDVv81 X-Received: by 2002:a17:906:4fc4:b0:73d:d4e9:2d6e with SMTP id i4-20020a1709064fc400b0073dd4e92d6emr19933830ejw.165.1663018709518; Mon, 12 Sep 2022 14:38:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663018709; cv=none; d=google.com; s=arc-20160816; b=fulZ24Kq8zfw/KCQDtc80uBJDTd64LrDNqAjCxmn/VLZaglNRTt9ME3GWg6taiQVMf bvWTk2TWmB4Qwa1/qHkwXH55uovL0ObsAHA/mF3b/EwqH4qnkeKWXtXNlQ11mmr74AUH oI2uLrLGyuOMNVc5/2GvFqWGZeVnMT2THYZN0w1u9wVMrc8JCMjNVXD2Rm7nPZ3ngWbv pZcWBda+DdBNxs32F9S3KnHZ0tQgaXYmqJjc+RfXS8H21Eipf0xJpZ2UK23AoDaBvwcW 83sMeTFvKEcSYXHLNYFu5R9/cEjHTJOmJP8gaxLK9G8kAE+qotilAlNBAjlYNDH1Wn8v 5sEg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=7dKzDwRUlmLye7xEXGjGhQjyTzYuCruwsDq3s5BYt1k=; b=QEP+6wyri3XNsfW2OQON78EwA5OOu9Q/It7yBg1LlXMpibeaDJtogcsz134MKzmiUq SPJLAyex3nGceWIlQjjYoqa5qA5fC7yGbwKnNcb5fM51vVqgiGRiUrEOSEcnva6Oj5N4 KpmWVqoI7EvxB0FrEmLg20dkpE8gKX/BlLAVb4+MnQkFbywEGVPRjQFD++/qmqVlr5OV j4P1yl9ZDZ+ecHtgfmz7OQE3ARE6nd6Z/uooJWS4U8mk4yLJN4koFZP7qFL8TfA6A/gI 8URIHsU8aBSiJGNZM9ORR9dUkjHJRDZ8CRX93VqC26jsqH8swl240p//nz9S5ChKaNmh z9MA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ht22-20020a170907609600b007304fac484fsi7931901ejc.466.2022.09.12.14.38.03; Mon, 12 Sep 2022 14:38:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229737AbiILVZC (ORCPT + 99 others); Mon, 12 Sep 2022 17:25:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55100 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229680AbiILVZA (ORCPT ); Mon, 12 Sep 2022 17:25:00 -0400 Received: from cae.in-ulm.de (cae.in-ulm.de [217.10.14.231]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id E1B70DFAA for ; Mon, 12 Sep 2022 14:24:55 -0700 (PDT) Received: by cae.in-ulm.de (Postfix, from userid 1000) id 222F214011C; Mon, 12 Sep 2022 23:24:52 +0200 (CEST) Date: Mon, 12 Sep 2022 23:24:52 +0200 From: "Christian A. Ehrhardt" To: Tejun Heo Cc: linux-kernel@vger.kernel.org, Greg Kroah-Hartman Subject: Re: [PATCH] kernfs: fix use-after-free in __kernfs_remove Message-ID: References: <20220907200811.654034-1-lk@c--e.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, Tejun, On Fri, Sep 09, 2022 at 12:25:36AM +0200, Christian A. Ehrhardt wrote: > > Hello Tejun, > > On Thu, Sep 08, 2022 at 07:14:43AM -1000, Tejun Heo wrote: > > Hello, Christian. > > > > On Wed, Sep 07, 2022 at 10:08:11PM +0200, Christian A. Ehrhardt wrote: > > > Concurrent calls to __kernfs_remove can race on the removal > > > of the root node: The race occurs if the root node(kn) is freed > > > during kernfs_drain. The child node(pos) is explicitly protected > > > with an additional ref count. Do the same for the root node. > > > > I don't think this is right. We don't support parallel invocations of > > __kernfs_remove() this way. If @kn can be freed during kernfs_drain(), it > > also means that it can be freed before kernfs_rwsem is grabbed in > > kernfs_remove(). > > Point taken. However, the syzkaller reproducer reliably triggers > the bug without the patch and the bug is gone with the patch. > > > The caller must be responsible for ensuring that @kn > > remains allocated. Otherwise, it can't be made reliable. > > In this case the caller of __kernfs_remove is not kernfs_remove but > kernfs_remove_by_name_ns and it fails to take a reference for the > node that it looks up and deletes. Thus a second call to > kernfs_remove_by_name_ns can remove the node while kernfs_drain > drops the semaphore. > > I'll post an updated patch tomorrow. Sorry, no patch (yet). But here's the whole story of the initial syzkaller report (form top to bottom). I'm not sure where we go wrong but I think several places could do better here: The code in net/9p/client.c creates one kmem-cache per client session. All of these kmem caches use the same name ("9p-fcall-cache"). Is it ok to create several kmem caches with the same name? My feeling is that this is somewhat unexpected but should probably be allowed. In the setup in question slab caches are not merged. Thus the slub code uses the kmem cache name directly to create the sysfs directory for the slub cache. If we allow multiple kmem caches with the same name the slub code should somehow make the sysfs names unique (e.g. by adding a serial numer to the name), right? Before creating the sysfs directory the slub code uses sysfs_remove_link (aka kernfs_remove_by_name) with the following comment: "If we have a leftover link remove it." In fact these "leftover"s are the sysfs files of an active kmem cache with the same name. Additionally, sysfs_remove_link() looks like a bit of a misnomer as it removes whatever exists under the given name. This may be a symlink but it can be an entire directory hierarchy, too. Is this intentional? At least it's been like that forever. If kmem cache creation is done in parallel we can now have concurrent invocations of kernfs_remove_by_name_ns() for the same parent and the same name. This is what eventually causes the race. The race is possible as kernfs_remove_by_name_ns() looks up the name of the target in its parent but does not acquire a ref count on the target before calling __kernfs_remove(). __kernfs_remove() may drop the kernfs_rwsem in kernfs_drain(). Thus a concurrent call to __kernfs_remove can complete the removal except for the nodes that the first instance of __kernfs_remove() holds refs for. As explained above no ref is held on the root of the removed tree. This causes the use-after-free that KASAN sees and complains about. For kernfs_remove it is reasonable to expect the caller to hold some kind of reference to prevent this type of race and from a quick check, the callers seem to get this correct. The only exception that I could find is kernfs_remove_by_name_ns. This is easy to fix if kernfs_remove_by_name_ns() hold a reference on the root node across the call to __kernfs_remove(). IMHO this should be done. Should I sent a patch? Thanks for bearing with me. Oppinons? regards Christian