Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp1275595rdb; Wed, 6 Dec 2023 13:51:20 -0800 (PST) X-Google-Smtp-Source: AGHT+IEx7XJIMVGUdmen/DXmmEYTIiL7LNwXhY8gO1+/t/4xPDOrRpV+6F7i/n1p/tCIIWQSW1XS X-Received: by 2002:a17:90b:4c52:b0:286:c105:2360 with SMTP id np18-20020a17090b4c5200b00286c1052360mr1403597pjb.24.1701899480152; Wed, 06 Dec 2023 13:51:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701899480; cv=none; d=google.com; s=arc-20160816; b=GlSk6cln70x9fMKWW2V1oFyQMdBlCjbu4PZPg934cXG8ZuH4U5VeYzmDvjtCWnlNLX 8J0wSAwE66J+cYR0j9qYUdWSvbCC5IHbbIt/dZXDFPms/UAsofGvIY6kZ9ZdrcYGHUa2 vjKw+mbg1TjuNFGTNblE0CTP9fi3WlsbVLv/xxtKtwC/gUVzoAB5LyJfPRQVjlzGINx3 OSQgO2Q03SqLS/f7vyFwaEcJF+35Of8NlO6gCQ9u6pOp3W8A7yTRyXF7VVBMfkwyCrbN yQxj+8SVyAIarEqRhXt1XrdcaB7L8VLCxE0k3BoKhLBVnMFfMbiBnRjJKKhj5jQQ6OEH UuUg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=rz+8ZqQ5yOJuTIBqJr619Wm+Qr2CHYQ5KWzbCgo+Y2w=; fh=oV6vCdIzMQX6Qw4FeMpG2KQDBxXZJLkPxBv3d6achZ8=; b=ntQY+V+/i9UsMzCaUV7Cmjnt2IWoz4QXsIK5SyHRZ6snURNUFGGUPlaopUSfCblIuT luURUz2LS5PDwQbfr2IJkAlu5j0HgegZ8i+w2JaONIrrKkeo3TWO7ASqXgyv2ow9m4LW NXsyjCpUzkElg+jEHMP61flWn6QvCMuFgKYFtcpx9iWSMu9zcKI9iWOKkzhazF9QdGxv JWpcf34u+lBamv1UdsX0IvwiBM6LcaiKLKMkexLZKjS6FtrzwnXksALkFPt5k6Egp9Ys 0OeGOzqJxxwBdV0A84CbUdJYfZ3W4KB8NsyxIvLLC80M47KESNY754MDx5cr/G9TZshe duig== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="Gtg07h/r"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id e3-20020a170902ed8300b001d09aec8a00si418185plj.371.2023.12.06.13.51.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 13:51:20 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="Gtg07h/r"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 638B080EB869; Wed, 6 Dec 2023 13:51:17 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377290AbjLFVvA (ORCPT + 99 others); Wed, 6 Dec 2023 16:51:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53362 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1377251AbjLFVu7 (ORCPT ); Wed, 6 Dec 2023 16:50:59 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2690A98 for ; Wed, 6 Dec 2023 13:51:06 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AEDBFC433C8; Wed, 6 Dec 2023 21:51:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1701899465; bh=Q63swJ2+NLD0RQu6BuRcv9wlpDxUWII9x/AotwElN5k=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Gtg07h/rVyTNJlt3F4T4oRHOwHZOOSt/Fz6u0Lz9rRctYwlhLlXPDS8sNgcv43vdt WXvfR61K0Jl3tLU0VLDTaj17rbNITldFbGwEHpODlwE3kzvJ54L1JNcTTwEstCxtfs Yi5bdrjKYOrtacSFWl9MZ6TAH7OiQtgZ9JfiduhVNsxMi8v18ahnScq2X3sN9pSdBZ TYxIauEUgF5kupMmq+RogxxdVHFqXb8IAMPxSYpHgvarN75NoeEwWsqgxx9333SYWf bV0SQhqwPKjq8HM344ZYI/hLTR8jwfVL+w9KbZh4Y9MKvbCj+F0gpykaUyzeG4EFQj L/vQS5Stlb0TQ== Date: Wed, 6 Dec 2023 13:51:04 -0800 From: Saeed Mahameed To: Shifeng Li Cc: saeedm@nvidia.com, leon@kernel.org, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, eranbe@mellanox.com, moshe@mellanox.com, netdev@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, dinghui@sangfor.com.cn, lishifeng1992@126.com, Moshe Shemesh Subject: Re: [PATCH net v4] net/mlx5e: Fix a race in command alloc flow Message-ID: References: <20231202080126.1167237-1-lishifeng@sangfor.com.cn> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20231202080126.1167237-1-lishifeng@sangfor.com.cn> X-Spam-Status: No, score=-1.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Wed, 06 Dec 2023 13:51:17 -0800 (PST) On 02 Dec 00:01, Shifeng Li wrote: >Fix a cmd->ent use after free due to a race on command entry. >Such race occurs when one of the commands releases its last refcount and >frees its index and entry while another process running command flush >flow takes refcount to this command entry. The process which handles >commands flush may see this command as needed to be flushed if the other >process allocated a ent->idx but didn't set ent to cmd->ent_arr in >cmd_work_handler(). Fix it by moving the assignment of cmd->ent_arr into >the spin lock. > >[70013.081955] BUG: KASAN: use-after-free in mlx5_cmd_trigger_completions+0x1e2/0x4c0 [mlx5_core] >[70013.081967] Write of size 4 at addr ffff88880b1510b4 by task kworker/26:1/1433361 >[70013.081968] >[70013.082028] Workqueue: events aer_isr >[70013.082053] Call Trace: >[70013.082067] dump_stack+0x8b/0xbb >[70013.082086] print_address_description+0x6a/0x270 >[70013.082102] kasan_report+0x179/0x2c0 >[70013.082173] mlx5_cmd_trigger_completions+0x1e2/0x4c0 [mlx5_core] >[70013.082267] mlx5_cmd_flush+0x80/0x180 [mlx5_core] >[70013.082304] mlx5_enter_error_state+0x106/0x1d0 [mlx5_core] >[70013.082338] mlx5_try_fast_unload+0x2ea/0x4d0 [mlx5_core] >[70013.082377] remove_one+0x200/0x2b0 [mlx5_core] >[70013.082409] pci_device_remove+0xf3/0x280 >[70013.082439] device_release_driver_internal+0x1c3/0x470 >[70013.082453] pci_stop_bus_device+0x109/0x160 >[70013.082468] pci_stop_and_remove_bus_device+0xe/0x20 >[70013.082485] pcie_do_fatal_recovery+0x167/0x550 >[70013.082493] aer_isr+0x7d2/0x960 >[70013.082543] process_one_work+0x65f/0x12d0 >[70013.082556] worker_thread+0x87/0xb50 >[70013.082571] kthread+0x2e9/0x3a0 >[70013.082592] ret_from_fork+0x1f/0x40 > >The logical relationship of this error is as follows: > > aer_recover_work | ent->work >-------------------------------------------+------------------------------ >aer_recover_work_func | >|- pcie_do_recovery | > |- report_error_detected | > |- mlx5_pci_err_detected |cmd_work_handler > |- mlx5_enter_error_state | |- cmd_alloc_index > |- enter_error_state | |- lock cmd->alloc_lock > |- mlx5_cmd_flush | |- clear_bit > |- mlx5_cmd_trigger_completions| |- unlock cmd->alloc_lock > |- lock cmd->alloc_lock | > |- vector = ~dev->cmd.vars.bitmask > |- for_each_set_bit | > |- cmd_ent_get(cmd->ent_arr[i]) (UAF) > |- unlock cmd->alloc_lock | |- cmd->ent_arr[ent->idx]=ent > >The cmd->ent_arr[ent->idx] assignment and the bit clearing are not >protected by the cmd->alloc_lock in cmd_work_handler(). > >Fixes: 50b2412b7e78 ("net/mlx5: Avoid possible free of command entry while timeout comp handler") >Reviewed-by: Moshe Shemesh >Signed-off-by: Shifeng Li LGTM, Applied to net-mlx5. Thanks, Saeed.