Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp1751884rdb; Thu, 7 Dec 2023 07:57:03 -0800 (PST) X-Google-Smtp-Source: AGHT+IGiGGb7e+qoZfrjBNntYy4XE9/1nq8YrzYdQp8qSETubZsJRj7NkTTHi3aBfM3NQaDldXMH X-Received: by 2002:a17:902:d502:b0:1d0:92a0:492b with SMTP id b2-20020a170902d50200b001d092a0492bmr2889086plg.84.1701964622938; Thu, 07 Dec 2023 07:57:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701964622; cv=none; d=google.com; s=arc-20160816; b=nIDKDuRT4hIX/pTBS+KifskTRb6QQOsPm0bvbvz0PcxSfQre9jG+ZCdv9Fpihp1vLF /xaBKzSedv7jUTVcZdUqND2gfyR+ydu2en9msGRP+ozE8Gt/T5O4z5hdXPZR/Eb+BsRJ 9wgJXFIWbzU6/HRv0aakn9Ve1hTuBqkdSrVuoqKEP0Y2N4+UpoP36Y8J943OUvn1adx7 Haz+hWrtE9gt93fjWP+CZENnIxL5eH8UPQmwU9SHiEe0QCFa/wbsgga5yvLGI4qZJ5Ow 2Vwk6RMjM7wATdOvBsvJ9ygdFVb5xL0nlsenuQ2RbHOGKkMo30oi1RZzIJuywSa1oWoD 3lbw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=eBVK5rxL1HblrRO+1zGzaM86i2SdcKlf4kZB18jTxEo=; fh=eWMCYFng/6L4nSOY1yOrbL1qMczCRMKGOWykXCem1H0=; b=IR8OphS/NJ8414UFwGoMZN+9LMwVQFbTEXOOlEm1H+Rib/nSrEpAQadgh47I0xa02k 0SrHyusQCt+Zz57w53fJ3OHj//AiNsa9JfhGHyWPdOI5D/sc4I/IIPa+0YJUkzNe3/HB O5r05spRp2TmzQ1shSzE27GxeuoK47+Dc7qc6qDLHzb0Cg5FXN5i/q1XTMOnfOjmJqBO BkQm2VSWZby8iDwsh/YLpDZgnHoU3fPREroyfACsQKT+KfeeB4YObeEi+JBHa9wXePhb oK4f8vElHpO82JP71HLoviMCaSnGqLO14evgtcX9i4r/7tHMSIOaqf48rTeEuFngAw3X OkEg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="Y/dGeoJl"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id cn8-20020a056a020a8800b005b7ce261d0dsi1399976pgb.402.2023.12.07.07.57.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Dec 2023 07:57:02 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="Y/dGeoJl"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 545A080CBF29; Thu, 7 Dec 2023 07:57:00 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1443633AbjLGP4e (ORCPT + 99 others); Thu, 7 Dec 2023 10:56:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55048 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235784AbjLGP4V (ORCPT ); Thu, 7 Dec 2023 10:56:21 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CE3153ABD for ; Thu, 7 Dec 2023 07:55:45 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 02E08C433C8; Thu, 7 Dec 2023 15:54:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1701964492; bh=j+A8HC+/CTEKAKXAGf5rsMfShCOdniBO0jQsFTvKPe0=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=Y/dGeoJlP1O0eQTnVpyyoz7c48ZnIZUgVvUdoywSWgr911ILlbFHm9CZi2pTzPgrU DxEx7/Uw4oRtdexPJf6+IHCybyBVF8BpiY5lTUJr42kHN6UPjTxSz5c5oNpIykpWJQ Zmnjuj8RdQM1Z0woOGe3u1t/Xie6jJ/8RtZ9tbgnETYPpruMuTp3V/t5eocqKJQouD ps1H6vCEVZKeHhxbMr3r6crD6a5FIokfK1vbc0v4sPhaz4EMfPAlGCUX8DjwxLDsg0 SS1GBva9dtJPLy1vWdFpDqwkEscHFl6hqKesCyeeNN4p9nglKhJkMo7PPQlkj8/Tkl gRDH14Z6Ud5pQ== Message-ID: <2bbc4c40-ff8b-4243-9987-dc7d5502a37c@kernel.org> Date: Thu, 7 Dec 2023 08:54:51 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH V3 2/5] misc: mlx5ctl: Add mlx5ctl misc driver Content-Language: en-US To: Jakub Kicinski , Aron Silverton Cc: Greg Kroah-Hartman , Saeed Mahameed , Jason Gunthorpe , Arnd Bergmann , Leon Romanovsky , Jiri Pirko , Leonid Bloch , Itay Avraham , linux-kernel@vger.kernel.org, Saeed Mahameed References: <20231128044628.GA8901@u2004-local> <20231128065321.53d4d5bb@kernel.org> <20231128162413.GP436702@nvidia.com> <20231128084421.6321b9b2@kernel.org> <20231128175224.GR436702@nvidia.com> <20231128103304.25c2c642@kernel.org> <2023112922-lyricist-unclip-8e78@gregkh> <20231204185210.030a72ca@kernel.org> <20231205204855.52fa5cc1@kernel.org> From: David Ahern In-Reply-To: <20231205204855.52fa5cc1@kernel.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Thu, 07 Dec 2023 07:57:00 -0800 (PST) On 12/5/23 9:48 PM, Jakub Kicinski wrote: > On Tue, 5 Dec 2023 11:11:00 -0600 Aron Silverton wrote: >> 1. As mentioned already, we recently faced a complex problem with RDMA >> in KVM and were getting nowhere trying to debug using the usual methods. >> Mellanox support was able to use this debug interface to see what was >> happening on the PCI bus and prove that the issue was caused by >> corrupted PCIe transactions. This finally put the investigation on the >> correct path. The debug interface was used consistently and extensively >> to test theories about what was happening in the system and, ultimately, >> allowed the problem to be solved. > > You hit on an important point, and what is also my experience working > at Meta. I may have even mentioned it in this thread already. > If there is a serious issue with a complex device, there are two ways > you can get support - dump all you can and send the dump to the vendor > or get on a live debugging session with their engineers. Users' ability > to debug those devices is practically non-existent. The idea that we > need access to FW internals is predicated on the assumption that we > have an ability to make sense of those internals. > > Once you're on a support call with the vendor - just load a custom > kernel, module, whatever, it's already extremely expensive manual labor. You rail against out of tree drivers and vendor proprietary tools, and now you argue for just that. There is no reason debugging capabilities can not be built into the OS and used when needed. That means anything needed - from kernel modules to userspace tools. The Meta data point is not representative of the world at large - different scale, different needs, different expertise on staff (OS and H/W). Getting S/W installed (especially anything requiring a compiler) in a production server (and VMs) is not an easy request and in many cases not even possible. When a customer hits problem, the standard steps are to run a script, generate a tar file and ship it to the OS vendor. Engineers at the OS vendor go through it and may need other data - like getting detailed dumps from individual pieces of H/W. Every time those requests require going to a vendor web site to pull down vendor tools, get permission to install them, schedule the run of said tool ... it only serves to drag out the debugging process. ie., this open-ended stance only serves to hurt Linux users.