Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754487AbbGUC4n (ORCPT ); Mon, 20 Jul 2015 22:56:43 -0400 Received: from relay3.sgi.com ([192.48.152.1]:47229 "EHLO relay.sgi.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752134AbbGUC4m (ORCPT ); Mon, 20 Jul 2015 22:56:42 -0400 Date: Mon, 20 Jul 2015 21:56:39 -0500 From: Alex Thorlton To: Alex Thorlton Cc: Or Gerlitz , Or Gerlitz , andrew banman , Linux Kernel , Doug Ledford , Sean Hefty , Hal Rosenstock , "David S. Miller" , Roland Dreier , Matan Barak , Moni Shoua , Jack Morgenstein , Yishai Hadas , Eran Ben Elisha , Ira Weiny , "linux-rdma@vger.kernel.org" Subject: Re: [BUG] mellanox IB driver fails to load on large config Message-ID: <20150721025639.GX58053@asylum.americas.sgi.com> References: <20150710191506.GA52396@asylum.americas.sgi.com> <20150714182234.GD17920@asylum.americas.sgi.com> <20150714184820.GB58053@asylum.americas.sgi.com> <20150714202848.GD58053@asylum.americas.sgi.com> <55A74E61.1080403@mellanox.com> <20150720162803.GL58053@asylum.americas.sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150720162803.GL58053@asylum.americas.sgi.com> User-Agent: Mutt/1.5.17 (2007-11-01) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1518 Lines: 44 On Mon, Jul 20, 2015 at 11:28:03AM -0500, Alex Thorlton wrote: > I've got some time on the large machine later today. I'll give this a > try then. I ran a boot with this patch applied: diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h index 83e80ab..c84aea0 100644 --- a/include/linux/mlx4/device.h +++ b/include/linux/mlx4/device.h @@ -45,7 +45,7 @@ #include #define MAX_MSIX_P_PORT 17 -#define MAX_MSIX 64 +#define MAX_MSIX 8192 #define MSIX_LEGACY_SZ 4 #define MIN_MSIX_P_PORT 5 I went for a max of 8192, since I was actually booting the machine with 6144 cores (not 4096) for this run. It doesn't look like this fixed the problem. I still saw the same errors during boot. FWIW, the module does appear to still successfully load: 8<--- # lsmod | grep mlx mlx4_ib 151552 0 ib_sa 32768 1 mlx4_ib ib_mad 49152 2 ib_sa,mlx4_ib ib_core 102400 3 ib_sa,mlx4_ib,ib_mad mlx4_core 278528 1 mlx4_ib --->8 If the module loading is good enough, and we should just ignore the errors, then I'm fine with that. Just wanting to make sure that everything is behaving correctly. - Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/