Return-path: Received: from mga03.intel.com ([143.182.124.21]:58957 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751960Ab2GQTJy (ORCPT ); Tue, 17 Jul 2012 15:09:54 -0400 Message-ID: <5005B881.8010505@intel.com> (sfid-20120717_211005_035106_EC20A0F1) Date: Tue, 17 Jul 2012 12:09:53 -0700 From: John Fastabend MIME-Version: 1.0 To: "Rustad, Mark D" CC: David Miller , "" , "" , "" Subject: Re: That's pretty much it for 3.5.0 References: <20120717.090142.125145009944045241.davem@davemloft.net> <997C449C-D599-4F46-A0A3-A2B869DEE36E@intel.com> <5005B643.2080009@intel.com> In-Reply-To: <5005B643.2080009@intel.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Sender: linux-wireless-owner@vger.kernel.org List-ID: On 7/17/2012 12:00 PM, John Fastabend wrote: > On 7/17/2012 11:48 AM, Rustad, Mark D wrote: >> On Jul 17, 2012, at 10:41 AM, Rustad, Mark D wrote: >> >>> On Jul 17, 2012, at 9:01 AM, David Miller wrote: >>> >>>> Linus was _extremely_ generous and took in all the stuff that was >>>> pending in the net tree just now. >>> >>> Maybe *too* generous. :-) I just updated and when I boot I get an >>> early crash in update_netdev_tables which is in netprio_cgroup.c. >>> >>>> Besides very serious issues, I'm not willing to consider any more bug >>>> fixes for the 'net' tree at this time. >>> >>> I think the above issue will have to be fixed, as it completely >>> prevents booting for any kernel that includes the netprio_cgroup option. >>> >>>> Only one pending known bug qualifies, and that's the CIPSO ip option >>>> processing OOPS'er. And I'll work on that myself if Paul Moore >>>> doesn't show a sign of life in the next day. >>>> >>>> Thanks. >>> >>> >>> I can start taking a look at this if you like, but I see that Gao >>> feng has two patches in the last set of patches that may be related. >>> >>> To give you an idea how early the crash is, here are a few log >>> messages leading up to it: >>> >>> [ 0.003455] Dentry cache hash table entries: 262144 (order: 9, >>> 2097152 bytes) >>> [ 0.005550] Inode-cache hash table entries: 131072 (order: 8, >>> 1048576 bytes) >>> [ 0.007165] Mount-cache hash table entries: 256 >>> [ 0.010289] Initializing cgroup subsys net_cls >>> [ 0.010947] Initializing cgroup subsys net_prio >>> [ 0.011039] BUG: unable to handle kernel NULL pointer dereference >>> at 0000000000000828 >>> [ 0.011998] IP: [] update_netdev_tables+0x68/0xe0 >> >> >> I found that I can avoid the crash by configuring the netprio_cgroup >> as a module. I don't need to have it built in, I just happened to. >> This finding may lower the temperature of this issue a lot from what I >> had been feeling. >> > > hmm looks like we access init_net here, > > static void update_netdev_tables(void) > { > struct net_device *dev; > u32 max_len = atomic_read(&max_prioidx) + 1; > struct netprio_map *map; > > rtnl_lock(); > for_each_netdev(&init_net, dev) { > map = rtnl_dereference(dev->priomap); > if ((!map) || > (map->priomap_len < max_len)) > extend_netdev_table(dev, max_len); > } > rtnl_unlock(); > } > > but inet_net is initialized by pure_initcall(net_ns_init) and I > gather pure_initcall's should not have any dependencies but it > looks like we created one here with cgroup_init_early() in > start_kernel(). > > I'll poke around some more. Also had some off list help from > Mark. > > .John > although we don't have an early_init hook for netprio_cgroup so this is probably not correct.