Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756698AbYFZAuC (ORCPT ); Wed, 25 Jun 2008 20:50:02 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754631AbYFZAtu (ORCPT ); Wed, 25 Jun 2008 20:49:50 -0400 Received: from mga01.intel.com ([192.55.52.88]:35569 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751768AbYFZAts (ORCPT ); Wed, 25 Jun 2008 20:49:48 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.27,705,1204531200"; d="scan'208";a="582041560" Subject: Re: [PATCH 2/2] acpi based pci gap caluculation v2 From: Zhao Yakui To: akataria@vmware.com Cc: Alok kataria , "lenb@kernel.org" , Ingo Molnar , linux-acpi , LKML In-Reply-To: <1214413073.27577.67.camel@promb-2n-dhcp368.eng.vmware.com> References: <1214333326.27577.28.camel@promb-2n-dhcp368.eng.vmware.com> <1214362159.9800.28.camel@yakui_zhao.sh.intel.com> <35f686220806242117p4a442982hb459a6b76312f391@mail.gmail.com> <1214372365.9800.42.camel@yakui_zhao.sh.intel.com> <35f686220806242304q43987059xb203dd1ac75b7583@mail.gmail.com> <1214383095.9800.85.camel@yakui_zhao.sh.intel.com> <1214413073.27577.67.camel@promb-2n-dhcp368.eng.vmware.com> Content-Type: text/plain Date: Thu, 26 Jun 2008 09:00:46 +0800 Message-Id: <1214442046.7055.9.camel@yakui_zhao.sh.intel.com> Mime-Version: 1.0 X-Mailer: Evolution 2.8.0 (2.8.0-7.fc6) Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6563 Lines: 144 On Wed, 2008-06-25 at 09:57 -0700, Alok Kataria wrote: > On Wed, 2008-06-25 at 01:38 -0700, Zhao Yakui wrote: > > On Tue, 2008-06-24 at 23:04 -0700, Alok kataria wrote: > > > Hi Yakui, > > > > > > On Tue, Jun 24, 2008 at 10:39 PM, Zhao Yakui wrote: > > > > On Tue, 2008-06-24 at 21:17 -0700, Alok kataria wrote: > > > >> On Tue, Jun 24, 2008 at 7:49 PM, Zhao Yakui wrote: > > > >> > On Tue, 2008-06-24 at 11:48 -0700, Alok Kataria wrote: > > > >> >> Evaluates the _CRS object under PCI0 looking for producer resources. > > > >> >> Then searches the e820 memory space for a gap within these producer resources. > > > >> >> > > > >> >> Allows us to find a gap for the unclaimed pci resources or MMIO resources > > > >> >> for hotplug devices within the BIOS allowed pci regions. > > > >> >> > > > >> > It seems reasonable. > > > >> > But if the resource obtained from the PCI0 _CRS method is incorrect, we > > > >> > will get the incorrect pci_mem_start. > > > >> > > > >> Hi Yakui, > > > >> > > > >> What do you mean by the PCI0 _CRS being incorrect ? Why would the BIOS > > > >> give a incorrect _CRS object ? > > > >> Also we don't just take the value given from the _CRS method, we still > > > >> read the e820_map to search for an unallocated resource. So even if > > > >> (by chance) the _CRS method returns incorrect value we would still > > > >> figure out if there is a collision with an already allocated resource. > > > > In the patch the address obtained from the _CRS object will be passed > > > > into the function of e820_search_gap. In such case maybe we will get the > > > > pci_mem_start different with the e820_setup_gap. > > > > > > True..the whole idea behind doing this patch was to get a correct > > > (different) value for pci_mem_start. > > > We read the _CRS object over here to make sure that we assign the > > > pci_mem_start from the address range which is reserved by the BIOS for > > > PCI devices. > > > > > > Also this reading of _CRS object would be done before we start > > > initializing the pci devices, i.e. before we start using the value of > > > pci_mem_start, so the original value assigned by pci_setup_gap is just > > > overwritten by this function. So that should be fine IMHO. > > > Also we would still want the call for e820_setup_gap because there can > > > be systems with no acpi support or acpi disabled > > > > > > > > > > > > > >> > > > > >> > At the same time after the patch is applied, pci_mem_start will be > > > >> > parsed in two different ways. > > > >> > > > >> Yes pci_mem_start would be initialized in 2 different ways but we > > > >> still have to parse the e820_map the old way because there could be > > > >> systems without ACPI. > > > >> > > > >> > If the result is different, maybe the > > > >> > incorrect pci_mem_start will be used. > > > >> > > > >> Yeah, The result is different in my case. Though my BIOS reserves > > > >> hotpluggable memory region, kernel doesn't respect that right now and > > > >> just parses the e820_map to calculate the gap and pci_mem_start value. > > > >> I have explained the problem in this mail. > > > >> > > > >> http://marc.info/?l=linux-acpi&m=121391675711763&w=2 > > > >> > > > >> Maybe nobody has seen this problem yet, because there are no systems > > > >> out there with less than 4GB memory to start with and which allow > > > >> memory hotplug. > > > >> > > > >> But still i don't understand what do you mean by, we can get incorrect > > > >> pci_mem_start, in which case ? > > > > > > > > In the function of setup_arch the pci_mem_start will be parsed by > > > > searching the e820 table. After the patch is applied, we will parse the > > > > pci_mem_start again in the function of pci_acpi_scan_init and it will > > > > override the value parsed in the function of setup_arch. If the > > > > pci_mem_start is incorrect in the second case, maybe it will have side > > > > effect. > > > > > > Yes it will override. But how can the value be incorrect in the second > > > case. As explained in my previous mail we still parse the e820_map to > > > check if we have unclaimed resources between start_address (that read > > > from _CRS) to 2^32. So even if this start_address is wrong we would > > > catch that during parsing e820_map. But again why would the _CRS > > > return incorrect values, are you talking about errors in BIOS ? > > The pci_mem_start is still gotten by parsing the E820 table.But the > > input parameter start_addr will be used in the function of > > e820_search_gap. > > If the following is the resource start address returned by the PCI0 > > _CRS object , maybe the different pci_mem_start will be gotten. > > 0xE0000000 > > 0xE4000000 > > > > At the same time if several start address is returned by the _CRS > > object, the e820 table will be parsed several times. > > Yes we will parse e820 several times, but we don't initialize > pci_mem_start in every pass. > It will be initialized only twice once via the e820_setup_gap code path > and once via pci_setup_gap. > And i think you agree that both of these are required ? Agree with what you said and IMO it is OK. > > During the gap calculation the previous code or the code now in > e820_setup_gap does this... > calculates a gap in e820_map from 0 to 2^32. > Initializes pci_mem_start. > > And now with this patch, the code in pci_setup_gap > does this... > for each _CRS under PCI0 > search gap from start_addr of _CRS to 2^32 *[1]. > Initialize pci_mem_start with the biggest gap that we could > find. Yes. But maybe the pci_mem_start obtained in pci_setup_gap will be different with that obtained in e820_setup_gap on some bogus BIOS. If the pci_mem_start obtained in pci_setup_gap can meet the requirement, it is also reasonable. > Essentially, what we are doing is just limiting the gap calculation to a > smaller address space depending on the ACPI information we get. > > Now then, what problem do you see with this approach ? > *[1] > While writing this mail i figured out that, instead of searching from > start_addr of _CRS to 2^32 we should just search till *end_addr* of _CRS resource. > I will send a patch to that effect. If this, IMO it will be OK. > Thanks, > Alok > > > > > > Thanks. > > Yakui > > > > > Thanks, > > > Alok > > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/