Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754976AbeAIKiZ (ORCPT + 1 other); Tue, 9 Jan 2018 05:38:25 -0500 Received: from mail-dm3nam03on0069.outbound.protection.outlook.com ([104.47.41.69]:43916 "EHLO NAM03-DM3-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754947AbeAIKiP (ORCPT ); Tue, 9 Jan 2018 05:38:15 -0500 Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Christian.Koenig@amd.com; Subject: Re: [BISECTED] v4.15-rc: Boot regression on x86_64/AMD To: Bjorn Helgaas , Linus Torvalds Cc: Aaro Koskinen , Andy Shevchenko , Linux Kernel Mailing List , linux-pci@vger.kernel.org, Boris Ostrovsky , Juergen Gross References: <20180105220412.fzpwqe4zljdawr36@darkstar.musicnaut.iki.fi> From: =?UTF-8?Q?Christian_K=c3=b6nig?= Message-ID: <628e2b58-b16b-5792-b4ef-88bec15ab779@amd.com> Date: Tue, 9 Jan 2018 11:37:55 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/mixed; boundary="------------292CEB6B2E472794B49E0431" Content-Language: en-US X-Originating-IP: [2a02:908:1251:8fc0:4c6d:7233:b7e1:3b88] X-ClientProxiedBy: HE1PR07CA0001.eurprd07.prod.outlook.com (10.160.74.139) To MWHPR12MB1312.namprd12.prod.outlook.com (10.169.205.137) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: f105374e-b049-4825-1285-08d5574d150b X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(48565401081)(4534020)(4602075)(4627115)(201703031133081)(201702281549075)(5600026)(4604075)(2017052603307)(7153060)(49563074)(7193020);SRVR:MWHPR12MB1312; X-Microsoft-Exchange-Diagnostics: 1;MWHPR12MB1312;3:/WtZNZ5Z2ye28ZHp9E1wWu16X1U4q+cuXRAxYTajA1mTNC6IuM2TNHHDg6AjL3Eupba97AeROGSniqFA1qXsno8tR7ThDkH47dEuuo749Mt08Q018lBKSZhviFLbuqhQCpa8V1FF9wwG8NGUvesi5t3cfMf66c6J6N8DUuVC/LX72PsqLAuDxWXbsIoXRp9XXPXs/RCY5+dcGihWTmVQTV0chbfYBzCZMyR/cg5VqTDcoBk5TfO3IWWRNVNE9jEH;25:OFgoXrYr6lmJNU76sjSX2P8AuMCaUjQ75j5ieuhx/pUHOO2Anm0QdUxY1iA+tOZ8/4ObyM87Qt94wvEnnPz8gr6zHCw362gWaqwrYwzvXv2eHufI0mFV2qLSQLupT36cSnB2ovo3LstfTcAutiFYTkvWZAyWcxC9jrAFOh6YlXixMkzArM2yDRYgyjsLBiz3I5F8rwa+nYj6dK3dlYEt333ZiXVJOyF9A4XZvp+epmmHappLcPKwza2MA3g+iD+8Yij+pHv3pRB7+8qjwQewTUWlsOeeSaYyCslwmEhvACRCY9DbrPJwc9zj1+gV9p0g3kfUOAag35aCe//pnhKAWg==;31:uzO/C4j1bO1AS5XhAiM1q5VGK1GeytTlxaGiOYeenTvAgoEAeVvug6kq6aACZ97zb07nCdLAn412HU3Nkdf3/pU/zA8J8lNMWegU3arc+xPzfxDgnylE48ASGi02x7rNj6TYcnRXc/RzLEfoi6imTSinbIGD7WVkLhfuimt3K9neeA0OlpvMRgXdQNHs4Lkcu6iwH6QK+X2Ejd/brXuD+tL1Nj8sdgugRqZwcAVuVEI= X-MS-TrafficTypeDiagnostic: MWHPR12MB1312: X-Microsoft-Exchange-Diagnostics: 1;MWHPR12MB1312;20:NM6rvz0Q9qLoy9MLbRXCm7/5/LxxxURmaHxII7B/4hfktxQpuqEurthBSSawMixB/S6dnEd04ZbDVGgKTipB5UtnVmNCjTwjSMYKxmV7771v58qSDTtMmcDsGHo78QdRLiNU6iXg/qZO3rxLm8u99nucblLsYNX7+vKI0AmT2C8yKRhx+imnNy8KxJ4O9fmAxXJmHFSwa+7mzTvWLTi76TTFhXM7OJ4wvNBFxpKjGviNtHLBmXaxclJ9hFBogj2eTX8gJnsZ1FA2FGLLPzuQAaJpSQZSj48gFpNJvlhx08RBQoz1rgYHFkWGuQb4DvWFw5V0YHMai7xdZxzMHy7csUygPwpOgLKmx0MKSYrkaSvGjeikTy/BW9bXvCtohBAoR/OAXbUVJBgfGF7VdLif1C8K/OJzVnj7aKoZRX52yZnP6eibLSFeCJemR92zWFA8dtPKu1UX6TgXROJtNPE5xc+SGJsioCPowjnsw7Y8y8g2VVqe8pmpLvjgl+CWdtcB;4:2X4IvxA1d1BwUMvez7kuAAeV4MLgH+8DM8Xz+zvWsrUntVKyZfS4+pfvCojPYvTNB+Sq2l8BYY2Nqoq6LECVlQKKHb8tORC8fHc+11LAitmow/gc8FGogTubzIdwrgKWaLnPKJjlnKKgxGwpL2ZKhnMZL41OaaQ1rFe8HRJ+K5bH7bJ0oRSTO5pLqIMQoKJvbdrxPUfFiU+6nMH6rnpHU272TUpz+igIcMrw3zJ+YcxEpV92yvcbO5P10NPccw1lCtjr1APzJLx50R12a9QeQIjOLaa5YxJJAKYdrqpl1dSYy2pXzlsrDp09e18dupmp X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(767451399110); X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(102415395)(6040470)(2401047)(8121501046)(5005006)(3231023)(944501075)(3002001)(10201501046)(93006095)(93001095)(6055026)(6041268)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123562045)(20161123558120)(20161123560045)(20161123564045)(6072148)(201708071742011);SRVR:MWHPR12MB1312;BCL:0;PCL:0;RULEID:(100000803101)(100110400095);SRVR:MWHPR12MB1312; X-Forefront-PRVS: 0547116B72 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(396003)(346002)(366004)(39380400002)(376002)(39860400002)(24454002)(189003)(199004)(65806001)(81166006)(81156014)(36756003)(33964004)(2476003)(65956001)(1706002)(8936002)(76176011)(6116002)(8676002)(53936002)(84326002)(229853002)(59450400001)(6486002)(5890100001)(52396003)(4610100001)(53546011)(568964002)(7736002)(386003)(86362001)(305945005)(4326008)(58126008)(2950100002)(16586007)(64126003)(31686004)(316002)(105586002)(478600001)(110136005)(37036004)(25786009)(106356001)(6666003)(31696002)(5660300001)(83506002)(54906003)(6246003)(72206003)(65826007)(68736007)(52116002)(39060400002)(2906002)(97736004);DIR:OUT;SFP:1101;SCL:1;SRVR:MWHPR12MB1312;H:[IPv6:2a02:908:1251:8fc0:4c6d:7233:b7e1:3b88];FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;MWHPR12MB1312;23:vpKT0kVDZx+eTGW6K3px1UFOD1+NK4doqn4o0k2FX?= =?us-ascii?Q?ys4THjWgG1i7f4FbV0vonej3poKsRND7REsOVZdZTO6MoksbHXWdiQ9ige/y?= =?us-ascii?Q?cPfuTBT9C9JOOVA0jNcOI7lLX/Q2ouKQD12qbJKivcvBw14DqAqCvPqQZNu7?= =?us-ascii?Q?eqiqUoCr2LoYWexhWMSoZsb3UfEy93q4kCyL9CacjC7IkJ2waUL0909S+/xH?= =?us-ascii?Q?oGsXCSz0xREtc1WV/KBUAPOOgrtyxY2UdlSvVtSkZZUVINrYuv+eU/Hp/+wJ?= =?us-ascii?Q?GM6KKYIZWCaHAaAg+1Cx9FvwOEEZNuzpfyXY5p1H55Vg6XkfaXAyJimndecP?= =?us-ascii?Q?PvNCZrnpHa3NL9FxxuSlZOsk6uGTVrtWdFtV0PXiDZLVDPeKrVy1oJCP0XDw?= =?us-ascii?Q?szHxDGpwi0asqWqCYDQOxxuDSIlIMe13k3mFUp8BUvI872BxL5gH3Ej41BaA?= =?us-ascii?Q?f/YDqpj61b6ZTkpTPfRttqGRp4v5SEga8WVW15lydHdU8r+hqcWR2eGkkdY7?= =?us-ascii?Q?7ZlaEGAwy/2IWYwXPvE8US6CqnOn1uv3gVXFpiVAOExBIVWaESiOhn6Z9KvJ?= =?us-ascii?Q?eP1tjg7tvJyNzW+1iPJF0Flz5Pd1Ua0Z7R0rjc4fFE2h5KYaKJ5G/Ye98NMh?= =?us-ascii?Q?N8gVcDZeuNgBpXDF0hF7x636sMqs6KMI4NxrVEbfb2Nxy4DvpWdYZQ67dM3p?= =?us-ascii?Q?WXwOdXo9+uUWlc53HEnyd0Be5sQaVm/PFkqvB6YPjJKz3kaoPR8duZqr+dcS?= =?us-ascii?Q?Tg58j5JdhKnUINHD3+Mke7B0gTmPYjwVtX0XtihX/0Dh4XOUEJWZWt3Ik2Ln?= =?us-ascii?Q?lbVHxT9hF2hZKOvJmNZ7FpkcwUjY6ufHKawVnbfjqpMXD6vrXCeBe1zkVYOT?= =?us-ascii?Q?H6hM1QM7loTz/3fJtL4gbo+E60iFIFhnSjkaj4pPXcTbBuEfGwe8nLt7gThV?= =?us-ascii?Q?q2COHEYx/fic0ujNa9q/+iKwXHZIbYt6To8dpJ06ay/leAzUzv/Bt57LrpBG?= =?us-ascii?Q?974Wm7sSauroWzmovm5Nz+WgRv5fDvJvZS3OK6gWMY+HvCY3AqCFyOaVES1F?= =?us-ascii?Q?3TEw9ApL0efe5u5gQ0w8AccrCwKAEoVjY2yi4dSCTi2dIqFCPK0AWitpq3IH?= =?us-ascii?Q?M4R3IwzAWQvNqYZmDw79DqFs1+mRhfGtsNxOFa1+zwUkQkXMvhjo/z2CuwC0?= =?us-ascii?Q?dqwcsdYsYfJlhF5uamDMnQZ4hGZAP5VZHFhjbxrMV/qZrrjfr42nTsT1G2bq?= =?us-ascii?Q?GMy3SNPm0mbvTK/bUH/baZaYGhvmZpISfOI6dWf0nnyJ2bVPZbWhFmHdpuGJ?= =?us-ascii?Q?zS3+FOKREJ8LZYzyBEVGcGGvBzRDTfiuXWrRe6L3eeYR7u6pvyj21OLUZHUr?= =?us-ascii?Q?drYaalAGQ/uT8Qk869jiFCEytp0O2xinGTScm21fOrI4E/jEkWXZILB14DDR?= =?us-ascii?Q?b101S8DsA=3D=3D?= X-Microsoft-Exchange-Diagnostics: 1;MWHPR12MB1312;6:BRFhwgOUVznJxQSPcktYFP1vrtd3PmPBkIwmM4L1q4awgBYzzTyoz1jPRfcMUxLh9oj4ZIyj1++/JxH2rumE5mxG3bYwoxqt2tgUOsJ5rSsho7C5NTs1Fti1MiahmFIR3dOfXzPtq1Ag76qDbctQnW00a+uO815OSbVdZdBY7zOpL8DAZDBNu0relX7MQqACuiHxJMb5rFGj7wVMtPOaDKJu2rbBx14bXOoiShbwbXyZ6E1OklVyTnRSjlxYzQGAzSgCjq9MCTvQlXqIGNCG6N2BSq0tuRLcNK/mB39crX2AXzuHWRH2bo6cpMFoEhNC87PJuvLdwd8Ie3mwvFCqSUp/V9zKggd48EnAEwPFy84=;5:MVHMKzgqTWk+V18zall+wES7jZoNCLPHn8euXeAGHPiLh/wMwaxxXT+eqBYBTbqvSuvHrBsbZbu/HKAga6ykd3PHvTpQCBrLL9cECmnXYSnR3sqfMfGFmkPvp6upEd+Ja+kD3V7M79anRTSIMGKjoh4c94nKmNaHIpZ4sVT2fd4=;24:wecMfW8e1ZXp6TRCljbGysW3Tic/2Md/tuEWw/aSYd8GPHyDRuNzOYYw/79814Y9BURPetMLc94WWhMSTuIup21mDlTok8Mp1V3G5iStvvU=;7:pxvu6XKhtyJ0U3rS7EqmVgPR8QJV+50yjC8Upf+MOnMrQJHzLTHk0SynPpgMqPdO3PJivwlJ9Rk6sb4KoVY3YFYDsaBeuPGcJ6fiNZOdhuFmVVEM4ZTbNATJU0DbKyTrmhXTghMRBNRXdQpS3i4h6j9HuWJmJ959GnDpT76MUaPjA6AfFdTlMxf4hmWVFVzNvy3PhN6Upzdk9z4DzsefJ4PVsKWIpXXu0TjXsPDXp889ghmAEaHK41fcFB9f4Kpl SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;MWHPR12MB1312;20:gdyYOv7ZD4mUlLF2uTXC9Wri68cJD5+V4HxVE848k5WQcgnXQ2cxHyTchTMxFWApS/p8rXsWhQoS7Ye2pN1Z1KgkALHnwd2n8Q92lSXrkmA3TBt5bxzjDv3IXYBJIsK/+SCf3pI1z0/0FTvvfymJXrpuJQRPSmr28969v00Ty8f10R3CE0GT2HrCub3Kp+ewmBaAygYEvnVza7mbitm/ViGhW4GLGCbZo5BNM36ym9c4VxLtYo/1uQonXVRSsZu1 X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2018 10:38:08.7907 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: f105374e-b049-4825-1285-08d5574d150b X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR12MB1312 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: This is a multi-part message in MIME format. --------------292CEB6B2E472794B49E0431 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Am 09.01.2018 um 00:23 schrieb Bjorn Helgaas: > [+cc Boris, Juergen, linux-pci] > > On Fri, Jan 5, 2018 at 6:00 PM, Linus Torvalds > wrote: >> On Fri, Jan 5, 2018 at 2:04 PM, Aaro Koskinen wrote: >>> After v4.14, I've been unable to boot my AMD compilation box with the >>> v4.15-rc mainline Linux. It just ends up in a silent reboot loop. >>> >>> I bisected this to: >>> >>> commit fa564ad9636651fd11ec2c79c48dee844066f73a >>> Author: Christian König >>> Date: Tue Oct 24 14:40:29 2017 -0500 >>> >>> x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f, 60-7f) >> Hmm. That was reported to break boot earlier already. >> >> The breakage was supposedly fixed by three patches from Christian: >> >> a19e2696135e: "x86/PCI: Only enable a 64bit BAR on single-socket AMD >> Family 15h" >> >> 470195f82e4e: "x86/PCI: Fix infinite loop in search for 64bit BAR placement" >> >> and a third one that was apparently never applied. >> >> I'm not sure why that third patch was never applied, I'm including it here. >> >> Does the system work for you if you apply that patch (instead of >> reverting all of them)? >> >> I wonder why that patch wasn't applied, but if it doesn't fix things, >> I think we do need to revert it all. >> >> Christian? Bjorn? > I didn't apply the third patch ("x86/PCI: limit the size of the 64bit > BAR to 256GB") because (a) we thought it was optional ("just a > precaution against eventual problems"), (b) we didn't have a good > explanation of why 256GB was the correct number, and (c) it seemed to > be a workaround for a Xen issue that we hoped to fix in a better way. Just for the record completely agree on that. > It does apparently make Aaro's system work, but I still hesitate to > apply it because it's magical -- avoiding the address space from > 0x1_00000000 to 0xbd_00000000 makes things work, but we don't know > why. I assume there's some unreported device in that area, but I > don't think we have any real assurance that the > 0xbd_00000000-0xfd_00000000 area we now use is any safer. Well, I knew why it's not working. The BIOS is not telling us the truth about how much memory is installed. A device above 4GB would actually be handled correctly by the code (see the check when we walk over all the existing IO regions). I tested a bit with Aaro and came up with the attached patch, it adds a 16GB guard between the end of memory and the new window for the PCIe root hub. But I agree with you that this is just a hack and not a real solution. > I would feel better about this if we made it opt-in via a kernel > parameter and/or some kind of whitelist. I still don't really *like* > it, since ACPI does provide a mechanism (_PRS/_SRS) for doing this > safely, and we could just say "if you want to use big BARs, the BIOS > should enable big windows or at least make them available via ACPI > resources." The only problem is that BIOSes don't do that and we > don't yet have Linux support for _PRS/_SRS for host bridges. Well that is the point I disagree on. When the memory map we get from the BIOS is not correct it makes no difference if we enable the window with the BIOS or by direct programming the hardware. I will work with Aaron some more to come up with a solution which reads the memory map directly from the hardware as well and checks if that is valid before doing anything else. > I'll prepare a revert as a back-up plan in case we don't come up with > a better solution. Either that or only enable it when pci=add-root-window is given on the kernel commandline. Just let me know what you prefer and I will hack a patch for this together today. Christian. > Bjorn --------------292CEB6B2E472794B49E0431 Content-Type: text/x-patch; name="0001-x86-PCI-add-16GB-guard-between-end-of-memory-and-new.patch" Content-Transfer-Encoding: 8bit Content-Disposition: attachment; filename*0="0001-x86-PCI-add-16GB-guard-between-end-of-memory-and-new.pa"; filename*1="tch" >From 101e157babcef10b91edf91e7e6f03826c2f8ade Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Christian=20K=C3=B6nig?= Date: Tue, 28 Nov 2017 10:02:35 +0100 Subject: [PATCH] x86/PCI: add 16GB guard between end of memory and new PCI window MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add a workaround for buggy BIOS implementations who steal memory for iGPUs from the OS without reporting it as reserved. Signed-off-by: Christian König Tested-by: Aaro Koskinen --- arch/x86/pci/fixup.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c index e663d6bf1328..e1bdae2cebb6 100644 --- a/arch/x86/pci/fixup.c +++ b/arch/x86/pci/fixup.c @@ -713,6 +713,10 @@ static void pci_amd_enable_64bit_bar(struct pci_dev *dev) } res->start = conflict->end + 1; } + /* Add 16GB guard between end of memory and new PCI window to work + * around buggy BIOS implementations. + */ + res->start += 0x400000000ull; dev_info(&dev->dev, "adding root bus resource %pR\n", res); -- 2.11.0 --------------292CEB6B2E472794B49E0431--