2018-01-20 01:26:59

by Laura Abbott

[permalink] [raw]
Subject: Boot regression with bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") on top of -rc8

Hi,

Fedora got multiple reports of an early bootup crash post -rc8.
Bisection showed bacf6b499e11 ("x86/mm: Use a struct to reduce
parameters for SME PGD mapping") . It doesn't revert cleanly
but if I revert the few other changes in arch/x86/mm/mem_encrypt.c
as well it boots up fine.

Annoyingly, I can't seem to get any actual kernel logs even with
earlyprintk. It just reboots immediately (triple fault?). This
happens on both of my Lenovo machines and I can ask other reporters
for details as well.

$ git bisect log
# bad: [ec835f8104a21f4d4eeb9d316ee71d2b4a7f00de] Merge tag 'trace-v4.15-rc4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
# good: [a8750ddca918032d6349adbf9a4b6555e7db20da] Linux 4.15-rc8
git bisect start 'origin/master' 'v4.15-rc8'
# bad: [79683f80e4f07dba13cc08d0ebcf5c7b0aa1bf68] Merge tag 'mmc-v4.15-rc2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
git bisect bad 79683f80e4f07dba13cc08d0ebcf5c7b0aa1bf68
# good: [161f72ed6dbe7fb176585091d3b797125d310399] Merge tag 'mac80211-for-davem-2018-01-15' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
git bisect good 161f72ed6dbe7fb176585091d3b797125d310399
# good: [88dc7fca18001fd883e5ace775afa316b68c8f2c] Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 88dc7fca18001fd883e5ace775afa316b68c8f2c
# bad: [d47924417319e3b6a728c0b690f183e75bc2a702] x86/intel_rdt/cqm: Prevent use after free
git bisect bad d47924417319e3b6a728c0b690f183e75bc2a702
# good: [fc90ccfd286eabb05ec54521367df8663cf0bbbf] Revert "x86/apic: Remove init_bsp_APIC()"
git bisect good fc90ccfd286eabb05ec54521367df8663cf0bbbf
# bad: [bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4] x86/mm: Use a struct to reduce parameters for SME PGD mapping
git bisect bad bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4
# good: [1303880179e67c59e801429b7e5d0f6b21137d99] x86/mm: Clean up register saving in the __enc_copy() assembly code
git bisect good 1303880179e67c59e801429b7e5d0f6b21137d99
# first bad commit: [bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4] x86/mm: Use a struct to reduce parameters for SME PGD mapping


Configuration is at https://git.kernel.org/pub/scm/linux/kernel/git/jwboyer/fedora.git/commit/?h=rawhide
Note that I do think this is something in the Fedora configuration
because a generic "make defconfig" booted just fine.

Thanks,
Laura


2018-01-20 02:24:56

by Gabriel C

[permalink] [raw]
Subject: Re: Boot regression with bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") on top of -rc8

2018-01-20 2:23 GMT+01:00 Laura Abbott <[email protected]>:
> Hi,

Hi ,

>
> Fedora got multiple reports of an early bootup crash post -rc8.
> Bisection showed bacf6b499e11 ("x86/mm: Use a struct to reduce
> parameters for SME PGD mapping") . It doesn't revert cleanly
> but if I revert the few other changes in arch/x86/mm/mem_encrypt.c
> as well it boots up fine.
>
> Annoyingly, I can't seem to get any actual kernel logs even with
> earlyprintk. It just reboots immediately (triple fault?). This
> happens on both of my Lenovo machines and I can ask other reporters
> for details as well.
>

I tested these patches on 2 Lenovo Ideapad both with Skylake CPUs
on a older dual Xeon box , on 2 Toshibas with AMD APUs , on a RYZEN box ,
on dual EPYC box .. ofc on EPYC with mem_encrypt=on on the Intel CPUs disabled.

Also tested on top 4.14.13 , 4.14.14 as well on top 4.15.0-rc7 and on
current master/rc8++ without to see something like this.

Also we pushed these patches on 4.14.13/14 and didn't got any reports about
something like this.

What Lenovo boxes are these ? maybe I find one to reproduce.


> $ git bisect log
> # bad: [ec835f8104a21f4d4eeb9d316ee71d2b4a7f00de] Merge tag
> 'trace-v4.15-rc4-3' of
> git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
> # good: [a8750ddca918032d6349adbf9a4b6555e7db20da] Linux 4.15-rc8
> git bisect start 'origin/master' 'v4.15-rc8'
> # bad: [79683f80e4f07dba13cc08d0ebcf5c7b0aa1bf68] Merge tag
> 'mmc-v4.15-rc2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
> git bisect bad 79683f80e4f07dba13cc08d0ebcf5c7b0aa1bf68
> # good: [161f72ed6dbe7fb176585091d3b797125d310399] Merge tag
> 'mac80211-for-davem-2018-01-15' of
> git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
> git bisect good 161f72ed6dbe7fb176585091d3b797125d310399
> # good: [88dc7fca18001fd883e5ace775afa316b68c8f2c] Merge branch
> 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> git bisect good 88dc7fca18001fd883e5ace775afa316b68c8f2c
> # bad: [d47924417319e3b6a728c0b690f183e75bc2a702] x86/intel_rdt/cqm: Prevent
> use after free
> git bisect bad d47924417319e3b6a728c0b690f183e75bc2a702
> # good: [fc90ccfd286eabb05ec54521367df8663cf0bbbf] Revert "x86/apic: Remove
> init_bsp_APIC()"
> git bisect good fc90ccfd286eabb05ec54521367df8663cf0bbbf
> # bad: [bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4] x86/mm: Use a struct to
> reduce parameters for SME PGD mapping
> git bisect bad bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4
> # good: [1303880179e67c59e801429b7e5d0f6b21137d99] x86/mm: Clean up register
> saving in the __enc_copy() assembly code
> git bisect good 1303880179e67c59e801429b7e5d0f6b21137d99
> # first bad commit: [bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4] x86/mm: Use a
> struct to reduce parameters for SME PGD mapping
>
>
> Configuration is at
> https://git.kernel.org/pub/scm/linux/kernel/git/jwboyer/fedora.git/commit/?h=rawhide
> Note that I do think this is something in the Fedora configuration
> because a generic "make defconfig" booted just fine.

But maybe some of the Fedora patches ?

Can you try an kernel with the config but without any patches ?
Or a defconfig and just enable CONFIG_AMD_MEM_ENCRYPT ?

>
> Thanks,
> Laura

Regards,

Gabriel C

2018-01-20 02:41:54

by Linus Torvalds

[permalink] [raw]
Subject: Re: Boot regression with bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") on top of -rc8

On Fri, Jan 19, 2018 at 5:23 PM, Laura Abbott <[email protected]> wrote:
>
> Fedora got multiple reports of an early bootup crash post -rc8.
> Bisection showed bacf6b499e11 ("x86/mm: Use a struct to reduce
> parameters for SME PGD mapping") . It doesn't revert cleanly
> but if I revert the few other changes in arch/x86/mm/mem_encrypt.c
> as well it boots up fine.

Odd. I've tried to read through that patch three times to find
anything it actually changes, and I can't find anything.

It looks like that patch should have absolutely no actual behavioral impact.

But clearly I'm missing something. Can anybody see what the mistake in
the conversion is?

Linus

2018-01-20 04:06:48

by Laura Abbott

[permalink] [raw]
Subject: Re: Boot regression with bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") on top of -rc8

On 01/19/2018 06:23 PM, Gabriel C wrote:
> 2018-01-20 2:23 GMT+01:00 Laura Abbott <[email protected]>:
>> Hi,
>
> Hi ,
>
>>
>> Fedora got multiple reports of an early bootup crash post -rc8.
>> Bisection showed bacf6b499e11 ("x86/mm: Use a struct to reduce
>> parameters for SME PGD mapping") . It doesn't revert cleanly
>> but if I revert the few other changes in arch/x86/mm/mem_encrypt.c
>> as well it boots up fine.
>>
>> Annoyingly, I can't seem to get any actual kernel logs even with
>> earlyprintk. It just reboots immediately (triple fault?). This
>> happens on both of my Lenovo machines and I can ask other reporters
>> for details as well.
>>
>
> I tested these patches on 2 Lenovo Ideapad both with Skylake CPUs
> on a older dual Xeon box , on 2 Toshibas with AMD APUs , on a RYZEN box ,
> on dual EPYC box .. ofc on EPYC with mem_encrypt=on on the Intel CPUs disabled.
>
> Also tested on top 4.14.13 , 4.14.14 as well on top 4.15.0-rc7 and on
> current master/rc8++ without to see something like this.
>
> Also we pushed these patches on 4.14.13/14 and didn't got any reports about
> something like this.
>
> What Lenovo boxes are these ? maybe I find one to reproduce.
>
>
>> $ git bisect log
>> # bad: [ec835f8104a21f4d4eeb9d316ee71d2b4a7f00de] Merge tag
>> 'trace-v4.15-rc4-3' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
>> # good: [a8750ddca918032d6349adbf9a4b6555e7db20da] Linux 4.15-rc8
>> git bisect start 'origin/master' 'v4.15-rc8'
>> # bad: [79683f80e4f07dba13cc08d0ebcf5c7b0aa1bf68] Merge tag
>> 'mmc-v4.15-rc2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
>> git bisect bad 79683f80e4f07dba13cc08d0ebcf5c7b0aa1bf68
>> # good: [161f72ed6dbe7fb176585091d3b797125d310399] Merge tag
>> 'mac80211-for-davem-2018-01-15' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
>> git bisect good 161f72ed6dbe7fb176585091d3b797125d310399
>> # good: [88dc7fca18001fd883e5ace775afa316b68c8f2c] Merge branch
>> 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>> git bisect good 88dc7fca18001fd883e5ace775afa316b68c8f2c
>> # bad: [d47924417319e3b6a728c0b690f183e75bc2a702] x86/intel_rdt/cqm: Prevent
>> use after free
>> git bisect bad d47924417319e3b6a728c0b690f183e75bc2a702
>> # good: [fc90ccfd286eabb05ec54521367df8663cf0bbbf] Revert "x86/apic: Remove
>> init_bsp_APIC()"
>> git bisect good fc90ccfd286eabb05ec54521367df8663cf0bbbf
>> # bad: [bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4] x86/mm: Use a struct to
>> reduce parameters for SME PGD mapping
>> git bisect bad bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4
>> # good: [1303880179e67c59e801429b7e5d0f6b21137d99] x86/mm: Clean up register
>> saving in the __enc_copy() assembly code
>> git bisect good 1303880179e67c59e801429b7e5d0f6b21137d99
>> # first bad commit: [bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4] x86/mm: Use a
>> struct to reduce parameters for SME PGD mapping
>>
>>
>> Configuration is at
>> https://git.kernel.org/pub/scm/linux/kernel/git/jwboyer/fedora.git/commit/?h=rawhide
>> Note that I do think this is something in the Fedora configuration
>> because a generic "make defconfig" booted just fine.
>
> But maybe some of the Fedora patches ?
>
> Can you try an kernel with the config but without any patches ?
> Or a defconfig and just enable CONFIG_AMD_MEM_ENCRYPT ?
>

The bisect was a vanilla kernel without Fedora patches.

>>
>> Thanks,
>> Laura
>
> Regards,
>
> Gabriel C
>


2018-01-20 04:16:58

by Tom Lendacky

[permalink] [raw]
Subject: Re: Boot regression with bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") on top of -rc8

On 1/19/2018 8:38 PM, Linus Torvalds wrote:
> On Fri, Jan 19, 2018 at 5:23 PM, Laura Abbott <[email protected]> wrote:
>>
>> Fedora got multiple reports of an early bootup crash post -rc8.
>> Bisection showed bacf6b499e11 ("x86/mm: Use a struct to reduce
>> parameters for SME PGD mapping") . It doesn't revert cleanly
>> but if I revert the few other changes in arch/x86/mm/mem_encrypt.c
>> as well it boots up fine.
>
> Odd. I've tried to read through that patch three times to find
> anything it actually changes, and I can't find anything.
>
> It looks like that patch should have absolutely no actual behavioral impact.
>
> But clearly I'm missing something. Can anybody see what the mistake in
> the conversion is?

I'll take a closer look at this, but it really shouldn't have any effect,
especially on a non-AMD box (which I'm assuming this is?) and with memory
encryption off by default.

Thanks,
Tom

>
> Linus
>

2018-01-20 05:29:47

by Gabriel C

[permalink] [raw]
Subject: Re: Boot regression with bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") on top of -rc8

2018-01-20 5:02 GMT+01:00 Laura Abbott <[email protected]>:
> On 01/19/2018 06:23 PM, Gabriel C wrote:
>>
>> 2018-01-20 2:23 GMT+01:00 Laura Abbott <[email protected]>:
>>>
>>> Hi,
>>
>>
>> Hi ,
>>
>>>
>>> Fedora got multiple reports of an early bootup crash post -rc8.
>>> Bisection showed bacf6b499e11 ("x86/mm: Use a struct to reduce
>>> parameters for SME PGD mapping") . It doesn't revert cleanly
>>> but if I revert the few other changes in arch/x86/mm/mem_encrypt.c
>>> as well it boots up fine.
>>>
>>> Annoyingly, I can't seem to get any actual kernel logs even with
>>> earlyprintk. It just reboots immediately (triple fault?). This
>>> happens on both of my Lenovo machines and I can ask other reporters
>>> for details as well.
>>>
>>
>> I tested these patches on 2 Lenovo Ideapad both with Skylake CPUs
>> on a older dual Xeon box , on 2 Toshibas with AMD APUs , on a RYZEN box ,
>> on dual EPYC box .. ofc on EPYC with mem_encrypt=on on the Intel CPUs
>> disabled.
>>
>> Also tested on top 4.14.13 , 4.14.14 as well on top 4.15.0-rc7 and on
>> current master/rc8++ without to see something like this.
>>
>> Also we pushed these patches on 4.14.13/14 and didn't got any reports
>> about
>> something like this.
>>
>> What Lenovo boxes are these ? maybe I find one to reproduce.
>>
>>
>>> $ git bisect log
>>> # bad: [ec835f8104a21f4d4eeb9d316ee71d2b4a7f00de] Merge tag
>>> 'trace-v4.15-rc4-3' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
>>> # good: [a8750ddca918032d6349adbf9a4b6555e7db20da] Linux 4.15-rc8
>>> git bisect start 'origin/master' 'v4.15-rc8'
>>> # bad: [79683f80e4f07dba13cc08d0ebcf5c7b0aa1bf68] Merge tag
>>> 'mmc-v4.15-rc2-3' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
>>> git bisect bad 79683f80e4f07dba13cc08d0ebcf5c7b0aa1bf68
>>> # good: [161f72ed6dbe7fb176585091d3b797125d310399] Merge tag
>>> 'mac80211-for-davem-2018-01-15' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
>>> git bisect good 161f72ed6dbe7fb176585091d3b797125d310399
>>> # good: [88dc7fca18001fd883e5ace775afa316b68c8f2c] Merge branch
>>> 'x86-pti-for-linus' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>>> git bisect good 88dc7fca18001fd883e5ace775afa316b68c8f2c
>>> # bad: [d47924417319e3b6a728c0b690f183e75bc2a702] x86/intel_rdt/cqm:
>>> Prevent
>>> use after free
>>> git bisect bad d47924417319e3b6a728c0b690f183e75bc2a702
>>> # good: [fc90ccfd286eabb05ec54521367df8663cf0bbbf] Revert "x86/apic:
>>> Remove
>>> init_bsp_APIC()"
>>> git bisect good fc90ccfd286eabb05ec54521367df8663cf0bbbf
>>> # bad: [bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4] x86/mm: Use a struct to
>>> reduce parameters for SME PGD mapping
>>> git bisect bad bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4
>>> # good: [1303880179e67c59e801429b7e5d0f6b21137d99] x86/mm: Clean up
>>> register
>>> saving in the __enc_copy() assembly code
>>> git bisect good 1303880179e67c59e801429b7e5d0f6b21137d99
>>> # first bad commit: [bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4] x86/mm:
>>> Use a
>>> struct to reduce parameters for SME PGD mapping
>>>
>>>
>>> Configuration is at
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/jwboyer/fedora.git/commit/?h=rawhide
>>> Note that I do think this is something in the Fedora configuration
>>> because a generic "make defconfig" booted just fine.
>>
>>
>> But maybe some of the Fedora patches ?
>>
>> Can you try an kernel with the config but without any patches ?
>> Or a defconfig and just enable CONFIG_AMD_MEM_ENCRYPT ?
>>
>
> The bisect was a vanilla kernel without Fedora patches.

Ok . I did an build ( v4.15-rc8-225-g8dd903d2cf7b ) on my AMD
Workstation ( EPYC CPU )
with your 64bit config and one on the Ideapad ( Intel i7-6498DU ) ..
I disabled Selinux since I don't use it here and module signing.

Also with your config my serial setup won't work and the kernel hangs
but mem_encrypt=on/off works just fine.

Also I notice on the Workstation it takes forever to boot untill
'DMA-API' reports out-of-memory
( dunno how much memory it need but the box has 128GB of RAM )..


Can you tell us your Lenovo models please ?

Reagrds,

Gabriel C

2018-01-20 06:17:11

by Tom Lendacky

[permalink] [raw]
Subject: Re: Boot regression with bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") on top of -rc8

On 1/19/2018 11:25 PM, Gabriel C wrote:
> 2018-01-20 5:02 GMT+01:00 Laura Abbott <[email protected]>:
>> On 01/19/2018 06:23 PM, Gabriel C wrote:
>>>
>>> 2018-01-20 2:23 GMT+01:00 Laura Abbott <[email protected]>:
>>>>
>>>> Hi,
>>>
>>>
>>> Hi ,
>>>
>>>>
>>>> Fedora got multiple reports of an early bootup crash post -rc8.
>>>> Bisection showed bacf6b499e11 ("x86/mm: Use a struct to reduce
>>>> parameters for SME PGD mapping") . It doesn't revert cleanly
>>>> but if I revert the few other changes in arch/x86/mm/mem_encrypt.c
>>>> as well it boots up fine.
>>>>
>>>> Annoyingly, I can't seem to get any actual kernel logs even with
>>>> earlyprintk. It just reboots immediately (triple fault?). This
>>>> happens on both of my Lenovo machines and I can ask other reporters
>>>> for details as well.
>>>>
>>>
>>> I tested these patches on 2 Lenovo Ideapad both with Skylake CPUs
>>> on a older dual Xeon box , on 2 Toshibas with AMD APUs , on a RYZEN box ,
>>> on dual EPYC box .. ofc on EPYC with mem_encrypt=on on the Intel CPUs
>>> disabled.
>>>
>>> Also tested on top 4.14.13 , 4.14.14 as well on top 4.15.0-rc7 and on
>>> current master/rc8++ without to see something like this.
>>>
>>> Also we pushed these patches on 4.14.13/14 and didn't got any reports
>>> about
>>> something like this.
>>>
>>> What Lenovo boxes are these ? maybe I find one to reproduce.
>>>
>>>
>>>> $ git bisect log
>>>> # bad: [ec835f8104a21f4d4eeb9d316ee71d2b4a7f00de] Merge tag
>>>> 'trace-v4.15-rc4-3' of
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
>>>> # good: [a8750ddca918032d6349adbf9a4b6555e7db20da] Linux 4.15-rc8
>>>> git bisect start 'origin/master' 'v4.15-rc8'
>>>> # bad: [79683f80e4f07dba13cc08d0ebcf5c7b0aa1bf68] Merge tag
>>>> 'mmc-v4.15-rc2-3' of
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
>>>> git bisect bad 79683f80e4f07dba13cc08d0ebcf5c7b0aa1bf68
>>>> # good: [161f72ed6dbe7fb176585091d3b797125d310399] Merge tag
>>>> 'mac80211-for-davem-2018-01-15' of
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
>>>> git bisect good 161f72ed6dbe7fb176585091d3b797125d310399
>>>> # good: [88dc7fca18001fd883e5ace775afa316b68c8f2c] Merge branch
>>>> 'x86-pti-for-linus' of
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>>>> git bisect good 88dc7fca18001fd883e5ace775afa316b68c8f2c
>>>> # bad: [d47924417319e3b6a728c0b690f183e75bc2a702] x86/intel_rdt/cqm:
>>>> Prevent
>>>> use after free
>>>> git bisect bad d47924417319e3b6a728c0b690f183e75bc2a702
>>>> # good: [fc90ccfd286eabb05ec54521367df8663cf0bbbf] Revert "x86/apic:
>>>> Remove
>>>> init_bsp_APIC()"
>>>> git bisect good fc90ccfd286eabb05ec54521367df8663cf0bbbf
>>>> # bad: [bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4] x86/mm: Use a struct to
>>>> reduce parameters for SME PGD mapping
>>>> git bisect bad bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4
>>>> # good: [1303880179e67c59e801429b7e5d0f6b21137d99] x86/mm: Clean up
>>>> register
>>>> saving in the __enc_copy() assembly code
>>>> git bisect good 1303880179e67c59e801429b7e5d0f6b21137d99
>>>> # first bad commit: [bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4] x86/mm:
>>>> Use a
>>>> struct to reduce parameters for SME PGD mapping
>>>>
>>>>
>>>> Configuration is at
>>>>
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/jwboyer/fedora.git/commit/?h=rawhide
>>>> Note that I do think this is something in the Fedora configuration
>>>> because a generic "make defconfig" booted just fine.
>>>
>>>
>>> But maybe some of the Fedora patches ?
>>>
>>> Can you try an kernel with the config but without any patches ?
>>> Or a defconfig and just enable CONFIG_AMD_MEM_ENCRYPT ?
>>>
>>
>> The bisect was a vanilla kernel without Fedora patches.
>
> Ok . I did an build ( v4.15-rc8-225-g8dd903d2cf7b ) on my AMD
> Workstation ( EPYC CPU )

I've tried multiple config combinations on my EPYC system and have not
been able to reproduce this issue and have not had any boot issues with
mem_encrypt=on or mem_encrypt=off. I don't have access to a non-AMD box
at the moment, but I'm really scratching my head on this one.

> with your 64bit config and one on the Ideapad ( Intel i7-6498DU ) ..
> I disabled Selinux since I don't use it here and module signing.
>
> Also with your config my serial setup won't work and the kernel hangs
> but mem_encrypt=on/off works just fine.

If your using the EPYC serial device and you haven't enabled legacy
serial device support (if your BIOS supports that), then you're using
it as a platform device. The fedora config has not set the
CONFIG_X86_AMD_PLATFORM_DEVICE setting so you won't get the module
to load and give you serial output.

I'm confused when you say the kernel hangs but mem_encrypt=on/off works
just fine, can you explain that a bit more?

>
> Also I notice on the Workstation it takes forever to boot untill
> 'DMA-API' reports out-of-memory
> ( dunno how much memory it need but the box has 128GB of RAM )..

I haven't seen this before.

Thanks,
Tom

>
>
> Can you tell us your Lenovo models please ?
>
> Reagrds,
>
> Gabriel C
>

2018-01-20 06:58:35

by Laura Abbott

[permalink] [raw]
Subject: Re: Boot regression with bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") on top of -rc8

On 01/19/2018 10:15 PM, Tom Lendacky wrote:
> On 1/19/2018 11:25 PM, Gabriel C wrote:
>> 2018-01-20 5:02 GMT+01:00 Laura Abbott <[email protected]>:
>>> On 01/19/2018 06:23 PM, Gabriel C wrote:
>>>>
>>>> 2018-01-20 2:23 GMT+01:00 Laura Abbott <[email protected]>:
>>>>>
>>>>> Hi,
>>>>
>>>>
>>>> Hi ,
>>>>
>>>>>
>>>>> Fedora got multiple reports of an early bootup crash post -rc8.
>>>>> Bisection showed bacf6b499e11 ("x86/mm: Use a struct to reduce
>>>>> parameters for SME PGD mapping") . It doesn't revert cleanly
>>>>> but if I revert the few other changes in arch/x86/mm/mem_encrypt.c
>>>>> as well it boots up fine.
>>>>>
>>>>> Annoyingly, I can't seem to get any actual kernel logs even with
>>>>> earlyprintk. It just reboots immediately (triple fault?). This
>>>>> happens on both of my Lenovo machines and I can ask other reporters
>>>>> for details as well.
>>>>>
>>>>
>>>> I tested these patches on 2 Lenovo Ideapad both with Skylake CPUs
>>>> on a older dual Xeon box , on 2 Toshibas with AMD APUs , on a RYZEN box ,
>>>> on dual EPYC box .. ofc on EPYC with mem_encrypt=on on the Intel CPUs
>>>> disabled.
>>>>
>>>> Also tested on top 4.14.13 , 4.14.14 as well on top 4.15.0-rc7 and on
>>>> current master/rc8++ without to see something like this.
>>>>
>>>> Also we pushed these patches on 4.14.13/14 and didn't got any reports
>>>> about
>>>> something like this.
>>>>
>>>> What Lenovo boxes are these ? maybe I find one to reproduce.
>>>>
>>>>
>>>>> $ git bisect log
>>>>> # bad: [ec835f8104a21f4d4eeb9d316ee71d2b4a7f00de] Merge tag
>>>>> 'trace-v4.15-rc4-3' of
>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
>>>>> # good: [a8750ddca918032d6349adbf9a4b6555e7db20da] Linux 4.15-rc8
>>>>> git bisect start 'origin/master' 'v4.15-rc8'
>>>>> # bad: [79683f80e4f07dba13cc08d0ebcf5c7b0aa1bf68] Merge tag
>>>>> 'mmc-v4.15-rc2-3' of
>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
>>>>> git bisect bad 79683f80e4f07dba13cc08d0ebcf5c7b0aa1bf68
>>>>> # good: [161f72ed6dbe7fb176585091d3b797125d310399] Merge tag
>>>>> 'mac80211-for-davem-2018-01-15' of
>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
>>>>> git bisect good 161f72ed6dbe7fb176585091d3b797125d310399
>>>>> # good: [88dc7fca18001fd883e5ace775afa316b68c8f2c] Merge branch
>>>>> 'x86-pti-for-linus' of
>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>>>>> git bisect good 88dc7fca18001fd883e5ace775afa316b68c8f2c
>>>>> # bad: [d47924417319e3b6a728c0b690f183e75bc2a702] x86/intel_rdt/cqm:
>>>>> Prevent
>>>>> use after free
>>>>> git bisect bad d47924417319e3b6a728c0b690f183e75bc2a702
>>>>> # good: [fc90ccfd286eabb05ec54521367df8663cf0bbbf] Revert "x86/apic:
>>>>> Remove
>>>>> init_bsp_APIC()"
>>>>> git bisect good fc90ccfd286eabb05ec54521367df8663cf0bbbf
>>>>> # bad: [bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4] x86/mm: Use a struct to
>>>>> reduce parameters for SME PGD mapping
>>>>> git bisect bad bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4
>>>>> # good: [1303880179e67c59e801429b7e5d0f6b21137d99] x86/mm: Clean up
>>>>> register
>>>>> saving in the __enc_copy() assembly code
>>>>> git bisect good 1303880179e67c59e801429b7e5d0f6b21137d99
>>>>> # first bad commit: [bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4] x86/mm:
>>>>> Use a
>>>>> struct to reduce parameters for SME PGD mapping
>>>>>
>>>>>
>>>>> Configuration is at
>>>>>
>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/jwboyer/fedora.git/commit/?h=rawhide
>>>>> Note that I do think this is something in the Fedora configuration
>>>>> because a generic "make defconfig" booted just fine.
>>>>
>>>>
>>>> But maybe some of the Fedora patches ?
>>>>
>>>> Can you try an kernel with the config but without any patches ?
>>>> Or a defconfig and just enable CONFIG_AMD_MEM_ENCRYPT ?
>>>>
>>>
>>> The bisect was a vanilla kernel without Fedora patches.
>>
>> Ok . I did an build ( v4.15-rc8-225-g8dd903d2cf7b ) on my AMD
>> Workstation ( EPYC CPU )
>
> I've tried multiple config combinations on my EPYC system and have not
> been able to reproduce this issue and have not had any boot issues with
> mem_encrypt=on or mem_encrypt=off. I don't have access to a non-AMD box
> at the moment, but I'm really scratching my head on this one.
>
>> with your 64bit config and one on the Ideapad ( Intel i7-6498DU ) ..
>> I disabled Selinux since I don't use it here and module signing.
>>
>> Also with your config my serial setup won't work and the kernel hangs
>> but mem_encrypt=on/off works just fine.
>
> If your using the EPYC serial device and you haven't enabled legacy
> serial device support (if your BIOS supports that), then you're using
> it as a platform device. The fedora config has not set the
> CONFIG_X86_AMD_PLATFORM_DEVICE setting so you won't get the module
> to load and give you serial output.
>
> I'm confused when you say the kernel hangs but mem_encrypt=on/off works
> just fine, can you explain that a bit more?
>
>>
>> Also I notice on the Workstation it takes forever to boot untill
>> 'DMA-API' reports out-of-memory
>> ( dunno how much memory it need but the box has 128GB of RAM )..
>

The machines I have are a Lenovo X1 Carbon and a Lenovo T470s.
A Lenovo X220 ThinkPad also reported the problem.

If I comment out sme_encrypt_kernel it boots:

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 7ba5d819ebe3..443ef5d3f1fa 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -158,7 +158,7 @@ unsigned long __head __startup_64(unsigned long
physaddr,
*p += load_delta - sme_get_me_mask();

/* Encrypt the kernel and related (if SME is active) */
- sme_encrypt_kernel(bp);
+ //sme_encrypt_kernel(bp);

/*
* Return the SME encryption mask (if SME is active) to be used
as a


Interestingly, I tried to print the values in sme_active
(sme_me_mask , sev_enabled) followed by a return at the
very start of sme_encrypt_kernel and that rebooted as well,
vs booting if I just kept the return. sme_me_mask and
sev_enabled are explicitly marked as being in .data,
is it possible they are ending up in a section that isn't
yet mapped or did I hit print too early?

Thanks,
Laura


2018-01-20 07:05:12

by Laura Abbott

[permalink] [raw]
Subject: Re: Boot regression with bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") on top of -rc8

On 01/19/2018 10:57 PM, Laura Abbott wrote:
> On 01/19/2018 10:15 PM, Tom Lendacky wrote:
>> On 1/19/2018 11:25 PM, Gabriel C wrote:
>>> 2018-01-20 5:02 GMT+01:00 Laura Abbott <[email protected]>:
>>>> On 01/19/2018 06:23 PM, Gabriel C wrote:
>>>>>
>>>>> 2018-01-20 2:23 GMT+01:00 Laura Abbott <[email protected]>:
>>>>>>
>>>>>> Hi,
>>>>>
>>>>>
>>>>> Hi ,
>>>>>
>>>>>>
>>>>>> Fedora got multiple reports of an early bootup crash post -rc8.
>>>>>> Bisection showed bacf6b499e11 ("x86/mm: Use a struct to reduce
>>>>>> parameters for SME PGD mapping") . It doesn't revert cleanly
>>>>>> but if I revert the few other changes in arch/x86/mm/mem_encrypt.c
>>>>>> as well it boots up fine.
>>>>>>
>>>>>> Annoyingly, I can't seem to get any actual kernel logs even with
>>>>>> earlyprintk. It just reboots immediately (triple fault?). This
>>>>>> happens on both of my Lenovo machines and I can ask other reporters
>>>>>> for details as well.
>>>>>>
>>>>>
>>>>> I tested these patches on 2 Lenovo Ideapad both with Skylake CPUs
>>>>> on a older dual Xeon box , on 2 Toshibas with AMD APUs , on a RYZEN
>>>>> box ,
>>>>> on dual EPYC box .. ofc on EPYC with mem_encrypt=on on the Intel CPUs
>>>>> disabled.
>>>>>
>>>>> Also tested on top 4.14.13 , 4.14.14 as well on top 4.15.0-rc7 and on
>>>>> current master/rc8++ without to see something like this.
>>>>>
>>>>> Also we pushed these patches on 4.14.13/14 and didn't got any reports
>>>>> about
>>>>> something like this.
>>>>>
>>>>> What Lenovo boxes are these ? maybe I find one to reproduce.
>>>>>
>>>>>
>>>>>> $ git bisect log
>>>>>> # bad: [ec835f8104a21f4d4eeb9d316ee71d2b4a7f00de] Merge tag
>>>>>> 'trace-v4.15-rc4-3' of
>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
>>>>>> # good: [a8750ddca918032d6349adbf9a4b6555e7db20da] Linux 4.15-rc8
>>>>>> git bisect start 'origin/master' 'v4.15-rc8'
>>>>>> # bad: [79683f80e4f07dba13cc08d0ebcf5c7b0aa1bf68] Merge tag
>>>>>> 'mmc-v4.15-rc2-3' of
>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
>>>>>> git bisect bad 79683f80e4f07dba13cc08d0ebcf5c7b0aa1bf68
>>>>>> # good: [161f72ed6dbe7fb176585091d3b797125d310399] Merge tag
>>>>>> 'mac80211-for-davem-2018-01-15' of
>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
>>>>>> git bisect good 161f72ed6dbe7fb176585091d3b797125d310399
>>>>>> # good: [88dc7fca18001fd883e5ace775afa316b68c8f2c] Merge branch
>>>>>> 'x86-pti-for-linus' of
>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>>>>>> git bisect good 88dc7fca18001fd883e5ace775afa316b68c8f2c
>>>>>> # bad: [d47924417319e3b6a728c0b690f183e75bc2a702] x86/intel_rdt/cqm:
>>>>>> Prevent
>>>>>> use after free
>>>>>> git bisect bad d47924417319e3b6a728c0b690f183e75bc2a702
>>>>>> # good: [fc90ccfd286eabb05ec54521367df8663cf0bbbf] Revert "x86/apic:
>>>>>> Remove
>>>>>> init_bsp_APIC()"
>>>>>> git bisect good fc90ccfd286eabb05ec54521367df8663cf0bbbf
>>>>>> # bad: [bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4] x86/mm: Use a
>>>>>> struct to
>>>>>> reduce parameters for SME PGD mapping
>>>>>> git bisect bad bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4
>>>>>> # good: [1303880179e67c59e801429b7e5d0f6b21137d99] x86/mm: Clean up
>>>>>> register
>>>>>> saving in the __enc_copy() assembly code
>>>>>> git bisect good 1303880179e67c59e801429b7e5d0f6b21137d99
>>>>>> # first bad commit: [bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4]
>>>>>> x86/mm:
>>>>>> Use a
>>>>>> struct to reduce parameters for SME PGD mapping
>>>>>>
>>>>>>
>>>>>> Configuration is at
>>>>>>
>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/jwboyer/fedora.git/commit/?h=rawhide
>>>>>>
>>>>>> Note that I do think this is something in the Fedora configuration
>>>>>> because a generic "make defconfig" booted just fine.
>>>>>
>>>>>
>>>>> But maybe some of the Fedora patches ?
>>>>>
>>>>> Can you try an kernel with the config but without any patches ?
>>>>> Or a defconfig and just enable  CONFIG_AMD_MEM_ENCRYPT ?
>>>>>
>>>>
>>>> The bisect was a vanilla kernel without Fedora patches.
>>>
>>> Ok . I did an build ( v4.15-rc8-225-g8dd903d2cf7b ) on my AMD
>>> Workstation  ( EPYC CPU )
>>
>> I've tried multiple config combinations on my EPYC system and have not
>> been able to reproduce this issue and have not had any boot issues with
>> mem_encrypt=on or mem_encrypt=off.  I don't have access to a non-AMD box
>> at the moment, but I'm really scratching my head on this one.
>>
>>>   with your 64bit config and one on the Ideapad ( Intel i7-6498DU ) ..
>>> I disabled Selinux since I don't use it here and module  signing.
>>>
>>> Also with your config my serial setup won't work and the kernel hangs
>>> but mem_encrypt=on/off works just fine.
>>
>> If your using the EPYC serial device and you haven't enabled legacy
>> serial device support (if your BIOS supports that), then you're using
>> it as a platform device.  The fedora config has not set the
>> CONFIG_X86_AMD_PLATFORM_DEVICE setting so you won't get the module
>> to load and give you serial output.
>>
>> I'm confused when you say the kernel hangs but mem_encrypt=on/off works
>> just fine, can you explain that a bit more?
>>
>>>
>>> Also I notice on the Workstation it takes forever to boot untill
>>> 'DMA-API' reports out-of-memory
>>> ( dunno how much memory it need but the box has 128GB of RAM )..
>>
>
> The machines I have are a Lenovo X1 Carbon and a Lenovo T470s.
> A Lenovo X220 ThinkPad also reported the problem.
>
> If I comment out sme_encrypt_kernel it boots:
>
> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
> index 7ba5d819ebe3..443ef5d3f1fa 100644
> --- a/arch/x86/kernel/head64.c
> +++ b/arch/x86/kernel/head64.c
> @@ -158,7 +158,7 @@ unsigned long __head __startup_64(unsigned long
> physaddr,
>         *p += load_delta - sme_get_me_mask();
>
>         /* Encrypt the kernel and related (if SME is active) */
> -       sme_encrypt_kernel(bp);
> +       //sme_encrypt_kernel(bp);
>
>         /*
>          * Return the SME encryption mask (if SME is active) to be used
> as a
>
>
> Interestingly, I tried to print the values in sme_active
> (sme_me_mask , sev_enabled) followed by a return at the
> very start of sme_encrypt_kernel and that rebooted as well,
> vs booting if I just kept the return. sme_me_mask and
> sev_enabled are explicitly marked as being in .data,
> is it possible they are ending up in a section that isn't
> yet mapped or did I hit print too early?
>
> Thanks,
> Laura
>

One last note: This was built with gcc 7.2.1

2018-01-20 12:07:34

by Gabriel C

[permalink] [raw]
Subject: Re: Boot regression with bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") on top of -rc8

2018-01-20 7:15 GMT+01:00 Tom Lendacky <[email protected]>:
> On 1/19/2018 11:25 PM, Gabriel C wrote:
>> 2018-01-20 5:02 GMT+01:00 Laura Abbott <[email protected]>:
>>> On 01/19/2018 06:23 PM, Gabriel C wrote:
>>>>
>>>> 2018-01-20 2:23 GMT+01:00 Laura Abbott <[email protected]>:
>>>>>
>>>>> Hi,
>>>>
>>>>
>>>> Hi ,
>>>>
>>>>>
>>>>> Fedora got multiple reports of an early bootup crash post -rc8.
>>>>> Bisection showed bacf6b499e11 ("x86/mm: Use a struct to reduce
>>>>> parameters for SME PGD mapping") . It doesn't revert cleanly
>>>>> but if I revert the few other changes in arch/x86/mm/mem_encrypt.c
>>>>> as well it boots up fine.
>>>>>
>>>>> Annoyingly, I can't seem to get any actual kernel logs even with
>>>>> earlyprintk. It just reboots immediately (triple fault?). This
>>>>> happens on both of my Lenovo machines and I can ask other reporters
>>>>> for details as well.
>>>>>
>>>>
>>>> I tested these patches on 2 Lenovo Ideapad both with Skylake CPUs
>>>> on a older dual Xeon box , on 2 Toshibas with AMD APUs , on a RYZEN box ,
>>>> on dual EPYC box .. ofc on EPYC with mem_encrypt=on on the Intel CPUs
>>>> disabled.
>>>>
>>>> Also tested on top 4.14.13 , 4.14.14 as well on top 4.15.0-rc7 and on
>>>> current master/rc8++ without to see something like this.
>>>>
>>>> Also we pushed these patches on 4.14.13/14 and didn't got any reports
>>>> about
>>>> something like this.
>>>>
>>>> What Lenovo boxes are these ? maybe I find one to reproduce.
>>>>
>>>>
>>>>> $ git bisect log
>>>>> # bad: [ec835f8104a21f4d4eeb9d316ee71d2b4a7f00de] Merge tag
>>>>> 'trace-v4.15-rc4-3' of
>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
>>>>> # good: [a8750ddca918032d6349adbf9a4b6555e7db20da] Linux 4.15-rc8
>>>>> git bisect start 'origin/master' 'v4.15-rc8'
>>>>> # bad: [79683f80e4f07dba13cc08d0ebcf5c7b0aa1bf68] Merge tag
>>>>> 'mmc-v4.15-rc2-3' of
>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
>>>>> git bisect bad 79683f80e4f07dba13cc08d0ebcf5c7b0aa1bf68
>>>>> # good: [161f72ed6dbe7fb176585091d3b797125d310399] Merge tag
>>>>> 'mac80211-for-davem-2018-01-15' of
>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
>>>>> git bisect good 161f72ed6dbe7fb176585091d3b797125d310399
>>>>> # good: [88dc7fca18001fd883e5ace775afa316b68c8f2c] Merge branch
>>>>> 'x86-pti-for-linus' of
>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>>>>> git bisect good 88dc7fca18001fd883e5ace775afa316b68c8f2c
>>>>> # bad: [d47924417319e3b6a728c0b690f183e75bc2a702] x86/intel_rdt/cqm:
>>>>> Prevent
>>>>> use after free
>>>>> git bisect bad d47924417319e3b6a728c0b690f183e75bc2a702
>>>>> # good: [fc90ccfd286eabb05ec54521367df8663cf0bbbf] Revert "x86/apic:
>>>>> Remove
>>>>> init_bsp_APIC()"
>>>>> git bisect good fc90ccfd286eabb05ec54521367df8663cf0bbbf
>>>>> # bad: [bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4] x86/mm: Use a struct to
>>>>> reduce parameters for SME PGD mapping
>>>>> git bisect bad bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4
>>>>> # good: [1303880179e67c59e801429b7e5d0f6b21137d99] x86/mm: Clean up
>>>>> register
>>>>> saving in the __enc_copy() assembly code
>>>>> git bisect good 1303880179e67c59e801429b7e5d0f6b21137d99
>>>>> # first bad commit: [bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4] x86/mm:
>>>>> Use a
>>>>> struct to reduce parameters for SME PGD mapping
>>>>>
>>>>>
>>>>> Configuration is at
>>>>>
>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/jwboyer/fedora.git/commit/?h=rawhide
>>>>> Note that I do think this is something in the Fedora configuration
>>>>> because a generic "make defconfig" booted just fine.
>>>>
>>>>
>>>> But maybe some of the Fedora patches ?
>>>>
>>>> Can you try an kernel with the config but without any patches ?
>>>> Or a defconfig and just enable CONFIG_AMD_MEM_ENCRYPT ?
>>>>
>>>
>>> The bisect was a vanilla kernel without Fedora patches.
>>
>> Ok . I did an build ( v4.15-rc8-225-g8dd903d2cf7b ) on my AMD
>> Workstation ( EPYC CPU )
>
> I've tried multiple config combinations on my EPYC system and have not
> been able to reproduce this issue and have not had any boot issues with
> mem_encrypt=on or mem_encrypt=off. I don't have access to a non-AMD box
> at the moment, but I'm really scratching my head on this one.
>
>> with your 64bit config and one on the Ideapad ( Intel i7-6498DU ) ..
>> I disabled Selinux since I don't use it here and module signing.
>>
>> Also with your config my serial setup won't work and the kernel hangs
>> but mem_encrypt=on/off works just fine.
>
> If your using the EPYC serial device and you haven't enabled legacy
> serial device support (if your BIOS supports that), then you're using
> it as a platform device. The fedora config has not set the
> CONFIG_X86_AMD_PLATFORM_DEVICE setting so you won't get the module
> to load and give you serial output.
>
> I'm confused when you say the kernel hangs but mem_encrypt=on/off works
> just fine, can you explain that a bit more?
>

I have to remove console=ttyS.. etc from kernel command line to boot
but this is not about mem_encrypt , since once removed the kernel works
just fine with mem_encrypt=on/off.

>>
>> Also I notice on the Workstation it takes forever to boot untill
>> 'DMA-API' reports out-of-memory
>> ( dunno how much memory it need but the box has 128GB of RAM )..
>
> I haven't seen this before.
>
> Thanks,
> Tom
>
>>
>>
>> Can you tell us your Lenovo models please ?
>>
>> Reagrds,
>>
>> Gabriel C
>>

2018-01-20 12:11:14

by Gabriel C

[permalink] [raw]
Subject: Re: Boot regression with bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") on top of -rc8

2018-01-20 8:03 GMT+01:00 Laura Abbott <[email protected]>:
> On 01/19/2018 10:57 PM, Laura Abbott wrote:
>>
>> On 01/19/2018 10:15 PM, Tom Lendacky wrote:
>>>
>>> On 1/19/2018 11:25 PM, Gabriel C wrote:
>>>>
>>>> 2018-01-20 5:02 GMT+01:00 Laura Abbott <[email protected]>:
>>>>>
>>>>> On 01/19/2018 06:23 PM, Gabriel C wrote:
>>>>>>
>>>>>>
>>>>>> 2018-01-20 2:23 GMT+01:00 Laura Abbott <[email protected]>:
>>>>>>>
>>>>>>>
>>>>>>> Hi,
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hi ,
>>>>>>
>>>>>>>
>>>>>>> Fedora got multiple reports of an early bootup crash post -rc8.
>>>>>>> Bisection showed bacf6b499e11 ("x86/mm: Use a struct to reduce
>>>>>>> parameters for SME PGD mapping") . It doesn't revert cleanly
>>>>>>> but if I revert the few other changes in arch/x86/mm/mem_encrypt.c
>>>>>>> as well it boots up fine.
>>>>>>>
>>>>>>> Annoyingly, I can't seem to get any actual kernel logs even with
>>>>>>> earlyprintk. It just reboots immediately (triple fault?). This
>>>>>>> happens on both of my Lenovo machines and I can ask other reporters
>>>>>>> for details as well.
>>>>>>>
>>>>>>
>>>>>> I tested these patches on 2 Lenovo Ideapad both with Skylake CPUs
>>>>>> on a older dual Xeon box , on 2 Toshibas with AMD APUs , on a RYZEN
>>>>>> box ,
>>>>>> on dual EPYC box .. ofc on EPYC with mem_encrypt=on on the Intel CPUs
>>>>>> disabled.
>>>>>>
>>>>>> Also tested on top 4.14.13 , 4.14.14 as well on top 4.15.0-rc7 and on
>>>>>> current master/rc8++ without to see something like this.
>>>>>>
>>>>>> Also we pushed these patches on 4.14.13/14 and didn't got any reports
>>>>>> about
>>>>>> something like this.
>>>>>>
>>>>>> What Lenovo boxes are these ? maybe I find one to reproduce.
>>>>>>
>>>>>>
>>>>>>> $ git bisect log
>>>>>>> # bad: [ec835f8104a21f4d4eeb9d316ee71d2b4a7f00de] Merge tag
>>>>>>> 'trace-v4.15-rc4-3' of
>>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
>>>>>>> # good: [a8750ddca918032d6349adbf9a4b6555e7db20da] Linux 4.15-rc8
>>>>>>> git bisect start 'origin/master' 'v4.15-rc8'
>>>>>>> # bad: [79683f80e4f07dba13cc08d0ebcf5c7b0aa1bf68] Merge tag
>>>>>>> 'mmc-v4.15-rc2-3' of
>>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
>>>>>>> git bisect bad 79683f80e4f07dba13cc08d0ebcf5c7b0aa1bf68
>>>>>>> # good: [161f72ed6dbe7fb176585091d3b797125d310399] Merge tag
>>>>>>> 'mac80211-for-davem-2018-01-15' of
>>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
>>>>>>> git bisect good 161f72ed6dbe7fb176585091d3b797125d310399
>>>>>>> # good: [88dc7fca18001fd883e5ace775afa316b68c8f2c] Merge branch
>>>>>>> 'x86-pti-for-linus' of
>>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>>>>>>> git bisect good 88dc7fca18001fd883e5ace775afa316b68c8f2c
>>>>>>> # bad: [d47924417319e3b6a728c0b690f183e75bc2a702] x86/intel_rdt/cqm:
>>>>>>> Prevent
>>>>>>> use after free
>>>>>>> git bisect bad d47924417319e3b6a728c0b690f183e75bc2a702
>>>>>>> # good: [fc90ccfd286eabb05ec54521367df8663cf0bbbf] Revert "x86/apic:
>>>>>>> Remove
>>>>>>> init_bsp_APIC()"
>>>>>>> git bisect good fc90ccfd286eabb05ec54521367df8663cf0bbbf
>>>>>>> # bad: [bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4] x86/mm: Use a
>>>>>>> struct to
>>>>>>> reduce parameters for SME PGD mapping
>>>>>>> git bisect bad bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4
>>>>>>> # good: [1303880179e67c59e801429b7e5d0f6b21137d99] x86/mm: Clean up
>>>>>>> register
>>>>>>> saving in the __enc_copy() assembly code
>>>>>>> git bisect good 1303880179e67c59e801429b7e5d0f6b21137d99
>>>>>>> # first bad commit: [bacf6b499e11760aef73a3bb5ce4e5eea74a3fd4]
>>>>>>> x86/mm:
>>>>>>> Use a
>>>>>>> struct to reduce parameters for SME PGD mapping
>>>>>>>
>>>>>>>
>>>>>>> Configuration is at
>>>>>>>
>>>>>>>
>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/jwboyer/fedora.git/commit/?h=rawhide
>>>>>>> Note that I do think this is something in the Fedora configuration
>>>>>>> because a generic "make defconfig" booted just fine.
>>>>>>
>>>>>>
>>>>>>
>>>>>> But maybe some of the Fedora patches ?
>>>>>>
>>>>>> Can you try an kernel with the config but without any patches ?
>>>>>> Or a defconfig and just enable CONFIG_AMD_MEM_ENCRYPT ?
>>>>>>
>>>>>
>>>>> The bisect was a vanilla kernel without Fedora patches.
>>>>
>>>>
>>>> Ok . I did an build ( v4.15-rc8-225-g8dd903d2cf7b ) on my AMD
>>>> Workstation ( EPYC CPU )
>>>
>>>
>>> I've tried multiple config combinations on my EPYC system and have not
>>> been able to reproduce this issue and have not had any boot issues with
>>> mem_encrypt=on or mem_encrypt=off. I don't have access to a non-AMD box
>>> at the moment, but I'm really scratching my head on this one.
>>>
>>>> with your 64bit config and one on the Ideapad ( Intel i7-6498DU ) ..
>>>> I disabled Selinux since I don't use it here and module signing.
>>>>
>>>> Also with your config my serial setup won't work and the kernel hangs
>>>> but mem_encrypt=on/off works just fine.
>>>
>>>
>>> If your using the EPYC serial device and you haven't enabled legacy
>>> serial device support (if your BIOS supports that), then you're using
>>> it as a platform device. The fedora config has not set the
>>> CONFIG_X86_AMD_PLATFORM_DEVICE setting so you won't get the module
>>> to load and give you serial output.
>>>
>>> I'm confused when you say the kernel hangs but mem_encrypt=on/off works
>>> just fine, can you explain that a bit more?
>>>
>>>>
>>>> Also I notice on the Workstation it takes forever to boot untill
>>>> 'DMA-API' reports out-of-memory
>>>> ( dunno how much memory it need but the box has 128GB of RAM )..
>>>
>>>
>>
>> The machines I have are a Lenovo X1 Carbon and a Lenovo T470s.
>> A Lenovo X220 ThinkPad also reported the problem.
>>
>> If I comment out sme_encrypt_kernel it boots:
>>
>> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
>> index 7ba5d819ebe3..443ef5d3f1fa 100644
>> --- a/arch/x86/kernel/head64.c
>> +++ b/arch/x86/kernel/head64.c
>> @@ -158,7 +158,7 @@ unsigned long __head __startup_64(unsigned long
>> physaddr,
>> *p += load_delta - sme_get_me_mask();
>>
>> /* Encrypt the kernel and related (if SME is active) */
>> - sme_encrypt_kernel(bp);
>> + //sme_encrypt_kernel(bp);
>>
>> /*
>> * Return the SME encryption mask (if SME is active) to be used
>> as a
>>
>>
>> Interestingly, I tried to print the values in sme_active
>> (sme_me_mask , sev_enabled) followed by a return at the
>> very start of sme_encrypt_kernel and that rebooted as well,
>> vs booting if I just kept the return. sme_me_mask and
>> sev_enabled are explicitly marked as being in .data,
>> is it possible they are ending up in a section that isn't
>> yet mapped or did I hit print too early?
>>
>> Thanks,
>> Laura
>>
>
> One last note: This was built with gcc 7.2.1

With retpoline patches ( if so what patchset ) ? or the one from gcc7 branch ?

I didn't tested with this gcc version , I've tested with gcc6 and
gcc8-retpoline-enabled compiler.

Regards,

Gabriel

2018-01-20 12:15:54

by Ingo Molnar

[permalink] [raw]
Subject: Re: Boot regression with bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") on top of -rc8


* Laura Abbott <[email protected]> wrote:

> Configuration is at https://git.kernel.org/pub/scm/linux/kernel/git/jwboyer/fedora.git/commit/?h=rawhide
> Note that I do think this is something in the Fedora configuration
> because a generic "make defconfig" booted just fine.

Hm, it says:

Invalid branch: rawhide/

Could you send the .config as attachment please?

Thanks,

Ingo


2018-01-20 12:34:55

by Ingo Molnar

[permalink] [raw]
Subject: Re: Boot regression with bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") on top of -rc8


* Laura Abbott <[email protected]> wrote:

> The machines I have are a Lenovo X1 Carbon and a Lenovo T470s.
> A Lenovo X220 ThinkPad also reported the problem.
>
> If I comment out sme_encrypt_kernel it boots:
>
> diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
> index 7ba5d819ebe3..443ef5d3f1fa 100644
> --- a/arch/x86/kernel/head64.c
> +++ b/arch/x86/kernel/head64.c
> @@ -158,7 +158,7 @@ unsigned long __head __startup_64(unsigned long
> physaddr,
> *p += load_delta - sme_get_me_mask();
>
> /* Encrypt the kernel and related (if SME is active) */
> - sme_encrypt_kernel(bp);
> + //sme_encrypt_kernel(bp);
>
> /*
> * Return the SME encryption mask (if SME is active) to be used as a
>
>
> Interestingly, I tried to print the values in sme_active
> (sme_me_mask , sev_enabled) followed by a return at the
> very start of sme_encrypt_kernel and that rebooted as well,
> vs booting if I just kept the return. sme_me_mask and
> sev_enabled are explicitly marked as being in .data,
> is it possible they are ending up in a section that isn't
> yet mapped or did I hit print too early?

So all this is in awfully early code.

I think you should only be able to use early_printk here - is that what you are
using?

As like Linus I don't see anything explicitly wrong in the patch, it obviously
made a difference to you and others, and the commenting out experiment verifies
the bisection result I think.

Here's a brute-force list of historic problems in early code, and an attempt to
check whether those aspects are fine:

1) stack troubles

The bisected-to patch adds one more C function call parameter, and one of the (low
probability) possibilities would be for the initial stack to be overflowing.

But stack setup in setup_64() looks fine to me:

/* Set up the stack for verify_cpu(), similar to initial_stack below */
leaq (__end_init_task - SIZEOF_PTREGS)(%rip), %rsp

/* Sanitize CPU configuration */
call verify_cpu


__end_init_task is defined as:

#define INIT_TASK_DATA(align) \
. = ALIGN(align); \
VMLINUX_SYMBOL(__start_init_task) = .; \
*(.data..init_task) \
VMLINUX_SYMBOL(__end_init_task) = .;


and we set up space for the init task in arch/x86/kernel/vmlinux.lds.S via:

/* Data */
.data : AT(ADDR(.data) - LOAD_OFFSET) {
/* Start of data section */
_sdata = .;

/* init_task */
INIT_TASK_DATA(THREAD_SIZE)

where THREAD_SIZE is at least 16K of space, more on KASAN.

So we put the initial stack PT_REGS below the end of &init_task - which should all
be good and there should be plenty of space.

2)

using global variables, which is unsafe in early code if the kernel is
relocatable.

The bisected to commit uses a new sme_populate_pgd_data to collect variables that
were already on the stack, which should be position independent and safe.

But the other commits use sme_active(), which does:

bool sme_active(void)
{
return sme_me_mask && !sev_enabled;
}
EXPORT_SYMBOL(sme_active);

And that looks PIC-unsafe to me, as both are globals:

u64 sme_me_mask __section(.data) = 0;
EXPORT_SYMBOL(sme_me_mask);

Does the code start working if you force sme_active() to 0 while keeping the
function call, i.e. something like the hack below?

Thanks,

Ingo

arch/x86/mm/mem_encrypt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 3ef362f598e3..52f7db4d08d6 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -403,7 +403,7 @@ int __init early_set_memory_encrypted(unsigned long vaddr, unsigned long size)
*/
bool sme_active(void)
{
- return sme_me_mask && !sev_enabled;
+ return 0;
}
EXPORT_SYMBOL(sme_active);


2018-01-20 13:16:13

by Ingo Molnar

[permalink] [raw]
Subject: Re: Boot regression with bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") on top of -rc8


* Ingo Molnar <[email protected]> wrote:

> 2)
>
> using global variables, which is unsafe in early code if the kernel is
> relocatable.
>
> The bisected to commit uses a new sme_populate_pgd_data to collect variables that
> were already on the stack, which should be position independent and safe.
>
> But the other commits use sme_active(), which does:
>
> bool sme_active(void)
> {
> return sme_me_mask && !sev_enabled;
> }
> EXPORT_SYMBOL(sme_active);
>
> And that looks PIC-unsafe to me, as both are globals:
>
> u64 sme_me_mask __section(.data) = 0;
> EXPORT_SYMBOL(sme_me_mask);
>
> Does the code start working if you force sme_active() to 0 while keeping the
> function call, i.e. something like the hack below?

BTW., this aspect of the boot code is really fragile, and depending on compiler
there could be unsafe relocations generated without it being 'obvious' from the
patch itself. It's also pretty compiler and code layout dependent ...

A good way to check this I think would be to turn off CONFIG_RELOCATABLE=y in the
.config - does that make the kernel boot again?

If that makes a difference then we need to take a look at the relocations in the
two key files, with CONFIG_RELOCATABLE=y turned back on:

objdump -r arch/x86/kernel/head64.o
objdump -r arch/x86/mm/mem_encrypt.o

There's three types of relocations that should be there normally:

#define R_X86_64_64 1 /* Direct 64 bit */
#define R_X86_64_PC32 2 /* PC relative 32 bit signed */
#define R_X86_64_32S 11 /* Direct 32 bit sign extended */

Only R_X86_64_PC32 is safe as-is, R_X86_64_32S needs to be used via
fixup_pointer().

What makes this difficult in the SME context is that the early boot portion of
arch/x86/mm/mem_encrypt.c is not separated out, but mixed in with later code.

I missed this aspect when reviewing and merging this code :-(

Maybe a diff of the list of relocations of the before/after commit points would be
nice.

I.e. does something like:

git checkout <last_working_commit_sha1>
objdump -r arch/x86/mm/mem_encrypt.o | grep R_X86 | cut -d' ' -f2- > working.relocs

git checkout <first_broken_commit_sha1>
objdump -r arch/x86/mm/mem_encrypt.o | grep R_X86 | cut -d' ' -f2- > broken.relocs

diff -up working.relocs broken.relocs

show any changes to the relocations?

Side note:

Regardless of whether it's the root cause for this regression we definitely need
to improve the relocations robustness of early boot code: at minimum we should
isolate all critical functionality into a separate section, and then add tooling
checks to make sure all relocations are safe.

Thanks,

Ingo

2018-01-20 15:40:09

by Laura Abbott

[permalink] [raw]
Subject: Re: Boot regression with bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") on top of -rc8

On 01/20/2018 04:12 AM, Ingo Molnar wrote:
>
> * Laura Abbott <[email protected]> wrote:
>
>> Configuration is at https://git.kernel.org/pub/scm/linux/kernel/git/jwboyer/fedora.git/commit/?h=rawhide
>> Note that I do think this is something in the Fedora configuration
>> because a generic "make defconfig" booted just fine.
>
> Hm, it says:
>
> Invalid branch: rawhide/
>
> Could you send the .config as attachment please?
>
> Thanks,
>
> Ingo
>

Attached.


Attachments:
kernel-4.15.0-x86_64.config (190.30 kB)

2018-01-20 17:36:00

by Tom Lendacky

[permalink] [raw]
Subject: Re: Boot regression with bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") on top of -rc8

On 1/20/2018 10:52 AM, Laura Abbott wrote:
> On 01/20/2018 05:13 AM, Ingo Molnar wrote:
>>
>> * Ingo Molnar <[email protected]> wrote:
>>
>>> 2)
>>>
>>> using global variables, which is unsafe in early code if the kernel is
>>> relocatable.
>>>
>>> The bisected to commit uses a new sme_populate_pgd_data to collect
>>> variables that
>>> were already on the stack, which should be position independent and safe.
>>>
>>> But the other commits use sme_active(), which does:
>>>
>>> bool sme_active(void)
>>> {
>>>          return sme_me_mask && !sev_enabled;
>>> }
>>> EXPORT_SYMBOL(sme_active);
>>>
>>> And that looks PIC-unsafe to me, as both are globals:
>>>
>>> u64 sme_me_mask __section(.data) = 0;
>>> EXPORT_SYMBOL(sme_me_mask);
>>>
>>> Does the code start working if you force sme_active() to 0 while
>>> keeping the
>>> function call, i.e. something like the hack below?
>>
>> BTW., this aspect of the boot code is really fragile, and depending on
>> compiler
>> there could be unsafe relocations generated without it being 'obvious'
>> from the
>> patch itself. It's also pretty compiler and code layout dependent ...
>>
>> A good way to check this I think would be to turn off
>> CONFIG_RELOCATABLE=y in the
>> .config - does that make the kernel boot again?
>>
>> If that makes a difference then we need to take a look at the
>> relocations in the
>> two key files, with CONFIG_RELOCATABLE=y turned back on:
>>
>>    objdump -r arch/x86/kernel/head64.o
>>    objdump -r arch/x86/mm/mem_encrypt.o
>>
>> There's three types of relocations that should be there normally:
>>
>> #define R_X86_64_64             1       /* Direct 64 bit  */
>> #define R_X86_64_PC32           2       /* PC relative 32 bit signed */
>> #define R_X86_64_32S            11      /* Direct 32 bit sign extended */
>>
>> Only R_X86_64_PC32 is safe as-is, R_X86_64_32S needs to be used via
>> fixup_pointer().
>>
>> What makes this difficult in the SME context is that the early boot
>> portion of
>> arch/x86/mm/mem_encrypt.c is not separated out, but mixed in with later
>> code.
>>
>> I missed this aspect when reviewing and merging this code :-(
>>
>> Maybe a diff of the list of relocations of the before/after commit
>> points would be
>> nice.
>>
>> I.e. does something like:
>>
>>    git checkout <last_working_commit_sha1>
>>    objdump -r arch/x86/mm/mem_encrypt.o  | grep R_X86 | cut -d' ' -f2- >
>> working.relocs
>>
>>    git checkout <first_broken_commit_sha1>
>>    objdump -r arch/x86/mm/mem_encrypt.o  | grep R_X86 | cut -d' ' -f2- >
>> broken.relocs
>>
>>    diff -up working.relocs broken.relocs
>>
>> show any changes to the relocations?
>>
>> Side note:
>>
>> Regardless of whether it's the root cause for this regression we
>> definitely need
>> to improve the relocations robustness of early boot code: at minimum we
>> should
>> isolate all critical functionality into a separate section, and then add
>> tooling
>> checks to make sure all relocations are safe.
>>
>> Thanks,
>>
>>     Ingo
>>
>
> For the previous question, changing it to sme_active _does_ make the
> kernel work. Unfortunately, I can't test without relocations since
> I need to boot with CONFIG_EFI_STUB, but the relocations did show
> something interesting:
>
> +R_X86_64_PC32     __stack_chk_fail-0x0000000000000004
>
> There's a new call to __stack_chk_fail and if I dump the end of
> sme_encrypt_kernel I do see that stuck in there. I bet the size
> of struct sme_populate_pgd_data is now large enough to trigger
> a stack check. If I add __nostackprotector to sme_encrypt_kernel
> like sme_enable has, it boots fine. This would explain why that
> particular commit showed as the problem in bisection.

Great find Laura. It must have something to do with compiler levels
since my level didn't insert that check.

Thanks,
Tom

>
> Thanks,
> Laura

2018-01-21 01:17:14

by Laura Abbott

[permalink] [raw]
Subject: [PATCH] x86: Use __nostackprotect for sme_encrypt_kernel

Commit bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for
SME PGD mapping") moved some parameters into a structure. The
structure was large enough to trigger the stack protection canary
in sme_encrypt_kernel which doesn't work this early, causing reboots.
Mark sme_encrypt_kernel appropriately to not use the canary.

Fixes: bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for
SME PGD mapping")
Signed-off-by: Laura Abbott <[email protected]>
---
I hadn't seen this picked up yet so sending explicitly
---
arch/x86/mm/mem_encrypt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 3ef362f598e3..e1d61e8500f9 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -738,7 +738,7 @@ static unsigned long __init sme_pgtable_calc(unsigned long len)
return total;
}

-void __init sme_encrypt_kernel(struct boot_params *bp)
+void __init __nostackprotector sme_encrypt_kernel(struct boot_params *bp)
{
unsigned long workarea_start, workarea_end, workarea_len;
unsigned long execute_start, execute_end, execute_len;
--
2.15.1


2018-01-21 01:25:13

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] x86: Use __nostackprotect for sme_encrypt_kernel

On Sat, Jan 20, 2018 at 5:14 PM, Laura Abbott <[email protected]> wrote:
>
> I hadn't seen this picked up yet so sending explicitly

Ingo, I just took this directly as a patch.

Linus

2018-01-21 01:51:10

by Gabriel C

[permalink] [raw]
Subject: Re: [PATCH] x86: Use __nostackprotect for sme_encrypt_kernel

On 21.01.2018 02:23, Linus Torvalds wrote:
> On Sat, Jan 20, 2018 at 5:14 PM, Laura Abbott <[email protected]> wrote:
>>
>> I hadn't seen this picked up yet so sending explicitly
>
> Ingo, I just took this directly as a patch.

Added stable to CC since the patch series this patch fixes
is in stable-queue.

Regards,

Gabriel C


2018-01-21 04:18:25

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] x86: Use __nostackprotect for sme_encrypt_kernel

On Sat, Jan 20, 2018 at 5:49 PM, Gabriel C <[email protected]> wrote:
>
> Added stable to CC since the patch series this patch fixes
> is in stable-queue.

Oh, it wasn't clear from the commit message. But I guess the "Fixes:"
tag would have caught Greg's eye regardless.

Anyway, Laura's fix is commit 91cfc88c66bf ("x86: Use __nostackprotect
for sme_encrypt_kernel") in my tree.

Linus

2018-01-21 08:48:13

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86: Use __nostackprotect for sme_encrypt_kernel


* Linus Torvalds <[email protected]> wrote:

> On Sat, Jan 20, 2018 at 5:14 PM, Laura Abbott <[email protected]> wrote:
> >
> > I hadn't seen this picked up yet so sending explicitly
>
> Ingo, I just took this directly as a patch.

Thanks!

Ingo

2018-01-21 09:38:23

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] x86: Use __nostackprotect for sme_encrypt_kernel

On Sat, Jan 20, 2018 at 08:16:31PM -0800, Linus Torvalds wrote:
> On Sat, Jan 20, 2018 at 5:49 PM, Gabriel C <[email protected]> wrote:
> >
> > Added stable to CC since the patch series this patch fixes
> > is in stable-queue.
>
> Oh, it wasn't clear from the commit message. But I guess the "Fixes:"
> tag would have caught Greg's eye regardless.

Ugh, I am _so_ behind in looking at patches that only have a Fixes: tag
in it and not a "Cc: stable@" tag, due to the recent high-volume of the
latter.

But they will end up in a mbox that I need to dig out of eventually, but
it will take time, so if you know you want a patch in a stable release,
it's much easier to just use the "Cc: stable@" tag please.

thanks,

greg k-h

2018-01-21 09:51:49

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86: Use __nostackprotect for sme_encrypt_kernel


* Greg Kroah-Hartman <[email protected]> wrote:

> On Sat, Jan 20, 2018 at 08:16:31PM -0800, Linus Torvalds wrote:
> > On Sat, Jan 20, 2018 at 5:49 PM, Gabriel C <[email protected]> wrote:
> > >
> > > Added stable to CC since the patch series this patch fixes
> > > is in stable-queue.
> >
> > Oh, it wasn't clear from the commit message. But I guess the "Fixes:"
> > tag would have caught Greg's eye regardless.
>
> Ugh, I am _so_ behind in looking at patches that only have a Fixes: tag
> in it and not a "Cc: stable@" tag, due to the recent high-volume of the
> latter.
>
> But they will end up in a mbox that I need to dig out of eventually, but
> it will take time, so if you know you want a patch in a stable release,
> it's much easier to just use the "Cc: stable@" tag please.

Just to make it easier, please put this upstream fix into -stable:

91cfc88c66bf: ("x86: Use __nostackprotect for sme_encrypt_kernel")

I believe all the prerequisite upstream commits are in -stable already:

1303880179e6: x86/mm: Clean up register saving in the __enc_copy() assembly code
bacf6b499e11: x86/mm: Use a struct to reduce parameters for SME PGD mapping
2b5d00b6c2cd: x86/mm: Centralize PMD flags in sme_encrypt_kernel()
cc5f01e28d6c: x86/mm: Prepare sme_encrypt_kernel() for PAGE aligned encryption
107cd2532181: x86/mm: Encrypt the initrd earlier for BSP microcode update

Thanks,

Ingo

2018-01-21 10:37:25

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] x86: Use __nostackprotect for sme_encrypt_kernel

On Sun, Jan 21, 2018 at 10:50:20AM +0100, Ingo Molnar wrote:
>
> * Greg Kroah-Hartman <[email protected]> wrote:
>
> > On Sat, Jan 20, 2018 at 08:16:31PM -0800, Linus Torvalds wrote:
> > > On Sat, Jan 20, 2018 at 5:49 PM, Gabriel C <[email protected]> wrote:
> > > >
> > > > Added stable to CC since the patch series this patch fixes
> > > > is in stable-queue.
> > >
> > > Oh, it wasn't clear from the commit message. But I guess the "Fixes:"
> > > tag would have caught Greg's eye regardless.
> >
> > Ugh, I am _so_ behind in looking at patches that only have a Fixes: tag
> > in it and not a "Cc: stable@" tag, due to the recent high-volume of the
> > latter.
> >
> > But they will end up in a mbox that I need to dig out of eventually, but
> > it will take time, so if you know you want a patch in a stable release,
> > it's much easier to just use the "Cc: stable@" tag please.
>
> Just to make it easier, please put this upstream fix into -stable:
>
> 91cfc88c66bf: ("x86: Use __nostackprotect for sme_encrypt_kernel")

Thanks, now queued up.

> I believe all the prerequisite upstream commits are in -stable already:
>
> 1303880179e6: x86/mm: Clean up register saving in the __enc_copy() assembly code
> bacf6b499e11: x86/mm: Use a struct to reduce parameters for SME PGD mapping
> 2b5d00b6c2cd: x86/mm: Centralize PMD flags in sme_encrypt_kernel()
> cc5f01e28d6c: x86/mm: Prepare sme_encrypt_kernel() for PAGE aligned encryption
> 107cd2532181: x86/mm: Encrypt the initrd earlier for BSP microcode update

Yes, those are all there already.

greg k-h