2018-01-07 22:55:40

by Linus Torvalds

[permalink] [raw]
Subject: Linux 4.15-rc7

Ok, we had an interesting week, and by now everybody knows why we were
merging all those odd x86 page table isolation patches without
following all of the normal release timing rules.

But rc7 itself is actually pretty calm. Yes, there were a few small
follow-up patches to the PTI code still, and yes, there's been a fair
amount of discussion about the exact details of the Spectre fixes, but
at least in general things have been nice and calm. And we're actually
back to "normal" in that most of the patches are drivers (mainly GPU,
some crypto, some random small things - input layer, platform drivers
etc). There are misc small filesystem and arch updates too.

The appended shortlog is small enough that it's easy to just scroll
down and get a feel for what happened.

The one thing I want to do now that Meltdown and Spectre are public,
is to give a *big* shout-out to the x86 people, and Thomas Gleixner in
particular for really being on top of this. It's been one huge
annoyance, and honestly, Thomas really went over and beyond in this
whole mess. A lot of other people have obviously been involved too,
don't get me wrong, but this is exactly the kind of issue that easily
results in lots of nasty hacky patches because people are falling all
over themselves trying to fix it and they can't even talk about why
they are doing it in public, and Thomas &co ended up being a huge
reason for why it was all much easier for me to merge: because of the
literally _months_ of work on quality control and gating these patches
and making sure the end result was a clean and manageable series.

So a big *BIG* thanks to Thomas for making it so much easier for me to
merge all this stuff. The whole nasty TLB isolation patches would
have been just _so_ much more horrible without him.

Anyway, due to this all, 4.15 will obviously be one of the releases
with an rc8, even if things are starting to really calm down by now.
We'll see, hopefully we won't need any more than that.

Linus

---

Aaron Ma (1):
Input: elantech - add new icbody type 15

Al Viro (2):
sget(): handle failures of register_shrinker()
fix "netfilter: xt_bpf: Fix XT_BPF_MODE_FD_PINNED mode of
'xt_bpf_info_v1'"

Alejandro Mery (3):
ARM: davinci: Use platform_device_register_full() to create pdev
for dm365's eDMA
ARM: davinci: Add dma_mask to dm365's eDMA device
ARM: davinci: fix mmc entries in dm365's dma_slave_map

Alexey Brodkin (2):
ARC: Fix detection of dual-issue enabled
ARC: [plat-hsdk] Switch DisplayLink driver from fbdev to DRM

Aliaksei Karaliou (2):
xfs: quota: fix missed destroy of qi_tree_lock
xfs: quota: check result of register_shrinker()

Andrea Arcangeli (1):
userfaultfd: clear the vma->vm_userfaultfd_ctx if UFFD_EVENT_FORK fails

Andrew Morton (1):
kernel/exit.c: export abort() to modules

Andrey Ryabinin (1):
x86/mm: Set MODULES_END to 0xffffffffff000000

Anshuman Khandual (1):
mm/mprotect: add a cond_resched() inside change_pmd_range()

Anthony Kim (1):
Input: hideep - fix compile error due to missing include file

Antoine Tenart (3):
crypto: inside-secure - free requests even if their handling failed
crypto: inside-secure - fix request allocations in invalidation path
crypto: inside-secure - do not use areq->result for partial results

Ard Biesheuvel (1):
efi/capsule-loader: Reinstate virtual capsule mapping

Arnd Bergmann (3):
ARM: dts: ls1021a: fix incorrect clock references
ARM: dts: tango4: remove bogus interrupt-controller property
crypto: chelsio - select CRYPTO_GF128MUL

Baoquan He (1):
mm/sparse.c: wrong allocation for mem_section

Bogdan Mirea (2):
arm64: dts: renesas: salvator-x: Remove renesas, no-ether-link property
arm64: dts: renesas: ulcb: Remove renesas, no-ether-link property

Boris Brezillon (1):
mtd: nand: pxa3xx: Fix READOOB implementation

Chen-Yu Tsai (1):
ARM: dts: sunxi: Convert to CCU index macros for HDMI controller

Chris Mason (1):
btrfs: fix refcount_t usage when deleting btrfs_delayed_nodes

Christian Borntraeger (2):
KVM: s390: fix cmma migration for multiple memory slots
KVM: s390: prevent buffer overrun on memory hotplug during migration

Dan Carpenter (1):
afs: Potential uninitialized variable in afs_extract_data()

Darrick J. Wong (1):
xfs: fix s_maxbytes overflow problems

Dave Young (2):
x86/efi: Fix kernel param add_efi_memmap regression
mm: check pfn_valid first in zero_resv_unavail

David Howells (3):
fscache: Fix the default for fscache_maybe_release_page()
afs: Fix unlink
afs: Fix missing error handling in afs_write_end()

David Lechner (1):
ARM: dts: da850-lego-ev3: Fix battery voltage gpio

David Woodhouse (1):
x86/alternatives: Add missing '\n' at end of ALTERNATIVE inline asm

Dhinakaran Pandiyan (1):
drm/i915/psr: Fix register name mess up.

Dmitry Torokhov (1):
Input: elants_i2c - do not clobber interrupt trigger on x86

Eric Biggers (3):
crypto: chacha20poly1305 - validate the digest size
crypto: pcrypt - fix freeing pcrypt instances
capabilities: fix buffer overread on very short xattr

Eric W. Biederman (1):
pid: Handle failure to allocate the first pid in a pid namespace

Eugeniy Paltsev (4):
ARC: [plat-hsdk]: Set initial core pll output frequency
ARC: [plat-hsdk]: Get rid of core pll frequency set in platform code
ARC: [plat-axs103]: Set initial core pll output frequency
ARC: [plat-axs103] refactor the quad core DT quirk code

Hans Verkuil (1):
omapdrm/dss/hdmi4_cec: fix interrupt handling

Heiko Carstens (1):
s390/sclp: disable FORTIFY_SOURCE for early sclp code

Heiko Stuebner (3):
ARM: dts: rockchip: add cpu0-regulator on rk3066a-marsboard
arm64: dts: rockchip: fix trailing 0 in rk3328 tsadc interrupts
arm64: dts: rockchip: limit rk3328-rock64 gmac speed to 100MBit for now

Helge Deller (6):
parisc: Show unhashed hardware inventory
parisc: Show initial kernel memory layout unhashed
parisc: Show unhashed HPA of Dino chip
parisc: Show unhashed EISA EEPROM address
parisc: Fix alignment of pa_tlb_lock in assembly on 32-bit SMP kernel
parisc: qemu idle sleep support

Icenowy Zheng (1):
arm64: allwinner: a64: add Ethernet PHY regulator for several boards

Jacek Anaszewski (1):
leds: core: Fix regression caused by commit 2b83ff96f51d

Jagan Teki (1):
arm64: allwinner: a64-sopine: Fix to use dcdc1 regulator instead of vcc3v3

James Hogan (1):
lib/mpi: Fix umul_ppmm() for MIPS64r6

Jan Engelhardt (1):
crypto: n2 - cure use after free

Javier Martinez Canillas (1):
ARM: dts: exynos: Enable Mixer node for Exynos5800 Peach Pi machine

Jean-Philippe Brucker (1):
iommu/arm-smmu-v3: Don't free page table ops twice

Jeffy Chen (1):
mailmap: update Mark Yao's email address

Jim Mattson (1):
kvm: vmx: Scrub hardware GPRs at VM-exit

Joel Stanley (1):
ARM: dts: aspeed-g4: Correct VUART IRQ number

John Johansen (1):
apparmor: fix regression in mount mediation when feature set is pinned

John Sperbeck (1):
powerpc/mm: Fix SEGV on mapped region to return SEGV_ACCERR

Jonathan Cameron (1):
crypto: af_alg - Fix race around ctx->rcvused by making it atomic_t

Josh Poimboeuf (2):
x86/dumpstack: Fix partial register dumps
x86/dumpstack: Print registers for first stack frame

Kees Cook (1):
exec: Weaken dumpability for secureexec

Klaus Goger (1):
arm64: dts: rockchip: remove vdd_log from rk3399-puma

Linus Torvalds (1):
Linux 4.15-rc7

Lucas De Marchi (1):
drm/i915: Apply Display WA #1183 on skl, kbl, and cfl

Markus Heiser (1):
docs: fix, intel_guc_loader.c has been moved to intel_guc_fw.c

Martin Schwidefsky (1):
s390: fix preemption race in disable_sacf_uaccess

Masahiro Yamada (1):
arm64: dts: uniphier: fix gpio-ranges property of PXs3 SoC

Matt Fleming (1):
MAINTAINERS: Remove Matt Fleming as EFI co-maintainer

Matthew Wilcox (1):
mm/debug.c: provide useful debugging information for VM_BUG

Maxime Ripard (1):
ARM: dts: sun8i: a711: Reinstate the PMIC compatible

Nick Desaulniers (1):
x86/process: Define cpu_tss_rw in same section as declaration

Nikolay Borisov (1):
btrfs: Fix flush bio leak

Ofer Heifetz (1):
crypto: inside-secure - per request invalidation

Oleg Nesterov (1):
kernel/acct.c: fix the acct->needcheck check in check_free_space()

Oleksandr Andrushchenko (1):
Input: xen-kbdfront - do not advertise multi-touch pressure support

Olof Johansson (1):
Input: joystick/analog - riscv has get_cycles()

Peter Rosin (1):
ARM: dts: at91: disable the nxp,se97b SMBUS timeout on the TSE-850

Peter Zijlstra (1):
x86/events/intel/ds: Use the proper cache flush method for
mapping ds buffers

Randy Dunlap (1):
documentation/gpu/i915: fix docs build error after file rename

Rob Herring (1):
ARM: dts: rockchip: fix rk3288 iep-IOMMU interrupts property cells

Robin Murphy (1):
iommu/arm-smmu-v3: Cope with duplicated Stream IDs

Russell King (5):
drm/armada: fix leak of crtc structure
drm/armada: fix SRAM powerdown
drm/armada: fix UV swap code
drm/armada: improve efficiency of armada_drm_plane_calc_addrs()
drm/armada: fix YUV planar format framebuffer offsets

Sebastian Ott (1):
s390/pci: handle insufficient resources during dma tlb flush

Sergey Matyukevich (1):
arm64: dts: orange-pi-zero-plus2: fix sdcard detect

Sergey Senozhatsky (2):
arc: do not use __print_symbol()
mm/zsmalloc.c: include fs.h

Sinan Kaya (1):
mfd: rtsx: Release IRQ during shutdown

Stefan Brüns (1):
sunxi-rsb: Include OF based modalias in device uevent

Stefan Haberland (1):
s390/dasd: fix wrongly assigned configuration data

Tetsuo Handa (1):
mm,vmscan: Make unregister_shrinker() no-op if register_shrinker() failed.

Thomas Gleixner (7):
x86/pti: Enable PTI by default
x86/pti: Make sure the user/kernel PTEs match
x86/pti: Switch to kernel CR3 at early in entry_SYSCALL_compat()
x86/mm: Map cpu_entry_area at the same place on 4/5 level
x86/kaslr: Fix the vaddr_end mess
x86/tlb: Drop the _GPL from the cpu_tlbstate export
x86/pti: Rename BUG_CPU_INSECURE to BUG_CPU_MELTDOWN

Tom Lendacky (1):
x86/cpu, x86/pti: Do not enable PTI on AMD processors

Ville Syrjälä (2):
drm/i915: Disable DC states around GMBUS on GLK
drm/i915: Put all non-blocking modesets onto an ordered wq

Vineet Gupta (3):
ARC: uaccess: dont use "l" gcc inline asm constraint modifier
ARC: handle gcc generated __builtin_trap()
ARC: handle gcc generated __builtin_trap for older compiler

Wei Yongjun (1):
xen/pvcalls: use GFP_ATOMIC under spin lock

Xiongwei Song (1):
drm/ttm: check the return value of kzalloc

Yue Hin Lau (1):
drm/amd/display: call set csc_default if enable adjustment is false

Zhen Lei (1):
Input: ims-pcu - fix typo in the error message


2018-01-08 07:20:39

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Linux 4.15-rc7

Linus,

On Sun, 7 Jan 2018, Linus Torvalds wrote:

> The one thing I want to do now that Meltdown and Spectre are public,
> is to give a *big* shout-out to the x86 people, and Thomas Gleixner in
> particular for really being on top of this. It's been one huge
> annoyance, and honestly, Thomas really went over and beyond in this
> whole mess. A lot of other people have obviously been involved too,
> don't get me wrong, but this is exactly the kind of issue that easily
> results in lots of nasty hacky patches because people are falling all
> over themselves trying to fix it and they can't even talk about why
> they are doing it in public, and Thomas &co ended up being a huge
> reason for why it was all much easier for me to merge: because of the
> literally _months_ of work on quality control and gating these patches
> and making sure the end result was a clean and manageable series.
>
> So a big *BIG* thanks to Thomas for making it so much easier for me to
> merge all this stuff. The whole nasty TLB isolation patches would
> have been just _so_ much more horrible without him.

I'm deeply moved and feel a little ashamed as without the help of others
this wouldn't have been possible at all. So it's on me to hand over the
*BIG* thanks to:

Ingo Molnar who was the git logistics mastermind behind this, the last
sanity check before commit and the initial stress tester. Thanks
especially for taking over most of the regular tip maintenance workload.

Andi Lutomirksy for the great work on the entry code, cpu entry area, LDT
mapping and the PCID rework and his reviews.

Borislav Petkov for his meticulous reviews, his help with all AMD issues
and being always on standby for testing and debugging despite his
workload of backporting KAISER to dead kernels.

Peter Zijlstra for his work on the tlb flush / PCID code, reviews and
supporting me on the short trip into LDT VMA mapping which we had to drop
for various reasons.

Dave Hansen who did the initial KAISER port and helped all along with the
rework in various ways

Josh Poimboeuf for fixing up all the stacktrace issues we encountered

Juergen Gross for helping out on the XEN side of things so we did
not have to dig into the inwards of XEN/PV.

Kess Cook for helping with coordination behind the scenes

Greg Kroah-Hartman for not pestering us with all the pre 4.14 backports
and the smooth integration and exposure to 4.14 stable which gave us more
test coverage and helped us to iron out the inevitable hickups.

Linus for keeping his diving harpune in the cabinet and giving us great
support for getting this into his tree on time which allowed 4.14 to gain
all the goods as well.

The team at TU Graz who did the initial KAISER implementation. I'm really
impressed what kernel first timers can achieve and I have to say that I
see worse code in my daily work as a maintainer. Congrats to them for
their findings in the guts of our CPUs as well. Really impressive!

This list is surely not complete, so I extend the thanks to everyone who
helped with review, patches, testing, bug reports and regression hunting.

I want to take the opportunity to say thanks to my wife Monika, my family
and my great team @linutronix for bearing with the extraordinary grumpy old
greybeard which I certainly was for the past two month.

It's been an interesting challenge to sort that out in such a short
timeframe, but I'm sure all of the involved people would have preferred to
do this with the head start which at least one other OS got on this.

But it's not time yet for a post-mortem of this mess, we still have to sort
out the spectre mitigations and it seems Linus expects me to keep my hand
on things for the next time. Aye, aye, captain!

Lets sort this in a technical manner, with the security of our users in
mind and then take a break and after that sit down and gain the performance
back which we lost on the way. Lots of work ahead.

Thanks,

Thomas

2018-01-10 23:32:55

by Pavel Machek

[permalink] [raw]
Subject: Re: Linux 4.15-rc7

Hi!

> The one thing I want to do now that Meltdown and Spectre are public,
> is to give a *big* shout-out to the x86 people, and Thomas Gleixner in
> particular for really being on top of this. It's been one huge
> annoyance, and honestly, Thomas really went over and beyond in this
> whole mess. A lot of other people have obviously been involved too,

As I understand it: KPTI prevents Meltdown attack on x86-64, but
Spectre means even x86-64 is not expected to be safe?

Ok, so Meltdown is public... And I still have some nice 32-bit
machines I'd like to keep working.

Proof of concept is out, https://github.com/IAIK/meltdown/ .

Is anyone working on KPTI for x86-32? SLES11 should still be
supported, and that should have x86-32 version; any chance SUSE can
share some patches?

Thanks,
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Attachments:
(No filename) (947.00 B)
signature.asc (181.00 B)
Digital signature
Download all attachments

2018-01-11 11:35:57

by Olivier Galibert

[permalink] [raw]
Subject: Re: Linux 4.15-rc7

Wasn't/Isn't the 4G/4G memory layout for 32 bits essentially KPTI?

OG.


On Thu, Jan 11, 2018 at 12:32 AM, Pavel Machek <[email protected]> wrote:
> Hi!
>
>> The one thing I want to do now that Meltdown and Spectre are public,
>> is to give a *big* shout-out to the x86 people, and Thomas Gleixner in
>> particular for really being on top of this. It's been one huge
>> annoyance, and honestly, Thomas really went over and beyond in this
>> whole mess. A lot of other people have obviously been involved too,
>
> As I understand it: KPTI prevents Meltdown attack on x86-64, but
> Spectre means even x86-64 is not expected to be safe?
>
> Ok, so Meltdown is public... And I still have some nice 32-bit
> machines I'd like to keep working.
>
> Proof of concept is out, https://github.com/IAIK/meltdown/ .
>
> Is anyone working on KPTI for x86-32? SLES11 should still be
> supported, and that should have x86-32 version; any chance SUSE can
> share some patches?
>
> Thanks,
> Pavel
> --
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2018-01-11 14:06:22

by Nikolay Borisov

[permalink] [raw]
Subject: Re: Linux 4.15-rc7



On 11.01.2018 13:29, Olivier Galibert wrote:
> Wasn't/Isn't the 4G/4G memory layout for 32 bits essentially KPTI?

4g/4g was never accepted upstream

>
> OG.
>
>
> On Thu, Jan 11, 2018 at 12:32 AM, Pavel Machek <[email protected]> wrote:
>> Hi!
>>
>>> The one thing I want to do now that Meltdown and Spectre are public,
>>> is to give a *big* shout-out to the x86 people, and Thomas Gleixner in
>>> particular for really being on top of this. It's been one huge
>>> annoyance, and honestly, Thomas really went over and beyond in this
>>> whole mess. A lot of other people have obviously been involved too,
>>
>> As I understand it: KPTI prevents Meltdown attack on x86-64, but
>> Spectre means even x86-64 is not expected to be safe?
>>
>> Ok, so Meltdown is public... And I still have some nice 32-bit
>> machines I'd like to keep working.
>>
>> Proof of concept is out, https://github.com/IAIK/meltdown/ .
>>
>> Is anyone working on KPTI for x86-32? SLES11 should still be
>> supported, and that should have x86-32 version; any chance SUSE can
>> share some patches?
>>
>> Thanks,
>> Pavel
>> --
>> (english) http://www.livejournal.com/~pavelmachek
>> (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2018-01-11 14:07:25

by Jiri Kosina

[permalink] [raw]
Subject: Re: Linux 4.15-rc7

On Thu, 11 Jan 2018, Pavel Machek wrote:

> Is anyone working on KPTI for x86-32? SLES11 should still be supported,
> and that should have x86-32 version; any chance SUSE can share some
> patches?

We are sharing sources of all our kernels at

http://kernel.suse.com/

If you can find the x86-32 support there, it's yours (hint: you won't).

Otherwise, you'd either have to wait until we (or someone else) implements
it (it's on our list), or implement it yourself.

Thanks,

--
Jiri Kosina
SUSE Labs

2018-01-12 11:06:28

by Pavel Machek

[permalink] [raw]
Subject: Re: Linux 4.15-rc7

Hi!

> Wasn't/Isn't the 4G/4G memory layout for 32 bits essentially KPTI?

Good point. Is that still supported? Was it ever?

Umm. I seem to recall that 4G/4G layout was out of tree but never
merged.

High Memory Support
1. off (NOHIGHMEM)
2. 4GB (HIGHMEM4G)
> 3. 64GB (HIGHMEM64G)
choice[1-3]: 3
Memory split
> 1. 3G/1G user/kernel split (VMSPLIT_3G) (NEW)
2. 2G/2G user/kernel split (VMSPLIT_2G)
3. 1G/3G user/kernel split (VMSPLIT_1G)
choice[1-3?]:

Does anyone have recent patches?

Best regards,
Pavel

> On Thu, Jan 11, 2018 at 12:32 AM, Pavel Machek <[email protected]> wrote:
> > Hi!
> >
> >> The one thing I want to do now that Meltdown and Spectre are public,
> >> is to give a *big* shout-out to the x86 people, and Thomas Gleixner in
> >> particular for really being on top of this. It's been one huge
> >> annoyance, and honestly, Thomas really went over and beyond in this
> >> whole mess. A lot of other people have obviously been involved too,
> >
> > As I understand it: KPTI prevents Meltdown attack on x86-64, but
> > Spectre means even x86-64 is not expected to be safe?
> >
> > Ok, so Meltdown is public... And I still have some nice 32-bit
> > machines I'd like to keep working.
> >
> > Proof of concept is out, https://github.com/IAIK/meltdown/ .
> >
> > Is anyone working on KPTI for x86-32? SLES11 should still be
> > supported, and that should have x86-32 version; any chance SUSE can
> > share some patches?
> >
> > Thanks,
> > Pavel
> > --
> > (english) http://www.livejournal.com/~pavelmachek
> > (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Attachments:
(No filename) (1.81 kB)
signature.asc (181.00 B)
Digital signature
Download all attachments

2018-01-12 13:23:23

by Arnd Bergmann

[permalink] [raw]
Subject: Re: Linux 4.15-rc7

On Fri, Jan 12, 2018 at 12:06 PM, Pavel Machek <[email protected]> wrote:
> Hi!
>
>> Wasn't/Isn't the 4G/4G memory layout for 32 bits essentially KPTI?
>
> Good point. Is that still supported? Was it ever?
>
> Umm. I seem to recall that 4G/4G layout was out of tree but never
> merged.

I think that's correct: it was in RHEL3 and RHEL4 but never merged
upstream.

However, there is an important difference between KPTI and X86_4G:
The former unmaps the kernel pages from the user space page tables,
but keeps both the linear mapping and the user pages visible in
kernel mode, while the latter must have also unmapped user space
pages from kernel mode, requiring a more expensive get_user/put_user
implementation.

Kees mentioned an idea to also unmap user pages from kernel
mode as an additional safeguard on top of KPTI, which would get
it even closer to the X86_4G implementation:
https://outflux.net/blog/archives/2018/01/04/smep-emulation-in-pti/

Could you be more specific which 32-bit x86 chips you have that are
affected by Meltdown? Do you mean pre-2004 Pentiums or Core-Duo
laptops? I would guess that Cyrix/Natsemi/AMD 6x86/MediaGX/Geode
and AMD NexGen K6/K7 also affected by Spectre but probably not
Meltdown, and most other 32-bit microarchitectures seem to be purely
in-order.

Arnd

2018-01-12 14:43:36

by Pavel Machek

[permalink] [raw]
Subject: Re: Linux 4.15-rc7

Hi!

> >> Wasn't/Isn't the 4G/4G memory layout for 32 bits essentially KPTI?
> >
> > Good point. Is that still supported? Was it ever?
> >
> > Umm. I seem to recall that 4G/4G layout was out of tree but never
> > merged.
>
> I think that's correct: it was in RHEL3 and RHEL4 but never merged
> upstream.

Too bad.

> However, there is an important difference between KPTI and X86_4G:
> The former unmaps the kernel pages from the user space page tables,
> but keeps both the linear mapping and the user pages visible in
> kernel mode, while the latter must have also unmapped user space
> pages from kernel mode, requiring a more expensive get_user/put_user
> implementation.
>
> Kees mentioned an idea to also unmap user pages from kernel
> mode as an additional safeguard on top of KPTI, which would get
> it even closer to the X86_4G implementation:
> https://outflux.net/blog/archives/2018/01/04/smep-emulation-in-pti/

Well, I guess at this point I'm looking for a good place to start from...

> Could you be more specific which 32-bit x86 chips you have that are
> affected by Meltdown? Do you mean pre-2004 Pentiums or Core-Duo
> laptops? I would guess that Cyrix/Natsemi/AMD 6x86/MediaGX/Geode
> and AMD NexGen K6/K7 also affected by Spectre but probably not
> Meltdown, and most other 32-bit microarchitectures seem to be purely
> in-order.

I do have Core Solo here'd like to keep working (and useful for web
browsing). Then there's Pentium M. Occasionaly I run 32-bit kernels on
modern machines for testing.

Thanks,
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Attachments:
(No filename) (1.64 kB)
signature.asc (181.00 B)
Digital signature
Download all attachments

2018-01-12 17:20:59

by Vito Caputo

[permalink] [raw]
Subject: Re: Linux 4.15-rc7

On Fri, Jan 12, 2018 at 02:23:20PM +0100, Arnd Bergmann wrote:
> On Fri, Jan 12, 2018 at 12:06 PM, Pavel Machek <[email protected]> wrote:
> > Hi!
> >
> >> Wasn't/Isn't the 4G/4G memory layout for 32 bits essentially KPTI?
> >
> > Good point. Is that still supported? Was it ever?
> >
> > Umm. I seem to recall that 4G/4G layout was out of tree but never
> > merged.
>
> I think that's correct: it was in RHEL3 and RHEL4 but never merged
> upstream.
>
> However, there is an important difference between KPTI and X86_4G:
> The former unmaps the kernel pages from the user space page tables,
> but keeps both the linear mapping and the user pages visible in
> kernel mode, while the latter must have also unmapped user space
> pages from kernel mode, requiring a more expensive get_user/put_user
> implementation.
>
> Kees mentioned an idea to also unmap user pages from kernel
> mode as an additional safeguard on top of KPTI, which would get
> it even closer to the X86_4G implementation:
> https://outflux.net/blog/archives/2018/01/04/smep-emulation-in-pti/
>
> Could you be more specific which 32-bit x86 chips you have that are
> affected by Meltdown? Do you mean pre-2004 Pentiums or Core-Duo
> laptops? I would guess that Cyrix/Natsemi/AMD 6x86/MediaGX/Geode
> and AMD NexGen K6/K7 also affected by Spectre but probably not
> Meltdown, and most other 32-bit microarchitectures seem to be purely
> in-order.
>

I have some Celeron D, 4GiB dedicated servers with a 32-bit stack.
They've proven to be very reliable boxes, and are the most affordable
baremetal x86 machines I've found. I'd appreciate a PTI implementation
on them.

Thanks,
Vito Caputo

2018-01-12 17:34:08

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux 4.15-rc7

On Fri, Jan 12, 2018 at 5:23 AM, Arnd Bergmann <[email protected]> wrote:
>
> However, there is an important difference between KPTI and X86_4G:
> The former unmaps the kernel pages from the user space page tables,
> but keeps both the linear mapping and the user pages visible in
> kernel mode, while the latter must have also unmapped user space
> pages from kernel mode, requiring a more expensive get_user/put_user
> implementation.

Indeed. And I think that the 4G:4G patches do things wrong.

People are already complaining about the PTI costs. Separating user
space entirely is much much worse, and makes all user accesses from
kernel space too painful for words.

Honestly, I didn't merge the old 4G:4G patches originally, and I'm not
going to merge them this time around either.

Linus

2018-01-12 19:38:18

by Pavel Machek

[permalink] [raw]
Subject: Re: Linux 4.15-rc7

On Fri 2018-01-12 09:34:03, Linus Torvalds wrote:
> On Fri, Jan 12, 2018 at 5:23 AM, Arnd Bergmann <[email protected]> wrote:
> >
> > However, there is an important difference between KPTI and X86_4G:
> > The former unmaps the kernel pages from the user space page tables,
> > but keeps both the linear mapping and the user pages visible in
> > kernel mode, while the latter must have also unmapped user space
> > pages from kernel mode, requiring a more expensive get_user/put_user
> > implementation.
>
> Indeed. And I think that the 4G:4G patches do things wrong.

Yeah. But if there's copy around for something recent, I'd still like
to see it.

> People are already complaining about the PTI costs. Separating user
> space entirely is much much worse, and makes all user accesses from
> kernel space too painful for words.
>
> Honestly, I didn't merge the old 4G:4G patches originally, and I'm not
> going to merge them this time around either.

I'll try to do the right thing. OTOH... I don't like the fact that
kernel memory on my machine is currently readable, probably even from
javascript.

I tried disabling CPU caches. Just like that, off, boom. My system
will not survive that, and it looks like 100x slowdown. So 2x slowdown
would be an improvement (and 4G:4G can probably do better than that).

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Attachments:
(No filename) (1.42 kB)
signature.asc (181.00 B)
Digital signature
Download all attachments

2018-01-12 19:44:52

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux 4.15-rc7

On Fri, Jan 12, 2018 at 11:38 AM, Pavel Machek <[email protected]> wrote:
>
> I'll try to do the right thing. OTOH... I don't like the fact that
> kernel memory on my machine is currently readable, probably even from
> javascript.

Oh, absolutely. I'm just saying that it's probably best to try to
start from the x86-64 KPTI model, and see how that works for x86-32.

Maybe some of the 4G:4G entry code could come in handy as a "these are
the issues" kind of thing.

> I tried disabling CPU caches. Just like that, off, boom. My system
> will not survive that, and it looks like 100x slowdown.

Yeah, no. That is not a realistic thing to do on any hardware since
the PPro, I'm afraid.

Linus

2018-01-12 20:11:40

by Arnd Bergmann

[permalink] [raw]
Subject: Re: Linux 4.15-rc7

On Fri, Jan 12, 2018 at 6:20 PM, <[email protected]> wrote:
> On Fri, Jan 12, 2018 at 02:23:20PM +0100, Arnd Bergmann wrote:

>> Could you be more specific which 32-bit x86 chips you have that are
>> affected by Meltdown? Do you mean pre-2004 Pentiums or Core-Duo
>> laptops? I would guess that Cyrix/Natsemi/AMD 6x86/MediaGX/Geode
>> and AMD NexGen K6/K7 also affected by Spectre but probably not
>> Meltdown, and most other 32-bit microarchitectures seem to be purely
>> in-order.
>>
>
> I have some Celeron D, 4GiB dedicated servers with a 32-bit stack.
> They've proven to be very reliable boxes, and are the most affordable
> baremetal x86 machines I've found. I'd appreciate a PTI implementation
> on them.

That's an interesting setup for a number of reasons:

- Celeron D are mostly 64-bit CPUs, but it depends on the particular
model/stepping, so if you have a couple of them, you might be able to
avoid the meltdown bug by running a 64-bit kernel with KPTI at least on
some of them, or trivially replace the CPU on others. This usually
works without changing user space, and tends to result in a faster
system than running a 32-bit kernel as you avoid highmem.
- I haven't found a definite answer on whether Netburst-based CPUs
are affected by meltdown at all. Some people claim it's affected,
others say it's not. If the code from https://github.com/IAIK/meltdown
is successful on your Celeron D, then we know it's affected, if not,
then you could decide to not care about KPTI (Spectre would still
be an issue).
- A 32-bit system running with mostly highmem (only the low 768 MB
out of 4GB are directly mapped) means some of the exploits are
harder to do in practice, as most of the page cache is not visible
in the kernel, and reading data from other processes will fail more
often that succeed.
- Economically, it seems barely worth running these if you pay for
the electricity: the CPU costs a few dollars/euros, it only takes
a couple of weeks of continuous operation to exceed that in
operating cost. Replacing the mainboard with a modern low end
all-in-one board at 10W might pay off within a year. If you don't pay
for electricity, that obviously doesn't work.

Arnd

2018-01-12 20:41:11

by Pavel Machek

[permalink] [raw]
Subject: Re: Linux 4.15-rc7

On Fri 2018-01-12 11:44:48, Linus Torvalds wrote:
> On Fri, Jan 12, 2018 at 11:38 AM, Pavel Machek <[email protected]> wrote:
> >
> > I'll try to do the right thing. OTOH... I don't like the fact that
> > kernel memory on my machine is currently readable, probably even from
> > javascript.
>
> Oh, absolutely. I'm just saying that it's probably best to try to
> start from the x86-64 KPTI model, and see how that works for x86-32.
>
> Maybe some of the 4G:4G entry code could come in handy as a "these are
> the issues" kind of thing.

Ok, so I do have the diff that compiles, and it is 300 lines. Those
will be extremely tricky 300 lines, but...

> > I tried disabling CPU caches. Just like that, off, boom. My system
> > will not survive that, and it looks like 100x slowdown.
>
> Yeah, no. That is not a realistic thing to do on any hardware since
> the PPro, I'm afraid.

What is special about PPro?

Well -- cache off kind of is what I want -- kills Spectre _and_
Meltdown ;-), attacking close to the fundametal issue. And it really
should be doable on UP system, right?

I guess I should re-try with plain VGA console, not framebuffer.

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Attachments:
(No filename) (1.27 kB)
signature.asc (181.00 B)
Digital signature
Download all attachments

2018-01-12 22:04:15

by Vito Caputo

[permalink] [raw]
Subject: Re: Linux 4.15-rc7

On Fri, Jan 12, 2018 at 09:11:38PM +0100, Arnd Bergmann wrote:
> On Fri, Jan 12, 2018 at 6:20 PM, <[email protected]> wrote:
> > On Fri, Jan 12, 2018 at 02:23:20PM +0100, Arnd Bergmann wrote:
>
> >> Could you be more specific which 32-bit x86 chips you have that are
> >> affected by Meltdown? Do you mean pre-2004 Pentiums or Core-Duo
> >> laptops? I would guess that Cyrix/Natsemi/AMD 6x86/MediaGX/Geode
> >> and AMD NexGen K6/K7 also affected by Spectre but probably not
> >> Meltdown, and most other 32-bit microarchitectures seem to be purely
> >> in-order.
> >>
> >
> > I have some Celeron D, 4GiB dedicated servers with a 32-bit stack.
> > They've proven to be very reliable boxes, and are the most affordable
> > baremetal x86 machines I've found. I'd appreciate a PTI implementation
> > on them.
>
> That's an interesting setup for a number of reasons:
>
> - Celeron D are mostly 64-bit CPUs, but it depends on the particular
> model/stepping, so if you have a couple of them, you might be able to
> avoid the meltdown bug by running a 64-bit kernel with KPTI at least on
> some of them, or trivially replace the CPU on others. This usually
> works without changing user space, and tends to result in a faster
> system than running a 32-bit kernel as you avoid highmem.
>

This may be possible, I'll need to try booting a x86_64 kernel on one
and see. I would rather not change all of userspace.

> - I haven't found a definite answer on whether Netburst-based CPUs
> are affected by meltdown at all. Some people claim it's affected,
> others say it's not. If the code from https://github.com/IAIK/meltdown
> is successful on your Celeron D, then we know it's affected, if not,
> then you could decide to not care about KPTI (Spectre would still
> be an issue).
>

I tried that when the code was first made public, but libkdump doesn't
support 32-bit; it's full of 64-bit register use in the assembly bits.

> - A 32-bit system running with mostly highmem (only the low 768 MB
> out of 4GB are directly mapped) means some of the exploits are
> harder to do in practice, as most of the page cache is not visible
> in the kernel, and reading data from other processes will fail more
> often that succeed.
>

Well that's good news.

> - Economically, it seems barely worth running these if you pay for
> the electricity: the CPU costs a few dollars/euros, it only takes
> a couple of weeks of continuous operation to exceed that in
> operating cost. Replacing the mainboard with a modern low end
> all-in-one board at 10W might pay off within a year. If you don't pay
> for electricity, that obviously doesn't work.
>

I don't pay for the electricity, these are old dedicated servers hosted
by a third party. Not my hardware, and any more modern dedicated x86
servers I've found are substantially more expensive and always SMP.

This particular hosting provider has tried selling me upgrades to their
current low-end offering (which is still SMP), the price basically
doubles. These boxes are mostly idle, performing just personal email
and ssh duties. For this situation reliability and security is the
priority, power efficiency and performance are not.

Thanks,
Vito Caputo

2018-01-12 22:09:01

by Arnd Bergmann

[permalink] [raw]
Subject: Re: Linux 4.15-rc7

On Fri, Jan 12, 2018 at 11:04 PM, <[email protected]> wrote:
> On Fri, Jan 12, 2018 at 09:11:38PM +0100, Arnd Bergmann wrote:
>
>> - I haven't found a definite answer on whether Netburst-based CPUs
>> are affected by meltdown at all. Some people claim it's affected,
>> others say it's not. If the code from https://github.com/IAIK/meltdown
>> is successful on your Celeron D, then we know it's affected, if not,
>> then you could decide to not care about KPTI (Spectre would still
>> be an issue).
>>
>
> I tried that when the code was first made public, but libkdump doesn't
> support 32-bit; it's full of 64-bit register use in the assembly bits.

Apparently 32-bit support was added on Wednesday, maybe you
can try again with today's version.

Arnd

2018-01-12 22:59:04

by Vito Caputo

[permalink] [raw]
Subject: Re: Linux 4.15-rc7

On Fri, Jan 12, 2018 at 11:08:58PM +0100, Arnd Bergmann wrote:
> On Fri, Jan 12, 2018 at 11:04 PM, <[email protected]> wrote:
> > On Fri, Jan 12, 2018 at 09:11:38PM +0100, Arnd Bergmann wrote:
> >
> >> - I haven't found a definite answer on whether Netburst-based CPUs
> >> are affected by meltdown at all. Some people claim it's affected,
> >> others say it's not. If the code from https://github.com/IAIK/meltdown
> >> is successful on your Celeron D, then we know it's affected, if not,
> >> then you could decide to not care about KPTI (Spectre would still
> >> be an issue).
> >>
> >
> > I tried that when the code was first made public, but libkdump doesn't
> > support 32-bit; it's full of 64-bit register use in the assembly bits.
>
> Apparently 32-bit support was added on Wednesday, maybe you
> can try again with today's version.
>

Thanks for informing me of this, I hadn't noticed.

I just tried it out, and confirmed the Celeron D is vulnerable to
meltdown.

Regards,
Vito Caputo

2018-01-13 12:52:13

by Pavel Machek

[permalink] [raw]
Subject: kernel page table isolation for x86-32 was Re: Linux 4.15-rc7

Hi!

> > I'll try to do the right thing. OTOH... I don't like the fact that
> > kernel memory on my machine is currently readable, probably even from
> > javascript.
>
> Oh, absolutely. I'm just saying that it's probably best to try to
> start from the x86-64 KPTI model, and see how that works for x86-32.

Ok, it should not be too bad. Here's something... getting it to
compile should be easy, getting it to work might be trickier. Not sure
what needs to be done for the LDT.

Pavel

diff --git a/Documentation/x86/pti.txt b/Documentation/x86/pti.txt
index d11eff6..e13e1e5 100644
--- a/Documentation/x86/pti.txt
+++ b/Documentation/x86/pti.txt
@@ -124,7 +124,7 @@ Possible Future Work
boot-time switching.

Testing
-========
+=======

To test stability of PTI, the following test procedure is recommended,
ideally doing all of these in parallel:
diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index 45a63e0..b0485cc 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -332,6 +332,99 @@ For 32-bit we have the following conventions - kernel is built with

#endif

+#else
+
+/*
+ * x86-32 kernel page table isolation.
+ */
+#ifdef CONFIG_PAGE_TABLE_ISOLATION
+
+/*
+ * PAGE_TABLE_ISOLATION PGDs are 8k. Flip bit 12 to switch between the two
+ * halves:
+ */
+#define PTI_SWITCH_PGTABLES_MASK (1<<PAGE_SHIFT)
+#define PTI_SWITCH_MASK (PTI_SWITCH_PGTABLES_MASK|(1<<X86_CR3_PTI_SWITCH_BIT))
+
+.macro ADJUST_KERNEL_CR3 reg:req
+ /* Clear PCID and "PAGE_TABLE_ISOLATION bit", point CR3 at kernel pagetables: */
+ andl $(~PTI_SWITCH_MASK), \reg
+.endm
+
+.macro SWITCH_TO_KERNEL_CR3 scratch_reg:req
+ ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI
+ movl %cr3, \scratch_reg
+ ADJUST_KERNEL_CR3 \scratch_reg
+ movl \scratch_reg, %cr3
+.Lend_\@:
+.endm
+
+.macro SWITCH_TO_USER_CR3_NOSTACK scratch_reg:req scratch_reg2:req
+ ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI
+ mov %cr3, \scratch_reg
+
+.Lwrcr3_\@:
+ /* Flip the PGD and ASID to the user version */
+ orl $(PTI_SWITCH_MASK), \scratch_reg
+ mov \scratch_reg, %cr3
+.Lend_\@:
+.endm
+
+.macro SWITCH_TO_USER_CR3_STACK scratch_reg:req
+ pushl %eax
+ SWITCH_TO_USER_CR3_NOSTACK scratch_reg=\scratch_reg scratch_reg2=%eax
+ popl %eax
+.endm
+
+.macro SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg:req save_reg:req
+ ALTERNATIVE "jmp .Ldone_\@", "", X86_FEATURE_PTI
+ movl %cr3, \scratch_reg
+ movl \scratch_reg, \save_reg
+ /*
+ * Is the "switch mask" all zero? That means that both of
+ * these are zero:
+ *
+ * 1. The user/kernel PCID bit, and
+ * 2. The user/kernel "bit" that points CR3 to the
+ * bottom half of the 8k PGD
+ *
+ * That indicates a kernel CR3 value, not a user CR3.
+ */
+ testl $(PTI_SWITCH_MASK), \scratch_reg
+ jz .Ldone_\@
+
+ ADJUST_KERNEL_CR3 \scratch_reg
+ movl \scratch_reg, %cr3
+
+.Ldone_\@:
+.endm
+
+.macro RESTORE_CR3 scratch_reg:req save_reg:req
+ ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI
+
+ /*
+ * The CR3 write could be avoided when not changing its value,
+ * but would require a CR3 read *and* a scratch register.
+ */
+ movl \save_reg, %cr3
+.Lend_\@:
+.endm
+
+#else /* CONFIG_PAGE_TABLE_ISOLATION=n: */
+
+.macro SWITCH_TO_KERNEL_CR3 scratch_reg:req
+.endm
+.macro SWITCH_TO_USER_CR3_NOSTACK scratch_reg:req scratch_reg2:req
+.endm
+.macro SWITCH_TO_USER_CR3_STACK scratch_reg:req
+.endm
+.macro SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg:req save_reg:req
+.endm
+.macro RESTORE_CR3 scratch_reg:req save_reg:req
+.endm
+
+#endif
+
#endif /* CONFIG_X86_64 */

/*
diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index d2ef7f32..be8759b 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -46,6 +46,8 @@
#include <asm/frame.h>
#include <asm/nospec-branch.h>

+#include "calling.h"
+
.section .entry.text, "ax"

/*
@@ -428,6 +430,7 @@ ENTRY(entry_SYSENTER_32)
pushl $0 /* pt_regs->ip = 0 (placeholder) */
pushl %eax /* pt_regs->orig_ax */
SAVE_ALL pt_regs_ax=$-ENOSYS /* save rest */
+ SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg=%edx save_reg=%ecx

/*
* SYSENTER doesn't filter flags, so we need to clear NT, AC
@@ -464,6 +467,7 @@ ENTRY(entry_SYSENTER_32)
ALTERNATIVE "testl %eax, %eax; jz .Lsyscall_32_done", \
"jmp .Lsyscall_32_done", X86_FEATURE_XENPV

+ RESTORE_CR3 scratch_reg=%edx save_reg=%ecx
/* Opportunistic SYSEXIT */
TRACE_IRQS_ON /* User mode traces as IRQs on. */
movl PT_EIP(%esp), %edx /* pt_regs->ip */
@@ -539,6 +543,7 @@ ENTRY(entry_INT80_32)
ASM_CLAC
pushl %eax /* pt_regs->orig_ax */
SAVE_ALL pt_regs_ax=$-ENOSYS /* save rest */
+ SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg=%edx save_reg=%ecx

/*
* User mode is traced as though IRQs are on, and the interrupt gate
@@ -552,6 +557,7 @@ ENTRY(entry_INT80_32)

restore_all:
TRACE_IRQS_IRET
+ RESTORE_CR3 scratch_reg=%eax save_reg=%ecx
.Lrestore_all_notrace:
#ifdef CONFIG_X86_ESPFIX32
ALTERNATIVE "jmp .Lrestore_nocheck", "", X86_BUG_ESPFIX
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index e51b65a..a87fb89 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -4,7 +4,7 @@
*
* Copyright (C) 1991, 1992 Linus Torvalds
* Copyright (C) 2000, 2001, 2002 Andi Kleen SuSE Labs
- * Copyright (C) 2000 Pavel Machek <[email protected]>
+ * Copyright (C) 2000 Pavel Machek SuSE Labs
*
* entry.S contains the system-call and fault low-level handling routines.
*
diff --git a/arch/x86/include/asm/pgtable_32.h b/arch/x86/include/asm/pgtable_32.h
index e67c062..1b36e56 100644
--- a/arch/x86/include/asm/pgtable_32.h
+++ b/arch/x86/include/asm/pgtable_32.h
@@ -107,4 +107,78 @@ do { \
*/
#define LOWMEM_PAGES ((((2<<31) - __PAGE_OFFSET) >> PAGE_SHIFT))

+#ifdef CONFIG_PAGE_TABLE_ISOLATION
+/*
+ * All top-level PAGE_TABLE_ISOLATION page tables are order-1 pages
+ * (8k-aligned and 8k in size). The kernel one is at the beginning 4k and
+ * the user one is in the last 4k. To switch between them, you
+ * just need to flip the 12th bit in their addresses.
+ */
+#define PTI_PGTABLE_SWITCH_BIT PAGE_SHIFT
+
+#ifndef __ASSEMBLY__
+/*
+ * This generates better code than the inline assembly in
+ * __set_bit().
+ */
+static inline void *ptr_set_bit(void *ptr, int bit)
+{
+ unsigned long __ptr = (unsigned long)ptr;
+
+ __ptr |= BIT(bit);
+ return (void *)__ptr;
+}
+static inline void *ptr_clear_bit(void *ptr, int bit)
+{
+ unsigned long __ptr = (unsigned long)ptr;
+
+ __ptr &= ~BIT(bit);
+ return (void *)__ptr;
+}
+
+static inline pgd_t *kernel_to_user_pgdp(pgd_t *pgdp)
+{
+ return ptr_set_bit(pgdp, PTI_PGTABLE_SWITCH_BIT);
+}
+
+static inline pgd_t *user_to_kernel_pgdp(pgd_t *pgdp)
+{
+ return ptr_clear_bit(pgdp, PTI_PGTABLE_SWITCH_BIT);
+}
+
+static inline p4d_t *kernel_to_user_p4dp(p4d_t *p4dp)
+{
+ return ptr_set_bit(p4dp, PTI_PGTABLE_SWITCH_BIT);
+}
+
+static inline p4d_t *user_to_kernel_p4dp(p4d_t *p4dp)
+{
+ return ptr_clear_bit(p4dp, PTI_PGTABLE_SWITCH_BIT);
+}
+#endif
+#endif /* CONFIG_PAGE_TABLE_ISOLATION */
+
+#ifndef __ASSEMBLY__
+#ifdef CONFIG_PAGE_TABLE_ISOLATION
+pgd_t __pti_set_user_pgd(pgd_t *pgdp, pgd_t pgd);
+
+/*
+ * Take a PGD location (pgdp) and a pgd value that needs to be set there.
+ * Populates the user and returns the resulting PGD that must be set in
+ * the kernel copy of the page tables.
+ */
+static inline pgd_t pti_set_user_pgd(pgd_t *pgdp, pgd_t pgd)
+{
+ if (!static_cpu_has(X86_FEATURE_PTI))
+ return pgd;
+ return __pti_set_user_pgd(pgdp, pgd);
+}
+#else
+static inline pgd_t pti_set_user_pgd(pgd_t *pgdp, pgd_t pgd)
+{
+ return pgd;
+}
+#endif
+#endif
+
#endif /* _ASM_X86_PGTABLE_32_H */
diff --git a/arch/x86/include/asm/pgtable_32_types.h b/arch/x86/include/asm/pgtable_32_types.h
index ce245b0..804fc33 100644
--- a/arch/x86/include/asm/pgtable_32_types.h
+++ b/arch/x86/include/asm/pgtable_32_types.h
@@ -62,4 +62,6 @@ extern bool __vmalloc_start_set; /* set once high_memory is set */

#define MAXMEM (VMALLOC_END - PAGE_OFFSET - __VMALLOC_RESERVE)

+#define LDT_BASE_ADDR 0 /* FIXME */
+
#endif /* _ASM_X86_PGTABLE_32_DEFS_H */
diff --git a/arch/x86/include/asm/processor-flags.h b/arch/x86/include/asm/processor-flags.h
index 6a60fea..8f1cf71 100644
--- a/arch/x86/include/asm/processor-flags.h
+++ b/arch/x86/include/asm/processor-flags.h
@@ -39,10 +39,6 @@
#define CR3_PCID_MASK 0xFFFull
#define CR3_NOFLUSH BIT_ULL(63)

-#ifdef CONFIG_PAGE_TABLE_ISOLATION
-# define X86_CR3_PTI_SWITCH_BIT 11
-#endif
-
#else
/*
* CR3_ADDR_MASK needs at least bits 31:5 set on PAE systems, and we save
@@ -53,4 +49,8 @@
#define CR3_NOFLUSH 0
#endif

+#ifdef CONFIG_PAGE_TABLE_ISOLATION
+# define X86_CR3_PTI_SWITCH_BIT 11
+#endif
+
#endif /* _ASM_X86_PROCESSOR_FLAGS_H */
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index dbaf14d..849e073 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -1,5 +1,7 @@
#define pr_fmt(fmt) "SMP alternatives: " fmt

+#define DEBUG
+
#include <linux/module.h>
#include <linux/sched.h>
#include <linux/mutex.h>
@@ -28,7 +30,7 @@ EXPORT_SYMBOL_GPL(alternatives_patched);

#define MAX_PATCH_LEN (255-1)

-static int __initdata_or_module debug_alternative;
+static int __initdata_or_module debug_alternative = 1;

static int __init debug_alt(char *str)
{
@@ -60,7 +62,7 @@ __setup("noreplace-paravirt", setup_noreplace_paravirt);
#define DPRINTK(fmt, args...) \
do { \
if (debug_alternative) \
- printk(KERN_DEBUG "%s: " fmt "\n", __func__, ##args); \
+ printk( "%s: " fmt "\n", __func__, ##args); \
} while (0)

#define DUMP_BYTES(buf, len, fmt, args...) \
@@ -71,7 +73,7 @@ do { \
if (!(len)) \
break; \
\
- printk(KERN_DEBUG fmt, ##args); \
+ printk( fmt, ##args); \
for (j = 0; j < (len) - 1; j++) \
printk(KERN_CONT "%02hhx ", buf[j]); \
printk(KERN_CONT "%02hhx\n", buf[j]); \
@@ -373,6 +375,8 @@ void __init_or_module noinline apply_alternatives(struct alt_instr *start,
u8 *instr, *replacement;
u8 insnbuf[MAX_PATCH_LEN];

+ printk("apply_alternatives: entering\n");
+
DPRINTK("alt table %p -> %p", start, end);
/*
* The scan order should be from start to end. A later scanned
diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
index c290209..002ffaf 100644
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -505,6 +505,31 @@ __INITDATA
GLOBAL(early_recursion_flag)
.long 0

+#define NEXT_PAGE(name) \
+ .balign PAGE_SIZE; \
+GLOBAL(name)
+
+#ifdef CONFIG_PAGE_TABLE_ISOLATION
+/*
+ * Each PGD needs to be 8k long and 8k aligned. We do not
+ * ever go out to userspace with these, so we do not
+ * strictly *need* the second page, but this allows us to
+ * have a single set_pgd() implementation that does not
+ * need to worry about whether it has 4k or 8k to work
+ * with.
+ *
+ * This ensures PGDs are 8k long:
+ */
+#define PTI_USER_PGD_FILL 1024
+/* This ensures they are 8k-aligned: */
+#define NEXT_PGD_PAGE(name) \
+ .balign 2 * PAGE_SIZE; \
+GLOBAL(name)
+#else
+#define NEXT_PGD_PAGE(name) NEXT_PAGE(name)
+#define PTI_USER_PGD_FILL 0
+#endif
+
__REFDATA
.align 4
ENTRY(initial_code)
@@ -516,24 +541,26 @@ ENTRY(setup_once_ref)
* BSS section
*/
__PAGE_ALIGNED_BSS
- .align PAGE_SIZE
#ifdef CONFIG_X86_PAE
-.globl initial_pg_pmd
+NEXT_PGD_PAGE(initial_pg_pmd)
initial_pg_pmd:
.fill 1024*KPMDS,4,0
+ .fill PTI_USER_PGD_FILL,4,0
#else
-.globl initial_page_table
+NEXT_PGD_PAGE(initial_page_table)
initial_page_table:
.fill 1024,4,0
+ .fill PTI_USER_PGD_FILL,4,0
#endif
initial_pg_fixmap:
.fill 1024,4,0
.globl empty_zero_page
empty_zero_page:
.fill 4096,1,0
-.globl swapper_pg_dir
+NEXT_PGD_PAGE(swapper_pg_dir)
swapper_pg_dir:
.fill 1024,4,0
+ .fill PTI_USER_PGD_FILL,4,0
EXPORT_SYMBOL(empty_zero_page)

/*
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 04a625f..57f5cd4 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -3,7 +3,7 @@
* linux/arch/x86/kernel/head_64.S -- start in 32bit and switch to 64bit
*
* Copyright (C) 2000 Andrea Arcangeli <[email protected]> SuSE
- * Copyright (C) 2000 Pavel Machek <[email protected]>
+ * Copyright (C) 2000 Pavel Machek
* Copyright (C) 2000 Karsten Keil <[email protected]>
* Copyright (C) 2001,2002 Andi Kleen <[email protected]>
* Copyright (C) 2005 Eric Biederman <[email protected]>
diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
index 2a4849e..896b53b 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -543,7 +543,11 @@ EXPORT_SYMBOL_GPL(ptdump_walk_pgd_level_debugfs);
static void ptdump_walk_user_pgd_level_checkwx(void)
{
#ifdef CONFIG_PAGE_TABLE_ISOLATION
+#ifdef CONFIG_X86_64
pgd_t *pgd = (pgd_t *) &init_top_pgt;
+#else
+ pgd_t *pgd = swapper_pg_dir;
+#endif

if (!static_cpu_has(X86_FEATURE_PTI))
return;
diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c
index ce38f16..029b5c8 100644
--- a/arch/x86/mm/pti.c
+++ b/arch/x86/mm/pti.c
@@ -113,8 +113,11 @@ pgd_t __pti_set_user_pgd(pgd_t *pgdp, pgd_t pgd)
* Top-level entries added to init_mm's usermode pgd after boot
* will not be automatically propagated to other mms.
*/
+#ifdef X86_64
+ /* FIXME? */
if (!pgdp_maps_userspace(pgdp))
return pgd;
+#endif

/*
* The user page tables get the full PGD, accessible from
@@ -166,7 +169,9 @@ static __init p4d_t *pti_user_pagetable_walk_p4d(unsigned long address)

set_pgd(pgd, __pgd(_KERNPG_TABLE | __pa(new_p4d_page)));
}
+#ifdef X86_64
BUILD_BUG_ON(pgd_large(*pgd) != 0);
+#endif

return p4d_offset(pgd, address);
}
diff --git a/security/Kconfig b/security/Kconfig
index 1f96e19..ad77de4 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -57,7 +57,7 @@ config SECURITY_NETWORK
config PAGE_TABLE_ISOLATION
bool "Remove the kernel mapping in user mode"
default y
- depends on X86_64 && !UML
+ depends on (X86_32 || X86_64) && !UML
help
This feature reduces the number of hardware side channels by
ensuring that the majority of kernel addresses are not mapped


--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Attachments:
(No filename) (13.97 kB)
signature.asc (181.00 B)
Digital signature
Download all attachments

2018-01-19 10:30:59

by Pavel Machek

[permalink] [raw]
Subject: Re: Linux 4.15-rc7

On Thu 2018-01-11 15:07:22, Jiri Kosina wrote:
> On Thu, 11 Jan 2018, Pavel Machek wrote:
>
> > Is anyone working on KPTI for x86-32? SLES11 should still be supported,
> > and that should have x86-32 version; any chance SUSE can share some
> > patches?
>
> We are sharing sources of all our kernels at
>
> http://kernel.suse.com/
>
> If you can find the x86-32 support there, it's yours (hint: you won't).
>
> Otherwise, you'd either have to wait until we (or someone else) implements
> it (it's on our list), or implement it yourself.

Hmm. Seems Joerg Roedel from suse sent implementation after all. And
it should boot, mine did not yet. Let me do some testing...

Pavel

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Attachments:
(No filename) (852.00 B)
signature.asc (188.00 B)
Digital signature
Download all attachments