2022-12-19 00:23:47

by Guilherme G. Piccoli

[permalink] [raw]
Subject: [PATCH 6.0.y / 6.1.y] x86/split_lock: Add sysctl to control the misery mode

commit 727209376f4998bc84db1d5d8af15afea846a92b upstream.

Commit b041b525dab9 ("x86/split_lock: Make life miserable for split lockers")
changed the way the split lock detector works when in "warn" mode;
basically, it not only shows the warn message, but also intentionally
introduces a slowdown through sleeping plus serialization mechanism
on such task. Based on discussions in [0], seems the warning alone
wasn't enough motivation for userspace developers to fix their
applications.

This slowdown is enough to totally break some proprietary (aka.
unfixable) userspace[1].

Happens that originally the proposal in [0] was to add a new mode
which would warns + slowdown the "split locking" task, keeping the
old warn mode untouched. In the end, that idea was discarded and
the regular/default "warn" mode now slows down the applications. This
is quite aggressive with regards proprietary/legacy programs that
basically are unable to properly run in kernel with this change.
While it is understandable that a malicious application could DoS
by split locking, it seems unacceptable to regress old/proprietary
userspace programs through a default configuration that previously
worked. An example of such breakage was reported in [1].

Add a sysctl to allow controlling the "misery mode" behavior, as per
Thomas suggestion on [2]. This way, users running legacy and/or
proprietary software are allowed to still execute them with a decent
performance while still observing the warning messages on kernel log.

[0] https://lore.kernel.org/lkml/[email protected]/
[1] https://github.com/doitsujin/dxvk/issues/2938
[2] https://lore.kernel.org/lkml/87pmf4bter.ffs@tglx/

[ dhansen: minor changelog tweaks, including clarifying the actual
problem ]

Fixes: b041b525dab9 ("x86/split_lock: Make life miserable for split lockers")
Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Guilherme G. Piccoli <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Tested-by: Andre Almeida <[email protected]>
Link: https://lore.kernel.org/all/20221024200254.635256-1-gpiccoli%40igalia.com
---


Hi folks, I've build tested this on both 6.0.13 and 6.1, worked fine. The
split lock detector code changed almost nothing since 6.0, so that makes
sense...

I think this is important to have in stable, some gaming community members
seems excited with that, it'll help with general proprietary software
(that is basically unfixable), making them run smoothly on 6.0.y and 6.1.y.

I've CCed some folks more than just the stable list, to gather more
opinions on that, so apologies if you received this email but think
that you shouldn't have.

Thanks in advance,

Guilherme


Documentation/admin-guide/sysctl/kernel.rst | 23 ++++++++
arch/x86/kernel/cpu/intel.c | 63 +++++++++++++++++----
2 files changed, 76 insertions(+), 10 deletions(-)

diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index 98d1b198b2b4..c2c64c1b706f 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst
@@ -1314,6 +1314,29 @@ watchdog work to be queued by the watchdog timer function, otherwise the NMI
watchdog — if enabled — can detect a hard lockup condition.


+split_lock_mitigate (x86 only)
+==============================
+
+On x86, each "split lock" imposes a system-wide performance penalty. On larger
+systems, large numbers of split locks from unprivileged users can result in
+denials of service to well-behaved and potentially more important users.
+
+The kernel mitigates these bad users by detecting split locks and imposing
+penalties: forcing them to wait and only allowing one core to execute split
+locks at a time.
+
+These mitigations can make those bad applications unbearably slow. Setting
+split_lock_mitigate=0 may restore some application performance, but will also
+increase system exposure to denial of service attacks from split lock users.
+
+= ===================================================================
+0 Disable the mitigation mode - just warns the split lock on kernel log
+ and exposes the system to denials of service from the split lockers.
+1 Enable the mitigation mode (this is the default) - penalizes the split
+ lockers with intentional performance degradation.
+= ===================================================================
+
+
stack_erasing
=============

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 2d7ea5480ec3..427899650483 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -1034,8 +1034,32 @@ static const struct {

static struct ratelimit_state bld_ratelimit;

+static unsigned int sysctl_sld_mitigate = 1;
static DEFINE_SEMAPHORE(buslock_sem);

+#ifdef CONFIG_PROC_SYSCTL
+static struct ctl_table sld_sysctls[] = {
+ {
+ .procname = "split_lock_mitigate",
+ .data = &sysctl_sld_mitigate,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+ .proc_handler = proc_douintvec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+ {}
+};
+
+static int __init sld_mitigate_sysctl_init(void)
+{
+ register_sysctl_init("kernel", sld_sysctls);
+ return 0;
+}
+
+late_initcall(sld_mitigate_sysctl_init);
+#endif
+
static inline bool match_option(const char *arg, int arglen, const char *opt)
{
int len = strlen(opt), ratelimit;
@@ -1146,12 +1170,20 @@ static void split_lock_init(void)
split_lock_verify_msr(sld_state != sld_off);
}

-static void __split_lock_reenable(struct work_struct *work)
+static void __split_lock_reenable_unlock(struct work_struct *work)
{
sld_update_msr(true);
up(&buslock_sem);
}

+static DECLARE_DELAYED_WORK(sl_reenable_unlock, __split_lock_reenable_unlock);
+
+static void __split_lock_reenable(struct work_struct *work)
+{
+ sld_update_msr(true);
+}
+static DECLARE_DELAYED_WORK(sl_reenable, __split_lock_reenable);
+
/*
* If a CPU goes offline with pending delayed work to re-enable split lock
* detection then the delayed work will be executed on some other CPU. That
@@ -1169,10 +1201,9 @@ static int splitlock_cpu_offline(unsigned int cpu)
return 0;
}

-static DECLARE_DELAYED_WORK(split_lock_reenable, __split_lock_reenable);
-
static void split_lock_warn(unsigned long ip)
{
+ struct delayed_work *work;
int cpu;

if (!current->reported_split_lock)
@@ -1180,14 +1211,26 @@ static void split_lock_warn(unsigned long ip)
current->comm, current->pid, ip);
current->reported_split_lock = 1;

- /* misery factor #1, sleep 10ms before trying to execute split lock */
- if (msleep_interruptible(10) > 0)
- return;
- /* Misery factor #2, only allow one buslocked disabled core at a time */
- if (down_interruptible(&buslock_sem) == -EINTR)
- return;
+ if (sysctl_sld_mitigate) {
+ /*
+ * misery factor #1:
+ * sleep 10ms before trying to execute split lock.
+ */
+ if (msleep_interruptible(10) > 0)
+ return;
+ /*
+ * Misery factor #2:
+ * only allow one buslocked disabled core at a time.
+ */
+ if (down_interruptible(&buslock_sem) == -EINTR)
+ return;
+ work = &sl_reenable_unlock;
+ } else {
+ work = &sl_reenable;
+ }
+
cpu = get_cpu();
- schedule_delayed_work_on(cpu, &split_lock_reenable, 2);
+ schedule_delayed_work_on(cpu, work, 2);

/* Disable split lock detection on this CPU to make progress */
sld_update_msr(false);
--
2.38.1


2022-12-19 10:32:20

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH 6.0.y / 6.1.y] x86/split_lock: Add sysctl to control the misery mode

On Sun, Dec 18, 2022 at 08:44:00PM -0300, Guilherme G. Piccoli wrote:
> commit 727209376f4998bc84db1d5d8af15afea846a92b upstream.
>
> Commit b041b525dab9 ("x86/split_lock: Make life miserable for split lockers")
> changed the way the split lock detector works when in "warn" mode;
> basically, it not only shows the warn message, but also intentionally
> introduces a slowdown through sleeping plus serialization mechanism
> on such task. Based on discussions in [0], seems the warning alone
> wasn't enough motivation for userspace developers to fix their
> applications.
>
> This slowdown is enough to totally break some proprietary (aka.
> unfixable) userspace[1].
>
> Happens that originally the proposal in [0] was to add a new mode
> which would warns + slowdown the "split locking" task, keeping the
> old warn mode untouched. In the end, that idea was discarded and
> the regular/default "warn" mode now slows down the applications. This
> is quite aggressive with regards proprietary/legacy programs that
> basically are unable to properly run in kernel with this change.
> While it is understandable that a malicious application could DoS
> by split locking, it seems unacceptable to regress old/proprietary
> userspace programs through a default configuration that previously
> worked. An example of such breakage was reported in [1].
>
> Add a sysctl to allow controlling the "misery mode" behavior, as per
> Thomas suggestion on [2]. This way, users running legacy and/or
> proprietary software are allowed to still execute them with a decent
> performance while still observing the warning messages on kernel log.
>
> [0] https://lore.kernel.org/lkml/[email protected]/
> [1] https://github.com/doitsujin/dxvk/issues/2938
> [2] https://lore.kernel.org/lkml/87pmf4bter.ffs@tglx/
>
> [ dhansen: minor changelog tweaks, including clarifying the actual
> problem ]
>
> Fixes: b041b525dab9 ("x86/split_lock: Make life miserable for split lockers")
> Suggested-by: Thomas Gleixner <[email protected]>
> Signed-off-by: Guilherme G. Piccoli <[email protected]>
> Signed-off-by: Dave Hansen <[email protected]>
> Reviewed-by: Tony Luck <[email protected]>
> Tested-by: Andre Almeida <[email protected]>
> Link: https://lore.kernel.org/all/20221024200254.635256-1-gpiccoli%40igalia.com
> ---
>
>
> Hi folks, I've build tested this on both 6.0.13 and 6.1, worked fine. The
> split lock detector code changed almost nothing since 6.0, so that makes
> sense...
>
> I think this is important to have in stable, some gaming community members
> seems excited with that, it'll help with general proprietary software
> (that is basically unfixable), making them run smoothly on 6.0.y and 6.1.y.

What specific programs have this problem and what are the exact results
of it?

Also, this is really a new feature and not really a "fix", but one could
argue a lot that this is a "resolve a performance problem" if you want
to and have the numbers to back it up {hint}

thanks,

greg k-h

2022-12-19 12:00:17

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 6.0.y / 6.1.y] x86/split_lock: Add sysctl to control the misery mode

On Mon, Dec 19, 2022 at 11:17:39AM +0100, Greg KH wrote:

> What specific programs have this problem and what are the exact results
> of it?

IIRC it was God of War (2018) that triggered this initially. But it's
possible more games were found to tickle this specific thing. Since it's
binary only gunk that is unlikely to ever get fixed we need something to
allow for it.

(slow motion Kratos yelling B...o...y...)

> Also, this is really a new feature and not really a "fix", but one could
> argue a lot that this is a "resolve a performance problem" if you want
> to and have the numbers to back it up {hint}

Right, there were some, they should indeed have been included.

2022-12-19 13:01:01

by Guilherme G. Piccoli

[permalink] [raw]
Subject: Re: [PATCH 6.0.y / 6.1.y] x86/split_lock: Add sysctl to control the misery mode

On 19/12/2022 07:59, Peter Zijlstra wrote:
> On Mon, Dec 19, 2022 at 11:17:39AM +0100, Greg KH wrote:
>
>> What specific programs have this problem and what are the exact results
>> of it?
>
> IIRC it was God of War (2018) that triggered this initially. But it's
> possible more games were found to tickle this specific thing. Since it's
> binary only gunk that is unlikely to ever get fixed we need something to
> allow for it.
>
> (slow motion Kratos yelling B...o...y...)
>
>> Also, this is really a new feature and not really a "fix", but one could
>> argue a lot that this is a "resolve a performance problem" if you want
>> to and have the numbers to back it up {hint}
>
> Right, there were some, they should indeed have been included.

Thanks Peter, that's exactly it - the current report is linked on commit
message.

About performance numbers, the only "numbers" I have is: game is
unplayable, according to the report "When I launch God of War through
Steam or Lutris I get around 25 fps, on lowest settings and at 10%
resolution scaling", FPS for for games is double of that usually.

I understand this is not a regular fix in which something is completely
broken, but it does fix a behavior introduced on kernel that prevent
some userspace binaries to properly run, in practice. Ofc some will
argue that we already have the kernel parameter, but it's different -
requires reboot and bootloader understanding.

If 6.0.y is too much, I'd ask we have it at least for 6.1, which is
long-term, that will help a lot of people for sure.
Thanks,


Guilherme

2023-03-29 17:15:44

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 6.0.y / 6.1.y] x86/split_lock: Add sysctl to control the misery mode

+Stas

On Mon, Dec 19, 2022, Guilherme G. Piccoli wrote:
> On 19/12/2022 07:59, Peter Zijlstra wrote:
> > On Mon, Dec 19, 2022 at 11:17:39AM +0100, Greg KH wrote:
> >
> >> What specific programs have this problem and what are the exact results
> >> of it?
> >
> > IIRC it was God of War (2018) that triggered this initially. But it's
> > possible more games were found to tickle this specific thing. Since it's
> > binary only gunk that is unlikely to ever get fixed we need something to
> > allow for it.
> >
> > (slow motion Kratos yelling B...o...y...)
> >
> >> Also, this is really a new feature and not really a "fix", but one could
> >> argue a lot that this is a "resolve a performance problem" if you want
> >> to and have the numbers to back it up {hint}
> >
> > Right, there were some, they should indeed have been included.
>
> Thanks Peter, that's exactly it - the current report is linked on commit
> message.
>
> About performance numbers, the only "numbers" I have is: game is
> unplayable, according to the report "When I launch God of War through
> Steam or Lutris I get around 25 fps, on lowest settings and at 10%
> resolution scaling", FPS for for games is double of that usually.
>
> I understand this is not a regular fix in which something is completely
> broken, but it does fix a behavior introduced on kernel that prevent
> some userspace binaries to properly run, in practice. Ofc some will
> argue that we already have the kernel parameter, but it's different -
> requires reboot and bootloader understanding.
>
> If 6.0.y is too much, I'd ask we have it at least for 6.1, which is
> long-term, that will help a lot of people for sure.
> Thanks,

Resurrecting this thread with a concrete use case.

dosemu2, which uses KVM to accelerate DOS emulation (stating the obvious), ran
into a problem where the hardcoded (prior to this patch) behavior will effectively
hang userspace if the 10ms sleep is interrupted, e.g. by a periodic SIGALRM[*].

Alternatively, we could try to figure out a way to ensure forward progress without
letting userpace run an end-around on the enforced misery, but backporting this
patch to stable kernels seems easier.

Stas, do you happen to know what the oldest stable kernel your users use, i.e.
how far back this backport would need to go?

[*] https://github.com/dosemu2/dosemu2/issues/1968#issuecomment-1486709361

2023-03-29 18:00:37

by stsp

[permalink] [raw]
Subject: Re: [PATCH 6.0.y / 6.1.y] x86/split_lock: Add sysctl to control the misery mode

Hi Sean, thanks for a head-up!

29.03.2023 22:14, Sean Christopherson пишет:
> Resurrecting this thread with a concrete use case.
> dosemu2, which uses KVM to accelerate DOS emulation (stating the obvious), ran
> into a problem where the hardcoded (prior to this patch) behavior will effectively
> hang userspace if the 10ms sleep is interrupted, e.g. by a periodic SIGALRM[*].

Yes, and we also created a ready-to-use
host test-case with no dosemu2 involved.

> Alternatively, we could try to figure out a way to ensure forward progress without
> letting userpace run an end-around on the enforced misery, but backporting this
> patch to stable kernels seems easier.
>
> Stas, do you happen to know what the oldest stable kernel your users use, i.e.
> how far back this backport would need to go?
It seems, the "broken" code was added by
the patch b041b525dab95 which is "described"
as v5.18-rc4-1-gb041b525dab9.
So I guess the answer would be "down to 5.18".

Of course I don't believe the patch you
mention, is a real solution. Its good for
-stable, but something else needs to be
developed in the future.