Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757468AbeAIVIA (ORCPT + 1 other); Tue, 9 Jan 2018 16:08:00 -0500 Received: from out02.mta.xmission.com ([166.70.13.232]:56009 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757272AbeAIVHz (ORCPT ); Tue, 9 Jan 2018 16:07:55 -0500 From: ebiederm@xmission.com (Eric W. Biederman) To: Willy Tarreau Cc: linux-kernel@vger.kernel.org, x86@kernel.org, tglx@linutronix.de, gnomes@lxorguk.ukuu.org.uk, torvalds@linux-foundation.org References: <1515427939-10999-1-git-send-email-w@1wt.eu> <87a7xnkq0g.fsf@xmission.com> <20180109160215.GA13065@1wt.eu> Date: Tue, 09 Jan 2018 15:07:07 -0600 In-Reply-To: <20180109160215.GA13065@1wt.eu> (Willy Tarreau's message of "Tue, 9 Jan 2018 17:02:15 +0100") Message-ID: <87d12iivwk.fsf@xmission.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1eZ17g-0007N9-93;;;mid=<87d12iivwk.fsf@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=67.3.133.177;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX18HL01RP6aJAppdMw2pkn4ueLkxlXWIXjQ= X-SA-Exim-Connect-IP: 67.3.133.177 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: Re: [PATCH RFC 0/4] Per-task PTI activation X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: Willy Tarreau writes: > Hi Eric, > > On Tue, Jan 09, 2018 at 09:31:27AM -0600, Eric W. Biederman wrote: >> The dangerous scenario is someone exploting a buffer overflow, or >> otherwise getting a network facing application to misbehave, and then >> using these new attacks to assist in gaining privilege escalation. > > For most use cases sure. But for *some* use cases, if they can control > of the application, you've already lost everything you had. Private keys, > clear text traffic, etc. We're precisely talking about such applications > where the userspace is as much important as the kernel, and where there's > hardly anything left to lose once the application is cracked. However, a > significant performance drop on the application definitely is a problem, > first making it weaker when facing attacks, or even failing to deal with > traffic peaks. >From reading the earlier emails it was not clear that all was lost if they were compromomised. In that case this makes plenty of sense. >> Googling seems to indicate that there is about one issue a year found in >> haproxy. So this is not an unrealistic concern for the case you >> mention. > > I agree. But in practice, we had two exploitable bugs, one in 2002 > (overflow in the logs), and one in 2014 requiring a purposely written > config which makes no pratical sense at all. Most other vulnerabilities > involve freezes, occasionally crashes, though that's even more rare. > And even with the two above, you just have one chance to try to exploit > it, if you get your pointer wrong, it dies and you have to wait for the > admin to restart it. In practice, seeing the process die is the worst > nightmare of admins as the service simply stops. I'm not saying we don't > want to defend them, we even chroot to an empty directory and drop > privileges to mitigate such a risk. But when the intruder is in the > process it's really too late. > >> So unless I am seeing things wrong this is a patchset designed to drop >> your defensense on the most vulnerable applications. > > In fact it can be seen very differently. By making it possible for exposed > but critical applications to share some risks with the rest of the system, > we also ensure they remain strong for their initial purpose and against > the most common types of attacks. And quite frankly we're not weakening > much given the risks already involved by the process itself. > > What I'm describing represents a small category of processes in only > certain environments. Some database servers will have the same issue. > Imagine a Redis server for example, which normally is very fast and > easily saturates whatever network around it. Some DNS providers may > have the same problem when dealing with hundreds of thousands to > millions of UDP packets each second (not counting attacks). > > All such services are critical in themselves, but the fact that we accept > to let them share the risks with the system doesn't mean they should be > running without the protections from the occasional operations guy just > allowed to connect there to verify if logs are full and to retrive > stats. Reasonable. >> Disably protection on the most vunerable applications is not behavior >> I would encourage. > > I'm not encouraging this behaviour either but right now the only option > for performance critical applications (even if they are vulnerable) is > to make the whole system vulnerable. > >> It seems better than disabling protection system >> wide but only slightly. I definitely don't think this is something we >> want applications disabling themselves. > > In fact that's what I liked with the wrapper approach, except that it > had the downside of being harder to manage in terms of administration > and we'd risk to see it used everywhere by default. The arch_prctl() > approach ensures that only applications where this is relevant can do > it. In the case of haproxy, I can trivially add a config option like > "disable-page-isolation" to let the admin enable it on purpose. How is that different from the option? > But I suspect there might be some performance critical applications that > cannot be patched, and that's where the wrapper could still provide some > value. I just don't want to encourage changning this option by default. As a lot of applications get installed in home servers or other places where they are not performance critical. At which point disabling the kpti protection by default would be reducing the level of protection of everything. But ultimately I only brought this up so that people are thinking about the other side of this. About how it will affect not the high performance servers single function but how it will affect the millions of little servers that do many things all from a single machine. Certainly I would not want this enabled in a container or a virtual private server. The capable(CAP_RAWIO) seems to handle that beautifully. >> Certainly this is something that should look at no-new-privs and if >> no-new-privs is set not allow disabling this protection. > > I don't know what is "no-new-privs" and couldn't find info on it > unfortunately. Do you have a link please ? Probably because I used dashes. The no new privs flag is documented in: Documentation/userspace-api/no_new_privs.rst It is a sandboxing flag that guarantees a process can not gain privileges after it has been set. You can search for PFA_NO_NEW_PRIVS in sched.h if you want to see where it is defined. Eric