Received: by 2002:ac0:946b:0:0:0:0:0 with SMTP id j40csp891432imj; Fri, 15 Feb 2019 08:28:53 -0800 (PST) X-Google-Smtp-Source: AHgI3IbSoRY2dV7HsWXYu6FBPaZ+DC6zGrGKcVra6cbrwsABTqhl3D4O5V39v3DsjE9mT5/8mf09 X-Received: by 2002:a17:902:20e9:: with SMTP id v38mr10734970plg.250.1550248133468; Fri, 15 Feb 2019 08:28:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1550248133; cv=none; d=google.com; s=arc-20160816; b=p82OkkOHyF/CpUzj/4He9irDZvu4YX9btnBLTt36IeGm+QGFYJRDF5UNAQ05RtCIHf rlaBCrkD8q9HRE0uvtO+EEqBIAnMkz8PMJzOkb3hRDy8QGVTvpFVpNl+azwLPPtXFC0U LIDVE4bSHZX1V7lDKqNh1ThRCEBrUxiegUqN+2yvZqc2DUiZ6FTX/prz4CgRdKIds3Y7 Q9+OcyHUdUXys/UNttSk5fR7/Q3K06jiSTZ4Cu6k7nDGjkNBM8KrtAP9yzDX/IovYG0Q SgQH0JlbmcPgUuyzn9FmuOdnRVgMvg1H58aq0DAukaJl2JCVj2tr8dQ/Kt0pgAq1uZ5j qIuw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=ZNbUbCDzeCMNWxY2Ac0lTZDbNBRGaODa8STVAiaDvKM=; b=t2/AzaetxvaPzmX+T5A6VpxL+EqPfmZfr2axeeAiSoxumc9OFLZtGQZ8yU9RedygPA BXaM1QyTdKeiR9NE0MgPxBsVFFd94D2m3RLrpD6zXNKQ1ZlTEI3wSpK8AYE8OByxsEAi /JUoXfTzjB6qAgYm4OzEo75oPB3OYkjODwjUHvQzrClCiagyYH+mMVIjKVQSe94KmrvT YQChS7VWCh6LLhIiIcg6HratuQQTAkZN/vzmnh7cvT0MZtVgXxDl6R4NSxT/gfS18dRM Y1/zUQLJ3rBxoxHoYf0EuzjLT8v/ydS2BCxxUjzx3u/XYswcu2kGiR+ur5sxJpGkYxWG v4fg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=0hj9NSOl; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o9si4354430pll.303.2019.02.15.08.28.37; Fri, 15 Feb 2019 08:28:53 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=0hj9NSOl; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388457AbfBOPTP (ORCPT + 99 others); Fri, 15 Feb 2019 10:19:15 -0500 Received: from mail.kernel.org ([198.145.29.99]:58210 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727536AbfBOPTP (ORCPT ); Fri, 15 Feb 2019 10:19:15 -0500 Received: from localhost (c-73-47-72-35.hsd1.nh.comcast.net [73.47.72.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 6A7BA21925; Fri, 15 Feb 2019 15:19:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1550243953; bh=Uf8Xcrlvvln/nba+KUI5sC40U5pQoOw7T2CxKTwt6M8=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=0hj9NSOlVfTCXktPQu27eVOzmjofqrKZpYRcI8Josnl5tKlAc7X2RnzqWlDD+wmB9 aEo9Arbjvn2i6WDnmQGiSHI2zaV74Tj9TFZRHsaWABnujabeeBRClLKagRXLUXnCQs kmlzVdQBJ+/7nsj2XkUxuDAQSf1QVviv1iIMcsXA= Date: Fri, 15 Feb 2019 10:19:12 -0500 From: Sasha Levin To: Michal Hocko Cc: Greg Kroah-Hartman , Andrew Morton , stable@vger.kernel.org, Linus Torvalds , Richard Weinberger , Samuel Dionne-Riel , LKML , graham@grahamc.com, Oleg Nesterov , Kees Cook Subject: Re: Userspace regression in LTS and stable kernels Message-ID: <20190215151912.GA10616@sasha-vm> References: <20190214122027.c0df36282d65dc9979248117@linux-foundation.org> <20190215070022.GD14473@kroah.com> <20190215091000.GT4525@dhcp22.suse.cz> <20190215092013.GA32575@kroah.com> <20190215094205.GW4525@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20190215094205.GW4525@dhcp22.suse.cz> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Feb 15, 2019 at 10:42:05AM +0100, Michal Hocko wrote: >On Fri 15-02-19 10:20:13, Greg KH wrote: >> On Fri, Feb 15, 2019 at 10:10:00AM +0100, Michal Hocko wrote: >> > On Fri 15-02-19 08:00:22, Greg KH wrote: >> > > On Thu, Feb 14, 2019 at 12:20:27PM -0800, Andrew Morton wrote: >> > > > On Thu, 14 Feb 2019 09:56:46 -0800 Linus Torvalds wrote: >> > > > >> > > > > On Wed, Feb 13, 2019 at 3:37 PM Richard Weinberger >> > > > > wrote: >> > > > > > >> > > > > > Your shebang line exceeds BINPRM_BUF_SIZE. >> > > > > > Before the said commit the kernel silently truncated the shebang line >> > > > > > (and corrupted it), >> > > > > > now it tells the user that the line is too long. >> > > > > >> > > > > It doesn't matter if it "corrupted" things by truncating it. All that >> > > > > matters is "it used to work, now it doesn't" >> > > > > >> > > > > Yes, maybe it never *should* have worked. And yes, it's sad that >> > > > > people apparently had cases that depended on this odd behavior, but >> > > > > there we are. >> > > > > >> > > > > I see that Kees has a patch to fix it up. >> > > > > >> > > > >> > > > Greg, I think we have a problem here. >> > > > >> > > > 8099b047ecc431518 ("exec: load_script: don't blindly truncate shebang >> > > > string") wasn't marked for backporting. And, presumably as a >> > > > consequence, Kees's fix "exec: load_script: allow interpreter argument >> > > > truncation" was not marked for backporting. >> > > > >> > > > 8099b047ecc431518 hasn't even appeared in a Linus released kernel, yet >> > > > it is now present in 4.9.x, 4.14.x, 4.19.x and 4.20.x. >> > > >> > > It came in 5.0-rc1, so it fits the "in a Linus released kernel" >> > > requirement. If we are to wait until it shows up in a -final, that >> > > would be months too late for almost all of these types of patches that >> > > are picked up. >> > >> > rc1 is just a too early. Waiting few more rcs or even a final release >> > for something that people do not see as an issue should be just fine. >> > Consider this particular patch and tell me why it had to be rushed in >> > the first place. The original code was broken for _years_ but I do not >> > remember anybody would be complaining. >> >> This patch was in 4.20.10, which was released on Feb 12 while 5.0-rc1 >> came out on Jan 6. Over a month delay. > >Obviously not long enough. You're assuming that if we wouldn't have taken this patch to stable somehow someone else would notice this bug and fix it. What test do we have that would catch this? Which testsuite tests for long shebang lines? Where is the test added together with this patch that covers this and similar cases? The fact is that many patches are not tested until they get to stable, whether we add them the same week they went upstream or months later. This is a great case for this: I doubt anyone but NixOS does this crazy thing with shebang lines, so who else would discover the bug? If this is indeed a case of us jumping the gun and shipping stuff too early before all tests are complete, please point me to the test that we missed and I'll make sure that for any future kernel release it gets run before we ship a stable kernel. >> > > > I don't know if Oleg considered backporting that patch. I certainly >> > > > did (I always do), and I decided against doing so. Yet there it is. >> > > >> > > This came in through Sasha's tools, which give people a week or so to >> > > say "hey, this isn't a stable patch!" and it seems everyone ignored that >> > > :( >> > >> > I thought we were through this already. Automagic autoselection of >> > patches in the core kernel (or mmotm tree patches in particular) is too >> > dangerous. We try hard to consider each and every patch for stable. Even >> > if something slips through then it is much more preferred to ask for a >> > stable backport in the respective email thread and wait for a conclusion >> > before adding it. >> >> We have a list of blacklisted files/subsystems for people that do not >> want this to happen to their area of the kernel. The patch seemed to >> make sense, and it passed all known tests that we currently have. > >Yes, the patch makes sense (I wouldn't give my acked-by otherwise). But >this is one of the area where things that make sense might still break >because it is hard to assume what userspace depends on. Great, so the solution is to just not take these things into stable at all? The solution should be to add tests to the patches that go in there to verify their correctness and that they don't regress in the future. If you're really concerned about subsystems being brittle the solution is to improve their testing rather push stuff in and hope nothing explodes. On one hand you Ack it saying it looks great to you and should be merged, but on the other hand you're saying that you don't really trust the patch? Really, if I wouldn't pick this patch now what do you think would have happened? It would just pop up in a few months as we roll our stable kernel forward. >> Sometimes things will slip through like this, it happens. And really, a >> 3 day turn-around-time to resolve this is pretty good, don't you think? > >Yes, but that doesn't make any difference on the fact that this was not >marked for stable and I still think this is not a stable material - at >least not at this moment. Hindsight is 20/20 :) If people were good at understanding the impact and implications their patch has on the kernel we would never introduce new bugs! I'll happily list a bunch more patches where folks didn't think they're stable material, but turned out to be important fixes. -- Thanks, Sasha