by Greg KH

[permalink] [raw]

Subject: Re: [PATCH 0/29] arm meltdown fix backporting review for lts 4.9

On Fri, Mar 02, 2018 at 05:14:50PM +0800, Alex Shi wrote:
>
>
> On 03/01/2018 11:24 PM, Greg KH wrote:
> > On Wed, Feb 28, 2018 at 11:56:22AM +0800, Alex Shi wrote:
> >> Hi All,
> >>
> >> This backport patchset fixed the meltdown issue, it's original branch:
> >> https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=kpti
> >> A few dependency or fixingpatches are also picked up, if they are necessary
> >> and no functional changes.
> >>
> >> The patchset also on repository:
> >> git://git.linaro.org/kernel/linux-linaro-stable.git lts-4.9-spectrevv2
> >>
> >> No bug found yet from kernelci.org and lkft testing.
> >
> > No bugs is good, but does it actually fix the meltdown problem? What
> > did you test it on?
>
> Oh, I have no A73/A75 cpu, so I can not reproduce meltdown bug.

Then why should I trust this backport at all?

Please test on the hardware that is affected, otherwise you do not know
if your patches do anything or not.

> > And why are you making this patchset up? What is wrong with the patches
> > in the android-common tree for this?
>
> We believe the LTS is the base kernel for android/lsk, so the fixing
> patches should get it first and then merge to other tree.

But you know that android-common is already fine here, the needed
patches are all integrated into there, so no additional work is needed
for android devices. So what devices do you expect to use this 4.9
backport?

What is "lsk"?

> >> Any comments are appreciated!
> >
> > You need to start versioning this changeset, as I have no idea if this
> > is the "latest" one or not, right?>
> > Or have you not sent out this patchset before? How does this interact
> > with the "spectre" patches? Or am I totally confused here?
>
> It is the first patchset for meltdown. Yes, I will resent this patchset
> with versioning after the renesas board booting fixed.
>
> The meltdown and spectre is 2 different bugs, the fixing patchset are
> isolated each other. So I did the backport as 2 different patchset. And
> merging them together is relative simple. I will comming with a merge
> patch next time, after the meltdown patchset ready.(the kernelci didn't
> works well in recent days)

I don't want a merged patchset, but having one dependant on the other is
just fine.

Again, test this on real hardware properly first.

But really, I don't see this need as all ARM devices that I know of that
are stuck on 4.9.y are already using the android-common tree. Same for
4.4.y. Do you know of any that are not, and that can not just use
4.14.y instead?

thanks,

greg k-h

2018-03-05 14:16:29

On Mon, Mar 05, 2018 at 02:08:59PM +0100, Greg KH wrote:

> I know there is lots more than Android to ARM, but the huge majority by
> quantity is Android.

> What I'm saying here is look at all of the backports that were required
> to get this working in the android tree. It was non-trivial by a long
> shot, and based on that work, this series feels really "small" and I'm
> really worried that it's not really working or solving the problem here.

Unfortunately what's been coming over was just the bit about using
android-common, not the bit about why you're worried about the code. :(

> There are major features that were backported to the android trees for
> ARM that the upstream features for Spectre and Meltdown built on top of
> to get their solution. To not backport all of that is a huge risk,
> right?

I'm not far enough into the details to comment on the specifics here;
there's other people in the CCs who are. Let's let people look at the
code and see if they think some of the fixes are useful in LTS. The
Android tree does have things beyond what's in LTS and there's been more
time for analysis since the changes were made there.

> So that's why I keep pointing people at the android trees. Look at what
> they did there. There's nothing stoping anyone who is really insistant
> on staying on these old kernel versions from pulling from those branches
> to get these bugfixes in a known stable, and tested, implementation.

I think there's enough stuff going on in the Android tree to make that
unpalatable for a good segment of users.

> Or just move to 4.14.y. Seriously, that's probably the safest thing in
> the long run for anyone here. And when you realize you can't do that,
> go yell at your SoC for forcing you into the nightmare that they conned
> you into by their 3+ million lines added to their kernel tree. You were
> always living on borowed time, and it looks like that time is finally
> up...

Yes, there are some people who are stuck with enormous out of tree patch
sets on most architectures (just look at the enterprise distros!) - but
there are also people who are at or very close to vanilla and just
trying to control their validation costs by not changing too much when
they don't need to. There's a good discussion to be had about it being
sensible for people to accept more change in that segment of the market
but equally those same attitudes have been an important part of the
pressure that's been placed on vendors long term to get things in
mainline.

> [1] It's also why I keep doing the LTS merges into the android-common
> trees within days of the upstream LTS release (today being an
> exception). That way once you do a pull/merge, you can just keep
> always merging to keep a secure device that is always up to date
> with the latest LTS releases in a simple way. How much easier can I
> make it for the ARM ecosystem here, really?

That's great for the Android ecosystem, it's fantastic work and is doing
a lot to overcome resistances people had there to merging up the LTS
which is going to help many people. While that's a very large part of
ARM ecosystem it's not all of it, there are also chip vendors and system
integrators who have made deliberate choices to minimize out of tree
code just as we've been encouraging them to.

Attachments:

(No filename) (3.31 kB)
signature.asc (499.00 B)
Download all attachments

2018-03-06 17:26:56

by Greg KH

[permalink] [raw]

Subject: Re: [PATCH 0/29] arm meltdown fix backporting review for lts 4.9

On Tue, Mar 06, 2018 at 02:26:34PM +0000, Mark Brown wrote:
> On Mon, Mar 05, 2018 at 02:08:59PM +0100, Greg KH wrote:
>
> > I know there is lots more than Android to ARM, but the huge majority by
> > quantity is Android.
>
> > What I'm saying here is look at all of the backports that were required
> > to get this working in the android tree. It was non-trivial by a long
> > shot, and based on that work, this series feels really "small" and I'm
> > really worried that it's not really working or solving the problem here.
>
> Unfortunately what's been coming over was just the bit about using
> android-common, not the bit about why you're worried about the code. :(

Sorry, it's been a long few months, my ability to communicate well about
this topic is tough at times without assuming everyone else has been
dealing with it for as long as some of us have.

> > There are major features that were backported to the android trees for
> > ARM that the upstream features for Spectre and Meltdown built on top of
> > to get their solution. To not backport all of that is a huge risk,
> > right?
>
> I'm not far enough into the details to comment on the specifics here;
> there's other people in the CCs who are. Let's let people look at the
> code and see if they think some of the fixes are useful in LTS. The
> Android tree does have things beyond what's in LTS and there's been more
> time for analysis since the changes were made there.

I suggest looking at the backports in the android-common tree that are
needed for this "feature" to work properly, and pull them out and test
them if you really want it in your Linaro trees. If you think some of
them should be added to the LTS kernels, I'll be glad to consider them,
but don't do a hack to try to work around the lack of these features,
otherwise you will not be happy in the long-run.

Again, look at the mess we have for x86 in 4.4.y and 4.9.y. You do not
want that for ARM for the simple reason that ARM systems usually last
"longer" with those old kernels than the x86 systems do.

> > So that's why I keep pointing people at the android trees. Look at what
> > they did there. There's nothing stoping anyone who is really insistant
> > on staying on these old kernel versions from pulling from those branches
> > to get these bugfixes in a known stable, and tested, implementation.
>
> I think there's enough stuff going on in the Android tree to make that
> unpalatable for a good segment of users.

Really? Like what? Last I looked it's only about 300 or so patches.
Something like less than .5% of the normal SoC backport size for any ARM
system recently. There were some numbers published a few months ago
about the real count, I can dig them up if you are curious.

> > Or just move to 4.14.y. Seriously, that's probably the safest thing in
> > the long run for anyone here. And when you realize you can't do that,
> > go yell at your SoC for forcing you into the nightmare that they conned
> > you into by their 3+ million lines added to their kernel tree. You were
> > always living on borowed time, and it looks like that time is finally
> > up...
>
> Yes, there are some people who are stuck with enormous out of tree patch
> sets on most architectures (just look at the enterprise distros!) - but
> there are also people who are at or very close to vanilla and just
> trying to control their validation costs by not changing too much when
> they don't need to.

Great, then move to 4.14.y :)

And before someone says "but it takes more to validate a new kernel
version than it does to just validate a core backport for the
architecture code", well...

> There's a good discussion to be had about it being sensible for people
> to accept more change in that segment of the market but equally those
> same attitudes have been an important part of the pressure that's been
> placed on vendors long term to get things in mainline.
>
> > [1] It's also why I keep doing the LTS merges into the android-common
> > trees within days of the upstream LTS release (today being an
> > exception). That way once you do a pull/merge, you can just keep
> > always merging to keep a secure device that is always up to date
> > with the latest LTS releases in a simple way. How much easier can I
> > make it for the ARM ecosystem here, really?
>
> That's great for the Android ecosystem, it's fantastic work and is doing
> a lot to overcome resistances people had there to merging up the LTS
> which is going to help many people. While that's a very large part of
> ARM ecosystem it's not all of it, there are also chip vendors and system
> integrators who have made deliberate choices to minimize out of tree
> code just as we've been encouraging them to.

Again great, go use 4.14.y for those systems please. It's better in the
long run.

thanks,

greg k-h

2018-03-06 21:32:48

On Tue, Mar 13, 2018 at 01:01:43PM +0000, Ard Biesheuvel wrote:
> On 13 March 2018 at 10:38, Greg KH <[email protected]> wrote:
> > On Tue, Mar 13, 2018 at 10:13:26AM +0000, Ard Biesheuvel wrote:
> >> On 13 March 2018 at 10:04, Greg KH <[email protected]> wrote:
> >> > On Wed, Mar 07, 2018 at 06:24:09PM +0000, Ard Biesheuvel wrote:
> >> >> On 2 March 2018 at 16:54, Greg KH <[email protected]> wrote:
> ...
> >> >> > Please test on the hardware that is affected, otherwise you do not know
> >> >> > if your patches do anything or not.
> >> >> >
> >> >>
> >> >> I don't think it is feasible to test these backports by confirming
> >> >> that they make the fundamental issue go away. We simply don't have the
> >> >> code to reproduce all the variants, and we have to rely on the
> >> >> information provided by ARM Ltd. regarding which cores are affected
> >> >> and which aren't.
> >> >
> >> > You really don't have the reproducers? Please work with ARM to resolve
> >> > that, this should not be a non-tested set of patches. That's really
> >> > worse than no patches at all, as if they were applied, that would
> >> > provide a false-sense of "all is fixed".
> >> >
> >>
> >> I know that on x86, the line between architecture and platform is
> >> blurry. That is not the case on ARM, though.
> >>
> >> Unlike platform firmware, the OS is built on top of an abstracted
> >> platform which is described by ARM's Architecture Reference Manual. If
> >> ARM Ltd. issues recommendations regarding what firmware PSCI methods
> >> to call when doing a context switch, or which barrier instruction to
> >> issue in certain circumstances, they do so because a certain class of
> >> hardware may require it in some cases. It is really not up to me to go
> >> find some exploit code on GitHub, run it before and after applying the
> >> patch and conclude that the problem is fixed. Instead, what I should
> >> do is confirm that the changes result in the recommended actions to be
> >> taken at the appropriate times.
> >
> > To _not_ take that exploit code and run it to _verify_ that your patches
> > work, would be foolish, right?
> >
>
> Oh, absolutely. But that presupposes access to both the affected
> hardware and the exploit code.

If you all don't have access to both, then someone is doing something
seriously wrong. Go complain to ARM please, we all know they have both.

I just got done yelling at a whole bunch of vendors last week about this
whole mess at a very large meeting of a lot of different Linux-based
companies. It's crazy that the disfunction is still happening.

greg k-h