LinuxLists.cc - [GIT PULL] omap changes for v2.6.39 merge window

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, Mar 17, 2011 at 11:30 AM, Tony Lindgren <[email protected]> wrote:
>
> Please pull omap changes for this merge window from:

Gaah. Guys, this whole ARM thing is a f*cking pain in the ass.

You need to stop stepping on each others toes. There is no way that
your changes to those crazy clock-data files should constantly result
in those annoying conflicts, just because different people in
different ARM trees do some masturbatory renaming of some random
device. Seriously.

That usb_musb_init() thing in arch/arm/mach-omap2/usb-musb.c also
seems to be totally insane. I wonder what kind of insanity I'm missing
just because I don't happen to see the merge conflicts, just because
people were lucky enough to happen to not touch the same file within a
few lines.

Somebody needs to get a grip in the ARM community. I do want to do
these merges, just to see how screwed up things are, but guys, this is
just ridiculous. The pure amount of crazy churn is annoying in itself,
but when I then get these "independent" pull requests from four
different people, and they touch the same files, that indicates that
something is wrong.

And stop the crazy renaming already! Just leave it off. Don't rename
boards and drivers "just because", at least not when there clearly are
clashes. There's no point. I'm not even talking about the file renames
(which happened and can also make it "fun" to try to resolve the
conflicts when somebody else then makes _other_ changes), but about
the stupid "change human-readable names in board files just to annoy
whoever needs to merge the crap".

Somebody in the ARM community really needs to step up and tell people
to stop dicking around.

(I'm replying to the omap pull request, because that's the one I did
last, but I don't know who to "blame". I don't care. It really doesn't
matter. I realize thar ARM vendors do crazy shit and haven't figured
out this whole "platform" thing yet, but you guys need to push back on
the people sending you crap).

Linus

2011-03-18 03:03:16

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, Mar 17, 2011 at 11:30 AM, Tony Lindgren <[email protected]> wrote:
>
> Please pull omap changes for this merge window from:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6.git omap-for-linus

Btw, that "generic hardware spinlock" thing needs to be hidden from
sane people and architectures that don't need it.

Make platforms that need it "select" it or something. It looks like
nobody but an OMAP4 could _possibly_ ever want to answer 'y' to that
question, SO DON'T ASK IT!

Linus

2011-03-18 07:07:08

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

* Linus Torvalds <[email protected]> [110317 19:48]:
> On Thu, Mar 17, 2011 at 11:30 AM, Tony Lindgren <[email protected]> wrote:
> >
> > Please pull omap changes for this merge window from:
>
> Gaah. Guys, this whole ARM thing is a f*cking pain in the ass.
>
> You need to stop stepping on each others toes. There is no way that
> your changes to those crazy clock-data files should constantly result
> in those annoying conflicts, just because different people in
> different ARM trees do some masturbatory renaming of some random
> device. Seriously.
>
> That usb_musb_init() thing in arch/arm/mach-omap2/usb-musb.c also
> seems to be totally insane. I wonder what kind of insanity I'm missing
> just because I don't happen to see the merge conflicts, just because
> people were lucky enough to happen to not touch the same file within a
> few lines.

This merge conflict was really unfortunate, the plan was to queue
driver related changes via the usb-devel list and the platform init
changes via the linux-omap list. But obviously something went wrong..

Anyways, we are close to done making the platform init code shared
between omaps and then drivers become arch independent. So things
should get easier for next merge window already.

> Somebody needs to get a grip in the ARM community. I do want to do
> these merges, just to see how screwed up things are, but guys, this is
> just ridiculous. The pure amount of crazy churn is annoying in itself,
> but when I then get these "independent" pull requests from four
> different people, and they touch the same files, that indicates that
> something is wrong.

Well in this case the conflicts were between driver changes and arch
related changes :) For the ARM and various ARM related SoC changes
there is a lot of work going on to make things more generic. So with
that the crazy churn should also ease, but it takes a lot of work to
get there.

> And stop the crazy renaming already! Just leave it off. Don't rename
> boards and drivers "just because", at least not when there clearly are
> clashes. There's no point. I'm not even talking about the file renames
> (which happened and can also make it "fun" to try to resolve the
> conflicts when somebody else then makes _other_ changes), but about
> the stupid "change human-readable names in board files just to annoy
> whoever needs to merge the crap".

OK, point taken. One part of the problem here are the current dependencies
between the driver code and platform code which makes it hard to patch
one without the other. Hopefully this will too ease as the drivers
become separated from the platform code.

> Somebody in the ARM community really needs to step up and tell people
> to stop dicking around.
>
> (I'm replying to the omap pull request, because that's the one I did
> last, but I don't know who to "blame". I don't care. It really doesn't
> matter. I realize thar ARM vendors do crazy shit and haven't figured
> out this whole "platform" thing yet, but you guys need to push back on
> the people sending you crap).

OK we'll pass on the message.

Regards,

Tony

2011-03-18 07:09:42

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

* Linus Torvalds <[email protected]> [110317 20:00]:
> On Thu, Mar 17, 2011 at 11:30 AM, Tony Lindgren <[email protected]> wrote:
> >
> > Please pull omap changes for this merge window from:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6.git omap-for-linus
>
> Btw, that "generic hardware spinlock" thing needs to be hidden from
> sane people and architectures that don't need it.
>
> Make platforms that need it "select" it or something. It looks like
> nobody but an OMAP4 could _possibly_ ever want to answer 'y' to that
> question, SO DON'T ASK IT!

No problem, let's make it depends on OMAP4.

Regards,

Tony

2011-03-18 08:07:13

by Ohad Ben Cohen

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Fri, Mar 18, 2011 at 9:09 AM, Tony Lindgren <[email protected]> wrote:
> * Linus Torvalds <[email protected]> [110317 20:00]:
>> Make platforms that need it "select" it or something. It looks like
>> nobody but an OMAP4 could _possibly_ ever want to answer 'y' to that
>> question, SO DON'T ASK IT!
>
> No problem, let's make it depends on OMAP4.

(patch also attached in case gmail eats my whitespace)

>From d086e8f994b9272f8c999af0a4d32d870749c77a Mon Sep 17 00:00:00 2001
From: Ohad Ben-Cohen <[email protected]>
Date: Fri, 18 Mar 2011 10:01:11 +0200
Subject: [PATCH] hwspinlock: depend on OMAP4

Currently only OMAP4 supports hwspinlocks, so don't bother asking
anyone else.

Signed-off-by: Ohad Ben-Cohen <[email protected]>
---
drivers/hwspinlock/Kconfig | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/hwspinlock/Kconfig b/drivers/hwspinlock/Kconfig
index eb4af28..1f29bab 100644
--- a/drivers/hwspinlock/Kconfig
+++ b/drivers/hwspinlock/Kconfig
@@ -4,6 +4,7 @@

config HWSPINLOCK
tristate "Generic Hardware Spinlock framework"
+ depends on ARCH_OMAP4
help
Say y here to support the generic hardware spinlock framework.
You only need to enable this if you have hardware spinlock module
--
1.7.1

Attachments:

0001-hwspinlock-depend-on-OMAP4.patch (844.00 B)

2011-03-18 10:16:13

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, Mar 17, 2011 at 07:50:36PM -0700, Linus Torvalds wrote:
> On Thu, Mar 17, 2011 at 11:30 AM, Tony Lindgren <[email protected]> wrote:
> >
> > Please pull omap changes for this merge window from:
>
> Gaah. Guys, this whole ARM thing is a f*cking pain in the ass.

Please be more specific. ARM is not small. ARM as an architecture is
_massive_, with lots of people working in it. By blaming 'ARM', it's a
bit like blaming the entire planet for problems in one country.

You've only had merge conflicts with IMX/MXC and OMAP so far, and that
hardly warrants taring the "whole ARM thing" with the same brush.

Most people are trying very hard to do the right thing.

> Somebody needs to get a grip in the ARM community. I do want to do
> these merges, just to see how screwed up things are, but guys, this is
> just ridiculous.

It isn't as bad as you think it is for the majority of ARM stuff.

> The pure amount of crazy churn is annoying in itself,
> but when I then get these "independent" pull requests from four
> different people, and they touch the same files, that indicates that
> something is wrong.

I have already complained to Uwe and Sascha about the IMX/MXC conflicts.
It already struck me that there's something seriously wrong at
pengutronix.com as both Uwe and Sascha work in the same area, yet don't
coordinate their efforts. It seems to me that Uwe is completely
independent of everyone else.

As for anything else, I really don't see a problem. There are changes
in core ARM code which have impacts on _every_ ARM platform. It's just
a fact of life that there could be conflicts between core changes and
platform changes. That's not because things are uncoordinated - I do
try to ensure that such patches receive as many acks from maintainers
as possible.

> Somebody in the ARM community really needs to step up and tell people
> to stop dicking around.

You mean like I've already done with Uwe and Sascha?

I do get the impression that you're extremely unhappy with the way ARM
stuff works, and I've no real idea how to solve that. I think much of
it is down to perception rather than anything tangible.

Maybe the only solution is for ARM to fork the kernel, which is something
I really don't want to do - but from what I'm seeing its the only solution
which could come close to making you happy.

2011-03-18 11:13:36

by Uwe Kleine-König

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

Hello Russell,

On Fri, Mar 18, 2011 at 10:15:12AM +0000, Russell King - ARM Linux wrote:
> On Thu, Mar 17, 2011 at 07:50:36PM -0700, Linus Torvalds wrote:
> > The pure amount of crazy churn is annoying in itself,
> > but when I then get these "independent" pull requests from four
> > different people, and they touch the same files, that indicates that
> > something is wrong.
>
> I have already complained to Uwe and Sascha about the IMX/MXC conflicts.
> It already struck me that there's something seriously wrong at
> pengutronix.com as both Uwe and Sascha work in the same area, yet don't
> coordinate their efforts. It seems to me that Uwe is completely
> independent of everyone else.
I feel blamed wrongly here. My part of the "crazy churn" in
v2.6.38..$todayslinus/master is that I touched drivers/net/Kconfig[1] and
arch/arm/mach-mxs/gpio.c[2]. The former went in via the net tree; the
latter via Russell's tree with Sascha's Ack. Please correct me if I'm
wrong but I think this was the correct thing to do. I don't know how
that qualifies as "completely independent of everyone else".

IMHO the cooperation between Sascha and me works fine. In fact nearly
all[3] of my patches that touch imx related things under arch/arm/ go in
via Sascha's tree.

Best regards
Uwe

[1] 085e79e (net/fec: consolidate all i.MX options to CONFIG_ARM)
[2] bf0c111 (ARM: 6744/1: mxs: irq_data conversion)
[3] Some exceptions I found are:
bf0c111 ARM: 6744/1: mxs: irq_data conversion
4df772d ARM: 6322/1: imx/pca100: Fix name of spi platform data
868003c ARM: 6280/1: imx: Fix build failure when including <mach/gpio.h> without <linux/spinlock.h>
All of these are OK IMHO.

--
Pengutronix e.K. | Uwe Kleine-K?nig |
Industrial Linux Solutions | http://www.pengutronix.de/ |

2011-03-18 23:43:26

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

* Ohad Ben-Cohen <[email protected]> [110318 01:04]:
> On Fri, Mar 18, 2011 at 9:09 AM, Tony Lindgren <[email protected]> wrote:
> > * Linus Torvalds <[email protected]> [110317 20:00]:
> >> Make platforms that need it "select" it or something. It looks like
> >> nobody but an OMAP4 could _possibly_ ever want to answer 'y' to that
> >> question, SO DON'T ASK IT!
> >
> > No problem, let's make it depends on OMAP4.
>
> From d086e8f994b9272f8c999af0a4d32d870749c77a Mon Sep 17 00:00:00 2001
> From: Ohad Ben-Cohen <[email protected]>
> Date: Fri, 18 Mar 2011 10:01:11 +0200
> Subject: [PATCH] hwspinlock: depend on OMAP4
>
> Currently only OMAP4 supports hwspinlocks, so don't bother asking
> anyone else.
>
> Signed-off-by: Ohad Ben-Cohen <[email protected]>

Thanks I'll queue this.

Regards,

Tony

2011-03-30 17:07:00

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Friday 18 March 2011, Russell King - ARM Linux wrote:
> I do get the impression that you're extremely unhappy with the way ARM
> stuff works, and I've no real idea how to solve that. I think much of
> it is down to perception rather than anything tangible.
>
> Maybe the only solution is for ARM to fork the kernel, which is something
> I really don't want to do - but from what I'm seeing its the only solution
> which could come close to making you happy.

I'm still new to the ARM world, but I think one real problem is the way
that all platforms have their own trees with a very flat hierarchy --
a lot of people directly ask Linus to pull their trees, and the main
way to sort out conflicts is linux-next. The number of platforms in the
ARM arch is still increasing, so I assume that this only gets worse.

This would be no easier if everyone was asking you to pull their trees,
as I believe was the case before that. The amount of code getting changed
there is too large to get reviewed by a single person, and I believe
neither of you really wants the burden to judge if all of the branches
are ok (and complain to the authors when they are not).

Russell, do you think it would help to have an additional ARM platform
tree that collects all the changes that impact only the platform code but
not the core architecture? I believe that would be a way out, but requires
a careful selection of people responsible for it. In particular, I don't
think a single person can handle it without good sub-maintainers.

The way that x86 is maintained is to have a small group of people that
all have write access to one tree, so patches and branches from downstream
maintainers can get pulled by a number of people, when at least one of
them in totally comfortable with the contents and nobody else objects.
In case one of them is unhappy about something that went in, it can always
get reverted and will not be applied again until everybody is happy with it.

I think a similar setup would be possible for ARM, but only if you are
in the team, plus one person from at least ARM Ltd (Catalin?),
Linaro (Nicolas?) and maybe one or two more, but none of the actual
SoC vendors that produce the bulk of the code that would go in there.

It would also require buy-in from Linus eventually, to make it clear that
he would pull from that tree directly and no longer from other
subarchitecture trees, once the process has been proven to work
for everyone.

I don't think I would want to be on the committers team myself, to make
it not too Linaro-heavy, but I can definitely offer a significant amount
of time for reviewing patches to be committed by someone else.

Also, I would assume that your own time would keep focused on the core
ARM tree that already keeps you busy enough.

Arnd

2011-03-30 19:21:55

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, Mar 30, 2011 at 10:06 AM, Arnd Bergmann <[email protected]> wrote:
>
> I'm still new to the ARM world, but I think one real problem is the way
> that all platforms have their own trees with a very flat hierarchy --
> a lot of people directly ask Linus to pull their trees, and the main
> way to sort out conflicts is linux-next. The number of platforms in the
> ARM arch is still increasing, so I assume that this only gets worse.

As far as I'm concerned, the biggest issuee is that some of the ARM
crap is just CRAP. It's idiotic tables that get updated by multiple
people, and in totally nonsensical ways. When I see conflicts in those
damn clock-data files, I just go mental. Those files are an
abomination.

Why the hell is the clock-data for fifty (number taken out of my ass)
different clocking rules in one array? And why do we have eight
different files of that kind of crap for omap2?

THAT is an example of something that is totally and utterly screwed
up. Those kinds of random board-level detail files abound in the ARM
tree. They should either be in a per-board file, or (much better) the
ARM people should have standardized this ten f*cking years ago, and
put it in a bootloader or something that just initializes the crap so
that the kernel doesn't have to have random tables like that at all.

ARM right now i a nightmare, and most of it is because ARM hardware
manufacturers are morons. But the way the ARM tree is then laid out
has made that even more painful, and the decision to put all the crazy
board details in the kernel tables instead of trying to have a
per-board boot loader that fills in the details is just crazy.

Look at the dirstat for arch/ in just the current merge window
(cut-off at 5% just to not get too much):

[torvalds@i5 linux]$ git diff -C --dirstat=5 --cumulative v2.6.38.. arch
14.0% arch/arm/mach-omap2/
5.8% arch/arm/plat-mxc/include/mach/
6.3% arch/arm/plat-mxc/
57.1% arch/arm/
5.4% arch/m68k/
9.6% arch/unicore32/
6.9% arch/x86/
100.0% arch/

almost *SIXTY* percent of all arch updates were to ARM code. And
that's despite the fact that one of those architectures (unicore32) is
a totally new architecture, and despite m68k having gone through a
first-level unification of nommu and mmu code!

And was this just a fluke? No. Doing the same for 2.3.37..38 gives
58.3% for arch/arm (and in that release we had a blackfin unification
effort, otherwise arm would have been an even bigger percentage).

That's ridiculous. It's entirely due to the whole f*cked-up arm ecosystem.

Something needs to be done. A small part is to make sure the source
code is more hierarchical, so that we don't have those crazy shared
data-files that are ugly as hell and get conflicts because different
boards all think they need to care.

But the larger problem is that somebody really REALLY needs to think
about how to get those crazy board details out of the kernel entirely.
Having per-board drivers for real hardware is sane - having to have
per-board detail files for clock chips is just crazy. Split off that
thing a "Linux ARM second-stage bootloader" project that has the
per-board tables or something. Don't pollute the main kernel with
crazy details like this.

Because as far as I can tell, most of that board support really is
about crazy details that the kernel shouldn't even care about. Come up
with a table that describes them, have one common parsing routine, and
push the table into a bootloader. And get rid of having to add a board
file for every crazy random piece of hardware that nobody really cares
about.

Or something.

Linus

2011-03-30 20:41:47

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, 30 Mar 2011, Linus Torvalds wrote:

> ARM right now i a nightmare, and most of it is because ARM hardware
> manufacturers are morons.

If in your mind "competitors" == "morons" then you might be right.

> But the way the ARM tree is then laid out
> has made that even more painful, and the decision to put all the crazy
> board details in the kernel tables instead of trying to have a
> per-board boot loader that fills in the details is just crazy.

I beg to disagree.

Trying to rely on bootloaders doing things right is like saying that x86
should always rely on the BIOS doing things right. We have this chance
in the OMAP case to have a manufacturer who is smart enough to document
all those things so that the kernel can be autonomous and more reliable,
and the BIOS joke avoided entirely. When something needs fixing it is
much easier to update the kernel ourselves than waiting after any
bootloader updates which are themselves much more risky to perform.

Granted, things could be structured in a better way so to minimize the
risk of conflicts when clocks for unrelated drivers are updated at the
same time. Something like initcall tables or the like.

> Look at the dirstat for arch/ in just the current merge window
> (cut-off at 5% just to not get too much):
>
> [torvalds@i5 linux]$ git diff -C --dirstat=5 --cumulative v2.6.38.. arch
> 14.0% arch/arm/mach-omap2/
> 5.8% arch/arm/plat-mxc/include/mach/
> 6.3% arch/arm/plat-mxc/
> 57.1% arch/arm/
> 5.4% arch/m68k/
> 9.6% arch/unicore32/
> 6.9% arch/x86/
> 100.0% arch/
>
> almost *SIXTY* percent of all arch updates were to ARM code.

Absolutely not! You have 14% going to OMAP code which happens to be
under arch/arm/ but there is nothing ARM specific in there. If OMAP was
using a PPC or a MIPS core then you'd have the same result except under
arch/powerpc or arch/mips. There is very little in terms of ARM
specific peculiarities under arch/arm/mach-omap2/ in fact.

And it happens that, after all the beating we've made on those embedded
(ARM) SOC manufacturers, trying to push the point home that it is far
better for everyone to have support merged in the mainline kernel
instead of keeping their patches piled up into an obsolete kernel
version, it happens that the OMAP folks are the top champions when it's
time to upstream their code. Are we going to complain to them now that
they're doing exactly what the upstream kernel community people have
been asking of them for so many years?

> And that's despite the fact that one of those architectures
> (unicore32) is a totally new architecture, and despite m68k having
> gone through a first-level unification of nommu and mmu code!

So what? That only shows that those architectures are not being used as
much as ARM is. This is probably just a reflect of actual market share.

> And was this just a fluke? No. Doing the same for 2.3.37..38 gives
> 58.3% for arch/arm (and in that release we had a blackfin unification
> effort, otherwise arm would have been an even bigger percentage).

And don't be surprised if the dirstat result for arch/arm/ goes up even
further in the near future. Other SOC manufacturers which happen to
have chosen an ARM core for their CPU are also coming out of their
moronic stupor and waking up to speed with this Open Source thing. If
they were choosing a MIPS core this could rebalance the dirstat, but
they happen to also have an ARM core with a _completely_ different
architecture around it since that's what they compete over. If ARM Ltd
was to dictate everything composing the ARM architecture like what
happened on X86 then you'd end up with a much lower level of competition
and innovation.

If that means doing someting similar to "git mv arch/arm/mach-omap2
arch/omap2" for the dirstat to be more meaningful then let's do it. But
I think that a different interpretation of the one you did would be more
appropriate here.

> That's ridiculous. It's entirely due to the whole f*cked-up arm ecosystem.

Well, let's face it, ARM is at the moment highly successful. And yes it
might be destabilizing as ARM might be surpassing the X86 kernel
development rate while X86 always used to be the ultimate reference.
But calling the whole ecosystem "f*cked-up" because we have issue
scaling to it properly is a rather cheap argument.

> Something needs to be done. A small part is to make sure the source
> code is more hierarchical, so that we don't have those crazy shared
> data-files that are ugly as hell and get conflicts because different
> boards all think they need to care.

Absolutely!

> But the larger problem is that somebody really REALLY needs to think
> about how to get those crazy board details out of the kernel entirely.
> Having per-board drivers for real hardware is sane - having to have
> per-board detail files for clock chips is just crazy. Split off that
> thing a "Linux ARM second-stage bootloader" project that has the
> per-board tables or something. Don't pollute the main kernel with
> crazy details like this.

There is on-going work to bring device tree support to ARM. Maybe that
will be the way to go to move those clock details out of the kernel.
And maybe that will become another unfixable PC BIOS fiasco. We'll see.
I don't particularly like the idea of _more_ APIs between bootloaders
and the kernel. Keeping everything fixable in only one place is way
more convenient than the burden of the occasional merge conflict.

Sure, something has to be done to minimize the pain, your pain, but not
by increasing the pain elsewhere. I think that you know pretty well
already how painful dealing with BIOS data, or ACPI, or any other vendor
controlled (sometimes closed source) config tables might be. We've
sidestepped that pain entirely on ARM so far and that really feels good.

Nicolas

2011-03-30 21:10:31

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, 30 Mar 2011, Linus Torvalds wrote:
> On Wed, Mar 30, 2011 at 10:06 AM, Arnd Bergmann <[email protected]> wrote:
> >
> > I'm still new to the ARM world, but I think one real problem is the way
> > that all platforms have their own trees with a very flat hierarchy --
> > a lot of people directly ask Linus to pull their trees, and the main
> > way to sort out conflicts is linux-next. The number of platforms in the
> > ARM arch is still increasing, so I assume that this only gets worse.
>
> Because as far as I can tell, most of that board support really is
> about crazy details that the kernel shouldn't even care about. Come up
> with a table that describes them, have one common parsing routine, and
> push the table into a bootloader. And get rid of having to add a board
> file for every crazy random piece of hardware that nobody really cares
> about.

There is effort on the way to address that with device tree support,
but that wont solve the other problem Arnd mentioned.

Let me phrase it different.

The main problem is NOT that these things conflict, the main problem -
and I can tell you after working through all that irq/gpio/mfd
sh*tpile - is that these subarchs start a life on their own and find
tons of creative ways to work around shortcomings of infrastructure
code up to the point where infrastructure code cannot be changed
anymore w/o breaking the world and some more. There is a f*cking good
reason why I made myself run through all that horror. These
shortcomings are partially real, but most of the time the failure is
on those folks simply because they do not understand how it works. If
the shortcoming is real they fail to talk to the infrastructure
maintainers and just hack something which boots.

The ARM core code and CPU/TLB/CACHE abnominations handling which is in
Russell's hands is working very well. Piling the babysitting of
sub-arch support onto Russell as well simply cannot scale, as the
whole madness of inconsistency of the ARM core architecture itself is
a full time job on it's own.

Watching the rapidly increasing number of SoCs which are spilling out
in the ARM ecosystem and their totaly non-architected "glue together
random IP cores" philosophy, I' convinced that we need a full-time
gatekeeper who babysits the subarch stuff and keeps an eye on those
ever repeating failure patterns and works on resolving them.

The only problem is to find a person, who is willing to do that, has
enough experience, broad shoulders and a strong accepted voice. Not to
talk about finding someone who is willing to pay a large enough
compensation for pain and suffering.

This is getting worse now as there seems to be a strong incentive to
get all that vendor BSP crap into mainline and I did a couple of
reviews in the last months which were more than frustrating. Running
through a 10 rounds review for a 200 lines driver is not really
encouraging - though at least the review prevented that a 1000 lines
horror crap got merged. I don't blame those people too much as they
have been thrown into Linux development after dealing with black hole
OS cores for years and therefor being spoiled with the thought of
"work around the failure / missing feature" somehow w/o the ability to
talk to anyone about it. That's a system failure of the established
commercial OS world and it will take some time to show those people
that it can be solved different. But that takes quite some manpower.

So one person will be not enough, that needs to be a whole team of
experienced people in the very near future to deal with the massive
tsunami of crap which is targeted at mainline. If we fail to set that
up, then we run into a very ugly maintainability issue in no time.

Thanks,

tglx

2011-03-30 21:30:43

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, Mar 30, 2011 at 1:41 PM, Nicolas Pitre <[email protected]> wrote:
>
> If in your mind "competitors" == "morons" then you might be right.

There's a difference between "competition" and "do things differently
just to be difficult".

> Trying to rely on bootloaders doing things right is like saying that x86
> should always rely on the BIOS doing things right.

No. Not at all.

The problem with firmware/BIOS is that it's set in stone and closed-source.

I'm suggesting splitting out the crazy part into a separate project
that does this. Open-source. Like a mini-kernel. Because the thing is,
the main kernel doesn't care, and _shouldn't_ care. Those board files
are just noise.

The long-term situation should be that you should be able to have ONE
binary kernel "just work". That's where we are on x86. Really.

Without that kind of long-term view, where do you think ARM is going
to be in five years?

>> almost *SIXTY* percent of all arch updates were to ARM code.
>
> Absolutely not! ?You have 14% going to OMAP code which happens to be
> under arch/arm/ but there is nothing ARM specific in there. ?If OMAP was
> using a PPC or a MIPS core then you'd have the same result except under
> arch/powerpc or arch/mips. ?There is very little in terms of ARM
> specific peculiarities under arch/arm/mach-omap2/ in fact.

But that's my point - the problem is all the crazy board crap.

I've never claimed that this is about the ARM cpu (which has it's own
issues, but that's a separate rant entirely). It's about the broken
infrastructure.

Now, some of it is quite understandable - ie real drivers for real
hardware. But a _lot_ of it seems to be just descriptor tables, and
I'm getting the very strong feeling that ARM people aren't even
_trying_ to make it sane, and trying to standardize things, or trying
to aim for the whole notion of "one kernel image, with much more hw
description done elsewhere".

Sure, you'll fundamentally always need several images (due to the
afore-mentioned crazy CPU architecture flaws - arm6 vs arm7 vs
armxyz), but I'm looking at the future, and arch/arm will get
_totally_ unmaintainable unless you guys have a plan for getting out
of the crazy hole you are in now.

arch/arm is already about 3x the size of arch/x86. And it's pretty
much all the crazy infrastructure afaik. timer chips, irq chips, gpio
differences - crap like that.

And the fact that you don't even seem to UNDERSTAND the problem, and
think that it's ok, and that continued future explosion of this is all
fine makes me even more nervous.

Linus

2011-03-30 21:38:36

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, Mar 30, 2011 at 07:06:41PM +0200, Arnd Bergmann wrote:
> On Friday 18 March 2011, Russell King - ARM Linux wrote:
> > I do get the impression that you're extremely unhappy with the way ARM
> > stuff works, and I've no real idea how to solve that. I think much of
> > it is down to perception rather than anything tangible.
> >
> > Maybe the only solution is for ARM to fork the kernel, which is something
> > I really don't want to do - but from what I'm seeing its the only solution
> > which could come close to making you happy.
>
> I'm still new to the ARM world, but I think one real problem is the way
> that all platforms have their own trees with a very flat hierarchy --
> a lot of people directly ask Linus to pull their trees, and the main
> way to sort out conflicts is linux-next. The number of platforms in the
> ARM arch is still increasing, so I assume that this only gets worse.

The reason that we've ended up with a flat heirarchy in terms of
developers is down to pressure. There was a time when we had a more
structured system, where the sub-tree people submitted their patches
to me and the list, they'd be reviewed (mostly and mostly only) by me
before being merged into my tree and going upstream from there.

As the community grew, it got harder and harder to do decent reviewed
of those patches and so the acceptance rate dropped.

Eventually we switched to the current arrangement where I'm essentially
only concerned about core ARM code, and a few platforms which I have
personal interest in (or are contracted to look after.)

For the rest I just look at the patches, and send back what feedback I
can on them (which is mostly when my mailer turns a line red because
it's matched one of my mutt regexps for spotting common mistakes.)

> This would be no easier if everyone was asking you to pull their trees,
> as I believe was the case before that. The amount of code getting changed
> there is too large to get reviewed by a single person, and I believe
> neither of you really wants the burden to judge if all of the branches
> are ok (and complain to the authors when they are not).

Absolutely right - and the problem is that we still have no one who is
willing to step up and do the review.

What I was promised at the time was that by giving sub-tree maintainers
the loaded pistol, this problem of code quality would in effect be self-
correcting. If they make a hash out of it, they'd have to be the ones
to fix it themselves.

Instead, what's happening is that the _entire_ ARM community, ARM
hardware manufacturers and so forth is being blamed here.

> Russell, do you think it would help to have an additional ARM platform
> tree that collects all the changes that impact only the platform code but
> not the core architecture? I believe that would be a way out, but requires
> a careful selection of people responsible for it. In particular, I don't
> think a single person can handle it without good sub-maintainers.

It's not that simple, as what happens when we have core ARM code updates
which ends up touching every single board file? The result is conflicts
between trees, and that could get extremely messy indeed.

To be honest, given the politics, I don't want to be the one stuck in the
middle, receiving and endless stream of Linus' complaints about the way
the ARM community works, or the board support code. However, inspite of
the sub-tree maintainers having the responsibility for their own code I
still find myself in the firing line.

And I have got to the point of just not giving a damn. I can't change
the ARM community (I've tried over the years to get more active review
of platform changes and failed - and had it pointed out by folk like
Alan Cox, that such a system is impossible due to lack of motivation
by, eg, an OMAP person to review a Samsung change.)

If this ultimately means that Linus decides to throw ARM out of the
mainline kernel, then I guess we'll need a git tree setup somewhere to
track mainline, and to take the ARM merges, and do our own kernel
releases with different version numbering. I'm quite prepared to do
that and run such a tree, and not give a damn about what the sub-arch
people do provided I can still make the necessary core ARM code changes
as required (and if some sub-arch breaks they get to fix it.) However,
that would be entirely limited to just ARM arch stuff and not drivers.

2011-03-30 21:44:52

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

* Linus Torvalds <[email protected]> [110330 12:19]:
> On Wed, Mar 30, 2011 at 10:06 AM, Arnd Bergmann <[email protected]> wrote:
> >
> > I'm still new to the ARM world, but I think one real problem is the way
> > that all platforms have their own trees with a very flat hierarchy --
> > a lot of people directly ask Linus to pull their trees, and the main
> > way to sort out conflicts is linux-next. The number of platforms in the
> > ARM arch is still increasing, so I assume that this only gets worse.
>
> As far as I'm concerned, the biggest issuee is that some of the ARM
> crap is just CRAP. It's idiotic tables that get updated by multiple
> people, and in totally nonsensical ways. When I see conflicts in those
> damn clock-data files, I just go mental. Those files are an
> abomination.

Those kind of conflicts should not happen any longer that's for sure.

> Why the hell is the clock-data for fifty (number taken out of my ass)
> different clocking rules in one array? And why do we have eight
> different files of that kind of crap for omap2?

The clock*data.c files could eventually come from device tree, but that
still requires quite a bit of work to get there.

> Look at the dirstat for arch/ in just the current merge window
> (cut-off at 5% just to not get too much):
>
> [torvalds@i5 linux]$ git diff -C --dirstat=5 --cumulative v2.6.38.. arch
> 14.0% arch/arm/mach-omap2/
> 5.8% arch/arm/plat-mxc/include/mach/
> 6.3% arch/arm/plat-mxc/
> 57.1% arch/arm/
> 5.4% arch/m68k/
> 9.6% arch/unicore32/
> 6.9% arch/x86/
> 100.0% arch/
>
> almost *SIXTY* percent of all arch updates were to ARM code. And
> that's despite the fact that one of those architectures (unicore32) is
> a totally new architecture, and despite m68k having gone through a
> first-level unification of nommu and mmu code!
>
> And was this just a fluke? No. Doing the same for 2.3.37..38 gives
> 58.3% for arch/arm (and in that release we had a blackfin unification
> effort, otherwise arm would have been an even bigger percentage).
>
> That's ridiculous. It's entirely due to the whole f*cked-up arm ecosystem.

Yeh there's no BIOS and there are no scannable busses.. Which leads
to huge amount of data patches that show up in the diffstat.

Anyways, let's plan on kicking out per-SoC and per-board data from
the kernel and get it from the bootloader via device tree in the
long run. Most of the data is already separate from the code, so
it should not be that hard to do, just takes some time.

Regards,

Tony

2011-03-30 21:54:58

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

* Thomas Gleixner <[email protected]> [110330 14:07]:
>
> So one person will be not enough, that needs to be a whole team of
> experienced people in the very near future to deal with the massive
> tsunami of crap which is targeted at mainline. If we fail to set that
> up, then we run into a very ugly maintainability issue in no time.

One thing that will help here and distribute the load is to move
more things under drivers/ as then we have more maintainers looking
at the code.

Regards,

Tony

2011-03-30 22:08:24

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

* Nicolas Pitre <[email protected]> [110330 13:39]:
>
> Trying to rely on bootloaders doing things right is like saying that x86
> should always rely on the BIOS doing things right. We have this chance
> in the OMAP case to have a manufacturer who is smart enough to document
> all those things so that the kernel can be autonomous and more reliable,
> and the BIOS joke avoided entirely. When something needs fixing it is
> much easier to update the kernel ourselves than waiting after any
> bootloader updates which are themselves much more risky to perform.
>
> Granted, things could be structured in a better way so to minimize the
> risk of conflicts when clocks for unrelated drivers are updated at the
> same time. Something like initcall tables or the like.

We are hitting a problem with these data files for omap2+ already
where the size of the kernel gets too bloated. So the device tree
approach would help making more distro friendly kernel.

If we want to keep the data in the kernel, they should be loadable
kernel modules except for the few core clocks etc needed to bring
up the system.

I guess an alternative to device thee could be place them under
drivers/firmware or something similar. That does not help with
per-board data though, it would only help with the clocks needed
by device drivers.

> There is on-going work to bring device tree support to ARM. Maybe that
> will be the way to go to move those clock details out of the kernel.
> And maybe that will become another unfixable PC BIOS fiasco. We'll see.
> I don't particularly like the idea of _more_ APIs between bootloaders
> and the kernel. Keeping everything fixable in only one place is way
> more convenient than the burden of the occasional merge conflict.
>
> Sure, something has to be done to minimize the pain, your pain, but not
> by increasing the pain elsewhere. I think that you know pretty well
> already how painful dealing with BIOS data, or ACPI, or any other vendor
> controlled (sometimes closed source) config tables might be. We've
> sidestepped that pain entirely on ARM so far and that really feels good.

Yeah the syncing up with the bootloader and patching kernel around
bootloader bugs is an issue. But the bloat issue might be hard to
work around otherwise.

Regards,

Tony

2011-03-30 22:14:35

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

* Russell King - ARM Linux <[email protected]> [110330 14:05]:
> On Wed, Mar 30, 2011 at 07:06:41PM +0200, Arnd Bergmann wrote:
>
> And I have got to the point of just not giving a damn. I can't change
> the ARM community (I've tried over the years to get more active review
> of platform changes and failed - and had it pointed out by folk like
> Alan Cox, that such a system is impossible due to lack of motivation
> by, eg, an OMAP person to review a Samsung change.)

I think this is happening more and more as we have more ARM generic
and Linux generic code.

Regards,

Tony

2011-03-30 22:25:22

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, 30 Mar 2011, Tony Lindgren wrote:

> * Thomas Gleixner <[email protected]> [110330 14:07]:
> >
> > So one person will be not enough, that needs to be a whole team of
> > experienced people in the very near future to deal with the massive
> > tsunami of crap which is targeted at mainline. If we fail to set that
> > up, then we run into a very ugly maintainability issue in no time.
>
> One thing that will help here and distribute the load is to move
> more things under drivers/ as then we have more maintainers looking
> at the code.

Guess what's that going to solve? Nothing, nada.

Really, you move the problem to people who are not prepared to deal
with the wave either. So what's the gain?

FYI, lots of the wreckage I observed regarding irq stuff was in
drivers/*

My statement stays the same. In whatever playground you try to shift
that problem in it's not going to scale the way it is right now,

Thanks,

tglx

2011-03-30 22:26:22

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, Mar 30, 2011 at 2:44 PM, Tony Lindgren <[email protected]> wrote:
>>
>> That's ridiculous. It's entirely due to the whole f*cked-up arm ecosystem.
>
> Yeh there's no BIOS and there are no scannable busses.. Which leads
> to huge amount of data patches that show up in the diffstat.

Yes. And due to all the traditional embedded models, there's no
historical "platform" model at all, unlike pretty much all other
architectures. Sure, other architectures often have more than a single
platform, but there's usually at least a couple of standard ways of
doing things, and defined interfaces to figure out at least most of
the big issues.

The embedded world has always been painfully different, and arm is
just the most successful entry (by far) in that world.

So as a result, it's not just "no scannable busses", it's pretty much
_everything_ that you can't take for granted. The clock chip details
and crazy irq controllers are a symptom.

Now, I'm not a huge fan of ACPI and always point out how firmware
people inevitably get something wrong, but the _real_ reason I think
ACPI was such a broken idea is that it was basically used as an excuse
to break the PC "platform" notion - the fundamental problem with ACPI
was that it was a way to avoid making platform decisions, and let all
the hw crazies make bad decisions and then "fix them up" in ACPI.

So I always felt that Intel should just have documented the hardware
standard instead, and pushed that as the platform (which, in all
honesty, Intel has done for a lot of very successful things - PCI,
AHCI, USB etc etc are all examples of Intel creating those kinds of
hardware platform standards). ACPI allowed (and still allows) Sony and
others to make crazy ad-hoc decisions about some random motherboard
device, and just encouranges _bad_ hardware without enumeration or
good high-level rules.

But for ARM, I suspect even ACPI would actually be an improvement.
Because on ARM, the crazy non-platform hw people already happened, and
took over the insane asylum. So having a complicated description
language with an interpreter wouldn't be worse than what we already do
there.

I'm only half kidding. I wouldn't wish for ACPI even on ARM. But..

> Anyways, let's plan on kicking out per-SoC and per-board data from
> the kernel and get it from the bootloader via device tree in the
> long run. Most of the data is already separate from the code, so
> it should not be that hard to do, just takes some time.

This is basically my hope for the future. I just think that ARM people
should be very very aggressive about it, because the longer it isn't
done, the more crud there will be to convert.

I bet it will be painful to do. But it will be even more painful to
_not_ do it, and then five years from now realize that it should have
been done ten years ago.

Linus

2011-03-30 22:38:14

by Paul E. McKenney

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, Mar 30, 2011 at 02:54:35PM -0700, Tony Lindgren wrote:
> * Thomas Gleixner <[email protected]> [110330 14:07]:
> >
> > So one person will be not enough, that needs to be a whole team of
> > experienced people in the very near future to deal with the massive
> > tsunami of crap which is targeted at mainline. If we fail to set that
> > up, then we run into a very ugly maintainability issue in no time.
>
> One thing that will help here and distribute the load is to move
> more things under drivers/ as then we have more maintainers looking
> at the code.

In many cases, the ARM SoC vendors will want their people producing the
code, so although moving things to drivers might be a good thing to do,
it won't really increase the number of people involved. Plus the move
to the drivers subtree would be a problem for devices with tight ties
to the board or SoC.

There is work on pushing towards common code, but there is a lot of code
and this will take time and a lot of work.

Thanx, Paul

2011-03-30 22:40:00

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

* Linus Torvalds <[email protected]> [110330 15:18]:
> On Wed, Mar 30, 2011 at 2:44 PM, Tony Lindgren <[email protected]> wrote:
>
> But for ARM, I suspect even ACPI would actually be an improvement.
> Because on ARM, the crazy non-platform hw people already happened, and
> took over the insane asylum. So having a complicated description
> language with an interpreter wouldn't be worse than what we already do
> there.
>
> I'm only half kidding. I wouldn't wish for ACPI even on ARM. But..

Heh I think the device tree is saner here than ACPI :)

> > Anyways, let's plan on kicking out per-SoC and per-board data from
> > the kernel and get it from the bootloader via device tree in the
> > long run. Most of the data is already separate from the code, so
> > it should not be that hard to do, just takes some time.
>
> This is basically my hope for the future. I just think that ARM people
> should be very very aggressive about it, because the longer it isn't
> done, the more crud there will be to convert.

At least now all that data is in one place in convertable format instead
of direct register tinkering of shared registers in each device driver
probe function..

> I bet it will be painful to do. But it will be even more painful to
> _not_ do it, and then five years from now realize that it should have
> been done ten years ago.

Something needs to be done for sure.

Tony

2011-03-30 22:45:47

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

* Thomas Gleixner <[email protected]> [110330 15:22]:
> On Wed, 30 Mar 2011, Tony Lindgren wrote:
>
> > * Thomas Gleixner <[email protected]> [110330 14:07]:
> > >
> > > So one person will be not enough, that needs to be a whole team of
> > > experienced people in the very near future to deal with the massive
> > > tsunami of crap which is targeted at mainline. If we fail to set that
> > > up, then we run into a very ugly maintainability issue in no time.
> >
> > One thing that will help here and distribute the load is to move
> > more things under drivers/ as then we have more maintainers looking
> > at the code.
>
> Guess what's that going to solve? Nothing, nada.
>
> Really, you move the problem to people who are not prepared to deal
> with the wave either. So what's the gain?

I guess my point is that with creating more common frameworks people
will be using common code. Some examples that come to mind are clock
framework, gpiolib, dma engine, runtime PM and so on.

Then even arch specific driver code becomes generic and separated
from the arch specific code. And then the driver subsystem maintainer
can review it easier.

Regards,

Tony

2011-03-30 22:48:06

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

* Paul E. McKenney <[email protected]> [110330 15:35]:
> On Wed, Mar 30, 2011 at 02:54:35PM -0700, Tony Lindgren wrote:
> > * Thomas Gleixner <[email protected]> [110330 14:07]:
> > >
> > > So one person will be not enough, that needs to be a whole team of
> > > experienced people in the very near future to deal with the massive
> > > tsunami of crap which is targeted at mainline. If we fail to set that
> > > up, then we run into a very ugly maintainability issue in no time.
> >
> > One thing that will help here and distribute the load is to move
> > more things under drivers/ as then we have more maintainers looking
> > at the code.
>
> In many cases, the ARM SoC vendors will want their people producing the
> code, so although moving things to drivers might be a good thing to do,
> it won't really increase the number of people involved. Plus the move
> to the drivers subtree would be a problem for devices with tight ties
> to the board or SoC.
>
> There is work on pushing towards common code, but there is a lot of code
> and this will take time and a lot of work.

I agree on the common code part, then even drivers with tight
ties to board or SoC become just generic drivers that are easy
to review.

Tony

2011-03-30 22:56:47

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, 30 Mar 2011, Tony Lindgren wrote:

> * Thomas Gleixner <[email protected]> [110330 15:22]:
> > On Wed, 30 Mar 2011, Tony Lindgren wrote:
> >
> > > * Thomas Gleixner <[email protected]> [110330 14:07]:
> > > >
> > > > So one person will be not enough, that needs to be a whole team of
> > > > experienced people in the very near future to deal with the massive
> > > > tsunami of crap which is targeted at mainline. If we fail to set that
> > > > up, then we run into a very ugly maintainability issue in no time.
> > >
> > > One thing that will help here and distribute the load is to move
> > > more things under drivers/ as then we have more maintainers looking
> > > at the code.
> >
> > Guess what's that going to solve? Nothing, nada.
> >
> > Really, you move the problem to people who are not prepared to deal
> > with the wave either. So what's the gain?
>
> I guess my point is that with creating more common frameworks people
> will be using common code. Some examples that come to mind are clock
> framework, gpiolib, dma engine, runtime PM and so on.

For all that to happen you need a really experienced team with a
strong team lead to fight that through and go through the existing
horror while dealing with the incoming flood at the same time.

See commit 9ad198cb for illustration.

Sigh, I fought that battle for a couple of month to deal with shite
coming in faster than you can fix it and everyone ignoring it.

Thanks,

tglx

2011-03-30 23:13:42

by Paul E. McKenney

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, Mar 30, 2011 at 03:47:52PM -0700, Tony Lindgren wrote:
> * Paul E. McKenney <[email protected]> [110330 15:35]:
> > On Wed, Mar 30, 2011 at 02:54:35PM -0700, Tony Lindgren wrote:
> > > * Thomas Gleixner <[email protected]> [110330 14:07]:
> > > >
> > > > So one person will be not enough, that needs to be a whole team of
> > > > experienced people in the very near future to deal with the massive
> > > > tsunami of crap which is targeted at mainline. If we fail to set that
> > > > up, then we run into a very ugly maintainability issue in no time.
> > >
> > > One thing that will help here and distribute the load is to move
> > > more things under drivers/ as then we have more maintainers looking
> > > at the code.
> >
> > In many cases, the ARM SoC vendors will want their people producing the
> > code, so although moving things to drivers might be a good thing to do,
> > it won't really increase the number of people involved. Plus the move
> > to the drivers subtree would be a problem for devices with tight ties
> > to the board or SoC.
> >
> > There is work on pushing towards common code, but there is a lot of code
> > and this will take time and a lot of work.
>
> I agree on the common code part, then even drivers with tight
> ties to board or SoC become just generic drivers that are easy
> to review.

Yep! The trick is getting to that point. Some drivers will be easier
than others.

Thanx, Paul

2011-03-30 23:14:20

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, 30 Mar 2011, Tony Lindgren wrote:

> * Paul E. McKenney <[email protected]> [110330 15:35]:
> > On Wed, Mar 30, 2011 at 02:54:35PM -0700, Tony Lindgren wrote:
> > > * Thomas Gleixner <[email protected]> [110330 14:07]:
> > > >
> > > > So one person will be not enough, that needs to be a whole team of
> > > > experienced people in the very near future to deal with the massive
> > > > tsunami of crap which is targeted at mainline. If we fail to set that
> > > > up, then we run into a very ugly maintainability issue in no time.
> > >
> > > One thing that will help here and distribute the load is to move
> > > more things under drivers/ as then we have more maintainers looking
> > > at the code.
> >
> > In many cases, the ARM SoC vendors will want their people producing the
> > code, so although moving things to drivers might be a good thing to do,
> > it won't really increase the number of people involved. Plus the move
> > to the drivers subtree would be a problem for devices with tight ties
> > to the board or SoC.
> >
> > There is work on pushing towards common code, but there is a lot of code
> > and this will take time and a lot of work.
>
> I agree on the common code part, then even drivers with tight
> ties to board or SoC become just generic drivers that are easy
> to review.

You wish. There is an already existing problem that the identical IP
cores of peripheral crap are reused accross architectures. And of
course because it is a different architecture we have two different
drivers with different issues.

See: http://marc.info/?l=linux-kernel&m=130041568128164

We already fail to detect this on the driver level, so please answer
the question I asked before: How do you spread the load and scale with
the amount of shite which is coming in?

The above example is probably not the only one in tree and we will see
lots of unnoticed instances of drivers dealing with minimal different
versions of the same IP crappola in the near future simply because the
vendors claim that their stuff is unique and only works with their
particular instance of hackery unless we have enough capable people to
look over this. Whether it's in arch/ or drivers/ it does not
matter. We are simply not prepared to the amount of crap coming in.

Thanks,

tglx

2011-03-30 23:28:24

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

* Thomas Gleixner <[email protected]> [110330 16:11]:
> On Wed, 30 Mar 2011, Tony Lindgren wrote:
>
> > * Paul E. McKenney <[email protected]> [110330 15:35]:
> > > On Wed, Mar 30, 2011 at 02:54:35PM -0700, Tony Lindgren wrote:
> > > > * Thomas Gleixner <[email protected]> [110330 14:07]:
> > > > >
> > > > > So one person will be not enough, that needs to be a whole team of
> > > > > experienced people in the very near future to deal with the massive
> > > > > tsunami of crap which is targeted at mainline. If we fail to set that
> > > > > up, then we run into a very ugly maintainability issue in no time.
> > > >
> > > > One thing that will help here and distribute the load is to move
> > > > more things under drivers/ as then we have more maintainers looking
> > > > at the code.
> > >
> > > In many cases, the ARM SoC vendors will want their people producing the
> > > code, so although moving things to drivers might be a good thing to do,
> > > it won't really increase the number of people involved. Plus the move
> > > to the drivers subtree would be a problem for devices with tight ties
> > > to the board or SoC.
> > >
> > > There is work on pushing towards common code, but there is a lot of code
> > > and this will take time and a lot of work.
> >
> > I agree on the common code part, then even drivers with tight
> > ties to board or SoC become just generic drivers that are easy
> > to review.
>
> You wish. There is an already existing problem that the identical IP
> cores of peripheral crap are reused accross architectures. And of
> course because it is a different architecture we have two different
> drivers with different issues.
>
> See: http://marc.info/?l=linux-kernel&m=130041568128164

Yeah that's a problem. And getting people to create generic device
drivers is hard, takes tons of commenting and still needs some hardware
workaround options passed in the platform_data..

> We already fail to detect this on the driver level, so please answer
> the question I asked before: How do you spread the load and scale with
> the amount of shite which is coming in?

Sorry I don't have a solution to that :) I'm struggling with that issue
big time myself.

> The above example is probably not the only one in tree and we will see
> lots of unnoticed instances of drivers dealing with minimal different
> versions of the same IP crappola in the near future simply because the
> vendors claim that their stuff is unique and only works with their
> particular instance of hackery unless we have enough capable people to
> look over this. Whether it's in arch/ or drivers/ it does not
> matter. We are simply not prepared to the amount of crap coming in.

Yes I agree. Tools like checkpatch and sparse don't help with issues
like this.

Tony

2011-03-30 23:32:03

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, 30 Mar 2011, Linus Torvalds wrote:

> On Wed, Mar 30, 2011 at 1:41 PM, Nicolas Pitre <[email protected]> wrote:
> >
> > If in your mind "competitors" == "morons" then you might be right.
>
> There's a difference between "competition" and "do things differently
> just to be difficult".

Absolutely. We've seen that from some proprietary software companies.

> > Trying to rely on bootloaders doing things right is like saying that x86
> > should always rely on the BIOS doing things right.
>
> No. Not at all.
>
> The problem with firmware/BIOS is that it's set in stone and closed-source.
>
> I'm suggesting splitting out the crazy part into a separate project
> that does this. Open-source. Like a mini-kernel. Because the thing is,
> the main kernel doesn't care, and _shouldn't_ care. Those board files
> are just noise.

Sure, but important noise nevertheless. As long as the noise is
confined to a limited set of .c files I'm happy. OTOH I have very
little hope for a separate project that would only deal with that noise.
That will simply never fly, even less so as an Open Source project.
The insentive for people to work on such thing simply aren't there as
that is totally uninteresting and without any rewards.

Furthermore, this does create pain. you have to make things in sync
between the kernel and the mini-kernel (let's call it bootloader). In
practice the bootloader is always maintained separately from the kernel,
on its own pace and with its own release schedule. Trying to
synchronize independent projects is really painful as you know already,
otherwise the user space for perf would still be maintained separately
from the kernel, right?

Now, when there is a bug in one of the clock settings, or one clock
is missing for that new kernel driver to work properly, the
bootloader would have to be fixed, revalidated, and the fix deployed
separately but still in addition to the kernel. This process still adds
to the pain such that what people do in those cases is simply to hack
the driver code in the kernel. Instead, the OMAP folks created a table
to abstract them into something more manageable.

And here's the final catch. Most of those clocks are often derived from
each other in a tree structure inside the SOC. And for power saving
reasons, some crazy people want to dynamically change the config for
those clocks at run time according to the required frequency for given
loads, turn them off when possible, and of course turn the parent clock
off as well if all the children clocks are themselves turned off. So
the kernel has NO CHOICE but to be fully aware of them.

Then comes power domains with the cascade of regulators and so forth,
again all software controlled. Add to the mix the different sleep
states that can be derived from that, which is far more sophisticated
than ACPI states on Intel. And in some cases, the hardware capabilities
are there but people still didn't find the optimal way to drive them, so
research is still on-going software wise. And obviously those SOC
vendors do compete on that front since power consumption is the killing
weapon these days. No wonder why they are so different from each other
with all that "board crap".

> The long-term situation should be that you should be able to have ONE
> binary kernel "just work". That's where we are on x86. Really.

But X86 is peanuts. Really. There was one machine called the IBM PC at
some point that everybody cloned, and the rest was totally irrelevant.
Then came that thing called Windows that reinforced this hardware
monoculture as it was used for the ultimate conformance testing. This
is damn easy in that case to produce a kernel that works virtually
everywhere.

On ARM there is simply not such thing as a single machine design to
clone, and a closed source test bench to design for.

And this is orthogonal to this discussion anyway, as having in-kernel
clock tables is not incompatible with a single kernel binary. Dropping
at runtime those clock tables that are irrelevant to the currently
running hardware is not rocket science.

> Without that kind of long-term view, where do you think ARM is going
> to be in five years?

ARM is going to still be relevant simply because they now have Linux
that they can modify to suit their latest changes. That's one thing
with Open Source which can be good or bad: full hardware compatibility
is no longer an issue since the software can be adapted at will.

Still... there are on-going efforts to consolidate things amongst all
the ARM vendors. The ARM architecture is standardizing more and more
stuff in the whole stack in every revision. But they won't standardize
everything otherwise they'll kill that competing ecosystem.

> >> almost *SIXTY* percent of all arch updates were to ARM code.
> >
> > Absolutely not! ?You have 14% going to OMAP code which happens to be
> > under arch/arm/ but there is nothing ARM specific in there. ?If OMAP was
> > using a PPC or a MIPS core then you'd have the same result except under
> > arch/powerpc or arch/mips. ?There is very little in terms of ARM
> > specific peculiarities under arch/arm/mach-omap2/ in fact.
>
> But that's my point - the problem is all the crazy board crap.
>
> I've never claimed that this is about the ARM cpu (which has it's own
> issues, but that's a separate rant entirely). It's about the broken
> infrastructure.

Let's see how we can fix it then. Trying to shovel the problem away
won't help the situation. Those ARM vendors are crazy for sure. But
it's not a relatively few merge conflicts compared to the volume of
changes that will make us flinch, right?

> Now, some of it is quite understandable - ie real drivers for real
> hardware. But a _lot_ of it seems to be just descriptor tables, and
> I'm getting the very strong feeling that ARM people aren't even
> _trying_ to make it sane, and trying to standardize things, or trying
> to aim for the whole notion of "one kernel image, with much more hw
> description done elsewhere".

That work is happening. It is not ready. I'm not against it but I
remain sceptical. I still think that a self contained kernel is more
maintainable.

Still, because ARM is just a CPU architecture, those SOC vendors will
always have something new to differenciate themselves from the other SOC
vendors. And that cannot be described in a table alone. The power
management hardware from TI will still require separate _executable_
code from the Freescale one, or the Samsung one, or the Nvidia one, or
the Qualcomm one, or the Marvell one, yada yada. And I really don't
want to see that code turned into some vendor provided buggy ACPI
bytecode or similar.

> arch/arm is already about 3x the size of arch/x86. And it's pretty
> much all the crazy infrastructure afaik. timer chips, irq chips, gpio
> differences - crap like that.

Indeed. And I expect it to grow even bigger. Be warned.

> And the fact that you don't even seem to UNDERSTAND the problem, and
> think that it's ok, and that continued future explosion of this is all
> fine makes me even more nervous.

I do understand the problem. And so far, the way we scaled is to have
TI people care about the OMAP code, Freescale people care about the iMX
code, and so on. If one of them produces crap code then so it is, and
the other vendor is totally unaffected, which is why I'm not too
nervous. Blaming a merge conflict on the entire ARM ecosystem just
because one team was large enough to have separate people doing
different things that intersected into the clock table is blowing things
totally out of proportion.

And if those hardware vendors are still in business in the future, and
apparently new ones are joining in, then the arch/arm/ directory will
continue to gain weight. And on ARM, Linux is very very successful
that's all.

Nicolas

2011-03-31 00:00:36

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, Mar 30, 2011 at 07:31:59PM -0400, Nicolas Pitre wrote:
> Sure, but important noise nevertheless. As long as the noise is
> confined to a limited set of .c files I'm happy. OTOH I have very
> little hope for a separate project that would only deal with that noise.
> That will simply never fly, even less so as an Open Source project.

It's also an excuse for people to make it a closed source project, and
so you end up with platforms with a closed source binary blob passed
into the kernel, which has hacky patches to fixup the binary blob parser
to make it all work.

We already see this with the damn simple memory layout stuff we already
(try) to require of boot loaders.

> Still... there are on-going efforts to consolidate things amongst all
> the ARM vendors. The ARM architecture is standardizing more and more
> stuff in the whole stack in every revision. But they won't standardize
> everything otherwise they'll kill that competing ecosystem.

Let's not kid ourselves over that effort: because there is already soo
much code in mainline, the efforts to consolidate things can in itself
create _big_ patches which will inflate the %age change under arch/arm.

> > Now, some of it is quite understandable - ie real drivers for real
> > hardware. But a _lot_ of it seems to be just descriptor tables, and
> > I'm getting the very strong feeling that ARM people aren't even
> > _trying_ to make it sane, and trying to standardize things, or trying
> > to aim for the whole notion of "one kernel image, with much more hw
> > description done elsewhere".
>
> That work is happening. It is not ready. I'm not against it but I
> remain sceptical. I still think that a self contained kernel is more
> maintainable.

Let's not forget that in the future, the hardware should improve.
There's efforts to standardise some of the peripherals, such as
interrupt controllers and timers.

The first attempt at architecting an interrupt controller for ARM has
actually caused *more* complexity, as the architected interrupt (GIC)
controller contained no power management. That has caused SoC vendors
to bolt a second power management interrupt controller alongside the
GIC which needs to be kept in sync. So... the result was yet more code
which doesn't sit at all well with data descriptions of systems.

> Still, because ARM is just a CPU architecture, those SOC vendors will
> always have something new to differenciate themselves from the other SOC
> vendors. And that cannot be described in a table alone. The power
> management hardware from TI will still require separate _executable_
> code from the Freescale one, or the Samsung one, or the Nvidia one, or
> the Qualcomm one, or the Marvell one, yada yada. And I really don't
> want to see that code turned into some vendor provided buggy ACPI
> bytecode or similar.

To get rid of all the platform related stuff, I think you'd need some
kind of bytecode to deal with some of the procedural stuff with various
platforms. Without bytecode, the only other way is to keep the stuff
as C functions in the kernel and find some way of binding them to
drivers through DT, which means we're still going to have platform
specific C files littering the kernel.

While I can see DT solving the "declare this data structure" problem,
I believe that's only part of the issue.

This is exactly why when DT was proposed as a miracle cure-all for ARM,
I wanted to see DT on a real ARM platform rather than just ARM Ltd's
simple and similar development boards.

Certainly, though, DT for ARM is progressing.

2011-03-31 00:15:37

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, Mar 30, 2011 at 12:21:32PM -0700, Linus Torvalds wrote:
> Look at the dirstat for arch/ in just the current merge window
> (cut-off at 5% just to not get too much):
>
> [torvalds@i5 linux]$ git diff -C --dirstat=5 --cumulative v2.6.38.. arch
> 14.0% arch/arm/mach-omap2/
> 5.8% arch/arm/plat-mxc/include/mach/
> 6.3% arch/arm/plat-mxc/
> 57.1% arch/arm/

In this merge window, I deleted at least 6000 lines from arch/arm, and
by quoting diffstat percentages, you're using that against the ARM
community. Why did I bother (that's not a question).

2011-03-31 00:15:45

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

* Russell King - ARM Linux <[email protected]> [110330 16:57]:
> On Wed, Mar 30, 2011 at 07:31:59PM -0400, Nicolas Pitre wrote:
>
> > Still, because ARM is just a CPU architecture, those SOC vendors will
> > always have something new to differenciate themselves from the other SOC
> > vendors. And that cannot be described in a table alone. The power
> > management hardware from TI will still require separate _executable_
> > code from the Freescale one, or the Samsung one, or the Nvidia one, or
> > the Qualcomm one, or the Marvell one, yada yada. And I really don't
> > want to see that code turned into some vendor provided buggy ACPI
> > bytecode or similar.
>
> To get rid of all the platform related stuff, I think you'd need some
> kind of bytecode to deal with some of the procedural stuff with various
> platforms. Without bytecode, the only other way is to keep the stuff
> as C functions in the kernel and find some way of binding them to
> drivers through DT, which means we're still going to have platform
> specific C files littering the kernel.

The SoC specific code still needs to be different for things like PM,
but that's pretty small compared to the mux/clock/hwmod data on omaps.

Also I think we can make the PM code into loadable modules eventually.

> While I can see DT solving the "declare this data structure" problem,
> I believe that's only part of the issue.

Yup I agree there are other issues too.

> This is exactly why when DT was proposed as a miracle cure-all for ARM,
> I wanted to see DT on a real ARM platform rather than just ARM Ltd's
> simple and similar development boards.
>
> Certainly, though, DT for ARM is progressing.

At least omap mux/clokc/hwmod data could come either from devicetree
or be a loadable kernel module for most entries.

Regards,

Tony

2011-03-31 00:31:10

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

Nico:

On Wed, Mar 30, 2011 at 6:31 PM, Nicolas Pitre <[email protected]> wrote:
>
> But X86 is peanuts. ?Really.

Finally, a voice of reason!

> On ARM there is simply not such thing as a single machine design to
> clone, and a closed source test bench to design for.

... and there almost certainly won't ever be.

I recognize the problems present in the ARM code. But any attempt to
homogenize the ARM platform code just doesn't fit with reality. ARM
machines are created to solve different problems from PCs, and the
diversity in the ARM platform community is a reflection of that.

I don't think that all the chaos in the ARM kernel code comes from
shite programming; I think it comes from the chaos that is ARM
hardware.

And I'm highly skeptical that any of these problems can be simply
pushed into a bootloader. Most of my clients struggle with the
bare-minimum that is required of bootloaders now, and I don't think
they are in any way out of the ordinary.

I think that the ARM beast has to be tamed with kernel code, because
kernel programmers are the best minds available to do it.

b.g.
--
Bill Gatliff
[email protected]

2011-03-31 00:40:27

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, 30 Mar 2011, Nicolas Pitre wrote:

> On Wed, 30 Mar 2011, Linus Torvalds wrote:
>
>> On Wed, Mar 30, 2011 at 1:41 PM, Nicolas Pitre <[email protected]> wrote:
>>>
>>> Trying to rely on bootloaders doing things right is like saying that x86
>>> should always rely on the BIOS doing things right.
>>
>> No. Not at all.
>>
>> The problem with firmware/BIOS is that it's set in stone and closed-source.
>>
>> I'm suggesting splitting out the crazy part into a separate project
>> that does this. Open-source. Like a mini-kernel. Because the thing is,
>> the main kernel doesn't care, and _shouldn't_ care. Those board files
>> are just noise.
>
> Sure, but important noise nevertheless. As long as the noise is
> confined to a limited set of .c files I'm happy. OTOH I have very
> little hope for a separate project that would only deal with that noise.
> That will simply never fly, even less so as an Open Source project.
> The insentive for people to work on such thing simply aren't there as
> that is totally uninteresting and without any rewards.
>
> Furthermore, this does create pain. you have to make things in sync
> between the kernel and the mini-kernel (let's call it bootloader). In
> practice the bootloader is always maintained separately from the kernel,
> on its own pace and with its own release schedule. Trying to
> synchronize independent projects is really painful as you know already,
> otherwise the user space for perf would still be maintained separately
> from the kernel, right?

Being separate from the kernel with it's own release schedule could be a
good thing.

using the example of clocks. if the clock definitions were in the
bootloader project, then when a new board is produced with a slightly
different clock arrangement, all you have to do is to update the
bootloader to pass the new definition to the kernel, and then you can use
a well tested kernel that has been put through it's paces on other
hardware already.

Today you have to get the change upstream into the kernel, and then use
the new kernel (which always includes new features and bugs that you have
to test for)

you aren't saying that you are allowing arbatrary binary blobs to be
passed to the kernel from the bootloader, you are only saying that you
allow well defined board definition descriptions to be passed to the
kernel from the bootloader.

yes the bootloader can try to pass binary garbage to the kernel, but the
kernel doesn't have to be written to accept it. The kernel side remains
under your control even if the bootloader piece is owned by someone else.

the two pieces do not need to be released and updated in lockstep. yes,
there will be (many) cases where a new kernel adds support for a new type
of device, but the communications format between the bootloader and the
kernel can be designed to be tolorant of such skew. Even before the kernel
knows how to drive the hardware you can have the format of the information
about that hardware defined (allowing the bootloader to pass information
to the kernel that it just ignores because it doesn't have a driver in it
for that particulare piece of hardware), and if the bootloader doesn't
tell the kernel about some device, the kernel will just ignore that
device.

this means that you need to have some group doing the equivalent of
assigning device numbers for the different devices (and in this case going
just a little further to define what setup parameters will be needed),
initially this may be a little rough, but after a very short time I would
expect the people doing this work to start recognising that even though
vendor A who first proposes this device has some things hard-wired, the
definition format should support these things as variables instead of
being assumed.

David Lang

2011-03-31 00:56:38

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, Mar 30, 2011 at 5:15 PM, Russell King - ARM Linux
<[email protected]> wrote:
>
> In this merge window, I deleted at least 6000 lines from arch/arm, and
> by quoting diffstat percentages, you're using that against the ARM
> community. ?Why did I bother (that's not a question).

Umm. The actual stats are still:

1349 files changed, 62230 insertions(+), 33993 deletions(-)

which is sad. And the end result speaks for itself: this is lines per
architecture:

...
124022 total arch/sh
124418 total arch/sparc
181997 total arch/m68k
246717 total arch/mips
254785 total arch/x86
370912 total arch/powerpc
732732 total arch/arm

notice how ARM ends up being in a class of its own. This is a PROBLEM.

And ARM fanbois can say "oh, but arm is special" all they want, but
they need to realize that the lack of common platform for ARM is a
real major issue. It's not a "feature", and I'm sorry, but anybody who
calls x86 "peanuts" is a moron and should be ashamed of himself.
Instead of trying to feel superior, those people should feel like
pariah.

The fact that x86 has a platform, and people have cared about
compatibility, and actually gets things to work with less code is a
good thing. I know ARM people who think that x86 is an "ugly"
architecture. But the fact is, of all the architectures out there, ARM
right now is the ugliest BY FAR. Exactly because of people who don't
seem to understand that this kind of crap is a problem.

Linus

2011-03-31 01:15:04

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

Linus:

On Wed, Mar 30, 2011 at 7:55 PM, Linus Torvalds
<[email protected]> wrote:
> ?124022 total arch/sh
> ?124418 total arch/sparc
> ?181997 total arch/m68k
> ?246717 total arch/mips
> ?254785 total arch/x86
> ?370912 total arch/powerpc
> ?732732 total arch/arm

I'm not sure this metric is completely fair to ARM. If you want to
level the field, I think you have to divide each result by the number
of SoC's (or equivalent, in the case of x86) represented by that
architecture. Otherwise you aren't taking the diversity of the
various implementations of that instruction set into account.

Doing that, I think you'll find that ARM is in much better shape than
it appears.

> And ARM fanbois can say "oh, but arm is special" all they want, but
> they need to realize that the lack of common platform for ARM is a
> real major issue. It's not a "feature", and I'm sorry, but anybody who
> calls x86 "peanuts" is a moron and should be ashamed of himself.
> Instead of trying to feel superior, those people should feel like
> pariah.

I didn't say it was peanuts, but I agreed with the statement and I
stand by it. I don't think x86 is even close to the diversity you
find in the various ARM implementations.

> The fact that x86 has a platform, and people have cared about
> compatibility, and actually gets things to work with less code is a
> good thing.

Depends on who you ask. I have had to completely re-do entire
projects because we weren't able to bend the x86's notion of
"platform" to fit the task at hand (the decision to go with x86 was
made before I was involved in those projects, fwiw).

Furthermore, why aren't you saying the same thing about SH? They
don't appear to have a concept of "platform" that's any more evolved
than ARM. But there are a lot fewer SH SoCs supported in-kernel, so
the "problem" doesn't look as pronounced.

> I know ARM people who think that x86 is an "ugly" architecture.

You don't know me, but I think x86 is an ugly architecture.

> But the fact is, of all the architectures out there, ARM
> right now is the ugliest BY FAR. Exactly because of people who don't
> seem to understand that this kind of crap is a problem.

It's an OPPORTUNITY, not a problem. ARM's absence of a "platform"
concept allows developers to bend ARM into the shape needed to solve
the problem--- something that you can't say about x86.

This is a kernel architecture problem, not an SoC problem.

b.g.
--
Bill Gatliff
[email protected]

2011-03-31 01:38:03

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, Mar 30, 2011 at 6:15 PM, Bill Gatliff <[email protected]> wrote:
>
> I'm not sure this metric is completely fair to ARM. ?If you want to
> level the field, I think you have to divide each result by the number
> of SoC's

But that's the problem with ARM. Hardware companies that do one-off
stuff, with no sense of compatibility.

And calling it an "opportunity" is just stupid.

There's nothing good about causing extra work just because ARM hasn't
had the sense to standardize on one (or a couple) of interrupt
controllers etc.

Linus

2011-03-31 01:44:52

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

Linus:

On Wed, Mar 30, 2011 at 8:37 PM, Linus Torvalds
<[email protected]> wrote:
>
> There's nothing good about causing extra work just because ARM hasn't
> had the sense to standardize on one (or a couple) of interrupt
> controllers etc.

You should go talk with ARM about it, I'm sure they'll be very
reasonable with you. And accommodating.

In the meantime, we have to live with the chips that exist and the
ones coming down the pipe. Until ARM and all their licensees start
consulting us on such matters, we'll just have to find a way to deal
with what we're given after the fact.

b.g.
--
Bill Gatliff
[email protected]

2011-03-31 01:57:10

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, Mar 30, 2011 at 6:44 PM, Bill Gatliff <[email protected]> wrote:
>
> In the meantime, we have to live with the chips that exist and the
> ones coming down the pipe. ?Until ARM and all their licensees start
> consulting us on such matters, we'll just have to find a way to deal
> with what we're given after the fact.

Yes. But:

(a) we don't have to be stupid and think it's a good design and an
"opportunity" like you do.

and

(b) the kernel source code doesn't have to be the mess of code that
it is. Those things should be abstracted out somehow (and yes,
devicetree is hopefully one way)

I really don't understand why you seem to be arguing against trying to
fix a real problem, and why you also seem to be arguing that the messy
ARM situation is somehow "good". I find your attitude about the lack
of platform being "good" be to incomprehensibly stupid. There is
absolutely _no_ advantage to anybody from the crazy arm fragmentation.

I know, I know, a lot of companies make money supporting the whole
crazy mess. I guess that can make people confused and think that being
messy is good, and could be seen as an advantage.

But most embedded companies seem to have realized that they should
move up the stack, rather than worry about some crazy GPIO or stupid
driver details.

Linus

2011-03-31 02:20:10

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

Linus:

On Wed, Mar 30, 2011 at 8:56 PM, Linus Torvalds
<[email protected]> wrote:
> ?(a) we don't have to be stupid and think it's a good design and an
> "opportunity" like you do.

The complexity that is the current state of the ARM ecosystem presents
the opportunity to find a way to accommodate all those chips within
the Linux kernel.

If it isn't opportunity, then you must be arguing that we shouldn't
add any new ARM SoC support to the kernel. Is that what you are
saying?

> ?(b) the kernel source code doesn't have to be the mess of code that
> it is. Those things should be abstracted out somehow (and yes,
> devicetree is hopefully one way)

We are in violent agreement here.

> I really don't understand why you seem to be arguing against trying to
> fix a real problem, and why you also seem to be arguing that the messy
> ARM situation is somehow "good". I find your attitude about the lack
> of platform being "good" be to incomprehensibly stupid. There is
> absolutely _no_ advantage to anybody from the crazy arm fragmentation.

I guess I wasn't being clear.

Some of the proposals in this thread seemed to argue for the creation
of a one-size-fits-all kernel solution for ARM. I think pursuit of
such a solution is a complete waste of time, because it denies the
reality that ARM machines aren't very uniform. It's far better to
admit to and embrace that diversity than it is to try to deny it.

What we have today clearly isn't optimal, and it isn't going to scale
much farther. But we can't entertain the option of chucking the whole
mess over the wall into bootloader-land, or VHDL-land, or whatever.
Those simply aren't options. We have to find a solution that will
work in kernel space.

> I know, I know, a lot of companies make money supporting the whole
> crazy mess. I guess that can make people confused and think that being
> messy is good, and could be seen as an advantage.

No, I don't think messy is good. And I'm an individual who makes his
living making Linux run on ARM (among others) platforms. Messy wastes
my time, because I have to deal with the mess rather than making
platforms that solve real problems. Messy equates to overhead and
tedium. Messy invites error. Messy sucks.

> But most embedded companies seem to have realized that they should
> move up the stack, rather than worry about some crazy GPIO or stupid
> driver details.

Now that you mention it, GPIO is a perfect example of what I'm talking
about. Every ARM chip does that differently. Rather than deny that,
David Brownell came up with some code that embraces it. His code
might not be perfect, but we're learning as we go along. Same can be
said about the kernel's ARM situation as a whole.

And I still don't think that ARM is as bad as you think it is. Sure
it's ugly, but I don't think it's uglier than SH or PPC. It's just
ugly at a larger scale, because there are so many more ARM chips to
choose from.

Does ARM need some fixing? Yes! But let's be realistic about what
forms the solution might take.

b.g.
--
Bill Gatliff
[email protected]

2011-03-31 03:17:50

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, 30 Mar 2011, [email protected] wrote:

> On Wed, 30 Mar 2011, Nicolas Pitre wrote:
>
> > Furthermore, this does create pain. you have to make things in sync
> > between the kernel and the mini-kernel (let's call it bootloader). In
> > practice the bootloader is always maintained separately from the kernel,
> > on its own pace and with its own release schedule. Trying to
> > synchronize independent projects is really painful as you know already,
> > otherwise the user space for perf would still be maintained separately
> > from the kernel, right?
>
> Being separate from the kernel with it's own release schedule could be a good
> thing.

It could. It didn't work well for oprofile. That's why perf was
included in the kernel tree which so far appears to be a success.

> using the example of clocks. if the clock definitions were in the bootloader
> project, then when a new board is produced with a slightly different clock
> arrangement, all you have to do is to update the bootloader to pass the new
> definition to the kernel, and then you can use a well tested kernel that has
> been put through it's paces on other hardware already.

Sure. And that's the perfect case. I've yet to see things work so
well in practice.

> Today you have to get the change upstream into the kernel, and then use the
> new kernel (which always includes new features and bugs that you have to test
> for)

No. Today you have to get the change working in the current stable
kernel locally and test it before you submit it upstream.

> the two pieces do not need to be released and updated in lockstep. yes, there
> will be (many) cases where a new kernel adds support for a new type of device,
> but the communications format between the bootloader and the kernel can be
> designed to be tolorant of such skew. Even before the kernel knows how to
> drive the hardware you can have the format of the information about that
> hardware defined (allowing the bootloader to pass information to the kernel
> that it just ignores because it doesn't have a driver in it for that
> particulare piece of hardware), and if the bootloader doesn't tell the kernel
> about some device, the kernel will just ignore that device.

Today we have no such communication format with its limitations and
compatibility issues to care about. That allows for much greater
flexibility just as the kernel internal APIs are not guaranteed to be
stable. The major drawback is a lack of forward compatibility meaning
that for every new piece of hardware to come you need a kernel update.
There is simply no magic solution.

> this means that you need to have some group doing the equivalent of assigning
> device numbers for the different devices (and in this case going just a little
> further to define what setup parameters will be needed), initially this may be
> a little rough, but after a very short time I would expect the people doing
> this work to start recognising that even though vendor A who first proposes
> this device has some things hard-wired, the definition format should support
> these things as variables instead of being assumed.

Ideally, yes. but if every vendor has a different set of peripherals,
and from one SOC revision to the next from the same vendor you still
have different hardware knobs, then you still have to add yet more code
to the kernel. And that doesn't solve the issue of dynamic clock and
power management at runtime either for which custom code is still
required.

As long as SOC vendors keep producing wildly different architectures
besides the core CPU we'll have this problem. Denying the reality won't
make that problem go away either. And device tree won't stop those
vendor from still trying to do things differently (better?) because they
are not constrained by having to ensure this single proprietary software
stack still boot.

Nicolas

2011-03-31 03:29:41

by Dave Airlie

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

> As long as SOC vendors keep producing wildly different architectures
> besides the core CPU we'll have this problem. ?Denying the reality won't
> make that problem go away either. ?And device tree won't stop those
> vendor from still trying to do things differently (better?) because they
> are not constrained by having to ensure this single proprietary software
> stack still boot.

So you are saying the only way to get the Linux ARM shit cleaned up is
to hope Microsoft succeeds in making Windows a success on ARM?

Dave.

2011-03-31 03:32:33

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, Mar 30, 2011 at 7:20 PM, Bill Gatliff <[email protected]> wrote:
>
> If it isn't opportunity, then you must be arguing that we shouldn't
> add any new ARM SoC support to the kernel. ?Is that what you are
> saying?

What I'm saying is that we should not be adding ANY MINDLESS BOARD
DRIVERS for ARM.

Because they don't work. Most of them are totally unmaintainable CRAP
in the long run. They are extra code that actually have negative
worth. It shouldn't be in the kernel at all.

So let's take a really simple example of this kind of crap.

Do this:

git ls-files arch/arm/ | grep gpio

and cry. That's 145 files in the arm directory that are some kind of
crazy gpio support.

Now, we have gpio drivers in other parts of the kernel too, but ARM is
at the point where it's just crazy.

And most GPIO drivers I've ever seen are actually basically "turn this
bit on or off in this register to turn it into an Input or Output"
along with "read/write this other bit to actually see/set the value".
Repeat that for 'nr' bits, where 'nr' is just some small value,
usually in single digits.

Now, not all of them are that, by all means, and the details are often
slightly different. Sometimes the read register is the same as the
write register, sometimes it isn't. Sometimes you have a "clear
register" and a "set register" instead of a register you write the
value to. And I haven't checked what those 145 files do, but I bet a
_lot_ of them could be described by having a single generic gpio
driver, and then just using devicetree to give that common driver a
few values to describe where the IO ports are, which bits they are,
and which type of gpio it is.

And then when you have another ARM SoC, instead of writing yet another
mindless board driver for the gpio's on it, just add the <nr> entries
for the GPIO's to the device tree. NOT A SINGLE LINE OF CODE.

Yes, yes, there are always exceptions. Many GPIO's are actually behind
some i2c bus or something. Others can do pulsing or are just generally
more complex than an array of single bits. So I'm sure we couldn't
replace all those 145 gpio files under arch/arm with a single driver
and some devicetree entries, but maybe half of them match the simple
pattern. I bet the SoC case it's more than half, it would be silly to
do i2c on an SoC. But I dunno. I really didn't look.

PowerPC does exactly the above, btw. So I'm not just talking about
some magical theoretical thing. I seriously think every ARM person who
has ever written any of those "gpio" files should look at powerpc.
Now, I suspect that most powerpc SoC's tend to share more IP blocks
than the crazy ARM situation, but even so, please just check it out.
Check out the device tree files (*.dts) and do that same

git ls-files arch/arm/ | grep gpio

except do it on powerpc.

See the difference?

The powerpc people even wrote documentation about the thing, which is
just above and beyond reasonable.

Linus

2011-03-31 04:09:32

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, 30 Mar 2011, Linus Torvalds wrote:

> Umm. The actual stats are still:
>
> 1349 files changed, 62230 insertions(+), 33993 deletions(-)
>
> which is sad. And the end result speaks for itself: this is lines per
> architecture:
>
> ...
> 124022 total arch/sh
> 124418 total arch/sparc
> 181997 total arch/m68k
> 246717 total arch/mips
> 254785 total arch/x86
> 370912 total arch/powerpc
> 732732 total arch/arm
>
> notice how ARM ends up being in a class of its own. This is a PROBLEM.

OK let's think of it in terms of a problem. How do _we_ fix it? Maybe
we can change the Linux license and enforce some policies on the
hardware manufacturers imposing them to conform to a strict model before
they are allowed to use Linux. Do you think you can have such an
influence on them? I really don't think I do.

Or maybe we can tell to all those crazy people to get together and stop
trying to differentiate themselves otherwise we'll just ignore them?
That could be an option, those people will go away, and embedded Linux
will go back underground like it used to be. Wouldn't that be
wonderful?

> And ARM fanbois can say "oh, but arm is special" all they want, but
> they need to realize that the lack of common platform for ARM is a
> real major issue. It's not a "feature", and I'm sorry, but anybody who
> calls x86 "peanuts" is a moron and should be ashamed of himself.
> Instead of trying to feel superior, those people should feel like
> pariah.

Oh come on. You just provided actual numbers above showing that ARM is
simply fscked up (your words) compared to X86. I would be curious to
know what people like tglx who did significant work on both
architectures actually think of X86 relative to ARM when it comes to
kernel maintenance.

No one is saying there is no problem. There is _indeed_ a problem in ARM
land. But this is actually a _hardware_ problem that no one refutes.
On the software side I'd say that we're struggling but still coping
relatively well with it given the actual hardware jungle out there. It
is not like if _we_ could do something about _that_.

> The fact that x86 has a platform, and people have cared about
> compatibility, and actually gets things to work with less code is a
> good thing. I know ARM people who think that x86 is an "ugly"
> architecture. But the fact is, of all the architectures out there, ARM
> right now is the ugliest BY FAR. Exactly because of people who don't
> seem to understand that this kind of crap is a problem.

No disagreement here. Certainly not from my side. I would much prefer
to have only one or two ARM variants to deal with like in the old days
when only a very few ARM vendors were relevant. But the reality has
changed, and unless we start boycotting those who try to be different
just because they can, I don't think we have much choice but to cope.
And so far we do cope remarquably well given the diversity involved.

Things can be improved and they are indeed being improved every merge
window. But at the same time there is a new and different SOC to
support each merge window as well which just can't be fitted in the
existing numerous SOC models. And this is really frustrating indeed
because there is simply no magic solution and yet more code needs to be
written and reviewed again.

At least we managed to isolate the crap into separate directories and
try hard for one vendor not to screw up the other ones.

Nicolas

2011-03-31 04:38:42

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, 31 Mar 2011, Dave Airlie wrote:

> > As long as SOC vendors keep producing wildly different architectures
> > besides the core CPU we'll have this problem. ?Denying the reality won't
> > make that problem go away either. ?And device tree won't stop those
> > vendor from still trying to do things differently (better?) because they
> > are not constrained by having to ensure this single proprietary software
> > stack still boot.
>
> So you are saying the only way to get the Linux ARM shit cleaned up is
> to hope Microsoft succeeds in making Windows a success on ARM?

Absolutely. On Intel, it is (still) Windows the reference. If Windows
doesn't boot on your motherboard you have a problem. So motherboard
vendors won't make crazy incompatible things. They are constrained to
fix their hardware because they just cannot alter Windows to suit their
hardware differences. That really helps keeping actual differences to a
minimum and only to things that are not fundamental. So Windows really
helped making a uniform hardware platform on X86.

On ARM you have no prepackaged "real" Windows. That let hardware people
try things. So they do change the hardware platform all the time to
gain some edge. And this is no problem for them because most of the
time they have access to the OS source code and they modify it
themselves directly. No wonder why Linux is so popular on ARM. I'm sure
hardware designers really enjoy this freedom. We software developers
would much prefer if the whole hardware platform was standardized and
set in stone. That would certainly make our lives so much better and
then we would have spare cycles to actually abstract all those GPIO
drivers even further. But that would benefit Windows on ARM quite
significantly too.

Nicolas

2011-03-31 05:06:29

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, 30 Mar 2011, Nicolas Pitre wrote:

> On Wed, 30 Mar 2011, [email protected] wrote:
>
>> On Wed, 30 Mar 2011, Nicolas Pitre wrote:
>>
>
>> this means that you need to have some group doing the equivalent of assigning
>> device numbers for the different devices (and in this case going just a little
>> further to define what setup parameters will be needed), initially this may be
>> a little rough, but after a very short time I would expect the people doing
>> this work to start recognising that even though vendor A who first proposes
>> this device has some things hard-wired, the definition format should support
>> these things as variables instead of being assumed.
>
> Ideally, yes. but if every vendor has a different set of peripherals,
> and from one SOC revision to the next from the same vendor you still
> have different hardware knobs, then you still have to add yet more code
> to the kernel. And that doesn't solve the issue of dynamic clock and
> power management at runtime either for which custom code is still
> required.
>
> As long as SOC vendors keep producing wildly different architectures
> besides the core CPU we'll have this problem. Denying the reality won't
> make that problem go away either. And device tree won't stop those
> vendor from still trying to do things differently (better?) because they
> are not constrained by having to ensure this single proprietary software
> stack still boot.

the thing that you are not convincing us of is that all these different
SoCs are so wildly different architectures.

back in the early days of the PCs, different systems from different
vendors had different bus types, peripherals at different addresses, etc.
that didn't make all of those vendors systems different architectures,
instead those things were varients of the x86 architecture.

with ARM you do have a couple different architectures (arm5 vs arm7 for
example), but what you are hearing people say is that

arm7+IPblock1+IPblock2
arm7+IPblock1+IPblock3>
arm7+IPblock2+IPblock3>

are not three different architectures, they are one architecture with
different devices attached.

what's more, you seem to be saying that

arm7+IPblock1

and

arm7+IPblock1

are different architectures if the wiring between the arm core and
IPblock1 are different (they are different 'boards' or different chip
models, possibly from different manufacturers)

I see the variations here as a good thing, just like having a huge number
of pluggable cards in a PC was a good thing (even though it made it hard
to have an OS that supports every card out there)

in the case of the PC, systems that were too different died off, systems
that could have their differences abstracted into different
drivers prospered.

I am _not_ saying that all arm systems need to standardize on one
interrupt controller, I am saying that the kernel support for ARM needs to
be able to _easily_ be told that this chip has interrupt chip type 24
connected this way, and interrupt chip type 87 connected that way, without
needing to create a new architecture. If the kernel is compiled with the
appropriate drivers, it should even be able to be done without needing to
recompile the kernel.

Now I understand that this isn't how things are done today in ARM, but
that's not how things were done 10 years ago in x86 either. back then you
had kernels compiled specifically for each system. nowdays that is still
done where space or performance is critical, but a huge number of systems
sacrafice a few % of speed, and some storage and ram for the flexibility
and supportability of using a one-size-fits-many kernel (along with a
large number of loadable modules)

why can't ARM do this?

look at serial port support on x86.

while serial ports are becoming rare nowdays, the kernel has support for
_many_ different types of ports, and all of the port types support being
wired up in different ways (to different addresses and interrupts)

the kernel could have gone the route that ARM did of having a master
config that listed every system known and where it's serial ports were
wired, but thanks to the fact that many of these were plug-in options,
that would have been painful enough that it drove them to do it the right
way.

take ARM down a similar path, treat the on-chip devices in a similar way
to off-chip devices.

define the different types of GPIO behavior in arch-level drivers and make
the chip-level (or board level) definitions say "I have a type 6 GPIO
device wired this way" instead of including an entire GPIO driver in that
definition.

what is happening in ARM is being driven by the short-term ease of the
chip manufacturers, they do things any way they want, and their engineers
cut-n-paste their way to make things work as a new architecture.

in the long run, making things more flexible and easier for the device
designers and the people modifying the designs down the road will grow the
pie by making it much easier for people to drop an ARM onto a new design,
add a smattering of random (to the chip manufacturer) chips to do various
things, and get linux running on the result.

get it down to where a new board can be designed by a guy in his garage,
with a linux ARM distro able to run out of the box on it (with the
appropriate definitions of how the guy wired it), and the variations will
proliferate to the point where the current variation in ARM will look
trivial by comparison, and the result will be even more use (and therefor
sales) of ARM chips in all sorts of things that nobody can imagine today.

I'm working on projects that are small embedded systems, right now it's a
big-boy's game. If you want to do a small project you better copy the
reference board pretty closely or have a lot of kernel hacking ability
(along with the ability to convince the management types that you can do
this). In my project I lost the argument and instead of using ARM and
linux the people funding things opted to go with a propriatary OS from the
chip vendor instead. this maze is costing users and developers.

Turn things around and make it easier to have more variations and you will
get more support from all directions.

David Lang

2011-03-31 05:45:55

by Geert Uytterhoeven

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, Mar 31, 2011 at 01:31, Nicolas Pitre <[email protected]> wrote:
> On Wed, 30 Mar 2011, Linus Torvalds wrote:
>> The long-term situation should be that you should be able to have ONE
>> binary kernel "just work". That's where we are on x86. Really.
>
> But X86 is peanuts. Really. There was one machine called the IBM PC at
> some point that everybody cloned, and the rest was totally irrelevant.
> Then came that thing called Windows that reinforced this hardware
> monoculture as it was used for the ultimate conformance testing. This
> is damn easy in that case to produce a kernel that works virtually
> everywhere.
>
> On ARM there is simply not such thing as a single machine design to
> clone, and a closed source test bench to design for.

There are other architectures that didn't start from a single root platform,
but still support multi-platform kernels.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2011-03-31 06:42:29

by Olof Johansson

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, Mar 30, 2011 at 8:24 PM, Linus Torvalds
<[email protected]> wrote:

> Check out the device tree files (*.dts) and do that same
>
> ? git ls-files arch/arm/ | grep gpio
>
> except do it on powerpc.
>
> See the difference?
>
> The powerpc people even wrote documentation about the thing, which is
> just above and beyond reasonable.

Powerpc does has the benefit of only having about three (active)
silicon vendors, out of which two are doing most of the arch
infrastructure work. There are more chefs in the kitchen in the ARM
world. But yeah, the arch/powerpc/platforms/* directories are tiny
compared to the ARM equivalents.

That being said: arch/ppc used to be messy too. A lot of cleanups were
done when the ppc+ppc64 -> powerpc merge happened.

Starting over on a new base would avoid some of the problems of
dealing with new incoming platforms while things are being cleaned up.
It would
decouple need to start reviewing _everything_ at the same time as as
the cleanup is underway, and would give a chance to set good examples
for how things should be handled as new platforms are brought over.

Then, when the time comes, start refusing new boards/SoCs/features on
the old subtree and have new clean stuff go directly into the new one.

Of course, main drawback is that this would duplicate the actual arch
code (the parts Russell are handling), and he is already stretched
thin as it is. That code isn't what needs the cleanup most though, so
maybe it can just be shared to start with to avoid that overhead.

-Olof

2011-03-31 06:55:08

by David Brown

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, Mar 30 2011, Linus Torvalds wrote:

> And most GPIO drivers I've ever seen are actually basically "turn this
> bit on or off in this register to turn it into an Input or Output"
> along with "read/write this other bit to actually see/set the value".
> Repeat that for 'nr' bits, where 'nr' is just some small value,
> usually in single digits.

A few hundred is more typical.

At least with the MSMs, it's even messier. The chip has a moderate
number of pins, and a large number of peripherals, more than can
actually be connected to these pins at a time. Multiple devices, and a
general GPIO are frequently muxed. This has to be configured, in
addition to input or output. Even for just GPIO, there may be multiple
types of pullups, drive strengths. Different devices will need
different configurations to handle low power, etc.

Of the 145 files with 'gpio' in the name, 50 or so of them seem to
actually be GPIO drivers. At least of the more common platforms, these
are quite different, and seem to deal with fairly complex devices. Many
at least use the gpiolib.

The rest of the files seem to be more concerned with what the different
GPIOs are connected to. This seems like the place to focus on
generalizing. Maybe device tree.

> Now, not all of them are that, by all means, and the details are often
> slightly different. Sometimes the read register is the same as the
> write register, sometimes it isn't. Sometimes you have a "clear
> register" and a "set register" instead of a register you write the
> value to. And I haven't checked what those 145 files do, but I bet a
> _lot_ of them could be described by having a single generic gpio
> driver, and then just using devicetree to give that common driver a
> few values to describe where the IO ports are, which bits they are,
> and which type of gpio it is.

If we can come up with a general way to describe the diverse things a
GPIO can be used for. Even something like drive strength can have
widely differing possibilities, and may need different settings for low
power modes (which means it can't just be the boot loader setting it).

> And then when you have another ARM SoC, instead of writing yet another
> mindless board driver for the gpio's on it, just add the <nr> entries
> for the GPIO's to the device tree. NOT A SINGLE LINE OF CODE.

Agree, at least in theory.

> Yes, yes, there are always exceptions. Many GPIO's are actually behind
> some i2c bus or something. Others can do pulsing or are just generally
> more complex than an array of single bits. So I'm sure we couldn't
> replace all those 145 gpio files under arch/arm with a single driver
> and some devicetree entries, but maybe half of them match the simple
> pattern. I bet the SoC case it's more than half, it would be silly to
> do i2c on an SoC. But I dunno. I really didn't look.

At least one MSM has GPIOs behind i2c. Too many pins, otherwise.

> PowerPC does exactly the above, btw. So I'm not just talking about
> some magical theoretical thing. I seriously think every ARM person who
> has ever written any of those "gpio" files should look at powerpc.
> Now, I suspect that most powerpc SoC's tend to share more IP blocks
> than the crazy ARM situation, but even so, please just check it out.
> Check out the device tree files (*.dts) and do that same
>
> git ls-files arch/arm/ | grep gpio
>
> except do it on powerpc.
>
> See the difference?

Yeah, powerpc doesn't seem to have as complex of a use of gpios.

David

--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

2011-03-31 07:15:45

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, 30 Mar 2011, [email protected] wrote:

> On Wed, 30 Mar 2011, Nicolas Pitre wrote:
>
> > As long as SOC vendors keep producing wildly different architectures
> > besides the core CPU we'll have this problem. Denying the reality won't
> > make that problem go away either. And device tree won't stop those
> > vendor from still trying to do things differently (better?) because they
> > are not constrained by having to ensure this single proprietary software
> > stack still boot.
>
> the thing that you are not convincing us of is that all these different SoCs
> are so wildly different architectures.

You can have a look at the code. It is all there. And the different
GPIO controllers are just the tip of the iceberg. But by all means if
you actually find a way to abstract most differences between those SOCs
then please say so. Many people tried already. Maybe they weren't
smart enough.

> back in the early days of the PCs, different systems from different vendors
> had different bus types, peripherals at different addresses, etc. that didn't
> make all of those vendors systems different architectures, instead those
> things were varients of the x86 architecture.

Most of them didn't survive. That really helps.

> with ARM you do have a couple different architectures (arm5 vs arm7 for
> example),

That's the easy part. Well not necessarily that easy, but RMK did a
damn good job of it and looking at the result you may think that was
easy.

> but what you are hearing people say is that
>
> arm7+IPblock1+IPblock2
> arm7+IPblock1+IPblock3>
> arm7+IPblock2+IPblock3>
>
> are not three different architectures, they are one architecture with
> different devices attached.
>
> what's more, you seem to be saying that
>
> arm7+IPblock1
>
> and
>
> arm7+IPblock1
>
> are different architectures if the wiring between the arm core and IPblock1
> are different (they are different 'boards' or different chip models, possibly
> from different manufacturers)

That's again the easy part. If that was all that simple we would be all
happy campers.

> I see the variations here as a good thing, just like having a huge number of
> pluggable cards in a PC was a good thing (even though it made it hard to have
> an OS that supports every card out there)

What we have here is more like a large number of PCs, each one with a
different set of cards. The different SOCs are like PCs, and the cards
are those set of peripherals to be found in those SOCs.

But it is not just about peripherals, and system level things like
timers and interrupt controllers. It is also about the huge complexity
put into the hardware for power management. you will find different
clocks/PLLs, power domains and regulators, and that has to be intermixed
with various sleep states, wake sources and conditions, etc. And no
ressemblance is to be found between different vendors for those pm
features unfortunately. Even with a common high level abstraction
(which abstraction is still being worked on) you will still need the
backend code to interface with the hardware.

Maybe someday the optimal (or even just good enough) power management
hardware infrastructure will be found and everybody will standardize on
that. But we're apparently not there yet.

> in the case of the PC, systems that were too different died off, systems that
> could have their differences abstracted into different drivers prospered.

Maybe that will happen eventually. In the mean time they are rather all
successful despite the major hardware discrepancies because the kernel
is Open Source and it is relatively easy for each vendor to hack the
kernel until it works on their own hardware. User space is largely
unaffected so the kernel becomes the actual abstraction.

> I am _not_ saying that all arm systems need to standardize on one interrupt
> controller,

I wish they had.

> I am saying that the kernel support for ARM needs to be able to
> _easily_ be told that this chip has interrupt chip type 24 connected this way,
> and interrupt chip type 87 connected that way, without needing to create a new
> architecture.

But then we have the next SOC coming up with yet another IRQ controller
built into it. So yet more code is required to deal with it.
Amongst the 15 ARM vendors or so, no IRQ controller is the same.

> If the kernel is compiled with the appropriate drivers, it
> should even be able to be done without needing to recompile the kernel.

Obviously. And that's not a problem. The problem is to actually
support all the different hardware blocks. Hence the volume of code
out there.

If you do configure your kernel for one specific target which has a
specific SOC implying a specific IRQ, timer, GPIO, clock, power, memory
and peripheral bus, then you'll end up using relatively few lines of
code from the arch/arm/ directory. If instead you want support for a
SOC from another vendor, then you'll need a different set of source code
similar to the differences between an ARM SOC and a MIPS SOC.

> Now I understand that this isn't how things are done today in ARM, but that's
> not how things were done 10 years ago in x86 either. back then you had kernels
> compiled specifically for each system. nowdays that is still done where space
> or performance is critical, but a huge number of systems sacrafice a few % of
> speed, and some storage and ram for the flexibility and supportability of
> using a one-size-fits-many kernel (along with a large number of loadable
> modules)

We do have that partially on ARM. It is already possible to build a
kernel that supports multiple different boards at once, as long as
they're all using the same SOC family. I invite you to have a look at
http://armlinux.simtec.co.uk/kautobuild/2.6.39-rc1/index.html. Notice
the defconfig names which are mostly after SOC names, and the number of
actual machines they do include for each of those configs. For example,
the omap2plus_defconfig produces a single kernel binary that can boot on
31 different boards. Performance is not optimal as the instruction set
is limited to the oldest supported by all those boards. But it works.

This last merge window, some prerequisite infrastructure changes was
pushed upstream to eventually allow for many SOC families to be compiled
together by performing binary patching of the kernel at boot time like
on X86.

But that has nothing to do with the fact that those SOCs still need
their own specific support code to work because each vendor did their
own things around the CPU core. And there are a _lot_ of different SOCs
on ARM, far more than any other architecture in the kernel tree.

> look at serial port support on x86.
>
> while serial ports are becoming rare nowdays, the kernel has support for
> _many_ different types of ports, and all of the port types support being wired
> up in different ways (to different addresses and interrupts)
>
> the kernel could have gone the route that ARM did of having a master config
> that listed every system known and where it's serial ports were wired, but
> thanks to the fact that many of these were plug-in options, that would have
> been painful enough that it drove them to do it the right way.

No... You don't understand. What you're talking about above is all
about different ways to wire the same serial controller chip, or
derrivatives of that chip, all based on the venerable 8250 from National
Semiconductor. While it is true that some ARM SOCs do have a compatible
UART and rightfully use the 8250.c driver, many (most?) SOCS do have
totally incompatible UARTs requiring separate drivers. That's what I
mean by each vendor doing their own things differently.

> take ARM down a similar path, treat the on-chip devices in a similar way to
> off-chip devices.

But we do already. Care to provide an example where it is not the case?

> define the different types of GPIO behavior in arch-level drivers and make the
> chip-level (or board level) definitions say "I have a type 6 GPIO device wired
> this way" instead of including an entire GPIO driver in that definition.

GPIO is too easy. Please take the clock tree and the multiple power
domains found on the various TI OMAP and tell me how you would change
that code into
such abstract data, and make that format also useful and usable for the
Marvell Kirkwood, and/or the Freescale i.MX, and/or the ST-Ericson
UX500, and/or ... (the list can go like this for long).

> what is happening in ARM is being driven by the short-term ease of the chip
> manufacturers, they do things any way they want, and their engineers
> cut-n-paste their way to make things work as a new architecture.

Sure that's what they do. But that code never get merged in mainline.
Trust me, what we have now, even if it doesn't look nice from a 10000
feet point of view, has gone through ssome review and often complete
rewrite from people outside of those manufacturers.

> in the long run, making things more flexible and easier for the device
> designers and the people modifying the designs down the road will grow the pie
> by making it much easier for people to drop an ARM onto a new design, add a
> smattering of random (to the chip manufacturer) chips to do various things,
> and get linux running on the result.
>
> get it down to where a new board can be designed by a guy in his garage, with
> a linux ARM distro able to run out of the box on it (with the appropriate
> definitions of how the guy wired it), and the variations will proliferate to
> the point where the current variation in ARM will look trivial by comparison,
> and the result will be even more use (and therefor sales) of ARM chips in all
> sorts of things that nobody can imagine today.

Hey, are you trying to convince _me_? I'm currently not into the
semiconductor
industry, and for the 2.5 years that I did work for a semiconductor
vendor I had very little to say about the hardware design. I do wish
that the hardware platform on ARM was more uniform. That would
certainly make my job easier. But reality is different, for better or
worse.

Nicolas

2011-03-31 07:21:35

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, 31 Mar 2011, Geert Uytterhoeven wrote:

> On Thu, Mar 31, 2011 at 01:31, Nicolas Pitre <[email protected]> wrote:
> > On ARM there is simply not such thing as a single machine design to
> > clone, and a closed source test bench to design for.
>
> There are other architectures that didn't start from a single root platform,
> but still support multi-platform kernels.

Sure, so does ARM with some restrictions.

Nicolas

2011-03-31 08:07:13

by Ingo Molnar

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

* Nicolas Pitre <[email protected]> wrote:

> On Wed, 30 Mar 2011, [email protected] wrote:
>
> > back in the early days of the PCs, different systems from different vendors
> > had different bus types, peripherals at different addresses, etc. that didn't
> > make all of those vendors systems different architectures, instead those
> > things were varients of the x86 architecture.
>
> Most of them didn't survive. That really helps.

That's not the point, 99% of the current ARM boards will not 'survive' either,
10-20 years down the road.

I think you missed David's main point: life inevitably went on and few of the
old x86 hardware 'survived' physically, but past hardware versions have not
littered the kernel source with half a million lines of source code in the
process ...

Having strong, effective platform abstractions inside the kernel really helps
even if the hardware space itself is inevitably fragmented: both powerpc and
x86 has shown that. Until you realize and appreciate that you really have not
understood the problem i think.

Thanks,

Ingo

2011-03-31 08:10:59

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, Mar 30, 2011 at 10:05:41PM -0700, [email protected] wrote:
> with ARM you do have a couple different architectures (arm5 vs arm7 for
> example), but what you are hearing people say is that
>
> arm7+IPblock1+IPblock2
> arm7+IPblock1+IPblock3>
> arm7+IPblock2+IPblock3>
>
> are not three different architectures, they are one architecture with
> different devices attached.

Wrong. Let's take an example. If you have an OMAP SoC with ARMv7 + GIC +
OMAP timer, and another SoC (eg, MSM) with GIC + their own timer, then
the common code will be used to support ARMv7 on both SoCs.

The common GIC support code will be used to talk to the GIC interrupt
controller.

The OMAP timer code will be used to handle the OMAP timer, and the MSM
timer code will be used to handle the MSM timer.

We're not crazy. We don't have N sets of code implementing support for
the GIC interrupt controller. Same happens for the VIC code - we have
common code supporting VIC implementations across the different SoCs
which have a VIC.

But the GIC is a totally different beast to the VIC. The GIC is SMP
capable and has two software interfaces - the CPU local part and the
CPU global part. The VIC doesn't have any of that as it is UP only.
They function entirely differently. How can you have some common code
to support both of those?

> what's more, you seem to be saying that
>
> arm7+IPblock1
>
> and
>
> arm7+IPblock1
>
> are different architectures if the wiring between the arm core and
> IPblock1 are different (they are different 'boards' or different chip
> models, possibly from different manufacturers)

Over the years which I was overseeing platform support I tried to ensure
as much sharing of code across different platforms. I no longer oversee
platform specific stuff, and so its entirely possible that several SoCs
have the same IP block but their own code to drive it.

That's where Thomas is right - we need a team of people to provide
review of that to catch it and get it consolidated. Such a team would
need funding. Where does that funding come from? I've no idea.
This review issue is something which I've been on about for the last
ten years, and it hasn't really got much better during that time.

We also need the various SoC designers and ARM architecture people to
realise that what the hardware situation is rediculous; I have commented
about this lack of standardisation to ARM in past years. ARM have had
a standard set of peripherals for ten years, but the SoC people haven't
really taken them up - and when they do, they seem to always introduce
their own tweaks, sometimes with no way to detect those tweaks.

So far, I've avoided merging code to change the way the driver support
works for those peripherals to allow the platform level code to describe
those differences because I don't like it. It sounds like I should
continue to avoid merging it and, hopefully, it'll just go away.

2011-03-31 08:31:18

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, Mar 31, 2011 at 10:06:34AM +0200, Ingo Molnar wrote:
> Having strong, effective platform abstractions inside the kernel really helps
> even if the hardware space itself is inevitably fragmented: both powerpc and
> x86 has shown that. Until you realize and appreciate that you really have not
> understood the problem i think.

No, I think it is the other way around. Folk like me and Nicolas over
the last ten years have put considerable amounts of effort into trying
to keep the ARM support code as clean and maintainable as possible.

That is true of the common ARM stuff, but there's no way we can do this
for all SoC support - there aren't the hours in the day to provide such
a wide oversight. That's why we have SoC maintainers, and the SoC
maintainers have the responsibility to sort out their own sub-trees.

2011-03-31 09:55:17

by Alan

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

> Absolutely. On Intel, it is (still) Windows the reference. If Windows
> doesn't boot on your motherboard you have a problem. So motherboard
> vendors won't make crazy incompatible things. They are constrained to

OLPC, Moorestown ?

> fix their hardware because they just cannot alter Windows to suit their
> hardware differences. That really helps keeping actual differences to a
> minimum and only to things that are not fundamental. So Windows really
> helped making a uniform hardware platform on X86.

That and the fact the Microsoft driver validation has driven a lot of
standardisation along the "we could write a driver and go through WHQL
and ... and ..." or we could just use a standard interface.

Thing is though - the x86 platform does change, modern PC systems are
very different to old ones (different IRQ controllers, different power
management, different processor features, different bus interfaces,
different firmware, ...) but someone bothered to make these
*discoverable*.

If I boot a Linux kernel on an AMD K6 I'm running with an 8259 interrupt
controller, 8254/5 supporting I/O, a PC style keyboard controller,
graphics via PCI or maybe AGP using memory on the card mostly, a one
command at a time ATA interface based on WD1010 registers, APM based
firmware that implements an extended version of the PC BIOS.

If I boot it on a current PC I'm booting on a multiprocessor system with
different timers, totally different IRQ controllers, different keyboard
controllers (USB), PCI Express, an IOMMU, NCQ SATA, ACPI, graphics
running in shared host memory able to give/take pages from the host,
extra instructions, etc etc

And the same kernel boots just fine on both just fine.

2011-03-31 10:12:09

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

B1;2401;0cOn Thu, 31 Mar 2011, Nicolas Pitre wrote:
> On Wed, 30 Mar 2011, Linus Torvalds wrote:
> > And ARM fanbois can say "oh, but arm is special" all they want, but
> > they need to realize that the lack of common platform for ARM is a
> > real major issue. It's not a "feature", and I'm sorry, but anybody who
> > calls x86 "peanuts" is a moron and should be ashamed of himself.
> > Instead of trying to feel superior, those people should feel like
> > pariah.
>
> Oh come on. You just provided actual numbers above showing that ARM is
> simply fscked up (your words) compared to X86. I would be curious to
> know what people like tglx who did significant work on both
> architectures actually think of X86 relative to ARM when it comes to
> kernel maintenance.

To be honest both suck in their own way. The only reason why x86 is
slightly less horrible is the fact that it's better architectured at
the hardware level.

But I see the same mess coming in with all those wonderful Atom based
SoCs on the horizon, which are nothing else than any other random ARM
SoC, just that they glue an x86 core into the same cheepo random IP
peripherals conglomerate. In fact some of those chip have been ARM
powered before they got an x86 injected.

And worse: the Intel folks went there and wrote a new driver for an IP
block which had already an "ARM associated" driver.

So I say that it is not only an ARM problem, it's a general problem
that people do not realize that the IP cores are reused all over the
place and across architectures.

I'm pretty sure after I went through all the irq code recently that
lots of those ARM SoCs from vendors across the board could share a lot
of driver code if someone would actually sit down and analyse the
situation. Right now we have nobody who has the time and the stomach
to go through this and at the same time prevent that more copied
crappola is hitting the tree.

I'm sure that device tree is part of the solution, but that only helps
if we find a way to prevent duplicate drivers in the first place.

Thanks,

tglx

2011-03-31 10:42:15

by Ingo Molnar

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

* Russell King - ARM Linux <[email protected]> wrote:

> On Thu, Mar 31, 2011 at 10:06:34AM +0200, Ingo Molnar wrote:
> > Having strong, effective platform abstractions inside the kernel really helps
> > even if the hardware space itself is inevitably fragmented: both powerpc and
> > x86 has shown that. Until you realize and appreciate that you really have not
> > understood the problem i think.
>
> No, I think it is the other way around. Folk like me and Nicolas over the
> last ten years have put considerable amounts of effort into trying to keep
> the ARM support code as clean and maintainable as possible.

Absolutely no argument about that, whenever i have read core ARM code it was
always a pleasure. You guys are doing a fine job there.

What i argued with was what Nicolas said:

> > > back in the early days of the PCs, different systems from different
> > > vendors had different bus types, peripherals at different addresses,
> > > etc. that didn't make all of those vendors systems different
> > > architectures, instead those things were varients of the x86
> > > architecture.
> >
> > Most of them didn't survive. That really helps.

It does not matter whether hardware survives or not - most pieces of hardware
do not survive. What matters is whether the inevitable hardware-churn is
allowed to litter the kernel tree with unmaintainable pieces of crap.

You even mention that it's not maintainable to you:

> That is true of the common ARM stuff, but there's no way we can do this for
> all SoC support - there aren't the hours in the day to provide such a wide
> oversight. [...]

The problem is the solution:

> That's why we have SoC maintainers, and the SoC maintainers have the
> responsibility to sort out their own sub-trees.

... which sets the wolves to mind the sheep, so to say. Self-oversight never
worked very well (unless you believe in perpetual bank bailouts).

So Linus and Thomas (with the genirq hat on) are pushing back, because both of
them feel affected negatively by crap.

"All is fine" or "it's just natural" do not seem like the right answers to
those concerns.

Thanks,

Ingo

2011-03-31 10:49:26

by Felipe Balbi

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

Hi,

On Thu, Mar 31, 2011 at 09:09:54AM +0100, Russell King - ARM Linux wrote:
> > what's more, you seem to be saying that
> >
> > arm7+IPblock1
> >
> > and
> >
> > arm7+IPblock1
> >
> > are different architectures if the wiring between the arm core and
> > IPblock1 are different (they are different 'boards' or different chip
> > models, possibly from different manufacturers)

This is utter BS, see e.g. that the same MUSB driver is re-used on OMAP,
on an external discrete chip TUSB6010, on DaVinci, on Blackfin, on
UX500, etc. The exact same driver is re-used on all those situations
with a little platform glue layer. We can't live without that small glue
layer for each different platform though and they sum up to 2600+ lines
of code (all different platform glues).

It's a pain to keep the core code generic enough so that it's useful on
all those cases, specially because between OMAP and AM35x, even the
register file that particular IP block is different. Still, we have
people working to keep the IP block drivers generic enough to be re-used
on several situations.

> Over the years which I was overseeing platform support I tried to ensure
> as much sharing of code across different platforms. I no longer oversee
> platform specific stuff, and so its entirely possible that several SoCs
> have the same IP block but their own code to drive it.
>
> That's where Thomas is right - we need a team of people to provide
> review of that to catch it and get it consolidated. Such a team would
> need funding. Where does that funding come from? I've no idea.

Fully agree with you Russell.

> We also need the various SoC designers and ARM architecture people to
> realise that what the hardware situation is rediculous; I have commented
> about this lack of standardisation to ARM in past years. ARM have had
> a standard set of peripherals for ten years, but the SoC people haven't
> really taken them up - and when they do, they seem to always introduce
> their own tweaks, sometimes with no way to detect those tweaks.

For sure that's happening, but should we prevent ARM vendors to add
their tweaks ? Like Nicolas said, that's fuel to innovation.

--
balbi

2011-03-31 10:50:39

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, Mar 31, 2011 at 10:54:40AM +0100, Alan Cox wrote:
> If I boot it on a current PC I'm booting on a multiprocessor system with
> different timers, totally different IRQ controllers, different keyboard
> controllers (USB), PCI Express, an IOMMU, NCQ SATA, ACPI, graphics
> running in shared host memory able to give/take pages from the host,
> extra instructions, etc etc
>
> And the same kernel boots just fine on both just fine.

We've been there for a long time with ARM. Right from the start I had
a single kernel image which booted over a range of ARM CPUs and
platforms.

As far as ARM CPU architectures go, today we can have a single kernel
image which covers ARMv3 to ARMv5, and a separate kernel image which
covers ARMv6 to ARMv7 including SMP and UP variants. The thing which
currently stops ARMv3 to ARMv7 all together is the different page table
layouts, the ASID tagging, the exclusive load/store support for cmpxchg
and other atomic operations, etc.

I wouldn't want to try to patch out the exclusive load/store operations
with some kind of function call to one of the generic implementations in
asm-generic as that gets you into ABI problems with GCC - it'd mean having
to tell GCC that various registers are clobbered all over the place.

With page tables, we can use the old format for ARMv5 with ARMv6 and
later, but that means we lose stuff like NX support to prevent instruction
prefetches hitting devices, which is of course a problem if you have
read-sensitive registers such as FIFOs there.

Can an x86 kernel with PAE support run on an x86 without PAE support?
The differences between ARMv5 and ARMv6 are much like PAE.

Outside of the CPU architecture, things become a lot more complicated.
The biggest one up until this merge window was that there is no fixed
address for system RAM, which makes stuff like virt_to_phys() rather
horrible to deal with - which in turn makes setting up and walking page
tables a nightmare. We've just solved that issue with run-time patching
of the kernel code to replace the add/sub instructions with ones with
the appropriate offset, so we're a step closer to unifying everything
into one single kernel image. This work alone produced this diffstat:

87 files changed, 450 insertions(+), 190 deletions(-)

so it actually resulted in a net increase in the amount of code to be
maintained rather than reducing it. That's hardly surprising as what
that replaced was just a bunch of #define's for PHYS_OFFSET with some
complex assembly code to do run-time patching of instructions.

The barriers against a single kernel image are being worked on, and it's
actually one of the things which Linaro is actively tasked to achieve.

One thing which I've been working on over the last six months is to
unify some of the ARM platforms which I use for testing the kernel on,
and I'd like to see one kernel image booting on all of those.

65 files changed, 1168 insertions(+), 1752 deletions(-)

Given this thread, I've lost the motivation to continue with it because
it's just going to cause more 'pointless churn' and end up annoying
Linus even more.

And I'm not going to be merging anything into my tree for the time being.
I know there's no way for me to continue without being moaned at by someone.
So I'm just going to take the easy option at the moment and do precisely
nothing in terms of queueing patches until something gets resolved one way
or the other. I'm not even going to review any patches because I currently
see it as a total waste of time - I've no idea whether they'll stand any
chance what so ever of making it into mainline.

What's the way out of this? I've no idea. Can ARM continue being part
of the mainline kernel? I've no idea. Will we be ripping out all the
ARM platform code from the mainline kernel? I've no idea.

I am now completely demotivated.

2011-03-31 11:02:23

by Artem Bityutskiy

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, 2011-03-30 at 23:10 +0200, Thomas Gleixner wrote:
> The only problem is to find a person, who is willing to do that, has
> enough experience, broad shoulders and a strong accepted voice. Not to
> talk about finding someone who is willing to pay a large enough
> compensation for pain and suffering.

If I understand correctly, this is exactly what Linaro would need to do.

--
Best Regards,
Artem Bityutskiy (Артём Битюцкий)

2011-03-31 11:27:14

by Felipe Balbi

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

Hi,

On Wed, Mar 30, 2011 at 08:24:30PM -0700, Linus Torvalds wrote:
> So let's take a really simple example of this kind of crap.
>
> Do this:
>
> git ls-files arch/arm/ | grep gpio
>
> and cry. That's 145 files in the arm directory that are some kind of
> crazy gpio support.

Most likely those are remanescent from times when we didn't have
gpiolib. It's not long ago (about 2 years) that David Brownell (rip, my
friend) introduced it to the kernel. So those should be converted to
gpiolib and moved to drivers/gpio/ but it doesn't mean we don't need
that piece of code.

With different register layout, register offsets, register sizes, etc,
it would be way uglier to combine all those into one big hacky driver.

--
balbi

2011-03-31 12:04:38

by Jean-Christophe PLAGNIOL-VILLARD

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, 31 Mar 2011, Russell King - ARM Linux wrote:

> On Thu, Mar 31, 2011 at 10:06:34AM +0200, Ingo Molnar wrote:
> > Having strong, effective platform abstractions inside the kernel really helps
> > even if the hardware space itself is inevitably fragmented: both powerpc and
> > x86 has shown that. Until you realize and appreciate that you really have not
> > understood the problem i think.
>
> No, I think it is the other way around. Folk like me and Nicolas over
> the last ten years have put considerable amounts of effort into trying
> to keep the ARM support code as clean and maintainable as possible.
>
> That is true of the common ARM stuff, but there's no way we can do this
> for all SoC support - there aren't the hours in the day to provide such

That's what I said. You and Nicholas wont scale.

> a wide oversight. That's why we have SoC maintainers, and the SoC
> maintainers have the responsibility to sort out their own sub-trees.

But the current SoC maintainer model does not work either. The SoC
maintainers care about their sandbox and have exactly zero incentive
to look at the overall picture, e.g reuse of code for the same IP
blocks, better abstraction mechanisms etc.

Therefor you need a team of experienced kernel developers which are
NOT associated with a particular vendor who are able to tame that SoC
crowd and work closely with you and Nicholas to keep stuff in sync.

Thanks,

tglx

2011-03-31 12:37:46

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On 11:50 Thu 31 Mar , Russell King - ARM Linux wrote:
> On Thu, Mar 31, 2011 at 10:54:40AM +0100, Alan Cox wrote:
> > If I boot it on a current PC I'm booting on a multiprocessor system with
> > different timers, totally different IRQ controllers, different keyboard
> > controllers (USB), PCI Express, an IOMMU, NCQ SATA, ACPI, graphics
> > running in shared host memory able to give/take pages from the host,
> > extra instructions, etc etc
> >
> > And the same kernel boots just fine on both just fine.
>
> We've been there for a long time with ARM. Right from the start I had
> a single kernel image which booted over a range of ARM CPUs and
> platforms.
>
> As far as ARM CPU architectures go, today we can have a single kernel
> image which covers ARMv3 to ARMv5, and a separate kernel image which
> covers ARMv6 to ARMv7 including SMP and UP variants. The thing which
> currently stops ARMv3 to ARMv7 all together is the different page table
> layouts, the ASID tagging, the exclusive load/store support for cmpxchg
> and other atomic operations, etc.
As we can see a lots of people work on this, to now do not add thousand of
boards but try to have only a few

Personnaly I do it on at91 as example and will continue to try to have one board in
the kernel with board information pass via Barebox, when it's possible.
I think it's a common effort doen by the ARM Community and this will imply
a lots of changesets.

The work done by Linaro with the device tree will help a lot to simplify the
pass of the information from the boot loader to the kernel. But we can already
do it today.

Best Regards,
J.

2011-03-31 12:38:38

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, 2011-03-31 at 11:50 +0100, Russell King - ARM Linux wrote:
> Given this thread, I've lost the motivation to continue with it because
> it's just going to cause more 'pointless churn' and end up annoying
> Linus even more.

I don't think the criticism was directed at the core ARM code that you
maintain (Ingo and others even praised it). I also don't think that you
stopping maintaining it would help in any way with this situation.

We probably shouldn't take criticism personally. Linus has some points
which the ARM community is aware of already since there is ongoing work
for consolidating the platform code (recent v2p patches, SMP-on-UP, FDT
and probably more will come) only that this won't happen overnight. If
you stop merging any of these, there's definitely no way out (other than
doing the work separately for the next two years and replacing the
arch/arm in a single pull request).

--
Catalin

2011-03-31 13:02:37

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, Mar 31, 2011 at 01:38:21PM +0100, Catalin Marinas wrote:
> On Thu, 2011-03-31 at 11:50 +0100, Russell King - ARM Linux wrote:
> > Given this thread, I've lost the motivation to continue with it because
> > it's just going to cause more 'pointless churn' and end up annoying
> > Linus even more.
>
> I don't think the criticism was directed at the core ARM code that you
> maintain (Ingo and others even praised it). I also don't think that you
> stopping maintaining it would help in any way with this situation.
>
> We probably shouldn't take criticism personally. Linus has some points
> which the ARM community is aware of already since there is ongoing work
> for consolidating the platform code (recent v2p patches, SMP-on-UP, FDT
> and probably more will come) only that this won't happen overnight. If
> you stop merging any of these, there's definitely no way out (other than
> doing the work separately for the next two years and replacing the
> arch/arm in a single pull request).

But are we going to be allowed to continue this effort without being
constantly blamed for "pointless churn" all the time? I don't think
so, so it may well be better to give up with pushing stuff into mainline
for two years, and then do a massive re-merge as a single major "replace
everything".

I don't like the idea, but I don't see much alternative.

And since Linus' whinge about ARM defconfigs, I really *hate* merging
anything with *any* defconfig changes in - as a result, I don't
particularly want to deal with ARM defconfig changes anymore. I'm sure
they'll make Linus explode about it again in the near future. That's
why this time around, I kept them in a separate branch in case Linus
refused to pull them.

And again, as a result of this thread I've given up for the time being
on the idea of continuing to consolidate the ARM Integrator/Versatile/
Realview/Versatile Express code. I just don't see the point of wasting
time trying to consolidate stuff if it's just going to be used against
us in terms of diffstat percentages and churn complaints.

Just look at the removal of AAEC2000, LH7A40x and 2000 lines from the
mach-types file removed 6000 lines, which in itself is about the number
of lines of change submitted during the last merge window for any one
non-ARM architecture. At this point in time with this complaint, I've
absolutely no idea why I bothered to do that. I should've left it well
alone and then the diffstat percentage would've been smaller. After
all, it's "pointless churn".

Yes, I'm severely hacked off and fed up with this. Whatever we do will
ultimately be used against us in one way or another.

2011-03-31 13:26:21

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, Mar 31, 2011 at 12:41:52PM +0200, Ingo Molnar wrote:
> * Russell King - ARM Linux <[email protected]> wrote:
> > On Thu, Mar 31, 2011 at 10:06:34AM +0200, Ingo Molnar wrote:
> > > Having strong, effective platform abstractions inside the kernel really helps
> > > even if the hardware space itself is inevitably fragmented: both powerpc and
> > > x86 has shown that. Until you realize and appreciate that you really have not
> > > understood the problem i think.
> >
> > No, I think it is the other way around. Folk like me and Nicolas over the
> > last ten years have put considerable amounts of effort into trying to keep
> > the ARM support code as clean and maintainable as possible.
>
> Absolutely no argument about that, whenever i have read core ARM code it was
> always a pleasure. You guys are doing a fine job there.

Thanks for your vote of confidence. It's really appreciated.

2011-03-31 13:39:54

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, 30 Mar 2011, Linus Torvalds wrote:

> On Wed, Mar 30, 2011 at 6:15 PM, Bill Gatliff <[email protected]> wrote:
> >
> > I'm not sure this metric is completely fair to ARM. ?If you want to
> > level the field, I think you have to divide each result by the number
> > of SoC's
>
> But that's the problem with ARM. Hardware companies that do one-off
> stuff, with no sense of compatibility.
>
> And calling it an "opportunity" is just stupid.
>
> There's nothing good about causing extra work just because ARM hasn't
> had the sense to standardize on one (or a couple) of interrupt
> controllers etc.

Well, ARM is not the only one there, it's the top one, but mips and
power are not less crazy. Here are the top five architectures counted
in number of irq_chip implementations.

arch/arm 139
arch/mips 75
arch/powerpc 44
arch/alpha 21
arch/x86 15

Thanks,

tglx

2011-03-31 13:54:42

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thursday 31 March 2011, Artem Bityutskiy wrote:
> On Wed, 2011-03-30 at 23:10 +0200, Thomas Gleixner wrote:
> > The only problem is to find a person, who is willing to do that, has
> > enough experience, broad shoulders and a strong accepted voice. Not to
> > talk about finding someone who is willing to pay a large enough
> > compensation for pain and suffering.
>
> If I understand correctly, this is exactly what Linaro would need to do.

Absolutely. Getting the work done that each of the SoC vendors needs but
none of them would do on their own is the main reason why Linaro is there,
so if we can agree what needs to be done, I'm sure we can find someone
to do it.

Arnd

2011-03-31 13:54:53

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, 31 Mar 2011, Russell King - ARM Linux wrote:
> And I'm not going to be merging anything into my tree for the time being.
> I know there's no way for me to continue without being moaned at by someone.
> So I'm just going to take the easy option at the moment and do precisely
> nothing in terms of queueing patches until something gets resolved one way
> or the other. I'm not even going to review any patches because I currently
> see it as a total waste of time - I've no idea whether they'll stand any
> chance what so ever of making it into mainline.
>
> What's the way out of this? I've no idea. Can ARM continue being part
> of the mainline kernel? I've no idea. Will we be ripping out all the
> ARM platform code from the mainline kernel? I've no idea.

Moving arm out of mainline would be a complete disaster.

> I am now completely demotivated.

I can understand that, but I have to say again, that you do a pretty
damned good job on keeping the ARM core code clean and I always enjoy
working with you when infrastructure I'm working on needs to deal with
ARM requirements. We disagree from time to time, but I don't remember
a single incident where we could not resolve that on a technical base.

It's not your fault, that you can't clone yourself 10 times to deal
with the subarch flood. It's also not your fault that the subarch
maintainers tend not to look over the brim of their SoC sandbox and
figure out how to solve the big picture.

So no, ARM needs to stay and the ARM crowd needs to find competent
people who babysit the SoC crowd and work with you to get the big
picture straight. Something like this should have been done _before_
the wave of ARM spilling out of every other silicon shop became
tsunami sized. But not doing it at all will not solve anything.

Thanks,

tglx

2011-03-31 14:43:20

by Kevin Hilman

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

Thomas Gleixner <[email protected]> writes:

> But the current SoC maintainer model does not work either. The SoC
> maintainers care about their sandbox and have exactly zero incentive
> to look at the overall picture, e.g reuse of code for the same IP
> blocks, better abstraction mechanisms etc.

zero incentive? that's a bit strong, IMO.

That may be true for some SoCs, it's not really fair as a sweeping
statement.

Some SoCs families (like OMAP) have huge amount of diversity even within
the SoC family, so better abstractions and generic infrastrucure
improvements are an obvious win, even staying within the SoC.

There are several examples of SoC maintainers looking at the overall
picture and contributing to better abstractions and common
infrastructure code.

One is USB as Felipe already pointed out where the same USB OTG IP block
(with vendor tweaks of course) is used across several completely
different SoCs with common infrastructure code.

Another example that I'm more familiar with is power management. In
OMAP land, we have been been very supportive and active in generic
infrastructure improvements (like runtime PM.) In fact runtime PM was
born partially because one of the other ARM SoC maintainers (Magnus
Damm, SH-mobile) proposed the idea as he was implementing PM for that
SoC family. We have been actively contributing to the runtime PM
infrastructure with both code, testing, converting our drivers over to
using runtime PM. and contributing back fixes and enhancements as we
find problems or limitations. In addition, personally, I have spent the
last year evangelizing the importance of using common frameworks like
runtime PM to the embedded community via talks at the Embedded Linux
Conference (ELC, US and Europe.) Especially as IP blocks are reused
across SoC families, abstractions like runtime PM are the only way to
keep the SoC specifics of PM out of the common driver.

Yes, ARM SoC maintainers have to make up some ground. But compare this
to just a couple years ago where the common complaint was "why aren't
embedded SoC people contributing code to mainline", and you'll see we
have come a long way.

Kevin

Maintainer of parts of the ARM kernel:
- TI Davinci SoC family
- TI OMAP Power Management infrastructure

2011-03-31 14:55:38

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

Russell:

On Thu, Mar 31, 2011 at 8:01 AM, Russell King - ARM Linux
<[email protected]> wrote:
>
> And since Linus' whinge about ARM defconfigs, I really *hate* merging
> anything with *any* defconfig changes in - as a result, I don't
> particularly want to deal with ARM defconfig changes anymore.

I think the defconfigs are as important as the code itself! Sure there
is a lot of churn in them--- just let git deal with it and move on.

The defconfigs and board code provide critical reference models for
new platform developers. If these models aren't right, then you end
up providing bad references that get replicated over and over again.
It multiplies the problem we're trying to solve.

> Just look at the removal of AAEC2000, LH7A40x and 2000 lines from the
> mach-types file removed 6000 lines, which in itself is about the number
> of lines of change submitted during the last merge window for any one
> non-ARM architecture. ?At this point in time with this complaint, I've
> absolutely no idea why I bothered to do that. ?I should've left it well
> alone and then the diffstat percentage would've been smaller. ?After
> all, it's "pointless churn".

I think you did it because it was the Right Thing To Do. Even
positive change can be painful at times.

The majority is exceedingly grateful for the effort you make.

b.g.

--
Bill Gatliff
[email protected]

2011-03-31 15:02:00

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, 31 Mar 2011, Kevin Hilman wrote:

> Thomas Gleixner <[email protected]> writes:
>
> > But the current SoC maintainer model does not work either. The SoC
> > maintainers care about their sandbox and have exactly zero incentive
> > to look at the overall picture, e.g reuse of code for the same IP
> > blocks, better abstraction mechanisms etc.
>
> zero incentive? that's a bit strong, IMO.
>
> That may be true for some SoCs, it's not really fair as a sweeping
> statement.

Fair enough, but it's the perception in general.

> Conference (ELC, US and Europe.) Especially as IP blocks are reused
> across SoC families, abstractions like runtime PM are the only way to
> keep the SoC specifics of PM out of the common driver.

Right, I know that these things happen, but at the same time the sheer
amount of stuff flowing in makes it hard that these infrastructure
stuff really works out. And we are only at the beginning of the big
shuffle "code in to mainline" game.

After cleaning up the whole irq stuff across the tree I can tell you,
that the mess is non-linear growing with the number of instances.

You can see the patterns which are:
- copy and paste
- introduce different bugs
- add more abuse

That's what I'm really concerned about.

> Yes, ARM SoC maintainers have to make up some ground. But compare this
> to just a couple years ago where the common complaint was "why aren't
> embedded SoC people contributing code to mainline", and you'll see we
> have come a long way.

Well, code comes in, which is progress. But we need to figure out how
to deal with the increasingly growing flood before we drown in it.

Thanks,

tglx

2011-03-31 15:06:15

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, Mar 31, 2011 at 05:01:40PM +0200, Thomas Gleixner wrote:
> On Thu, 31 Mar 2011, Kevin Hilman wrote:
>
> > Thomas Gleixner <[email protected]> writes:
> >
> > > But the current SoC maintainer model does not work either. The SoC
> > > maintainers care about their sandbox and have exactly zero incentive
> > > to look at the overall picture, e.g reuse of code for the same IP
> > > blocks, better abstraction mechanisms etc.
> >
> > zero incentive? that's a bit strong, IMO.
> >
> > That may be true for some SoCs, it's not really fair as a sweeping
> > statement.
>
> Fair enough, but it's the perception in general.
>
> > Conference (ELC, US and Europe.) Especially as IP blocks are reused
> > across SoC families, abstractions like runtime PM are the only way to
> > keep the SoC specifics of PM out of the common driver.
>
> Right, I know that these things happen, but at the same time the sheer
> amount of stuff flowing in makes it hard that these infrastructure
> stuff really works out. And we are only at the beginning of the big
> shuffle "code in to mainline" game.
>
> After cleaning up the whole irq stuff across the tree I can tell you,
> that the mess is non-linear growing with the number of instances.
>
> You can see the patterns which are:
> - copy and paste
> - introduce different bugs
> - add more abuse
>
> That's what I'm really concerned about.
>
> > Yes, ARM SoC maintainers have to make up some ground. But compare this
> > to just a couple years ago where the common complaint was "why aren't
> > embedded SoC people contributing code to mainline", and you'll see we
> > have come a long way.
>
> Well, code comes in, which is progress. But we need to figure out how
> to deal with the increasingly growing flood before we drown in it.

How about we declare the remainder of this cycle and the next merge window
as being only for bug and regression fixes, and consolidation of stuff like
the IRQ controller and GPIO controller code for the next merge window?

2011-03-31 15:24:01

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thursday 31 March 2011, Kevin Hilman wrote:
> Some SoCs families (like OMAP) have huge amount of diversity even within
> the SoC family, so better abstractions and generic infrastrucure
> improvements are an obvious win, even staying within the SoC.

But that's the point. The incentive is there for managing the infrastructure
within the SoC, but not across SoCs. Allow me to use OMAP as a bad example
while pointing out that it's really one of the best supported platforms
we currently have, while the others are usually much worse in terms of
working with the community (or at least they are behind on the learning
curve but getting there):

* OMAP2 introduced the hwmod concept as an attempt to reduce duplication
between board code, but the code was done on the mach-omap2 level
instead of finding a way to make it work across SOC vendors, or using
an existing solution.

* The IOMMU code in omap2 duplicates the API we have in the common kernel,
with slight differences, instead of using the existing code, making it
impossible to share a driver between SOC families.

* The ti-st code duplicates parts of the bluetooth layer (apparently
that is getting fixed soon).

* The DSS display drivers introduce new infrastructure include new bus
types that have the complexity to make them completely generic, but
in practice can only work on OMAP, and are clearly not written with
cross-vendor abstractions in mind.

Arnd

2011-03-31 15:47:09

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, 31 Mar 2011, Russell King - ARM Linux wrote:

>
> On Thu, Mar 31, 2011 at 05:01:40PM +0200, Thomas Gleixner wrote:
>> On Thu, 31 Mar 2011, Kevin Hilman wrote:
>>
>>> Thomas Gleixner <[email protected]> writes:
>>
>>> Yes, ARM SoC maintainers have to make up some ground. But compare this
>>> to just a couple years ago where the common complaint was "why aren't
>>> embedded SoC people contributing code to mainline", and you'll see we
>>> have come a long way.
>>
>> Well, code comes in, which is progress. But we need to figure out how
>> to deal with the increasingly growing flood before we drown in it.
>
> How about we declare the remainder of this cycle and the next merge window
> as being only for bug and regression fixes, and consolidation of stuff like
> the IRQ controller and GPIO controller code for the next merge window?

well, now that -rc1 has been released, the remainder of this cycleis
already only bug and regression fixes.

declaring the next merge window as the same may or may not help, depending
on if it pushes people to do more consolodations work or just delay
submitting the work they are doing.

David Lang

2011-03-31 16:04:27

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, 31 Mar 2011, Russell King - ARM Linux wrote:

> On Thu, Mar 31, 2011 at 10:06:34AM +0200, Ingo Molnar wrote:
>> Having strong, effective platform abstractions inside the kernel really helps
>> even if the hardware space itself is inevitably fragmented: both powerpc and
>> x86 has shown that. Until you realize and appreciate that you really have not
>> understood the problem i think.
>
> No, I think it is the other way around. Folk like me and Nicolas over
> the last ten years have put considerable amounts of effort into trying
> to keep the ARM support code as clean and maintainable as possible.

In this case I owe you and Nicolas an apology.

I think that part of the issue is that when Linus points out a problem,
the response isn't "we agree and are working on it, here's what we are
doing", instead it seems to be mostly "there is no problem, this is just
because there is so much variation in ARM"

Linus does look at the code he pulls, if he is pulling changesets that are
described as consolodations and cleanups, he won't be whining about code
churn.

but if he is just pulling chnagesets that are described as "addsupport for
board X" or "modify defconfig defaults" he is going to complain.

it's not the total amount of code, and it's not even the total amount of
change to the code that's the issue. It's that the changes are conflicting
with each other (due to things like central config tables that multiple
people are updating in different ways) and the same files getting modified
frequently, many times in ways that don't seem to have a clear direction
(defconfigs for example)

David Lang

2011-03-31 16:46:44

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, Mar 31, 2011 at 09:03:28AM -0700, [email protected] wrote:
> In this case I owe you and Nicolas an apology.

Thanks.

> it's not the total amount of code, and it's not even the total amount of
> change to the code that's the issue.

I think you're not entirely correct - have a look at Linus' message where
there's a comparison of the size of arch/arm with other architectures, and
you'll find that it is partly about size of source code.

2011-03-31 16:59:12

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, 31 Mar 2011, Arnd Bergmann wrote:

> On Thursday 31 March 2011, Kevin Hilman wrote:
> > Some SoCs families (like OMAP) have huge amount of diversity even within
> > the SoC family, so better abstractions and generic infrastrucure
> > improvements are an obvious win, even staying within the SoC.
>
> But that's the point. The incentive is there for managing the infrastructure
> within the SoC, but not across SoCs. Allow me to use OMAP as a bad example
> while pointing out that it's really one of the best supported platforms
> we currently have, while the others are usually much worse in terms of
> working with the community (or at least they are behind on the learning
> curve but getting there):
>
> * OMAP2 introduced the hwmod concept as an attempt to reduce duplication
> between board code, but the code was done on the mach-omap2 level
> instead of finding a way to make it work across SOC vendors, or using
> an existing solution.
>
> * The IOMMU code in omap2 duplicates the API we have in the common kernel,
> with slight differences, instead of using the existing code, making it
> impossible to share a driver between SOC families.
>
> * The ti-st code duplicates parts of the bluetooth layer (apparently
> that is getting fixed soon).
>
> * The DSS display drivers introduce new infrastructure include new bus
> types that have the complexity to make them completely generic, but
> in practice can only work on OMAP, and are clearly not written with
> cross-vendor abstractions in mind.

Right, but the problem starts in way simpler areas like irq chips and
gpio stuff, where lots of the IP cores are similar and trivial enough
to be shared across many SoC families.

Even the OMAP "consolidated" code is silly:

static void _set_gpio_dataout(struct gpio_bank *bank, int gpio, int enable)
{
void __iomem *reg = bank->base;
u32 l = 0;

switch (bank->method) {
#ifdef CONFIG_ARCH_OMAP1
case METHOD_MPUIO:
reg += OMAP_MPUIO_OUTPUT / bank->stride;
l = __raw_readl(reg);
if (enable)
l |= 1 << gpio;
else
l &= ~(1 << gpio);
break;
#endif
#ifdef CONFIG_ARCH_OMAP15XX
case METHOD_GPIO_1510:
reg += OMAP1510_GPIO_DATA_OUTPUT;
l = __raw_readl(reg);
if (enable)
l |= 1 << gpio;
else
l &= ~(1 << gpio);
break;
#endif
#ifdef CONFIG_ARCH_OMAP16XX
case METHOD_GPIO_1610:
if (enable)
reg += OMAP1610_GPIO_SET_DATAOUT;
else
reg += OMAP1610_GPIO_CLEAR_DATAOUT;
l = 1 << gpio;
break;
#endif
#if defined(CONFIG_ARCH_OMAP730) || defined(CONFIG_ARCH_OMAP850)
case METHOD_GPIO_7XX:
reg += OMAP7XX_GPIO_DATA_OUTPUT;
l = __raw_readl(reg);
if (enable)
l |= 1 << gpio;
else
l &= ~(1 << gpio);
break;
#endif
#if defined(CONFIG_ARCH_OMAP2) || defined(CONFIG_ARCH_OMAP3)
case METHOD_GPIO_24XX:
if (enable)
reg += OMAP24XX_GPIO_SETDATAOUT;
else
reg += OMAP24XX_GPIO_CLEARDATAOUT;
l = 1 << gpio;
break;
#endif
#ifdef CONFIG_ARCH_OMAP4
case METHOD_GPIO_44XX:
if (enable)
reg += OMAP4_GPIO_SETDATAOUT;
else
reg += OMAP4_GPIO_CLEARDATAOUT;
l = 1 << gpio;
break;
#endif
default:
WARN_ON(1);
return;
}
__raw_writel(l, reg);
}

So we have 2 types of sections:

#1
data = read_reg();
if (enable)
data |= bit;
else
data &= ~bit;
write_reg(data);

#2
if (enable)
write_enable_reg(bit);
else
write_disable_reg(bit);

But the code above has 6 cases in the switch because nobody abstracted
it out consequently. Not to talk about the ifdef mess.

So now look at tons of other gpio implementations all over the DOZENS
of ARM plat-/mach- space and guess what.

Most have either type #1 or type #2 just slightly different copied,
less or better abstracted and I'm pretty damned sure, that you could
consolidate all that stuff into a handful or even less drivers which
provide the code across the board.

Same for irq chips. Most of these gpio things have callbacks which do:

irq_xxx(struct irq_data *d)
{
gpio = irq_data_get_irq_chip_data(d);
irq = d->irq - gpio->base_irq;
reg = convert_to_reg(gpio, irq);
mask = convert_to_mask(gpio);

write(reg, mask);
}

I saw all those incarnations lately and you can boil them down to a
handful or less as well which fit all over the place.

Start off with such a trivial, but immense effective cleanup and see
what it helps to share code even accross SoC vendors. They all glue
together random IP blocks from the market and there are not soo many
sources which are relevant. This makes sense in all aspects:

1) less and better code
2) faster setup for new SoCs
3) shared benefit for all vendors

Thanks,

tglx

2011-03-31 17:18:22

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, Mar 31, 2011 at 9:45 AM, Russell King - ARM Linux
<[email protected]> wrote:
>
> I think you're not entirely correct - have a look at Linus' message where
> there's a comparison of the size of arch/arm with other architectures, and
> you'll find that it is partly about size of source code.

It definitely is partly about code size, although I tend to use it as
an easy illustration of the issues rather than anything deeper than
that.

The code size is a symptom of the problem. The deeper problem is the
crazy arm hardware infrastructure. We can't do much about that, but it
does mean that I _do_ think we need to take approaches that aren't
necessary when that deeper problem doesn't exist.

For example, I don't think it's problematic that we have various
drivers that do their own gpio thing. Why? Because those things are
generally part of a bigger chip/driver that is discoverable, and they
don't tend to proliferate in various crazy random amounts - nor do
they tend to impact anything else. Nor does it look like that is going
to explode in the future any more than any normal driver work is
exploding.

Or look at the irq controllers we have on most other architectures.
x86 has several of them too, and it's annoying. But it's "several",
not "hundreds", and again, it's not exploding or looking like it will
be a major pain to support. I doubt Thomas enjoyed having to work with
the legacy controllers, but I also doubt he had huge problems on the
x86 side.

What does that mean? _I_ think it means that ARM platform managers
need to worry about things that other platform managers don't
necessarily need to worry about as much. Things that works for others
will _not_ work for ARM, because the platforms don't have the same
kind of sticking power. Other architectures just don't have the same
kinds of issues, so they don't need to worry about them.

An right now just from the kinds of pulls I do, I see all these ARM
platform maintainers doing their own thing, and adding new platforms
all the time (with a _very_ occasional removal of some old platform
support just because it was a prototype that never went anywhere at
all).

What I'm _not_ seeing is a lot of cross-platform maintenance or sense
of people trying to reign things in and look for solutions to the
proliferation of random stupid and mindless platform code.

Even _within_ platforms, I see conflicts like the crazy clock files -
and between platforms I don't see any conflicts for the simple reason
that people are just duplicating crap and adding more and more of
these mindless things. "A new platform? Let's just create a new
directory, fill it with all the same template crud, and then tweak the
code to match".

It's simple, but it's not maintainable.

So it's not the size PER SE. But the size is a damn easy first-order
"we have a problem" sign. ARM is clearly an important architecture,
but that doesn't excuse the crazy bloating.

x86 had this too - the whole mindless duplication of x86-64. It got
merged. It required cleanups, it required effort, it required time.
And it required some abstractions. x86 never had much of a _platform_
duplication, and the different platforms that did exist were so
clearly secondary that they never rated any first-class code: they
always knew that they were a second-class citizen and had to work
within the framework of trying to hook into the main PC platform code.

Power did a lot of platform unification, despite having much less of a
problem than ARM to begin with. The "less of a problem" made it easier
to do, but also less critical.

And I don't think it's realistic to convert all ARM platforms in one
go, and nobody should even plan anything like that.

But I _do_ think that some of the major platforms (like omap) should
seriously just say "we can't continue doing this", and look at
converting at least everything they do to something like devicetree
(NOT some omap-specific infrastructure!) rather than lots of explicit
code. Done right, that hopefully then shows other platforms how to do
the same, and get the ball rolling.

Linus

2011-03-31 17:23:34

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, 31 Mar 2011, Russell King - ARM Linux wrote:

> On Thu, Mar 31, 2011 at 10:54:40AM +0100, Alan Cox wrote:
>> If I boot it on a current PC I'm booting on a multiprocessor system with
>> different timers, totally different IRQ controllers, different keyboard
>> controllers (USB), PCI Express, an IOMMU, NCQ SATA, ACPI, graphics
>> running in shared host memory able to give/take pages from the host,
>> extra instructions, etc etc
>>
>> And the same kernel boots just fine on both just fine.
>
> We've been there for a long time with ARM. Right from the start I had
> a single kernel image which booted over a range of ARM CPUs and
> platforms.
>
> As far as ARM CPU architectures go, today we can have a single kernel
> image which covers ARMv3 to ARMv5, and a separate kernel image which
> covers ARMv6 to ARMv7 including SMP and UP variants. The thing which
> currently stops ARMv3 to ARMv7 all together is the different page table
> layouts, the ASID tagging, the exclusive load/store support for cmpxchg
> and other atomic operations, etc.

I don't think the push is to get a single kernel image that boots
absolutly everywhere. having separate ARM5 and ARM7 kernels doesn't seem
to be a big deal to anyone.

> Outside of the CPU architecture, things become a lot more complicated.

exactly, and this is where there is an issue.

> The biggest one up until this merge window was that there is no fixed
> address for system RAM, which makes stuff like virt_to_phys() rather
> horrible to deal with - which in turn makes setting up and walking page
> tables a nightmare. We've just solved that issue with run-time patching
> of the kernel code to replace the add/sub instructions with ones with
> the appropriate offset, so we're a step closer to unifying everything
> into one single kernel image. This work alone produced this diffstat:
>
> 87 files changed, 450 insertions(+), 190 deletions(-)
>
> so it actually resulted in a net increase in the amount of code to be
> maintained rather than reducing it. That's hardly surprising as what
> that replaced was just a bunch of #define's for PHYS_OFFSET with some
> complex assembly code to do run-time patching of instructions.

but I don't think this sort of work is what anyone is complaining about.

> Given this thread, I've lost the motivation to continue with it because
> it's just going to cause more 'pointless churn' and end up annoying
> Linus even more.

it sounds like you are part of the solution, not part of the problem. the
biggest problem is the general response from 'the ARM community' when
these sorts of issues are raised claiming that there is no problem. you
seem to be very aware of the problem and are working to fix it. that is a
very different situaion.

David Lang

2011-03-31 17:56:16

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, 31 Mar 2011, [email protected] wrote:

> I think that part of the issue is that when Linus points out a problem, the
> response isn't "we agree and are working on it, here's what we are doing",
> instead it seems to be mostly "there is no problem, this is just because there
> is so much variation in ARM"

The problem is two-fold:

1) - "ARM hardware manufacturers are morons"...
- ARM vendors "do things differently just to be difficult"...
- "the crazy arm fragmentation"...
Translate this into whatever ways you like. The fact is that ARM
is quite popular as a CPU core but there is very little in terms of
standardization around that CPU core. *OBVIOUSLY* this is a problem.
But there is _nothing_ we can do about that besides the current
moaning and the hope that those vendors will hear us and stop trying
to be different from their competitors. Apparently that won't happen
in the near future, so we can either sit on our arses proclaiming
repeatedly that this is a problem until those hardware vendors put
their acts together, or we find ways to deal with it somehow.

2) Because of (1) we do end up being floded by SOC specific support code
with an unprecedented scale. Here's the stat:

$ git diff --shortstat v2.6.38..v2.6.39-rc1 arch/arm/
1319 files changed, 61303 insertions(+), 33780 deletions(-)
$ git diff --shortstat v2.6.38..v2.6.39-rc1 arch/x86/
241 files changed, 6508 insertions(+), 4326 deletions(-)

That's ten (10) times more lines added in the ARM directory than in
the X86 directory. Is this a sudden burst or a tendency?

$ git diff --shortstat v2.6.37..v2.6.38 arch/arm/
1257 files changed, 72412 insertions(+), 29361 deletions(-)
$ git diff --shortstat v2.6.37..v2.6.38 arch/x86/
216 files changed, 10021 insertions(+), 5016 deletions(-)

$ git diff --shortstat v2.6.36..v2.6.37 arch/arm/
1314 files changed, 55072 insertions(+), 17620 deletions(-)
$ git diff --shortstat v2.6.36..v2.6.37 arch/x86/
299 files changed, 16130 insertions(+), 12800 deletions(-)

$ git diff --shortstat v2.6.35..v2.6.36 arch/arm
1041 files changed, 53428 insertions(+), 25722 deletions(-)
$ git diff --shortstat v2.6.35..v2.6.36 arch/x86/
231 files changed, 7216 insertions(+), 8028 deletions(-)

So that appears to be quite "normal" to see ARM vendors together
producing many times the level of activities compared to X86.

So... Is there missed opportunity for better code reuse here? Most
probably. Is all that code the result of misabstracted and duplicated
code? Certainly not. Let's just presume that half of that code is
genuine crap and the other half is simply the result of new hardware for
which there is no existing model to fit it in. Even then, do we have 5
times the reviewer bandwidth to properly review all that code compared
to X86? Absolutely not, not even close.

If prominent people looking at this from the side line continue bashing
at those who are both feet in the mud trying to contain the flood rather
than actually helping then nothing will change. Instead this only
creates despair and the splashed people may simply decide to throw in
the towel, at which point things will collapse for real. In reality,
the system has been going as it is for quite a while and with more or
less the same level of intensity. And the fact is that _users_ of the
ARM kernel are not complaining. Things are far from being perfect, but
so far things have been "good enough" for the majority of the people
involved, and improvements are constantly being worked on with the men
power available.

Nicolas

2011-03-31 18:09:01

by Koen Kooi

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

Op 31 mrt 2011, om 19:22 heeft [email protected] het volgende geschreven:

> On Thu, 31 Mar 2011, Russell King - ARM Linux wrote:
>
>> On Thu, Mar 31, 2011 at 10:54:40AM +0100, Alan Cox wrote:
>>> If I boot it on a current PC I'm booting on a multiprocessor system with
>>> different timers, totally different IRQ controllers, different keyboard
>>> controllers (USB), PCI Express, an IOMMU, NCQ SATA, ACPI, graphics
>>> running in shared host memory able to give/take pages from the host,
>>> extra instructions, etc etc
>>>
>>> And the same kernel boots just fine on both just fine.
>>
>> We've been there for a long time with ARM. Right from the start I had
>> a single kernel image which booted over a range of ARM CPUs and
>> platforms.
>>
>> As far as ARM CPU architectures go, today we can have a single kernel
>> image which covers ARMv3 to ARMv5, and a separate kernel image which
>> covers ARMv6 to ARMv7 including SMP and UP variants. The thing which
>> currently stops ARMv3 to ARMv7 all together is the different page table
>> layouts, the ASID tagging, the exclusive load/store support for cmpxchg
>> and other atomic operations, etc.
>
> I don't think the push is to get a single kernel image that boots absolutly everywhere. having separate ARM5 and ARM7 kernels doesn't seem to be a big deal to anyone.

You mean ARMv5 and ARMv7, right? ARM5 and ARM7 are completely different things.

The short, but inaccurate version:

ARM9 -> ARMv4t, ARMv5te*
ARM11 -> ARMv6*
CORTEX-A* -> ARMv7a

regards,

Koen-

2011-03-31 18:10:18

by Alexander Holler

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

Hello,

Am 31.03.2011 10:09, schrieb Russell King - ARM Linux:

> We also need the various SoC designers and ARM architecture people to
> realise that what the hardware situation is rediculous; I have commented
> about this lack of standardisation to ARM in past years. ARM have had
> a standard set of peripherals for ten years, but the SoC people haven't
> really taken them up - and when they do, they seem to always introduce
> their own tweaks, sometimes with no way to detect those tweaks.

As a user of several ARM boards I fully agree. I've come to the
conclusion that if device tree or something similiar won't come up,
which offers a vendor independent description of the hardware, the ARM
market (at least with Linux as an OS) won't function. It's already
almost impossible to update an old vendor kernel to a mainline kernel
version without having schematics. Up to now this isn't a big problem
because most ARM-HW people are playing with are developer boards, but
thats already changing and more and more stuff will come without schematics.
And without the help of something like the x86 BIOS (or DT for ARM) you
are just lost using an ARM-HW where you don't have the schematics, when
you don't want to use the vendor provided kernel sources (for which you
almost never get e.g. any security fixes). Finding all the small knobs
to turn out of vendor provided kernel sources is a pain and just a waste
of time you almost never can't finish before the HW in question got
obsolete.

Just my 2¢ on that topic from a somewhat user point of view from one who
isn't really involved that much in kernel development.

At least I find such a rant from Linus from time to time a good thing.
Sometimes it helps if someone speaks out loud whats wrong. And if Linus
wouldn't be that one, who else would be courageous enough to do that? I
wouldn't (and I can't, I have to thank all kernel developers for their
hard work).

Regards,

Alexander

2011-03-31 18:12:57

by Sam Ravnborg

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

> And since Linus' whinge about ARM defconfigs, I really *hate* merging
> anything with *any* defconfig changes in - as a result, I don't
> particularly want to deal with ARM defconfig changes anymore.

I thought we solved this with the introduction of "make savedefconfig"
that created much much smaller defconfig.
At least the defconfigs should be down in the noise level in the
diffstats.

We can always argue about the usefullness of the 100+ defconfigs,
that is still unreadable.
But at least they no longer dominate the diffs (if people remember
to use "make savedefconfig".

Sam

2011-03-31 18:18:29

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, Mar 31, 2011 at 08:12:54PM +0200, Sam Ravnborg wrote:
> > And since Linus' whinge about ARM defconfigs, I really *hate* merging
> > anything with *any* defconfig changes in - as a result, I don't
> > particularly want to deal with ARM defconfig changes anymore.
>
> I thought we solved this with the introduction of "make savedefconfig"
> that created much much smaller defconfig.

Did we solve it to Linus' satisfaction, or did we solve it sufficiently
to avoid the immediate threat of them all being removed and we still have
work to do? I've no idea.

Personally, I'd still like to see less of them but without significantly
impacting the usefulness of automated built tooks like Simtec's
kautobuild.

2011-03-31 18:23:25

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, 31 Mar 2011, Thomas Gleixner wrote:

> Start off with such a trivial, but immense effective cleanup and see
> what it helps to share code even accross SoC vendors. They all glue
> together random IP blocks from the market and there are not soo many
> sources which are relevant. This makes sense in all aspects:
>
> 1) less and better code
> 2) faster setup for new SoCs
> 3) shared benefit for all vendors

If this was always true. Someone commented on the fact that the IP
block providing USB on OMAP is shared with a couple other platforms.
But about 2600 lines of pure glue is still necessary around the common
driver to make it work for everyone. I'm not saying that separate
drivers are called for here, simply that hardware people _will_ screw it
up, especially when they are hooking it up to a non-standard
SOC-specific bus.

Another example: there used to be many different IP blocks providing
MMC/SD/SDIO support that people were adding to their SOCs. Each SOC
would have its own reinvention of the wheel but they were all different
but simple wheels, and drivers for them were obvious and straight
forward. Then came the SDHCI "standard". At first few implementation
existed so the sdhci driver was, too, rather straight forward. But
hardware manufacturers thought (rightfully) that this would be a good
idea to use that standard instead of using their custom simple wheel.
And so they did, releasing new SOC revision with the old wheel replaced
by a square implementation of the sdhci one. Today the sdhci driver is
literally bastardized by all the quirks needed to work around all the
different and creative bugs or even standard misinterpretation of the
standard out there in the field. And in many cases the sdhci version is
even _less_ functional than the custom and already supported
implementation it replaced.

And what would the hardware guys tell you? That software is cheap.

Nicolas

2011-03-31 18:34:15

by Jesse Barnes

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, 31 Mar 2011 19:17:51 +0100
Russell King - ARM Linux <[email protected]> wrote:

> On Thu, Mar 31, 2011 at 08:12:54PM +0200, Sam Ravnborg wrote:
> > > And since Linus' whinge about ARM defconfigs, I really *hate* merging
> > > anything with *any* defconfig changes in - as a result, I don't
> > > particularly want to deal with ARM defconfig changes anymore.
> >
> > I thought we solved this with the introduction of "make savedefconfig"
> > that created much much smaller defconfig.
>
> Did we solve it to Linus' satisfaction, or did we solve it sufficiently
> to avoid the immediate threat of them all being removed and we still have
> work to do? I've no idea.
>
> Personally, I'd still like to see less of them but without significantly
> impacting the usefulness of automated built tooks like Simtec's
> kautobuild.

Might be neat if kbuild could wget config files automatically from a
known location. Bonus points if it properly detected the right one to
grab in the first place...

--
Jesse Barnes, Intel Open Source Technology Center

2011-03-31 18:35:20

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, 31 Mar 2011, Nicolas Pitre wrote:
> On Thu, 31 Mar 2011, [email protected] wrote:
>
> > I think that part of the issue is that when Linus points out a problem, the
> > response isn't "we agree and are working on it, here's what we are doing",
> > instead it seems to be mostly "there is no problem, this is just because there
> > is so much variation in ARM"
>
> If prominent people looking at this from the side line continue bashing
> at those who are both feet in the mud trying to contain the flood rather
> than actually helping then nothing will change. Instead this only
> creates despair and the splashed people may simply decide to throw in
> the towel, at which point things will collapse for real. In reality,
> the system has been going as it is for quite a while and with more or
> less the same level of intensity. And the fact is that _users_ of the
> ARM kernel are not complaining. Things are far from being perfect, but
> so far things have been "good enough" for the majority of the people
> involved, and improvements are constantly being worked on with the men
> power available.

And that's the whole point why I was ranting in the first place. I
know that there are clever folks working on the solution, but it's
entirely clear to me, that they are simply not enough compared to the
massive inbound flood. So neither you nor Russell can cope with it,
you simply do not scale. That's why I suggested that the ARM community
needs to push competent man power into this.

You say the concept of subarch maintainers is working quite well. That
depends on the definition of working. It works in terms of users can
use it, but it does not work from a maintainability POV.

Nobody wants to bash on those who are working on it, but IMNSHO the
current way is running into an utter nightmare even w/o you and
Russell throwing in the towel.

I went through quite a few iterations of large scale cleanups, so I
know how you feel.

Thanks,

tglx

2011-03-31 18:56:05

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, 31 Mar 2011, Nicolas Pitre wrote:

> On Thu, 31 Mar 2011, Thomas Gleixner wrote:
>
> > Start off with such a trivial, but immense effective cleanup and see
> > what it helps to share code even accross SoC vendors. They all glue
> > together random IP blocks from the market and there are not soo many
> > sources which are relevant. This makes sense in all aspects:
> >
> > 1) less and better code
> > 2) faster setup for new SoCs
> > 3) shared benefit for all vendors
>
> If this was always true. Someone commented on the fact that the IP
> block providing USB on OMAP is shared with a couple other platforms.
> But about 2600 lines of pure glue is still necessary around the common
> driver to make it work for everyone. I'm not saying that separate
> drivers are called for here, simply that hardware people _will_ screw it
> up, especially when they are hooking it up to a non-standard
> SOC-specific bus.

Right. That's a problem, but we should not ignore the places where
reusing stuff is easy possible. And making good examples out of it.

And it really _IS_ worth the trouble. Look at the git log of
drivers/spi/pxa2xx* . We could have slapped the other "x86" driver
into spi, but that does not make any sense from a software engineering
and maintainability POV. And it would have been more work in the end
to cleanup the separate driver than isolating the existing one and
reuse it.

This is a sustainability issue. And we need to become more clever
about identifying the places where we can abstract stuff into shared
drivers and infrastructure when we want to sustain Linux for another
few decades.

> And what would the hardware guys tell you? That software is cheap.

If you can prove with simple examples that using existing software
removes 6 month of useless reinventing the wheel and another 6 month
of testing plus the fight with the kernel folks, then eventually they
start to listen as you can express this in $$$.

Thanks,

tglx

2011-03-31 19:11:18

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, Mar 31, 2011 at 10:56 AM, Nicolas Pitre <[email protected]> wrote:
>
> So... Is there missed opportunity for better code reuse here? ?Most
> probably. ?Is all that code the result of misabstracted and duplicated
> code? ?Certainly not. ?Let's just presume that half of that code is
> genuine crap and the other half is simply the result of new hardware for
> which there is no existing model to fit it in. ?Even then, do we have 5
> times the reviewer bandwidth to properly review all that code compared
> to X86? ?Absolutely not, not even close.

That's an odd assumption. And it's followed by a total red herring
that doesn't even make any sense. And then your conclusion seems to be
that ARM could never have the same quality anyway, because of the
whole lack of review issue is fundamental.

And from there you seem to go on to think that there are no major
problems, and things are "good enough"!

> If prominent people looking at this from the side line continue bashing
> at those who are both feet in the mud trying to contain the flood rather
> than actually helping then nothing will change.

The reason nothing seems to be changing is that you don't seem to
think it's even WORTH fixing. I really don't understand your
arguments.

They seem to boil down to the same thing that always happens in the
embedded world, and why most of the hardware and software is crap:
people don't think further than their own small project.

It's why embedded OS's have always been crap, and it's why Linux is
becomign so important to ARM - exactly because the embedded world
(both software and hardware) always just look at their own issue, and
say "hey, this is working for me right now, so I won't bother to try
to solve the bigger issues, because it's not worth my time".

To hammer that in:

> ... ?And the fact is that _users_ of the
> ARM kernel are not complaining. ?Things are far from being perfect, but
> so far things have been "good enough" for the majority of the people
> involved, and improvements are constantly being worked on with the men
> power available.

You really don't seem to care about how Thomas was complaining about
the whole maintenance issue. As he was trying to clean up irq
handling, the pure flow of more crap just made it hard to ever catch
up. THAT is the kind of maintenance problem where I go "This is a big
problem".

But for some individual board user or the code-monkey who creates yet
another board description, this isn't a problem. Because he's looking
at a single kernel and a single board at a time, and seldom cares
about anything else.

But guess what? That really _is_ a problem. And it's likely to be a
bigger problem in the future. Look at how we're actually starting to
see vendors who are making ARM into more of a real platform, rather
than a succession of one-off embedded boards, and think about what
that actually will entail.

I'm talking about things like ASUS getting their feet wet making
netbooks with ARM. Things like that have been promised for the last
several years now, and so far it's been a total failure.

And it's going to CONTINUE to be a failure, unless the ARM platform
mess can be sorted out.

Why? Think of the Ubuntu's etc of the world. If you can't make
half-way generic install images, you can't have a reasonably generic
distribution. And if you don't have that, then what happens to your
developer situation? Most sane people won't touch it with a ten-foot
pole, because the bother is simply not worth their time.

So the current embedded mindset of "hey, it's working for all these
individual people" is just broken. It's broken for multiple reasons.
It's broken because it makes it much harder to do top-level
maintenance (but the low-level guys don't care), and it's broken
because it results in insane fragmentation where it basically is never
an option to support anything but one - or a couple - particular
device.

The ARM -core- situation is simple, and those high-level people can
(and do) say that they'll just support ARM7 and screw all the other
cores.

But the platform problem is real. And it does need some solution,
because continuing to just do the same thing really does mean that
some new things simply cannot be done.

And the fact that it's been "good enough" in the past when every
single board was always just a one-off and had nothing to do with
other boards does _not_ mean that it's going to continue to be good
enough.

So the _good_ news is that all the high-end ARM's are largely
consolidating anyway, and when we're talking Cortex-A9 class hardware,
there generally aren't millions of SoC's. And I'm hoping the hardware
people are actually aware of this (presumably because their customers
are starting to push back against pointless hw churn too) and clearly
some manufacturers are trying to -create- platforms like OMAP that try
to have lots of shared characteristics (and then screw up a lot of the
details, but whatever).

But I still do think that on the software side, people need to stop
doing the whole "let's just copy that platform code for this other
platform that is quite similar but has a different XYZ chip".

Linus

2011-03-31 19:25:32

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, 31 Mar 2011, Linus Torvalds wrote:

> What I'm _not_ seeing is a lot of cross-platform maintenance or sense
> of people trying to reign things in and look for solutions to the
> proliferation of random stupid and mindless platform code.

I do that, Russell does that, Catalin does that, Tony does that, and
maybe less than a handful of other important people I'm not listing
(sorry). But clearly we are far far from being enough people doing that
kind of work. And the fact is that the volume of ARM platform code is
steadily being produced at a rate far surpassing X86, and even higher
than all the other architectures put together. Linaro is trying to help
here, but Linaro cannot conjure the needed experience and knowledge for
that kind of work with a magic wand.

So we need help! If core kernel people could get off their X86 stool
and get down in the ARM mud to help sort out this mess that would be
really nice (thanks tglx). Until then all that the few of us can do is
to contain the flood and hope for the best, and so far things being as
they are have still worked surprisingly well in practice for users. If
compensation is a concern then I think Linaro might be able to arrange
something.

And we can't count on vendor people doing this work. They are all busy
porting the kernel to their next SOC version so they can win the next
big Android hardware design, and doing so with our kernel quality
standards is already quite a struggle for them.

What is going on at the moment is some effort to introduce DT support to
ARM. The core support is there, but that is useless until platform code
is moved to it, and corresponding work is put into bootloaders, etc.
That is progressing... slowly.

Also there is some work to be able to build a kernel supporting more
than one SOC family at once. Of course there is no practical use for a
single kernel binary that boots on all existing boards, but moving
towards such a goal has beneficial side effects such as good
consolidation when possible.

But we also need some slack wrt number of lines changed. An increased
consolidation effort will create more churn not less, at least for a
while. The OMAP clock merge conflict was the result of some cleanup
which will make further consolidation easier.

Nicolas

2011-03-31 20:05:32

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, Mar 31, 2011 at 12:25 PM, Nicolas Pitre <[email protected]> wrote:
>
> So we need help! ?If core kernel people could get off their X86 stool
> and get down in the ARM mud to help sort out this mess that would be
> really nice (thanks tglx). ?Until then all that the few of us can do is
> to contain the flood and hope for the best, and so far things being as
> they are have still worked surprisingly well in practice for users. ?If
> compensation is a concern then I think Linaro might be able to arrange
> something.

The thing is, maintainers don't scale.

The only way to get quality code is to try to improve the quality from
the "leaf nodes", because otherwise you'll always end up playing
catch-up. You'll get new bad code faster than you can clean it up.

I've told people this before, and I'll tell it again: when I flame
submaintainers, they should try to push the pain down. I'm not really
asking those submaintainers to clean up all the stuff they are
getting: I'm basically asking people to say "no", or at least push
back a lot, and argue with the people who send you code. Tell them
what you don't like about the code, and tell them that you can't take
it any more.

> And we can't count on vendor people doing this work. ?They are all busy
> porting the kernel to their next SOC version so they can win the next
> big Android hardware design, and doing so with our kernel quality
> standards is already quite a struggle for them.

This really isn't the argument. The argument should be that if they
want their code up-stream, they need to do a good job. If they don't,
why should you take it at all?

> What is going on at the moment is some effort to introduce DT support to
> ARM. ?The core support is there, but that is useless until platform code
> is moved to it, and corresponding work is put into bootloaders, etc.
> That is progressing... slowly.

How about not moving platform code TO it, but at least saying that you
won't accept new platform code that doesn't use it? When somebody
sends you a new platform, just say "no" if it's another copy-paste job
or another "add yet another #ifdef or conditional to a messy driver".

> But we also need some slack wrt number of lines changed. ?An increased
> consolidation effort will create more churn not less, at least for a
> while. ?The OMAP clock merge conflict was the result of some cleanup
> which will make further consolidation easier.

Umm. The whole "number of lines of code" thing has become a total red herring.

THAT IS NOT WHY I STARTED TO COMPLAIN!

The reason I point out the number of lines of code is because it's one
of the more obvious _symptoms_ of the problem.

But trust me, if you start doing a better job at platform code, I
won't be complaining when I get lots of deleted code, or when I start
getting devicetree descriptions instead of new drivers.

So "number of lines of code" and "massive churn" is a problem, but
look at how I'm not complaining about the drivers/ subdirectory, which
is still the bulk of all lines in the kernel by far. I may complain
about particular subsystems in drivers (gpu..) for other reasons, but
it's not "lines of code" per se.

In the case of ARM, the reason I point out the size of the arch/arm is
because it's illustrative of just how _different_ ARM is from the
other architectures. And those differences are pretty much all about
the board/platform issues.

Linus

2011-03-31 20:35:33

by Kevin Hilman

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

Arnd Bergmann <[email protected]> writes:

> On Thursday 31 March 2011, Kevin Hilman wrote:
>> Some SoCs families (like OMAP) have huge amount of diversity even within
>> the SoC family, so better abstractions and generic infrastrucure
>> improvements are an obvious win, even staying within the SoC.
>
> But that's the point. The incentive is there for managing the infrastructure
> within the SoC, but not across SoCs.

OK, but the rest of my thread went on to describe how at least a few ARM
SoC maintainers are actually actively working infrastructure that is
cross SoC, like runtime PM. It might start because of an abstraction
within an SoC family like supporting both SH and SH-mobile, or
OMAP[12345], but it does sometimes result in not only cross-SoC code but
cross-platform frameworks.

Admiteddly, the percentage of ARM SoC developers actively working on
these common, cross-platform infrastructure layers is rather small, but
at least it is non-zero. :)

> Allow me to use OMAP as a bad example while pointing out that it's
> really one of the best supported platforms we currently have, while
> the others are usually much worse in terms of working with the
> community (or at least they are behind on the learning curve but
> getting there):
>
> * OMAP2 introduced the hwmod concept as an attempt to reduce duplication
> between board code, but the code was done on the mach-omap2 level
> instead of finding a way to make it work across SOC vendors, or using
> an existing solution.

Well, before deciding whether something like hwmod should be a cross-SoC
abstraction, it's important to be clear about what level of abstraction
is needed, or practical for a given feature. For power management, we
already have (and use) existing abstractions for the drivers. The clock
framework, system PM and runtime PM framework are all existing
abstraction layers for drivers.

Remember that power management is one of those areas that ARM SoC
vendors like to "differentiate" on, so the hardware is vastly different
between ARM vendors. Having worked on embedded Linux power management
for a while now, I currently do not think any cross-SoC abstractions
below the clock framework or runtime PM are worth it. I'm certainly
willing to be pursuaded otherwise, but currently don't see the
usefulness.

With that as background, hwmod was never inteded as something to be
cross-SoC. If you look at the data that's in an omap_hwmod, it's
entirely OMAP hardware specific, and mostly focused on power management
hardware details, register descriptions, feature capabilities etc. This
allows the OMAP PM core code to be generalized and work across all SoCs
in the OMAP family. But again, it was intended for OMAP PM core code.
At that level, there really isn't much to share with other SoCs since
the PM hardware for the various SoC vendors is so "differentiated"
(a.k.a fsck'd up in extremely different ways.)

Kevin

2011-03-31 20:35:50

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, Mar 31, 2011 at 1:05 PM, Linus Torvalds
<[email protected]> wrote:
>
> Umm. The whole "number of lines of code" thing has become a total red herring.
>
> THAT IS NOT WHY I STARTED TO COMPLAIN!
>
> The reason I point out the number of lines of code is because it's one
> of the more obvious _symptoms_ of the problem.
>
> But trust me, if you start doing a better job at platform code, I
> won't be complaining when I get lots of deleted code, or when I start
> getting devicetree descriptions instead of new drivers.

So if you don't like lines of code, how about just "number of files touched".

This is Thomas this merge window. Remember: he's traditionally doing
timers and interrupts and stuff. Do:

git log --author=tglx v2.6.38.. --oneline --numstat |
cut -f3- | grep -v ' ' | cut -d/ -f1-2 |
sort | uniq -c | sort -n | tail

and see another example of arm standing out (ok, so "kernel/irq" also
stands out in this case, but you'd kind of _expect_ that, when the
whole series is about irq controllers, wouldn't you?)

And no, mips doesn't look so hot either.

Anyway, the point is, there are many ways to show the whole "arm is a
maintenance problem" issue. Don't get too hung up about number of
lines changed. It's just the simplest kind of thing where the tools
give answers very quickly without the above kinds of games.

Btw, the reason why the subject line on this rant is what it is, is
that during this merge window, arm was also one of the more annoying
to merge. And NO, that does NOT mean that I want you guys to merge
things behind my back just to hide the problem. I'm just saying that
it's another facet of this whole issue. Not that I can't do merges,
but that ARM simply ends up having these issues that others don't.

Linus

2011-03-31 21:40:53

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, 31 Mar 2011, Nicolas Pitre wrote:
> On Thu, 31 Mar 2011, Linus Torvalds wrote:
> > What I'm _not_ seeing is a lot of cross-platform maintenance or sense
> > of people trying to reign things in and look for solutions to the
> > proliferation of random stupid and mindless platform code.
>
> I do that, Russell does that, Catalin does that, Tony does that, and
> maybe less than a handful of other important people I'm not listing
> (sorry). But clearly we are far far from being enough people doing that
> kind of work. And the fact is that the volume of ARM platform code is
> steadily being produced at a rate far surpassing X86, and even higher
> than all the other architectures put together. Linaro is trying to help
> here, but Linaro cannot conjure the needed experience and knowledge for
> that kind of work with a magic wand.
>
> So we need help! If core kernel people could get off their X86 stool
> and get down in the ARM mud to help sort out this mess that would be
> really nice (thanks tglx). Until then all that the few of us can do is

The main help we can give (aside of actually looking at code and
concepts) is to feed back the experience we have with massive cleanups
and which mechanisms work and which not and - at least I can speak for
myself - to stand at your side when it comes to pushing that through.

One thing what really helps to force people to get their act together
is that you as maintainers identify trouble spots which are easily
addressable. Then put those spots on your list and require people who
want to submit new features in that area to cleanup the mess first and
then put their new feature thing on top. We successfully used this to
get the unification work after the mechanical i386/x86_64 merger
done. It requires a lot of stubborness, but it reduces the work burden
of the maintainers a lot.

Once you have the easy spots addressed, that could be device tree
stuff, gpio, irq_chips for the beginning and see how it works out,
then go steps further.

The important point here is from my POV that you put down the
requirements with no wiggle room and just ignore the "oh it could be
done better" whining. Either people come up with a patch which solves
the whole issue better or they just will cope. But when you have
consolidated stuff then you can and need to look from a high level
perspective and refine the infrastructure.

I really regret in hindsight, that I did not enforce the cleanup of
the irq layer way earlier and that I did not see the abuse of it early
enough. At some point I realized that being polite is the wrong
solution, so I forced myself to push this cleanup through. That's an
experience which I don't wish anybody else to make. Especially because
as long as the oldstyle stuff works oldstyle crap comes in faster than
you can fix it. See commit 9ad198c. And it's a massive effort to do
something which results in:

Total patches: 414
Total files touched: 702
Total insertions: 6805
Total deletions: 8632
Lines added -1827

And that massive effort is just because you cannot break stuff after
the fact in a massive scale. The whole thing introduced a mere 6
patches fixup fallout which were applied one day after rc1. It could
have been avoided, but ....

So yes, we can and will help with advise, a certain amount of review
(especially on the concept level) and giving you all the support you
need to fight that trough.

Thanks,

tglx

2011-03-31 22:49:29

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, 31 Mar 2011, Linus Torvalds wrote:

> On Thu, Mar 31, 2011 at 12:25 PM, Nicolas Pitre <[email protected]> wrote:
> >
> > So we need help! ?If core kernel people could get off their X86 stool
> > and get down in the ARM mud to help sort out this mess that would be
> > really nice (thanks tglx). ?Until then all that the few of us can do is
> > to contain the flood and hope for the best, and so far things being as
> > they are have still worked surprisingly well in practice for users. ?If
> > compensation is a concern then I think Linaro might be able to arrange
> > something.
>
> The thing is, maintainers don't scale.

True. My remark about core kernel people still stands though.

> The only way to get quality code is to try to improve the quality from
> the "leaf nodes", because otherwise you'll always end up playing
> catch-up. You'll get new bad code faster than you can clean it up.

Leaf nodes on ARM are people coming from corporate background with the
old school software development methodologies. They do it as a _job_
first and foremost. They only work on Linux because that's what their
boss assigned them to. Don't get me wrong: that doesn't mean they are
bad people. Simply that they are not into it for the public recognition
(or flaming) from their peers. Once their code works they lose interest
and move on. That mindset is extremely hard to change and take time, on
a scale of years. Much more time than required to produce the code
needed to support that new SOC out of the pipeline. There are notable
exceptions obviously. But this is still a scalability problem in
itself. So we need men-in-the-middle attacks.

> I've told people this before, and I'll tell it again: when I flame
> submaintainers, they should try to push the pain down. I'm not really
> asking those submaintainers to clean up all the stuff they are
> getting: I'm basically asking people to say "no", or at least push
> back a lot, and argue with the people who send you code. Tell them
> what you don't like about the code, and tell them that you can't take
> it any more.

I wish we could be sufficient people to be able to determine what we
actually don't like about the code. There is simply not enough core
kernel people with the required visibility doing such work in ARM land.
That's the fundamental problem. The fact that the most successful
"real" ARM devices running Linux out there still aren't supported in
mainline doesn't help building a community of enthusiasts around it
either.

> > And we can't count on vendor people doing this work. ?They are all busy
> > porting the kernel to their next SOC version so they can win the next
> > big Android hardware design, and doing so with our kernel quality
> > standards is already quite a struggle for them.
>
> This really isn't the argument. The argument should be that if they
> want their code up-stream, they need to do a good job. If they don't,
> why should you take it at all?

Embedded vendors did keep their code out of the kernel before. We've
been hammering them about upstreaming their code for years. Now they
are striking back with too much code for our review capacity. So
problematic code gets merged without anyone noticing because it compiles
and does work, until someone comes along with a wide scale API cleanup
and stumble on it.

The alternative is to only accept fully reviewed code, but to scale with
the numbers we've all seen, 60% of the reviewers should be looking at
ARM code and that's not happening. We've been there before, like two
years ago or so. Pressure builds up at the maintainer gate as the
backlog grows and key people get burned out, then the system collapses.
No one wants to go back there.

> > What is going on at the moment is some effort to introduce DT support to
> > ARM. ?The core support is there, but that is useless until platform code
> > is moved to it, and corresponding work is put into bootloaders, etc.
> > That is progressing... slowly.
>
> How about not moving platform code TO it, but at least saying that you
> won't accept new platform code that doesn't use it? When somebody
> sends you a new platform, just say "no" if it's another copy-paste job
> or another "add yet another #ifdef or conditional to a messy driver".

DT has to prove itself on ARM with a few existing platforms before we
can open the flood gate towards it. If something is wrong with DT
support it is best to fix only the core stuff without also having to fix
all users and possibly all bootloaders, etc. That work is progressing
slowly because there are more people praising DT than people doing the
actual work.

Nicolas

2011-04-01 00:53:22

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, Mar 31, 2011 at 06:49:11PM -0400, Nicolas Pitre wrote:
> On Thu, 31 Mar 2011, Linus Torvalds wrote:

> > The only way to get quality code is to try to improve the quality from
> > the "leaf nodes", because otherwise you'll always end up playing
> > catch-up. You'll get new bad code faster than you can clean it up.
>
> Leaf nodes on ARM are people coming from corporate background with the
> old school software development methodologies. They do it as a _job_
> first and foremost. They only work on Linux because that's what their
> boss assigned them to. Don't get me wrong: that doesn't mean they are
> bad people. Simply that they are not into it for the public recognition
> (or flaming) from their peers. Once their code works they lose interest
> and move on. That mindset is extremely hard to change and take time, on
> a scale of years. Much more time than required to produce the code
> needed to support that new SOC out of the pipeline. There are notable
> exceptions obviously. But this is still a scalability problem in
> itself. So we need men-in-the-middle attacks.

It's also often the case that the leaf maintainers are themselves
overloaded with work, especially those who don't have much code in tree
already or who have advanced power management features in their devices
that they're trying to support (which tend to be the area that requires
most work as they're system wide in impact).

> > This really isn't the argument. The argument should be that if they
> > want their code up-stream, they need to do a good job. If they don't,
> > why should you take it at all?

> Embedded vendors did keep their code out of the kernel before. We've
> been hammering them about upstreaming their code for years. Now they
> are striking back with too much code for our review capacity. So
> problematic code gets merged without anyone noticing because it compiles
> and does work, until someone comes along with a wide scale API cleanup
> and stumble on it.

Plus the fact that even if the code isn't of the quality we'd ideally
like you do tend to get *some* quality improvement from pushing things
into mainline simply by virtue of 1000 foot review and it's much more
likely that random people will come along and contribute improvements to
mainline than to vendor BSPs. Speaking as someone who works over many
different embedded CPUs (not just ARM) I'm generally thankful when I'm
working with mainline code, even if it's not the mainline code I might
dream of. There are some great out of tree BSPs but there's also
others.

2011-04-01 01:18:07

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, Mar 30, 2011 at 03:14:10PM -0700, Tony Lindgren wrote:
> * Russell King - ARM Linux <[email protected]> [110330 14:05]:

> > And I have got to the point of just not giving a damn. I can't change
> > the ARM community (I've tried over the years to get more active review
> > of platform changes and failed - and had it pointed out by folk like
> > Alan Cox, that such a system is impossible due to lack of motivation
> > by, eg, an OMAP person to review a Samsung change.)

> I think this is happening more and more as we have more ARM generic
> and Linux generic code.

Plus you've now got some non-trivial code for off-SoC devices which
means you've got a growing number of people who do actively work over
many SoCs.

2011-04-01 01:43:09

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, Mar 31, 2011 at 12:56:33AM +0200, Thomas Gleixner wrote:
> On Wed, 30 Mar 2011, Tony Lindgren wrote:
> > * Thomas Gleixner <[email protected]> [110330 15:22]:
> > > On Wed, 30 Mar 2011, Tony Lindgren wrote:

> > > > One thing that will help here and distribute the load is to move
> > > > more things under drivers/ as then we have more maintainers looking
> > > > at the code.

> > > Guess what's that going to solve? Nothing, nada.

> > > Really, you move the problem to people who are not prepared to deal
> > > with the wave either. So what's the gain?

> > I guess my point is that with creating more common frameworks people
> > will be using common code. Some examples that come to mind are clock
> > framework, gpiolib, dma engine, runtime PM and so on.

> For all that to happen you need a really experienced team with a
> strong team lead to fight that through and go through the existing
> horror while dealing with the incoming flood at the same time.

My experience is that it's not that bad doing this providing you can
convince people to actually show their code to the relevant subsystem
maintainers and they have time to look at the code. The first step is
reasonably tractable since it's a fairly basic level of review and as a
subsystem maintainer you're well enough motivated to at least ensure
that people aren't breaking the abstractions enough to cause problems
for anyone but the people directly working with the drivers.

2011-04-01 04:49:47

by David Brown

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, Mar 31 2011, Nicolas Pitre wrote:

> Leaf nodes on ARM are people coming from corporate background with the
> old school software development methodologies. They do it as a _job_
> first and foremost. They only work on Linux because that's what their
> boss assigned them to. Don't get me wrong: that doesn't mean they are
> bad people. Simply that they are not into it for the public recognition
> (or flaming) from their peers. Once their code works they lose interest
> and move on. That mindset is extremely hard to change and take time, on
> a scale of years. Much more time than required to produce the code
> needed to support that new SOC out of the pipeline. There are notable
> exceptions obviously. But this is still a scalability problem in
> itself. So we need men-in-the-middle attacks.

An additional mindset that is difficult to work with in this environment
is that the corporate development methodology has a focus on schedules
and deliverables. Even people who would otherwise like to contribute
will have pressure to get something done. Many think of "submit to
mainline" is kind of a last step in a development process, instead of
even a goal to accomplish.

When we push back, there is a good chance they just won't bother, not
because they don't want to do it, but because it doesn't fit a schedule,
and there is already something else for them to work on.

So what's the right answer here. Practically, someone just sent out a
fairly complete DMA driver for a new MSM device. Naturally, this
hardware is nothing like anyone else's DMA, but the driver itself pretty
much independent from other kernel APIs. It isn't even similar to the
existing DMA driver in the MSM. With it are patches to ifdef-up various
drivers to use the appropriate DMA.

The DMA code, by itself, seems reasonably well written (with some
cleanup and such needed), but it makes everything that uses it messy.
In this particular case, DMA engine will probably need some work to
either incorporate the unique capabilities of this hardware, or at least
allow them to be used. The author probably won't be able to do this on
their own.

I could pull the driver into the tree, and now we have yet another
driver with yet another API. If I push back, realistically, it will
probably end up out-of-tree, along with everything that depends on it.

Up until now, it seems that attitude has been that it is better to be
in-tree than out of tree, but are we getting too much stuff to continue
that?

Today, most of the MSM code lives out of tree. The QuIC tree for MSM
(currently based off of 2.6.35):

git diff --stat v2.6.35..HEAD | tail -1
3432 files changed, 1144473 insertions(+), 17315 deletions(-)
git diff --stat v2.6.35..HEAD arch/arm/mach-msm | tail -1
595 files changed, 286054 insertions(+), 1928 deletions(-)

There's a large amount of work just to get the code up to kernel
standards (the coding style has been fairly well enforced), and there is
constant development for new hardware.

Thanks,
David Brown

--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

2011-04-01 07:32:44

by Tomi Valkeinen

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thu, 2011-03-31 at 17:23 +0200, Arnd Bergmann wrote:

> * The DSS display drivers introduce new infrastructure include new bus
> types that have the complexity to make them completely generic, but
> in practice can only work on OMAP, and are clearly not written with
> cross-vendor abstractions in mind.

If you mean the panel drivers, then I disagree. They are currently OMAP
specific, but they are designed so that making them generic shouldn't be
too difficult. It's been my aim for a long time already to make the
panel drivers generic, but I've never had time and it's never been quite
clear to me what would be the best way to do that.

The core DSS driver is OMAP specific, and while the DSS IP could in
theory be used in some other platform, that is not currently the case
and I wouldn't want to needlessly start abstracting things for just the
sake of abstracting.

Tomi

2011-04-01 07:45:44

by Ingo Molnar

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

* David Brown <[email protected]> wrote:

> When we push back, there is a good chance they just won't bother, not because
> they don't want to do it, but because it doesn't fit a schedule, and there is
> already something else for them to work on.
>
> So what's the right answer here. [...]

IMO the right answer is what Linus and Thomas outlined:

1) provide a small number of clean examples and clean abstractions
2) to not pull new crap from that point on
3) do this gradually but consistently

I.e. make all your requirements technical and actionable - avoid sweeping,
impossible to meet requirements. Do not require people to clean up all of the
existing mess straight away (they cannot realistically do it), do not summarily
block the flow of patches, but be firm about drawing a line in the sand and be
firm about not introducing new mess in a gradually growing list of well-chosen
areas of focus.

Rinse, repeat.

If companies do not 'bother to push upstream', then management will eventually
notice negative economic consequences:

- Higher short-term production costs: upstream feedback/review/testing
improves the product, so the lack of upstream feedback/review/testing
increases the production costs of the product.

- Higher long-term production costs: gradually slower SoC development due to a
morass of out-of-tree hacks that werent pushed upstream causing gradually
higher development costs. This means higher payroll costs and longer time to
market - in which time more flexible competitors can beat you.

- Brain drain: developers like to show their good work upstream as well, not
just in some ship-and-forget out-of-tree kernel. Good developers will
gravitate towards SoC companies that encourage them to work upstream.
No matter how good of a business idea a company has if there's no good
developers.

- Less revenue: a product can not possibly be more appealing to SoC customers
if the upstream Linux kernel does not support it. As ARM moves up the food
chain towards more complex, higher profit margin products longer term
thinking gains foothold gradually.

- Competitive disadvantages: most SoC competitors push their changes upstream,
so they get free development assistance, they get free exposure, they get
free PR and they get opportunities. Not pushing upstream is a lost
opportunity.

All of these effects translate into real $$$$$$$ and affect the bottom line
very directly, both short and long term. These costs also increase with time so
they are not fixed.

If management does not actively encourage upstream-quality changes then
management will have to justify why they exposed the company to these extra
costs, complications and risks - just to save on the relatively minor (and
fixed) cost of working with upstream.

If despite all that management still believes (rightly or wrongly) that it's
cheaper for the company to do low quality throw-away code and does not care
about any of the short and long-term costs listed above then this really means
that they really do not care about you or about the upstream kernel - so they
do not exist as far as the upstream kernel is concerned.

Why should you then reward them with pulling crap and why should you be willing
to invest future maintenance overhead into their "we do not care about you"
solution?

Working with upstream is a quid pro quo with plenty of advantages on both
sides, which gives maintainers a heck of a leverage to push back on crap while
still having all the incentives in the world to help produce a high quality
kernel.

Thanks,

Ingo

2011-04-01 11:23:57

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Friday 01 April 2011, Tomi Valkeinen wrote:
> On Thu, 2011-03-31 at 17:23 +0200, Arnd Bergmann wrote:
>
> > * The DSS display drivers introduce new infrastructure include new bus
> > types that have the complexity to make them completely generic, but
> > in practice can only work on OMAP, and are clearly not written with
> > cross-vendor abstractions in mind.
>
> If you mean the panel drivers, then I disagree. They are currently OMAP
> specific, but they are designed so that making them generic shouldn't be
> too difficult. It's been my aim for a long time already to make the
> panel drivers generic, but I've never had time and it's never been quite
> clear to me what would be the best way to do that.
>
> The core DSS driver is OMAP specific, and while the DSS IP could in
> theory be used in some other platform, that is not currently the case
> and I wouldn't want to needlessly start abstracting things for just the
> sake of abstracting.

Ok, fair enough. I haven't looked at the OMAP DSS code in detail, so
I apologise if I did it injustice. What I did review is the ST Ericsson
MCDE code which was written by taking the OMAP code as an example.

The symptom I'm describing is that infrastructure is getting added
to platform specific code without making clear that it is mean to
be generic. I.e. the code is hidded away in the drivers/video/omap
directory, where other people would not go looking for it.

What I would have hoped you to do is to tell the ST Ericsson people
when they posted their code that they should instead work with you
to integrate the two implementations. As far as I remember (I may be
wrong again), that did not happen.

Arnd

2011-04-01 11:30:25

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thursday 31 March 2011, Kevin Hilman wrote:
> Arnd Bergmann <[email protected]> writes:
> >
> > But that's the point. The incentive is there for managing the infrastructure
> > within the SoC, but not across SoCs.
>
> OK, but the rest of my thread went on to describe how at least a few ARM
> SoC maintainers are actually actively working infrastructure that is
> cross SoC, like runtime PM. It might start because of an abstraction
> within an SoC family like supporting both SH and SH-mobile, or
> OMAP[12345], but it does sometimes result in not only cross-SoC code but
> cross-platform frameworks.
>
> Admiteddly, the percentage of ARM SoC developers actively working on
> these common, cross-platform infrastructure layers is rather small, but
> at least it is non-zero. :)

True, I was oversimplifying. Still, the problem exists that to a large
degree, infrastructure also gets added to platform specific code where it
has no place.

> With that as background, hwmod was never inteded as something to be
> cross-SoC. If you look at the data that's in an omap_hwmod, it's
> entirely OMAP hardware specific, and mostly focused on power management
> hardware details, register descriptions, feature capabilities etc. This
> allows the OMAP PM core code to be generalized and work across all SoCs
> in the OMAP family. But again, it was intended for OMAP PM core code.
> At that level, there really isn't much to share with other SoCs since
> the PM hardware for the various SoC vendors is so "differentiated"
> (a.k.a fsck'd up in extremely different ways.)

There is an important difference between code that knows about board,
soc and platform specific issues ("drivers") and code that manages
these ("infrastructure"). Obviously, any hardware implementation, broadly
speaking, that is different from the other ones needs a driver.

However, the infrastructure for managing multiple drivers should be
written in a way that works for as many similar drivers as possible.

My complaint about the four examples I've given is that they mix the
driver with the infrastructure.

Arnd

2011-04-01 11:33:26

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thursday 31 March 2011, Thomas Gleixner wrote:
> On Thu, 31 Mar 2011, Arnd Bergmann wrote:
>
> Right, but the problem starts in way simpler areas like irq chips and
> gpio stuff, where lots of the IP cores are similar and trivial enough
> to be shared across many SoC families.

Yes, I'm sure that there are more obvious examples than the ones I've
given, those were just the ones that I had noticed myself.

> Even the OMAP "consolidated" code is silly:
>
> But the code above has 6 cases in the switch because nobody abstracted
> it out consequently. Not to talk about the ifdef mess.

Nice illustration.

Arnd

2011-04-01 11:55:24

by Tomi Valkeinen

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

(dropping people from cc, as this is getting quite DSS spesific)

On Fri, 2011-04-01 at 13:22 +0200, Arnd Bergmann wrote:
> On Friday 01 April 2011, Tomi Valkeinen wrote:
> > On Thu, 2011-03-31 at 17:23 +0200, Arnd Bergmann wrote:
> >
> > > * The DSS display drivers introduce new infrastructure include new bus
> > > types that have the complexity to make them completely generic, but
> > > in practice can only work on OMAP, and are clearly not written with
> > > cross-vendor abstractions in mind.
> >
> > If you mean the panel drivers, then I disagree. They are currently OMAP
> > specific, but they are designed so that making them generic shouldn't be
> > too difficult. It's been my aim for a long time already to make the
> > panel drivers generic, but I've never had time and it's never been quite
> > clear to me what would be the best way to do that.
> >
> > The core DSS driver is OMAP specific, and while the DSS IP could in
> > theory be used in some other platform, that is not currently the case
> > and I wouldn't want to needlessly start abstracting things for just the
> > sake of abstracting.
>
> Ok, fair enough. I haven't looked at the OMAP DSS code in detail, so
> I apologise if I did it injustice. What I did review is the ST Ericsson
> MCDE code which was written by taking the OMAP code as an example.
>
> The symptom I'm describing is that infrastructure is getting added
> to platform specific code without making clear that it is mean to
> be generic. I.e. the code is hidded away in the drivers/video/omap
> directory, where other people would not go looking for it.
>
> What I would have hoped you to do is to tell the ST Ericsson people
> when they posted their code that they should instead work with you
> to integrate the two implementations. As far as I remember (I may be
> wrong again), that did not happen.

I don't seem to remember seeing anything from ST Ericsson... While my
memory doesn't always serve me well, I would imagine I'd remember if I'd
seen code based on my code.

Ah, found them from fbdev mail archive. I was rather busy at that
period, I didn't really read the mailing lists.

I totally agree with you that we should have a common panel interface
layer. As I said, I've had it as a target for a long time. And hopefully
now that I moved from Nokia to TI I'll finally have time to work on it
also.

Thanks for pointing me to the MCDE stuff. I doesn't seem to be merged,
though. I need to contact them and see if they're still interested in
working on the common interface.

Tomi

2011-04-01 12:07:28

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Friday 01 April 2011, Tomi Valkeinen wrote:
> Thanks for pointing me to the MCDE stuff. I doesn't seem to be merged,
> though. I need to contact them and see if they're still interested in
> working on the common interface.

I pushed back quite hard on some of the aspects there, which probably
prevented it from going in so far. If the code is as much based on
the OMAP DSS as I think, quite a number of changes are required to
both in order to get them into shape for a decent cross-platform layer,
but there should not be any fundamental issues.

Arnd

2011-04-01 12:15:22

by Tomi Valkeinen

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Fri, 2011-04-01 at 14:07 +0200, Arnd Bergmann wrote:
> On Friday 01 April 2011, Tomi Valkeinen wrote:
> > Thanks for pointing me to the MCDE stuff. I doesn't seem to be merged,
> > though. I need to contact them and see if they're still interested in
> > working on the common interface.
>
> I pushed back quite hard on some of the aspects there, which probably
> prevented it from going in so far. If the code is as much based on
> the OMAP DSS as I think, quite a number of changes are required to
> both in order to get them into shape for a decent cross-platform layer,
> but there should not be any fundamental issues.

I only looked it briefly, but I'm not sure if there's that much code
that could be common. But I need to read the mail thread properly.

The driver for the display HW on the SoC doesn't probably have anything
in common with OMAP one. What could and should be common is the panel
side, which was just a single patch in that patch set.

Tomi

2011-04-01 12:42:08

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Thursday 31 March 2011, Bill Gatliff wrote:
> On Thu, Mar 31, 2011 at 8:01 AM, Russell King - ARM Linux <[email protected]> wrote:

> > Just look at the removal of AAEC2000, LH7A40x and 2000 lines from the
> > mach-types file removed 6000 lines, which in itself is about the number
> > of lines of change submitted during the last merge window for any one
> > non-ARM architecture. At this point in time with this complaint, I've
> > absolutely no idea why I bothered to do that. I should've left it well
> > alone and then the diffstat percentage would've been smaller. After
> > all, it's "pointless churn".
>
> I think you did it because it was the Right Thing To Do. Even
> positive change can be painful at times.
>
> The majority is exceedingly grateful for the effort you make.

Defintely. I haven't seen anyone in this thread blame Russell for the
mess. As far as I'm concerned, the code in arch/arm consists of
the well-maintained {mm,kernel,lib,common,tools,include} directories
that are actively being taken care of by Russell, and a huge amount
of crap that has accumulated in mach-* and plat-*. Some of it is
arguably better than other parts, but the problem is not that someone
in particular did a bad job writing the code. The problem is that
nobody today is pushing back hard enough on crap getting added.

There is a lot of good work going on to reduce the amount of crap in
the mach code, but my feeling is that it's not keeping up with the
rate of crap getting added by other people. In some ways, the Linaro
project has actually made this worse by helping people get their code
into shape for inclusion (which of course is generally a good thing
to do).

Arnd

2011-04-01 13:54:44

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Friday 01 April 2011, Ingo Molnar wrote:
> IMO the right answer is what Linus and Thomas outlined:
>
> 1) provide a small number of clean examples and clean abstractions
> 2) to not pull new crap from that point on
> 3) do this gradually but consistently
>
> I.e. make all your requirements technical and actionable - avoid sweeping,
> impossible to meet requirements. Do not require people to clean up all of the
> existing mess straight away (they cannot realistically do it), do not summarily
> block the flow of patches, but be firm about drawing a line in the sand and be
> firm about not introducing new mess in a gradually growing list of well-chosen
> areas of focus.
>
> Rinse, repeat.

I believe getting to point 1 is the hard part here. There are a lot of things
that are wrong with the mach-* (and also plat-*) implementations, and I don't
think we have one today that can really serve as an example. Most decisions
made in there made a lot of sense when they were introduced, and declaring
code that was perfectly acceptable yesterday to be unacceptable crap today
is not going to be met with much understanding by the someone who just
wants to add support for one more board to 100 already existing ones in the
same SoC family.

I would actually suggest a different much more radical start: Fork the way
that platforms are managed today, and start an alternative way of setting
up boards and devices together with the proven ARM core kernel infrastructure,
based on these observations (please correct me if some of them they don't make
sense):

1. The core arch code is not a problem (Russell does a great job here)
2. The platform specific code contains a lot of crap that doesn't belong there
(not enough reviewers to push back on crap)
3. The amount of crap in platform specfic files is growing exponentially,
despite the best efforts of a handful of people to clean it up.
4. Having one source file per board does not scale any more.
5. Discoverable hardware would solve this, but is not going to happen
in practice.
6. Board firmware would not solve this and is usually not present.
7. Boot loaders can not be trusted to pass valid information
8. Device tree blobs can solve a lot of the problems, and nobody has
come up with a better solution.
9. All interesting work is going into a handful of platforms, all of which
are ARMv7 based.
10. We do not want to discontinue support for old boards that work fine.
11. Massive changes to existing platforms would cause massive breakage.
12. Supporting many different boards with a single kernel binary is a
useful goal.
13. Infrastructure code should be cross-platform, not duplicated across
platforms.
14. 32 bit ARM is hitting the wall in the next years (Cortex-A15 is
actually adding PAE support, which has failed to solve this on
other architectures).
15. We need to solve the platform problem before 64 bit support comes
and adds another dimension to the complexity.

Based on these assumptions, my preferred strategy would be to a new
mach-nocrap directory with a documented set of rules (to be adapted when
necessary):

* Strictly no crap
* No board files
* No hardcoded memory maps
* No lists of interrupts and GPIOs
* All infrastructure added must be portable to all ARMv7 based SoCs.
(ARMv6 can be added later)
* 64 bit safe code only.
* SMP safe code only.
* All board specific information must come from a device tree and
be run-time detected.
* Must use the same device drivers as existing platforms
* Should share platform drivers (interrupt controller, gpio, timer, ...)
with existing platforms where appropriate.
* Code quality takes priority over stability in mach-nocrap, but must not
break other platforms.

Until we have something working there, I think we should still generally
allow new code to the existing platforms, and even new platforms to be
added, while trying to keep the quality as high as possible but without
changing the rules for them or doing any major treewide reworks.

Once the mach-nocrap approach has turned into something usable, we can
proceed on three fronts:
1. delete actively maintained boards from the other platforms once they
are no longer needed there
2. generalize concepts from mach-nocrap by applying them to all boards,
similar to the cleanup work that people have always been doing.
3. gradually make the rules for adding new code in other platforms stricter,
up to the point where they are bugfix only.

> If companies do not 'bother to push upstream', then management will eventually
> notice negative economic consequences:
>
> ...

Good points, I fully agree with these. I also think that the SoC companies
are actually understanding this nowadays, and that is exactly the reason
why we see so much code getting pushed in.

Arnd

2011-04-01 14:18:35

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wednesday 30 March 2011, Russell King - ARM Linux wrote:
> On Wed, Mar 30, 2011 at 07:06:41PM +0200, Arnd Bergmann wrote:
> >
> > I'm still new to the ARM world, but I think one real problem is the way
> > that all platforms have their own trees with a very flat hierarchy --
> > a lot of people directly ask Linus to pull their trees, and the main
> > way to sort out conflicts is linux-next. The number of platforms in the
> > ARM arch is still increasing, so I assume that this only gets worse.
>
> The reason that we've ended up with a flat heirarchy in terms of
> developers is down to pressure. There was a time when we had a more
> structured system, where the sub-tree people submitted their patches
> to me and the list, they'd be reviewed (mostly and mostly only) by me
> before being merged into my tree and going upstream from there.
>
> As the community grew, it got harder and harder to do decent reviewed
> of those patches and so the acceptance rate dropped.
>
> Eventually we switched to the current arrangement where I'm essentially
> only concerned about core ARM code, and a few platforms which I have
> personal interest in (or are contracted to look after.)
>
> For the rest I just look at the patches, and send back what feedback I
> can on them (which is mostly when my mailer turns a line red because
> it's matched one of my mutt regexps for spotting common mistakes.)

Thanks for the background information.

> > This would be no easier if everyone was asking you to pull their trees,
> > as I believe was the case before that. The amount of code getting changed
> > there is too large to get reviewed by a single person, and I believe
> > neither of you really wants the burden to judge if all of the branches
> > are ok (and complain to the authors when they are not).
>
> Absolutely right - and the problem is that we still have no one who is
> willing to step up and do the review.
>
> What I was promised at the time was that by giving sub-tree maintainers
> the loaded pistol, this problem of code quality would in effect be self-
> correcting. If they make a hash out of it, they'd have to be the ones
> to fix it themselves.
>
> Instead, what's happening is that the _entire_ ARM community, ARM
> hardware manufacturers and so forth is being blamed here.

This is not my impression. A lot of people are pointing out that there
are problems, and how they perceive them, but I don't think that anyone
really wants to blame the entire community. Even less I believe that
people that understand the situation are blaming you personally.

> > Russell, do you think it would help to have an additional ARM platform
> > tree that collects all the changes that impact only the platform code but
> > not the core architecture? I believe that would be a way out, but requires
> > a careful selection of people responsible for it. In particular, I don't
> > think a single person can handle it without good sub-maintainers.
>
> It's not that simple, as what happens when we have core ARM code updates
> which ends up touching every single board file? The result is conflicts
> between trees, and that could get extremely messy indeed.

I believe that conflicts between two trees are really not the issue,
we have tools to solve those in multiple ways, e.g. by pulling in such
updates from a topic branch into both trees, or by declaring one of
the two trees the master that can pull in the other one occasionally
in order to resolve the conflicts.

> To be honest, given the politics, I don't want to be the one stuck in the
> middle, receiving and endless stream of Linus' complaints about the way
> the ARM community works, or the board support code. However, inspite of
> the sub-tree maintainers having the responsibility for their own code I
> still find myself in the firing line.

I think that is partly a perception problem on your side. Understandably,
you still identify yourself with all of the code under arch/arm, so
if someone says that the ARM architecture code has problems, you take
it personally, even though the problems that are cited are almost
exclusively for code that you are not responsible for.

> And I have got to the point of just not giving a damn. I can't change
> the ARM community (I've tried over the years to get more active review
> of platform changes and failed - and had it pointed out by folk like
> Alan Cox, that such a system is impossible due to lack of motivation
> by, eg, an OMAP person to review a Samsung change.)

I think we're actually just getting there. You were not the only one
to point out the problem and Linaro was specifically founded to solve
this issue, as far as I can tell.

Arnd

2011-04-01 14:59:51

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Friday 01 April 2011, Detlef Vollmann wrote:
> On 04/01/11 15:54, Arnd Bergmann wrote:

> > 9. All interesting work is going into a handful of platforms, all of which
> > are ARMv7 based.
> Define interesting.

The ones that are causing the churn that we're talking about.
Platforms that have been working forever and only need to get
the occasional bug fix are boring, i.e. not the problem.

> > 12. Supporting many different boards with a single kernel binary is a
> > useful goal.
> Generally not for embedded systems (for me, a mobile PDA/phone is just a
> small computer with a crappy keyboard, but not an embedded system).

True. For embedded, this would not be an important thing to do, but
also not hurt. For anything that a user might want to put a new
kernel on, this would be helpful though.

> >
> > Based on these assumptions, my preferred strategy would be to a new
> > mach-nocrap directory with a documented set of rules (to be adapted when
> > necessary):
> >
> > * Strictly no crap
> > * No board files
> Where do you put code that needs to run very early (e.g. pinging the
> watchdog)?

Don't know. I'd hope we can get fast enough to the phase where device
drivers get initialized.

> > * No hardcoded memory maps
> > * No lists of interrupts and GPIOs
> > * All infrastructure added must be portable to all ARMv7 based SoCs.
> > (ARMv6 can be added later)
> > * 64 bit safe code only.
> > * SMP safe code only.
> > * All board specific information must come from a device tree and
> > be run-time detected.
> What do you mean by "run-time detected"?
> For powerpc, we currently have the device tree as DTS in the kernel
> and compile and bundle it together with the kernel.
> As you wrote above: "Discoverable hardware [...] is not going to happen"

I mean writing

if (device_is_compatible(dev, SOMETHING))
do_something();

instead of

#ifdef CONFIG_SOMETHING
do_something();
#endif

The run-time information could come from anywhere (device tree, hardware
registers, today one might use the board number), the important point is
not to assume that hardware is present just because someone enabled
a Kconfig option.

I believe that rule is generally accepted today, but we don't always
enforce it.

> > * Must use the same device drivers as existing platforms
> > * Should share platform drivers (interrupt controller, gpio, timer, ...)
> > with existing platforms where appropriate.
> > * Code quality takes priority over stability in mach-nocrap, but must not
> > break other platforms.
>
> I agree with the general idea, but nailing down the details in a world
> as diverse as the ARM world will not be easy...

Absolutely, I did not claim to have the single solution that everyone else
couldn't see. Please see this more as an RFC.

Arnd

2011-04-01 15:27:58

by Will Deacon

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

Hi Arnd,

On Fri, 2011-04-01 at 14:54 +0100, Arnd Bergmann wrote:
> I would actually suggest a different much more radical start: Fork the way
> that platforms are managed today, and start an alternative way of setting
> up boards and devices together with the proven ARM core kernel infrastructure,
> based on these observations (please correct me if some of them they don't make
> sense):
>
> 1. The core arch code is not a problem (Russell does a great job here)
> 2. The platform specific code contains a lot of crap that doesn't belong there
> (not enough reviewers to push back on crap)
> 3. The amount of crap in platform specfic files is growing exponentially,
> despite the best efforts of a handful of people to clean it up.
> 4. Having one source file per board does not scale any more.
> 5. Discoverable hardware would solve this, but is not going to happen
> in practice.
> 6. Board firmware would not solve this and is usually not present.
> 7. Boot loaders can not be trusted to pass valid information
> 8. Device tree blobs can solve a lot of the problems, and nobody has
> come up with a better solution.

Right, so this is directly related to point (5) because in essence FDT
is a way to make undiscoverable hardware discoverable by probing the
tree. The `it's just data' mantra sums it up nicely.

> 9. All interesting work is going into a handful of platforms, all of which
> are ARMv7 based.

I think starting out ARMv7-only might make this more manageable but
there are still shed loads of pre-v7 chips out there which we should try
not to break.

> 10. We do not want to discontinue support for old boards that work fine.

[...]

> Based on these assumptions, my preferred strategy would be to a new
> mach-nocrap directory with a documented set of rules (to be adapted when
> necessary):

This is a nice idea, but I don't think it's entirely practical:

> * Strictly no crap
> * No board files

I don't understand how you can handle `early quirks' without board
files. Does this follow on from Linus' suggestion about moving code out
of the kernel and into the bootloader?

Realistically, I don't think you will ever get away from board files.
The trick is probably to make them as small as possible and common to as
many boards as possible (like the platforms directory for PowerPC).

> * No hardcoded memory maps
> * No lists of interrupts and GPIOs

This is largely just data, so should be do-able once this stuff isn't
needed at compile-time (which is becoming the case with stuff like
dynamic p2v).

Will

2011-04-01 15:50:42

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Friday 01 April 2011, Detlef Vollmann wrote:
> On 04/01/11 16:59, Arnd Bergmann wrote:
> > On Friday 01 April 2011, Detlef Vollmann wrote:
> >> On 04/01/11 15:54, Arnd Bergmann wrote:
> >
> >>> 9. All interesting work is going into a handful of platforms, all of which
> >>> are ARMv7 based.
> >> Define interesting.
> >
> > The ones that are causing the churn that we're talking about.
> > Platforms that have been working forever and only need to get
> > the occasional bug fix are boring, i.e. not the problem.
> In the ARM tree I only know mach-at91.
> Atmel still introduces new SOCs based on ARM926EJ-S, and that makes
> perfect sense for lots of applications.

I thought new ones were generally Cortex-M3 based. Either way, even
if there are exceptions, focusing on ARMv7 at first should give
a good representation of the new development.

> >>> 12. Supporting many different boards with a single kernel binary is a
> >>> useful goal.
> >> Generally not for embedded systems (for me, a mobile PDA/phone is just a
> >> small computer with a crappy keyboard, but not an embedded system).
> >
> > True. For embedded, this would not be an important thing to do, but
> > also not hurt.
> It costs you flash space.

Well, the idea was not to force everyone to enable all options. When this
is done right, the kernel would not be any bigger.

> >>> * Strictly no crap
> >>> * No board files
> >> Where do you put code that needs to run very early (e.g. pinging the
> >> watchdog)?
> >
> > Don't know. I'd hope we can get fast enough to the phase where device
> > drivers get initialized.
> Nope, never happened for me :-(
> (Watchdog timeouts are often 1s or less.)

1s is a long time. Most of the boot process is drivers anyway, so we
just need to make sure that the watchdog is early enough.

> > I believe that rule is generally accepted today, but we don't always
> > enforce it.
> Without device tree, Kconfig option is the only way that really
> works today (no runtime HW detection, and same board ID with different
> setups).

I believe that has never been an accepted way of doing things, you are
supposed to get a new board ID for every new board, hence the name ;-).

ARnd

2011-04-01 15:56:24

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Friday 01 April 2011, Will Deacon wrote:

> > 1. The core arch code is not a problem (Russell does a great job here)
> > 2. The platform specific code contains a lot of crap that doesn't belong there
> > (not enough reviewers to push back on crap)
> > 3. The amount of crap in platform specfic files is growing exponentially,
> > despite the best efforts of a handful of people to clean it up.
> > 4. Having one source file per board does not scale any more.
> > 5. Discoverable hardware would solve this, but is not going to happen
> > in practice.
> > 6. Board firmware would not solve this and is usually not present.
> > 7. Boot loaders can not be trusted to pass valid information
> > 8. Device tree blobs can solve a lot of the problems, and nobody has
> > come up with a better solution.
>
> Right, so this is directly related to point (5) because in essence FDT
> is a way to make undiscoverable hardware discoverable by probing the
> tree. The `it's just data' mantra sums it up nicely.

Well, except that because of point 7, device trees are still inferior to
having correct and complete information in hardware.

> > 9. All interesting work is going into a handful of platforms, all of which
> > are ARMv7 based.
>
> I think starting out ARMv7-only might make this more manageable but
> there are still shed loads of pre-v7 chips out there which we should try
> not to break.

Yes, see below: the idea is to touch as little of the existing code
as possible, at least in the first stages.

> > 10. We do not want to discontinue support for old boards that work fine.
>
> [...]
>
> > Based on these assumptions, my preferred strategy would be to a new
> > mach-nocrap directory with a documented set of rules (to be adapted when
> > necessary):
>
> This is a nice idea, but I don't think it's entirely practical:
>
> > * Strictly no crap
> > * No board files
>
> I don't understand how you can handle `early quirks' without board
> files. Does this follow on from Linus' suggestion about moving code out
> of the kernel and into the bootloader?

There are multiple ways of dealing with this. One way would be to
mandate that the boot loader does the quirks, ideally as little
as possible.

Another option is to have a boot wrapper with board specific code,
which gets run between the regular boot loader and the common
kernel entry point. We might need such a wrapper anyway to pass the
device tree to the kernel.

> Realistically, I don't think you will ever get away from board files.
> The trick is probably to make them as small as possible and common to as
> many boards as possible (like the platforms directory for PowerPC).

Perhaps. But we can start out with strict rules and add exceptions
later when we run out of options.

> > * No hardcoded memory maps
> > * No lists of interrupts and GPIOs
>
> This is largely just data, so should be do-able once this stuff isn't
> needed at compile-time (which is becoming the case with stuff like
> dynamic p2v).

Right.

Arnd

2011-04-01 16:40:33

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Fri, Apr 1, 2011 at 8:55 AM, Arnd Bergmann <[email protected]> wrote:
>
> Well, except that because of point 7, device trees are still inferior to
> having correct and complete information in hardware.

Oh, absolutely.

If you have discoverable hardware, use it.

But by "discoverable hardware" I mean something like PCI config
cycles. IOW, real hardware features. Not some code like

if (board_signature_is(xyz)) {
...

that just maps some _other_ hardware knowledge (reading a SoC ID or
something) into an unrelated thing ("I know this SoC has these bits of
hardware").

So devicetree should never override actual "hardware tells me it
exists here". But you might well have a mapping from SoC ID's to a
compiled-in devicetree thing (this is largely what POWER does, iirc).

Linus

2011-04-01 17:45:50

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Fri, Apr 01, 2011 at 05:50:17PM +0200, Arnd Bergmann wrote:
> On Friday 01 April 2011, Detlef Vollmann wrote:
> > On 04/01/11 16:59, Arnd Bergmann wrote:
> > > On Friday 01 April 2011, Detlef Vollmann wrote:
> > >> On 04/01/11 15:54, Arnd Bergmann wrote:
> > >
> > >>> 9. All interesting work is going into a handful of platforms, all of which
> > >>> are ARMv7 based.
> > >> Define interesting.
> > >
> > > The ones that are causing the churn that we're talking about.
> > > Platforms that have been working forever and only need to get
> > > the occasional bug fix are boring, i.e. not the problem.
> > In the ARM tree I only know mach-at91.
> > Atmel still introduces new SOCs based on ARM926EJ-S, and that makes
> > perfect sense for lots of applications.
>
> I thought new ones were generally Cortex-M3 based. Either way, even
> if there are exceptions, focusing on ARMv7 at first should give
> a good representation of the new development.

If they're M3 then they're a microcontroller, and so would be using uclinux.

2011-04-01 19:54:50

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Fri, 1 Apr 2011, Arnd Bergmann wrote:

> On Friday 01 April 2011, Detlef Vollmann wrote:
> > On 04/01/11 16:59, Arnd Bergmann wrote:
> > > On Friday 01 April 2011, Detlef Vollmann wrote:
> > >> On 04/01/11 15:54, Arnd Bergmann wrote:
> > >
> > >>> 9. All interesting work is going into a handful of platforms, all of which
> > >>> are ARMv7 based.
> > >> Define interesting.
> > >
> > > The ones that are causing the churn that we're talking about.
> > > Platforms that have been working forever and only need to get
> > > the occasional bug fix are boring, i.e. not the problem.
> > In the ARM tree I only know mach-at91.
> > Atmel still introduces new SOCs based on ARM926EJ-S, and that makes
> > perfect sense for lots of applications.
>
> I thought new ones were generally Cortex-M3 based. Either way, even
> if there are exceptions, focusing on ARMv7 at first should give
> a good representation of the new development.

The actual CPU core doesn't matter at all. Whether it is ARM926EJ-S,
XScale, PJ4 or Cortex-A8/A9, _that_ is the part that is extremely well
maintained and abstracted already. The focus should instead be put on
those platforms that are the most used irrespective of their cores. And
by selecting the most used platforms, we have a greater chance to create
community momentum, and good examples will be spread more quickly.

> > >>> 12. Supporting many different boards with a single kernel binary is a
> > >>> useful goal.
> > >> Generally not for embedded systems (for me, a mobile PDA/phone is just a
> > >> small computer with a crappy keyboard, but not an embedded system).
> > >
> > > True. For embedded, this would not be an important thing to do, but
> > > also not hurt.
> > It costs you flash space.
>
> Well, the idea was not to force everyone to enable all options. When this
> is done right, the kernel would not be any bigger.

With many SOCs each with their own peculiarities, the kernel would
obviously grow bigger. But the major advantage of being _able_ to do
that is not ultimately to have only one kernel with multi-board support
even if in some context this has great value, but rather to enforce good
code reuse and abstraction.

Russell suggested that we enable CONFIG_ARM_PATCH_PHYS_VIRT by
default. This is already one way to remove one of the most
fundamental board specific piece of information that can be deduced at
run time instead of having compile time constants per SOC.

I however don't think it is practical to go off in a separate
mach-nocrap space and do things in parallel. Taking OMAP as an example,
there is already way too big of an infrastructure in place to simply
rewrite it in parallel to new OMAP versions coming up.

It would be more useful and scalable to simply sit down, look at the
current mess, and identify common patterns that can be easily factored
out into some shared library code, and all that would be left in the
board or SOC specific files eventually is the data to register with that
library code. Nothing so complicated as grand plans or planification
that makes it look like a mountain.

Two patterns were identified so far, and they are:

1) GPIO drivers

As Linus observed, in the majority of the cases GPIOs are accessed
through simple memory-mapped registers. Some have absolute state
registers, the others have separate clear/set registers. Suffice to
create two generic GPIO drivers each covering those two common cases,
and those generic drivers would simply register with the higher level
gpiolib code, and all the board code would have to do is to provide
the data for those GPIOs (register offsets, number of GPIOs, etc.).
Whether this data eventually comes from DT is an orthogonal issue.

2) IRQ chip drivers

Again, as Thomas observed, the same issue exists with the majority of
the IRQ chip drivers. Most of them follow a common simple pattern
that can be abstracted in some generic library code due to their very
similar mode of operation. Writing a common driver would leave the
board specific code with only a data table describing hardware
registers.

A good example of such rationalization that already happened is the
leds-gpio driver (./drivers/leds/leds-gpio.c), or similarly the
gpio-keys driver (drivers/input/keyboard/gpio_keys.c). I remember when
those board files were implementing their own simple drivers hooking
directly to the input API or the LED API.

After that let's take another identified common pattern and factorize it
out from board code. That might be timers (see RMK's recent
sched_clock() rationalization). That might be clocks (patches from
Jeremy Kerr exist and need merged). Etc.

Eventually we won't be able to find any more identifiable patterns which
are factorisable, and what will be left in board files is only genuine
SOC differences. And if all that is left is actually only data tables,
then maybe such board files could go entirely and that data be passed
via device tree, but that is still a long way off.

I think what is needed here is a bunch of people willing to work on such
things, extracting those common patterns, and creating the
infrastructure to cover them. Once that is in place then we will be in
a position to push back on code submissions that don't use that
infrastructure, and be on the lookout for new patterns to emerge.

Just with the above I think there is sufficient work to keep us busy for
a while.

Nicolas

2011-04-01 20:19:34

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Fri, 1 Apr 2011, Arnd Bergmann wrote:

> On Friday 01 April 2011, Will Deacon wrote:
>
> > > 1. The core arch code is not a problem (Russell does a great job here)
> > > 2. The platform specific code contains a lot of crap that doesn't belong there
> > > (not enough reviewers to push back on crap)
> > > 3. The amount of crap in platform specfic files is growing exponentially,
> > > despite the best efforts of a handful of people to clean it up.
> > > 4. Having one source file per board does not scale any more.
> > > 5. Discoverable hardware would solve this, but is not going to happen
> > > in practice.
> > > 6. Board firmware would not solve this and is usually not present.
> > > 7. Boot loaders can not be trusted to pass valid information
> > > 8. Device tree blobs can solve a lot of the problems, and nobody has
> > > come up with a better solution.
> >
> > Right, so this is directly related to point (5) because in essence FDT
> > is a way to make undiscoverable hardware discoverable by probing the
> > tree. The `it's just data' mantra sums it up nicely.
>
> Well, except that because of point 7, device trees are still inferior to
> having correct and complete information in hardware.

I helped with the design of a rather simple patch for ARM allowing for:

cat zImage foobar.dtb > zImage_with_dtb

Then the kernel is smart enough to detect it has a dtb on its tail and
use it.

In a perfect world the bootloader would be bug free and always up to
date with the best DT data. In practice I'm very skeptical this will
always be the case and painless. At least the above makes it very
simple to have a self contained kernel when (not if) need be.

> > > 9. All interesting work is going into a handful of platforms, all of which
> > > are ARMv7 based.
> >
> > I think starting out ARMv7-only might make this more manageable but
> > there are still shed loads of pre-v7 chips out there which we should try
> > not to break.
>
> Yes, see below: the idea is to touch as little of the existing code
> as possible, at least in the first stages.

I don't think this is a realistic approach. See my previous mail. Once
you start identifying concrete and well defined areas that needs
cleaning, it is best to come up with solutions that covers as much
existing code as possible, validating that the solution is also worth it
in the process. The more existing code you may cover with your cleanup,
the more likely it will fit future hardware as well.

Nicolas

2011-04-01 21:01:29

by Uwe Kleine-König

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

Hello,

On Fri, Apr 01, 2011 at 03:54:47PM -0400, Nicolas Pitre wrote:
> It would be more useful and scalable to simply sit down, look at the
> current mess, and identify common patterns that can be easily factored
> out into some shared library code, and all that would be left in the
> board or SOC specific files eventually is the data to register with that
> library code. Nothing so complicated as grand plans or planification
> that makes it look like a mountain.
>
> Two patterns were identified so far, and they are:
>
> 1) GPIO drivers
>
> As Linus observed, in the majority of the cases GPIOs are accessed
> through simple memory-mapped registers. Some have absolute state
> registers, the others have separate clear/set registers. Suffice to
> create two generic GPIO drivers each covering those two common cases,
> and those generic drivers would simply register with the higher level
> gpiolib code, and all the board code would have to do is to provide
> the data for those GPIOs (register offsets, number of GPIOs, etc.).
> Whether this data eventually comes from DT is an orthogonal issue.
>
> 2) IRQ chip drivers
>
> Again, as Thomas observed, the same issue exists with the majority of
> the IRQ chip drivers. Most of them follow a common simple pattern
> that can be abstracted in some generic library code due to their very
> similar mode of operation. Writing a common driver would leave the
> board specific code with only a data table describing hardware
> registers.
>
> A good example of such rationalization that already happened is the
> leds-gpio driver (./drivers/leds/leds-gpio.c), or similarly the
> gpio-keys driver (drivers/input/keyboard/gpio_keys.c). I remember when
> those board files were implementing their own simple drivers hooking
> directly to the input API or the LED API.
>
> After that let's take another identified common pattern and factorize it
> out from board code. That might be timers (see RMK's recent
> sched_clock() rationalization). That might be clocks (patches from
> Jeremy Kerr exist and need merged). Etc.
Another one is pwm (git ls-files arch/arm | grep pwm). A general
pwm framework was already discussed on lkml and linux-embedded
(http://thread.gmane.org/gmane.linux.ports.mips.general/29037/focus=44475);
I don't know the details though.

Best regards
Uwe

--
Pengutronix e.K. | Uwe Kleine-K?nig |
Industrial Linux Solutions | http://www.pengutronix.de/ |

2011-04-01 21:10:12

by Kevin Hilman

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

Arnd Bergmann <[email protected]> writes:

> On Friday 01 April 2011, Detlef Vollmann wrote:
>> On 04/01/11 15:54, Arnd Bergmann wrote:
>
>> > 9. All interesting work is going into a handful of platforms, all of which
>> > are ARMv7 based.
>> Define interesting.
>
> The ones that are causing the churn that we're talking about.
> Platforms that have been working forever and only need to get
> the occasional bug fix are boring, i.e. not the problem.

I'm not sure I follow the ARMv7-only thinking either.

Picking ARMv7 only would be a good way to avoid part of the problem, but
IMO, it doesn't really address the root causes. Part of the ugliness of
the platform-specific hackery (and the "churn" to clean some of it up)
is precisely due to support for multiple ARM architecture versions, and
the various SoCs in a family that use them. For example, linux-omap
supports OMAP1 (ARMv5), OMAP2 (ARMv6), OMAP3 (ARMv7) and OMAP4 (ARMv7
SMP), and OMAP2/3/4 in a single binary.

Also, since we've only very recently got to the point of being able to
support ARMv6 + ARMv7 UP & SMP in the same kernel, making a decision now
that only ARMv7 is important seems like a step backwards. If the
ultimate goal is getting to a point where we have infrastrucure that can
be cross-SoC, surely this same infrastrucure should support multiple ARM
architecture revisions.

The kernel is only part of many open-source projects, and many of these
projects are still using older hardware because it's cheap, available
and hackable. Supporting ARMv7 only might be a win for those selling
new hardware, but not necessarily a win for the broader open-source
community.

Kevin (obviously not speaking for my new employer)

2011-04-01 21:32:48

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Friday 01 April 2011 23:10:04 Kevin Hilman wrote:
> Arnd Bergmann <[email protected]> writes:
>
> > On Friday 01 April 2011, Detlef Vollmann wrote:
> >> On 04/01/11 15:54, Arnd Bergmann wrote:
> >
> >> > 9. All interesting work is going into a handful of platforms, all of which
> >> > are ARMv7 based.
> >> Define interesting.
> >
> > The ones that are causing the churn that we're talking about.
> > Platforms that have been working forever and only need to get
> > the occasional bug fix are boring, i.e. not the problem.
>
> I'm not sure I follow the ARMv7-only thinking either.
>
> Picking ARMv7 only would be a good way to avoid part of the problem, but
> IMO, it doesn't really address the root causes. Part of the ugliness of
> the platform-specific hackery (and the "churn" to clean some of it up)
> is precisely due to support for multiple ARM architecture versions, and
> the various SoCs in a family that use them. For example, linux-omap
> supports OMAP1 (ARMv5), OMAP2 (ARMv6), OMAP3 (ARMv7) and OMAP4 (ARMv7
> SMP), and OMAP2/3/4 in a single binary.
>
> Also, since we've only very recently got to the point of being able to
> support ARMv6 + ARMv7 UP & SMP in the same kernel, making a decision now
> that only ARMv7 is important seems like a step backwards. If the
> ultimate goal is getting to a point where we have infrastrucure that can
> be cross-SoC, surely this same infrastrucure should support multiple ARM
> architecture revisions.

Yes, forget about the ARMv7 part of my proposal, that was not a main point.

If we decide to have a new clean platform variant the way I suggested,
it would be nice to support all machines in a single kernel binary,
and at least v6+v7 is a solved problem.

Supporting a second kernel binary up to v5 with the same source is also
simple, as would be big-endian/little-endian variants, or thumb2/arm variants.
We might not want to do all combinations from the start though, and I would
choose ARMv6/v7-thumb2-le simply because that's what Linaro is focusing
on. The idea is to start with a clearly defined set, but write the code
in a way that makes it possible to extend in other directions.

Arnd

2011-04-01 21:51:22

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Friday, 1 April 2011, Arnd Bergmann <[email protected]> wrote:
> On Friday 01 April 2011 23:10:04 Kevin Hilman wrote:
>> Arnd Bergmann <[email protected]> writes:
>>
>> > On Friday 01 April 2011, Detlef Vollmann wrote:
>> >> On 04/01/11 15:54, Arnd Bergmann wrote:
>> >
>> >> > 9. All interesting work is going into a handful of platforms, all of which
>> >> > are ARMv7 based.
>> >> Define interesting.
>> >
>> > The ones that are causing the churn that we're talking about.
>> > Platforms that have been working forever and only need to get
>> > the occasional bug fix are boring, i.e. not the problem.
>>
>> I'm not sure I follow the ARMv7-only thinking either.
>>
>> Picking ARMv7 only would be a good way to avoid part of the problem, but
>> IMO, it doesn't really address the root causes. Part of the ugliness of
>> the platform-specific hackery (and the "churn" to clean some of it up)
>> is precisely due to support for multiple ARM architecture versions, and
>> the various SoCs in a family that use them. For example, linux-omap
>> supports OMAP1 (ARMv5), OMAP2 (ARMv6), OMAP3 (ARMv7) and OMAP4 (ARMv7
>> SMP), and OMAP2/3/4 in a single binary.
>>
>> Also, since we've only very recently got to the point of being able to
>> support ARMv6 + ARMv7 UP & SMP in the same kernel, making a decision now
>> that only ARMv7 is important seems like a step backwards. If the
>> ultimate goal is getting to a point where we have infrastrucure that can
>> be cross-SoC, surely this same infrastrucure should support multiple ARM
>> architecture revisions.
>
> Yes, forget about the ARMv7 part of my proposal, that was not a main point.
>
> If we decide to have a new clean platform variant the way I suggested,
> it would be nice to support all machines in a single kernel binary,
> and at least v6+v7 is a solved problem.
>
> Supporting a second kernel binary up to v5 with the same source is also
> simple, as would be big-endian/little-endian variants, or thumb2/arm variants.
> We might not want to do all combinations from the start though, and I would
> choose ARMv6/v7-thumb2-le simply because that's what Linaro is focusing
> on. The idea is to start with a clearly defined set, but write the code
> in a way that makes it possible to extend in other directions.

Thumb-2 is ARMv7 only. If you want a v6+v7 binary it would need to be
compiled to ARM.

--
Catalin

2011-04-01 22:09:03

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Friday 01 April 2011 21:54:47 Nicolas Pitre wrote:
> On Fri, 1 Apr 2011, Arnd Bergmann wrote:
> >
> > I thought new ones were generally Cortex-M3 based. Either way, even
> > if there are exceptions, focusing on ARMv7 at first should give
> > a good representation of the new development.
>
> The actual CPU core doesn't matter at all. Whether it is ARM926EJ-S,
> XScale, PJ4 or Cortex-A8/A9, _that_ is the part that is extremely well
> maintained and abstracted already. The focus should instead be put on
> those platforms that are the most used irrespective of their cores. And
> by selecting the most used platforms, we have a greater chance to create
> community momentum, and good examples will be spread more quickly.

Agreed.

> I however don't think it is practical to go off in a separate
> mach-nocrap space and do things in parallel. Taking OMAP as an example,
> there is already way too big of an infrastructure in place to simply
> rewrite it in parallel to new OMAP versions coming up.
>
> It would be more useful and scalable to simply sit down, look at the
> current mess, and identify common patterns that can be easily factored
> out into some shared library code, and all that would be left in the
> board or SOC specific files eventually is the data to register with that
> library code. Nothing so complicated as grand plans or planification
> that makes it look like a mountain.

This is exactly the question it comes down to. So far, we have focused
on cleaning up platforms bit by bit. Given sufficient resources, I'm
sure this can work. You assume that continuing on this path is the
fastest way to clean up the whole mess, while my suggestion is based
on the assumption that we can do better by starting a small fork.

I think we can both agree that by equally distributing the workforce
to both approaches, we'd be off worse than doing one of them right ;-)

> Two patterns were identified so far, and they are:
>
> 1) GPIO drivers
>
> As Linus observed, in the majority of the cases GPIOs are accessed
> through simple memory-mapped registers. Some have absolute state
> registers, the others have separate clear/set registers. Suffice to
> create two generic GPIO drivers each covering those two common cases,
> and those generic drivers would simply register with the higher level
> gpiolib code, and all the board code would have to do is to provide
> the data for those GPIOs (register offsets, number of GPIOs, etc.).
> Whether this data eventually comes from DT is an orthogonal issue.

Yes, this sounds like a great idea, but it's also unrelated to whether
we'd do a new platform, or introduce this into the existing platforms.

> 2) IRQ chip drivers
>
> Again, as Thomas observed, the same issue exists with the majority of
> the IRQ chip drivers. Most of them follow a common simple pattern
> that can be abstracted in some generic library code due to their very
> similar mode of operation. Writing a common driver would leave the
> board specific code with only a data table describing hardware
> registers.

Also sounds really good.

> I think what is needed here is a bunch of people willing to work on such
> things, extracting those common patterns, and creating the
> infrastructure to cover them. Once that is in place then we will be in
> a position to push back on code submissions that don't use that
> infrastructure, and be on the lookout for new patterns to emerge.
>
> Just with the above I think there is sufficient work to keep us busy for
> a while.

That is true, and I think we will need to do this. But as far as I can tell,
the problems that you talk about addressing are a different class from the
ones I was thinking of, because they only deal with areas that are already
isolated drivers with an existing API.

The things that I see as harder to do are where we need to change the
way that parts of the platform code interact with each other:

* platform specific IOMMU interfaces that need to be migrated to common
interfaces
* duplicated but slightly different header files in include/mach/
* static platform device definitions that get migrated to device tree
definitions.

Changing these tree-wide feels like open-heart surgery, and we'd spend
much time trying not to break stuff that could better be used to fix
other stuff.

The example that I have in mind is the time when we had a powerpc and a
ppc architecture in parallel, with ppc supporting a lot of hardware
that powerpc did not, but all new development getting done on powerpc.

This took years longer than we had expected at first, but I still think
it was a helpful fork. On ARM, we are in a much better shape in the
core code than what arch/ppc was, so there would be no point forking
that, but the problem on the platform code is quite similar.

Arnd

2011-04-02 02:24:52

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Sat, 2 Apr 2011, Arnd Bergmann wrote:

> On Friday 01 April 2011 21:54:47 Nicolas Pitre wrote:
> > I however don't think it is practical to go off in a separate
> > mach-nocrap space and do things in parallel. Taking OMAP as an example,
> > there is already way too big of an infrastructure in place to simply
> > rewrite it in parallel to new OMAP versions coming up.
> >
> > It would be more useful and scalable to simply sit down, look at the
> > current mess, and identify common patterns that can be easily factored
> > out into some shared library code, and all that would be left in the
> > board or SOC specific files eventually is the data to register with that
> > library code. Nothing so complicated as grand plans or planification
> > that makes it look like a mountain.
>
> This is exactly the question it comes down to. So far, we have focused
> on cleaning up platforms bit by bit. Given sufficient resources, I'm
> sure this can work. You assume that continuing on this path is the
> fastest way to clean up the whole mess, while my suggestion is based
> on the assumption that we can do better by starting a small fork.

I don't think any fork would gain any traction. That would only, heh,
fork the work force into two suboptimal branches for quite a while, and
given that we're talking about platform code, by the time the new branch
is usable and useful the hardware will probably be obsolete. The only
way this may work is for totally new platforms but we're not talking
about a fork in that case.

> I think we can both agree that by equally distributing the workforce
> to both approaches, we'd be off worse than doing one of them right ;-)

Absolutely.

> > I think what is needed here is a bunch of people willing to work on such
> > things, extracting those common patterns, and creating the
> > infrastructure to cover them. Once that is in place then we will be in
> > a position to push back on code submissions that don't use that
> > infrastructure, and be on the lookout for new patterns to emerge.
> >
> > Just with the above I think there is sufficient work to keep us busy for
> > a while.
>
> That is true, and I think we will need to do this. But as far as I can tell,
> the problems that you talk about addressing are a different class from the
> ones I was thinking of, because they only deal with areas that are already
> isolated drivers with an existing API.

They are areas with the best return on the investment. This has the
potential of making quite a bunch of code go away quickly. And the
goal is indeed to keep platform code hooking into existing APIs under
control, so that global maintenance tasks such as the one tglx did are
less painful. Obscure board code that no one else care about because no
other boards share the same hardware model, and which doesn't rely on
common kernel infrastructure, is not really a problem even if it looks
like crap because no one will have to touch it. And eventually the
board will become unused and we'll just delete that code.

> The things that I see as harder to do are where we need to change the
> way that parts of the platform code interact with each other:
>
> * platform specific IOMMU interfaces that need to be migrated to common
> interfaces

This can be done by actually forking the platform specific IOMMU code
only, just for the time required to migrate drivers to the common
interface.

> * duplicated but slightly different header files in include/mach/

Oh, actually that's part of the easy problems. This simply require time
to progressively do the boring work.

With CONFIG_ARM_PATCH_PHYS_VIRT turned on we can get rid of almost all
instances of arch/arm/mach-*/include/mach/memory.h already.

Getting rid of all instances of arch/arm/mach-*/include/mach/vmalloc.h
can be trivially achieved by simply moving the VMALLOC_END values into
the corresponding struct machine_desc instances.

And so on for many other files. This is all necessary for the
single-binary multi-SOC kernel work anyway.

> * static platform device definitions that get migrated to device tree
> definitions.

That require some kind of compatibility layer to make the transition
transparent to users. I think Grant had some good ideas for this.

> Changing these tree-wide feels like open-heart surgery, and we'd spend
> much time trying not to break stuff that could better be used to fix
> other stuff.

Well, depends how you see it. Sure this might cause some occasional
breakages, but normally those should be pretty obvious and easy to fix.
And the more we can do that stuff, the better future code adhering to the
new model will be.

> The example that I have in mind is the time when we had a powerpc and a
> ppc architecture in parallel, with ppc supporting a lot of hardware
> that powerpc did not, but all new development getting done on powerpc.
>
> This took years longer than we had expected at first, but I still think
> it was a helpful fork. On ARM, we are in a much better shape in the
> core code than what arch/ppc was, so there would be no point forking
> that, but the problem on the platform code is quite similar.

Nah, I don't think we want to go there at all. The problem on the
platform code is probably much worse on ARM due to the greater diversity
of supported hardware. If on PPC moving stuff across the fork took more
time on a year scale than expected, I think that on ARM we would simply
never see the end of it. And the incentive would not really be there
either, unlike when the core code is concerned and everyone is affected.

Nicolas

2011-04-02 02:59:29

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Fri, Apr 01, 2011 at 03:54:47PM -0400, Nicolas Pitre wrote:

> 1) GPIO drivers

> As Linus observed, in the majority of the cases GPIOs are accessed
> through simple memory-mapped registers. Some have absolute state
> registers, the others have separate clear/set registers. Suffice to
> create two generic GPIO drivers each covering those two common cases,
> and those generic drivers would simply register with the higher level
> gpiolib code, and all the board code would have to do is to provide
> the data for those GPIOs (register offsets, number of GPIOs, etc.).
> Whether this data eventually comes from DT is an orthogonal issue.

For this case we actually already have the basic_mmio_gpio driver in
tree, we should be pushing for wider usage of that.

2011-04-02 03:27:43

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Fri, Apr 01, 2011 at 05:55:57PM +0200, Arnd Bergmann wrote:
> On Friday 01 April 2011, Will Deacon wrote:

> > I don't understand how you can handle `early quirks' without board
> > files. Does this follow on from Linus' suggestion about moving code out
> > of the kernel and into the bootloader?

> There are multiple ways of dealing with this. One way would be to
> mandate that the boot loader does the quirks, ideally as little
> as possible.

Though we then get into the issues with bootloader quality and risk

> Another option is to have a boot wrapper with board specific code,
> which gets run between the regular boot loader and the common
> kernel entry point. We might need such a wrapper anyway to pass the
> device tree to the kernel.

This sounds an awful lot like a board file which doesn't get to use any
of the in-kernel infrastructure like bus controller drivers or chip
drivers to help which feels retrograde. I understand where you're
coming from on this but an absolute ban feels overly restrictive here,
it seems like we'd be better off allowing board files but pushing back
strongly on anything that should be data...

> > Realistically, I don't think you will ever get away from board files.
> > The trick is probably to make them as small as possible and common to as
> > many boards as possible (like the platforms directory for PowerPC).

> Perhaps. But we can start out with strict rules and add exceptions
> later when we run out of options.

...which is pretty much what you're saying here.

2011-04-02 04:38:20

by Richard Cochran

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Fri, Apr 01, 2011 at 04:19:31PM -0400, Nicolas Pitre wrote:
>
> In a perfect world the bootloader would be bug free and always up to
> date with the best DT data. In practice I'm very skeptical this will
> always be the case and painless. At least the above makes it very
> simple to have a self contained kernel when (not if) need be.

Yes, my experience with DT on powerpc teaches me that, although DT
sounds wonderful in theory, in practice kernel/dtb/uboot form a love
triangle (or perhaps a hate triangle) where all three points must be
exactly up to date with each other. If one part is even just a month
or two too old/new, then your kernel might not boot.

Richard

2011-04-03 15:28:22

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Saturday 02 April 2011, Nicolas Pitre wrote:
> On Sat, 2 Apr 2011, Arnd Bergmann wrote:
> > On Friday 01 April 2011 21:54:47 Nicolas Pitre wrote:
> > > I however don't think it is practical to go off in a separate
> > > mach-nocrap space and do things in parallel. Taking OMAP as an example,
> > > there is already way too big of an infrastructure in place to simply
> > > rewrite it in parallel to new OMAP versions coming up.
> > >
> > > It would be more useful and scalable to simply sit down, look at the
> > > current mess, and identify common patterns that can be easily factored
> > > out into some shared library code, and all that would be left in the
> > > board or SOC specific files eventually is the data to register with that
> > > library code. Nothing so complicated as grand plans or planification
> > > that makes it look like a mountain.
> >
> > This is exactly the question it comes down to. So far, we have focused
> > on cleaning up platforms bit by bit. Given sufficient resources, I'm
> > sure this can work. You assume that continuing on this path is the
> > fastest way to clean up the whole mess, while my suggestion is based
> > on the assumption that we can do better by starting a small fork.
>
> I don't think any fork would gain any traction. That would only, heh,
> fork the work force into two suboptimal branches for quite a while, and
> given that we're talking about platform code, by the time the new branch
> is usable and useful the hardware will probably be obsolete. The only
> way this may work is for totally new platforms but we're not talking
> about a fork in that case.

Doing it just for new platforms could be an option if we decide not
to do a fork. The potential danger there is that new platform maintainers
could feel being treated unfairly because they'd have to do much more
work than the existing ones in order to get merged.

> > The things that I see as harder to do are where we need to change the
> > way that parts of the platform code interact with each other:
> >
> > * platform specific IOMMU interfaces that need to be migrated to common
> > interfaces
>
> This can be done by actually forking the platform specific IOMMU code
> only, just for the time required to migrate drivers to the common
> interface.

True.

> > * duplicated but slightly different header files in include/mach/
>
> Oh, actually that's part of the easy problems. This simply require time
> to progressively do the boring work.
>
> With CONFIG_ARM_PATCH_PHYS_VIRT turned on we can get rid of almost all
> instances of arch/arm/mach-*/include/mach/memory.h already.
>
> Getting rid of all instances of arch/arm/mach-*/include/mach/vmalloc.h
> can be trivially achieved by simply moving the VMALLOC_END values into
> the corresponding struct machine_desc instances.
>
> And so on for many other files. This is all necessary for the
> single-binary multi-SOC kernel work anyway.

I would phrase that differently: There are multiple good reaons why we
want to get rid of conflicting mach/*.h files, but there are at least
two ways to get there.

> > * static platform device definitions that get migrated to device tree
> > definitions.
>
> That require some kind of compatibility layer to make the transition
> transparent to users. I think Grant had some good ideas for this.

Yes, there are a number of good ideas (device tree fragments,
platform_data constructors, gradually replacing platform data
with properties, and possibly some more things). We'll probably
use a combination of these, and they something is needed either
way.

> > The example that I have in mind is the time when we had a powerpc and a
> > ppc architecture in parallel, with ppc supporting a lot of hardware
> > that powerpc did not, but all new development getting done on powerpc.
> >
> > This took years longer than we had expected at first, but I still think
> > it was a helpful fork. On ARM, we are in a much better shape in the
> > core code than what arch/ppc was, so there would be no point forking
> > that, but the problem on the platform code is quite similar.
>
> Nah, I don't think we want to go there at all. The problem on the
> platform code is probably much worse on ARM due to the greater diversity
> of supported hardware. If on PPC moving stuff across the fork took more
> time on a year scale than expected, I think that on ARM we would simply
> never see the end of it. And the incentive would not really be there
> either, unlike when the core code is concerned and everyone is affected.

What actually took really long was getting to the point where we
could completely delete the old arch/ppc directory, and we might
never want to do the equivalent here and move all existing platforms
over to common code.

There are a few other examples that were done in a similar way:
* The drivers/ide code still serves a few hardware platforms that
never had anyone write a new libata code. Libata itself has
been in a good shape for a long time though.
* Same thing with ALSA: sound/oss is still there for some really
odd hardware, while ALSA is used everywhere else
* Many of the drivers getting into drivers/staging are so bad that
they simply get rewritten into a new driver and then deleted,
like arch/ppc.

We generally try to do gradual cleanups to any kernel code that is
worth keeping, because as you say the duplication itself causes a
lot of friction. For particularly hard cases, doing a replacement
implementation is an exceptional way out. What we need to find a
consensus on is how bad the problem in arch/arm/mach-*/ is:

1. No fundamental problem, just needs some care to clean up (your
position, I guess), so we do do what we always do and keep doing
gradual improvements, including treewide API changes.
2. Bad enough that starting a new competing implementation is easier
because it lets us try different things more easily and reduce
the number of treewide changes to all existing platforms.
(this is where I think we are) Like IDE and OSS, the old code
can still get improved and bug fixed, but concentrating on new
code gives us better freedom to make progress more quickly.
3. In need of a complete replacement, like arch/ppc and a lot of
drivers/staging. I'm not arguing that it's that bad.

Arnd

Arnd

2011-04-03 16:04:09

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Sun, Apr 03, 2011 at 05:26:37PM +0200, Arnd Bergmann wrote:
> There are a few other examples that were done in a similar way:
> * The drivers/ide code still serves a few hardware platforms that
> never had anyone write a new libata code. Libata itself has
> been in a good shape for a long time though.

And there are platforms where libata just doesn't work but the old
ide driver does. My firewall with a CF card in is completely unusable
with libata, but works fine with the old IDE driver.

That's partly caused by there being no support for the ISA IOCS16 signal
on that hardware, so the ISA bus needs specific handling depending on
the behaviour of the ISA device being addressed. Yes, crap hardware,
nothing new there.

> We generally try to do gradual cleanups to any kernel code that is
> worth keeping, because as you say the duplication itself causes a
> lot of friction. For particularly hard cases, doing a replacement
> implementation is an exceptional way out. What we need to find a
> consensus on is how bad the problem in arch/arm/mach-*/ is:
>
> 1. No fundamental problem, just needs some care to clean up (your
> position, I guess), so we do do what we always do and keep doing
> gradual improvements, including treewide API changes.
> 2. Bad enough that starting a new competing implementation is easier
> because it lets us try different things more easily and reduce
> the number of treewide changes to all existing platforms.
> (this is where I think we are) Like IDE and OSS, the old code
> can still get improved and bug fixed, but concentrating on new
> code gives us better freedom to make progress more quickly.
> 3. In need of a complete replacement, like arch/ppc and a lot of
> drivers/staging. I'm not arguing that it's that bad.

Having just looked at the clocksource stuff, there's 9 up-counting 32-bit
clocksources which are relatively easy to port to a single piece of code.

There's a number of down-counting clocksources using various methods to
convert to an up-counting value - sometimes -readl(), sometimes
cs->mask - readl() and sometimes ~readl().

Then there's those which are either 16-bit or 32-bit, and some of those
16-bit implementations must use readw() to avoid bus faults.

Combining all those together you end up with something pretty disgusting,
and an initialization function taking 7 arguments (iomem pointer, name,
rating, tick rate, size, up/down counter, clocksource flags).

Does it lead to more maintainable code? I'm not convinced - while it
certainly reduces the amount of code, but the reduced code is rather
more complex.

Would an alternative be to introduce separate mmio-32bit-up, mmio-32bit-down,
mmio-16bit-up and mmio-16bit-down clocksources? That also doesn't give
me a good feeling about the result.

Then there's those which change the cs->read function pointer at runtime,
and those which share that pointer with their sched_clock() implementation.

And those which need to do special things (like loop until read values are
stable) in their read functions which probably can't ever be supported by
common code.

2011-04-03 22:19:17

by Benjamin Herrenschmidt

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Fri, 2011-04-01 at 16:28 +0200, Detlef Vollmann wrote:
> > * No board files
> Where do you put code that needs to run very early (e.g. pinging the
> watchdog)?

Even on powerpc I keep board files :-)

The main thing is:

- The generic -> board linkage must not be hard (ie, no
platform_restart, but a board_ops.restart() etc....)

- An average board file is a few hundreds line long, that's it, mostly
it hooks up to generically provided functions, tho it gets the choice of
_which_ ones to hookup.

- It can still quirk/fixup a thing or two if needed, I thinkt it's
useful to keep that around, as long as such "quirks" remain small and
few. At the end of the day, if dealing with one board special case gives
you the choice between changing a ton of infrastructure/core to
introduce a new abstraction to deal with -that- special case vs. having
a one liner fixup in the platform code, the later is the most sensible
option. The hard part of course is to have sensible maintainers to make
sure this doesn't grow back to the old mess.

Cheers,
Ben.

2011-04-04 00:15:01

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Monday 04 April 2011, Benjamin Herrenschmidt wrote:
> On Fri, 2011-04-01 at 16:28 +0200, Detlef Vollmann wrote:
> > > * No board files
> > Where do you put code that needs to run very early (e.g. pinging the
> > watchdog)?
>
> Even on powerpc I keep board files :-)
>
> The main thing is:
>
> - The generic -> board linkage must not be hard (ie, no
> platform_restart, but a board_ops.restart() etc....)
>
> - An average board file is a few hundreds line long, that's it, mostly
> it hooks up to generically provided functions, tho it gets the choice of
> _which_ ones to hookup.

I believe a machine_type is more general than a board file, i.e. what
gets described as a machine in powerpc would often currently correspond
to multiple board files, if I am not mistaken.

The fact that we have a more diverse set of hardware on ARM, and that
it's growing quicker than powerpc also means that we should try harder
to reduce duplication than is necessary there.

> - It can still quirk/fixup a thing or two if needed, I thinkt it's
> useful to keep that around, as long as such "quirks" remain small and
> few. At the end of the day, if dealing with one board special case gives
> you the choice between changing a ton of infrastructure/core to
> introduce a new abstraction to deal with -that- special case vs. having
> a one liner fixup in the platform code, the later is the most sensible
> option. The hard part of course is to have sensible maintainers to make
> sure this doesn't grow back to the old mess.

I guess quirks are fine, as long as it's not required to have a them
for each board. We can have a function that gets called for any matching
"compatible" property of the root node, but I think the default should
be not to need it eventually.

This is one area where I think I can illustrate how a gradual change
from the status quo differs from a parallel new platform implementation:
To gradually change one board file, you would convert the existing
machine description to match the compatible property of the device
tree root node and possibly at a later stage remove that code again
once it's possible to work without it.
When starting out with a fresh implementation, we first need to
change all device drivers that are used on the board to work
without a machine description, but then would not have to change
any code twice, and the work for a similar board is almost done.

Arnd

2011-04-04 01:00:36

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Sunday 03 April 2011, Russell King - ARM Linux wrote:
> On Sun, Apr 03, 2011 at 05:26:37PM +0200, Arnd Bergmann wrote:
> > There are a few other examples that were done in a similar way:
> > * The drivers/ide code still serves a few hardware platforms that
> > never had anyone write a new libata code. Libata itself has
> > been in a good shape for a long time though.
>
> And there are platforms where libata just doesn't work but the old
> ide driver does. My firewall with a CF card in is completely unusable
> with libata, but works fine with the old IDE driver.
>
> That's partly caused by there being no support for the ISA IOCS16 signal
> on that hardware, so the ISA bus needs specific handling depending on
> the behaviour of the ISA device being addressed. Yes, crap hardware,
> nothing new there.

Yes, I think that also illustrates the approach: There is no fundamental
reason why it could not be made to work with libata, but there is also
no real incentive to do the work because all users can deal with the
IDE driver being essentially in bug-fix-only maintainance.

There is the occasional discussion about removing drivers/ide, but
right now it's more valuable to keep it, and the bug fixes cause
little trouble.

> Having just looked at the clocksource stuff, there's 9 up-counting 32-bit
> clocksources which are relatively easy to port to a single piece of code.
>
> There's a number of down-counting clocksources using various methods to
> convert to an up-counting value - sometimes -readl(), sometimes
> cs->mask - readl() and sometimes ~readl().

All three methods for down-counting can be abstracted as offset-readl(),
right?

> Then there's those which are either 16-bit or 32-bit, and some of those
> 16-bit implementations must use readw() to avoid bus faults.
>
> Combining all those together you end up with something pretty disgusting,
> and an initialization function taking 7 arguments (iomem pointer, name,
> rating, tick rate, size, up/down counter, clocksource flags).

I probably wouldn't use seven arguments, but put all arguments that are
typically constant per board into a data structure. Not sure we'd even
need the name unless there are a lot of cases where we'd register
multiple those clocksources at once. Size and up/down are just
flags I guess.

> Does it lead to more maintainable code? I'm not convinced - while it
> certainly reduces the amount of code, but the reduced code is rather
> more complex.

Isn't that always the tradeoff of generalizing the code? The big win
is that we don't get new copies of the same code with slightly different
bugs and that any changes to the code that are needed for supporting
new hardware have to get reviewed by people that know it.

The disadvantage is it's more complex code.

> Would an alternative be to introduce separate mmio-32bit-up, mmio-32bit-down,
> mmio-16bit-up and mmio-16bit-down clocksources? That also doesn't give
> me a good feeling about the result.

Just two booleans doesn't really justify separate drivers, I guess.

> Then there's those which change the cs->read function pointer at runtime,

I found omap, mxs and mxc doing that. In all cases, the differences between
the functions that are set are for parameters that depend on the CPU core
(MMIO address, 16/32 bit read), so that would be covered by the detection
logic.

> and those which share that pointer with their sched_clock() implementation.

Abstracting sched_clock() to be run-time selected is something that
needs to be taken care of. Maybe we could have a generic sched_clock
implementation that is written on top of clocksource instead of jiffies,
and always select that on architectures that have a decent clocksource.
Are there any platforms on ARM where that would be a bad idea? I believe
the main reaons why they are separate is that on x86 you can use the TSC
for sched_clock in many cases where you cannot use it for clocksource.

> And those which need to do special things (like loop until read values are
> stable) in their read functions which probably can't ever be supported by
> common code.

That's probably fine, because it's basically why we have the abstraction
at the clocksource level and not below it.

Arnd

2011-04-04 02:49:50

by Jean-Christophe PLAGNIOL-VILLARD

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Mon, 4 Apr 2011, Benjamin Herrenschmidt wrote:

> On Fri, 2011-04-01 at 16:28 +0200, Detlef Vollmann wrote:
> > > * No board files
> > Where do you put code that needs to run very early (e.g. pinging the
> > watchdog)?
>
> Even on powerpc I keep board files :-)
>
> The main thing is:
>
> - The generic -> board linkage must not be hard (ie, no
> platform_restart, but a board_ops.restart() etc....)

We have that on ARM already. See for example the struct machine_desc
definition in arch/arm/include/asm/mach/arch.h.

> - An average board file is a few hundreds line long, that's it, mostly
> it hooks up to generically provided functions, tho it gets the choice of
> _which_ ones to hookup.

Again that's largely the same situation on ARM. Taking Kirkwood for
example (wc -l arch/arm/mach-kirkwood/*-setup.c) the average for board
file tends to be more towards 200 lines though. Here DT could make a
difference by moving the statically defined board resources elsewhere.

> - It can still quirk/fixup a thing or two if needed, I thinkt it's
> useful to keep that around, as long as such "quirks" remain small and
> few. At the end of the day, if dealing with one board special case gives
> you the choice between changing a ton of infrastructure/core to
> introduce a new abstraction to deal with -that- special case vs. having
> a one liner fixup in the platform code, the later is the most sensible
> option. The hard part of course is to have sensible maintainers to make
> sure this doesn't grow back to the old mess.

Totally agreed. I think that's the core of the issue on ARM: we simply
aren't factoring out the duplication aggressively enough in order to
keep the board code to the absolute minimum. The fact that different
SoCs are totally alien to each other never encouraged that.

Nicolas

2011-04-04 02:59:47

by Benjamin Herrenschmidt

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Fri, 2011-04-01 at 09:39 -0700, Linus Torvalds wrote:
> that just maps some _other_ hardware knowledge (reading a SoC ID or
> something) into an unrelated thing ("I know this SoC has these bits of
> hardware").
>
> So devicetree should never override actual "hardware tells me it
> exists here". But you might well have a mapping from SoC ID's to a
> compiled-in devicetree thing (this is largely what POWER does, iirc).

Not quite :-)

On the most common non-embedded platforms the device-tree comes from the
firmware which generates it (ie, pSeries, macs, ...)

On most embedded platforms, the device-tree is either flashed separately
in a separate flash partition (and thus comes from u-boot) or is wrapped
with the kernel zImage at install time.

We have almost no case of detecting the board via some magic ID and
using that to slap a device-tree at runtime, mostly because the old
embedded platforms before most bootloaders grew the ability to pass us
the DT blob from flash, simply didn't even have such a board ID...

They did pass -some- informations but to some extent we were in a worst
position than ARM which does have them.

However, what I've observed (sadly) in practice is that many
manufacturers just ship products with bogus board IDs as well, they make
them up, often non-registered or registered to other vendors, duplicate
between products etc..., I've seen that with my Marvell based D-Link NAS
box for example.

But yes, I agree with pretty much everything you said :-) There is -one-
advantage in -one- specific case to also provide device node
representation for devices that are otherwise discoverable (PCI etc..),
which is when the device-tree can carry useful auxiliary informations
that the devices themselves don't provide. The typical example is that
growing tendency in ARM world to stick USB ethernet controllers on board
without a MAC address SEEPROM .... As long as the device is soldered
(and thus can be addressed via a stable "path" of ports/hubs), it can be
useful to stick a device-node for it (and for its parent controller,
potentially PCI, etc...).

Cheers,
Ben.

2011-04-04 05:25:00

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On 17:30 Fri 01 Apr , Detlef Vollmann wrote:
> On 04/01/11 16:59, Arnd Bergmann wrote:
> >On Friday 01 April 2011, Detlef Vollmann wrote:
> >>On 04/01/11 15:54, Arnd Bergmann wrote:
> >
> >>>9. All interesting work is going into a handful of platforms, all of which
> >>> are ARMv7 based.
> >>Define interesting.
> >
> >The ones that are causing the churn that we're talking about.
> >Platforms that have been working forever and only need to get
> >the occasional bug fix are boring, i.e. not the problem.
> In the ARM tree I only know mach-at91.
> Atmel still introduces new SOCs based on ARM926EJ-S, and that makes
> perfect sense for lots of applications.
> And if they add support for a new SOC, they just copy an existing one,
> change some GPIOs, and submit it as new files (sorry, I'm over-
> simplifying here).
at SoC level quite a lot as we do the necessarly to factorise as much as we
can you can take a look when we add the g20, g10 of g45, etc...
> And if you happen to wire your board a bit differently than they do,
> you have to patch theur generic file (in addidtion to add your own
> board file).
> And though I only know the mach-at91 closely, I'm pretty sure quite
> a number of other mach-* are not better.
> So this is actually why the ARM tree has such a bad reputation:
> lot's of code repetition, and still more of that.
On AT91 it's not anymore the case now we do start reduce the number of machine
and we not allow this anymore
>
> >>>12. Supporting many different boards with a single kernel binary is a
> >>> useful goal.
> >>Generally not for embedded systems (for me, a mobile PDA/phone is just a
> >>small computer with a crappy keyboard, but not an embedded system).
> >
> >True. For embedded, this would not be an important thing to do, but
> >also not hurt.
> It costs you flash space.
I disagree have a single kernel with mutliple board is very usefull as it can
allow you to have one firmware for multiple version of product. It will really
simplify the maintanance

And it's also allow to simplify kernel maintanance
>
> >>>* Strictly no crap
> >>> * No board files
> >>Where do you put code that needs to run very early (e.g. pinging the
> >>watchdog)?
> >
> >Don't know. I'd hope we can get fast enough to the phase where device
> >drivers get initialized.
> Nope, never happened for me :-(
> (Watchdog timeouts are often 1s or less.)
you can do a lot in 1s
you start the board dowload the code from nfs and jump in the kernel
as I do on at91sam9261ek with barebox
>
> >>>* All board specific information must come from a device tree and
> >>> be run-time detected.
> >>What do you mean by "run-time detected"?
> >>For powerpc, we currently have the device tree as DTS in the kernel
> >>and compile and bundle it together with the kernel.
> >>As you wrote above: "Discoverable hardware [...] is not going to happen"
> >
> >I mean writing
> >
> > if (device_is_compatible(dev, SOMETHING))
> > do_something();
> >
> >instead of
> >
> >#ifdef CONFIG_SOMETHING
> > do_something();
> >#endif
> >
> >The run-time information could come from anywhere (device tree, hardware
> >registers, today one might use the board number), the important point is
> >not to assume that hardware is present just because someone enabled
> >a Kconfig option.
> Understood and I agree.
>
> >I believe that rule is generally accepted today, but we don't always
> >enforce it.
> Without device tree, Kconfig option is the only way that really
> works today (no runtime HW detection, and same board ID with different
> setups).
you can do it already today with system_rev DTS will allow to make it more
generic and pass more information and on at91 we could easly have only few
boards. Except for few case such as really short boottime requirement this
will not impact the system on contrary

Best Regards,
J.
> Device tree in my world is just one big Kconfig option instead of
> many small...
>
> Detlef
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

2011-04-04 08:26:07

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Mon, 2011-04-04 at 02:59 +0200, Arnd Bergmann wrote:

> Abstracting sched_clock() to be run-time selected is something that
> needs to be taken care of. Maybe we could have a generic sched_clock
> implementation that is written on top of clocksource instead of jiffies,
> and always select that on architectures that have a decent clocksource.
> Are there any platforms on ARM where that would be a bad idea? I believe
> the main reaons why they are separate is that on x86 you can use the TSC
> for sched_clock in many cases where you cannot use it for clocksource.

I've proposed a mechanism for a run-time selectable sched_clock()
implementation as part of my A15 timer patch set:
http://www.spinics.net/lists/arm-kernel/msg116891.html
and more specifically patches #10 and #11.

I'm not completely pleased with it (the fact that it embeds a copy of
the generic sched_clock() to be used as a default is properly ugly), but
maybe this could be used as a base for further discussion.

Cheers,

M.
--
Reality is an implementation detail.

2011-04-04 09:29:04

by Nicolas Ferre

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

Le 01/04/2011 17:30, Detlef Vollmann :
> On 04/01/11 16:59, Arnd Bergmann wrote:
>> On Friday 01 April 2011, Detlef Vollmann wrote:
>>> On 04/01/11 15:54, Arnd Bergmann wrote:
>>
>>>> 9. All interesting work is going into a handful of platforms, all of
>>>> which
>>>> are ARMv7 based.
>>> Define interesting.
>>
>> The ones that are causing the churn that we're talking about.
>> Platforms that have been working forever and only need to get
>> the occasional bug fix are boring, i.e. not the problem.
> In the ARM tree I only know mach-at91.
> Atmel still introduces new SOCs based on ARM926EJ-S, and that makes
> perfect sense for lots of applications.
> And if they add support for a new SOC, they just copy an existing one,
> change some GPIOs, and submit it as new files (sorry, I'm over-
> simplifying here).
> And if you happen to wire your board a bit differently than they do,
> you have to patch theur generic file (in addidtion to add your own
> board file).
> And though I only know the mach-at91 closely, I'm pretty sure quite
> a number of other mach-* are not better.
> So this is actually why the ARM tree has such a bad reputation:
> lot's of code repetition, and still more of that.

Yes, certainly time has come for a change.

Note however that AT91 community is making great effort to:
- publish and maintain every single chip/board support since more than 5
years (and far before for first venerable at91rm9200) : if you recall
well, it was before most of code that appeared in arch/arm/mach-*
directories ;-)
- integrate ideas and patches from contributors for simplifying and
reducing board duplication
- try to conform to new infrastructures that are appearing on ARM Linux
for better convergence of code: gpiolib, leds, buttons, clocks (work in
progress)...

We know that work has to be done and we will for sure follow this effort
of consolidation. And remember: contributions welcomed ;-).

Best regards,
--
Nicolas Ferre

2011-04-04 11:04:00

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Mon, 2011-04-04 at 01:59 +0100, Arnd Bergmann wrote:
> On Sunday 03 April 2011, Russell King - ARM Linux wrote:
> > Then there's those which change the cs->read function pointer at runtime,
...
> > and those which share that pointer with their sched_clock() implementation.
>
> Abstracting sched_clock() to be run-time selected is something that
> needs to be taken care of. Maybe we could have a generic sched_clock
> implementation that is written on top of clocksource instead of jiffies,
> and always select that on architectures that have a decent clocksource.

On Cortex-A15 with the virtualisation extensions and architected timers
the clocksource is implemented using a physical counter (as we want
wall-clock timing). But for sched_clock() we may want to use a virtual
counter (which is basically an offset from the physical one, set by the
hypervisor during guest OS switching). Marc already posted some patches
for this.

--
Catalin

2011-04-04 11:21:59

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Mon, Apr 04, 2011 at 12:03:42PM +0100, Catalin Marinas wrote:
> On Mon, 2011-04-04 at 01:59 +0100, Arnd Bergmann wrote:
> > On Sunday 03 April 2011, Russell King - ARM Linux wrote:
> > > Then there's those which change the cs->read function pointer at runtime,
> ...
> > > and those which share that pointer with their sched_clock() implementation.
> >
> > Abstracting sched_clock() to be run-time selected is something that
> > needs to be taken care of. Maybe we could have a generic sched_clock
> > implementation that is written on top of clocksource instead of jiffies,
> > and always select that on architectures that have a decent clocksource.
>
> On Cortex-A15 with the virtualisation extensions and architected timers
> the clocksource is implemented using a physical counter (as we want
> wall-clock timing). But for sched_clock() we may want to use a virtual
> counter (which is basically an offset from the physical one, set by the
> hypervisor during guest OS switching). Marc already posted some patches
> for this.

I had a quick look at the two patches, but I was far from impressed
due to the apparant complexity I saw.

There's no point in trying to consolidate stuff if it results in a net
increase in the amount of code to be maintained as that just increases
the burden, churn and maintainence headache.

Is there an easier way to consolidate it across all platforms? I think
so:

static DEFINE_CLOCK_DATA(cd);
static u32 sched_clock_mask;
static u32 (*read_sched_clock)(void);

unsigned long long notrace sched_clock(void)
{
if (read_sched_clock) {
u32 cyc = read_sched_clock();
return cyc_to_sched_clock(&cd, cyc, sched_clock_mask);
} else {
/* jiffies based code */
}
}

static void notrace update_sched_clock(void)
{
u32 cyc = read_sched_clock();
update_sched_clock(&cd, cyc, sched_clock_mask);
}

void __init setup_sched_clock(u32 (*read)(void), int bits, unsigned long rate)
{
BUG_ON(bits > 32);
read_sched_clock = read;
sched_clock_mask = (1 << bits) - 1;
init_sched_clock(&cd, update_sched_clock, bits, rate);
}

and then get rid of the per-platform implementations entirely - all
that platforms then have to provide is a read function and a call to
setup_sched_clock().

Whether its worth it or not is questionable - the above is more lines
of code than many of the existing implementations, and we're not going
to shrink the existing implementations by much (maybe one to three
lines.) The only thing we gain is the ability to select an implementation
at runtime.

2011-04-04 13:24:04

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Mon, 2011-04-04 at 12:21 +0100, Russell King - ARM Linux wrote:
> On Mon, Apr 04, 2011 at 12:03:42PM +0100, Catalin Marinas wrote:
> > On Mon, 2011-04-04 at 01:59 +0100, Arnd Bergmann wrote:
> > > On Sunday 03 April 2011, Russell King - ARM Linux wrote:
> > > > Then there's those which change the cs->read function pointer at runtime,
> > ...
> > > > and those which share that pointer with their sched_clock() implementation.
> > >
> > > Abstracting sched_clock() to be run-time selected is something that
> > > needs to be taken care of. Maybe we could have a generic sched_clock
> > > implementation that is written on top of clocksource instead of jiffies,
> > > and always select that on architectures that have a decent clocksource.
> >
> > On Cortex-A15 with the virtualisation extensions and architected timers
> > the clocksource is implemented using a physical counter (as we want
> > wall-clock timing). But for sched_clock() we may want to use a virtual
> > counter (which is basically an offset from the physical one, set by the
> > hypervisor during guest OS switching). Marc already posted some patches
> > for this.
>
> I had a quick look at the two patches, but I was far from impressed
> due to the apparant complexity I saw.
>
> There's no point in trying to consolidate stuff if it results in a net
> increase in the amount of code to be maintained as that just increases
> the burden, churn and maintainence headache.
>
> Is there an easier way to consolidate it across all platforms? I think
> so:
>
> static DEFINE_CLOCK_DATA(cd);
> static u32 sched_clock_mask;
> static u32 (*read_sched_clock)(void);
>
> unsigned long long notrace sched_clock(void)
> {
> if (read_sched_clock) {
> u32 cyc = read_sched_clock();
> return cyc_to_sched_clock(&cd, cyc, sched_clock_mask);
> } else {
> /* jiffies based code */
> }
> }
>
> static void notrace update_sched_clock(void)
> {
> u32 cyc = read_sched_clock();
> update_sched_clock(&cd, cyc, sched_clock_mask);
> }
>
> void __init setup_sched_clock(u32 (*read)(void), int bits, unsigned long rate)
> {
> BUG_ON(bits > 32);
> read_sched_clock = read;
> sched_clock_mask = (1 << bits) - 1;
> init_sched_clock(&cd, update_sched_clock, bits, rate);
> }
>
> and then get rid of the per-platform implementations entirely - all
> that platforms then have to provide is a read function and a call to
> setup_sched_clock().

The complexity mostly comes the fact that I tried to avoid having more
runtime complexity on platforms that didn't need to select their
sched_clock() implementation at runtime (no indirection while calling
sched_clock()).

If this can be relaxed, then your implementation is definitely better.

> Whether its worth it or not is questionable - the above is more lines
> of code than many of the existing implementations, and we're not going
> to shrink the existing implementations by much (maybe one to three
> lines.) The only thing we gain is the ability to select an implementation
> at runtime.

I believe this last point to be rather important if we plan to have this
mythical single kernel covering several architectures. It's also nice
for the A15 to be able to use some default sched_clock() implementation
as a fallback if the generic timers are not available for some reason.

M.
--
Reality is an implementation detail.

2011-04-04 13:31:39

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Mon, Apr 04, 2011 at 02:24:17PM +0100, Marc Zyngier wrote:
> On Mon, 2011-04-04 at 12:21 +0100, Russell King - ARM Linux wrote:
> > Whether its worth it or not is questionable - the above is more lines
> > of code than many of the existing implementations, and we're not going
> > to shrink the existing implementations by much (maybe one to three
> > lines.) The only thing we gain is the ability to select an implementation
> > at runtime.
>
> I believe this last point to be rather important if we plan to have this
> mythical single kernel covering several architectures. It's also nice
> for the A15 to be able to use some default sched_clock() implementation
> as a fallback if the generic timers are not available for some reason.

If ARM are going to architect a set of timers into the hardware, let's
make sure that all such hardware has them so we can dig ourselves out
of this crappy mess that we find ourselves in today.

I do hope they're not missing functionality like the GIC is - otherwise
they're just going to make the situation worse than it already is.

2011-04-04 13:57:15

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Mon, 2011-04-04 at 14:31 +0100, Russell King - ARM Linux wrote:
> On Mon, Apr 04, 2011 at 02:24:17PM +0100, Marc Zyngier wrote:
> > On Mon, 2011-04-04 at 12:21 +0100, Russell King - ARM Linux wrote:
> > > Whether its worth it or not is questionable - the above is more lines
> > > of code than many of the existing implementations, and we're not going
> > > to shrink the existing implementations by much (maybe one to three
> > > lines.) The only thing we gain is the ability to select an implementation
> > > at runtime.
> >
> > I believe this last point to be rather important if we plan to have this
> > mythical single kernel covering several architectures. It's also nice
> > for the A15 to be able to use some default sched_clock() implementation
> > as a fallback if the generic timers are not available for some reason.
>
> If ARM are going to architect a set of timers into the hardware, let's
> make sure that all such hardware has them so we can dig ourselves out
> of this crappy mess that we find ourselves in today.

As far as I know, A15 always has a set of generic timers.

It may be that they are not available (frequency not programmed into the
CNTFREQ register), or that someone decided to use a better alternative
(for some particular interpretation of "better").

Overall, it seems like we need some degree of flexibility to have
several sched_clock() implementations within a single image, whether it
is to support multiple platforms, or to allow a single architecture to
pick the best alternative given a set of initial conditions.

M.
--
Reality is an implementation detail.

2011-04-04 20:08:53

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

2011/4/4 Marc Zyngier <[email protected]>:
> On Mon, 2011-04-04 at 14:31 +0100, Russell King - ARM Linux wrote:
>>
>> If ARM are going to architect a set of timers into the hardware, let's
>> make sure that all such hardware has them so we can dig ourselves out
>> of this crappy mess that we find ourselves in today.
>
> As far as I know, A15 always has a set of generic timers.
>
> It may be that they are not available (frequency not programmed into the
> CNTFREQ register), or that someone decided to use a better alternative
> (for some particular interpretation of "better").

I guess this thing is inside that A15 core?

First, what happens the day any vendors start making SoCs on this is
they turn the A15 core off whenever it is not used, loosing all state
including this timer, I believe.

This forces them all to add some totally different clocksource, event and
wakeup in some always-on voltage domain and rate that higher than
the A15 timer(s). They will then implement sched_clock() and
clocksource on that instead and only use A15 for localtimers.

(Leading to the proliferation of board/SoC timer hacks discussed
so much recently...)

The only way to reuse that poor thing in practice is if you engineer
a separate power domain with stuff that is supposed to be always-on
in the A15 macro (including these timers) so vendors must implement
this so as not to loose its state. Is this the case?

Else, in effect it will only be used as clocksource and sched_clock()
with these Versatile boards where the power is always on anyway.

Second, have you taken into account the effect of changing the
frequency of the A15 core, which is something every vendor also
does, as you know Colin Cross already has a patch pending
for that on the TWD localtimer which has not yet reached
the kernel. (Or is A15 fixed frequency? Forgive my ignorance...)

(And third it will also eventually need to hook into the timer-based
delay framework that I think Nokia is working on to be really
useful, else all delays become unpredictable.)

Yours,
Linus Walleij

2011-04-05 06:42:42

by Santosh Shilimkar

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On 4/5/2011 1:38 AM, Linus Walleij wrote:
> 2011/4/4 Marc Zyngier<[email protected]>:
>> On Mon, 2011-04-04 at 14:31 +0100, Russell King - ARM Linux wrote:
>>>
>>> If ARM are going to architect a set of timers into the hardware, let's
>>> make sure that all such hardware has them so we can dig ourselves out
>>> of this crappy mess that we find ourselves in today.
>>
>> As far as I know, A15 always has a set of generic timers.
>>
>> It may be that they are not available (frequency not programmed into the
>> CNTFREQ register), or that someone decided to use a better alternative
>> (for some particular interpretation of "better").
>
> I guess this thing is inside that A15 core?
>
Yes but the power domain partitioning can be used.

> First, what happens the day any vendors start making SoCs on this is
> they turn the A15 core off whenever it is not used, loosing all state
> including this timer, I believe.
>
> This forces them all to add some totally different clocksource, event and
> wakeup in some always-on voltage domain and rate that higher than
> the A15 timer(s). They will then implement sched_clock() and
> clocksource on that instead and only use A15 for localtimers.
>
> (Leading to the proliferation of board/SoC timer hacks discussed
> so much recently...)
>
> The only way to reuse that poor thing in practice is if you engineer
> a separate power domain with stuff that is supposed to be always-on
> in the A15 macro (including these timers) so vendors must implement
> this so as not to loose its state. Is this the case?
>
Yes. From what I understood from A15 timer architeture so far is,
A15 counter is suppose to be kept in ALWAYS ON domain by Soc vendors.
That's the requirement.

> Else, in effect it will only be used as clocksource and sched_clock()
> with these Versatile boards where the power is always on anyway.
>
> Second, have you taken into account the effect of changing the
> frequency of the A15 core, which is something every vendor also
> does, as you know Colin Cross already has a patch pending
> for that on the TWD localtimer which has not yet reached
> the kernel. (Or is A15 fixed frequency? Forgive my ignorance...)
>
This one is also addressed. The counter will run on fixed
frequency and even if the clock input has changed, the
counting will be as if the clock source is constant.
e.g
- @6 MHz, counter will increment count by 1 for every ~166 nS
- @32 KHz, counter will increment count by ~183 times for
every ~30 ms

So overall clock-source should work in all scenario's including low
power scenario'.

The only issue I see is the clock-events implemented using
local timers capabilities in low power modes. The local timers
won't be able wakeup CPU from DORMANT or OFF state and hence
you will need an additional wakeup capable clock-event
working together with the local timers using clock-notifiers.

> (And third it will also eventually need to hook into the timer-based
> delay framework that I think Nokia is working on to be really
> useful, else all delays become unpredictable.)
>
Do you mean udelay()/mdelay() here ?

Regards
Santosh

2011-04-05 07:30:10

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Mon, 2011-04-04 at 22:08 +0200, Linus Walleij wrote:
> 2011/4/4 Marc Zyngier <[email protected]>:
> > On Mon, 2011-04-04 at 14:31 +0100, Russell King - ARM Linux wrote:
> >>
> >> If ARM are going to architect a set of timers into the hardware, let's
> >> make sure that all such hardware has them so we can dig ourselves out
> >> of this crappy mess that we find ourselves in today.
> >
> > As far as I know, A15 always has a set of generic timers.
> >
> > It may be that they are not available (frequency not programmed into the
> > CNTFREQ register), or that someone decided to use a better alternative
> > (for some particular interpretation of "better").
>
> I guess this thing is inside that A15 core?
>
> First, what happens the day any vendors start making SoCs on this is
> they turn the A15 core off whenever it is not used, loosing all state
> including this timer, I believe.

The main counter is located in an ALWAYS_ON power domain, and should
keep going whatever happens in the system.

[...]

> Second, have you taken into account the effect of changing the
> frequency of the A15 core, which is something every vendor also
> does, as you know Colin Cross already has a patch pending
> for that on the TWD localtimer which has not yet reached
> the kernel. (Or is A15 fixed frequency? Forgive my ignorance...)

Fixed frequency, with a minimal roll-over time of 40 years.

M.
--
Reality is an implementation detail.

2011-04-05 07:45:52

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Tue, Apr 05, 2011 at 12:10:24PM +0530, Santosh Shilimkar wrote:
> The only issue I see is the clock-events implemented using
> local timers capabilities in low power modes. The local timers
> won't be able wakeup CPU from DORMANT or OFF state and hence
> you will need an additional wakeup capable clock-event
> working together with the local timers using clock-notifiers.

So yet again, we have something that almost works but doesn't, and
we're going to have to have board-specific hacks to work around this.

This makes the architected timers totally *pointless*.

2011-04-05 14:16:10

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Tue, 2011-04-05 at 08:45 +0100, Russell King - ARM Linux wrote:
> On Tue, Apr 05, 2011 at 12:10:24PM +0530, Santosh Shilimkar wrote:
> > The only issue I see is the clock-events implemented using
> > local timers capabilities in low power modes. The local timers
> > won't be able wakeup CPU from DORMANT or OFF state and hence
> > you will need an additional wakeup capable clock-event
> > working together with the local timers using clock-notifiers.
>
> So yet again, we have something that almost works but doesn't, and
> we're going to have to have board-specific hacks to work around this.
>
> This makes the architected timers totally *pointless*.

The point of architected timers is that you now get fast access to a
stable clock source with a fixed frequency that doesn't change with
cpufreq. You can also configure the clock source to be accessible from
user space (via another kuser helper) so that you have a fast
gettimeofday implementation (useful in the graphics world). Virtual
counters are another advantage.

The architecture doesn't say much about the power domains, so
implementations are allowed to vary here. Of course it would have been
better if it did but this doesn't make the architected timers totally
pointless (compare them with the A9 timers).

--
Catalin

2011-04-05 22:16:27

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

2011/4/5 Santosh Shilimkar <[email protected]>:
> [Me]
>> (And third it will also eventually need to hook into the timer-based
>> delay framework that I think Nokia is working on to be really
>> useful, else all delays become unpredictable.)
>>
> Do you mean udelay()/mdelay() here ?

Yes. Stephen Boyd from Qualcomm has floated patches to fix it for the
ARM architecture, I just pushed him again. We use it in our
ST-Ericsson products.

Ideally you'd want that to go along with the A15 timer stuff so that this
monotonic high-precision timer is also used for udelay()/mdelay().

Yours,
Linus Walleij

2011-04-05 22:22:20

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

2011/4/5 Santosh Shilimkar <[email protected]>:

> The only issue I see is the clock-events implemented using
> local timers capabilities in low power modes. The local timers
> won't be able wakeup CPU from DORMANT or OFF state and hence
> you will need an additional wakeup capable clock-event
> working together with the local timers using clock-notifiers.

And this is because the IRQs it emits are local and thus cannot wake
the system? This sounds way backwards... A simple na?ve solution
would have been to just route out an external IRQ line back from a
selected timer and into the GIC so it will be able to wake up the
system, right?

Yours,
Linus Walleij

2011-04-05 23:19:39

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

2011/4/1 Linus Torvalds <[email protected]>:

> If you have discoverable hardware, use it.
>
> But by "discoverable hardware" I mean something like PCI config
> cycles. IOW, real hardware features.

The ARM AMBA architecture actually has such a thing, or a
little of it, found in drivers/amba/bus.c.

Basically it requires you to get the physical address and size of
each peripheral, then at offset -0x10 from the end address (usually
at even 4K pages), if you find the magic number 0xB105F00D
(ARM has a sense of humour, obviously) you can find something
alike the PCI IDs at offset -0x20, manufacturer ID, version number
and revision of the hardware.

It isn't very hard to imagine that mechanism providing IRQ
numbers or DMA channel allocation and other such data
so it becomes more plug'n'play-ish. If the hardware had a
list of device physical whereabouts in a specific location too,
the system would be quite self-describing.

Not as sexy as the separate PCI configuration space
though, it's just hardcoded in along with the device I/O pages.

Apparently ARM pushed this in their few initial cells
and manufacturers are free to reuse this system for their
silicon. ST Microelectronics for example actually use it to
a larger extent. But since it was not mandatory and there
was no clear way on how to register magic numbers with
this system (like the PCI-SIG), it simply failed. Silicon
foundries didn't care or even know about it, neglecting to put
this 0xB105F00D in place.

IMO the world would have been much better off if
ARM mandated that all vendors *must* use this scheme
for their hardware blocks if they are to license the AMBA
bus incarnations, but they don't.

Maybe the ARM guys on the list has some background on
this?

Yours,
Linus Walleij

2011-04-06 06:11:43

by Barry Song

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

2011/4/1 Arnd Bergmann <[email protected]>:
> On Friday 01 April 2011, Ingo Molnar wrote:
>> IMO the right answer is what Linus and Thomas outlined:
>>
>> 1) provide a small number of clean examples and clean abstractions
>> 2) to not pull new crap from that point on
>> 3) do this gradually but consistently
>>
>> I.e. make all your requirements technical and actionable - avoid sweeping,
>> impossible to meet requirements. Do not require people to clean up all of the
>> existing mess straight away (they cannot realistically do it), do not summarily
>> block the flow of patches, but be firm about drawing a line in the sand and be
>> firm about not introducing new mess in a gradually growing list of well-chosen
>> areas of focus.
>>
>> Rinse, repeat.
>
> I believe getting to point 1 is the hard part here. There are a lot of things
> that are wrong with the mach-* (and also plat-*) implementations, and I don't
> think we have one today that can really serve as an example. Most decisions
> made in there made a lot of sense when they were introduced, and declaring
> code that was perfectly acceptable yesterday to be unacceptable crap today
> is not going to be met with much understanding by the someone who just
> wants to add support for one more board to 100 already existing ones in the
> same SoC family.
>
> I would actually suggest a different much more radical start: Fork the way
> that platforms are managed today, and start an alternative way of setting
> up boards and devices together with the proven ARM core kernel infrastructure,
> based on these observations (please correct me if some of them they don't make
> sense):
>
> 1. The core arch code is not a problem (Russell does a great job here)
> 2. The platform specific code contains a lot of crap that doesn't belong there
> (not enough reviewers to push back on crap)
> 3. The amount of crap in platform specfic files is growing exponentially,
> despite the best efforts of a handful of people to clean it up.
> 4. Having one source file per board does not scale any more.
> 5. Discoverable hardware would solve this, but is not going to happen
> in practice.
> 6. Board firmware would not solve this and is usually not present.
> 7. Boot loaders can not be trusted to pass valid information
> 8. Device tree blobs can solve a lot of the problems, and nobody has
> come up with a better solution.

ARM BSP is still blasting! we are planning to merge our new ARM
cortex-a9 SoC into kernel. So I am just wondering whether traditional
ARM BSP way can still be accepted, or we must move to use device tree?
but i have't seen any arm device tree codes enter mainline yet. but we
can get those patches from linaro 2.6.38. So what's the plan for
merging arm device tree?

What i have seen is that the BSP architecture of different ARM SoC
companies is even different.

samsung has three levels:
plat-samsung
plat-s3c24xx
mach-s3c2410
mach-s3c2440
plat-s5p
mach-s5pv210
mach-s5pv310

TI has two levels:
plat-omap
mach-omap1
mach-omap2

Nvidia has one level:
mach-tegra

I didn't find any rule about what codes should be placed in what
directories. Different companies have different ways. It looks like
the only agreement is board files are in mach-xxx. Any suggestions for
that?

BTW, we don't want to "dick around", which Linus has been very angry.
we want to fix more issues this email pointed out before we send
patches.

> 9. All interesting work is going into a handful of platforms, all of which
> are ARMv7 based.
> 10. We do not want to discontinue support for old boards that work fine.
> 11. Massive changes to existing platforms would cause massive breakage.
> 12. Supporting many different boards with a single kernel binary is a
> useful goal.
> 13. Infrastructure code should be cross-platform, not duplicated across
> platforms.
> 14. 32 bit ARM is hitting the wall in the next years (Cortex-A15 is
> actually adding PAE support, which has failed to solve this on
> other architectures).
> 15. We need to solve the platform problem before 64 bit support comes
> and adds another dimension to the complexity.
>
> Based on these assumptions, my preferred strategy would be to a new
> mach-nocrap directory with a documented set of rules (to be adapted when
> necessary):
>
> * Strictly no crap
> * No board files
> * No hardcoded memory maps
> * No lists of interrupts and GPIOs
> * All infrastructure added must be portable to all ARMv7 based SoCs.
> (ARMv6 can be added later)
> * 64 bit safe code only.
> * SMP safe code only.
> * All board specific information must come from a device tree and
> be run-time detected.
> * Must use the same device drivers as existing platforms
> * Should share platform drivers (interrupt controller, gpio, timer, ...)
> with existing platforms where appropriate.
> * Code quality takes priority over stability in mach-nocrap, but must not
> break other platforms.
>
> Until we have something working there, I think we should still generally
> allow new code to the existing platforms, and even new platforms to be
> added, while trying to keep the quality as high as possible but without
> changing the rules for them or doing any major treewide reworks.
>
> Once the mach-nocrap approach has turned into something usable, we can
> proceed on three fronts:
> 1. delete actively maintained boards from the other platforms once they
> are no longer needed there
> 2. generalize concepts from mach-nocrap by applying them to all boards,
> similar to the cleanup work that people have always been doing.
> 3. gradually make the rules for adding new code in other platforms stricter,
> up to the point where they are bugfix only.
>
>> If companies do not 'bother to push upstream', then management will eventually
>> notice negative economic consequences:
>>
>> ...
>
> Good points, I fully agree with these. I also think that the SoC companies
> are actually understanding this nowadays, and that is exactly the reason
> why we see so much code getting pushed in.
>
> Arnd
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

2011-04-06 06:42:20

by Santosh Shilimkar

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On 4/6/2011 3:52 AM, Linus Walleij wrote:
> 2011/4/5 Santosh Shilimkar<[email protected]>:
>
>> The only issue I see is the clock-events implemented using
>> local timers capabilities in low power modes. The local timers
>> won't be able wakeup CPU from DORMANT or OFF state and hence
>> you will need an additional wakeup capable clock-event
>> working together with the local timers using clock-notifiers.
>
> And this is because the IRQs it emits are local and thus cannot wake
> the system? This sounds way backwards... A simple na?ve solution
> would have been to just route out an external IRQ line back from a
> selected timer and into the GIC so it will be able to wake up the
> system, right?
>
Even the GIC would be dead is certain low power modes. It's need
GIC extension to route these signals and seems that it's bit
tricky hardware implementation since the timer logic
needs to be moved to ALWAYS ON power domain.

Regards
Santosh

2011-04-06 06:44:18

by Santosh Shilimkar

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On 4/6/2011 3:46 AM, Linus Walleij wrote:
> 2011/4/5 Santosh Shilimkar<[email protected]>:
>> [Me]
>>> (And third it will also eventually need to hook into the timer-based
>>> delay framework that I think Nokia is working on to be really
>>> useful, else all delays become unpredictable.)
>>>
>> Do you mean udelay()/mdelay() here ?
>
> Yes. Stephen Boyd from Qualcomm has floated patches to fix it for the
> ARM architecture, I just pushed him again. We use it in our
> ST-Ericsson products.
>
I remember the post.

> Ideally you'd want that to go along with the A15 timer stuff so that this
> monotonic high-precision timer is also used for udelay()/mdelay().
>
The A15 counter which is always available and running can be used
to emulate this.

Regards
Santosh

2011-04-06 07:32:23

by Bryan Wu

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, Apr 6, 2011 at 2:11 PM, Barry Song <[email protected]> wrote:
> 2011/4/1 Arnd Bergmann <[email protected]>:
>> On Friday 01 April 2011, Ingo Molnar wrote:
>>> IMO the right answer is what Linus and Thomas outlined:
>>>
>>> ? ?1) provide a small number of clean examples and clean abstractions
>>> ? ?2) to not pull new crap from that point on
>>> ? ?3) do this gradually but consistently
>>>
>>> I.e. make all your requirements technical and actionable - avoid sweeping,
>>> impossible to meet requirements. Do not require people to clean up all of the
>>> existing mess straight away (they cannot realistically do it), do not summarily
>>> block the flow of patches, but be firm about drawing a line in the sand and be
>>> firm about not introducing new mess in a gradually growing list of well-chosen
>>> areas of focus.
>>>
>>> Rinse, repeat.
>>
>> I believe getting to point 1 is the hard part here. There are a lot of things
>> that are wrong with the mach-* (and also plat-*) implementations, and I don't
>> think we have one today that can really serve as an example. Most decisions
>> made in there made a lot of sense when they were introduced, and declaring
>> code that was perfectly acceptable yesterday to be unacceptable crap today
>> is not going to be met with much understanding by the someone who just
>> wants to add support for one more board to 100 already existing ones in the
>> same SoC family.
>>
>> I would actually suggest a different much more radical start: Fork the way
>> that platforms are managed today, and start an alternative way of setting
>> up boards and devices together with the proven ARM core kernel infrastructure,
>> based on these observations (please correct me if some of them they don't make
>> sense):
>>
>> 1. The core arch code is not a problem (Russell does a great job here)
>> 2. The platform specific code contains a lot of crap that doesn't belong there
>> ? (not enough reviewers to push back on crap)
>> 3. The amount of crap in platform specfic files is growing exponentially,
>> ? despite the best efforts of a handful of people to clean it up.
>> 4. Having one source file per board does not scale any more.
>> 5. Discoverable hardware would solve this, but is not going to happen
>> ? in practice.
>> 6. Board firmware would not solve this and is usually not present.
>> 7. Boot loaders can not be trusted to pass valid information
>> 8. Device tree blobs can solve a lot of the problems, and nobody has
>> ? come up with a better solution.
>
> ARM BSP is still blasting! we are planning to merge our new ARM
> cortex-a9 SoC into kernel.

As far as I know, Barry is working on a new SoC family based on
Cortex-A9. He asked me/Eric personally before about this issue, it is
quite confused for new comers. On one hand, they wanna follow the
mainline style to join our upstream family, on the other hand if they
duplicate some crap from other SoC families, they will bring us
trouble or more crap.

> So I am just wondering whether traditional
> ARM BSP way can still be accepted, or we must move to use device tree?
> but i have't seen any arm device tree codes enter mainline yet. but we
> can get those patches from linaro 2.6.38. So what's the plan for
> merging arm device tree?
>

I suggest you need a dedicated guy who will work on DT supporting for
your SoC. As I can tell from this thread, DT will be heavily supported
by other SoC soon.

> What i have seen is that the BSP architecture of different ARM SoC
> companies is even different.
>
> samsung has three levels:
> plat-samsung
> ? ? ? ? ? ?plat-s3c24xx
> ? ? ? ? ? ? ? ? ? ? mach-s3c2410
> ? ? ? ? ? ? ? ? ? ? mach-s3c2440
> ? ? ? ? ? ?plat-s5p
> ? ? ? ? ? ? ? ? ? ? mach-s5pv210
> ? ? ? ? ? ? ? ? ? ? mach-s5pv310
>
> TI has two levels:
> plat-omap
> ? ? ? ? ? ?mach-omap1
> ? ? ? ? ? ?mach-omap2
>
> Nvidia has one level:
> mach-tegra
>
> I didn't find any rule about what codes should be placed in what
> directories. Different companies have different ways. It looks like
> the only agreement is board files are in mach-xxx. Any suggestions for
> that?
>

That's totally frustrated for a new comer, I think. It's that possible
we do more unification firstly and then allow new comers to follow,
like:
plat-common (or just named 'plat')- common plat-common framework for
all ARM based SoC, which might contains IRQ framework, GPIO, Timer,
Clock, PWM or other common things
SoC players just need add one file to enable the platform common
things on their SoC such as plat-omap.c, plat-imx.c, plat-samsung.c
and etc.

mach-'soc' - for machine or board related code, such as mach-omap,
mach-imx .., or maybe we can also introduce mach-common to share other
machine or board layer common code. I guess it will be some machine
related API common functions.

It's just a simple idea, we still need lots of work to make that happen.

Thanks,
-Bryan

> BTW, we don't want to "dick around", which Linus has been very angry.
> we want to fix more issues this email pointed out before we send
> patches.
>
>> 9. All interesting work is going into a handful of platforms, all of which
>> ? are ARMv7 based.
>> 10. We do not want to discontinue support for old boards that work fine.
>> 11. Massive changes to existing platforms would cause massive breakage.
>> 12. Supporting many different boards with a single kernel binary is a
>> ? ?useful goal.
>> 13. Infrastructure code should be cross-platform, not duplicated across
>> ? ?platforms.
>> 14. 32 bit ARM is hitting the wall in the next years (Cortex-A15 is
>> ? ?actually adding PAE support, which has failed to solve this on
>> ? ?other architectures).
>> 15. We need to solve the platform problem before 64 bit support comes
>> ? ?and adds another dimension to the complexity.
>>
>> Based on these assumptions, my preferred strategy would be to a new
>> mach-nocrap directory with a documented set of rules (to be adapted when
>> necessary):
>>
>> * Strictly no crap
>> ?* No board files
>> ?* No hardcoded memory maps
>> ?* No lists of interrupts and GPIOs
>> * All infrastructure added must be portable to all ARMv7 based SoCs.
>> ?(ARMv6 can be added later)
>> * 64 bit safe code only.
>> * SMP safe code only.
>> * All board specific information must come from a device tree and
>> ?be run-time detected.
>> * Must use the same device drivers as existing platforms
>> * Should share platform drivers (interrupt controller, gpio, timer, ...)
>> ?with existing platforms where appropriate.
>> * Code quality takes priority over stability in mach-nocrap, but must not
>> ?break other platforms.
>>
>> Until we have something working there, I think we should still generally
>> allow new code to the existing platforms, and even new platforms to be
>> added, while trying to keep the quality as high as possible but without
>> changing the rules for them or doing any major treewide reworks.
>>
>> Once the mach-nocrap approach has turned into something usable, we can
>> proceed on three fronts:
>> 1. delete actively maintained boards from the other platforms once they
>> ? are no longer needed there
>> 2. generalize concepts from mach-nocrap by applying them to all boards,
>> ? similar to the cleanup work that people have always been doing.
>> 3. gradually make the rules for adding new code in other platforms stricter,
>> ? up to the point where they are bugfix only.
>>
>>> If companies do not 'bother to push upstream', then management will eventually
>>> notice negative economic consequences:
>>>
>>> ...
>>
>> Good points, I fully agree with these. I also think that the SoC companies
>> are actually understanding this nowadays, and that is exactly the reason
>> why we see so much code getting pushed in.
>>
>> ? ? ? ?Arnd
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> [email protected]
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> --
> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>

2011-04-06 08:42:02

[permalink] [raw]

Subject: Re: [GIT PULL] omap changes for v2.6.39 merge window

On Wed, 2011-04-06 at 00:19 +0100, Linus Walleij wrote:
> 2011/4/1 Linus Torvalds <[email protected]>:
>
> > If you have discoverable hardware, use it.
> >
> > But by "discoverable hardware" I mean something like PCI config
> > cycles. IOW, real hardware features.
>
> The ARM AMBA architecture actually has such a thing, or a
> little of it, found in drivers/amba/bus.c.
>
> Basically it requires you to get the physical address and size of
> each peripheral, then at offset -0x10 from the end address (usually
> at even 4K pages), if you find the magic number 0xB105F00D
> (ARM has a sense of humour, obviously) you can find something
> alike the PCI IDs at offset -0x20, manufacturer ID, version number
> and revision of the hardware.

I don't think this was ever part of the AMBA specification. It was just
some convention used within ARM for the PrimeCell peripherals. Other
AMBA licensees did something else so it's not a reliable mechanism (I
think ARM gave up on this as well in recent peripherals).

> IMO the world would have been much better off if
> ARM mandated that all vendors *must* use this scheme
> for their hardware blocks if they are to license the AMBA
> bus incarnations, but they don't.

As you said, it had limitations like not providing IRQ or DMA
information, so not entirely useful.

FDT may have its problems but so far is a more generic approach for
specifying such information.

--
Catalin

2011-04-07 01:46:04