2005-09-05 07:49:45

by Rolf Eike Beer

[permalink] [raw]
Subject: Re: rc5 seemed to kill a disk that rc4-mm1 likes. Also some X trouble.

Am Dienstag, 30. August 2005 10:07 schrieb Rolf Eike Beer:
>Linus Torvalds wrote:
>>On Mon, 22 Aug 2005, Rolf Eike Beer wrote:
>>> >It's a PII-350 with more or less SuSE 9.3. The machine has no net
>>> > access, so I can only try to narrow it down to one rc at the weekend.
>>>
>>> 2.6.12 works fine, everything since 2.6.13-rc1 breaks it.
>>
>>Gaah. I don't see anything really obvious in that range. However, I notice
>>that pci_mmap_resource() (in drivers/pci/pci-sysfs.c) now has
>>
>>+ if (i >= PCI_ROM_RESOURCE)
>>+ return -ENODEV;
>>
>>which seems a big bogus. Why wouldn't we allow the ROM resource to be
>>mapped? I could imagine that the X server would very much like to mmap it,
>>although I don't know if modern X actually does that. The fact that it
>>works when root runs the X server and causes problems for normal users
>>does seem like there's something that root can do that users can't do, and
>>doing a mmap() on /dev/mem might be just that.
>>
>>Eike, maybe you could change the ">=" to just ">" instead?
>>
>>PS. The patch that introduced this was billed as "no change for anything
>>but ppc". Tssk.
>
>This does not fix the problem. I'll narrow it down to one git snapshot next
>weekend (forgot the tarball on friday).

The problem appeared between 2.6.12-git3 and 2.6.12-git4.

Eike


Attachments:
(No filename) (1.30 kB)
(No filename) (189.00 B)
Download all attachments

2005-09-05 08:45:37

by Linus Torvalds

[permalink] [raw]
Subject: Re: rc5 seemed to kill a disk that rc4-mm1 likes. Also some X trouble.


On Mon, 5 Sep 2005, Rolf Eike Beer wrote:
>
> The problem appeared between 2.6.12-git3 and 2.6.12-git4.

Just for reference, that's git ID's

1d345dac1f30af1cd9f3a1faa12f9f18f17f236e..2a5a68b840cbab31baab2d9b2e1e6de3b289ae1e

and that's 225 commits and the diff is 55,781 lines long.

It would be very good if you could try to use raw git and narrow it down a
bit more. It's really easy these days with a recent git version, just do

git bisect start
git bisect good 1d345dac1f30af1cd9f3a1faa12f9f18f17f236e
git bisect bad 2a5a68b840cbab31baab2d9b2e1e6de3b289ae1e

and off you go.. That will select a new kernel for you to try, which
basically cuts down the commits to ~110 - and if you can test just a few
kernels and binary-search a bit more, we'd have it down to just a couple.

If you want to try work smarter (rather than a brute-force binary search
thing), this command line:

git-whatchanged -p \
1d345dac1f30af1cd9f3a1faa12f9f18f17f236e..2a5a68b840cbab31baab2d9b2e1e6de3b289ae1e \
drivers/video

will actually give you some very good information on what to try (I forget
your exact original problem - I'm writing this from Italy, and I don't
have my full email archives here. It was some MGA card that stopped
working, no? Or was there something else?).

Anyway, git users really have a lot of nifty tools to help chase down bugs
like this. I used that "git bisect" thing twice myself last week. And the
"git-whatchanged" thing really is pretty flexible: as you can see, you can
limit it to both a range of commits and a certain subdirectory (or a _set_
of subdirectories and/or individual files - you can have as many pathname
limits as you want).

And that "-p" thing makes it show the whole diff for the thing (replace if
with a "-s" if you just want to see the descriptions and be silent about
the actual diff).

All in my never-ending quest to make people more aware of how they can use
git to pinpoint the source of kernel bugs.

Linus

2005-09-05 20:02:28

by Sonny Rao

[permalink] [raw]
Subject: Re: rc5 seemed to kill a disk that rc4-mm1 likes. Also some X trouble.

On Mon, Sep 05, 2005 at 01:45:28AM -0700, Linus Torvalds wrote:
>
> On Mon, 5 Sep 2005, Rolf Eike Beer wrote:
> >
> > The problem appeared between 2.6.12-git3 and 2.6.12-git4.
>
> Just for reference, that's git ID's
>
> 1d345dac1f30af1cd9f3a1faa12f9f18f17f236e..2a5a68b840cbab31baab2d9b2e1e6de3b289ae1e
>
> and that's 225 commits and the diff is 55,781 lines long.
>
> It would be very good if you could try to use raw git and narrow it down a
> bit more. It's really easy these days with a recent git version, just do
>
> git bisect start
> git bisect good 1d345dac1f30af1cd9f3a1faa12f9f18f17f236e
> git bisect bad 2a5a68b840cbab31baab2d9b2e1e6de3b289ae1e
>
> and off you go.. That will select a new kernel for you to try, which
> basically cuts down the commits to ~110 - and if you can test just a few
> kernels and binary-search a bit more, we'd have it down to just a couple.

Can this method detect breakages that are spread across more than one
patch? I suppose it'll just trigger on the last patch commited in the
set in this case?

Sonny

2005-09-06 07:44:39

by Linus Torvalds

[permalink] [raw]
Subject: Re: rc5 seemed to kill a disk that rc4-mm1 likes. Also some X trouble.



On Mon, 5 Sep 2005, Sonny Rao wrote:
>
> Can this method detect breakages that are spread across more than one
> patch? I suppose it'll just trigger on the last patch commited in the
> set in this case?

It will trigger on just the commit that introduces the user-visible
breakage, so yes, it's usually the last in a series (or the first one, for
that matter).

And it's not perfect. A problem that fades in and out is not something you
can do binary searching on. For example, sometimes a bug gets introduced
and ends up being dependent on things like cache alignment or some
variable layout etc, so you only _see_ the problem occasionally, and it
ends up happening due to totally unrelated changes - then the bisection
algorithm ends up being totally useles..

Linus