2011-06-01 19:25:33

by Steven Rostedt

[permalink] [raw]
Subject: [PATCH] ide: Fix bug caused by git merge

Running ktest.pl randconfig on an older box of mine that has an ide cd,
I found a config that caused the machine not to boot. Doing a ktest
config bisect, I found that if I have CONFIG_BLK_DEV_PIIX set, it caused
the bug.

Then I took that same config and found that 2.6.39 boots fine. So I
kicked off ktest to do a git bisect, and this is where everything went
to hell. The bisect came to a point where it wouldn't boot from a branch
that was based off of v2.6.39-rc4. But then I found v2.6.39-rc4 didn't
boot either. Something fixed this bug between rc4 and 39, but this
branch didn't have it.

I than ran a reverse git bisect with ktest (where good is bad and bad is
good) between v2.6.39 and v2.6.39-rc4 to find what fixed this bug, and
it found:

commit 7eec77a1816a7042591a6cbdb4820e9e7ebffe0e
Author: Tejun Heo <[email protected]>
ide: unexport DISK_EVENT_MEDIA_CHANGE for ide-gd and ide-cd

Great I thought, and then applied this patch to my git bisect again
(modified ktest for v3.1) to find the other bug. But instead I found
that this was the same bug, but it reappeared. The culprit was a line in
drivers/ide/ide-cd.c. What happened was that a git merge brought back
some code that was deleted.

The merge 0eb8e885726a3a93206510092bbc7e39e272f6ef that merged the
commits:

af75cd3c67845ebe31d2df9a780889a5ebecef11

and

0a58e077eb600d1efd7e54ad9926a75a39d7f8ae

had an issue.

If I did a:

git blame af75cd3c67845ebe31d2df9a780889a5ebecef11 drivers/ide/ide-cd.c

I would find:

5b03a1b1 (Tejun Heo 2011-03-09 19:54:27 +0100 1785) g->events = DISK_EVENT_MEDIA_CHANGE;

But if I did a:

git blame 0a58e077eb600d1efd7e54ad9926a75a39d7f8ae drivers/ide/ide-cd.c

Which contains the commit 7eec77a1816a7042591a6cbdb4820e9e7ebffe0e which
removes this line, it does not exist.

The issue here is that the commit 5b03a1b1 is a common ancestor to both
commits. But even though the commit 7eec77a1 removed that line, for some
reason, the merge put that line back. No other commit added that line to
the code.

By simply removing that line, my machine now boots again with the bad
config.

Signed-off-by: Steven Rostedt <[email protected]>

diff --git a/drivers/ide/ide-cd.c b/drivers/ide/ide-cd.c
index 6e5123b..144d272 100644
--- a/drivers/ide/ide-cd.c
+++ b/drivers/ide/ide-cd.c
@@ -1782,7 +1782,6 @@ static int ide_cd_probe(ide_drive_t *drive)
ide_cd_read_toc(drive, &sense);
g->fops = &idecd_ops;
g->flags |= GENHD_FL_REMOVABLE | GENHD_FL_BLOCK_EVENTS_ON_EXCL_WRITE;
- g->events = DISK_EVENT_MEDIA_CHANGE;
add_disk(g);
return 0;



2011-06-01 20:28:39

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] ide: Fix bug caused by git merge

On Thu, Jun 2, 2011 at 4:25 AM, Steven Rostedt <[email protected]> wrote:
>
> The issue here is that the commit 5b03a1b1 is a common ancestor to both
> commits. But even though the commit 7eec77a1 removed that line, for some
> reason, the merge put that line back. No other commit added that line to
> the code.

Incorrect.

What added that line back to the code was commit 698567f3fa79, a merge by Jens.

It had conflicts in drivers/ide/ide-cd.c (due to commit d4dc210f69bc,
which added the flag GENHD_FL_BLOCK_EVENTS_ON_EXCL_WRITE on the line
just before). And Jens did a mis-merge there.

Jens - this is a prime example of why people SHOULD NOT DO CROSS-MERGES!

You are just not as used to doing merges as I am, and when there are
conflicts, you clearly didn't actually look at the two sides to notice
that one side removed the line, so when you resolved it by taking all
the conflicting stuff, you re-introduced the line that should have
been deleted!

At the time of the conflicting merge, a

gitk --merge drivers/ide/ide-cd.c

would have shown the two conflicting commits, and told you that the
line needs to not exist in the merge.

You can re-create that "gitk --merge" in the current tree with

gitk 698567f3fa79^...698567f3fa79^2 drivers/ide/ide-cd.c

and because you just picked the result from one parent (instead of
picking the proper result of *combining* the changes), a

git show 698567f3fa79

doesn't show anything at all.

Grr. I now wondered about the OTHER conflicts that are mentioned in
that commit, AND THEY ALL HAVE THE SAME BUG. For all three files you
just picked one side, not the other.

Jens, you really can't do that. When you see a conflict marker, you
need to look at WHY. The conflict may well be because of removed lines
rather than added lines, the correct resolution is NOT necessarily to
just take everything within the conflicting region.

For example, the conflict in pcd.c looks like this (I'm re-doing the
merge properly to get the fixed diff):

disk->fops = &pcd_bdops;
<<<<<<< HEAD
disk->flags = GENHD_FL_BLOCK_EVENTS_ON_EXCL_WRITE;
disk->events = DISK_EVENT_MEDIA_CHANGE;
=======
>>>>>>> 698567f3fa79^2
}

and the rsolution is simply to

disk->fops = &pcd_bdops;
disk->flags = GENHD_FL_BLOCK_EVENTS_ON_EXCL_WRITE;
}

not the one you did (which left the "disk->events = .." line that was
removed in the other branch,

And the merge diff should have looked like this:

diff --cc drivers/block/paride/pcd.c
index a0aabd904a51,8690e31d9932..000000000000
--- a/drivers/block/paride/pcd.c
+++ b/drivers/block/paride/pcd.c
@@@ -320,8 -320,6 +320,7 @@@ static void pcd_init_units(void
disk->first_minor = unit;
strcpy(disk->disk_name, cd->name); /* umm... */
disk->fops = &pcd_bdops;
+ disk->flags = GENHD_FL_BLOCK_EVENTS_ON_EXCL_WRITE;
- disk->events = DISK_EVENT_MEDIA_CHANGE;
}
}



which shows how one line was added in one branch, and another one
removed. But because you just mindlessly picked one side over the
other, the merge diff ends up being empty (that's the "trivially
resolved" thing).

Pretty much exactly the same thing for the two other files.

PLEASE DON'T DO MERGES IF YOU CANNOT DO THEM!

Don't do them to remove conflicts. I'm *good* at handling conflicts. I
do it every friggin day. I might get them wrong occasionally, but it
is very very rare, exactly because I'm so used to doing merges, and I
carefully look at what actually happened on the two sides when I
resolve things. Git has wonderful support for helping resolve merges
intelligently, but you need to know how to do it. See the gitk example
above - run it and ponder.

Grr.

I'll fix it properly.

And thanks Steven for noticing, even if your analysis ended up missing this.

(Junio - what made it harder for Steven to see the reason may be due
the default history simplification. I do wonder if we should just make
"--simplify-merges" the default, because the aggressive and simple
default culling makes it hard to see merge commits like this that just
pick one side over the other. --simplify-merges is more expensive,
but doesn't have some of the problems the aggressive simplification
has)

Linus

2011-06-01 20:42:05

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] ide: Fix bug caused by git merge

On 2011-06-01 22:27, Linus Torvalds wrote:
> Jens, you really can't do that. When you see a conflict marker, you
> need to look at WHY. The conflict may well be because of removed lines
> rather than added lines, the correct resolution is NOT necessarily to
> just take everything within the conflicting region.

Of course I look at the conflicts. I don't know what happened in this
case. Contrary to what you think, I do do regular merges. Usually not in
external branches, and if I do, it's only to resolve non-trivial
conflicts.

And I don't know why this one got so screwed up, well reversed really.

--
Jens Axboe

2011-06-01 20:57:51

by Andreas Schwab

[permalink] [raw]
Subject: Re: [PATCH] ide: Fix bug caused by git merge

Linus Torvalds <[email protected]> writes:

> Git has wonderful support for helping resolve merges intelligently,
> but you need to know how to do it. See the gitk example above - run it
> and ponder.

Btw, when I redo the merge and let kdiff3 resolve the conflicts (via git
mergetool) it does exactly the right thing. :-)

Andreas.

--
Andreas Schwab, [email protected]
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."

2011-06-01 21:55:30

by Junio C Hamano

[permalink] [raw]
Subject: Re: [PATCH] ide: Fix bug caused by git merge

Linus Torvalds <[email protected]> writes:

> Junio - what made it harder for Steven to see the reason may be due
> the default history simplification. I do wonder if we should just make
> "--simplify-merges" the default, because the aggressive and simple
> default culling makes it hard to see merge commits like this that just
> pick one side over the other. --simplify-merges is more expensive,
> but doesn't have some of the problems the aggressive simplification
> has.

It is more expensive not just in computation cycles but in latency, as it
is inherently a "limited" operation that needs to first walk the history
and then post-process.

$ time sh -c 'git log drivers/ide/ide-cd.c | head -n 200 >/dev/null'

with and without "--simplify-merges" shows more than 50-fold differences
(0.15s vs 8s).

I am more disturbed by the fact that "git show 698567f3fa7" does not show
the mismerge (even with -c). Of course I know that not showing "taking
from one side" is by design of -c/--cc, but I still feel there should be
something we could do about it.

2011-06-01 22:02:12

by Junio C Hamano

[permalink] [raw]
Subject: Re: [PATCH] ide: Fix bug caused by git merge

Junio C Hamano <[email protected]> writes:

> Linus Torvalds <[email protected]> writes:
>
>> Junio - what made it harder for Steven to see the reason may be due
>> the default history simplification. I do wonder if we should just make
>> "--simplify-merges" the default, because the aggressive and simple
>> default culling makes it hard to see merge commits like this that just
>> pick one side over the other. --simplify-merges is more expensive,
>> but doesn't have some of the problems the aggressive simplification
>> has.
>
> It is more expensive not just in computation cycles but in latency, as it
> is inherently a "limited" operation that needs to first walk the history
> and then post-process.
>
> $ time sh -c 'git log drivers/ide/ide-cd.c | head -n 200 >/dev/null'
>
> with and without "--simplify-merges" shows more than 50-fold differences
> (0.15s vs 8s).
>
> I am more disturbed by the fact that "git show 698567f3fa7" does not show
> the mismerge (even with -c). Of course I know that not showing "taking
> from one side" is by design of -c/--cc, but I still feel there should be
> something we could do about it.

Your comment was about digging history after problem happened, so this is
a tangent in that sense, but I am wondering if it makes sense to advertise
and encourage the use of --conflict=diff3 even more, to prevent accidents
like this from happening in the first place.

Here is how our default merge conflict marker (modelled after RCS and CVS
output) look like when recreating the merge:

diff --cc drivers/ide/ide-cd.c
index 6e5123b,a5ec5a7..0000000
--- a/drivers/ide/ide-cd.c
+++ b/drivers/ide/ide-cd.c
@@@ -1781,8 -1781,7 +1781,12 @@@ static int ide_cd_probe(ide_drive_t *dr

ide_cd_read_toc(drive, &sense);
g->fops = &idecd_ops;
++<<<<<<< HEAD
+ g->flags |= GENHD_FL_REMOVABLE | GENHD_FL_BLOCK_EVENTS_ON_EXCL_WRITE;
+ g->events = DISK_EVENT_MEDIA_CHANGE;
++=======
+ g->flags |= GENHD_FL_REMOVABLE;
++>>>>>>> 61c4f2c
add_disk(g);
return 0;

With --conflict=diff3, it becomes crystal clear that g->events assignment
needs to be removed for even somebody who does not know the history of
this part of the kernel at all (like me).

diff --cc drivers/ide/ide-cd.c
index 6e5123b,a5ec5a7..0000000
--- a/drivers/ide/ide-cd.c
+++ b/drivers/ide/ide-cd.c
@@@ -1781,8 -1781,7 +1781,15 @@@ static int ide_cd_probe(ide_drive_t *dr

ide_cd_read_toc(drive, &sense);
g->fops = &idecd_ops;
++<<<<<<< ours
+ g->flags |= GENHD_FL_REMOVABLE | GENHD_FL_BLOCK_EVENTS_ON_EXCL_WRITE;
+ g->events = DISK_EVENT_MEDIA_CHANGE;
++||||||| base
+ g->flags |= GENHD_FL_REMOVABLE;
++ g->events = DISK_EVENT_MEDIA_CHANGE;
++=======
++ g->flags |= GENHD_FL_REMOVABLE;
++>>>>>>> theirs
add_disk(g);
return 0;

2011-06-02 00:01:06

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] ide: Fix bug caused by git merge

On Wed, 2011-06-01 at 15:01 -0700, Junio C Hamano wrote:

> With --conflict=diff3, it becomes crystal clear that g->events assignment
> needs to be removed for even somebody who does not know the history of
> this part of the kernel at all (like me).

OK, I like this. This has been one of my biggest complaints about
conflicts in git over the old .rej files that patch produced. In fact, I
find myself using a script from Peter Zijlstra that creates .rej files
from git conflicts, and puts back the old file before the commit was
done.

The default git commit shows you change1 and change2. It does not show
you what it was before either change. With a conflict giving a .rej
file, you see what is there now, what it was that you were trying to
commit, as well as what that commit expected to be there but wasn't. The
default git conflict lacks that last part. I have to remember this
option as it shows what I was missing from patch.

Thanks!

-- Steve

>
> diff --cc drivers/ide/ide-cd.c
> index 6e5123b,a5ec5a7..0000000
> --- a/drivers/ide/ide-cd.c
> +++ b/drivers/ide/ide-cd.c
> @@@ -1781,8 -1781,7 +1781,15 @@@ static int ide_cd_probe(ide_drive_t *dr
>
> ide_cd_read_toc(drive, &sense);
> g->fops = &idecd_ops;
> ++<<<<<<< ours
> + g->flags |= GENHD_FL_REMOVABLE | GENHD_FL_BLOCK_EVENTS_ON_EXCL_WRITE;
> + g->events = DISK_EVENT_MEDIA_CHANGE;
> ++||||||| base
> + g->flags |= GENHD_FL_REMOVABLE;
> ++ g->events = DISK_EVENT_MEDIA_CHANGE;
> ++=======
> ++ g->flags |= GENHD_FL_REMOVABLE;
> ++>>>>>>> theirs
> add_disk(g);
> return 0;