2009-07-01 06:00:29

by Stephen Rothwell

[permalink] [raw]
Subject: linux-next: boot failure

Hi Eric,

next-20090630 failed to boot (on PowerPC Power5/6 machines):

calling .audit_watch_init+0x0/0x80 @ 1
Unable to handle kernel paging request for data at address 0xffffffffffffffff
Faulting instruction address: 0xc00000000008b440
cpu 0x0: Vector: 300 (Data Access) at [c0000000be683990]
pc: c00000000008b440: .srcu_read_lock+0x20/0x40
lr: c0000000001607cc: .fsnotify_recalc_global_mask+0x2c/0xa0
sp: c0000000be683c10
msr: 8000000000009032
dar: ffffffffffffffff
dsisr: 40010000
current = 0xc0000000be67e000
paca = 0xc00000000093b200
pid = 1, comm = swapper
enter ? for help
[link register ] c0000000001607cc .fsnotify_recalc_global_mask+0x2c/0xa0
[c0000000be683c10] c0000000be683ca0 (unreliable)
[c0000000be683ca0] c000000000160bc0 .fsnotify_obtain_group+0x1e0/0x260
[c0000000be683d60] c0000000007879e4 .audit_watch_init+0x34/0x80
[c0000000be683de0] c00000000000947c .do_one_initcall+0x6c/0x1e0
[c0000000be683ee0] c00000000076fd6c .kernel_init+0x23c/0x2c0
[c0000000be683f90] c00000000002a9bc .kernel_thread+0x54/0x70

next-20090629 was fine. The commits in 0630 and not in 0629 (from the
fsnotify tree) are:

Audit: clean up the audit_watch split
audit: convert audit watches to use fsnotify instead of inotify
audit: redo audit watch locking and refcnt in light of fsnotify
audit: do not get and put just to free a watch
fsnotify: duplicate fsnotify_mark_entry data between 2 marks
fsnotify: allow addition of duplicate fsnotify marks
audit: reimplement audit_trees using fsnotify rather than inotify
inotify: deprecate the inotify kernel interface

If I have time, I may do a bisection.
--
Cheers,
Stephen Rothwell [email protected]
http://www.canb.auug.org.au/~sfr/


Attachments:
(No filename) (1.74 kB)
(No filename) (197.00 B)
Download all attachments

2009-07-01 11:53:19

by Sachin Sant

[permalink] [raw]
Subject: Re: linux-next: boot failure

Stephen Rothwell wrote:
> next-20090629 was fine. The commits in 0630 and not in 0629 (from the
> fsnotify tree) are:
>
> Audit: clean up the audit_watch split
> audit: convert audit watches to use fsnotify instead of inotify
> audit: redo audit watch locking and refcnt in light of fsnotify
> audit: do not get and put just to free a watch
> fsnotify: duplicate fsnotify_mark_entry data between 2 marks
> fsnotify: allow addition of duplicate fsnotify marks
> audit: reimplement audit_trees using fsnotify rather than inotify
> inotify: deprecate the inotify kernel interface
>
Stephen / Eric,

I too am facing similar issue on both Power and x86.
Culprit seems to be 2nd patch in the above list.

commit e1b79967e2b29839d16c12b534597a15d8630fc4
audit: convert audit watches to use fsnotify instead of inotify

I wasn't able to remove only patch 2 because of dependencies
on other patches. After i reverted patches 2 to 7 in the above
list, i was able to boot the machine. Then i applied patch 2
and the machine failed to boot.

Thanks
-Sachin


--

---------------------------------
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
---------------------------------

2009-07-01 12:21:01

by Stephen Rothwell

[permalink] [raw]
Subject: Re: linux-next: boot failure

Hi Sachin,

On Wed, 01 Jul 2009 17:22:59 +0530 Sachin Sant <[email protected]> wrote:
>
> I too am facing similar issue on both Power and x86.
> Culprit seems to be 2nd patch in the above list.
>
> commit e1b79967e2b29839d16c12b534597a15d8630fc4
> audit: convert audit watches to use fsnotify instead of inotify
>
> I wasn't able to remove only patch 2 because of dependencies
> on other patches. After i reverted patches 2 to 7 in the above
> list, i was able to boot the machine. Then i applied patch 2
> and the machine failed to boot.

Thanks for tracking this down. Now we just need a solution :-)

--
Cheers,
Stephen Rothwell [email protected]
http://www.canb.auug.org.au/~sfr/


Attachments:
(No filename) (711.00 B)
(No filename) (197.00 B)
Download all attachments

2009-07-01 12:35:26

by Eric Paris

[permalink] [raw]
Subject: Re: linux-next: boot failure

On Wed, 2009-07-01 at 16:00 +1000, Stephen Rothwell wrote:
> Hi Eric,
>
> next-20090630 failed to boot (on PowerPC Power5/6 machines):
>
> calling .audit_watch_init+0x0/0x80 @ 1
> Unable to handle kernel paging request for data at address 0xffffffffffffffff
> Faulting instruction address: 0xc00000000008b440
> cpu 0x0: Vector: 300 (Data Access) at [c0000000be683990]
> pc: c00000000008b440: .srcu_read_lock+0x20/0x40
> lr: c0000000001607cc: .fsnotify_recalc_global_mask+0x2c/0xa0
> sp: c0000000be683c10
> msr: 8000000000009032
> dar: ffffffffffffffff
> dsisr: 40010000
> current = 0xc0000000be67e000
> paca = 0xc00000000093b200
> pid = 1, comm = swapper
> enter ? for help
> [link register ] c0000000001607cc .fsnotify_recalc_global_mask+0x2c/0xa0
> [c0000000be683c10] c0000000be683ca0 (unreliable)
> [c0000000be683ca0] c000000000160bc0 .fsnotify_obtain_group+0x1e0/0x260
> [c0000000be683d60] c0000000007879e4 .audit_watch_init+0x34/0x80
> [c0000000be683de0] c00000000000947c .do_one_initcall+0x6c/0x1e0
> [c0000000be683ee0] c00000000076fd6c .kernel_init+0x23c/0x2c0
> [c0000000be683f90] c00000000002a9bc .kernel_thread+0x54/0x70

Hmmmm, I'm looking. The best I can guess is that the srcu struct in
fsnotify_recalc_global_mask hasn't been initialized. Both
audit_watch_init() (where you are having problems) and fsnotify_init()
(where we initialize the srcu struct) use subsys_initcall()

I will check the makefiles to see if kernel/built-in.o is linked in
before fs/buildin-in.o. I don't see a reason why audit watches need to
be that early in the kernel init process. This isn't happening on my
system so I'm asking if anyone hitting it can apply this patch and test?

Audit: audit watch init should not be before fsnotify init

From: Eric Paris <[email protected]>

Audit watch init and fsnotify init both use subsys_initcall() but since the
audit watch code is linked in before the fsnotify code the audit watch code
would be using the fsnotify srcu struct before it was initialized. This
patch fixes that problem by moving audit watch init to device_initcall() so
it happens after fsnotify is ready.

---

kernel/audit_watch.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)


diff --git a/kernel/audit_watch.c b/kernel/audit_watch.c
index 177e4b8..1295120 100644
--- a/kernel/audit_watch.c
+++ b/kernel/audit_watch.c
@@ -584,4 +584,4 @@ static int __init audit_watch_init(void)
}
return 0;
}
-subsys_initcall(audit_watch_init);
+device_initcall(audit_watch_init);

2009-07-01 14:46:31

by Sachin Sant

[permalink] [raw]
Subject: Re: linux-next: boot failure

Eric Paris wrote:
> I will check the makefiles to see if kernel/built-in.o is linked in
> before fs/buildin-in.o. I don't see a reason why audit watches need to
> be that early in the kernel init process. This isn't happening on my
> system so I'm asking if anyone hitting it can apply this patch and test?
>
> Audit: audit watch init should not be before fsnotify init
>
> From: Eric Paris <[email protected]>
With this patch i am able to boot today's next on my machines.

Tested-by : Sachin Sant <[email protected]>

Thanks
-Sachin


--

---------------------------------
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
---------------------------------