2008-10-10 09:41:59

by Pekka Enberg

[permalink] [raw]
Subject: [PROBLEM] hard-lock with kmemtrace, relayfs, and splice

(I'm cc'ing Tom, Jens, and LKML.)

On Fri, 2008-10-10 at 12:10 +0300, Pekka Enberg wrote:
> > I'm seeing a hard lock on my machine when I run kmemtraced with the
> > following patch applied:
> >
> > http://git.kernel.org/?p=linux/kernel/git/penberg/slab-2.6.git;a=commitdiff;h=17ca1d5506b1db433f0b7167a627bfd55d319dd3
> >
> > I can enable/disable kmemtrace via the debugfs files fine and can also
> > read the relay files with cat.
> >
> > Any idea where this is coming from?
>
> OK, it's the first splice() call in reader_thread() that causes the
> hang. Hmm.

To recap, with a CONFIG_KMEMTRACE enabled kernel from the
"topic/kmemtrace" branch of:

git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6.git topic/kmemtrace

running the "kmemtraced" program from

git://git.kernel.org/pub/scm/linux/kernel/git/penberg/kmemtrace-user.git

results to a hard lock on my machine. I am unable to find anything
obviously wrong with it and as I can read/write the relay files just
fine, I'm beginning to think it's problem in relayfs splice
implementation.

Tom, thoughts?

Pekka


Subject: Re: [PROBLEM] hard-lock with kmemtrace, relayfs, and splice

On Fri, Oct 10, 2008 at 12:42:03PM +0300, Pekka Enberg wrote:
> (I'm cc'ing Tom, Jens, and LKML.)
>
> On Fri, 2008-10-10 at 12:10 +0300, Pekka Enberg wrote:
> > > I'm seeing a hard lock on my machine when I run kmemtraced with the
> > > following patch applied:
> > >
> > > http://git.kernel.org/?p=linux/kernel/git/penberg/slab-2.6.git;a=commitdiff;h=17ca1d5506b1db433f0b7167a627bfd55d319dd3
> > >
> > > I can enable/disable kmemtrace via the debugfs files fine and can also
> > > read the relay files with cat.
> > >
> > > Any idea where this is coming from?
> >
> > OK, it's the first splice() call in reader_thread() that causes the
> > hang. Hmm.
>
> To recap, with a CONFIG_KMEMTRACE enabled kernel from the
> "topic/kmemtrace" branch of:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6.git topic/kmemtrace
>
> running the "kmemtraced" program from
>
> git://git.kernel.org/pub/scm/linux/kernel/git/penberg/kmemtrace-user.git
>
> results to a hard lock on my machine. I am unable to find anything
> obviously wrong with it and as I can read/write the relay files just
> fine, I'm beginning to think it's problem in relayfs splice
> implementation.
>
> Tom, thoughts?
>
> Pekka
>

Hmm, I've seen kmemtraced not stopping, but no hard locks. It seems not
even SIGKILL can stop it. As far as I know, SIGKILL should stop even a
blocking splice() (in case this is the problem). This is what I did:
# ./kmemtraced &
# killall -KILL kmemtraced
... kmemtraced still running and using CPU.


Eduard

2008-10-11 04:59:16

by Tom Zanussi

[permalink] [raw]
Subject: Re: [PROBLEM] hard-lock with kmemtrace, relayfs, and splice


On Fri, 2008-10-10 at 12:42 +0300, Pekka Enberg wrote:
> (I'm cc'ing Tom, Jens, and LKML.)
>
> On Fri, 2008-10-10 at 12:10 +0300, Pekka Enberg wrote:
> > > I'm seeing a hard lock on my machine when I run kmemtraced with the
> > > following patch applied:
> > >
> > > http://git.kernel.org/?p=linux/kernel/git/penberg/slab-2.6.git;a=commitdiff;h=17ca1d5506b1db433f0b7167a627bfd55d319dd3
> > >
> > > I can enable/disable kmemtrace via the debugfs files fine and can also
> > > read the relay files with cat.
> > >
> > > Any idea where this is coming from?
> >
> > OK, it's the first splice() call in reader_thread() that causes the
> > hang. Hmm.
>
> To recap, with a CONFIG_KMEMTRACE enabled kernel from the
> "topic/kmemtrace" branch of:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6.git topic/kmemtrace
>
> running the "kmemtraced" program from
>
> git://git.kernel.org/pub/scm/linux/kernel/git/penberg/kmemtrace-user.git
>
> results to a hard lock on my machine. I am unable to find anything
> obviously wrong with it and as I can read/write the relay files just
> fine, I'm beginning to think it's problem in relayfs splice
> implementation.
>
> Tom, thoughts?
>

It looks like you hit the same problem as described here:

commit 8191ecd1d14c6914c660dfa007154860a7908857

splice: fix infinite loop in generic_file_splice_read()

relay uses the same loop but it never got noticed or fixed. Can you try
the following patch:

diff --git a/kernel/relay.c b/kernel/relay.c
index 8d13a78..6a4d439 100644
--- a/kernel/relay.c
+++ b/kernel/relay.c
@@ -1318,12 +1318,9 @@ static ssize_t relay_file_splice_read(struct file *in,
if (ret < 0)
break;
else if (!ret) {
- if (spliced)
- break;
- if (flags & SPLICE_F_NONBLOCK) {
+ if (flags & SPLICE_F_NONBLOCK)
ret = -EAGAIN;
- break;
- }
+ break;
}

*ppos += ret;


It worked for me, but I also had to apply the following patch to
kmemtraced:

diff --git a/kmemtraced.c b/kmemtraced.c
index 217478d..324ced9 100644
--- a/kmemtraced.c
+++ b/kmemtraced.c
@@ -109,6 +109,8 @@ static void *reader_thread(void *data)
if (retval < 0)
panic("splice() (from) failed: %s\n",
strerror(errno));
+ if (!retval)
+ continue;
retval = splice(pipe_fd[0], NULL, log_fd, NULL,
128, SPLICE_F_MOVE);
if (retval < 0)


Otherwise it would end up hanging kmemtraced in the second splice (pipe
to log_fd) if the return from the first splice was 0 (i.e. there's no
data available (and we can never know if there will ever be any
more)).

I'm not sure why kmemtraced is only splicing 128 bytes at a time - it
seems to defeat the purpose - or why it wouldn't be using poll to know
when there's at least a whole sub-buffer to splice, but to each his own.
Hopefully the kernel patch at least fixes the loop.

Tom


Subject: Re: [PROBLEM] hard-lock with kmemtrace, relayfs, and splice

On Fri, Oct 10, 2008 at 11:58:51PM -0500, Tom Zanussi wrote:
> It worked for me, but I also had to apply the following patch to
> kmemtraced:
>
> diff --git a/kmemtraced.c b/kmemtraced.c
> index 217478d..324ced9 100644
> --- a/kmemtraced.c
> +++ b/kmemtraced.c
> @@ -109,6 +109,8 @@ static void *reader_thread(void *data)
> if (retval < 0)
> panic("splice() (from) failed: %s\n",
> strerror(errno));
> + if (!retval)
> + continue;
> retval = splice(pipe_fd[0], NULL, log_fd, NULL,
> 128, SPLICE_F_MOVE);
> if (retval < 0)
>
> Otherwise it would end up hanging kmemtraced in the second splice (pipe
> to log_fd) if the return from the first splice was 0 (i.e. there's no
> data available (and we can never know if there will ever be any
> more)).

Thanks, I'll apply it.

> I'm not sure why kmemtraced is only splicing 128 bytes at a time - it
> seems to defeat the purpose - or why it wouldn't be using poll to know
> when there's at least a whole sub-buffer to splice, but to each his own.
> Hopefully the kernel patch at least fixes the loop.

Yeah, it was a misguided attempt to fix the strange behavior.

> Tom


Cheers,
Eduard

2008-10-13 06:57:54

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PROBLEM] hard-lock with kmemtrace, relayfs, and splice

Hi Tom,

On Fri, 2008-10-10 at 23:58 -0500, Tom Zanussi wrote:
> It looks like you hit the same problem as described here:
>
> commit 8191ecd1d14c6914c660dfa007154860a7908857
>
> splice: fix infinite loop in generic_file_splice_read()
>
> relay uses the same loop but it never got noticed or fixed. Can you try
> the following patch:
>
> diff --git a/kernel/relay.c b/kernel/relay.c
> index 8d13a78..6a4d439 100644
> --- a/kernel/relay.c
> +++ b/kernel/relay.c
> @@ -1318,12 +1318,9 @@ static ssize_t relay_file_splice_read(struct file *in,
> if (ret < 0)
> break;
> else if (!ret) {
> - if (spliced)
> - break;
> - if (flags & SPLICE_F_NONBLOCK) {
> + if (flags & SPLICE_F_NONBLOCK)
> ret = -EAGAIN;
> - break;
> - }
> + break;
> }
>
> *ppos += ret;
>

Indeed. That fixes the deadlock.

However, now I don't get anything to the cpu*.out files if I run
kmemtraced with kmemtrace disabled. If I enable kmemtrace manually and
then run kmemtraced, I do receive some data. I did apply the
kmemtrace-user patch as well.

Hmm?

Pekka

2008-10-14 04:04:18

by Tom Zanussi

[permalink] [raw]
Subject: Re: [PROBLEM] hard-lock with kmemtrace, relayfs, and splice

Hi Pekka,

On Mon, 2008-10-13 at 09:57 +0300, Pekka Enberg wrote:
> Hi Tom,
>
> On Fri, 2008-10-10 at 23:58 -0500, Tom Zanussi wrote:
> > It looks like you hit the same problem as described here:
> >
> > commit 8191ecd1d14c6914c660dfa007154860a7908857
> >
> > splice: fix infinite loop in generic_file_splice_read()
> >
> > relay uses the same loop but it never got noticed or fixed. Can you try
> > the following patch:
> >
> > diff --git a/kernel/relay.c b/kernel/relay.c
> > index 8d13a78..6a4d439 100644
> > --- a/kernel/relay.c
> > +++ b/kernel/relay.c
> > @@ -1318,12 +1318,9 @@ static ssize_t relay_file_splice_read(struct file *in,
> > if (ret < 0)
> > break;
> > else if (!ret) {
> > - if (spliced)
> > - break;
> > - if (flags & SPLICE_F_NONBLOCK) {
> > + if (flags & SPLICE_F_NONBLOCK)
> > ret = -EAGAIN;
> > - break;
> > - }
> > + break;
> > }
> >
> > *ppos += ret;
> >
>
> Indeed. That fixes the deadlock.
>
> However, now I don't get anything to the cpu*.out files if I run
> kmemtraced with kmemtrace disabled. If I enable kmemtrace manually and
> then run kmemtraced, I do receive some data. I did apply the
> kmemtrace-user patch as well.
>
> Hmm?

To me, that sounds like how it should work - if kmemtrace is disabled,
it shouldn't be logging anything, and that's in fact what I saw when
debugging this - it started out disabled and therefore nothing being
logged to relay (printks confirmed that). When I wrote 1 to the enabled
file, data started getting logged to relay and to the *.out files.

So I don't know why the enabled state behaves the way it does, or if
it's unexpected, but that anyway doesn't seem like a relay problem to
me.

Tom

>
> Pekka
>

2008-10-14 05:17:44

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PROBLEM] hard-lock with kmemtrace, relayfs, and splice

Hi Tom,

Tom Zanussi wrote:
> To me, that sounds like how it should work - if kmemtrace is disabled,
> it shouldn't be logging anything, and that's in fact what I saw when
> debugging this - it started out disabled and therefore nothing being
> logged to relay (printks confirmed that). When I wrote 1 to the enabled
> file, data started getting logged to relay and to the *.out files.
>
> So I don't know why the enabled state behaves the way it does, or if
> it's unexpected, but that anyway doesn't seem like a relay problem to
> me.

Oh, right. Looking at kmemtraced.c, we never enable kmemtrace, just
disable it (which doesn't make much sense). Bug in README or the code.
Eduard?

Btw, Tom, you can add my

Tested-by: Pekka Enberg <[email protected]>

to your patch if you want. Are you going to send it to Andrew or do you
want me to pick it up with rest of the kmemtrace patches?

Pekka

2008-10-14 05:47:18

by Tom Zanussi

[permalink] [raw]
Subject: Re: [PROBLEM] hard-lock with kmemtrace, relayfs, and splice

Hi Pekka,

On Tue, 2008-10-14 at 08:13 +0300, Pekka Enberg wrote:
> Hi Tom,
>
> Tom Zanussi wrote:
> > To me, that sounds like how it should work - if kmemtrace is disabled,
> > it shouldn't be logging anything, and that's in fact what I saw when
> > debugging this - it started out disabled and therefore nothing being
> > logged to relay (printks confirmed that). When I wrote 1 to the enabled
> > file, data started getting logged to relay and to the *.out files.
> >
> > So I don't know why the enabled state behaves the way it does, or if
> > it's unexpected, but that anyway doesn't seem like a relay problem to
> > me.
>
> Oh, right. Looking at kmemtraced.c, we never enable kmemtrace, just
> disable it (which doesn't make much sense). Bug in README or the code.
> Eduard?
>
> Btw, Tom, you can add my
>
> Tested-by: Pekka Enberg <[email protected]>
>
> to your patch if you want. Are you going to send it to Andrew or do you
> want me to pick it up with rest of the kmemtrace patches?

If you pick it up that would be fine with me. Here's my sob:

Signed-off-by: Tom Zanussi <[email protected]>

Thanks,

Tom

>
> Pekka

2008-10-14 06:58:18

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PROBLEM] hard-lock with kmemtrace, relayfs, and splice

On Tue, 2008-10-14 at 00:46 -0500, Tom Zanussi wrote:
> > Tested-by: Pekka Enberg <[email protected]>
> >
> > to your patch if you want. Are you going to send it to Andrew or do you
> > want me to pick it up with rest of the kmemtrace patches?
>
> If you pick it up that would be fine with me. Here's my sob:
>
> Signed-off-by: Tom Zanussi <[email protected]>

Applied, thanks!

http://git.kernel.org/?p=linux/kernel/git/penberg/slab-2.6.git;a=commitdiff;h=51b19be3535c8fbcce6b6f838d89b9a6a4cc5b92

2008-10-14 07:05:35

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PROBLEM] hard-lock with kmemtrace, relayfs, and splice

On Tue, 2008-10-14 at 08:13 +0300, Pekka Enberg wrote:
> Hi Tom,
>
> Tom Zanussi wrote:
> > To me, that sounds like how it should work - if kmemtrace is disabled,
> > it shouldn't be logging anything, and that's in fact what I saw when
> > debugging this - it started out disabled and therefore nothing being
> > logged to relay (printks confirmed that). When I wrote 1 to the enabled
> > file, data started getting logged to relay and to the *.out files.
> >
> > So I don't know why the enabled state behaves the way it does, or if
> > it's unexpected, but that anyway doesn't seem like a relay problem to
> > me.
>
> Oh, right. Looking at kmemtraced.c, we never enable kmemtrace, just
> disable it (which doesn't make much sense). Bug in README or the code.

Eduard, here's a fix for the userspace counterpart:

>From aa95e997458fd4d2a3b9a5c75e1452e22a3159af Mon Sep 17 00:00:00 2001
From: Pekka Enberg <[email protected]>
Date: Tue, 14 Oct 2008 10:03:51 +0300
Subject: [PATCH] kmemtrace: enable tracing in kmemtraced

If kmemtrace is not enabled at boot, we need to enable it before we start
relaying data to userspace.

Signed-off-by: Pekka Enberg <[email protected]>
---
kmemtraced.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kmemtraced.c b/kmemtraced.c
index 217478d..5cc1ada 100644
--- a/kmemtraced.c
+++ b/kmemtraced.c
@@ -165,8 +165,6 @@ int main(int argc, char *argv[])
if (!readers)
panic("Out of memory!\n");

- write_str("/sys/kernel/debug/kmemtrace/enabled", "0");
-
printf("Copying /proc/kallsyms...\n");
copy_kallsyms();

@@ -180,6 +178,8 @@ int main(int argc, char *argv[])
strerror(errno));
}

+ write_str("/sys/kernel/debug/kmemtrace/enabled", "1");
+
printf("Logging... Press Control-C to stop.\n");

while (sigwait(&signals, &signal) == 0) {
--
1.5.4.3


Subject: Re: [PROBLEM] hard-lock with kmemtrace, relayfs, and splice

On Tue, Oct 14, 2008 at 12:46:47AM -0500, Tom Zanussi wrote:
> Hi Pekka,
>
> On Tue, 2008-10-14 at 08:13 +0300, Pekka Enberg wrote:
> > Hi Tom,
> >
> > Tom Zanussi wrote:
> > > To me, that sounds like how it should work - if kmemtrace is disabled,
> > > it shouldn't be logging anything, and that's in fact what I saw when
> > > debugging this - it started out disabled and therefore nothing being
> > > logged to relay (printks confirmed that). When I wrote 1 to the enabled
> > > file, data started getting logged to relay and to the *.out files.
> > >
> > > So I don't know why the enabled state behaves the way it does, or if
> > > it's unexpected, but that anyway doesn't seem like a relay problem to
> > > me.
> >
> > Oh, right. Looking at kmemtraced.c, we never enable kmemtrace, just
> > disable it (which doesn't make much sense). Bug in README or the code.
> > Eduard?
> >
> > Btw, Tom, you can add my
> >
> > Tested-by: Pekka Enberg <[email protected]>
> >
> > to your patch if you want. Are you going to send it to Andrew or do you
> > want me to pick it up with rest of the kmemtrace patches?
>
> If you pick it up that would be fine with me. Here's my sob:
>
> Signed-off-by: Tom Zanussi <[email protected]>
>
> Thanks,
>
> Tom
>

Great, this makes it go swifter. I'll commit the change to
kmemtrace-user.

2008-10-24 04:50:05

by Peter Teoh

[permalink] [raw]
Subject: Re: [PROBLEM] hard-lock with kmemtrace, relayfs, and splice

after doing a

git clone git://repo.or.cz/linux-2.6/kmemtrace.git

I reboot the OS and encountered several application crashes.....the
logs are as per attached....please comment.

for bug0 and bug1, the system go into a state of complete
non-responsive, even connecting via SSH into the system becomes
impossible, and bug0 and bug1 was generated just before it goes into
this state.

for bug2, the mouse seemed to respond, i managed to output the dmesg
(which is bug2 itself).

any comments?

On Fri, Oct 10, 2008 at 5:42 PM, Pekka Enberg <[email protected]> wrote:
> (I'm cc'ing Tom, Jens, and LKML.)
>
> On Fri, 2008-10-10 at 12:10 +0300, Pekka Enberg wrote:
>> > I'm seeing a hard lock on my machine when I run kmemtraced with the
>> > following patch applied:
>> >
>> > http://git.kernel.org/?p=linux/kernel/git/penberg/slab-2.6.git;a=commitdiff;h=17ca1d5506b1db433f0b7167a627bfd55d319dd3
>> >
>> > I can enable/disable kmemtrace via the debugfs files fine and can also
>> > read the relay files with cat.
>> >
>> > Any idea where this is coming from?
>>
>> OK, it's the first splice() call in reader_thread() that causes the
>> hang. Hmm.
>
> To recap, with a CONFIG_KMEMTRACE enabled kernel from the
> "topic/kmemtrace" branch of:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6.git topic/kmemtrace
>
> running the "kmemtraced" program from
>
> git://git.kernel.org/pub/scm/linux/kernel/git/penberg/kmemtrace-user.git
>
> results to a hard lock on my machine. I am unable to find anything
> obviously wrong with it and as I can read/write the relay files just
> fine, I'm beginning to think it's problem in relayfs splice
> implementation.
>
> Tom, thoughts?
>
> Pekka
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>



--
Regards,
Peter Teoh


Attachments:
(No filename) (1.96 kB)
bug0 (34.45 kB)
bug1 (121.89 kB)
bug2 (90.31 kB)
Download all attachments

2008-10-24 14:17:19

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PROBLEM] hard-lock with kmemtrace, relayfs, and splice

On Fri, 2008-10-24 at 12:44 +0800, Peter Teoh wrote:
> after doing a
>
> git clone git://repo.or.cz/linux-2.6/kmemtrace.git
>
> I reboot the OS and encountered several application crashes.....the
> logs are as per attached....please comment.
>
> for bug0 and bug1, the system go into a state of complete
> non-responsive, even connecting via SSH into the system becomes
> impossible, and bug0 and bug1 was generated just before it goes into
> this state.
>
> for bug2, the mouse seemed to respond, i managed to output the dmesg
> (which is bug2 itself).
>
> any comments?

Hmm, you're hitting out-of-memory. I can't immediately see how kmemtrace
is at fault here. Eduard, thoughts?

2008-10-25 00:56:19

by Peter Teoh

[permalink] [raw]
Subject: Re: [PROBLEM] hard-lock with kmemtrace, relayfs, and splice

ok. just ignore it. i think based on "git log" of"git://repo.or.cz/linux-2.6/kmemtrace.git":
commit 0441e5ff6ab71cf7a3e9ec3116f14d0fd7d20d51Author: Eduard - Gabriel Munteanu <[email protected]>Date: Thu Jul 10 20:20:05 2008 +0300
kmemtrace: SLOB hooks.
the repo seemed way out of date. But a clone of Pekka's slab-2.6'stopic/kmemtrace branch have no problem. The steps are as follows:
1. git clone git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6.git2. git checkout -m origin/topic/kmemtrace3. git checkout -b kmemtrace
git log gives:
commit 51b19be3535c8fbcce6b6f838d89b9a6a4cc5b92Author: Tom Zanussi <[email protected]>Date: Fri Oct 10 23:58:51 2008 -0500
relayfs: fix infinite loop with splice()
But the userspace tool I still get it from: git://repo.or.cz/kmemtrace-user.git
Correct?
Now I have some problems:
a. I would like to extract out all the commit as diff - may I knowhow to do that? ("git log" only gives the descriptive part).b. boot-time memory profiling....how can it be done (or extractedout)? (kmemtrace-user does not have that?)c. please provide some pointers to documentation: how do Iinterpret the following:
Allocation #83468 (CPU0) already exists! by __kmalloc_track_caller+25 last touched by __kmalloc_track_callerAllocation #83740 (CPU0) already exists! by __kmalloc_track_caller+25 last touched by __kmalloc_track_caller
For my purpose, I would like to trace how and where is memoryallocated - just one one single userspace program which access themmap("/dev/mem") to read the memory. How can it be done? "Tracingmemory" in my context will mean list all the kernel-function +memory-allocated-within-the-function + its-PFN-or-PTE-information (topinpoint the exact physical page). My understanding is that thesecan be done, via checking the value of "current", as each userspaceprogram started will have a unique "current" context value, andtherefore within any kernel function, just check this value, to focustracing on just one particular userspace program. Correct?
d. The following:
/sys/kernel/debug/kmemtrace>ls -al?ܼ? 0drwxr-xr-x 2 root root 0 2008-10-25 .drwxr-xr-x 9 root root 0 2008-10-25 ..-r-------- 1 root root 0 2008-10-25 abi_version-r-------- 1 root root 2946936192 2008-10-25 cpu0-r-------- 1 root root 1701263952 2008-10-25 cpu1-rw------- 1 root root 0 2008-10-25 enabled-r-------- 1 root root 0 2008-10-25 total_overruns
So the information is stored in memory, right? Is it possible toreset it? I don't want these information to clog up the memory.
e. The balance between cpu0 and cpu1 does not seemed equal, anyimplications on the scheduler/ IO Scheduler? or memory allocationscheduler (if there exists one :-))....sorry about that....myknowledge is limited.
On Fri, Oct 24, 2008 at 10:15 PM, Pekka Enberg <[email protected]> wrote:> On Fri, 2008-10-24 at 12:44 +0800, Peter Teoh wrote:>> after doing a>>>> git clone git://repo.or.cz/linux-2.6/kmemtrace.git>>>> I reboot the OS and encountered several application crashes.....the>> logs are as per attached....please comment.>>>> for bug0 and bug1, the system go into a state of complete>> non-responsive, even connecting via SSH into the system becomes>> impossible, and bug0 and bug1 was generated just before it goes into>> this state.>>>> for bug2, the mouse seemed to respond, i managed to output the dmesg>> (which is bug2 itself).>>>> any comments?>> Hmm, you're hitting out-of-memory. I can't immediately see how kmemtrace> is at fault here. Eduard, thoughts?>>


-- Regards,Peter Teoh????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

Subject: Re: [PROBLEM] hard-lock with kmemtrace, relayfs, and splice

On Sat, Oct 25, 2008 at 08:56:07AM +0800, Peter Teoh wrote:
> ok. just ignore it. i think based on "git log" of
> "git://repo.or.cz/linux-2.6/kmemtrace.git":
>
> commit 0441e5ff6ab71cf7a3e9ec3116f14d0fd7d20d51
> Author: Eduard - Gabriel Munteanu <[email protected]>
> Date: Thu Jul 10 20:20:05 2008 +0300
>
> kmemtrace: SLOB hooks.
>
> the repo seemed way out of date. But a clone of Pekka's slab-2.6's
> topic/kmemtrace branch have no problem. The steps are as follows:
>
> 1. git clone git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6.git
> 2. git checkout -m origin/topic/kmemtrace
> 3. git checkout -b kmemtrace

Hi,

Yes, that repo may be outdated, please use Pekka's for the latest
working stuff.

> git log gives:
>
> commit 51b19be3535c8fbcce6b6f838d89b9a6a4cc5b92
> Author: Tom Zanussi <[email protected]>
> Date: Fri Oct 10 23:58:51 2008 -0500
>
> relayfs: fix infinite loop with splice()
>
> But the userspace tool I still get it from: git://repo.or.cz/kmemtrace-user.git
>
> Correct?

Yes, I maintain this one and it's up-to-date.

> Now I have some problems:
>
> a. I would like to extract out all the commit as diff - may I know
> how to do that? ("git log" only gives the descriptive part).

You can use "git-log -p" to get diffs along with commit history.

> b. boot-time memory profiling....how can it be done (or extracted
> out)? (kmemtrace-user does not have that?)

If you enable kmemtrace at boot-time using the "kmemtrace.enabled=yes"
parameter, kmemtrace will start logging just after kmem_cache_init()
finishes. There are plans for making tracing possible even earlier,
however it involves quite a big rework of how relay allocates its
buffers.

> c. please provide some pointers to documentation: how do I
> interpret the following:
>
> Allocation #83468 (CPU0) already exists!
> by __kmalloc_track_caller+25
> last touched by __kmalloc_track_caller
> Allocation #83740 (CPU0) already exists!
> by __kmalloc_track_caller+25
> last touched by __kmalloc_track_caller

The userspace app writes out those messages when it looks like a memory
region has been allocated twice. In reality, it's most likely a free
hasn't been traced (because a misc free function isn't traced).

> For my purpose, I would like to trace how and where is memory
> allocated - just one one single userspace program which access the
> mmap("/dev/mem") to read the memory. How can it be done? "Tracing
> memory" in my context will mean list all the kernel-function +
> memory-allocated-within-the-function + its-PFN-or-PTE-information (to
> pinpoint the exact physical page). My understanding is that these
> can be done, via checking the value of "current", as each userspace
> program started will have a unique "current" context value, and
> therefore within any kernel function, just check this value, to focus
> tracing on just one particular userspace program. Correct?

I have made provisions for exporting the whole stack trace, however it
is not yet implemented.

> d. The following:
>
> /sys/kernel/debug/kmemtrace>ls -al
> ???? 0
> drwxr-xr-x 2 root root 0 2008-10-25 .
> drwxr-xr-x 9 root root 0 2008-10-25 ..
> -r-------- 1 root root 0 2008-10-25 abi_version
> -r-------- 1 root root 2946936192 2008-10-25 cpu0
> -r-------- 1 root root 1701263952 2008-10-25 cpu1
> -rw------- 1 root root 0 2008-10-25 enabled
> -r-------- 1 root root 0 2008-10-25 total_overruns
>
> So the information is stored in memory, right? Is it possible to
> reset it? I don't want these information to clog up the memory.

Relay uses fixed size buffers, so there is no point in clearing them.

> e. The balance between cpu0 and cpu1 does not seemed equal, any
> implications on the scheduler/ IO Scheduler? or memory allocation
> scheduler (if there exists one :-))....sorry about that....my
> knowledge is limited.

Much of the booting sequence is done on a single CPU, then the kernel
switches to SMP. It is natural to see one CPU log more allocations.

> --
> Regards,
> Peter Teoh

Thanks for testing.


Cheers,
Eduard