2013-05-21 03:57:48

by zhangwei(Jovi)

[permalink] [raw]
Subject: [ANNOUNCE] ktap 0.1 released


Dear,

I'm pleased to announce that ktap release v0.1, this is the first official
release of ktap project, it is expected that this release is not fully
functional or very stable and we welcome bug reports and fixes for the issues.

= what's ktap?

A New Scripting Dynamic Tracing Tool For Linux

KTAP is a new scripting dynamic tracing tool for Linux, it uses a
scripting language and lets users trace the Linux kernel dynamically.
KTAP is designed to give operational insights with interoperability
that allow users to tune, troubleshoot and extend kernel and application.

KTAP have different design principles from Linux mainstream dynamic tracing
language in that it's based on bytecode, so it doesn't depend upon GCC,
doesn't require compiling a kernel module, safe to use in production
environment, fulfilling the embedd ecosystem's tracing needs.

KTAP also is designed for enabling great interoperability with Linux kernel,
it gives user the power to modify and extend the system, and let users
explore the system in an easy way.

KTAP is released as GPL license.

More information can be found at ktap/doc directory.

= Features

Because this is the first release, so there wouldn't include too much features,
just contain several basic features about tracing, here are the summary:

1) support x86-32 and x86-64 (other arch is not tested yet)
2) support tracepoints, syscalls, kprobes, kretprobes
3) timer
4) dumpstack
5) many built-in functions and library functions in there

There have many features on the todo list, so it will support more
features in future, and be more stable than this release.

= Planned Changes

we are planning to enable more kernel ineroperability into ktap, implement more
sample scripts, and performance boost.

= Code

Please download code from:
https://github.com/ktap/ktap.git

= Building & Running

[root@jovi]# cd linux/kernel/trace/
[root@jovi]# git clone https://github.com/ktap/ktap.git

[root@jovi]# cd linux/kernel/trace/ktap
[root@jovi]# make #generate ktapvm kernel module
[root@jovi]# make ktap #generate userspace ktap tool

[root@jovi]# insmod ./ktapvm.ko
[root@jovi]# ./ktap scripts/syscalls.kp


= Simple syscall tracing example

function eventfun (e) {
printf("%d %d\t%s\t%s", cpu(), pid(), execname(), e.tostring())
}

kdebug.probe("tp:syscalls", eventfun)

kdebug.probe_end(function () {
printf("probe end\n")
})


= Examples/Documentation

Example is in ktap/scripts/
Documentation is in ktap/doc/

= Mailing list

[email protected]
You can subscribe KTAP mailing list at link: http://www.freelists.org/list/ktap

= Contribution

KTAP is still under active development, so contribution is welcome.
You are encouraged to report bugs, provide feedback, send feature request, or hack on it.

.jovi


2013-05-21 18:13:15

by Frank Ch. Eigler

[permalink] [raw]
Subject: Re: [ANNOUNCE] ktap 0.1 released

"zhangwei(Jovi)" <[email protected]> writes:

> I'm pleased to announce that ktap release v0.1, this is the first official
> release of ktap project [...]

Congrats.


> = what's ktap?
>
> Because this is the first release, so there wouldn't include too
> much features, just contain several basic features about tracing,
> [...]

Nice progress. Reviewing the safety/security items from
https://lkml.org/lkml/2013/1/17/623, I see improvement in most.

For example, you seem to be using GFP_ATOMIC for run-time memory
allocation, which is safer than before (though still could exhaust
resources). OTOH your code doesn't handle *failure* of such
allocation attempts (see call sites to kp_*alloc).

There still doesn't seem to be safety constraints on the incoming
byte code (like jump ranges, or loop counts).

It's nice to see some arithmetic OP_* checks, and the user_string
function is probably safe enough now. You'll need something analogous
for kernel space (and possibly as verification for the various %s
kp_printfs). The hash tables might be susceptible to the deliberate
hash collision attacks from last year.

It's nice to see the *_STACK_SIZE constraints in the bytecode
interpreter; is there any C-level recursion remaining to consume
excessive kernel stack?

Exposing os.sleep/os.wait (or general kernel functions) to become
callable from the scripts is fraught with danger. You just can't call
the underlying functions from random kernel context (imagine from the
most pessimal possible kprobe or tracepoint, somewhere within an
atomic section), and you'll get crashes.

You should write several time/space/invasivity stress-tests to help
see how future progress improves the code's performance/safety on
these and other problem areas.


> = Planned Changes
>
> we are planning to enable more kernel ineroperability into ktap [...]

As per the above, you'll want to be extremely careful about the idea
to export FFI to let the lua scripts call into arbitrary kernel
functions. Perhaps wrap it into a 'guru' mode flag?


- FChE

2013-05-21 22:19:10

by Jonathan Corbet

[permalink] [raw]
Subject: Re: [ANNOUNCE] ktap 0.1 released

On Tue, 21 May 2013 11:56:14 +0800
"zhangwei(Jovi)" <[email protected]> wrote:

> we welcome bug reports and fixes for the issues.

I'm messing with it...first impression:

unable create tracepoint event sys_enter_mmap on cpu 4, err: -19
unable create tracepoint event sys_enter_mmap on cpu 5, err: -19
unable create tracepoint event sys_enter_mmap on cpu 6, err: -19
unable create tracepoint event sys_enter_mmap on cpu 7, err: -19
[...]

Unsurprising, this is a four-core system. The code reads:

> for_each_possible_cpu(cpu)
> enable_tracepoint_on_cpu(cpu, &attr, call, arg, type);

Maybe that needs to be for_each_online_cpu() instead?

jon

2013-05-22 03:30:27

by zhangwei(Jovi)

[permalink] [raw]
Subject: Re: [ANNOUNCE] ktap 0.1 released

On 2013/5/22 6:19, Jonathan Corbet wrote:
> On Tue, 21 May 2013 11:56:14 +0800
> "zhangwei(Jovi)" <[email protected]> wrote:
>
>> we welcome bug reports and fixes for the issues.
>
> I'm messing with it...first impression:
>
> unable create tracepoint event sys_enter_mmap on cpu 4, err: -19
> unable create tracepoint event sys_enter_mmap on cpu 5, err: -19
> unable create tracepoint event sys_enter_mmap on cpu 6, err: -19
> unable create tracepoint event sys_enter_mmap on cpu 7, err: -19
> [...]
>
> Unsurprising, this is a four-core system. The code reads:
>
>> for_each_possible_cpu(cpu)
>> enable_tracepoint_on_cpu(cpu, &attr, call, arg, type);
>
> Maybe that needs to be for_each_online_cpu() instead?
>
Jon, Really thank you for this bug report, fixed it as your suggestion.

.jovi




2013-05-22 04:15:59

by Ming Lei

[permalink] [raw]
Subject: Re: [ANNOUNCE] ktap 0.1 released

On Tue, May 21, 2013 at 11:56 AM, zhangwei(Jovi)
<[email protected]> wrote:
>
> Dear,
>
> I'm pleased to announce that ktap release v0.1, this is the first official
> release of ktap project, it is expected that this release is not fully
> functional or very stable and we welcome bug reports and fixes for the issues.
>
> = what's ktap?
>
> A New Scripting Dynamic Tracing Tool For Linux
>
> KTAP is a new scripting dynamic tracing tool for Linux, it uses a
> scripting language and lets users trace the Linux kernel dynamically.
> KTAP is designed to give operational insights with interoperability
> that allow users to tune, troubleshoot and extend kernel and application.
>
> KTAP have different design principles from Linux mainstream dynamic tracing
> language in that it's based on bytecode, so it doesn't depend upon GCC,
> doesn't require compiling a kernel module, safe to use in production
> environment, fulfilling the embedd ecosystem's tracing needs.
>
> KTAP also is designed for enabling great interoperability with Linux kernel,
> it gives user the power to modify and extend the system, and let users
> explore the system in an easy way.
>
> KTAP is released as GPL license.
>
> More information can be found at ktap/doc directory.
>
> = Features
>
> Because this is the first release, so there wouldn't include too much features,
> just contain several basic features about tracing, here are the summary:
>
> 1) support x86-32 and x86-64 (other arch is not tested yet)
> 2) support tracepoints, syscalls, kprobes, kretprobes
> 3) timer
> 4) dumpstack
> 5) many built-in functions and library functions in there
>
> There have many features on the todo list, so it will support more
> features in future, and be more stable than this release.
>
> = Planned Changes
>
> we are planning to enable more kernel ineroperability into ktap, implement more
> sample scripts, and performance boost.
>
> = Code
>
> Please download code from:
> https://github.com/ktap/ktap.git
>
> = Building & Running
>
> [root@jovi]# cd linux/kernel/trace/
> [root@jovi]# git clone https://github.com/ktap/ktap.git
>
> [root@jovi]# cd linux/kernel/trace/ktap
> [root@jovi]# make #generate ktapvm kernel module
> [root@jovi]# make ktap #generate userspace ktap tool
>
> [root@jovi]# insmod ./ktapvm.ko
> [root@jovi]# ./ktap scripts/syscalls.kp
>
>
> = Simple syscall tracing example
>
> function eventfun (e) {
> printf("%d %d\t%s\t%s", cpu(), pid(), execname(), e.tostring())
> }
>
> kdebug.probe("tp:syscalls", eventfun)
>
> kdebug.probe_end(function () {
> printf("probe end\n")
> })
>
>
> = Examples/Documentation
>
> Example is in ktap/scripts/
> Documentation is in ktap/doc/

Nice job, I have run it on ARM already with only one line change.

But looks 'Control-C' can't stop the tracing or need some time to complete it,
see below:

$sudo ./ktap scripts/syscalls_histogram.kp
.....
Press Control-C to stop.
^C
^C^C^C
^C
value ------------- Distribution ------------- count
sys_enter_rt_sigprocmask |@@@@@@@@@@@@@
70
sys_enter_select |@@@@@@@@@
49
sys_enter_read |@@@@
25
sys_enter_write |@@@@
22
sys_enter_clock_gettime |@@@
19
sys_enter_ioctl |@
6
sys_enter_gettimeofday |
4
sys_enter_munmap |
3
sys_enter_fstat64 |
3
sys_enter_open |
3
sys_enter_close |
3
sys_enter_rt_sigaction |
1
sys_enter_nanosleep |
1
sys_enter_stat64 |
1



Thanks,
--
Ming Lei

2013-05-22 04:19:30

by Ming Lei

[permalink] [raw]
Subject: Re: [ANNOUNCE] ktap 0.1 released

On Wed, May 22, 2013 at 12:15 PM, Ming Lei <[email protected]> wrote:
> On Tue, May 21, 2013 at 11:56 AM, zhangwei(Jovi)
>
> Nice job, I have run it on ARM already with only one line change.
>
> But looks 'Control-C' can't stop the tracing or need some time to complete it,
> see below:

Sometimes, it doesn't work:

$ sudo ./ktap ./scripts/kretprobe.kp
......
vfs_read! (execname sshd); retval:38
^Cvfs_read! (execname sshd); retval:38

probe ending
vfs_read! (execname sshd); retval:48
vfs_read! (execname sshd); retval:2
^C
^C^C^C
^C^C^C^C^C^C^C

^C^C^C^C^C^C



Thanks,
--
Ming Lei

2013-05-22 04:34:08

by Ming Lei

[permalink] [raw]
Subject: Re: [ANNOUNCE] ktap 0.1 released

On Wed, May 22, 2013 at 12:19 PM, Ming Lei <[email protected]> wrote:
> On Wed, May 22, 2013 at 12:15 PM, Ming Lei <[email protected]> wrote:
>> On Tue, May 21, 2013 at 11:56 AM, zhangwei(Jovi)
>>
>> Nice job, I have run it on ARM already with only one line change.
>>
>> But looks 'Control-C' can't stop the tracing or need some time to complete it,
>> see below:
>
> Sometimes, it doesn't work:
>
> $ sudo ./ktap ./scripts/kretprobe.kp
> ......
> vfs_read! (execname sshd); retval:38
> ^Cvfs_read! (execname sshd); retval:38
>
> probe ending
> vfs_read! (execname sshd); retval:48
> vfs_read! (execname sshd); retval:2
> ^C
> ^C^C^C
> ^C^C^C^C^C^C^C
>
> ^C^C^C^C^C^C

This one can be fixed by below patch, but can't work on
the last one.

diff --git a/vm.c b/vm.c
index c5f5733..a24a389 100644
--- a/vm.c
+++ b/vm.c
@@ -1060,7 +1060,7 @@ ktap_State *kp_newthread(ktap_State *mainthread)

void kp_user_complete(ktap_State *ks)
{
- if (!ks || !G(ks)->user_completion)
+ if (!ks || !G(ks) || !G(ks)->user_completion)
return;

complete(G(ks)->user_completion);



Thanks,
--
Ming Lei

2013-05-22 07:00:53

by zhangwei(Jovi)

[permalink] [raw]
Subject: Re: [ANNOUNCE] ktap 0.1 released

On 2013/5/22 12:34, Ming Lei wrote:
> On Wed, May 22, 2013 at 12:19 PM, Ming Lei <[email protected]> wrote:
>> On Wed, May 22, 2013 at 12:15 PM, Ming Lei <[email protected]> wrote:
>>> On Tue, May 21, 2013 at 11:56 AM, zhangwei(Jovi)
>>>
>>> Nice job, I have run it on ARM already with only one line change.
>>>
>>> But looks 'Control-C' can't stop the tracing or need some time to complete it,
>>> see below:
>>
>> Sometimes, it doesn't work:
>>
>> $ sudo ./ktap ./scripts/kretprobe.kp
>> ......
>> vfs_read! (execname sshd); retval:38
>> ^Cvfs_read! (execname sshd); retval:38
>>
>> probe ending
>> vfs_read! (execname sshd); retval:48
>> vfs_read! (execname sshd); retval:2
>> ^C
>> ^C^C^C
>> ^C^C^C^C^C^C^C
>>
>> ^C^C^C^C^C^C
>
> This one can be fixed by below patch, but can't work on
> the last one.
>
> diff --git a/vm.c b/vm.c
> index c5f5733..a24a389 100644
> --- a/vm.c
> +++ b/vm.c
> @@ -1060,7 +1060,7 @@ ktap_State *kp_newthread(ktap_State *mainthread)
>
> void kp_user_complete(ktap_State *ks)
> {
> - if (!ks || !G(ks)->user_completion)
> + if (!ks || !G(ks) || !G(ks)->user_completion)
> return;
>
> complete(G(ks)->user_completion);
>
>
>
> Thanks,
>
Hi Ming,

Thanks for testing and your fix.

the ktap exit mechanism is not quite safe as you saw, so I plan to rewrite the logic
to make more safer, hopefully I could complete this work in next few days.

.jovi






2013-05-24 03:20:05

by zhangwei(Jovi)

[permalink] [raw]
Subject: Re: [ANNOUNCE] ktap 0.1 released

On 2013/5/22 2:13, Frank Ch. Eigler wrote:
> "zhangwei(Jovi)" <[email protected]> writes:
>
>> I'm pleased to announce that ktap release v0.1, this is the first official
>> release of ktap project [...]
>
> Congrats.
>
>
>> = what's ktap?
>>
>> Because this is the first release, so there wouldn't include too
>> much features, just contain several basic features about tracing,
>> [...]
>
> Nice progress. Reviewing the safety/security items from
> https://lkml.org/lkml/2013/1/17/623, I see improvement in most.
Thanks, frank, you give me a lot of helpful technical comments in that RFC mail,
also as this one :) really thanks.
>
> For example, you seem to be using GFP_ATOMIC for run-time memory
> allocation, which is safer than before (though still could exhaust
> resources). OTOH your code doesn't handle *failure* of such
> allocation attempts (see call sites to kp_*alloc).
Yes, memory allocation would be change to be more safer.
>
> There still doesn't seem to be safety constraints on the incoming
> byte code (like jump ranges, or loop counts).
>
> It's nice to see some arithmetic OP_* checks, and the user_string
> function is probably safe enough now. You'll need something analogous
> for kernel space (and possibly as verification for the various %s
> kp_printfs). The hash tables might be susceptible to the deliberate
> hash collision attacks from last year.
Current hashtable implementation is efficient, but need have more
security concern as you pointed.
>
> It's nice to see the *_STACK_SIZE constraints in the bytecode
> interpreter; is there any C-level recursion remaining to consume
> excessive kernel stack?
library C functions should not be a problem, like other kernel functions,
author should take care on stack overflow in own risk.
>
> Exposing os.sleep/os.wait (or general kernel functions) to become
> callable from the scripts is fraught with danger. You just can't call
> the underlying functions from random kernel context (imagine from the
> most pessimal possible kprobe or tracepoint, somewhere within an
> atomic section), and you'll get crashes.
Right, so those functions only can be called from mainthread,
I will add these checking later.

>
> You should write several time/space/invasivity stress-tests to help
> see how future progress improves the code's performance/safety on
> these and other problem areas.
Yes, there already have a test/ directory for basic functionality testing,
obviously it's not enough, I will add more benchmark and safety checking testcases.
>
>
>> = Planned Changes
>>
>> we are planning to enable more kernel ineroperability into ktap [...]
>
> As per the above, you'll want to be extremely careful about the idea
> to export FFI to let the lua scripts call into arbitrary kernel
> functions. Perhaps wrap it into a 'guru' mode flag?
Definitely, there must need a mode flag to separate safety and not-safety.
>
>
> - FChE
>
> .
>