Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753424AbbBYLUM (ORCPT ); Wed, 25 Feb 2015 06:20:12 -0500 Received: from collab.rosalab.ru ([195.19.76.181]:55243 "EHLO collab.rosalab.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751641AbbBYLUI (ORCPT ); Wed, 25 Feb 2015 06:20:08 -0500 Message-ID: <54EDAFE6.3000503@rosalab.ru> Date: Wed, 25 Feb 2015 14:20:06 +0300 From: Eugene Shatokhin Organization: ROSA User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: LKML Subject: Re: Kprobes: pre-handler with interrupts enabled - is it possible? References: <54ED88BC.8080705@rosalab.ru> In-Reply-To: <54ED88BC.8080705@rosalab.ru> X-Forwarded-Message-Id: <54ED88BC.8080705@rosalab.ru> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6239 Lines: 148 > (2015/02/24 15:04), Eugene Shatokhin wrote: >> 24.02.2015 06:47, Masami Hiramatsu пишет: >>> No, that is not allowed. I mean, you can do anything you want to do >>> on your handler (enabling preemption/irq etc.) but the result may be >>> not safe (it can crash your kernel, but it's not a kprobes' bug). >> >> Yes, that is why I am asking. >> >>> Actually, enable interrupts on kprobe handlers can cause reentering >>> kprobes (by kprobes on interrupt handlers), and currently kprobe skips >>> all those reentered kprobes. >>> Is it acceptable that some of your kprobe handlers are not fired when >>> hitting? >> >> I think, yes. When a software breakpoint hits, my system decodes the >> instruction, finds the address that is about to be accessed and tries to >> place a hardware breakpoint on that memory area. >> >> There are only 4 hardware breakpoints a CPU can use on x86, so if the >> software breakpoint hits too often, the system will not be able to >> process all hits anyway because all HW breakpoints may be already in use. >> >>> Would you mean sleep on your handler?? >> >> No, I use mdelay(). It is, in essence, a busy-wait loop as far as I >> know. The delay intervals may vary, the default is 5 jiffies. > > Hmm, here I couldn't understand. If mdelay() does busy-wait loop, why > would you like to enable irq?? > Other code doesn't work on the core while waiting. I'd like not to enable IRQ but rather to execute my handler with the same (or similar) restrictions as the original instruction would. If the insn executed with IRQ enabled, so would the handler, etc. So I am looking for a way to avoid *additionally* disabling IRQ (and, perhaps, preemption, although this might be harder). The breakpoints and delays already incur a penalty on the system's responsiveness. However, if, say, I probe an insn executing in a process context with IRQs enabled, the interrupts may be served on this CPU during the delay. If, additionally, preemption is not disabled and the kernel is built with CONFIG_PREEMPT=y then, I guess, mdelay() can be preempted allowing some other task to run, which is good for overall responsiveness. Usually, the longer delays I make, the more likely the races are detected but the performance overhead increases too. I do not have the exact numbers yet, but still. So, while 5-10 jiffies are often enough, sometimes it could be beneficial to wait longer. For example, when I used the system to confirm a race between .probe() and .ndo_open() callbacks in e1000 driver a year ago, I used the delay of about one second or more (for NetworkManager to start working with the device), which is too much if the IRQs were disabled, I think. Both .probe() and .ndo_open() executed in process context, by the way. Well, I was actually thinking about something like the following (for x86, at least). If a Kprobe's pre_handler returns non-zero, single-step will not be performed, right? As far as I can see in the code, Jprobes rely on that. Preemption will still be disabled and Jprobe's handler enables it when ready. What if I place a Kprobe on an insn of interest and the pre_handler changes regs->ip to the address of my function, say, "my_thunk_pre" (see below) then returns non-zero. Handling of int3 then completes, the context is restored, the interrupts are re-enabled (if they were enabled before int3). Preemption remains off because the Kprobe's implementation disabled it. Execution resumes in "my_thunk_pre" that is written in assembly and may look like this on x86_64 (x86_32 is similar): ---------------------- my_thunk_pre: push %rax call my_handler // my_handler() is a C function, with the default // calling convention/linkage. // Returns the address of the copied insn in the // Kprobe's insn slot in %rax. // restore the orig value of %rax and push the address // to jump to on the stack xchg %rax, (%rsp) // Jump to the copied insn (and fix %rsp at the same time): ret ---------------------- In this case, my_handler() seems to execute in the same context as the original insn, except for disabled preemption. It may use kprobe_running() to get the Kprobe, and, perhaps, some my structure that contains that Kprobe. Then, I guess, it might call preempt_enable_no_resched() like Jprobe's handler does (may be some other actions are needed?). After that, my_handler can do the rest of its job: arm the HW breakpoints, call mdelay(), etc. my_handler will return the address of the copied insn in the Kprobe's insn slot. The control will be passed there by my_thunk_pre(). For this to work, it is needed that the copied insn stored in the Kprobe's insn slot was followed by a jump back to the original code, to the next insn, I mean. Of course, this is not necessary for some control-transfer insns. But my system mostly works with the insns that access data rather than with these. Looks like Kprobes already do something similar and place such jumps in the insn slots (Kprobes with ainsn.boostable == 1) if there is enough space there. That is, if the size of the copied insn + 5 (size of jmp near relative) < 16 (MAX_INSN_SIZE). However, this seems to be done after single-step, which will not happen in my case. Still, I could place the jumps after the insns in the slots earlier, e.g., before I arm the Kprobes. Perhaps, it will not interfere with other functions of Kprobes. So, if all this worked, I suppose, my system would get everything it needs: my_handler() will do the delays in the same context and with the same restrictions as the original insn executes. Or perhaps, I am missing something critical here? Could this scheme break Kprobes somehow, what do you think? If there are no visible culprits, I think, I will give it a try. So, what is your opinion? By the way, thanks for you time, this my letter became unusually long. Regards, Eugene -- Eugene Shatokhin, ROSA www.rosalab.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/