Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932065AbWCTKqH (ORCPT ); Mon, 20 Mar 2006 05:46:07 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751114AbWCTKqH (ORCPT ); Mon, 20 Mar 2006 05:46:07 -0500 Received: from smtp.osdl.org ([65.172.181.4]:48352 "EHLO smtp.osdl.org") by vger.kernel.org with ESMTP id S1751093AbWCTKqG (ORCPT ); Mon, 20 Mar 2006 05:46:06 -0500 Date: Mon, 20 Mar 2006 02:42:48 -0800 From: Andrew Morton To: prasanna@in.ibm.com Cc: ak@suse.de, davem@davemloft.net, suparna@in.ibm.com, richardj_moore@uk.ibm.com, linux-kernel@vger.kernel.org Subject: Re: [1/3 PATCH] Kprobes: User space probes support- base interface Message-Id: <20060320024248.426b97ec.akpm@osdl.org> In-Reply-To: <20060320060931.GD31091@in.ibm.com> References: <20060320060745.GC31091@in.ibm.com> <20060320060931.GD31091@in.ibm.com> X-Mailer: Sylpheed version 1.0.4 (GTK+ 1.2.10; i386-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4956 Lines: 151 Prasanna S Panchamukhi wrote: > > This patch provides two interfaces to insert and remove > user space probes. Each probe is uniquely identified by > inode and offset within that executable/library file. > Insertion of a probe involves getting the code page for > a given offset, mapping it into the memory and then insert > the breakpoint at the given offset. Also the probe is added > to the uprobe_table hash list. A uprobe_module data strcture > is allocated for every probed application/library image on disk. > Removal of a probe involves getting the code page for a given > offset, mapping that page into the memory and then replacing > the breakpoint instruction with a the original opcode. > This patch also provides aggrigate probe handler feature, > where user can define multiple handlers per probe. > > +/** > + * Finds a uprobe at the specified user-space address in the current task. > + * Points current_uprobe at that uprobe and returns the corresponding kprobe. > + */ > +struct kprobe __kprobes *get_uprobe(void *addr) > +{ > + struct mm_struct *mm = current->mm; > + struct vm_area_struct *vma; > + struct inode *inode; > + unsigned long offset; > + struct kprobe *p, *kpr; > + struct uprobe *uprobe; > + > + vma = find_vma(mm, (unsigned long)addr); > + > + BUG_ON(!vma); /* this should not happen, not in our memory map */ > + > + offset = (unsigned long)addr - (vma->vm_start + > + (vma->vm_pgoff << PAGE_SHIFT)); > + if (!vma->vm_file) > + return NULL; All this appears to be happening without mmap_sem held? > +/** > + * Wait for the page to be unlocked if someone else had locked it, > + * then map the page and insert or remove the breakpoint. > + */ > +static int __kprobes map_uprobe_page(struct page *page, struct uprobe *uprobe, > + process_uprobe_func_t process_kprobe_user) > +{ > + int ret = 0; > + kprobe_opcode_t *uprobe_address; > + > + if (!page) > + return -EINVAL; /* TODO: more suitable errno */ > + > + wait_on_page_locked(page); > + /* could probably retry readpage here. */ > + if (!PageUptodate(page)) > + return -EINVAL; /* TODO: more suitable errno */ > + > + lock_page(page); That's a lot of fuss and might be racy with truncate. Why not just run lock_page()? > + uprobe_address = (kprobe_opcode_t *)kmap(page); > + uprobe_address = (kprobe_opcode_t *)((unsigned long)uprobe_address + > + (uprobe->offset & ~PAGE_MASK)); > + ret = (*process_kprobe_user)(uprobe, uprobe_address); > + kunmap(page); kmap_atomic() is preferred. > +/** > + * Gets exclusive write access to the given inode to ensure that the file > + * on which probes are currently applied does not change. Use the function, > + * deny_write_access_to_inode() we added in fs/namei.c. > + */ > +static inline int ex_write_lock(struct inode *inode) > +{ > + return deny_write_access_to_inode(inode); > +} hm, this code likes to poke around in VFS internals. It would be nice to have an overall description of what it's trying to do, why and how. > +/** > + * unregister_uprobe: Disarms the probe, removes the uprobe > + * pointers from the hash list and unhooks readpage routines. > + */ > +void __kprobes unregister_uprobe(struct uprobe *uprobe) > +{ > + struct address_space *mapping; > + struct uprobe_module *umodule; > + struct page *page; > + unsigned long flags; > + int ret = 0; > + > + if (!uprobe->inode) > + return; > + > + mapping = uprobe->inode->i_mapping; > + > + page = find_get_page(mapping, uprobe->offset >> PAGE_CACHE_SHIFT); > + > + spin_lock_irqsave(&uprobe_lock, flags); > + ret = remove_uprobe(uprobe); > + spin_unlock_irqrestore(&uprobe_lock, flags); > + > + mutex_lock(&uprobe_mutex); > + if (!(umodule = get_module_by_inode(uprobe->inode))) > + goto out; > + > + hlist_del(&uprobe->ulist); > + if (hlist_empty(&umodule->ulist_head)) { > + list_del(&umodule->mlist); > + ex_write_unlock(uprobe->inode); > + path_release(&umodule->nd); > + kfree(umodule); > + } > + > +out: > + mutex_unlock(&uprobe_mutex); > + if (ret) > + ret = map_uprobe_page(page, uprobe, remove_kprobe_user); > + > + if (ret == -EINVAL) > + return; > + /* > + * TODO: unregister_uprobe should not fail, need to handle > + * if it fails. > + */ > + flush_vma(mapping, page, uprobe); > + > + if (page) > + page_cache_release(page); > +} That's some pretty awkward coding. Buggy too. It doesn't drop the refcount on the page if map_uprobe_page() returned -EINVAL because it's assuming that EINVAL meant "there was no page". But there are multiple reasons for map_uprobe_page() to return -EINVAL. If that page isn't uptodate, we leak a ref. This function should be doing the checking for a find_get_page() failure, not map_uprobe_page(). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/