Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752848AbbHDWi0 (ORCPT ); Tue, 4 Aug 2015 18:38:26 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:58191 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751383AbbHDWiZ (ORCPT ); Tue, 4 Aug 2015 18:38:25 -0400 Date: Tue, 4 Aug 2015 15:38:23 -0700 From: Andrew Morton To: yalin wang Cc: Ingo Molnar , dave@sr71.net, David Rientjes , fabf@skynet.be, bhe@redhat.com, open list Subject: Re: [RFC] kcore:change kcore_read to make sure the kernel read is safe Message-Id: <20150804153823.51b8ba7b9307dc663d635907@linux-foundation.org> In-Reply-To: <593904DC-7E60-4B95-B82F-B50A5024085C@gmail.com> References: <593904DC-7E60-4B95-B82F-B50A5024085C@gmail.com> X-Mailer: Sylpheed 3.4.1 (GTK+ 2.24.23; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4432 Lines: 123 On Tue, 4 Aug 2015 11:37:57 +0800 yalin wang wrote: > This change kcore_read() to use __copy_from_user_inatomic() to > copy data from kernel address, because kern_addr_valid() just make sure > page table is valid during call it, whne it return, the page table may > change, for example, like set_fixmap() function will change kernel page > table, then maybe trigger kernel crash if encounter this unluckily. The changelog is a bit hard to follow. How does this version look? : read_kcore() does a copy_to_user() from kernel memory. This could cause a : crash if the source (kernel) address is concurrently unmapped via, say, : set_fixmap(). The kern_addr_valid() check is racy and won't reliably : prevent this. : : Change kcore_read() to use __copy_from_user_inatomic() via a temporary : buffer to catch such situations. What actually happens when copy_to_user() gets a fault on the source address? It *could* handle it and return -EFAULT. I forget... Also... what is special about this particular copy_to_user()? Isn't every copy_to_user() in the kernel vulnerable to a concurrent set_fixmap()? Is it that only read_kcore() will read pages which are subject to set_fixmap() alteration? > --- a/fs/proc/kcore.c > +++ b/fs/proc/kcore.c > @@ -86,8 +86,8 @@ static size_t get_kcore_size(int *nphdr, size_t *elf_buflen) > size = try; > *nphdr = *nphdr + 1; > } > - *elf_buflen = sizeof(struct elfhdr) + > - (*nphdr + 2)*sizeof(struct elf_phdr) + > + *elf_buflen = sizeof(struct elfhdr) + > + (*nphdr + 2)*sizeof(struct elf_phdr) + Unrelated whitespace fixes really shouldn't be in here. They don't bother me too much, but some people get upset ;) > 3 * ((sizeof(struct elf_note)) + > roundup(sizeof(CORE_STR), 4)) + > roundup(sizeof(struct elf_prstatus), 4) + > @@ -435,6 +435,7 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) > size_t elf_buflen; > int nphdr; > unsigned long start; > + unsigned long page = 0; "page" isn't a very good name - when we see that identifier we expect it to be a `struct page *'. Maybe call it copy_buf or something. (And incoming argument "buffer" was poorly named. "buffer" implies some temporary intermediate thing, which is inappropriate here!) > read_lock(&kclist_lock); > size = get_kcore_size(&nphdr, &elf_buflen); > @@ -485,7 +486,7 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) > start = kc_offset_to_vaddr(*fpos - elf_buflen); > if ((tsz = (PAGE_SIZE - (start & ~PAGE_MASK))) > buflen) > tsz = buflen; > - > + > while (buflen) { > struct kcore_list *m; > > @@ -515,15 +516,32 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) > } else { > if (kern_addr_valid(start)) { Do we still need the (racy) kern_addr_valid() test? The code should work OK if this is removed? > unsigned long n; > + mm_segment_t old_fs = get_fs(); > + > + if (page == 0) { > + page = __get_free_page(GFP_KERNEL); > + if (page == 0) > + return -ENOMEM; > > - n = copy_to_user(buffer, (char *)start, tsz); > + } > + set_fs(KERNEL_DS); > + pagefault_disable(); > + n = __copy_from_user_inatomic((void *)page, > + (__force const void __user *)start, > + tsz); > + pagefault_enable(); > + set_fs(old_fs); We should have a code comment in here telling people what's going on. A concurrent set_fixmap() on the source memory is unexpected! > + if (n) > + memset((void *)page + tsz - n, 0, n); > + > + n = copy_to_user(buffer, (char *)page, tsz); > /* > * We cannot distinguish between fault on source > * and fault on destination. When this happens > * we clear too and hope it will trigger the > * EFAULT again. > */ > - if (n) { > + if (n) { > if (clear_user(buffer + tsz - n, > n)) > return -EFAULT; > @@ -540,7 +558,7 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) > start += tsz; > tsz = (buflen > PAGE_SIZE ? PAGE_SIZE : buflen); > } > - > + free_page(page); > return acc; > } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/