Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752882AbbLLTla (ORCPT ); Sat, 12 Dec 2015 14:41:30 -0500 Received: from mail-ig0-f175.google.com ([209.85.213.175]:37402 "EHLO mail-ig0-f175.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752754AbbLLTl1 (ORCPT ); Sat, 12 Dec 2015 14:41:27 -0500 MIME-Version: 1.0 In-Reply-To: References: <20151212162342.GF11257@ret.masoncoding.com> Date: Sat, 12 Dec 2015 11:41:26 -0800 X-Google-Sender-Auth: 9twtwhMfqyYPxi_Jw-pD_Xlb710 Message-ID: Subject: Re: [PATCH] lock_page() doesn't lock if __wait_on_bit_lock returns -EINTR From: Linus Torvalds To: Chris Mason , Peter Zijlstra , Dave Jones , LKML , Jon Christopherson Cc: NeilBrown , Ingo Molnar , David Howells , Steven Whitehouse Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1958 Lines: 64 On Sat, Dec 12, 2015 at 10:33 AM, Linus Torvalds wrote: > > Peter, did that patch also handle just plain "lock_page()" case? Looking more at it, I think this all goes back to commit 743162013d40 ("sched: Remove proliferation of wait_on_bit() action functions"). Before that, we had wait_on_page_bit() doing: __wait_on_bit(page_waitqueue(page), &wait, sleep_on_page, TASK_UNINTERRUPTIBLE); and after that, the "sleep_on_page" got changed to "bit_wait_io". But that is bogus, because sleep_on_page() used to look like this: static int sleep_on_page(void *word) { io_schedule(); return 0; } while bit_wait_io() looks like this: __sched int bit_wait_io(void *word) { if (signal_pending_state(current->state, current)) return 1; io_schedule(); return 0; } which is ok, because as long as the task state is TASK_UNINTERRUPTIBLE, the whole signal_pending_state() thing turns into a no-op. So far, so fine. However, then commit 68985633bccb ("sched/wait: Fix signal handling in bit wait helpers") _really_ screwed up, and changed the function to __sched int bit_wait(struct wait_bit_key *word) { schedule(); if (signal_pending(current)) return -EINTR; return 0; } so now it returns an error when no error should happen. Which in turn makes __wait_on_bit() exit the bit-wait loop early. It looks like PeterZ's pending patch should fix this, by passing in the proper TASK_UNINTERRUPTIBLE to the bit_wait_io function, and going back to signal_pending_state(). PeterZ, did I follow the history of this correctly? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/