Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966589AbbBCTXr (ORCPT ); Tue, 3 Feb 2015 14:23:47 -0500 Received: from mail-ig0-f178.google.com ([209.85.213.178]:60926 "EHLO mail-ig0-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966015AbbBCTXp (ORCPT ); Tue, 3 Feb 2015 14:23:45 -0500 MIME-Version: 1.0 In-Reply-To: <20150203173413.GI3702@8bytes.org> References: <20150201144058.GM2974@kvack.org> <20150201221458.GN2974@kvack.org> <20150202001628.GO2974@kvack.org> <20150203112733.GM26304@twins.programming.kicks-ass.net> <20150203122551.GJ24151@twins.programming.kicks-ass.net> <20150203173413.GI3702@8bytes.org> Date: Tue, 3 Feb 2015 11:23:44 -0800 X-Google-Sender-Auth: BJShpVLbCYbdvjPcKrXW6X4DYIk Message-ID: Subject: Re: [PATCH] iommu/amd: Fix amd_iommu_free_device() From: Linus Torvalds To: Joerg Roedel Cc: Peter Zijlstra , Benjamin LaHaise , linux-aio@kvack.org, Linux Kernel , Jesse Barnes Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3213 Lines: 75 On Tue, Feb 3, 2015 at 9:34 AM, Joerg Roedel wrote: > > Hmm, have you seen spurious wakeups happening? The wakeup only comes > from put_device_state() and only when the reference count goes to zero. Even if that were true - and it isn't - the patch in question making it simpler and more robust, removing more lines than it adds. Maybe the callers have other events pending that might wake that process up? Are you going to guarantee that no caller ever might be doing their own wakeup thing? In fact, even if you do *(not* have anything in your call chain that has other wait queues, active, it's still wrong to do the single-wakeup thing, because we've also traditionally had cases where we do directed wakeups of threads Basically, you should always assume you can get spurious wakeups due to multiple independent wait-queues, the same way you should always assume you can get spurious hardware interrupts due to shared interrupt lines among multiple idnependent devices. Because there are also non-wait-queue wakeup events, although they are relatively rare. The obvious one that people are aware of is signals, which only affects TASK_INTERRUPTIBLE - or TASK_KILLABLE - waits, but there can actually be spurious wakeups even for TASK_UNINTERRUPTIBLE cases. In particular, there are things like "wake_up_process()" which generally doesn't wake up random tasks, but we do have cases where we use it locklessly (taking a reference to a task). See for example "kernel/locking/rwsem-add.c", which uses this model for waking up a process *without* necessarily holding a lock, which can result in the process actually being woken up *after* it already started running and doing something else. The __rwsem_do_wake() thing does smp_mb(); waiter->task = NULL; wake_up_process(tsk); put_task_struct(tsk); to basically say "I can wake up the process even after it has released the "waiter" structure, using the "waiter->task" thing as a release. The waiter, in turn, waits for waiter.task to become NULL, but what this all means is that you can actually get a "delayed wakeup" from having done a successful down_read() some time ago. All of these are really really unlikely to hit you, but basically you should *never* assume anything about "single wakeup". The linux wakeup model has always been that there can be multiple sources of wakeup events, and the proper way to wait for something is generally a variation of prepare_to_wait(..); for (;;) { set_task_state(..); .. test for condition and break out .. schedule() } finish_wait() although obviously these days we *heavily* favor the "wait_event()" interfaces that encapsulate something that looks like that loop and makes for a much simpler programming model. We should basically never favor the open-coded version, because it's so easy to get it wrong. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/