This patch fixes a bug in the accounting of the device_state.
In the current code, the device_state was put (decremented) too many times,
which sometimes lead to the driver getting stuck permanently in
put_device_state_wait(). That happen because the device_state->count would go
below zero, which is never supposed to happen.
The root cause is that the device_state was decremented in put_pasid_state()
and put_pasid_state_wait() but also in all the functions that call those
functions. Therefore, the device_state was decremented twice in each of these
code paths.
The fix is to decouple the device_state accounting from the pasid_state
accounting - remove the call to put_device_state() from the
put_pasid_state() and the put_pasid_state_wait())
Signed-off-by: Oded Gabbay <[email protected]>
---
drivers/iommu/amd_iommu_v2.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/drivers/iommu/amd_iommu_v2.c b/drivers/iommu/amd_iommu_v2.c
index 0d387db..7e0614b 100644
--- a/drivers/iommu/amd_iommu_v2.c
+++ b/drivers/iommu/amd_iommu_v2.c
@@ -272,10 +272,8 @@ static void free_pasid_state(struct pasid_state *pasid_state)
static void put_pasid_state(struct pasid_state *pasid_state)
{
- if (atomic_dec_and_test(&pasid_state->count)) {
- put_device_state(pasid_state->device_state);
+ if (atomic_dec_and_test(&pasid_state->count))
wake_up(&pasid_state->wq);
- }
}
static void put_pasid_state_wait(struct pasid_state *pasid_state)
@@ -284,9 +282,7 @@ static void put_pasid_state_wait(struct pasid_state *pasid_state)
prepare_to_wait(&pasid_state->wq, &wait, TASK_UNINTERRUPTIBLE);
- if (atomic_dec_and_test(&pasid_state->count))
- put_device_state(pasid_state->device_state);
- else
+ if (!atomic_dec_and_test(&pasid_state->count))
schedule();
finish_wait(&pasid_state->wq, &wait);
--
1.9.1
On Mon, Nov 10, 2014 at 12:21:39PM +0200, Oded Gabbay wrote:
> This patch fixes a bug in the accounting of the device_state.
> In the current code, the device_state was put (decremented) too many times,
> which sometimes lead to the driver getting stuck permanently in
> put_device_state_wait(). That happen because the device_state->count would go
> below zero, which is never supposed to happen.
>
> The root cause is that the device_state was decremented in put_pasid_state()
> and put_pasid_state_wait() but also in all the functions that call those
> functions. Therefore, the device_state was decremented twice in each of these
> code paths.
>
> The fix is to decouple the device_state accounting from the pasid_state
> accounting - remove the call to put_device_state() from the
> put_pasid_state() and the put_pasid_state_wait())
Right, there was a double drop of the reference to device state. An
alternative would have been to remove the put_device_state call at the
end of the amd_iommu_unbind_pasid() function. But this patch works as
well and is slightly better, so: Applied, thanks.
Joerg