Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp108342pxb; Wed, 20 Oct 2021 17:41:08 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxna2JSDzqoXbQSnWbhJ0MbtFwXo4q/TgpNR8pnI/BBKxqCHdpeM5/qJgWWtTejwfC5JM1d X-Received: by 2002:a63:344d:: with SMTP id b74mr1959328pga.142.1634776868127; Wed, 20 Oct 2021 17:41:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1634776868; cv=none; d=google.com; s=arc-20160816; b=i3BIfScvAAfrQ/hHPJNJtsXvXRzOFM9E74sS09R968cg2qDpycxkM8uORC854Pjem6 RIFNdIRpbsZL2enjzte2a/mREQjXtrtLrZfdmqBK5eMgjYQ46Cqzol3EcsVvRtXYlWBB sYoubF4zGJ1olUlrCbwR7N4m0dWrM9GfD78zJj15RF2iWCQ8auFn0cOMcXoxDhORud9n 94HXD04ybbdOyF3q+YdOcKrXyDGabKubkcvVTKyL7qZcGUUEynCXLrapv3tgHbLQ1xLq Jw9DLuOeeQ5HSiDoxmSh+MMIwZf0DEd1eSaZ6vFv3l5vl82PcxhN5lD6RRprbsbeQlFn dpVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=nINcHNP1LunHuFgS+T7bSGFWStA5gFA7A821E6TM/8E=; b=V8uax/mocvpHYmvb33hFAzROEMgzciFdfxRYYDxm07Pul/EFQDZ9REMTzodbtPCxsS EsePg8USVajub4Mt3cbZiNnIRJ2qT6TtUX5z1ln3voM5fAS+8ctfSjdEmWLIWwkjzL/O h7SvWsIGZ26BAqw4ekQAvB6T57WyFxGVEVKnizewOxPZaYenTH6B+FuQw52uraYngJqf rB0e7GJd9kHU7CW3FYF46tC/HADQffsXtMJRO3ATiX8Fy2ylgT7EE+/UC8qZsynmxVXl wUGfHgsMZVNn4vvQ9MZkEqfDydwnFufDWC7xH5TMcz/qqRIHpNG1s41vXjOCwXDpeAv1 MnqQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=CbX4hham; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id k16si4590916pls.8.2021.10.20.17.40.55; Wed, 20 Oct 2021 17:41:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=CbX4hham; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230225AbhJUAmF (ORCPT + 99 others); Wed, 20 Oct 2021 20:42:05 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:31531 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230478AbhJUAmF (ORCPT ); Wed, 20 Oct 2021 20:42:05 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634776789; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=nINcHNP1LunHuFgS+T7bSGFWStA5gFA7A821E6TM/8E=; b=CbX4hham1ntphzzf2fngYUoCDjB9g0f5F7dzZutLUX2T2N81zOMzauIC7LZuKjXWgktulN GZz+kuHKyprLzi4EAa7kBlTBHcy0XZxqm7EzOf2hiZud4eIcx60bJjG2wVywj1y9TpfNp8 yNfIzATb646+lOevZoiTEkv4cXoqmtY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-557-c3Q0yG6OP5e-tVuS9OPEsw-1; Wed, 20 Oct 2021 20:39:45 -0400 X-MC-Unique: c3Q0yG6OP5e-tVuS9OPEsw-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id AFD5A1006AA2; Thu, 21 Oct 2021 00:39:42 +0000 (UTC) Received: from T590 (ovpn-8-19.pek2.redhat.com [10.72.8.19]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 2231319D9B; Thu, 21 Oct 2021 00:39:10 +0000 (UTC) Date: Thu, 21 Oct 2021 08:39:05 +0800 From: Ming Lei To: Luis Chamberlain Cc: Benjamin Herrenschmidt , Paul Mackerras , tj@kernel.org, gregkh@linuxfoundation.org, akpm@linux-foundation.org, minchan@kernel.org, jeyu@kernel.org, shuah@kernel.org, bvanassche@acm.org, dan.j.williams@intel.com, joe@perches.com, tglx@linutronix.de, keescook@chromium.org, rostedt@goodmis.org, linux-spdx@vger.kernel.org, linux-doc@vger.kernel.org, linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, ming.lei@redhat.com Subject: Re: [PATCH v8 11/12] zram: fix crashes with cpu hotplug multistate Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 20, 2021 at 08:48:04AM -0700, Luis Chamberlain wrote: > On Wed, Oct 20, 2021 at 09:15:20AM +0800, Ming Lei wrote: > > On Tue, Oct 19, 2021 at 12:36:42PM -0700, Luis Chamberlain wrote: > > > On Wed, Oct 20, 2021 at 12:29:53AM +0800, Ming Lei wrote: > > > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c > > > > index d0cae7a42f4d..a14ba3d350ea 100644 > > > > --- a/drivers/block/zram/zram_drv.c > > > > +++ b/drivers/block/zram/zram_drv.c > > > > @@ -1704,12 +1704,12 @@ static void zram_reset_device(struct zram *zram) > > > > set_capacity_and_notify(zram->disk, 0); > > > > part_stat_set_all(zram->disk->part0, 0); > > > > > > > > - up_write(&zram->init_lock); > > > > /* I/O operation under all of CPU are done so let's free */ > > > > zram_meta_free(zram, disksize); > > > > memset(&zram->stats, 0, sizeof(zram->stats)); > > > > zcomp_destroy(comp); > > > > reset_bdev(zram); > > > > + up_write(&zram->init_lock); > > > > } > > > > > > > > static ssize_t disksize_store(struct device *dev, > > > > > > With this, it still ends up in a state where we loop and can't get out of: > > > > > > zram: Can't change algorithm for initialized device > > > > Again, you are running two zram02.sh[1] on /dev/zram0, that isn't unexpected > > You mean that it is not expected? If so then yes, of course. My meaning is clear: it is not unexpected, so it is expected. > > > behavior. Here the difference is just timing. > > Right, but that is what helped reproduce a difficutl to re-produce customer > bug. Once you find an easy way to reproduce a reported issue you stick > with it and try to make the situation worse to ensure no more bugs are > present. > > > Also you did not answer my question about your test expected result when > > running the following script from two terminal concurrently: > > > > while true; do > > PATH=$PATH:$PWD:$PWD/../../../lib/ ./zram02.sh; > > done > > If you run this, you should see no failures. OK, not see any failure when running single zram02.sh after applying my patch V2. > > Once you start a second script that one should cause odd issues on both > sides but never crash or stall the module. crash can't be observed with my patch V2, what do you mean 'stall' the module? Is that 'zram' can't be unloaded after the test is terminated via multiple 'ctrl-c'? > > A second series of tests is hitting CTRL-C on either randonly and > restarting testing once again randomly. ltp/zram02.sh has cleanup handler via trap to clean everything(swapoff/umount/reset/ rmmod), ctrl-c will terminate current forground task and cause shell to run the cleanup handler first, but further 'ctrl-c' will terminate the cleanup handler, then the cleanup won't be done completely, such as zram disk is left as swap device and zram can't be unloaded. The idea can be observed via the following script: #!/bin/bash trap 'echo "enter trap"; sleep 20; echo "exit trap";' INT sleep 30 After the above script is run foreground, when 1st ctrl-c is pressed, 'sleep 30' is terminated, then the trap command is run, so you can see "enter trap" dumped. Then if you pressed 2nd ctrl-c, 'sleep 20' is terminated immediately. So 'swapoff' from zram02.sh's trap function can be terminated in this way. zram disk being left as swap disk can be observed with your patch too after terminating via multiple ctrl-c which has to be done this way because the test is dead loop. So it is hard to cleanup everything completely after multiple 'CTRL-C' is involved, and it should be impossible. It needs violent multiple ctrl-c to terminate the dealoop test. So it isn't reasonable to expect that zram can be always unloaded successfully after the test script is terminated via multiple ctrl-c. But zram can be unloaded after running swapoff manually, from driver viewpoint, nothing is wrong. > > Again, neither should crash the kernel or stall the module. > > In the end of these tests you should be able to run the script alone > just once and not see issues. Thanks, Ming