2023-06-20 13:37:56

by Daniel Wagner

[permalink] [raw]
Subject: [PATCH blktests v1 1/3] nvme/048: Check for queue count check directly

The test monitored the state changes live -> resetting -> connecting ->
live, to figure out the queue count change was successful.

The fc transport is reconnecting very fast and the state transitions
are not observed by the current approach.

So instead trying to monitor the state changes, let's just wait for the
live state and the correct queue number.

As queue count is depending on the number of online CPUs we explicitly
use 1 and 2 for the max_queue count. This means the queue_count value
needs to reach either 2 or 3 (admin queue included).

Signed-off-by: Daniel Wagner <[email protected]>
---
tests/nvme/048 | 35 ++++++++++++++++++++++++-----------
1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/tests/nvme/048 b/tests/nvme/048
index 81084f0440c2..3dc5169132de 100755
--- a/tests/nvme/048
+++ b/tests/nvme/048
@@ -42,6 +42,26 @@ nvmf_wait_for_state() {
return 0
}

+nvmf_wait_for_queue_count() {
+ local subsys_name="$1"
+ local queue_count="$2"
+ local nvmedev
+
+ nvmedev=$(_find_nvme_dev "${subsys_name}")
+
+ queue_count_file="/sys/class/nvme-fabrics/ctl/${nvmedev}/queue_count"
+
+ nvmf_wait_for_state "${subsys_name}" "live" || return 1
+
+ queue_count=$((queue_count + 1))
+ if grep -q "${queue_count}" "${queue_count_file}"; then
+ return 0
+ fi
+
+ echo "expected queue count ${queue_count} not set"
+ return 1
+}
+
set_nvmet_attr_qid_max() {
local nvmet_subsystem="$1"
local qid_max="$2"
@@ -56,10 +76,7 @@ set_qid_max() {
local qid_max="$3"

set_nvmet_attr_qid_max "${subsys_name}" "${qid_max}"
-
- # Setting qid_max forces a disconnect and the reconntect attempt starts
- nvmf_wait_for_state "${subsys_name}" "connecting" || return 1
- nvmf_wait_for_state "${subsys_name}" "live" || return 1
+ nvmf_wait_for_queue_count "${subsys_name}" "${qid_max}" || return 1

return 0
}
@@ -77,12 +94,8 @@ test() {

_setup_nvmet

- hostid="$(uuidgen)"
- if [ -z "$hostid" ] ; then
- echo "uuidgen failed"
- return 1
- fi
- hostnqn="nqn.2014-08.org.nvmexpress:uuid:${hostid}"
+ hostid="${def_hostid}"
+ hostnqn="${def_hostnqn}"

truncate -s "${nvme_img_size}" "${file_path}"

@@ -103,7 +116,7 @@ test() {
echo FAIL
else
set_qid_max "${port}" "${subsys_name}" 1 || echo FAIL
- set_qid_max "${port}" "${subsys_name}" 128 || echo FAIL
+ set_qid_max "${port}" "${subsys_name}" 2 || echo FAIL
fi

_nvme_disconnect_subsys "${subsys_name}"
--
2.41.0



2023-06-20 14:19:50

by Sagi Grimberg

[permalink] [raw]
Subject: Re: [PATCH blktests v1 1/3] nvme/048: Check for queue count check directly


> The test monitored the state changes live -> resetting -> connecting ->
> live, to figure out the queue count change was successful.
>
> The fc transport is reconnecting very fast and the state transitions
> are not observed by the current approach.
>
> So instead trying to monitor the state changes, let's just wait for the
> live state and the correct queue number.
>
> As queue count is depending on the number of online CPUs we explicitly
> use 1 and 2 for the max_queue count. This means the queue_count value
> needs to reach either 2 or 3 (admin queue included).
>
> Signed-off-by: Daniel Wagner <[email protected]>
> ---
> tests/nvme/048 | 35 ++++++++++++++++++++++++-----------
> 1 file changed, 24 insertions(+), 11 deletions(-)
>
> diff --git a/tests/nvme/048 b/tests/nvme/048
> index 81084f0440c2..3dc5169132de 100755
> --- a/tests/nvme/048
> +++ b/tests/nvme/048
> @@ -42,6 +42,26 @@ nvmf_wait_for_state() {
> return 0
> }
>
> +nvmf_wait_for_queue_count() {
> + local subsys_name="$1"
> + local queue_count="$2"
> + local nvmedev
> +
> + nvmedev=$(_find_nvme_dev "${subsys_name}")
> +
> + queue_count_file="/sys/class/nvme-fabrics/ctl/${nvmedev}/queue_count"
> +
> + nvmf_wait_for_state "${subsys_name}" "live" || return 1
> +
> + queue_count=$((queue_count + 1))
> + if grep -q "${queue_count}" "${queue_count_file}"; then
> + return 0
> + fi
> +
> + echo "expected queue count ${queue_count} not set"
> + return 1
> +}
> +
> set_nvmet_attr_qid_max() {
> local nvmet_subsystem="$1"
> local qid_max="$2"
> @@ -56,10 +76,7 @@ set_qid_max() {
> local qid_max="$3"
>
> set_nvmet_attr_qid_max "${subsys_name}" "${qid_max}"
> -
> - # Setting qid_max forces a disconnect and the reconntect attempt starts
> - nvmf_wait_for_state "${subsys_name}" "connecting" || return 1
> - nvmf_wait_for_state "${subsys_name}" "live" || return 1
> + nvmf_wait_for_queue_count "${subsys_name}" "${qid_max}" || return 1

Why not simply wait for live? The connecting is obviously racy...

>
> return 0
> }
> @@ -77,12 +94,8 @@ test() {
>
> _setup_nvmet
>
> - hostid="$(uuidgen)"
> - if [ -z "$hostid" ] ; then
> - echo "uuidgen failed"
> - return 1
> - fi
> - hostnqn="nqn.2014-08.org.nvmexpress:uuid:${hostid}"
> + hostid="${def_hostid}"
> + hostnqn="${def_hostnqn}"
>
> truncate -s "${nvme_img_size}" "${file_path}"
>
> @@ -103,7 +116,7 @@ test() {
> echo FAIL
> else
> set_qid_max "${port}" "${subsys_name}" 1 || echo FAIL
> - set_qid_max "${port}" "${subsys_name}" 128 || echo FAIL
> + set_qid_max "${port}" "${subsys_name}" 2 || echo FAIL
> fi
>
> _nvme_disconnect_subsys "${subsys_name}"

2023-06-20 18:33:38

by Daniel Wagner

[permalink] [raw]
Subject: Re: [PATCH blktests v1 1/3] nvme/048: Check for queue count check directly

On Tue, Jun 20, 2023 at 04:49:45PM +0300, Sagi Grimberg wrote:
> > +nvmf_wait_for_queue_count() {
> > + local subsys_name="$1"
> > + local queue_count="$2"
> > + local nvmedev
> > +
> > + nvmedev=$(_find_nvme_dev "${subsys_name}")
> > +
> > + queue_count_file="/sys/class/nvme-fabrics/ctl/${nvmedev}/queue_count"
> > +
> > + nvmf_wait_for_state "${subsys_name}" "live" || return 1
> > +
> > + queue_count=$((queue_count + 1))
> > + if grep -q "${queue_count}" "${queue_count_file}"; then
> > + return 0
> > + fi
> > +
> > + echo "expected queue count ${queue_count} not set"
> > + return 1
> > +}
> > +
> > set_nvmet_attr_qid_max() {
> > local nvmet_subsystem="$1"
> > local qid_max="$2"
> > @@ -56,10 +76,7 @@ set_qid_max() {
> > local qid_max="$3"
> > set_nvmet_attr_qid_max "${subsys_name}" "${qid_max}"
> > -
> > - # Setting qid_max forces a disconnect and the reconntect attempt starts
> > - nvmf_wait_for_state "${subsys_name}" "connecting" || return 1
> > - nvmf_wait_for_state "${subsys_name}" "live" || return 1
> > + nvmf_wait_for_queue_count "${subsys_name}" "${qid_max}" || return 1
>
> Why not simply wait for live? The connecting is obviously racy...

That is what the new version is doing. It's waiting for the live state and then
checks the queue count.

2023-06-21 10:17:13

by Sagi Grimberg

[permalink] [raw]
Subject: Re: [PATCH blktests v1 1/3] nvme/048: Check for queue count check directly


>>> +nvmf_wait_for_queue_count() {
>>> + local subsys_name="$1"
>>> + local queue_count="$2"
>>> + local nvmedev
>>> +
>>> + nvmedev=$(_find_nvme_dev "${subsys_name}")
>>> +
>>> + queue_count_file="/sys/class/nvme-fabrics/ctl/${nvmedev}/queue_count"
>>> +
>>> + nvmf_wait_for_state "${subsys_name}" "live" || return 1
>>> +
>>> + queue_count=$((queue_count + 1))
>>> + if grep -q "${queue_count}" "${queue_count_file}"; then
>>> + return 0
>>> + fi
>>> +
>>> + echo "expected queue count ${queue_count} not set"
>>> + return 1
>>> +}
>>> +
>>> set_nvmet_attr_qid_max() {
>>> local nvmet_subsystem="$1"
>>> local qid_max="$2"
>>> @@ -56,10 +76,7 @@ set_qid_max() {
>>> local qid_max="$3"
>>> set_nvmet_attr_qid_max "${subsys_name}" "${qid_max}"
>>> -
>>> - # Setting qid_max forces a disconnect and the reconntect attempt starts
>>> - nvmf_wait_for_state "${subsys_name}" "connecting" || return 1
>>> - nvmf_wait_for_state "${subsys_name}" "live" || return 1
>>> + nvmf_wait_for_queue_count "${subsys_name}" "${qid_max}" || return 1
>>
>> Why not simply wait for live? The connecting is obviously racy...
>
> That is what the new version is doing. It's waiting for the live state and then
> checks the queue count.

Maybe don't fold waiting for live into waiting for queue_count.

2023-06-21 11:38:34

by Daniel Wagner

[permalink] [raw]
Subject: Re: [PATCH blktests v1 1/3] nvme/048: Check for queue count check directly

On Wed, Jun 21, 2023 at 12:50:29PM +0300, Sagi Grimberg wrote:
> > > Why not simply wait for live? The connecting is obviously racy...
> >
> > That is what the new version is doing. It's waiting for the live state and then
> > checks the queue count.
>
> Maybe don't fold waiting for live into waiting for queue_count.

Alright, will do!

2023-06-28 08:56:42

by Daniel Wagner

[permalink] [raw]
Subject: Re: [PATCH blktests v1 1/3] nvme/048: Check for queue count check directly

On Tue, Jun 27, 2023 at 10:13:48AM +0000, Shinichiro Kawasaki wrote:
> > + nvmf_wait_for_state "${subsys_name}" "live" || return 1
> > +
> > + queue_count=$((queue_count + 1))
> > + if grep -q "${queue_count}" "${queue_count_file}"; then
>
> Does this check work when the number in queue_count_file has more digits than
> queue_count? e.g.) queue_count_file=20, queue_count=2

The idea is that it should be an exact match. Let me figure out if this does
what it is supposed to do.

> - hostnqn="nqn.2014-08.org.nvmexpress:uuid:${hostid}"
> > + hostid="${def_hostid}"
> > + hostnqn="${def_hostnqn}"
>
> I guess it's the better to move this hunk to the 3rd patch. Or we can mention it
> in the commit message of this patch.

Good point, I'll move this to the 3rd patch so this change is in one patch.