2019-01-29 00:57:01

by Dexuan Cui

[permalink] [raw]
Subject: [PATCH v2] nfit: add Hyper-V NVDIMM DSM command set to white list


Add the Hyper-V _DSM command set to the white list of NVDIMM command
sets.

This command set is documented at http://www.uefi.org/RFIC_LIST
(see "Virtual NVDIMM 0x1901").

Thanks Dan Williams <[email protected]> for writing the
comment change.

Signed-off-by: Dexuan Cui <[email protected]>
Reviewed-by: Michael Kelley <[email protected]>
---

Changes in v2:
Updated the comment and changelog (Thanks, Dan!)
Rebased to the tag libnvdimm-fixes-5.0-rc4 of the nvdimm tree.

drivers/acpi/nfit/core.c | 17 ++++++++++++++---
drivers/acpi/nfit/nfit.h | 6 +++++-
include/uapi/linux/ndctl.h | 1 +
3 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index e18ade5d74e9..a9270c99be72 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -1861,9 +1861,17 @@ static int acpi_nfit_add_dimm(struct acpi_nfit_desc *acpi_desc,
dev_set_drvdata(&adev_dimm->dev, nfit_mem);

/*
- * Until standardization materializes we need to consider 4
- * different command sets. Note, that checking for function0 (bit0)
- * tells us if any commands are reachable through this GUID.
+ * There are 4 "legacy" NVDIMM command sets
+ * (NVDIMM_FAMILY_{INTEL,MSFT,HPE1,HPE2}) that were created before
+ * an EFI working group was established to constrain this
+ * proliferation. The nfit driver probes for the supported command
+ * set by GUID. Note, if you're a platform developer looking to add
+ * a new command set to this probe, consider using an existing set,
+ * or otherwise seek approval to publish the command set at
+ * http://www.uefi.org/RFIC_LIST.
+ *
+ * Note, that checking for function0 (bit0) tells us if any commands
+ * are reachable through this GUID.
*/
for (i = 0; i <= NVDIMM_FAMILY_MAX; i++)
if (acpi_check_dsm(adev_dimm->handle, to_nfit_uuid(i), 1, 1))
@@ -1886,6 +1894,8 @@ static int acpi_nfit_add_dimm(struct acpi_nfit_desc *acpi_desc,
dsm_mask &= ~(1 << 8);
} else if (nfit_mem->family == NVDIMM_FAMILY_MSFT) {
dsm_mask = 0xffffffff;
+ } else if (nfit_mem->family == NVDIMM_FAMILY_HYPERV) {
+ dsm_mask = 0x1f;
} else {
dev_dbg(dev, "unknown dimm command family\n");
nfit_mem->family = -1;
@@ -3729,6 +3739,7 @@ static __init int nfit_init(void)
guid_parse(UUID_NFIT_DIMM_N_HPE1, &nfit_uuid[NFIT_DEV_DIMM_N_HPE1]);
guid_parse(UUID_NFIT_DIMM_N_HPE2, &nfit_uuid[NFIT_DEV_DIMM_N_HPE2]);
guid_parse(UUID_NFIT_DIMM_N_MSFT, &nfit_uuid[NFIT_DEV_DIMM_N_MSFT]);
+ guid_parse(UUID_NFIT_DIMM_N_HYPERV, &nfit_uuid[NFIT_DEV_DIMM_N_HYPERV]);

nfit_wq = create_singlethread_workqueue("nfit");
if (!nfit_wq)
diff --git a/drivers/acpi/nfit/nfit.h b/drivers/acpi/nfit/nfit.h
index 33691aecfcee..4de167b4f76f 100644
--- a/drivers/acpi/nfit/nfit.h
+++ b/drivers/acpi/nfit/nfit.h
@@ -34,11 +34,14 @@
/* https://msdn.microsoft.com/library/windows/hardware/mt604741 */
#define UUID_NFIT_DIMM_N_MSFT "1ee68b36-d4bd-4a1a-9a16-4f8e53d46e05"

+/* http://www.uefi.org/RFIC_LIST (see "Virtual NVDIMM 0x1901") */
+#define UUID_NFIT_DIMM_N_HYPERV "5746c5f2-a9a2-4264-ad0e-e4ddc9e09e80"
+
#define ACPI_NFIT_MEM_FAILED_MASK (ACPI_NFIT_MEM_SAVE_FAILED \
| ACPI_NFIT_MEM_RESTORE_FAILED | ACPI_NFIT_MEM_FLUSH_FAILED \
| ACPI_NFIT_MEM_NOT_ARMED | ACPI_NFIT_MEM_MAP_FAILED)

-#define NVDIMM_FAMILY_MAX NVDIMM_FAMILY_MSFT
+#define NVDIMM_FAMILY_MAX NVDIMM_FAMILY_HYPERV

#define NVDIMM_STANDARD_CMDMASK \
(1 << ND_CMD_SMART | 1 << ND_CMD_SMART_THRESHOLD | 1 << ND_CMD_DIMM_FLAGS \
@@ -94,6 +97,7 @@ enum nfit_uuids {
NFIT_DEV_DIMM_N_HPE1 = NVDIMM_FAMILY_HPE1,
NFIT_DEV_DIMM_N_HPE2 = NVDIMM_FAMILY_HPE2,
NFIT_DEV_DIMM_N_MSFT = NVDIMM_FAMILY_MSFT,
+ NFIT_DEV_DIMM_N_HYPERV = NVDIMM_FAMILY_HYPERV,
NFIT_SPA_VOLATILE,
NFIT_SPA_PM,
NFIT_SPA_DCR,
diff --git a/include/uapi/linux/ndctl.h b/include/uapi/linux/ndctl.h
index f57c9e434d2d..de5d90212409 100644
--- a/include/uapi/linux/ndctl.h
+++ b/include/uapi/linux/ndctl.h
@@ -243,6 +243,7 @@ struct nd_cmd_pkg {
#define NVDIMM_FAMILY_HPE1 1
#define NVDIMM_FAMILY_HPE2 2
#define NVDIMM_FAMILY_MSFT 3
+#define NVDIMM_FAMILY_HYPERV 4

#define ND_IOCTL_CALL _IOWR(ND_IOCTL, ND_CMD_CALL,\
struct nd_cmd_pkg)
--
2.19.1



2019-01-30 06:25:25

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v2] nfit: add Hyper-V NVDIMM DSM command set to white list

On Mon, Jan 28, 2019 at 4:56 PM Dexuan Cui <[email protected]> wrote:
>
>
> Add the Hyper-V _DSM command set to the white list of NVDIMM command
> sets.
>
> This command set is documented at http://www.uefi.org/RFIC_LIST
> (see "Virtual NVDIMM 0x1901").
>
> Thanks Dan Williams <[email protected]> for writing the
> comment change.
>
> Signed-off-by: Dexuan Cui <[email protected]>
> Reviewed-by: Michael Kelley <[email protected]>
> ---
>
> Changes in v2:
> Updated the comment and changelog (Thanks, Dan!)
> Rebased to the tag libnvdimm-fixes-5.0-rc4 of the nvdimm tree.

Thanks for the re-spin, applied.

2019-02-01 18:06:06

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v2] nfit: add Hyper-V NVDIMM DSM command set to white list

On Fri, Feb 1, 2019 at 9:14 AM Dexuan Cui <[email protected]> wrote:
>
> > From: Dan Williams <[email protected]>
> > Sent: Tuesday, January 29, 2019 10:24 PM
> > On Mon, Jan 28, 2019 at 4:56 PM Dexuan Cui <[email protected]> wrote:
> > >
> > >
> > > Add the Hyper-V _DSM command set to the white list of NVDIMM command
> > > sets.
> > >
> > > Thanks Dan Williams <[email protected]> for writing the
> > > comment change.
> > > ---
> > > Changes in v2:
> > > Updated the comment and changelog (Thanks, Dan!)
> > > Rebased to the tag libnvdimm-fixes-5.0-rc4 of the nvdimm tree.
> >
> > Thanks for the re-spin, applied.
>
> Hi Dan,
> Unluckily it looks this commit causes a regression on
> https://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm.git/log/?h=libnvdimm-pending
>
> With the patch, "ndctl list" shows nothing, and /dev/pmem0 can't appear.
> If I revert the patch, it will be back to normal.
>
> I attached the config/logs. In the bad case, "dmesg" shows a line
> [ 5.259017] nd_pmem namespace0.0: 0x0000000000000000, too small must be at least 0x1000
>
> Any idea why this happens? I'm digging into the details and I appreciate your insights.

Looks like it is working as expected. The regression you are seeing is
the fact that the patch enables the kernel to enable
nvdimm-namespace-label reads. Those reads find a namespace index block
and a label. Unfortunately the label has the LOCAL flag set and Linux
explicitly ignores pmem namespace labels with that bit set. The reason
for that is due to the fact that the original definition of the LOCAL
bit from v1.1 of the namespace label implementation [1] explicitly
limited the LOCAL flag to "block aperture" regions. If you clear that
LOCAL flag I expect it will work. To my knowledge Windows pretends
that the v1.1 definition never existed.

The UEFI 2.7 specification for v1.2 labels states that setting the
LOCAL flag is optional when "nlabel", number of labels in the set, is
1. Linux makes that mandatory as LOCAL is redundant when nlabel is 1.

That said, the Robustness Principle makes a case that Linux should
tolerate the bit being set. However, it's just a non-trivial amount of
work to unwind the ingrained block-aperture assumptions of that bit.

[1]: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf

2019-02-01 18:07:00

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH v2] nfit: add Hyper-V NVDIMM DSM command set to white list

> From: Dan Williams <[email protected]>
> Sent: Tuesday, January 29, 2019 10:24 PM
> On Mon, Jan 28, 2019 at 4:56 PM Dexuan Cui <[email protected]> wrote:
> >
> >
> > Add the Hyper-V _DSM command set to the white list of NVDIMM command
> > sets.
> >
> > Thanks Dan Williams <[email protected]> for writing the
> > comment change.
> > ---
> > Changes in v2:
> > Updated the comment and changelog (Thanks, Dan!)
> > Rebased to the tag libnvdimm-fixes-5.0-rc4 of the nvdimm tree.
>
> Thanks for the re-spin, applied.

Hi Dan,
Unluckily it looks this commit causes a regression on
https://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm.git/log/?h=libnvdimm-pending

With the patch, "ndctl list" shows nothing, and /dev/pmem0 can't appear.
If I revert the patch, it will be back to normal.

I attached the config/logs. In the bad case, "dmesg" shows a line
[ 5.259017] nd_pmem namespace0.0: 0x0000000000000000, too small must be at least 0x1000

Any idea why this happens? I'm digging into the details and I appreciate your insights.

BTW, the patch does work fine linux-next's next-20190107.

Thanks,
-- Dexuan


Attachments:
dmesg.bad.txt (47.51 kB)
dmesg.bad.txt
dmesg.good.txt (44.91 kB)
dmesg.good.txt
kernel.config.txt (91.64 kB)
kernel.config.txt
bad.ndctl.list.vvv.txt (1.92 kB)
bad.ndctl.list.vvv.txt
Download all attachments

2019-02-01 23:19:52

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH v2] nfit: add Hyper-V NVDIMM DSM command set to white list

> From: Dan Williams <[email protected]>
> Sent: Friday, February 1, 2019 9:29 AM
> > Hi Dan,
> > Unluckily it looks this commit causes a regression ...
> > With the patch, "ndctl list" shows nothing, and /dev/pmem0 can't appear.
> > If I revert the patch, it will be back to normal.
> >
> > I attached the config/logs. In the bad case, "dmesg" shows a line
> > [ 5.259017] nd_pmem namespace0.0: 0x0000000000000000, too small
> must be at least 0x1000
> > Any idea why this happens? I'm digging into the details and I appreciate your
> insights.
>
> Looks like it is working as expected.

I was working on linux-next tree's next-20190107 and this patch did "work fine"
there. The "regression" happens on djbw/nvdimm.git tree's libnvdimm-pending
branch because we have this recent commit (Jan 19):

11189c1089da ("acpi/nfit: Fix command-supported detection"), which makes such
a change in acpi_nfit_ctl():

- if (!test_bit(cmd, &cmd_mask) || !test_bit(func, &dsm_mask))
+ if (cmd == ND_CMD_CALL && !test_bit(func, &dsm_mask))
+ return -ENOTTY;
+ else if (!test_bit(cmd, &cmd_mask))
return -ENOTTY;

So previously ND_CMD_GET_CONFIG_DATA fails with -ENOTTY and we're good.

Now the command succeeds, but it looks the returned data is inavlid, and I see
the "regression".

> The regression you are seeing is the fact that the patch enables the kernel to
> enable nvdimm-namespace-label reads.
Yes.

> Those reads find a namespace index block
> and a label. Unfortunately the label has the LOCAL flag set and Linux
> explicitly ignores pmem namespace labels with that bit set. The reason
Can you please point out the function that ignores the flag?

I checked where NSLABEL_FLAG_LOCAL is used, but it looks I can't find a
related function.

> for that is due to the fact that the original definition of the LOCAL
> bit from v1.1 of the namespace label implementation [1] explicitly
> limited the LOCAL flag to "block aperture" regions. If you clear that
> LOCAL flag I expect it will work. To my knowledge Windows pretends
> that the v1.1 definition never existed.
I'm trying to find out where the flag is used and how to clear it.

> The UEFI 2.7 specification for v1.2 labels states that setting the
> LOCAL flag is optional when "nlabel", number of labels in the set, is
> 1. Linux makes that mandatory as LOCAL is redundant when nlabel is 1.
>
> That said, the Robustness Principle makes a case that Linux should
> tolerate the bit being set. However, it's just a non-trivial amount of
> work to unwind the ingrained block-aperture assumptions of that bit.
Can you please explain this a bit more? Sorry, I'm new to this area...

Thanks,
-- Dexuan

2019-02-01 23:49:00

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v2] nfit: add Hyper-V NVDIMM DSM command set to white list

On Fri, Feb 1, 2019 at 3:17 PM Dexuan Cui <[email protected]> wrote:
>
> > From: Dan Williams <[email protected]>
> > Sent: Friday, February 1, 2019 9:29 AM
> > > Hi Dan,
> > > Unluckily it looks this commit causes a regression ...
> > > With the patch, "ndctl list" shows nothing, and /dev/pmem0 can't appear.
> > > If I revert the patch, it will be back to normal.
> > >
> > > I attached the config/logs. In the bad case, "dmesg" shows a line
> > > [ 5.259017] nd_pmem namespace0.0: 0x0000000000000000, too small
> > must be at least 0x1000
> > > Any idea why this happens? I'm digging into the details and I appreciate your
> > insights.
> >
> > Looks like it is working as expected.
>
> I was working on linux-next tree's next-20190107 and this patch did "work fine"
> there. The "regression" happens on djbw/nvdimm.git tree's libnvdimm-pending
> branch because we have this recent commit (Jan 19):
>
> 11189c1089da ("acpi/nfit: Fix command-supported detection"), which makes such
> a change in acpi_nfit_ctl():
>
> - if (!test_bit(cmd, &cmd_mask) || !test_bit(func, &dsm_mask))
> + if (cmd == ND_CMD_CALL && !test_bit(func, &dsm_mask))
> + return -ENOTTY;
> + else if (!test_bit(cmd, &cmd_mask))
> return -ENOTTY;
>
> So previously ND_CMD_GET_CONFIG_DATA fails with -ENOTTY and we're good.
>
> Now the command succeeds, but it looks the returned data is inavlid, and I see
> the "regression".

I believe it's the same reason. Without 11189c1089da the _LSR method
will fail, and otherwise it works and finds the label that it doesn't
like.

I'm not seeing "invalid" data in your failure log. Could you double
check that it's just not the success of _LSR that causes the issue?

> > The regression you are seeing is the fact that the patch enables the kernel to
> > enable nvdimm-namespace-label reads.
> Yes.
>
> > Those reads find a namespace index block
> > and a label. Unfortunately the label has the LOCAL flag set and Linux
> > explicitly ignores pmem namespace labels with that bit set. The reason
> Can you please point out the function that ignores the flag?
>
> I checked where NSLABEL_FLAG_LOCAL is used, but it looks I can't find a
> related function.

scan_labels() is where the namespace label is validated relative to
the region type:

if (is_nd_blk(&nd_region->dev)
== !!(flags & NSLABEL_FLAG_LOCAL))
/* pass, region matches label type */;
else
continue;

It also has meaning for the namespace capacity allocation
implementation that needed that flag to distinguish aliased capacity
between Block Aperture Mode and PMEM Mode access.

> > for that is due to the fact that the original definition of the LOCAL
> > bit from v1.1 of the namespace label implementation [1] explicitly
> > limited the LOCAL flag to "block aperture" regions. If you clear that
> > LOCAL flag I expect it will work. To my knowledge Windows pretends
> > that the v1.1 definition never existed.
> I'm trying to find out where the flag is used and how to clear it.

Assuming Hyper-V implements _LSW, you can recreate / reinitialize the
label area:

ndctl disable-region all
ndctl init-labels -f all
ndctl enable-region all
ndctl create-namespace

> > The UEFI 2.7 specification for v1.2 labels states that setting the
> > LOCAL flag is optional when "nlabel", number of labels in the set, is
> > 1. Linux makes that mandatory as LOCAL is redundant when nlabel is 1.
> >
> > That said, the Robustness Principle makes a case that Linux should
> > tolerate the bit being set. However, it's just a non-trivial amount of
> > work to unwind the ingrained block-aperture assumptions of that bit.
> Can you please explain this a bit more? Sorry, I'm new to this area...

The short story is that Linux enforces that LOCAL == Block Mode
Namespaces. See section 2.2 Namespace Label Layout in the original
spec [1]. The EFI 2.7 definition tried to allow for LOCAL to be set
when an interleave-set was comprised of a single NVDIMM, but then also
states its optional when Nlabel is 1. It has zero functional use for
interleave-set based namespaces even when the interleave-set-width is
1. So Linux takes the option to never set it, and goes further to
reject it if it's set and the region-type does not match, because that
follows the v1.1 meaning of the flag.

[1]: http://pmem.io/documents/NVDIMM_Namespace_Spec.pdf

2019-02-02 00:44:16

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH v2] nfit: add Hyper-V NVDIMM DSM command set to white list

> From: Dan Williams <[email protected]>
> Sent: Friday, February 1, 2019 3:47 PM
> To: Dexuan Cui <[email protected]>
>
> I believe it's the same reason. Without 11189c1089da the _LSR method
> will fail, and otherwise it works and finds the label that it doesn't
> like.
Exactly.

> I'm not seeing "invalid" data in your failure log. Could you double
> check that it's just not the success of _LSR that causes the issue?

acpi_label_read() never fails for me.

By "invalid", I only mean the messages in the dmesg.bad.txt I previously
attached (I'm just reading the specs to learn the details about NVDIMM
namespace's labels, so my description might be inaccurate) :

[ 4.832367] nvdimm nmem1: nsindex0 labelsize 1 invalid
[ 4.832369] nvdimm nmem1: nsindex1 labelsize 1 invalid
...
[ 5.259017] nd_pmem namespace0.0: 0x0000000000000000, too small must be at least 0x1000

> > > The regression you are seeing is the fact that the patch enables the kernel
> to
> > > enable nvdimm-namespace-label reads.
> > Yes.
> >
> > > Those reads find a namespace index block
> > > and a label. Unfortunately the label has the LOCAL flag set and Linux
> > > explicitly ignores pmem namespace labels with that bit set. The reason
> > Can you please point out the function that ignores the flag?
> >
> > I checked where NSLABEL_FLAG_LOCAL is used, but it looks I can't find a
> > related function.
>
> scan_labels() is where the namespace label is validated relative to
> the region type:
>
> if (is_nd_blk(&nd_region->dev)
> == !!(flags & NSLABEL_FLAG_LOCAL))
> /* pass, region matches label type */;
> else
> continue;
>
> It also has meaning for the namespace capacity allocation
> implementation that needed that flag to distinguish aliased capacity
> between Block Aperture Mode and PMEM Mode access.
Thanks for the pointer! I'm looking at this function.

> > > for that is due to the fact that the original definition of the LOCAL
> > > bit from v1.1 of the namespace label implementation [1] explicitly
> > > limited the LOCAL flag to "block aperture" regions. If you clear that
> > > LOCAL flag I expect it will work. To my knowledge Windows pretends
> > > that the v1.1 definition never existed.
> > I'm trying to find out where the flag is used and how to clear it.
>
> Assuming Hyper-V implements _LSW, you can recreate / reinitialize the
> label area:

I think Hyper-V only implements _LSR:
[ 4.720623] nfit ACPI0012:00: device:00: has _LSR
[ 4.723683] nfit ACPI0012:00: device:01: has _LSR

> > > The UEFI 2.7 specification for v1.2 labels states that setting the
> > > LOCAL flag is optional when "nlabel", number of labels in the set, is
> > > 1. Linux makes that mandatory as LOCAL is redundant when nlabel is 1.
> > >
> > > That said, the Robustness Principle makes a case that Linux should
> > > tolerate the bit being set. However, it's just a non-trivial amount of
> > > work to unwind the ingrained block-aperture assumptions of that bit.
> > Can you please explain this a bit more? Sorry, I'm new to this area...
>
> The short story is that Linux enforces that LOCAL == Block Mode
> Namespaces. See section 2.2 Namespace Label Layout in the original
> spec [1]. The EFI 2.7 definition tried to allow for LOCAL to be set
> when an interleave-set was comprised of a single NVDIMM, but then also
> states its optional when Nlabel is 1. It has zero functional use for
> interleave-set based namespaces even when the interleave-set-width is
> 1. So Linux takes the option to never set it, and goes further to
> reject it if it's set and the region-type does not match, because that
> follows the v1.1 meaning of the flag.
>
> [1]:
Thanks for the link! I'll read it.
BTW, it looks Hyper-V only supports PMEM namespace, at least so far.

Thanks,
-- Dexuan

2019-02-02 00:50:26

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v2] nfit: add Hyper-V NVDIMM DSM command set to white list

On Fri, Feb 1, 2019 at 4:34 PM Dexuan Cui <[email protected]> wrote:
>
> > From: Dan Williams <[email protected]>
> > Sent: Friday, February 1, 2019 3:47 PM
> > To: Dexuan Cui <[email protected]>
> >
> > I believe it's the same reason. Without 11189c1089da the _LSR method
> > will fail, and otherwise it works and finds the label that it doesn't
> > like.
> Exactly.
>
> > I'm not seeing "invalid" data in your failure log. Could you double
> > check that it's just not the success of _LSR that causes the issue?
>
> acpi_label_read() never fails for me.
>
> By "invalid", I only mean the messages in the dmesg.bad.txt I previously
> attached (I'm just reading the specs to learn the details about NVDIMM
> namespace's labels, so my description might be inaccurate) :
>
> [ 4.832367] nvdimm nmem1: nsindex0 labelsize 1 invalid
> [ 4.832369] nvdimm nmem1: nsindex1 labelsize 1 invalid

Oh, those are benign. They are a side effect of Linux probing for v1.2
namespace labels vs v1.1. It will always find that one of those is
"invalid".

> ...
> [ 5.259017] nd_pmem namespace0.0: 0x0000000000000000, too small must be at least 0x1000
>
> > > > The regression you are seeing is the fact that the patch enables the kernel
> > to
> > > > enable nvdimm-namespace-label reads.
> > > Yes.
> > >
> > > > Those reads find a namespace index block
> > > > and a label. Unfortunately the label has the LOCAL flag set and Linux
> > > > explicitly ignores pmem namespace labels with that bit set. The reason
> > > Can you please point out the function that ignores the flag?
> > >
> > > I checked where NSLABEL_FLAG_LOCAL is used, but it looks I can't find a
> > > related function.
> >
> > scan_labels() is where the namespace label is validated relative to
> > the region type:
> >
> > if (is_nd_blk(&nd_region->dev)
> > == !!(flags & NSLABEL_FLAG_LOCAL))
> > /* pass, region matches label type */;
> > else
> > continue;
> >
> > It also has meaning for the namespace capacity allocation
> > implementation that needed that flag to distinguish aliased capacity
> > between Block Aperture Mode and PMEM Mode access.
> Thanks for the pointer! I'm looking at this function.
>
> > > > for that is due to the fact that the original definition of the LOCAL
> > > > bit from v1.1 of the namespace label implementation [1] explicitly
> > > > limited the LOCAL flag to "block aperture" regions. If you clear that
> > > > LOCAL flag I expect it will work. To my knowledge Windows pretends
> > > > that the v1.1 definition never existed.
> > > I'm trying to find out where the flag is used and how to clear it.
> >
> > Assuming Hyper-V implements _LSW, you can recreate / reinitialize the
> > label area:
>
> I think Hyper-V only implements _LSR:
> [ 4.720623] nfit ACPI0012:00: device:00: has _LSR
> [ 4.723683] nfit ACPI0012:00: device:01: has _LSR

That's unfortunate...

>
> > > > The UEFI 2.7 specification for v1.2 labels states that setting the
> > > > LOCAL flag is optional when "nlabel", number of labels in the set, is
> > > > 1. Linux makes that mandatory as LOCAL is redundant when nlabel is 1.
> > > >
> > > > That said, the Robustness Principle makes a case that Linux should
> > > > tolerate the bit being set. However, it's just a non-trivial amount of
> > > > work to unwind the ingrained block-aperture assumptions of that bit.
> > > Can you please explain this a bit more? Sorry, I'm new to this area...
> >
> > The short story is that Linux enforces that LOCAL == Block Mode
> > Namespaces. See section 2.2 Namespace Label Layout in the original
> > spec [1]. The EFI 2.7 definition tried to allow for LOCAL to be set
> > when an interleave-set was comprised of a single NVDIMM, but then also
> > states its optional when Nlabel is 1. It has zero functional use for
> > interleave-set based namespaces even when the interleave-set-width is
> > 1. So Linux takes the option to never set it, and goes further to
> > reject it if it's set and the region-type does not match, because that
> > follows the v1.1 meaning of the flag.
> >
> > [1]:
> Thanks for the link! I'll read it.
> BTW, it looks Hyper-V only supports PMEM namespace, at least so far.

I don't think it should bother. It only makes sense for bare metal and
even then I know of no NVDIMMs that are shipping it.

2019-02-02 01:08:47

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH v2] nfit: add Hyper-V NVDIMM DSM command set to white list

> From: Linux-nvdimm <[email protected]> On Behalf Of
> Dexuan Cui
> Sent: Friday, February 1, 2019 4:34 PM
> > > > ...
> > > > Those reads find a namespace index block
> > > > and a label. Unfortunately the label has the LOCAL flag set and Linux
> > > > explicitly ignores pmem namespace labels with that bit set. The reason
> > > > for that is due to the fact that the original definition of the LOCAL
> > > > bit from v1.1 of the namespace label implementation [1] explicitly
> > > > limited the LOCAL flag to "block aperture" regions. If you clear that
> > > > LOCAL flag I expect it will work. To my knowledge Windows pretends
> > > > that the v1.1 definition never existed.

On the libnvdimm-pending branch, I get this:

root@decui-gen2-u1904:~/nvdimm# ndctl list
root@decui-gen2-u1904:~/nvdimm# ndctl list --idle
[
{
"dev":"namespace1.0",
"mode":"raw",
"size":0,
"uuid":"00000000-0000-0000-0000-000000000000",
"state":"disabled"
},
{
"dev":"namespace0.0",
"mode":"raw",
"size":0,
"uuid":"00000000-0000-0000-0000-000000000000",
"state":"disabled"
}
]

With the patch that clears the LOCAL label (BTW, the initial value of flags is 0x3,
meaning a read-only local label) :
@@ -2496,6 +2500,7 @@ static int init_active_labels(struct nd_region *nd_region)
if (!label_ent)
break;
label = nd_label_active(ndd, j);
+ label->flags &= ~NSLABEL_FLAG_LOCAL;
label_ent->label = label;

I get this:

root@decui-gen2-u1904:~/nvdimm# ndctl list
root@decui-gen2-u1904:~/nvdimm# ndctl list --idle
[
{
"dev":"namespace1.0",
"mode":"raw",
"size":0,
"uuid":"c258aaab-f72b-e546-bfa5-be5e07761dbc",
"state":"disabled",
"name":"Microsoft Hyper-V NVDIMM 1 Label"
},
{
"dev":"namespace0.0",
"mode":"raw",
"size":0,
"uuid":"9f0497a7-4453-7c40-ad35-21a791e00345",
"state":"disabled",
"name":"Microsoft Hyper-V NVDIMM 0 Label"
}
]

The "size" and "mode" still don't look right, but the improvement is that
now I can see a good descriptive "name", which I suppose is retrieved
from Hyper-V.

With Ubuntu 19.04 (4.18.0-11-generic), I get this:
(Note: the "mode" and "size" are correct. The "uuid" is different from
the above "9f0497a7-4453-7c40-ad35-21a791e00345" -- this is weird.)

root@decui-gen2-u1904:~# ndctl list
[
{
"dev":"namespace1.0",
"mode":"raw",
"size":137438953472,
"blockdev":"pmem1"
},
{
"dev":"namespace0.0",
"mode":"fsdax",
"map":"dev",
"size":33820770304,
"uuid":"35018886-397e-4fe7-a348-0a4d16eec44d",
"blockdev":"pmem0"
}
]

I'm trying to find out the correct solution. I apprecite your insights!

Thanks,
-- Dexuan

2019-02-02 01:39:03

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v2] nfit: add Hyper-V NVDIMM DSM command set to white list

On Fri, Feb 1, 2019 at 5:06 PM Dexuan Cui <[email protected]> wrote:
>
> > From: Linux-nvdimm <[email protected]> On Behalf Of
> > Dexuan Cui
> > Sent: Friday, February 1, 2019 4:34 PM
> > > > > ...
> > > > > Those reads find a namespace index block
> > > > > and a label. Unfortunately the label has the LOCAL flag set and Linux
> > > > > explicitly ignores pmem namespace labels with that bit set. The reason
> > > > > for that is due to the fact that the original definition of the LOCAL
> > > > > bit from v1.1 of the namespace label implementation [1] explicitly
> > > > > limited the LOCAL flag to "block aperture" regions. If you clear that
> > > > > LOCAL flag I expect it will work. To my knowledge Windows pretends
> > > > > that the v1.1 definition never existed.
>
> On the libnvdimm-pending branch, I get this:
>
> root@decui-gen2-u1904:~/nvdimm# ndctl list
> root@decui-gen2-u1904:~/nvdimm# ndctl list --idle
> [
> {
> "dev":"namespace1.0",
> "mode":"raw",
> "size":0,
> "uuid":"00000000-0000-0000-0000-000000000000",
> "state":"disabled"
> },
> {
> "dev":"namespace0.0",
> "mode":"raw",
> "size":0,
> "uuid":"00000000-0000-0000-0000-000000000000",
> "state":"disabled"
> }
> ]
>
> With the patch that clears the LOCAL label (BTW, the initial value of flags is 0x3,
> meaning a read-only local label) :
> @@ -2496,6 +2500,7 @@ static int init_active_labels(struct nd_region *nd_region)
> if (!label_ent)
> break;
> label = nd_label_active(ndd, j);
> + label->flags &= ~NSLABEL_FLAG_LOCAL;
> label_ent->label = label;
>
> I get this:
>
> root@decui-gen2-u1904:~/nvdimm# ndctl list
> root@decui-gen2-u1904:~/nvdimm# ndctl list --idle
> [
> {
> "dev":"namespace1.0",
> "mode":"raw",
> "size":0,
> "uuid":"c258aaab-f72b-e546-bfa5-be5e07761dbc",
> "state":"disabled",
> "name":"Microsoft Hyper-V NVDIMM 1 Label"
> },
> {
> "dev":"namespace0.0",
> "mode":"raw",
> "size":0,
> "uuid":"9f0497a7-4453-7c40-ad35-21a791e00345",
> "state":"disabled",
> "name":"Microsoft Hyper-V NVDIMM 0 Label"
> }
> ]
>
> The "size" and "mode" still don't look right, but the improvement is that
> now I can see a good descriptive "name", which I suppose is retrieved
> from Hyper-V.

Mode is right, there is no way for Hyper-V to create Linux fsdax mode
namespaces it requires some setup using variables only Linux knows.
Can you send the output of:

ndctl read-labels -j all

>
> With Ubuntu 19.04 (4.18.0-11-generic), I get this:
> (Note: the "mode" and "size" are correct. The "uuid" is different from
> the above "9f0497a7-4453-7c40-ad35-21a791e00345" -- this is weird.)
>
> root@decui-gen2-u1904:~# ndctl list
> [
> {
> "dev":"namespace1.0",
> "mode":"raw",
> "size":137438953472,
> "blockdev":"pmem1"
> },
> {
> "dev":"namespace0.0",
> "mode":"fsdax",
> "map":"dev",
> "size":33820770304,
> "uuid":"35018886-397e-4fe7-a348-0a4d16eec44d",
> "blockdev":"pmem0"
> }
> ]

This is because the Ubuntu kernel has the bug that causes _LSR to fail
so Linux falls back to a namespace defined by the region boundary. On
that namespace there is an "fsdax" info block located at the region
base +4K. That info block is tagged with the uuid of
"35018886-397e-4fe7-a348-0a4d16eec44d".

> I'm trying to find out the correct solution. I apprecite your insights!

It's a mess. First we need to figure out whether the label is actually
specifying a size of zero, or there is some other bug.

However, the next problem is going to be adding "fsdax" mode support
on top of the read-only defined namespaces. The ndctl reconfiguration
flow:

ndctl create-namespace -e namespace0.0 -m fsdax -f

...will likely fail because deleting the previous namespace in the
labels is part of that flow. It's always that labels are writable.

Honestly, the quickest path to something functional for Linux is to
simply delete the _LSR support and use raw mode defined namespaces.
Why have labels if they are read-only and the region is sufficient for
defining boundaries?

2019-02-02 02:18:37

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH v2] nfit: add Hyper-V NVDIMM DSM command set to white list

> From: Dan Williams <[email protected]>
> Sent: Friday, February 1, 2019 5:29 PM
> > ...
> > The "size" and "mode" still don't look right, but the improvement is that
> > now I can see a good descriptive "name", which I suppose is retrieved
> > from Hyper-V.
>
> Mode is right, there is no way for Hyper-V to create Linux fsdax mode
> namespaces it requires some setup using variables only Linux knows.
> Can you send the output of:
>
> ndctl read-labels -j all

The output is from a kernel built with the libnvdimm-pending branch plus
the one-line patch (label->flags &= ~NSLABEL_FLAG_LOCAL) in
init_active_labels():

root@decui-gen2-u1904:~# ndctl read-labels -j all
[
{
"dev":"nmem1",
"index":[
{
"signature":"NAMESPACE_INDEX",
"major":1,
"minor":2,
"labelsize":256,
"seq":1,
"nslot":2
},
{
"signature":"NAMESPACE_INDEX",
"major":1,
"minor":2,
"labelsize":256,
"seq":2,
"nslot":2
}
],
"label":[
{
"uuid":"c258aaab-f72b-e546-bfa5-be5e07761dbc",
"name":"Microsoft Hyper-V NVDIMM 1 Label",
"slot":0,
"position":0,
"nlabel":1,
"isetcookie":708891662257476870,
"lbasize":0,
"dpa":0,
"rawsize":137438953472,
"type_guid":"79d3f066-f3b4-7440-ac43-0d3318b78cdb",
"abstraction_guid":"00000000-0000-0000-0000-000000000000"
}
]
},
{
"dev":"nmem0",
"index":[
{
"signature":"NAMESPACE_INDEX",
"major":1,
"minor":2,
"labelsize":256,
"seq":1,
"nslot":2
},
{
"signature":"NAMESPACE_INDEX",
"major":1,
"minor":2,
"labelsize":256,
"seq":2,
"nslot":2
}
],
"label":[
{
"uuid":"9f0497a7-4453-7c40-ad35-21a791e00345",
"name":"Microsoft Hyper-V NVDIMM 0 Label",
"slot":0,
"position":0,
"nlabel":1,
"isetcookie":708891619307803909,
"lbasize":0,
"dpa":0,
"rawsize":34359738368,
"type_guid":"79d3f066-f3b4-7440-ac43-0d3318b78cdb",
"abstraction_guid":"00000000-0000-0000-0000-000000000000"
}
]
}
]
read 2 nmems

> > With Ubuntu 19.04 (4.18.0-11-generic), I get this:
> > (Note: the "mode" and "size" are correct. The "uuid" is different from
> > the above "9f0497a7-4453-7c40-ad35-21a791e00345" -- this is weird.)
> >
> > root@decui-gen2-u1904:~# ndctl list
> > [
> > {
> > "dev":"namespace1.0",
> > "mode":"raw",
> > "size":137438953472,
> > "blockdev":"pmem1"
> > },
> > {
> > "dev":"namespace0.0",
> > "mode":"fsdax",
> > "map":"dev",
> > "size":33820770304,
> > "uuid":"35018886-397e-4fe7-a348-0a4d16eec44d",
> > "blockdev":"pmem0"
> > }
> > ]
>
> This is because the Ubuntu kernel has the bug that causes _LSR to fail
> so Linux falls back to a namespace defined by the region boundary. On
> that namespace there is an "fsdax" info block located at the region
> base +4K. That info block is tagged with the uuid of
> "35018886-397e-4fe7-a348-0a4d16eec44d".
Thanks for the explanation!

> > I'm trying to find out the correct solution. I apprecite your insights!
>
> It's a mess. First we need to figure out whether the label is actually
> specifying a size of zero, or there is some other bug.
I agree.

> However, the next problem is going to be adding "fsdax" mode support
> on top of the read-only defined namespaces. The ndctl reconfiguration
> flow:
>
> ndctl create-namespace -e namespace0.0 -m fsdax -f

>
> ...will likely fail because deleting the previous namespace in the
> labels is part of that flow. It's always that labels are writable.
>
> Honestly, the quickest path to something functional for Linux is to
> simply delete the _LSR support and use raw mode defined namespaces.
> Why have labels if they are read-only and the region is sufficient for
> defining boundaries?

Just now Hyper-V team confirmed _LSW is not supported.

But with Ubuntu 19.04 kernel (4.18.0-11-generic), I'm able to run the commands
without any issue:

root@decui-gen2-u1904:~# ndctl list
[
{
"dev":"namespace1.0",
"mode":"raw",
"size":137438953472,
"blockdev":"pmem1"
},
{
"dev":"namespace0.0",
"mode":"fsdax",
"map":"dev",
"size":33820770304,
"uuid":"35018886-397e-4fe7-a348-0a4d16eec44d",
"blockdev":"pmem0"
}
]

root@decui-gen2-u1904:~# ndctl destroy-namespace "namespace0.0"
Error: namespace0.0 is active, specify --force for re-configuration

error destroying namespaces: Device or resource busy
destroyed 0 namespaces

root@decui-gen2-u1904:~# ndctl destroy-namespace "namespace0.0" --force
destroyed 1 namespace

root@decui-gen2-u1904:~# ndctl list
[
{
"dev":"namespace1.0",
"mode":"raw",
"size":137438953472,
"blockdev":"pmem1"
}
]

root@decui-gen2-u1904:~# ndctl create-namespace -e namespace0.0 -m fsdax -f
{
"dev":"namespace0.0",
"mode":"fsdax",
"map":"dev",
"size":"31.50 GiB (33.82 GB)",
"uuid":"9e4e819b-e2eb-4796-8f9e-15f96f63b5c2",
"sector_size":512,
"blockdev":"pmem0",
"numa_node":1
}

root@decui-gen2-u1904:~# ndctl list
[
{
"dev":"namespace1.0",
"mode":"raw",
"size":137438953472,
"blockdev":"pmem1"
},
{
"dev":"namespace0.0",
"mode":"fsdax",
"map":"dev",
"size":33820770304,
"uuid":"9e4e819b-e2eb-4796-8f9e-15f96f63b5c2",
"blockdev":"pmem0"
}
]


The above commands can also run fine with an upstream kernel that
doesn't have
11189c1089da ("acpi/nfit: Fix command-supported detection")
or
1194c4133195 ("nfit: Add Hyper-V NVDIMM DSM command set to white list")

Thanks
-- Dexuan

2019-02-02 03:34:55

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH v2] nfit: add Hyper-V NVDIMM DSM command set to white list

> From: Dan Williams <[email protected]>
> Sent: Friday, February 1, 2019 5:29 PM
> ...
> Honestly, the quickest path to something functional for Linux is to
> simply delete the _LSR support and use raw mode defined namespaces.
> Why have labels if they are read-only and the region is sufficient for
> defining boundaries?
Hyper-V Virtual NVDIMM can already work with Ubuntu 19.04 virtual machine
running on Hyper-V, i.e. I can create a raw or fsdax namesapce, and create a
xfs or ext4 file sysetm in /dev/pmem0p1, and mount the file system with and
without "-o dax". The basic functionality is good.

My recent work is mainly for ndctl support, i.e. get the health info by ndctl,
and use ndctl to monitor the error events (if applicable).

Thanks,
-- Dexuan

2019-02-02 05:28:30

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v2] nfit: add Hyper-V NVDIMM DSM command set to white list

On Fri, Feb 1, 2019 at 7:32 PM Dexuan Cui <[email protected]> wrote:
>
> > From: Dan Williams <[email protected]>
> > Sent: Friday, February 1, 2019 5:29 PM
> > ...
> > Honestly, the quickest path to something functional for Linux is to
> > simply delete the _LSR support and use raw mode defined namespaces.
> > Why have labels if they are read-only and the region is sufficient for
> > defining boundaries?
> Hyper-V Virtual NVDIMM can already work with Ubuntu 19.04 virtual machine
> running on Hyper-V, i.e. I can create a raw or fsdax namesapce, and create a
> xfs or ext4 file sysetm in /dev/pmem0p1, and mount the file system with and
> without "-o dax". The basic functionality is good.

Only in label-less mode apparently.

> My recent work is mainly for ndctl support, i.e. get the health info by ndctl,
> and use ndctl to monitor the error events (if applicable).

Right, but the recent fixes have exposed Linux to a labelled namespace
implementation that violates some of the driver's basic assumptions.

To preserve the level of operation you're currently seeing Linux might
need to add a hyper-v-specific quirk to force label-less operation.

2019-02-02 05:28:36

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v2] nfit: add Hyper-V NVDIMM DSM command set to white list

On Fri, Feb 1, 2019 at 6:17 PM Dexuan Cui <[email protected]> wrote:
> > From: Dan Williams <[email protected]>
[..]
> > Honestly, the quickest path to something functional for Linux is to
> > simply delete the _LSR support and use raw mode defined namespaces.
> > Why have labels if they are read-only and the region is sufficient for
> > defining boundaries?
>
> Just now Hyper-V team confirmed _LSW is not supported.
>
> But with Ubuntu 19.04 kernel (4.18.0-11-generic), I'm able to run the commands
> without any issue:

This is because Ubuntu is running in "label-less" mode. So all the
writes it is performing for namespace reconfiguration are just going
to the data-space of the region, not the labels.

2019-02-03 01:26:50

by Dan Williams

[permalink] [raw]
Subject: [PATCH] libnvdimm/dimm: Add a no-BLK quirk based on NVDIMM family

As Dexuan reports the NVDIMM_FAMILY_HYPERV platform is incompatible with
the existing Linux namespace implementation because it uses
NSLABEL_FLAG_LOCAL for x1-width PMEM interleave sets. Quirk it as an
platform / DIMM that does not provide BLK-aperture access. Allow the
libnvdimm core to assume no potential for aliasing. In case other
implementations make the same mistake, provide a "noblk" module
parameter to force-enable the quirk.

Link: https://lkml.kernel.org/r/PU1P153MB0169977604493B82B662A01CBF920@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM
Reported-by: Dexuan Cui <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
---
drivers/acpi/nfit/core.c | 4 ++++
drivers/nvdimm/dimm_devs.c | 7 ++++++-
drivers/nvdimm/label.c | 3 +++
drivers/nvdimm/namespace_devs.c | 6 ++++++
drivers/nvdimm/region_devs.c | 7 +++++++
include/linux/libnvdimm.h | 2 ++
6 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 4a7e8b1fa43b..811c399a3a76 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -2016,6 +2016,10 @@ static int acpi_nfit_register_dimms(struct acpi_nfit_desc *acpi_desc)
cmd_mask |= nfit_mem->dsm_mask & NVDIMM_STANDARD_CMDMASK;
}

+ /* Quirk to ignore LOCAL for labels on HYPERV DIMMs */
+ if (nfit_mem->family == NVDIMM_FAMILY_HYPERV)
+ set_bit(NDD_NOBLK, &flags);
+
if (test_bit(NFIT_MEM_LSR, &nfit_mem->flags)) {
set_bit(ND_CMD_GET_CONFIG_SIZE, &cmd_mask);
set_bit(ND_CMD_GET_CONFIG_DATA, &cmd_mask);
diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c
index 4890310df874..186d63f16434 100644
--- a/drivers/nvdimm/dimm_devs.c
+++ b/drivers/nvdimm/dimm_devs.c
@@ -11,6 +11,7 @@
* General Public License for more details.
*/
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+#include <linux/moduleparam.h>
#include <linux/vmalloc.h>
#include <linux/device.h>
#include <linux/ndctl.h>
@@ -25,6 +26,10 @@

static DEFINE_IDA(dimm_ida);

+static bool noblk;
+module_param(noblk, bool, 0444);
+MODULE_PARM_DESC(noblk, "force disable BLK / local alias support");
+
/*
* Retrieve bus and dimm handle and return if this bus supports
* get_config_data commands
@@ -551,7 +556,7 @@ struct nvdimm *__nvdimm_create(struct nvdimm_bus *nvdimm_bus,

nvdimm->dimm_id = dimm_id;
nvdimm->provider_data = provider_data;
- nvdimm->flags = flags;
+ nvdimm->flags = flags | noblk ? (1 << NDD_NOBLK) : 0;
nvdimm->cmd_mask = cmd_mask;
nvdimm->num_flush = num_flush;
nvdimm->flush_wpq = flush_wpq;
diff --git a/drivers/nvdimm/label.c b/drivers/nvdimm/label.c
index 6d6e9a12150b..f3d753d3169c 100644
--- a/drivers/nvdimm/label.c
+++ b/drivers/nvdimm/label.c
@@ -392,6 +392,7 @@ int nd_label_reserve_dpa(struct nvdimm_drvdata *ndd)
return 0; /* no label, nothing to reserve */

for_each_clear_bit_le(slot, free, nslot) {
+ struct nvdimm *nvdimm = to_nvdimm(ndd->dev);
struct nd_namespace_label *nd_label;
struct nd_region *nd_region = NULL;
u8 label_uuid[NSLABEL_UUID_LEN];
@@ -406,6 +407,8 @@ int nd_label_reserve_dpa(struct nvdimm_drvdata *ndd)

memcpy(label_uuid, nd_label->uuid, NSLABEL_UUID_LEN);
flags = __le32_to_cpu(nd_label->flags);
+ if (test_bit(NDD_NOBLK, &nvdimm->flags))
+ flags &= ~NSLABEL_FLAG_LOCAL;
nd_label_gen_id(&label_id, label_uuid, flags);
res = nvdimm_allocate_dpa(ndd, &label_id,
__le64_to_cpu(nd_label->dpa),
diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c
index 4b077555ac70..3677b0c4a33d 100644
--- a/drivers/nvdimm/namespace_devs.c
+++ b/drivers/nvdimm/namespace_devs.c
@@ -2492,6 +2492,12 @@ static int init_active_labels(struct nd_region *nd_region)
if (!label_ent)
break;
label = nd_label_active(ndd, j);
+ if (test_bit(NDD_NOBLK, &nvdimm->flags)) {
+ u32 flags = __le32_to_cpu(label->flags);
+
+ flags &= ~NSLABEL_FLAG_LOCAL;
+ label->flags = __cpu_to_le32(flags);
+ }
label_ent->label = label;

mutex_lock(&nd_mapping->lock);
diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
index e2818f94f292..3b58baa44b5c 100644
--- a/drivers/nvdimm/region_devs.c
+++ b/drivers/nvdimm/region_devs.c
@@ -1003,6 +1003,13 @@ static struct nd_region *nd_region_create(struct nvdimm_bus *nvdimm_bus,

if (test_bit(NDD_UNARMED, &nvdimm->flags))
ro = 1;
+
+ if (test_bit(NDD_NOBLK, &nvdimm->flags)
+ && dev_type == &nd_blk_device_type) {
+ dev_err(&nvdimm_bus->dev, "%s: %s mapping%d is not BLK capable\n",
+ caller, dev_name(&nvdimm->dev), i);
+ return NULL;
+ }
}

if (dev_type == &nd_blk_device_type) {
diff --git a/include/linux/libnvdimm.h b/include/linux/libnvdimm.h
index 5440f11b0907..7da406ae3a2b 100644
--- a/include/linux/libnvdimm.h
+++ b/include/linux/libnvdimm.h
@@ -42,6 +42,8 @@ enum {
NDD_SECURITY_OVERWRITE = 3,
/* tracking whether or not there is a pending device reference */
NDD_WORK_PENDING = 4,
+ /* ignore / filter NSLABEL_FLAG_LOCAL for this DIMM, i.e. no aliasing */
+ NDD_NOBLK = 5,

/* need to set a limit somewhere, but yes, this is likely overkill */
ND_IOCTL_MAX_BUFLEN = SZ_4M,


2019-02-03 14:34:51

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] libnvdimm/dimm: Add a no-BLK quirk based on NVDIMM family

Hi Dan,

I love your patch! Yet something to improve:

[auto build test ERROR on linux-nvdimm/libnvdimm-for-next]
[also build test ERROR on v5.0-rc4 next-20190201]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/Dan-Williams/libnvdimm-dimm-Add-a-no-BLK-quirk-based-on-NVDIMM-family/20190203-213444
base: https://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm.git libnvdimm-for-next
config: x86_64-randconfig-x018-201905 (attached as .config)
compiler: gcc-8 (Debian 8.2.0-14) 8.2.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64

All errors (new ones prefixed by >>):

drivers/acpi//nfit/core.c: In function 'acpi_nfit_register_dimms':
>> drivers/acpi//nfit/core.c:2003:27: error: 'NVDIMM_FAMILY_HYPERV' undeclared (first use in this function); did you mean 'NVDIMM_FAMILY_HPE2'?
if (nfit_mem->family == NVDIMM_FAMILY_HYPERV)
^~~~~~~~~~~~~~~~~~~~
NVDIMM_FAMILY_HPE2
drivers/acpi//nfit/core.c:2003:27: note: each undeclared identifier is reported only once for each function it appears in

vim +2003 drivers/acpi//nfit/core.c

1944
1945 static int acpi_nfit_register_dimms(struct acpi_nfit_desc *acpi_desc)
1946 {
1947 struct nfit_mem *nfit_mem;
1948 int dimm_count = 0, rc;
1949 struct nvdimm *nvdimm;
1950
1951 list_for_each_entry(nfit_mem, &acpi_desc->dimms, list) {
1952 struct acpi_nfit_flush_address *flush;
1953 unsigned long flags = 0, cmd_mask;
1954 struct nfit_memdev *nfit_memdev;
1955 u32 device_handle;
1956 u16 mem_flags;
1957
1958 device_handle = __to_nfit_memdev(nfit_mem)->device_handle;
1959 nvdimm = acpi_nfit_dimm_by_handle(acpi_desc, device_handle);
1960 if (nvdimm) {
1961 dimm_count++;
1962 continue;
1963 }
1964
1965 if (nfit_mem->bdw && nfit_mem->memdev_pmem)
1966 set_bit(NDD_ALIASING, &flags);
1967
1968 /* collate flags across all memdevs for this dimm */
1969 list_for_each_entry(nfit_memdev, &acpi_desc->memdevs, list) {
1970 struct acpi_nfit_memory_map *dimm_memdev;
1971
1972 dimm_memdev = __to_nfit_memdev(nfit_mem);
1973 if (dimm_memdev->device_handle
1974 != nfit_memdev->memdev->device_handle)
1975 continue;
1976 dimm_memdev->flags |= nfit_memdev->memdev->flags;
1977 }
1978
1979 mem_flags = __to_nfit_memdev(nfit_mem)->flags;
1980 if (mem_flags & ACPI_NFIT_MEM_NOT_ARMED)
1981 set_bit(NDD_UNARMED, &flags);
1982
1983 rc = acpi_nfit_add_dimm(acpi_desc, nfit_mem, device_handle);
1984 if (rc)
1985 continue;
1986
1987 /*
1988 * TODO: provide translation for non-NVDIMM_FAMILY_INTEL
1989 * devices (i.e. from nd_cmd to acpi_dsm) to standardize the
1990 * userspace interface.
1991 */
1992 cmd_mask = 1UL << ND_CMD_CALL;
1993 if (nfit_mem->family == NVDIMM_FAMILY_INTEL) {
1994 /*
1995 * These commands have a 1:1 correspondence
1996 * between DSM payload and libnvdimm ioctl
1997 * payload format.
1998 */
1999 cmd_mask |= nfit_mem->dsm_mask & NVDIMM_STANDARD_CMDMASK;
2000 }
2001
2002 /* Quirk to ignore LOCAL for labels on HYPERV DIMMs */
> 2003 if (nfit_mem->family == NVDIMM_FAMILY_HYPERV)
2004 set_bit(NDD_NOBLK, &flags);
2005
2006 if (test_bit(NFIT_MEM_LSR, &nfit_mem->flags)) {
2007 set_bit(ND_CMD_GET_CONFIG_SIZE, &cmd_mask);
2008 set_bit(ND_CMD_GET_CONFIG_DATA, &cmd_mask);
2009 }
2010 if (test_bit(NFIT_MEM_LSW, &nfit_mem->flags))
2011 set_bit(ND_CMD_SET_CONFIG_DATA, &cmd_mask);
2012
2013 flush = nfit_mem->nfit_flush ? nfit_mem->nfit_flush->flush
2014 : NULL;
2015 nvdimm = __nvdimm_create(acpi_desc->nvdimm_bus, nfit_mem,
2016 acpi_nfit_dimm_attribute_groups,
2017 flags, cmd_mask, flush ? flush->hint_count : 0,
2018 nfit_mem->flush_wpq, &nfit_mem->id[0],
2019 acpi_nfit_get_security_ops(nfit_mem->family));
2020 if (!nvdimm)
2021 return -ENOMEM;
2022
2023 nfit_mem->nvdimm = nvdimm;
2024 dimm_count++;
2025
2026 if ((mem_flags & ACPI_NFIT_MEM_FAILED_MASK) == 0)
2027 continue;
2028
2029 dev_info(acpi_desc->dev, "%s flags:%s%s%s%s%s\n",
2030 nvdimm_name(nvdimm),
2031 mem_flags & ACPI_NFIT_MEM_SAVE_FAILED ? " save_fail" : "",
2032 mem_flags & ACPI_NFIT_MEM_RESTORE_FAILED ? " restore_fail":"",
2033 mem_flags & ACPI_NFIT_MEM_FLUSH_FAILED ? " flush_fail" : "",
2034 mem_flags & ACPI_NFIT_MEM_NOT_ARMED ? " not_armed" : "",
2035 mem_flags & ACPI_NFIT_MEM_MAP_FAILED ? " map_fail" : "");
2036
2037 }
2038
2039 rc = nvdimm_bus_check_dimm_count(acpi_desc->nvdimm_bus, dimm_count);
2040 if (rc)
2041 return rc;
2042
2043 /*
2044 * Now that dimms are successfully registered, and async registration
2045 * is flushed, attempt to enable event notification.
2046 */
2047 list_for_each_entry(nfit_mem, &acpi_desc->dimms, list) {
2048 struct kernfs_node *nfit_kernfs;
2049
2050 nvdimm = nfit_mem->nvdimm;
2051 if (!nvdimm)
2052 continue;
2053
2054 rc = nvdimm_security_setup_events(nvdimm);
2055 if (rc < 0)
2056 dev_warn(acpi_desc->dev,
2057 "security event setup failed: %d\n", rc);
2058
2059 nfit_kernfs = sysfs_get_dirent(nvdimm_kobj(nvdimm)->sd, "nfit");
2060 if (nfit_kernfs)
2061 nfit_mem->flags_attr = sysfs_get_dirent(nfit_kernfs,
2062 "flags");
2063 sysfs_put(nfit_kernfs);
2064 if (!nfit_mem->flags_attr)
2065 dev_warn(acpi_desc->dev, "%s: notifications disabled\n",
2066 nvdimm_name(nvdimm));
2067 }
2068
2069 return devm_add_action_or_reset(acpi_desc->dev, shutdown_dimm_notify,
2070 acpi_desc);
2071 }
2072

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (6.10 kB)
.config.gz (29.86 kB)
Download all attachments

2019-02-03 16:50:23

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] libnvdimm/dimm: Add a no-BLK quirk based on NVDIMM family

Hi Dan,

I love your patch! Yet something to improve:

[auto build test ERROR on linux-nvdimm/libnvdimm-for-next]
[also build test ERROR on v5.0-rc4 next-20190201]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/Dan-Williams/libnvdimm-dimm-Add-a-no-BLK-quirk-based-on-NVDIMM-family/20190203-213444
base: https://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm.git libnvdimm-for-next
config: x86_64-allmodconfig (attached as .config)
compiler: gcc-8 (Debian 8.2.0-14) 8.2.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64

All errors (new ones prefixed by >>):

drivers/acpi/nfit/core.c: In function 'acpi_nfit_register_dimms':
>> drivers/acpi/nfit/core.c:2003:27: error: 'NVDIMM_FAMILY_HYPERV' undeclared (first use in this function); did you mean 'NVDIMM_FAMILY_HPE1'?
if (nfit_mem->family == NVDIMM_FAMILY_HYPERV)
^~~~~~~~~~~~~~~~~~~~
NVDIMM_FAMILY_HPE1
drivers/acpi/nfit/core.c:2003:27: note: each undeclared identifier is reported only once for each function it appears in

vim +2003 drivers/acpi/nfit/core.c

1944
1945 static int acpi_nfit_register_dimms(struct acpi_nfit_desc *acpi_desc)
1946 {
1947 struct nfit_mem *nfit_mem;
1948 int dimm_count = 0, rc;
1949 struct nvdimm *nvdimm;
1950
1951 list_for_each_entry(nfit_mem, &acpi_desc->dimms, list) {
1952 struct acpi_nfit_flush_address *flush;
1953 unsigned long flags = 0, cmd_mask;
1954 struct nfit_memdev *nfit_memdev;
1955 u32 device_handle;
1956 u16 mem_flags;
1957
1958 device_handle = __to_nfit_memdev(nfit_mem)->device_handle;
1959 nvdimm = acpi_nfit_dimm_by_handle(acpi_desc, device_handle);
1960 if (nvdimm) {
1961 dimm_count++;
1962 continue;
1963 }
1964
1965 if (nfit_mem->bdw && nfit_mem->memdev_pmem)
1966 set_bit(NDD_ALIASING, &flags);
1967
1968 /* collate flags across all memdevs for this dimm */
1969 list_for_each_entry(nfit_memdev, &acpi_desc->memdevs, list) {
1970 struct acpi_nfit_memory_map *dimm_memdev;
1971
1972 dimm_memdev = __to_nfit_memdev(nfit_mem);
1973 if (dimm_memdev->device_handle
1974 != nfit_memdev->memdev->device_handle)
1975 continue;
1976 dimm_memdev->flags |= nfit_memdev->memdev->flags;
1977 }
1978
1979 mem_flags = __to_nfit_memdev(nfit_mem)->flags;
1980 if (mem_flags & ACPI_NFIT_MEM_NOT_ARMED)
1981 set_bit(NDD_UNARMED, &flags);
1982
1983 rc = acpi_nfit_add_dimm(acpi_desc, nfit_mem, device_handle);
1984 if (rc)
1985 continue;
1986
1987 /*
1988 * TODO: provide translation for non-NVDIMM_FAMILY_INTEL
1989 * devices (i.e. from nd_cmd to acpi_dsm) to standardize the
1990 * userspace interface.
1991 */
1992 cmd_mask = 1UL << ND_CMD_CALL;
1993 if (nfit_mem->family == NVDIMM_FAMILY_INTEL) {
1994 /*
1995 * These commands have a 1:1 correspondence
1996 * between DSM payload and libnvdimm ioctl
1997 * payload format.
1998 */
1999 cmd_mask |= nfit_mem->dsm_mask & NVDIMM_STANDARD_CMDMASK;
2000 }
2001
2002 /* Quirk to ignore LOCAL for labels on HYPERV DIMMs */
> 2003 if (nfit_mem->family == NVDIMM_FAMILY_HYPERV)
2004 set_bit(NDD_NOBLK, &flags);
2005
2006 if (test_bit(NFIT_MEM_LSR, &nfit_mem->flags)) {
2007 set_bit(ND_CMD_GET_CONFIG_SIZE, &cmd_mask);
2008 set_bit(ND_CMD_GET_CONFIG_DATA, &cmd_mask);
2009 }
2010 if (test_bit(NFIT_MEM_LSW, &nfit_mem->flags))
2011 set_bit(ND_CMD_SET_CONFIG_DATA, &cmd_mask);
2012
2013 flush = nfit_mem->nfit_flush ? nfit_mem->nfit_flush->flush
2014 : NULL;
2015 nvdimm = __nvdimm_create(acpi_desc->nvdimm_bus, nfit_mem,
2016 acpi_nfit_dimm_attribute_groups,
2017 flags, cmd_mask, flush ? flush->hint_count : 0,
2018 nfit_mem->flush_wpq, &nfit_mem->id[0],
2019 acpi_nfit_get_security_ops(nfit_mem->family));
2020 if (!nvdimm)
2021 return -ENOMEM;
2022
2023 nfit_mem->nvdimm = nvdimm;
2024 dimm_count++;
2025
2026 if ((mem_flags & ACPI_NFIT_MEM_FAILED_MASK) == 0)
2027 continue;
2028
2029 dev_info(acpi_desc->dev, "%s flags:%s%s%s%s%s\n",
2030 nvdimm_name(nvdimm),
2031 mem_flags & ACPI_NFIT_MEM_SAVE_FAILED ? " save_fail" : "",
2032 mem_flags & ACPI_NFIT_MEM_RESTORE_FAILED ? " restore_fail":"",
2033 mem_flags & ACPI_NFIT_MEM_FLUSH_FAILED ? " flush_fail" : "",
2034 mem_flags & ACPI_NFIT_MEM_NOT_ARMED ? " not_armed" : "",
2035 mem_flags & ACPI_NFIT_MEM_MAP_FAILED ? " map_fail" : "");
2036
2037 }
2038
2039 rc = nvdimm_bus_check_dimm_count(acpi_desc->nvdimm_bus, dimm_count);
2040 if (rc)
2041 return rc;
2042
2043 /*
2044 * Now that dimms are successfully registered, and async registration
2045 * is flushed, attempt to enable event notification.
2046 */
2047 list_for_each_entry(nfit_mem, &acpi_desc->dimms, list) {
2048 struct kernfs_node *nfit_kernfs;
2049
2050 nvdimm = nfit_mem->nvdimm;
2051 if (!nvdimm)
2052 continue;
2053
2054 rc = nvdimm_security_setup_events(nvdimm);
2055 if (rc < 0)
2056 dev_warn(acpi_desc->dev,
2057 "security event setup failed: %d\n", rc);
2058
2059 nfit_kernfs = sysfs_get_dirent(nvdimm_kobj(nvdimm)->sd, "nfit");
2060 if (nfit_kernfs)
2061 nfit_mem->flags_attr = sysfs_get_dirent(nfit_kernfs,
2062 "flags");
2063 sysfs_put(nfit_kernfs);
2064 if (!nfit_mem->flags_attr)
2065 dev_warn(acpi_desc->dev, "%s: notifications disabled\n",
2066 nvdimm_name(nvdimm));
2067 }
2068
2069 return devm_add_action_or_reset(acpi_desc->dev, shutdown_dimm_notify,
2070 acpi_desc);
2071 }
2072

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (6.09 kB)
.config.gz (65.08 kB)
Download all attachments

2019-02-03 17:52:39

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH] libnvdimm/dimm: Add a no-BLK quirk based on NVDIMM family

> From: Dan Williams <[email protected]>
> Sent: Saturday, February 2, 2019 5:13 PM
> ...
> As Dexuan reports the NVDIMM_FAMILY_HYPERV platform is incompatible with
> the existing Linux namespace implementation because it uses
> NSLABEL_FLAG_LOCAL for x1-width PMEM interleave sets. Quirk it as an
> platform / DIMM that does not provide BLK-aperture access. Allow the
> libnvdimm core to assume no potential for aliasing. In case other
> implementations make the same mistake, provide a "noblk" module
> parameter to force-enable the quirk.

Hi Dan,
Thanks very much for the patch! With it, "ndctl list" can show the below:

root@decui-gen2-u1904:~/ndctl# ndctl list
[
{
"dev":"namespace1.0",
"mode":"raw",
"size":137438953472,
"uuid":"c258aaab-f72b-e546-bfa5-be5e07761dbc",
"blockdev":"pmem1",
"name":"Microsoft Hyper-V NVDIMM 1 Label"
},
{
"dev":"namespace0.0",
"mode":"raw",
"size":34359738368,
"uuid":"9f0497a7-4453-7c40-ad35-21a791e00345",
"blockdev":"pmem0",
"name":"Microsoft Hyper-V NVDIMM 0 Label"
}
]

And /dev/pmem0 can appear, but /dev/pmem0p1 can not appear, and the "mode" of
"namespace0.0" is not correct. With the Ubuntu 19.04 4.18 kenel, I get the below:

root@decui-gen2-u1904:~/ndctl# ndctl list
[
{
"dev":"namespace1.0",
"mode":"raw",
"size":137438953472,
"blockdev":"pmem1"
},
{
"dev":"namespace0.0",
"mode":"fsdax",
"map":"dev",
"size":33820770304,
"uuid":"ef028c4e-2b1f-4bf8-b92a-1109d7a1c914",
"blockdev":"pmem0"
}
]
and /dev/pmem0p1 can appear.

It looks the namespace created by Ubuntu 19.04 (4.18) is incompatible with
the libnvdimm-pending branch + this patch.

It looks we need to completely disable the label usage for NVDIMM_FAMILY_HYPERV
due to compability?

Thanks,
-- Dexuan

2019-02-03 17:54:12

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH] libnvdimm/dimm: Add a no-BLK quirk based on NVDIMM family

> From: kbuild test robot <[email protected]>
> Sent: Sunday, February 3, 2019 6:32 AM
> To: Dan Williams <[email protected]>
> Cc: [email protected]; [email protected]; Dexuan Cui
> <[email protected]>; [email protected]
> Subject: Re: [PATCH] libnvdimm/dimm: Add a no-BLK quirk based on NVDIMM
> family
>
> Hi Dan,
>
> I love your patch! Yet something to improve:
>
> [auto build test ERROR on linux-nvdimm/libnvdimm-for-next]
> [also build test ERROR on v5.0-rc4 next-20190201]
> [if your patch is applied to the wrong git tree, please drop us a note to help
> improve the system]
> ...
> All errors (new ones prefixed by >>):
>
> drivers/acpi//nfit/core.c: In function 'acpi_nfit_register_dimms':
> >> drivers/acpi//nfit/core.c:2003:27: error: 'NVDIMM_FAMILY_HYPERV'
> undeclared (first use in this function); did you mean 'NVDIMM_FAMILY_HPE2'?
> if (nfit_mem->family == NVDIMM_FAMILY_HYPERV)
> ^~~~~~~~~~~~~~~~~~~~
> NVDIMM_FAMILY_HPE2
> drivers/acpi//nfit/core.c:2003:27: note: each undeclared identifier is
> reported only once for each function it appears in

This is a false alarm. The patch is only for
https://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm.git/log/?h=libnvdimm-pending
where we have the dependant commid
1194c4133195 ("nfit: Add Hyper-V NVDIMM DSM command set to white list")

Thanks,
-- Dexuan

2019-02-03 17:55:07

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH] libnvdimm/dimm: Add a no-BLK quirk based on NVDIMM family

> From: kbuild test robot <[email protected]>
> Sent: Sunday, February 3, 2019 6:38 AM
> ...
> Hi Dan,
>
> I love your patch! Yet something to improve:
>
> [auto build test ERROR on linux-nvdimm/libnvdimm-for-next]
> [also build test ERROR on v5.0-rc4 next-20190201]
> [if your patch is applied to the wrong git tree, please drop us a note to help
> improve the system]
> ...
> config: x86_64-allmodconfig (attached as .config)
> compiler: gcc-8 (Debian 8.2.0-14) 8.2.0
> reproduce:
> # save the attached .config to linux build tree
> make ARCH=x86_64
>
> All errors (new ones prefixed by >>):
>
> drivers/acpi/nfit/core.c: In function 'acpi_nfit_register_dimms':
> >> drivers/acpi/nfit/core.c:2003:27: error: 'NVDIMM_FAMILY_HYPERV'
> undeclared (first use in this function); did you mean 'NVDIMM_FAMILY_HPE1'?
> if (nfit_mem->family == NVDIMM_FAMILY_HYPERV)
> ^~~~~~~~~~~~~~~~~~~~
> NVDIMM_FAMILY_HPE1
> drivers/acpi/nfit/core.c:2003:27: note: each undeclared identifier is
> reported only once for each function it appears in

This is a false alarm, too. The patch is only for
https://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm.git/log/?h=libnvdimm-pending
where we have the dependant commit:
1194c4133195 ("nfit: Add Hyper-V NVDIMM DSM command set to white list")

Thanks,
-- Dexuan

2019-02-03 17:56:17

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH] libnvdimm/dimm: Add a no-BLK quirk based on NVDIMM family

On Sun, Feb 3, 2019 at 9:22 AM Dexuan Cui <[email protected]> wrote:
>
> > From: Dan Williams <[email protected]>
> > Sent: Saturday, February 2, 2019 5:13 PM
> > ...
> > As Dexuan reports the NVDIMM_FAMILY_HYPERV platform is incompatible with
> > the existing Linux namespace implementation because it uses
> > NSLABEL_FLAG_LOCAL for x1-width PMEM interleave sets. Quirk it as an
> > platform / DIMM that does not provide BLK-aperture access. Allow the
> > libnvdimm core to assume no potential for aliasing. In case other
> > implementations make the same mistake, provide a "noblk" module
> > parameter to force-enable the quirk.
>
> Hi Dan,
> Thanks very much for the patch! With it, "ndctl list" can show the below:
>
> root@decui-gen2-u1904:~/ndctl# ndctl list
> [
> {
> "dev":"namespace1.0",
> "mode":"raw",
> "size":137438953472,
> "uuid":"c258aaab-f72b-e546-bfa5-be5e07761dbc",
> "blockdev":"pmem1",
> "name":"Microsoft Hyper-V NVDIMM 1 Label"
> },
> {
> "dev":"namespace0.0",
> "mode":"raw",
> "size":34359738368,
> "uuid":"9f0497a7-4453-7c40-ad35-21a791e00345",
> "blockdev":"pmem0",
> "name":"Microsoft Hyper-V NVDIMM 0 Label"
> }
> ]
>
> And /dev/pmem0 can appear, but /dev/pmem0p1 can not appear, and the "mode" of
> "namespace0.0" is not correct. With the Ubuntu 19.04 4.18 kenel, I get the below:
>
> root@decui-gen2-u1904:~/ndctl# ndctl list
> [
> {
> "dev":"namespace1.0",
> "mode":"raw",
> "size":137438953472,
> "blockdev":"pmem1"
> },
> {
> "dev":"namespace0.0",
> "mode":"fsdax",
> "map":"dev",
> "size":33820770304,
> "uuid":"ef028c4e-2b1f-4bf8-b92a-1109d7a1c914",
> "blockdev":"pmem0"
> }
> ]
> and /dev/pmem0p1 can appear.
>
> It looks the namespace created by Ubuntu 19.04 (4.18) is incompatible with
> the libnvdimm-pending branch + this patch.

This is correct, the configuration switched from label-less by default
to labeled.

> It looks we need to completely disable the label usage for NVDIMM_FAMILY_HYPERV
> due to compability?

It's either going to be a quirk in Linux or a quirk in the Hyper-V
configuration. I think it's more manageable for Hyper-V to ship a
"label disable" switch than to try to ship a workaround in Linux. The
"noblk" quirk in Linux works around the LOCAL bit in the labels.
However, the namespace mode regression can only be resolved by hiding
the label capability until the administrator manually switches from
label-less to labeled operation.

2019-02-03 18:51:22

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH] libnvdimm/dimm: Add a no-BLK quirk based on NVDIMM family

> From: Dan Williams <[email protected]>
> Sent: Sunday, February 3, 2019 9:49 AM
> > ...
> > It looks the namespace created by Ubuntu 19.04 (4.18) is incompatible with
> > the libnvdimm-pending branch + this patch.
>
> This is correct, the configuration switched from label-less by default
> to labeled.
Thanks for the confirmation!

> > It looks we need to completely disable the label usage for
> NVDIMM_FAMILY_HYPERV
> > due to compability?
>
> It's either going to be a quirk in Linux or a quirk in the Hyper-V
> configuration. I think it's more manageable for Hyper-V to ship a
> "label disable" switch than to try to ship a workaround in Linux. The
> "noblk" quirk in Linux works around the LOCAL bit in the labels.
> However, the namespace mode regression can only be resolved by hiding
> the label capability until the administrator manually switches from
> label-less to labeled operation.

As I understand, the essence of the issue is: Hyper-V emulates the
label mechanism (i.e. it supports _LSI and LSR), but doesn't do it
right (i.e. it doesn't support _LSW).

To manage the namespaces, Linux can choose to use label or not.

If _LSI/_LSR are supported, Linux assumes _LSW is supported as well
and chooses to use label (i.e. the label mode), but since Hyper-V doesn't
support _LSW, Linux fails to change the namespace configuration. In
Ubuntu 19.04 (4.18), the kernel is not aware of Hyper-V _LSI/_LSR, so
the created namespace is in "label-less" mode, and hence can't be used
with the libnvdimm-pending branch + this patch, unless we add a quirk
in Linux to explicitly not use the label.

I agree ideally Hyper-V should hide the label capability, and I'll request
Hyper-V team to do that. I hope Hyper-V guys are still able to do that
in time so we won't need a quirk in Linux kernel.

BTW, I suppose Windows VM should be in "label-less" mode.

Thanks for the help, Dan!

Thanks,
-- Dexuan

2019-02-03 19:21:05

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH] libnvdimm/dimm: Add a no-BLK quirk based on NVDIMM family

On Sun, Feb 3, 2019 at 10:13 AM Dexuan Cui <[email protected]> wrote:
>
> > From: Dan Williams <[email protected]>
> > Sent: Sunday, February 3, 2019 9:49 AM
> > > ...
> > > It looks the namespace created by Ubuntu 19.04 (4.18) is incompatible with
> > > the libnvdimm-pending branch + this patch.
> >
> > This is correct, the configuration switched from label-less by default
> > to labeled.
> Thanks for the confirmation!
>
> > > It looks we need to completely disable the label usage for
> > NVDIMM_FAMILY_HYPERV
> > > due to compability?
> >
> > It's either going to be a quirk in Linux or a quirk in the Hyper-V
> > configuration. I think it's more manageable for Hyper-V to ship a
> > "label disable" switch than to try to ship a workaround in Linux. The
> > "noblk" quirk in Linux works around the LOCAL bit in the labels.
> > However, the namespace mode regression can only be resolved by hiding
> > the label capability until the administrator manually switches from
> > label-less to labeled operation.
>
> As I understand, the essence of the issue is: Hyper-V emulates the
> label mechanism (i.e. it supports _LSI and LSR), but doesn't do it
> right (i.e. it doesn't support _LSW).
>
> To manage the namespaces, Linux can choose to use label or not.
>
> If _LSI/_LSR are supported, Linux assumes _LSW is supported as well
> and chooses to use label (i.e. the label mode), but since Hyper-V doesn't
> support _LSW, Linux fails to change the namespace configuration.

No, that's not quite right. The reason Linux does not see the fsdax
mode configuration is due to the missing "address abstraction GUID" in
the label produced the default Hyper-V configuration. In label-less
mode there is no "address abstraction GUID" to validate so it falls
back to just using the info-block directly.

The _LSW issue is if you wanted to reconfigure a raw namespace to
fsdax mode. The current flow tries to delete the label, but that's
only for reconfiguration, not the initial boot-up case that is
currently failing. The deletion would fail on finding no label-write
capability, but to be clear the boot-up case does not perform any
writes.

> In
> Ubuntu 19.04 (4.18), the kernel is not aware of Hyper-V _LSI/_LSR, so
> the created namespace is in "label-less" mode, and hence can't be used
> with the libnvdimm-pending branch + this patch, unless we add a quirk
> in Linux to explicitly not use the label.
>
> I agree ideally Hyper-V should hide the label capability, and I'll request
> Hyper-V team to do that. I hope Hyper-V guys are still able to do that
> in time so we won't need a quirk in Linux kernel.

After some more thought, the "no regressions" rule means that Linux
should ship a quirk for this by default. I think a good heuristic is
to disable label support by default if no _LSW method is detected. An
opt-in can be specified to accept "read-only" configurations, but
that's an exceptional case. I'll send a patch for this.

> BTW, I suppose Windows VM should be in "label-less" mode.

I expect Windows mandates labeled operation. This label-less concept
was something Linux shipped in advance of a specification being
ratified and to support early NVDIMMs that don't advertise label
space.

> Thanks for the help, Dan!

Thanks for the quick feedback and testing.

2019-02-03 19:43:39

by Dan Williams

[permalink] [raw]
Subject: [PATCH] acpi/nfit: Require opt-in for read-only label configurations

Recent fixes to command handling enabled Linux to read label
configurations that it could not before. Unfortunately that means that
configurations that were operating in label-less mode will be broken as
the kernel ignores the existing namespace configuration and tries to
honor the new found labels.

Fortunately this seems limited to a case where Linux can quirk the
behavior and maintain the existing label-less semantics by default.
When the platform does not emit an _LSW method, disable all label access
methods. Provide a 'force_labels' module parameter to allow read-only
label operation.

Fixes: 11189c1089da ("acpi/nfit: Fix command-supported detection")
Reported-by: Dexuan Cui <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
---
drivers/acpi/nfit/core.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)

diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 811c399a3a76..5b5e802de7b8 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -56,6 +56,10 @@ static bool no_init_ars;
module_param(no_init_ars, bool, 0644);
MODULE_PARM_DESC(no_init_ars, "Skip ARS run at nfit init time");

+static bool force_labels;
+module_param(force_labels, bool, 0444);
+MODULE_PARM_DESC(force_labels, "Opt-in to labels despite missing methods");
+
LIST_HEAD(acpi_descs);
DEFINE_MUTEX(acpi_desc_lock);

@@ -1916,6 +1920,19 @@ static int acpi_nfit_add_dimm(struct acpi_nfit_desc *acpi_desc,
dev_dbg(dev, "%s: has _LSW\n", dev_name(&adev_dimm->dev));
set_bit(NFIT_MEM_LSW, &nfit_mem->flags);
}
+
+ /*
+ * Quirk read-only label configurations to preserve
+ * access to label-less namespaces by default.
+ */
+ if (!test_bit(NFIT_MEM_LSW, &nfit_mem->flags)
+ && !force_labels) {
+ dev_dbg(dev, "%s: No _LSW, disable labels\n",
+ dev_name(&adev_dimm->dev));
+ clear_bit(NFIT_MEM_LSR, &nfit_mem->flags);
+ } else
+ dev_dbg(dev, "%s: Force enable labels\n",
+ dev_name(&adev_dimm->dev));
}

populate_shutdown_status(nfit_mem);


2019-02-04 05:49:19

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH] acpi/nfit: Require opt-in for read-only label configurations

> From: Dan Williams <[email protected]>
> Sent: Sunday, February 3, 2019 11:29 AM
> ...
> Recent fixes to command handling enabled Linux to read label
> configurations that it could not before. Unfortunately that means that
> configurations that were operating in label-less mode will be broken as
> the kernel ignores the existing namespace configuration and tries to
> honor the new found labels.
>
> Fortunately this seems limited to a case where Linux can quirk the
> behavior and maintain the existing label-less semantics by default.
> When the platform does not emit an _LSW method, disable all label access
> methods. Provide a 'force_labels' module parameter to allow read-only
> label operation.
>
> Fixes: 11189c1089da ("acpi/nfit: Fix command-supported detection")
> Reported-by: Dexuan Cui <[email protected]>
> Signed-off-by: Dan Williams <[email protected]>

Hi Dan,
Thanks for the patch and all the detailed explanation! With this patch,
the "fsdax" namespace can be properly detected now!

Thanks,
-- Dexuan

2019-02-05 16:54:47

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH] libnvdimm/dimm: Add a no-BLK quirk based on NVDIMM family

> From: Dan Williams <[email protected]>
> Sent: Sunday, February 3, 2019 11:14 AM
> > ...
> > As I understand, the essence of the issue is: Hyper-V emulates the
> > label mechanism (i.e. it supports _LSI and LSR), but doesn't do it
> > right (i.e. it doesn't support _LSW).
> >
> > To manage the namespaces, Linux can choose to use label or not.
> >
> > If _LSI/_LSR are supported, Linux assumes _LSW is supported as well
> > and chooses to use label (i.e. the label mode), but since Hyper-V doesn't
> > support _LSW, Linux fails to change the namespace configuration.
>
> No, that's not quite right. The reason Linux does not see the fsdax
> mode configuration is due to the missing "address abstraction GUID" in
> the label produced the default Hyper-V configuration.
Hi Dan,
Do you mean NVDIMM_DAX_GUID?

> In label-less mode there is no "address abstraction GUID" to validate so
> it falls back to just using the info-block directly.
In the case of not using label storage area, as I understand the info-block
resides in regular data storage area. Can you please tell me where exactly
the info-block is and how its location is decided?
And I suppose the info-block contains the NVDIMM_DAX_GUID?

I'm asking because I found I lose ~500MBytes of the 32GBytes Hyper-V
NVDIMM device, when the namespace is in fsdax mode. When it's in
raw mode, I'm able to use all of the 32GB space.

> The _LSW issue is if you wanted to reconfigure a raw namespace to
> fsdax mode. The current flow tries to delete the label, but that's
> only for reconfiguration, not the initial boot-up case that is
> currently failing. The deletion would fail on finding no label-write
> capability, but to be clear the boot-up case does not perform any
> writes.
Thanks for the explanation!

> > In Ubuntu 19.04 (4.18), the kernel is not aware of Hyper-V _LSI/_LSR, so
> > the created namespace is in "label-less" mode, and hence can't be used
> > with the libnvdimm-pending branch + this patch, unless we add a quirk
> > in Linux to explicitly not use the label.
> >
> > I agree ideally Hyper-V should hide the label capability, and I'll request
> > Hyper-V team to do that. I hope Hyper-V guys are still able to do that
> > in time so we won't need a quirk in Linux kernel.
>
> After some more thought, the "no regressions" rule means that Linux
> should ship a quirk for this by default. I think a good heuristic is
> to disable label support by default if no _LSW method is detected. An
> opt-in can be specified to accept "read-only" configurations, but
> that's an exceptional case. I'll send a patch for this.
>
> > BTW, I suppose Windows VM should be in "label-less" mode.
>
> I expect Windows mandates labeled operation. This label-less concept
> was something Linux shipped in advance of a specification being
> ratified and to support early NVDIMMs that don't advertise label
> space.
Since Hyper-V NVDIMM doesn't support _LSW, IMO Windows VM
can't update the LSI area either, so I guess Windows VM can't use
label either, just like Linux VM? I guess Windows VM also stores an
"info block" when the namespace is in dax mode, just like Linux VM?

BTW, it looks the dax mode configuration in Linux VM is incompatible
with that in Windows VM(?) When I create a DAX-enabled NTFS partition
in Windows VM in the same Hyper-V NVDIMM, Windows VM is able to
use all of the 32GB space (As I mentioned above, in Linux I lose ~500MB).
And the "dax" mode namespace created in Windows VM is detected
as "raw", and vice verse (I think).

I'm trying to understand why I lose ~500MB in Linux dax mode.
Your insights are appreciated!

Thanks,
--Dexuan

2019-02-05 17:41:54

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH] libnvdimm/dimm: Add a no-BLK quirk based on NVDIMM family

On Tue, Feb 5, 2019 at 8:53 AM Dexuan Cui <[email protected]> wrote:
>
> > From: Dan Williams <[email protected]>
> > Sent: Sunday, February 3, 2019 11:14 AM
> > > ...
> > > As I understand, the essence of the issue is: Hyper-V emulates the
> > > label mechanism (i.e. it supports _LSI and LSR), but doesn't do it
> > > right (i.e. it doesn't support _LSW).
> > >
> > > To manage the namespaces, Linux can choose to use label or not.
> > >
> > > If _LSI/_LSR are supported, Linux assumes _LSW is supported as well
> > > and chooses to use label (i.e. the label mode), but since Hyper-V doesn't
> > > support _LSW, Linux fails to change the namespace configuration.
> >
> > No, that's not quite right. The reason Linux does not see the fsdax
> > mode configuration is due to the missing "address abstraction GUID" in
> > the label produced the default Hyper-V configuration.
> Hi Dan,
> Do you mean NVDIMM_DAX_GUID?

Correct.

>
> > In label-less mode there is no "address abstraction GUID" to validate so
> > it falls back to just using the info-block directly.
> In the case of not using label storage area, as I understand the info-block
> resides in regular data storage area. Can you please tell me where exactly
> the info-block is and how its location is decided?
> And I suppose the info-block contains the NVDIMM_DAX_GUID?

The info-block always lives in the data-storage area, even with
labels. It's placed at namespace base address + 4K.

When labels are present the label gives the namespace a uuid and the
info-block "parent uuid" field must match that value. Without labels
the "parent uuid" field is unused / filled with zero's. Also with v1.2
labels the address abstraction uuid must match the info-block type.

> I'm asking because I found I lose ~500MBytes of the 32GBytes Hyper-V
> NVDIMM device, when the namespace is in fsdax mode. When it's in
> raw mode, I'm able to use all of the 32GB space.

Yes. A portion of the capacity is reserved for page structures.

https://lwn.net/Articles/656197/

2019-02-06 02:16:32

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH] libnvdimm/dimm: Add a no-BLK quirk based on NVDIMM family

> From: Dan Williams <[email protected]>
> Sent: Tuesday, February 5, 2019 9:12 AM
> On Tue, Feb 5, 2019 at 8:53 AM Dexuan Cui <[email protected]> wrote:
> >
> > > From: Dan Williams <[email protected]>
> > > Sent: Sunday, February 3, 2019 11:14 AM
> > > > ...
> > > > As I understand, the essence of the issue is: Hyper-V emulates the
> > > > label mechanism (i.e. it supports _LSI and LSR), but doesn't do it
> > > > right (i.e. it doesn't support _LSW).
> > > >
> > > > To manage the namespaces, Linux can choose to use label or not.
> > > >
> > > > If _LSI/_LSR are supported, Linux assumes _LSW is supported as well
> > > > and chooses to use label (i.e. the label mode), but since Hyper-V doesn't
> > > > support _LSW, Linux fails to change the namespace configuration.
> > >
> > > No, that's not quite right. The reason Linux does not see the fsdax
> > > mode configuration is due to the missing "address abstraction GUID" in
> > > the label produced the default Hyper-V configuration.
> > Hi Dan,
> > Do you mean NVDIMM_DAX_GUID?
>
> Correct.
FYI: in create_namespace_pmem(), label0->abstraction_guid is guid_null in
the case of Hyper-V NVDIMM. It looks Hyper-V does't use the guid.

> > > In label-less mode there is no "address abstraction GUID" to validate so
> > > it falls back to just using the info-block directly.
> > In the case of not using label storage area, as I understand the info-block
> > resides in regular data storage area. Can you please tell me where exactly
> > the info-block is and how its location is decided?
> > And I suppose the info-block contains the NVDIMM_DAX_GUID?
>
> The info-block always lives in the data-storage area, even with
> labels. It's placed at namespace base address + 4K.
>
> When labels are present the label gives the namespace a uuid and the
> info-block "parent uuid" field must match that value. Without labels
> the "parent uuid" field is unused / filled with zero's. Also with v1.2
> labels the address abstraction uuid must match the info-block type.
Thanks for elaborating on this!

It looks the label mechanism is for advanced usages, e.g., a PMEM namespace
can consist of multiple NVDIMMs, or a namespace only uses part of the
capacity of a NVDIMM, or a NVDIMM contains both PMEM and Block-Mode
namespaces.

For a simple usage, e.g. if a NVDIMM only contains one PMEM namespace which
consumes all the storage space of the NVDIMM, it looks the namespace can
be normally used with the help of the info-block, and we don't really need
the label. IMO this is the case of Hyper-V, i.e. we don't use the label at all,
since _LSW is not implemented.

> > I'm asking because I found I lose ~500MBytes of the 32GBytes Hyper-V
> > NVDIMM device, when the namespace is in fsdax mode. When it's in
> > raw mode, I'm able to use all of the 32GB space.
>
> Yes. A portion of the capacity is reserved for page structures.
Got it. It looks the size of the info-block (and the related padding?) is 2MB,
and "the overhead is 64-bytes per 4K (16GB per 1TB) on x86", so the total
overhead in my 32GB case is: 2MB + 32GB/(4096/64) = 514MB.

Thanks for sharing the link! Now I realized we can use the -M parameter
to not store the page metadata in the NVDIMM:

-M; --map=
A pmem namespace in “fsdax” or “devdax” mode requires allocation of
per-page metadata. The allocation can be drawn from either:
“mem”: typical system memory
“dev”: persistent memory reserved from the namespace

Thanks,
--Dexuan

2019-02-13 09:27:22

by Dan Williams

[permalink] [raw]
Subject: [PATCH v2] libnvdimm/dimm: Add a no-BLK quirk based on NVDIMM family

As Dexuan reports the NVDIMM_FAMILY_HYPERV platform is incompatible with
the existing Linux namespace implementation because it uses
NSLABEL_FLAG_LOCAL for x1-width PMEM interleave sets. Quirk it as an
platform / DIMM that does not provide BLK-aperture access. Allow the
libnvdimm core to assume no potential for aliasing. In case other
implementations make the same mistake, provide a "noblk" module
parameter to force-enable the quirk.

Link: https://lkml.kernel.org/r/PU1P153MB0169977604493B82B662A01CBF920@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM
Reported-by: Dexuan Cui <[email protected]>
Tested-by: Dexuan Cui <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
---
Changes since v1:
* Fix the default setting for NDD_NOBLK in __nvdimm_create().

drivers/acpi/nfit/core.c | 4 ++++
drivers/nvdimm/dimm_devs.c | 7 +++++++
drivers/nvdimm/label.c | 3 +++
drivers/nvdimm/namespace_devs.c | 6 ++++++
drivers/nvdimm/region_devs.c | 7 +++++++
include/linux/libnvdimm.h | 2 ++
6 files changed, 29 insertions(+)

diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 4a7e8b1fa43b..811c399a3a76 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -2016,6 +2016,10 @@ static int acpi_nfit_register_dimms(struct acpi_nfit_desc *acpi_desc)
cmd_mask |= nfit_mem->dsm_mask & NVDIMM_STANDARD_CMDMASK;
}

+ /* Quirk to ignore LOCAL for labels on HYPERV DIMMs */
+ if (nfit_mem->family == NVDIMM_FAMILY_HYPERV)
+ set_bit(NDD_NOBLK, &flags);
+
if (test_bit(NFIT_MEM_LSR, &nfit_mem->flags)) {
set_bit(ND_CMD_GET_CONFIG_SIZE, &cmd_mask);
set_bit(ND_CMD_GET_CONFIG_DATA, &cmd_mask);
diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c
index 4890310df874..553aa78abeee 100644
--- a/drivers/nvdimm/dimm_devs.c
+++ b/drivers/nvdimm/dimm_devs.c
@@ -11,6 +11,7 @@
* General Public License for more details.
*/
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+#include <linux/moduleparam.h>
#include <linux/vmalloc.h>
#include <linux/device.h>
#include <linux/ndctl.h>
@@ -25,6 +26,10 @@

static DEFINE_IDA(dimm_ida);

+static bool noblk;
+module_param(noblk, bool, 0444);
+MODULE_PARM_DESC(noblk, "force disable BLK / local alias support");
+
/*
* Retrieve bus and dimm handle and return if this bus supports
* get_config_data commands
@@ -551,6 +556,8 @@ struct nvdimm *__nvdimm_create(struct nvdimm_bus *nvdimm_bus,

nvdimm->dimm_id = dimm_id;
nvdimm->provider_data = provider_data;
+ if (noblk)
+ flags |= 1 << NDD_NOBLK;
nvdimm->flags = flags;
nvdimm->cmd_mask = cmd_mask;
nvdimm->num_flush = num_flush;
diff --git a/drivers/nvdimm/label.c b/drivers/nvdimm/label.c
index 6d6e9a12150b..f3d753d3169c 100644
--- a/drivers/nvdimm/label.c
+++ b/drivers/nvdimm/label.c
@@ -392,6 +392,7 @@ int nd_label_reserve_dpa(struct nvdimm_drvdata *ndd)
return 0; /* no label, nothing to reserve */

for_each_clear_bit_le(slot, free, nslot) {
+ struct nvdimm *nvdimm = to_nvdimm(ndd->dev);
struct nd_namespace_label *nd_label;
struct nd_region *nd_region = NULL;
u8 label_uuid[NSLABEL_UUID_LEN];
@@ -406,6 +407,8 @@ int nd_label_reserve_dpa(struct nvdimm_drvdata *ndd)

memcpy(label_uuid, nd_label->uuid, NSLABEL_UUID_LEN);
flags = __le32_to_cpu(nd_label->flags);
+ if (test_bit(NDD_NOBLK, &nvdimm->flags))
+ flags &= ~NSLABEL_FLAG_LOCAL;
nd_label_gen_id(&label_id, label_uuid, flags);
res = nvdimm_allocate_dpa(ndd, &label_id,
__le64_to_cpu(nd_label->dpa),
diff --git a/drivers/nvdimm/namespace_devs.c b/drivers/nvdimm/namespace_devs.c
index 4b077555ac70..3677b0c4a33d 100644
--- a/drivers/nvdimm/namespace_devs.c
+++ b/drivers/nvdimm/namespace_devs.c
@@ -2492,6 +2492,12 @@ static int init_active_labels(struct nd_region *nd_region)
if (!label_ent)
break;
label = nd_label_active(ndd, j);
+ if (test_bit(NDD_NOBLK, &nvdimm->flags)) {
+ u32 flags = __le32_to_cpu(label->flags);
+
+ flags &= ~NSLABEL_FLAG_LOCAL;
+ label->flags = __cpu_to_le32(flags);
+ }
label_ent->label = label;

mutex_lock(&nd_mapping->lock);
diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
index e2818f94f292..3b58baa44b5c 100644
--- a/drivers/nvdimm/region_devs.c
+++ b/drivers/nvdimm/region_devs.c
@@ -1003,6 +1003,13 @@ static struct nd_region *nd_region_create(struct nvdimm_bus *nvdimm_bus,

if (test_bit(NDD_UNARMED, &nvdimm->flags))
ro = 1;
+
+ if (test_bit(NDD_NOBLK, &nvdimm->flags)
+ && dev_type == &nd_blk_device_type) {
+ dev_err(&nvdimm_bus->dev, "%s: %s mapping%d is not BLK capable\n",
+ caller, dev_name(&nvdimm->dev), i);
+ return NULL;
+ }
}

if (dev_type == &nd_blk_device_type) {
diff --git a/include/linux/libnvdimm.h b/include/linux/libnvdimm.h
index 5440f11b0907..7da406ae3a2b 100644
--- a/include/linux/libnvdimm.h
+++ b/include/linux/libnvdimm.h
@@ -42,6 +42,8 @@ enum {
NDD_SECURITY_OVERWRITE = 3,
/* tracking whether or not there is a pending device reference */
NDD_WORK_PENDING = 4,
+ /* ignore / filter NSLABEL_FLAG_LOCAL for this DIMM, i.e. no aliasing */
+ NDD_NOBLK = 5,

/* need to set a limit somewhere, but yes, this is likely overkill */
ND_IOCTL_MAX_BUFLEN = SZ_4M,


2019-02-13 13:34:42

by Johannes Thumshirn

[permalink] [raw]
Subject: Re: [PATCH v2] libnvdimm/dimm: Add a no-BLK quirk based on NVDIMM family

Looks good,
Reviewed-by: Johannes Thumshirn <[email protected]>
--
Johannes Thumshirn SUSE Labs Filesystems
[email protected] +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850