2017-08-14 15:26:41

by Eddie James

[permalink] [raw]
Subject: [PATCH v3 0/3] hwmon: (pmbus): Add IBM Common Form Factor (CFF) power supply driver

From: "Edward A. James" <[email protected]>

This series adds a hwmon pmbus driver for a POWER System power supply. The
core monitoring functionality is provided by pmbus.

Changes since v2:
* Renamed the driver again...
* Remove debugfs bool from pmbus_driver_info.
* Add comment for returning rc when reading STATUS_MFR_SPECIFIC fails.

Since v1:
* Renamed the driver.
* Removed non-hwmon attributes.
* Simplified word and byte data reads.

Edward A. James (3):
dt-bindings: i2c: Document the IBM CCF power supply version 1
hwmon: (pmbus): Add IBM Common Form Factor (CFF) power supply driver
Documentation: hwmon: Document the IBM CFF power supply

.../devicetree/bindings/i2c/ibm,cffps1.txt | 21 +++
Documentation/hwmon/ibm-cffps | 54 ++++++++
drivers/hwmon/pmbus/Kconfig | 10 ++
drivers/hwmon/pmbus/Makefile | 1 +
drivers/hwmon/pmbus/ibm-cffps.c | 151 +++++++++++++++++++++
5 files changed, 237 insertions(+)
create mode 100644 Documentation/devicetree/bindings/i2c/ibm,cffps1.txt
create mode 100644 Documentation/hwmon/ibm-cffps
create mode 100644 drivers/hwmon/pmbus/ibm-cffps.c

--
1.8.3.1


2017-08-14 15:26:46

by Eddie James

[permalink] [raw]
Subject: [PATCH v3 2/3] hwmon: (pmbus): Add IBM Common Form Factor (CFF) power supply driver

From: "Edward A. James" <[email protected]>

Add the driver to monitor IBM CFF power supplies with hwmon over
pmbus.

Signed-off-by: Edward A. James <[email protected]>
---
drivers/hwmon/pmbus/Kconfig | 10 +++
drivers/hwmon/pmbus/Makefile | 1 +
drivers/hwmon/pmbus/ibm-cffps.c | 151 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 162 insertions(+)
create mode 100644 drivers/hwmon/pmbus/ibm-cffps.c

diff --git a/drivers/hwmon/pmbus/Kconfig b/drivers/hwmon/pmbus/Kconfig
index 68d717a..f3de18a 100644
--- a/drivers/hwmon/pmbus/Kconfig
+++ b/drivers/hwmon/pmbus/Kconfig
@@ -37,6 +37,16 @@ config SENSORS_ADM1275
This driver can also be built as a module. If so, the module will
be called adm1275.

+config SENSORS_IBM_CFFPS
+ tristate "IBM Common Form Factor Power Supply"
+ default n
+ help
+ If you say yes here you get hardware monitoring support for the IBM
+ Common Form Factor power supply.
+
+ This driver can also be built as a module. If so, the module will
+ be called ibm-cffps.
+
config SENSORS_IR35221
tristate "Infineon IR35221"
default n
diff --git a/drivers/hwmon/pmbus/Makefile b/drivers/hwmon/pmbus/Makefile
index 75bb7ca..737fa4e 100644
--- a/drivers/hwmon/pmbus/Makefile
+++ b/drivers/hwmon/pmbus/Makefile
@@ -5,6 +5,7 @@
obj-$(CONFIG_PMBUS) += pmbus_core.o
obj-$(CONFIG_SENSORS_PMBUS) += pmbus.o
obj-$(CONFIG_SENSORS_ADM1275) += adm1275.o
+obj-$(CONFIG_SENSORS_IBM_CFFPS) += ibm-cffps.o
obj-$(CONFIG_SENSORS_IR35221) += ir35221.o
obj-$(CONFIG_SENSORS_LM25066) += lm25066.o
obj-$(CONFIG_SENSORS_LTC2978) += ltc2978.o
diff --git a/drivers/hwmon/pmbus/ibm-cffps.c b/drivers/hwmon/pmbus/ibm-cffps.c
new file mode 100644
index 0000000..1a6294c
--- /dev/null
+++ b/drivers/hwmon/pmbus/ibm-cffps.c
@@ -0,0 +1,151 @@
+/*
+ * Copyright 2017 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/device.h>
+#include <linux/i2c.h>
+#include <linux/jiffies.h>
+#include <linux/module.h>
+
+#include "pmbus.h"
+
+/* STATUS_MFR_SPECIFIC bits */
+#define CFFPS_MFR_FAN_FAULT BIT(0)
+#define CFFPS_MFR_THERMAL_FAULT BIT(1)
+#define CFFPS_MFR_OV_FAULT BIT(2)
+#define CFFPS_MFR_UV_FAULT BIT(3)
+#define CFFPS_MFR_PS_KILL BIT(4)
+#define CFFPS_MFR_OC_FAULT BIT(5)
+#define CFFPS_MFR_VAUX_FAULT BIT(6)
+#define CFFPS_MFR_CURRENT_SHARE_WARNING BIT(7)
+
+static int ibm_cffps_read_byte_data(struct i2c_client *client, int page,
+ int reg)
+{
+ int rc, mfr;
+
+ switch (reg) {
+ case PMBUS_STATUS_VOUT:
+ case PMBUS_STATUS_IOUT:
+ case PMBUS_STATUS_TEMPERATURE:
+ case PMBUS_STATUS_FAN_12:
+ rc = pmbus_read_byte_data(client, page, reg);
+ if (rc < 0)
+ return rc;
+
+ mfr = pmbus_read_byte_data(client, page,
+ PMBUS_STATUS_MFR_SPECIFIC);
+ if (mfr < 0)
+ /*
+ * Return the status register instead of an error,
+ * since we successfully read status.
+ */
+ return rc;
+
+ /* Add MFR_SPECIFIC bits to the standard pmbus status regs. */
+ if (reg == PMBUS_STATUS_FAN_12) {
+ if (mfr & CFFPS_MFR_FAN_FAULT)
+ rc |= PB_FAN_FAN1_FAULT;
+ } else if (reg == PMBUS_STATUS_TEMPERATURE) {
+ if (mfr & CFFPS_MFR_THERMAL_FAULT)
+ rc |= PB_TEMP_OT_FAULT;
+ } else if (reg == PMBUS_STATUS_VOUT) {
+ if (mfr & (CFFPS_MFR_OV_FAULT | CFFPS_MFR_VAUX_FAULT))
+ rc |= PB_VOLTAGE_OV_FAULT;
+ if (mfr & CFFPS_MFR_UV_FAULT)
+ rc |= PB_VOLTAGE_UV_FAULT;
+ } else if (reg == PMBUS_STATUS_IOUT) {
+ if (mfr & CFFPS_MFR_OC_FAULT)
+ rc |= PB_IOUT_OC_FAULT;
+ if (mfr & CFFPS_MFR_CURRENT_SHARE_WARNING)
+ rc |= PB_CURRENT_SHARE_FAULT;
+ }
+ break;
+ default:
+ rc = -ENODATA;
+ break;
+ }
+
+ return rc;
+}
+
+static int ibm_cffps_read_word_data(struct i2c_client *client, int page,
+ int reg)
+{
+ int rc, mfr;
+
+ switch (reg) {
+ case PMBUS_STATUS_WORD:
+ rc = pmbus_read_word_data(client, page, reg);
+ if (rc < 0)
+ return rc;
+
+ mfr = pmbus_read_byte_data(client, page,
+ PMBUS_STATUS_MFR_SPECIFIC);
+ if (mfr < 0)
+ /*
+ * Return the status register instead of an error,
+ * since we successfully read status.
+ */
+ return rc;
+
+ if (mfr & CFFPS_MFR_PS_KILL)
+ rc |= PB_STATUS_OFF;
+ break;
+ default:
+ rc = -ENODATA;
+ break;
+ }
+
+ return rc;
+}
+
+static struct pmbus_driver_info ibm_cffps_info = {
+ .pages = 1,
+ .func[0] = PMBUS_HAVE_VIN | PMBUS_HAVE_VOUT | PMBUS_HAVE_IOUT |
+ PMBUS_HAVE_PIN | PMBUS_HAVE_FAN12 | PMBUS_HAVE_TEMP |
+ PMBUS_HAVE_TEMP2 | PMBUS_HAVE_TEMP3 | PMBUS_HAVE_STATUS_VOUT |
+ PMBUS_HAVE_STATUS_IOUT | PMBUS_HAVE_STATUS_INPUT |
+ PMBUS_HAVE_STATUS_TEMP | PMBUS_HAVE_STATUS_FAN12,
+ .read_byte_data = ibm_cffps_read_byte_data,
+ .read_word_data = ibm_cffps_read_word_data,
+};
+
+static int ibm_cffps_probe(struct i2c_client *client,
+ const struct i2c_device_id *id)
+{
+ return pmbus_do_probe(client, id, &ibm_cffps_info);
+}
+
+static const struct i2c_device_id ibm_cffps_id[] = {
+ { "ibm_cffps1", 1 },
+ {}
+};
+MODULE_DEVICE_TABLE(i2c, ibm_cffps_id);
+
+static const struct of_device_id ibm_cffps_of_match[] = {
+ { .compatible = "ibm,cffps1" },
+ {}
+};
+MODULE_DEVICE_TABLE(of, ibm_cffps_of_match);
+
+static struct i2c_driver ibm_cffps_driver = {
+ .driver = {
+ .name = "ibm-cffps",
+ .of_match_table = ibm_cffps_of_match,
+ },
+ .probe = ibm_cffps_probe,
+ .remove = pmbus_do_remove,
+ .id_table = ibm_cffps_id,
+};
+
+module_i2c_driver(ibm_cffps_driver);
+
+MODULE_AUTHOR("Eddie James");
+MODULE_DESCRIPTION("PMBus driver for IBM Common Form Factor power supplies");
+MODULE_LICENSE("GPL");
--
1.8.3.1

2017-08-14 15:27:05

by Eddie James

[permalink] [raw]
Subject: [PATCH v3 3/3] Documentation: hwmon: Document the IBM CFF power supply

From: "Edward A. James" <[email protected]>

Signed-off-by: Edward A. James <[email protected]>
---
Documentation/hwmon/ibm-cffps | 54 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 54 insertions(+)
create mode 100644 Documentation/hwmon/ibm-cffps

diff --git a/Documentation/hwmon/ibm-cffps b/Documentation/hwmon/ibm-cffps
new file mode 100644
index 0000000..e091ff2
--- /dev/null
+++ b/Documentation/hwmon/ibm-cffps
@@ -0,0 +1,54 @@
+Kernel driver ibm-cffps
+=======================
+
+Supported chips:
+ * IBM Common Form Factor power supply
+
+Author: Eddie James <[email protected]>
+
+Description
+-----------
+
+This driver supports IBM Common Form Factor (CFF) power supplies. This driver
+is a client to the core PMBus driver.
+
+Usage Notes
+-----------
+
+This driver does not auto-detect devices. You will have to instantiate the
+devices explicitly. Please see Documentation/i2c/instantiating-devices for
+details.
+
+Sysfs entries
+-------------
+
+The following attributes are supported:
+
+curr1_alarm Output current over-current fault.
+curr1_input Measured output current in mA.
+curr1_label "iout1"
+
+fan1_alarm Fan 1 warning.
+fan1_fault Fan 1 fault.
+fan1_input Fan 1 speed in RPM.
+fan2_alarm Fan 2 warning.
+fan2_fault Fan 2 fault.
+fan2_input Fan 2 speed in RPM.
+
+in1_alarm Input voltage under-voltage fault.
+in1_input Measured input voltage in mV.
+in1_label "vin"
+in2_alarm Output voltage over-voltage fault.
+in2_input Measured output voltage in mV.
+in2_label "vout1"
+
+power1_alarm Input fault.
+power1_input Measured input power in uW.
+power1_label "pin"
+
+temp1_alarm PSU inlet ambient temperature over-temperature fault.
+temp1_input Measured PSU inlet ambient temp in millidegrees C.
+temp2_alarm Secondary rectifier temp over-temperature fault.
+temp2_input Measured secondary rectifier temp in millidegrees C.
+temp3_alarm ORing FET temperature over-temperature fault.
+temp3_input Measured ORing FET temperature in millidegrees C.
--
1.8.3.1

2017-08-14 15:27:32

by Eddie James

[permalink] [raw]
Subject: [PATCH v3 1/3] dt-bindings: i2c: Document the IBM CCF power supply version 1

From: "Edward A. James" <[email protected]>

Signed-off-by: Edward A. James <[email protected]>
---
.../devicetree/bindings/i2c/ibm,cffps1.txt | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
create mode 100644 Documentation/devicetree/bindings/i2c/ibm,cffps1.txt

diff --git a/Documentation/devicetree/bindings/i2c/ibm,cffps1.txt b/Documentation/devicetree/bindings/i2c/ibm,cffps1.txt
new file mode 100644
index 0000000..f68a0a6
--- /dev/null
+++ b/Documentation/devicetree/bindings/i2c/ibm,cffps1.txt
@@ -0,0 +1,21 @@
+Device-tree bindings for IBM Common Form Factor Power Supply Version 1
+----------------------------------------------------------------------
+
+Required properties:
+ - compatible = "ibm,cffps1";
+ - reg = < I2C bus address >; : Address of the power supply on the
+ I2C bus.
+
+Example:
+
+ i2c-bus@100 {
+ #address-cells = <1>;
+ #size-cells = <0>;
+ #interrupt-cells = <1>;
+ < more properties >
+
+ power-supply@68 {
+ compatible = "ibm,cffps1";
+ reg = <0x68>;
+ };
+ };
--
1.8.3.1

2017-08-14 18:53:14

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH v3 3/3] Documentation: hwmon: Document the IBM CFF power supply

On Mon, Aug 14, 2017 at 10:26:30AM -0500, Eddie James wrote:
> From: "Edward A. James" <[email protected]>
>
> Signed-off-by: Edward A. James <[email protected]>
> ---
> Documentation/hwmon/ibm-cffps | 54 +++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 54 insertions(+)
> create mode 100644 Documentation/hwmon/ibm-cffps
>
> diff --git a/Documentation/hwmon/ibm-cffps b/Documentation/hwmon/ibm-cffps
> new file mode 100644
> index 0000000..e091ff2
> --- /dev/null
> +++ b/Documentation/hwmon/ibm-cffps
> @@ -0,0 +1,54 @@
> +Kernel driver ibm-cffps
> +=======================
> +
> +Supported chips:
> + * IBM Common Form Factor power supply
> +
> +Author: Eddie James <[email protected]>
> +
> +Description
> +-----------
> +
> +This driver supports IBM Common Form Factor (CFF) power supplies. This driver
> +is a client to the core PMBus driver.
> +
> +Usage Notes
> +-----------
> +
> +This driver does not auto-detect devices. You will have to instantiate the
> +devices explicitly. Please see Documentation/i2c/instantiating-devices for
> +details.
> +
> +Sysfs entries
> +-------------
> +
> +The following attributes are supported:
> +
> +curr1_alarm Output current over-current fault.
> +curr1_input Measured output current in mA.
> +curr1_label "iout1"
> +
> +fan1_alarm Fan 1 warning.
> +fan1_fault Fan 1 fault.
> +fan1_input Fan 1 speed in RPM.
> +fan2_alarm Fan 2 warning.
> +fan2_fault Fan 2 fault.
> +fan2_input Fan 2 speed in RPM.
> +
> +in1_alarm Input voltage under-voltage fault.

Just noticed. Are you sure you mean 'fault' here and below ?
'alarm' attributes normally report an over- or under- condition,
but not a fault. Faults should be reported with 'fault' attributes.
In PMBus lingo (which doesn't distinguish a real 'fault' from
a critical over- or under- condition), the "FAULT" condition
usually maps with the 'crit_alarm' or 'lcrit_alarm' attributes.
Also, under-voltages would normally be reported as min_alarm
or clrit_alarm, not in_alarm.

> +in1_input Measured input voltage in mV.
> +in1_label "vin"
> +in2_alarm Output voltage over-voltage fault.
> +in2_input Measured output voltage in mV.
> +in2_label "vout1"
> +
> +power1_alarm Input fault.

Another example; this maps to PMBUS_PIN_OP_WARN_LIMIT which is an
input power alarm, not an indication of a fault condition.

> +power1_input Measured input power in uW.
> +power1_label "pin"
> +
> +temp1_alarm PSU inlet ambient temperature over-temperature fault.
> +temp1_input Measured PSU inlet ambient temp in millidegrees C.
> +temp2_alarm Secondary rectifier temp over-temperature fault.

Interestingly, PMBus does not distinguish between a critical temperature
alarm and an actual "fault". Makes me wonder if the IBM PS reports
CFFPS_MFR_THERMAL_FAULT if there is an actual fault (chip or sensor failure),
or if it has the same meaning as PB_TEMP_OT_FAULT, ie an excessively high
temperature.

If it is a real fault (a detected sensor failure), we should possibly
consider adding a respective "virtual" temperature status flag. The same
is true for other status bits reported in the manufacturer status
register if any of those reflect a "real" fault, ie a chip failure.

> +temp2_input Measured secondary rectifier temp in millidegrees C.
> +temp3_alarm ORing FET temperature over-temperature fault.
> +temp3_input Measured ORing FET temperature in millidegrees C.
> --
> 1.8.3.1
>

2017-08-14 19:26:29

by Eddie James

[permalink] [raw]
Subject: Re: [PATCH v3 3/3] Documentation: hwmon: Document the IBM CFF power supply



On 08/14/2017 01:53 PM, Guenter Roeck wrote:
> On Mon, Aug 14, 2017 at 10:26:30AM -0500, Eddie James wrote:
>> From: "Edward A. James" <[email protected]>
>>
>> Signed-off-by: Edward A. James <[email protected]>
>> ---
>> Documentation/hwmon/ibm-cffps | 54 +++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 54 insertions(+)
>> create mode 100644 Documentation/hwmon/ibm-cffps
>>
>> diff --git a/Documentation/hwmon/ibm-cffps b/Documentation/hwmon/ibm-cffps
>> new file mode 100644
>> index 0000000..e091ff2
>> --- /dev/null
>> +++ b/Documentation/hwmon/ibm-cffps
>> @@ -0,0 +1,54 @@
>> +Kernel driver ibm-cffps
>> +=======================
>> +
>> +Supported chips:
>> + * IBM Common Form Factor power supply
>> +
>> +Author: Eddie James <[email protected]>
>> +
>> +Description
>> +-----------
>> +
>> +This driver supports IBM Common Form Factor (CFF) power supplies. This driver
>> +is a client to the core PMBus driver.
>> +
>> +Usage Notes
>> +-----------
>> +
>> +This driver does not auto-detect devices. You will have to instantiate the
>> +devices explicitly. Please see Documentation/i2c/instantiating-devices for
>> +details.
>> +
>> +Sysfs entries
>> +-------------
>> +
>> +The following attributes are supported:
>> +
>> +curr1_alarm Output current over-current fault.
>> +curr1_input Measured output current in mA.
>> +curr1_label "iout1"
>> +
>> +fan1_alarm Fan 1 warning.
>> +fan1_fault Fan 1 fault.
>> +fan1_input Fan 1 speed in RPM.
>> +fan2_alarm Fan 2 warning.
>> +fan2_fault Fan 2 fault.
>> +fan2_input Fan 2 speed in RPM.
>> +
>> +in1_alarm Input voltage under-voltage fault.
> Just noticed. Are you sure you mean 'fault' here and below ?
> 'alarm' attributes normally report an over- or under- condition,
> but not a fault. Faults should be reported with 'fault' attributes.
> In PMBus lingo (which doesn't distinguish a real 'fault' from
> a critical over- or under- condition), the "FAULT" condition
> usually maps with the 'crit_alarm' or 'lcrit_alarm' attributes.
> Also, under-voltages would normally be reported as min_alarm
> or clrit_alarm, not in_alarm.

Thanks, I better change this doc to "alarm." The spec reports all these
as "faults" but many of them are merely over-temp or over-voltage, etc,
and should be "alarm" to be consistent with PMBus.

The problem with this power supply is that it doesn't report any
"limits." So unless I set up my read_byte function to return some
limits, we can't get any lower or upper limits and therefore won't get
the crit_alarm, lcrit_alarm, etc. Do you think I should "fake" the
limits in the driver?

>
>> +in1_input Measured input voltage in mV.
>> +in1_label "vin"
>> +in2_alarm Output voltage over-voltage fault.
>> +in2_input Measured output voltage in mV.
>> +in2_label "vout1"
>> +
>> +power1_alarm Input fault.
> Another example; this maps to PMBUS_PIN_OP_WARN_LIMIT which is an
> input power alarm, not an indication of a fault condition.

Hm, with my latest changes to look at the higher byte of STATUS_WORD, it
looks like we now have the same name for both the pin generic alarm
attribute and the pin_limit_attr... So in this device's case, it would
map to PB_STATUS_INPUT bit of STATUS_WORD. Didn't think about that...
any suggestions? Can't really change the name of the limit one without
breaking people's code...

>
>> +power1_input Measured input power in uW.
>> +power1_label "pin"
>> +
>> +temp1_alarm PSU inlet ambient temperature over-temperature fault.
>> +temp1_input Measured PSU inlet ambient temp in millidegrees C.
>> +temp2_alarm Secondary rectifier temp over-temperature fault.
> Interestingly, PMBus does not distinguish between a critical temperature
> alarm and an actual "fault". Makes me wonder if the IBM PS reports
> CFFPS_MFR_THERMAL_FAULT if there is an actual fault (chip or sensor failure),
> or if it has the same meaning as PB_TEMP_OT_FAULT, ie an excessively high
> temperature.

Will change these to "alarm" in the doc too.

>
> If it is a real fault (a detected sensor failure), we should possibly
> consider adding a respective "virtual" temperature status flag. The same
> is true for other status bits reported in the manufacturer status
> register if any of those reflect a "real" fault, ie a chip failure.

Yea, that would probably be helpful. The CFFPS_MFR_THERMAL_FAULT bit is
a fault (so the spec says), but I'm not sure what is triggering it.

Thanks,
Eddie

>
>> +temp2_input Measured secondary rectifier temp in millidegrees C.
>> +temp3_alarm ORing FET temperature over-temperature fault.
>> +temp3_input Measured ORing FET temperature in millidegrees C.
>> --
>> 1.8.3.1
>>

2017-08-14 22:37:22

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH v3 3/3] Documentation: hwmon: Document the IBM CFF power supply

On Mon, Aug 14, 2017 at 02:26:20PM -0500, Eddie James wrote:
>
>
> On 08/14/2017 01:53 PM, Guenter Roeck wrote:
> >On Mon, Aug 14, 2017 at 10:26:30AM -0500, Eddie James wrote:
> >>From: "Edward A. James" <[email protected]>
> >>
> >>Signed-off-by: Edward A. James <[email protected]>
> >>---
> >> Documentation/hwmon/ibm-cffps | 54 +++++++++++++++++++++++++++++++++++++++++++
> >> 1 file changed, 54 insertions(+)
> >> create mode 100644 Documentation/hwmon/ibm-cffps
> >>
> >>diff --git a/Documentation/hwmon/ibm-cffps b/Documentation/hwmon/ibm-cffps
> >>new file mode 100644
> >>index 0000000..e091ff2
> >>--- /dev/null
> >>+++ b/Documentation/hwmon/ibm-cffps
> >>@@ -0,0 +1,54 @@
> >>+Kernel driver ibm-cffps
> >>+=======================
> >>+
> >>+Supported chips:
> >>+ * IBM Common Form Factor power supply
> >>+
> >>+Author: Eddie James <[email protected]>
> >>+
> >>+Description
> >>+-----------
> >>+
> >>+This driver supports IBM Common Form Factor (CFF) power supplies. This driver
> >>+is a client to the core PMBus driver.
> >>+
> >>+Usage Notes
> >>+-----------
> >>+
> >>+This driver does not auto-detect devices. You will have to instantiate the
> >>+devices explicitly. Please see Documentation/i2c/instantiating-devices for
> >>+details.
> >>+
> >>+Sysfs entries
> >>+-------------
> >>+
> >>+The following attributes are supported:
> >>+
> >>+curr1_alarm Output current over-current fault.
> >>+curr1_input Measured output current in mA.
> >>+curr1_label "iout1"
> >>+
> >>+fan1_alarm Fan 1 warning.
> >>+fan1_fault Fan 1 fault.
> >>+fan1_input Fan 1 speed in RPM.
> >>+fan2_alarm Fan 2 warning.
> >>+fan2_fault Fan 2 fault.
> >>+fan2_input Fan 2 speed in RPM.
> >>+
> >>+in1_alarm Input voltage under-voltage fault.
> >Just noticed. Are you sure you mean 'fault' here and below ?
> >'alarm' attributes normally report an over- or under- condition,
> >but not a fault. Faults should be reported with 'fault' attributes.
> >In PMBus lingo (which doesn't distinguish a real 'fault' from
> >a critical over- or under- condition), the "FAULT" condition
> >usually maps with the 'crit_alarm' or 'lcrit_alarm' attributes.
> >Also, under-voltages would normally be reported as min_alarm
> >or clrit_alarm, not in_alarm.
>
> Thanks, I better change this doc to "alarm." The spec reports all these as
> "faults" but many of them are merely over-temp or over-voltage, etc, and
> should be "alarm" to be consistent with PMBus.
>
> The problem with this power supply is that it doesn't report any "limits."
> So unless I set up my read_byte function to return some limits, we can't get
> any lower or upper limits and therefore won't get the crit_alarm,
> lcrit_alarm, etc. Do you think I should "fake" the limits in the driver?
>
Good question. Are the limits documented ? If yes, that would make sense.
I am quite sure that limits are word registers, though.

Guenter

> >
> >>+in1_input Measured input voltage in mV.
> >>+in1_label "vin"
> >>+in2_alarm Output voltage over-voltage fault.
> >>+in2_input Measured output voltage in mV.
> >>+in2_label "vout1"
> >>+
> >>+power1_alarm Input fault.
> >Another example; this maps to PMBUS_PIN_OP_WARN_LIMIT which is an
> >input power alarm, not an indication of a fault condition.
>
> Hm, with my latest changes to look at the higher byte of STATUS_WORD, it
> looks like we now have the same name for both the pin generic alarm
> attribute and the pin_limit_attr... So in this device's case, it would map
> to PB_STATUS_INPUT bit of STATUS_WORD. Didn't think about that... any
> suggestions? Can't really change the name of the limit one without breaking
> people's code...
>
> >
> >>+power1_input Measured input power in uW.
> >>+power1_label "pin"
> >>+
> >>+temp1_alarm PSU inlet ambient temperature over-temperature fault.
> >>+temp1_input Measured PSU inlet ambient temp in millidegrees C.
> >>+temp2_alarm Secondary rectifier temp over-temperature fault.
> >Interestingly, PMBus does not distinguish between a critical temperature
> >alarm and an actual "fault". Makes me wonder if the IBM PS reports
> >CFFPS_MFR_THERMAL_FAULT if there is an actual fault (chip or sensor failure),
> >or if it has the same meaning as PB_TEMP_OT_FAULT, ie an excessively high
> >temperature.
>
> Will change these to "alarm" in the doc too.
>
> >
> >If it is a real fault (a detected sensor failure), we should possibly
> >consider adding a respective "virtual" temperature status flag. The same
> >is true for other status bits reported in the manufacturer status
> >register if any of those reflect a "real" fault, ie a chip failure.
>
> Yea, that would probably be helpful. The CFFPS_MFR_THERMAL_FAULT bit is a
> fault (so the spec says), but I'm not sure what is triggering it.
>
> Thanks,
> Eddie
>
> >
> >>+temp2_input Measured secondary rectifier temp in millidegrees C.
> >>+temp3_alarm ORing FET temperature over-temperature fault.
> >>+temp3_input Measured ORing FET temperature in millidegrees C.
> >>--
> >>1.8.3.1
> >>
>

2017-08-15 20:37:06

by Eddie James

[permalink] [raw]
Subject: Re: [PATCH v3 3/3] Documentation: hwmon: Document the IBM CFF power supply



On 08/14/2017 05:37 PM, Guenter Roeck wrote:
> On Mon, Aug 14, 2017 at 02:26:20PM -0500, Eddie James wrote:
>>
>> On 08/14/2017 01:53 PM, Guenter Roeck wrote:
>>> On Mon, Aug 14, 2017 at 10:26:30AM -0500, Eddie James wrote:
>>>> From: "Edward A. James" <[email protected]>
>>>>
>>>> Signed-off-by: Edward A. James <[email protected]>
>>>> ---
>>>> Documentation/hwmon/ibm-cffps | 54 +++++++++++++++++++++++++++++++++++++++++++
>>>> 1 file changed, 54 insertions(+)
>>>> create mode 100644 Documentation/hwmon/ibm-cffps
>>>>
>>>> diff --git a/Documentation/hwmon/ibm-cffps b/Documentation/hwmon/ibm-cffps
>>>> new file mode 100644
>>>> index 0000000..e091ff2
>>>> --- /dev/null
>>>> +++ b/Documentation/hwmon/ibm-cffps
>>>> @@ -0,0 +1,54 @@
>>>> +Kernel driver ibm-cffps
>>>> +=======================
>>>> +
>>>> +Supported chips:
>>>> + * IBM Common Form Factor power supply
>>>> +
>>>> +Author: Eddie James <[email protected]>
>>>> +
>>>> +Description
>>>> +-----------
>>>> +
>>>> +This driver supports IBM Common Form Factor (CFF) power supplies. This driver
>>>> +is a client to the core PMBus driver.
>>>> +
>>>> +Usage Notes
>>>> +-----------
>>>> +
>>>> +This driver does not auto-detect devices. You will have to instantiate the
>>>> +devices explicitly. Please see Documentation/i2c/instantiating-devices for
>>>> +details.
>>>> +
>>>> +Sysfs entries
>>>> +-------------
>>>> +
>>>> +The following attributes are supported:
>>>> +
>>>> +curr1_alarm Output current over-current fault.
>>>> +curr1_input Measured output current in mA.
>>>> +curr1_label "iout1"
>>>> +
>>>> +fan1_alarm Fan 1 warning.
>>>> +fan1_fault Fan 1 fault.
>>>> +fan1_input Fan 1 speed in RPM.
>>>> +fan2_alarm Fan 2 warning.
>>>> +fan2_fault Fan 2 fault.
>>>> +fan2_input Fan 2 speed in RPM.
>>>> +
>>>> +in1_alarm Input voltage under-voltage fault.
>>> Just noticed. Are you sure you mean 'fault' here and below ?
>>> 'alarm' attributes normally report an over- or under- condition,
>>> but not a fault. Faults should be reported with 'fault' attributes.
>>> In PMBus lingo (which doesn't distinguish a real 'fault' from
>>> a critical over- or under- condition), the "FAULT" condition
>>> usually maps with the 'crit_alarm' or 'lcrit_alarm' attributes.
>>> Also, under-voltages would normally be reported as min_alarm
>>> or clrit_alarm, not in_alarm.
>> Thanks, I better change this doc to "alarm." The spec reports all these as
>> "faults" but many of them are merely over-temp or over-voltage, etc, and
>> should be "alarm" to be consistent with PMBus.
>>
>> The problem with this power supply is that it doesn't report any "limits."
>> So unless I set up my read_byte function to return some limits, we can't get
>> any lower or upper limits and therefore won't get the crit_alarm,
>> lcrit_alarm, etc. Do you think I should "fake" the limits in the driver?
>>
> Good question. Are the limits documented ? If yes, that would make sense.
> I am quite sure that limits are word registers, though.

No, no documentation on any limits... I'll leave it as is, as it it's
meeting our requirements for now. I'll just change "fault" to "alarm" in
the doc here.

Thanks,
Eddie

>
> Guenter
>
>>>> +in1_input Measured input voltage in mV.
>>>> +in1_label "vin"
>>>> +in2_alarm Output voltage over-voltage fault.
>>>> +in2_input Measured output voltage in mV.
>>>> +in2_label "vout1"
>>>> +
>>>> +power1_alarm Input fault.
>>> Another example; this maps to PMBUS_PIN_OP_WARN_LIMIT which is an
>>> input power alarm, not an indication of a fault condition.
>> Hm, with my latest changes to look at the higher byte of STATUS_WORD, it
>> looks like we now have the same name for both the pin generic alarm
>> attribute and the pin_limit_attr... So in this device's case, it would map
>> to PB_STATUS_INPUT bit of STATUS_WORD. Didn't think about that... any
>> suggestions? Can't really change the name of the limit one without breaking
>> people's code...
>>
>>>> +power1_input Measured input power in uW.
>>>> +power1_label "pin"
>>>> +
>>>> +temp1_alarm PSU inlet ambient temperature over-temperature fault.
>>>> +temp1_input Measured PSU inlet ambient temp in millidegrees C.
>>>> +temp2_alarm Secondary rectifier temp over-temperature fault.
>>> Interestingly, PMBus does not distinguish between a critical temperature
>>> alarm and an actual "fault". Makes me wonder if the IBM PS reports
>>> CFFPS_MFR_THERMAL_FAULT if there is an actual fault (chip or sensor failure),
>>> or if it has the same meaning as PB_TEMP_OT_FAULT, ie an excessively high
>>> temperature.
>> Will change these to "alarm" in the doc too.
>>
>>> If it is a real fault (a detected sensor failure), we should possibly
>>> consider adding a respective "virtual" temperature status flag. The same
>>> is true for other status bits reported in the manufacturer status
>>> register if any of those reflect a "real" fault, ie a chip failure.
>> Yea, that would probably be helpful. The CFFPS_MFR_THERMAL_FAULT bit is a
>> fault (so the spec says), but I'm not sure what is triggering it.
>>
>> Thanks,
>> Eddie
>>
>>>> +temp2_input Measured secondary rectifier temp in millidegrees C.
>>>> +temp3_alarm ORing FET temperature over-temperature fault.
>>>> +temp3_input Measured ORing FET temperature in millidegrees C.
>>>> --
>>>> 1.8.3.1
>>>>