Currently, IRQs for both falling and raising edges of the GPIO
line connected to the DHT11 device are requested. However, the
low states do not carry information, it is possible to determine
0 and 1 bits from the timing of two adjacent falling edges as
well.
Doing so does no longer requires to read the GPIO line value
within the IRQ handler, plus halves the number of IRQs to be
handled at all.
Signed-off-by: Andreas Feldner <[email protected]>
---
drivers/iio/humidity/dht11.c | 28 +++++++++++-----------------
1 file changed, 11 insertions(+), 17 deletions(-)
diff --git a/drivers/iio/humidity/dht11.c b/drivers/iio/humidity/dht11.c
index c97e25448772..d1cd053c5dd4 100644
--- a/drivers/iio/humidity/dht11.c
+++ b/drivers/iio/humidity/dht11.c
@@ -30,13 +30,13 @@
#define DHT11_DATA_VALID_TIME 2000000000 /* 2s in ns */
-#define DHT11_EDGES_PREAMBLE 2
+#define DHT11_EDGES_PREAMBLE 1
#define DHT11_BITS_PER_READ 40
/*
* Note that when reading the sensor actually 84 edges are detected, but
* since the last edge is not significant, we only store 83:
*/
-#define DHT11_EDGES_PER_READ (2 * DHT11_BITS_PER_READ + \
+#define DHT11_EDGES_PER_READ (DHT11_BITS_PER_READ + \
DHT11_EDGES_PREAMBLE + 1)
/*
@@ -46,6 +46,7 @@
* 1-bit: 68-75uS -- typically 70uS (AM2302)
* The acutal timings also depend on the properties of the cable, with
* longer cables typically making pulses shorter.
+ * Low time is constant 50uS.
*
* Our decoding depends on the time resolution of the system:
* timeres > 34uS ... don't know what a 1-tick pulse is
@@ -63,7 +64,8 @@
#define DHT11_START_TRANSMISSION_MIN 18000 /* us */
#define DHT11_START_TRANSMISSION_MAX 20000 /* us */
#define DHT11_MIN_TIMERES 34000 /* ns */
-#define DHT11_THRESHOLD 49000 /* ns */
+#define DHT11_LOW 50000 /* ns */
+#define DHT11_THRESHOLD (49000 + DHT11_LOW) /* ns */
#define DHT11_AMBIG_LOW 23000 /* ns */
#define DHT11_AMBIG_HIGH 30000 /* ns */
@@ -83,7 +85,7 @@ struct dht11 {
/* num_edges: -1 means "no transmission in progress" */
int num_edges;
- struct {s64 ts; int value; } edges[DHT11_EDGES_PER_READ];
+ struct {s64 ts; } edges[DHT11_EDGES_PER_READ];
};
#ifdef CONFIG_DYNAMIC_DEBUG
@@ -99,7 +101,7 @@ static void dht11_edges_print(struct dht11 *dht11)
for (i = 1; i < dht11->num_edges; ++i) {
dev_dbg(dht11->dev, "%d: %lld ns %s\n", i,
dht11->edges[i].ts - dht11->edges[i - 1].ts,
- dht11->edges[i - 1].value ? "high" : "low");
+ "falling");
}
}
#endif /* CONFIG_DYNAMIC_DEBUG */
@@ -125,14 +127,8 @@ static int dht11_decode(struct dht11 *dht11, int offset)
unsigned char temp_int, temp_dec, hum_int, hum_dec, checksum;
for (i = 0; i < DHT11_BITS_PER_READ; ++i) {
- t = dht11->edges[offset + 2 * i + 2].ts -
- dht11->edges[offset + 2 * i + 1].ts;
- if (!dht11->edges[offset + 2 * i + 1].value) {
- dev_dbg(dht11->dev,
- "lost synchronisation at edge %d\n",
- offset + 2 * i + 1);
- return -EIO;
- }
+ t = dht11->edges[offset + i + 1].ts -
+ dht11->edges[offset + i].ts;
bits[i] = t > DHT11_THRESHOLD;
}
@@ -174,9 +170,7 @@ static irqreturn_t dht11_handle_irq(int irq, void *data)
struct dht11 *dht11 = iio_priv(iio);
if (dht11->num_edges < DHT11_EDGES_PER_READ && dht11->num_edges >= 0) {
- dht11->edges[dht11->num_edges].ts = ktime_get_boottime_ns();
- dht11->edges[dht11->num_edges++].value =
- gpiod_get_value(dht11->gpiod);
+ dht11->edges[dht11->num_edges++].ts = ktime_get_boottime_ns();
if (dht11->num_edges >= DHT11_EDGES_PER_READ)
complete(&dht11->completion);
@@ -224,7 +218,7 @@ static int dht11_read_raw(struct iio_dev *iio_dev,
goto err;
ret = request_irq(dht11->irq, dht11_handle_irq,
- IRQF_TRIGGER_RISING | IRQF_TRIGGER_FALLING,
+ IRQF_TRIGGER_FALLING,
iio_dev->name, iio_dev);
if (ret)
goto err;
--
2.30.2
On Sun, 29 Jan 2023 19:05:23 +0100
Andreas Feldner <[email protected]> wrote:
> Currently, IRQs for both falling and raising edges of the GPIO
> line connected to the DHT11 device are requested. However, the
> low states do not carry information, it is possible to determine
> 0 and 1 bits from the timing of two adjacent falling edges as
> well.
>
> Doing so does no longer requires to read the GPIO line value
> within the IRQ handler, plus halves the number of IRQs to be
> handled at all.
>
> Signed-off-by: Andreas Feldner <[email protected]>
+CC Harald Not been that many years since Harald replied, so address
may still be good.
> ---
> drivers/iio/humidity/dht11.c | 28 +++++++++++-----------------
> 1 file changed, 11 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/iio/humidity/dht11.c b/drivers/iio/humidity/dht11.c
> index c97e25448772..d1cd053c5dd4 100644
> --- a/drivers/iio/humidity/dht11.c
> +++ b/drivers/iio/humidity/dht11.c
> @@ -30,13 +30,13 @@
>
> #define DHT11_DATA_VALID_TIME 2000000000 /* 2s in ns */
>
> -#define DHT11_EDGES_PREAMBLE 2
> +#define DHT11_EDGES_PREAMBLE 1
> #define DHT11_BITS_PER_READ 40
> /*
> * Note that when reading the sensor actually 84 edges are detected, but
> * since the last edge is not significant, we only store 83:
> */
> -#define DHT11_EDGES_PER_READ (2 * DHT11_BITS_PER_READ + \
> +#define DHT11_EDGES_PER_READ (DHT11_BITS_PER_READ + \
> DHT11_EDGES_PREAMBLE + 1)
>
> /*
> @@ -46,6 +46,7 @@
> * 1-bit: 68-75uS -- typically 70uS (AM2302)
> * The acutal timings also depend on the properties of the cable, with
> * longer cables typically making pulses shorter.
> + * Low time is constant 50uS.
> *
> * Our decoding depends on the time resolution of the system:
> * timeres > 34uS ... don't know what a 1-tick pulse is
> @@ -63,7 +64,8 @@
> #define DHT11_START_TRANSMISSION_MIN 18000 /* us */
> #define DHT11_START_TRANSMISSION_MAX 20000 /* us */
> #define DHT11_MIN_TIMERES 34000 /* ns */
> -#define DHT11_THRESHOLD 49000 /* ns */
> +#define DHT11_LOW 50000 /* ns */
> +#define DHT11_THRESHOLD (49000 + DHT11_LOW) /* ns */
> #define DHT11_AMBIG_LOW 23000 /* ns */
> #define DHT11_AMBIG_HIGH 30000 /* ns */
>
> @@ -83,7 +85,7 @@ struct dht11 {
>
> /* num_edges: -1 means "no transmission in progress" */
> int num_edges;
> - struct {s64 ts; int value; } edges[DHT11_EDGES_PER_READ];
> + struct {s64 ts; } edges[DHT11_EDGES_PER_READ];
> };
>
> #ifdef CONFIG_DYNAMIC_DEBUG
> @@ -99,7 +101,7 @@ static void dht11_edges_print(struct dht11 *dht11)
> for (i = 1; i < dht11->num_edges; ++i) {
> dev_dbg(dht11->dev, "%d: %lld ns %s\n", i,
> dht11->edges[i].ts - dht11->edges[i - 1].ts,
> - dht11->edges[i - 1].value ? "high" : "low");
> + "falling");
> }
> }
> #endif /* CONFIG_DYNAMIC_DEBUG */
> @@ -125,14 +127,8 @@ static int dht11_decode(struct dht11 *dht11, int offset)
> unsigned char temp_int, temp_dec, hum_int, hum_dec, checksum;
>
> for (i = 0; i < DHT11_BITS_PER_READ; ++i) {
> - t = dht11->edges[offset + 2 * i + 2].ts -
> - dht11->edges[offset + 2 * i + 1].ts;
> - if (!dht11->edges[offset + 2 * i + 1].value) {
> - dev_dbg(dht11->dev,
> - "lost synchronisation at edge %d\n",
> - offset + 2 * i + 1);
> - return -EIO;
> - }
> + t = dht11->edges[offset + i + 1].ts -
> + dht11->edges[offset + i].ts;
> bits[i] = t > DHT11_THRESHOLD;
> }
>
> @@ -174,9 +170,7 @@ static irqreturn_t dht11_handle_irq(int irq, void *data)
> struct dht11 *dht11 = iio_priv(iio);
>
> if (dht11->num_edges < DHT11_EDGES_PER_READ && dht11->num_edges >= 0) {
> - dht11->edges[dht11->num_edges].ts = ktime_get_boottime_ns();
> - dht11->edges[dht11->num_edges++].value =
> - gpiod_get_value(dht11->gpiod);
> + dht11->edges[dht11->num_edges++].ts = ktime_get_boottime_ns();
>
> if (dht11->num_edges >= DHT11_EDGES_PER_READ)
> complete(&dht11->completion);
> @@ -224,7 +218,7 @@ static int dht11_read_raw(struct iio_dev *iio_dev,
> goto err;
>
> ret = request_irq(dht11->irq, dht11_handle_irq,
> - IRQF_TRIGGER_RISING | IRQF_TRIGGER_FALLING,
> + IRQF_TRIGGER_FALLING,
> iio_dev->name, iio_dev);
> if (ret)
> goto err;
On 2023-01-30 21:22, Jonathan Cameron wrote:
> On Sun, 29 Jan 2023 19:05:23 +0100
> Andreas Feldner <[email protected]> wrote:
>
>> Currently, IRQs for both falling and raising edges of the GPIO
>> line connected to the DHT11 device are requested. However, the
>> low states do not carry information, it is possible to determine
>> 0 and 1 bits from the timing of two adjacent falling edges as
>> well.
This probably is true, but it wasn't obvious from reading the data
sheet, how constant the low times actually are. Back then the idea
also was, to use the low times to do recovery from line noise etc.
In the end the driver works reliably without, so we could make
this change.
However aside from the DHT11 there are also a number of different
chips sold as DHT22. I tried to get as many of them as possible
and made extensive tests to ensure they all work properly and
with different timer resolutions.
It would be quite an effort to replicate this for your new
algorithm. However, if you want to pursue this, the first step
would be to prove (probably by calculation), that no matter the
timer resolution (ie some systems have 32kHz timers only) the
new algorithm doesn't lead to decoding ambiguity in cases where
the current algorithm is unambiguous.
>> Doing so does no longer requires to read the GPIO line value
>> within the IRQ handler, plus halves the number of IRQs to be
>> handled at all.
This seems like a really small benefit. And we would lose the
low state timings in debug output, which I personally find quite
convenient. Unless there is data, that this change actually improves
something for somebody, I'd reject it.
Also if we ever are to support shared interrupts, we will need to
read the line value anyway.
>> Signed-off-by: Andreas Feldner <[email protected]>
>
> +CC Harald Not been that many years since Harald replied, so address
> may still be good.
Thanks for including me in the discussion.
best regards,
Harald
>> ---
>> drivers/iio/humidity/dht11.c | 28 +++++++++++-----------------
>> 1 file changed, 11 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/iio/humidity/dht11.c
>> b/drivers/iio/humidity/dht11.c
>> index c97e25448772..d1cd053c5dd4 100644
>> --- a/drivers/iio/humidity/dht11.c
>> +++ b/drivers/iio/humidity/dht11.c
>> @@ -30,13 +30,13 @@
>>
>> #define DHT11_DATA_VALID_TIME 2000000000 /* 2s in ns */
>>
>> -#define DHT11_EDGES_PREAMBLE 2
>> +#define DHT11_EDGES_PREAMBLE 1
>> #define DHT11_BITS_PER_READ 40
>> /*
>> * Note that when reading the sensor actually 84 edges are detected,
>> but
>> * since the last edge is not significant, we only store 83:
>> */
>> -#define DHT11_EDGES_PER_READ (2 * DHT11_BITS_PER_READ + \
>> +#define DHT11_EDGES_PER_READ (DHT11_BITS_PER_READ + \
>> DHT11_EDGES_PREAMBLE + 1)
>>
>> /*
>> @@ -46,6 +46,7 @@
>> * 1-bit: 68-75uS -- typically 70uS (AM2302)
>> * The acutal timings also depend on the properties of the cable,
>> with
>> * longer cables typically making pulses shorter.
>> + * Low time is constant 50uS.
>> *
>> * Our decoding depends on the time resolution of the system:
>> * timeres > 34uS ... don't know what a 1-tick pulse is
>> @@ -63,7 +64,8 @@
>> #define DHT11_START_TRANSMISSION_MIN 18000 /* us */
>> #define DHT11_START_TRANSMISSION_MAX 20000 /* us */
>> #define DHT11_MIN_TIMERES 34000 /* ns */
>> -#define DHT11_THRESHOLD 49000 /* ns */
>> +#define DHT11_LOW 50000 /* ns */
>> +#define DHT11_THRESHOLD (49000 + DHT11_LOW) /* ns */
>> #define DHT11_AMBIG_LOW 23000 /* ns */
>> #define DHT11_AMBIG_HIGH 30000 /* ns */
>>
>> @@ -83,7 +85,7 @@ struct dht11 {
>>
>> /* num_edges: -1 means "no transmission in progress" */
>> int num_edges;
>> - struct {s64 ts; int value; } edges[DHT11_EDGES_PER_READ];
>> + struct {s64 ts; } edges[DHT11_EDGES_PER_READ];
>> };
>>
>> #ifdef CONFIG_DYNAMIC_DEBUG
>> @@ -99,7 +101,7 @@ static void dht11_edges_print(struct dht11 *dht11)
>> for (i = 1; i < dht11->num_edges; ++i) {
>> dev_dbg(dht11->dev, "%d: %lld ns %s\n", i,
>> dht11->edges[i].ts - dht11->edges[i - 1].ts,
>> - dht11->edges[i - 1].value ? "high" : "low");
>> + "falling");
>> }
>> }
>> #endif /* CONFIG_DYNAMIC_DEBUG */
>> @@ -125,14 +127,8 @@ static int dht11_decode(struct dht11 *dht11, int
>> offset)
>> unsigned char temp_int, temp_dec, hum_int, hum_dec, checksum;
>>
>> for (i = 0; i < DHT11_BITS_PER_READ; ++i) {
>> - t = dht11->edges[offset + 2 * i + 2].ts -
>> - dht11->edges[offset + 2 * i + 1].ts;
>> - if (!dht11->edges[offset + 2 * i + 1].value) {
>> - dev_dbg(dht11->dev,
>> - "lost synchronisation at edge %d\n",
>> - offset + 2 * i + 1);
>> - return -EIO;
>> - }
>> + t = dht11->edges[offset + i + 1].ts -
>> + dht11->edges[offset + i].ts;
>> bits[i] = t > DHT11_THRESHOLD;
>> }
>>
>> @@ -174,9 +170,7 @@ static irqreturn_t dht11_handle_irq(int irq, void
>> *data)
>> struct dht11 *dht11 = iio_priv(iio);
>>
>> if (dht11->num_edges < DHT11_EDGES_PER_READ && dht11->num_edges >=
>> 0) {
>> - dht11->edges[dht11->num_edges].ts = ktime_get_boottime_ns();
>> - dht11->edges[dht11->num_edges++].value =
>> - gpiod_get_value(dht11->gpiod);
>> + dht11->edges[dht11->num_edges++].ts = ktime_get_boottime_ns();
>>
>> if (dht11->num_edges >= DHT11_EDGES_PER_READ)
>> complete(&dht11->completion);
>> @@ -224,7 +218,7 @@ static int dht11_read_raw(struct iio_dev *iio_dev,
>> goto err;
>>
>> ret = request_irq(dht11->irq, dht11_handle_irq,
>> - IRQF_TRIGGER_RISING | IRQF_TRIGGER_FALLING,
>> + IRQF_TRIGGER_FALLING,
>> iio_dev->name, iio_dev);
>> if (ret)
>> goto err;
Following up on Harald's remark, I can provide some first comparison
data indeed:
Am 31.01.23 um 10:44 schrieb [email protected]:
> This seems like a really small benefit. And we would lose the
> low state timings in debug output, which I personally find quite
> convenient. Unless there is data, that this change actually improves
> something for somebody, I'd reject it.
Running test script against the original kernel module (see below where
on):
# real [s] user [s] sys [s] success fails err per succ
1 222,068 0,515 0,506 83 96 115,66 %
2 223,152 0,603 0,493 86 99 115,12 %
*3* 223,502 0,563 0,411 91 68 74,73 %
*4* 209,626 0,431 0,189 100 15 15,00 %
*5* 209,689 0,46 0,193 100 19 19,00 %
*6* 220,35 0,413 0,315 100 35 35,00 %
Running the patched module:
# Real User Sys Successes Failures Error rate
1 223,061 0,459 0,258 88 25 28,41 %
2 222,431 0,561 0,367 75 57 76,00 %
3 225,675 0,436 0,178 92 19 20,65 %
4 222,746 0,444 0,194 98 23 23,47 %
5 222,668 0,416 0,205 97 20 20,62 %
*6* 204,126 0,34 0,138 100 0 0,00 %
*7* 210,495 0,393 0,199 100 16 16,00 %
*8* 212,563 0,447 0,139 100 19 19,00 %
All tests run on the same board, Allwinner H3 sold as BananaPi M2 Zero,
under kernel 6.2.0-rc5+. The devicetree overlay is setting the
input-debounce property of &pio to 5µs, or, because of the excessive
error rates of the original driver in this configuration, to 1µs (lines
marked with an asterisk).
The test simply tries to read temperature and humidity from the IIO/dht11
exposed input files every 2 seconds, immediately repeating after an error.
Real/User/Sys is determined by good ol' time command, successes and
failures are counted by the test script.
Two aspects strike:
1) the patched version of the driver is working satisfactory even with
5µs input-debounce filter, where the original driver shows more failed
than successful reads in this configuration.
2) The error rate is consistently lower with the patched driver
(67,9% to 33,8% average)
I believe to see similar results, i.e. a noticable improvement on the error
rate, on my old trusted RaspberryPi 2B (without any devicetree fiddling, of
course), however without quantitative comparison and based on some Raspbian
patch level rather than based on kernel 6.2.0-rc5+.
Of course I have only access to a handful of DHT22 devices, most probably
from the same production batch. But I think at least I'd like to stick
with the patched version, tbh.
Hope this helps, let me know if it'd pay to work on another version of
the patch!
Best wishes
Andreas
On 2023-02-05 21:41, [email protected] wrote:
> Following up on Harald's remark, I can provide some first comparison
> data indeed:
>
> Am 31.01.23 um 10:44 schrieb [email protected]:
>> This seems like a really small benefit. And we would lose the
>> low state timings in debug output, which I personally find quite
>> convenient. Unless there is data, that this change actually improves
>> something for somebody, I'd reject it.
>
> Running test script against the original kernel module (see below where
> on):
>
> # real [s] user [s] sys [s] success fails err per succ
> 1 222,068 0,515 0,506 83 96 115,66 %
> 2 223,152 0,603 0,493 86 99 115,12 %
> *3* 223,502 0,563 0,411 91 68 74,73 %
> *4* 209,626 0,431 0,189 100 15 15,00 %
> *5* 209,689 0,46 0,193 100 19 19,00 %
> *6* 220,35 0,413 0,315 100 35 35,00 %
>
>
> Running the patched module:
>
> # Real User Sys Successes Failures Error rate
> 1 223,061 0,459 0,258 88 25 28,41 %
> 2 222,431 0,561 0,367 75 57 76,00 %
> 3 225,675 0,436 0,178 92 19 20,65 %
> 4 222,746 0,444 0,194 98 23 23,47 %
> 5 222,668 0,416 0,205 97 20 20,62 %
> *6* 204,126 0,34 0,138 100 0 0,00 %
> *7* 210,495 0,393 0,199 100 16 16,00 %
> *8* 212,563 0,447 0,139 100 19 19,00 %
>
> All tests run on the same board, Allwinner H3 sold as BananaPi M2 Zero,
> under kernel 6.2.0-rc5+. The devicetree overlay is setting the
> input-debounce property of &pio to 5µs, or, because of the excessive
> error rates of the original driver in this configuration, to 1µs (lines
> marked with an asterisk).
>
> The test simply tries to read temperature and humidity from the
> IIO/dht11
> exposed input files every 2 seconds, immediately repeating after an
> error.
>
> Real/User/Sys is determined by good ol' time command, successes and
> failures are counted by the test script.
>
> Two aspects strike:
>
> 1) the patched version of the driver is working satisfactory even with
> 5µs input-debounce filter, where the original driver shows more failed
> than successful reads in this configuration.
>
> 2) The error rate is consistently lower with the patched driver
> (67,9% to 33,8% average)
>
> I believe to see similar results, i.e. a noticable improvement on the
> error
> rate, on my old trusted RaspberryPi 2B (without any devicetree
> fiddling, of
> course), however without quantitative comparison and based on some
> Raspbian
> patch level rather than based on kernel 6.2.0-rc5+.
>
> Of course I have only access to a handful of DHT22 devices, most
> probably
> from the same production batch. But I think at least I'd like to stick
> with the patched version, tbh.
Aside from different chips (and mostly the old DHT11) there are many
things,
that influence timing: cable lenght, input characteristic etc.
It looks good, but we should still be careful.
> Hope this helps, let me know if it'd pay to work on another version of
> the patch!
Thanks, these are indeed interresting results. If you want to move this
forward, the next steps would be:
1) Sharing your test script - yes it's trivial, but still ...
2) A theoretical analysis about possible regressions depending on timer
resolution as mentioned in an earlier message.
3) Ideally figuring out, why your version performs better then what we
currently have. I have some suspicions, but better understanding might
lead to a better approach. E.g. maybe recording the other edges isn't
the problem so long as we ignore them during decoding?
As I see it, the main thing we are losing with your current proposal is
some diagnostic features. If we keep them as much as possible and have
regressions understood and covered, I see no reason to reject your idea.
best regards,
Harald
> Best wishes
>
> Andreas
Am 07.02.23 um 11:33 schrieb [email protected]:
> Thanks, these are indeed interresting results. If you want to move this
> forward, the next steps would be:
> 1) Sharing your test script - yes it's trivial, but still ...
That appears fairly easy, you find the script here:
https://github.com/pelzvieh/banana_resources/tree/main/test_dht11
> 2) A theoretical analysis about possible regressions depending on timer
> resolution as mentioned in an earlier message.
This sounds as if you were doing such an analysis for the original
version. Can you share this work so I can attempt to repeat it
for the modified algorithm?
> 3) Ideally figuring out, why your version performs better then what we
> currently have. I have some suspicions, but better understanding might
> lead to a better approach. E.g. maybe recording the other edges isn't
> the problem so long as we ignore them during decoding?
>
> As I see it, the main thing we are losing with your current proposal is
> some diagnostic features. If we keep them as much as possible and have
> regressions understood and covered, I see no reason to reject your idea.
That's why I changed the script to separately count EIO and ETIMEDOUT.
The latter indicates missed edges, the former failure to interpret
the data read.
What I see is that the patched driver's errors mostly result from missed
IRQ (note in contrast to last results, I cut the number of reads):
# real[s] user[s] sys[s] success EIO timeout err per succ
1 20.57 0.25 0.03 10 0 0 0
2 24.74 0.25 0.07 10 0 4 0,4
3 21.55 0.20 0.07 10 0 0 0
4 25.81 0.25 0.08 10 0 5 0,5
5 21.56 0.23 0.05 10 0 0 0
6 21.58 0.22 0.05 10 1 0 0,1
7 25.86 0.24 0.08 10 1 5 0,6
8 22.69 0.27 0.05 10 1 1 0,2
9 23.67 0.26 0.04 10 0 2 0,2
10 20.55 0.23 0.04 10 0 0 0
Whereas the original driver has more errors resulting from
mis-interpreted data:
# real[s] user[s] sys[s] success EIO timeout err per succ
1 24.88 0.26 0.07 10 5 4 0,9
2 25.91 0.26 0.07 10 4 5 0,9
3 31.27 0.31 0.10 10 6 10 1,6
4 29.17 0.32 0.11 10 7 8 1,5
5 22.73 0.24 0.08 10 4 2 0,6
6 46.46 0.35 0.25 10 19 24 4,3
7 23.79 0.23 0.09 10 3 3 0,6
8 30.17 0.27 0.11 10 6 9 1,5
9 23.77 0.26 0.06 10 3 2 0,5
10 20.58 0.24 0.06 10 1 0 0,1
I tried a variant that reads falling and rising edges and
uses the redundany of information to eliminate some errors.
This did not work out at all. It seems a relevant source of
trouble is delayed call to the IRQ handler. The problem is
that only then you try to find out if this IRQ is due to
rising or falling edge by reading the current GPIO level. When
you are to late, this might already have changed and you read
a level, but for the edge of _this_ level you'll receive another
IRQ a few us later.
So the reason that this patch here is showing
lower error rates seems to be the lower probability of such
things happening by halving the IRQs to be handled, _plus_
the information from the hardware, that this IRQ was due
to a falling edge.
Yours,
Andreas.
On 2023-02-11 11:41, [email protected] wrote:
> Am 07.02.23 um 11:33 schrieb [email protected]:
>> 2) A theoretical analysis about possible regressions depending on
>> timer
>> resolution as mentioned in an earlier message.
>
> This sounds as if you were doing such an analysis for the original
> version. Can you share this work so I can attempt to repeat it
> for the modified algorithm?
The short version is in the comments. The relevant section is:
/*
* Data transmission timing:
* Data bits are encoded as pulse length (high time) on the data line.
* 0-bit: 22-30uS -- typically 26uS (AM2302)
* 1-bit: 68-75uS -- typically 70uS (AM2302)
* The acutal timings also depend on the properties of the cable, with
* longer cables typically making pulses shorter.
*
* Our decoding depends on the time resolution of the system:
* timeres > 34uS ... don't know what a 1-tick pulse is
* 34uS > timeres > 30uS ... no problem (30kHz and 32kHz clocks)
* 30uS > timeres > 23uS ... don't know what a 2-tick pulse is
* timeres < 23uS ... no problem
The long version I probably don't have anymore, but it's not rocket
science. Just multiples of the time resolution. Eg:
34 = 68/2
23 = 68/3
>> 3) Ideally figuring out, why your version performs better then what we
>> currently have. I have some suspicions, but better understanding might
>> lead to a better approach. E.g. maybe recording the other edges isn't
>> the problem so long as we ignore them during decoding?
>>
>> As I see it, the main thing we are losing with your current proposal
>> is
>> some diagnostic features. If we keep them as much as possible and have
>> regressions understood and covered, I see no reason to reject your
>> idea.
>
> That's why I changed the script to separately count EIO and ETIMEDOUT.
> The latter indicates missed edges, the former failure to interpret
> the data read.
>
> What I see is that the patched driver's errors mostly result from
> missed
> IRQ (note in contrast to last results, I cut the number of reads):
>
> # real[s] user[s] sys[s] success EIO timeout err per
> succ
> 1 20.57 0.25 0.03 10 0 0 0
> 2 24.74 0.25 0.07 10 0 4 0,4
> 3 21.55 0.20 0.07 10 0 0 0
> 4 25.81 0.25 0.08 10 0 5 0,5
> 5 21.56 0.23 0.05 10 0 0 0
> 6 21.58 0.22 0.05 10 1 0 0,1
> 7 25.86 0.24 0.08 10 1 5 0,6
> 8 22.69 0.27 0.05 10 1 1 0,2
> 9 23.67 0.26 0.04 10 0 2 0,2
> 10 20.55 0.23 0.04 10 0 0 0
>
> Whereas the original driver has more errors resulting from
> mis-interpreted data:
>
> # real[s] user[s] sys[s] success EIO timeout err per
> succ
> 1 24.88 0.26 0.07 10 5 4 0,9
> 2 25.91 0.26 0.07 10 4 5 0,9
> 3 31.27 0.31 0.10 10 6 10 1,6
> 4 29.17 0.32 0.11 10 7 8 1,5
> 5 22.73 0.24 0.08 10 4 2 0,6
> 6 46.46 0.35 0.25 10 19 24 4,3
> 7 23.79 0.23 0.09 10 3 3 0,6
> 8 30.17 0.27 0.11 10 6 9 1,5
> 9 23.77 0.26 0.06 10 3 2 0,5
> 10 20.58 0.24 0.06 10 1 0 0,1
>
> I tried a variant that reads falling and rising edges and
> uses the redundany of information to eliminate some errors.
> This did not work out at all.
That's an interesting data point. Care to share the code?
> It seems a relevant source of
> trouble is delayed call to the IRQ handler. The problem is
> that only then you try to find out if this IRQ is due to
> rising or falling edge by reading the current GPIO level. When
> you are to late, this might already have changed and you read
> a level, but for the edge of _this_ level you'll receive another
> IRQ a few us later.
I doubt this interpretation. Mostly I don't think you would even
get a second interrupt in this case: It is just a flag that
indicates something has changed, not a counter.
I expect, that you just get one missing edge (which we don't notice,
because we are tolerant about a missing preamble), which would
show as two consecutive edges of the same value - not three as
your explanation suggests.
I don't see, why it wouldn't be possible to recover from that,
in cases, where the delay is small enough for your version to work.
> So the reason that this patch here is showing
> lower error rates seems to be the lower probability of such
> things happening by halving the IRQs to be handled, _plus_
> the information from the hardware, that this IRQ was due
> to a falling edge.
The first part is likely true at the moment and seems enough to
explain the data you have shown. I still believe we could be
smart about the second part in software.
Thanks,
Harald