LinuxLists.cc - [Bluez-users] Has anyone seen these problems with the CSR BlueCore and BlueZ before ?

2005-03-17 13:30:56

Subject: [Bluez-users] Has anyone seen these problems with the CSR BlueCore and BlueZ before ?

Hey All,

I'm developping an application on an ARM-7 based platform which is running
Linux 2.4.18 with the latest BlueZ version for that platform. In this
application there is a master device that can connect to up to 7 slaves
(master and slave devices are all the same, the external controller
determines which is master).

The application does the following, the external controller on the master
can send a single command to the Linux application, in the command is an
address field identifying to which slaves the command must be passed on. A
slave will pass the command on to its external controller, which will
respond to the command by sending a reply back to the Linux slave, which in
turn passes it on to the Linux master and is then handed over to the
external controller again. As soon as all addressed slaves have responded,
the sequence is repeated.

Now to what I see happening, when I run this with 5 slaves, I don't see any
problems for more than an hour, but when I run it with 6 or 7 slaves, I see
that the BlueCore resets itself, with 6 slaves after about 40 minutes and
with 7 slaves already after 5 to 10 minutes. When I eliminate the CRC check
I can run the 7 slaves configuration for up to about 40 minutes after it
crashes.

What I think happens is that the BlueCore receives data to fast over the air
and can not pass it on to the Linux CPU quick enough, so there is an
internal overflow in the BlueCore.
Communication between the BlueCore and the Linux CPU is done via BCSP at a
speed of 921600 baud.

Has anybody seen behaviour like this before and is there something I can do
about it ???

Greetings,
Han

---
Han Hoekstra
Wireless Value B.V.
Waanderweg 30a
7812 HZ Emmen
Tel: +31-591-633200

2005-03-17 19:06:59

by Marcel Holtmann

[permalink] [raw]

Subject: Re: [Bluez-users] Has anyone seen these problems with the CSR BlueCore and BlueZ before ?

Hi Steven,

> > What document should I read
> > for more information about the panic/fault code meanings?
>
> HQ Commands (bcore-sp-003Pc). They're in section 6.
>
> In addition to the information in there, you can decode the following:
>
> 0x0080-0x00ff: Debugging codes. These should be used only within CSR
> for testing purposes and, in theory, should not make it
> into released code, but you never know.
>
> 0x0100-0xffff: No panic/fault since last power cycle. When the chip is
> powered on, the panic and fault codes usually end up
> somewhere in this area. In theory they could end up in
> the valid range, but it's unlikely. We don't use codes
> in this area for real panics or faults.
>
> If you want to test them, then the BCCMD variables are writable so you
> should be able to write to them, read them back, and confirm that
> they're preserved across reset but not power cycle.
>
> If you want to test them further, then writing to variables 0x4820 and
> 0x4822 will cause the chip to panic or fault respectively. Each takes a
> single 16 bit argument containg the code to use. In the case of panic,
> this means that the chip will reset (unless the watchdog is disabled).
> In the case of fault, an HCI Hardware_Error event and an HQ PDU should
> be emitted.
>
> If you're writing the codes or provoking the actions via BCCMD then you
> can specify codes outside the normally valid range. This means you
> should be able to test the full range your decode.

the code for reading the value and displaying it if it is < 0x0100 is
now in the CVS. I leave it to someone else to add a {panic|fault}2str()
routine to the files utils/tools/csr.[ch].

Regards

Marcel

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Bluez-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bluez-users

2005-03-17 17:50:35

by Steven Singer

[permalink] [raw]

Subject: Re: [Bluez-users] Has anyone seen these problems with the CSR BlueCore and BlueZ before ?

Marcel Holtmann wrote:
> Steven Singer wrote:
>> would it be a good idea to add reading of the panic and
>> fault codes to the CSR specific section of hcitool revision? Or,
>> get BlueZ to read the panic code when the inerface is brought up. If
>> it's non-zero and < 0x0100 write it to an event log and then zero it.
>
> the kernel code should be complete vendor independent and thus adding
> this to "hciconfig hci0 revsion" is the only option.

That's OK. At least it will make it easy for users to find out this
information.

> Are both variables UINT16? If yes, then it will be quite easy to add
> this information to the hciconfig command.

Yes.

> What document should I read
> for more information about the panic/fault code meanings?

HQ Commands (bcore-sp-003Pc). They're in section 6.

In addition to the information in there, you can decode the following:

0x0080-0x00ff: Debugging codes. These should be used only within CSR
for testing purposes and, in theory, should not make it
into released code, but you never know.

0x0100-0xffff: No panic/fault since last power cycle. When the chip is
powered on, the panic and fault codes usually end up
somewhere in this area. In theory they could end up in
the valid range, but it's unlikely. We don't use codes
in this area for real panics or faults.

If you want to test them, then the BCCMD variables are writable so you
should be able to write to them, read them back, and confirm that
they're preserved across reset but not power cycle.

If you want to test them further, then writing to variables 0x4820 and
0x4822 will cause the chip to panic or fault respectively. Each takes a
single 16 bit argument containg the code to use. In the case of panic,
this means that the chip will reset (unless the watchdog is disabled).
In the case of fault, an HCI Hardware_Error event and an HQ PDU should
be emitted.

If you're writing the codes or provoking the actions via BCCMD then you
can specify codes outside the normally valid range. This means you
should be able to test the full range your decode.

- Steven
--

**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

**********************************************************************

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Bluez-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bluez-users

2005-03-17 16:50:08

by Marcel Holtmann

[permalink] [raw]

Subject: Re: [Bluez-users] Has anyone seen these problems with the CSR BlueCore and BlueZ before ?

Hi Steven,

> would it be a good idea to add reading of the panic and
> fault codes to the CSR specific section of hcitool revision? Or,
> get BlueZ to read the panic code when the inerface is brought up. If
> it's non-zero and < 0x0100 write it to an event log and then zero it.

the kernel code should be complete vendor independent and thus adding
this to "hciconfig hci0 revsion" is the only option.

Are both variables UINT16? If yes, then it will be quite easy to add
this information to the hciconfig command. What document should I read
for more information about the panic/fault code meanings?

Regards

Marcel

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Bluez-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bluez-users

2005-03-17 16:28:39

by Steven Singer

[permalink] [raw]

Subject: Re: [Bluez-users] Has anyone seen these problems with the CSR BlueCore and BlueZ before ?

Han Hoekstra wrote:
> What I think happens is that the BlueCore receives data to fast over the air
> and can not pass it on to the Linux CPU quick enough, so there is an
> internal overflow in the BlueCore.

This shoudn't happen.

The BlueCore firmware has multiple layers of defences in place to prevent
exactly this sort of problem.

If it's reset, it's possible that the firmware has detected a problem
and has panicked. If it has, a panic code will be stored in RAM and
preserved across the reset. You should be able to read it by using
BCCMD to get variable 0x6805. You might also want to read the fault
code in variable 0x6806.

Panic represent unrecoverable fatal errors (like they do for the
linux kernel). They cause the chip to reset immediately. Faults
represent recoverable errors. The firmware does its best to continue.
It reports the problem with an HCI Hardware_Error event and an HQ
report.

>From the panic code, I should be able to tell you exactly what the
firmware thinks has gone wrong.

One slight snag is that since the panic and fault codes are kept in an
area of RAM that's never reset, on power up they go to random values.
All valid values are less than 0x0100, so usually you can distinguish
the random values from real values. You can use bccmd to set the value
to 0 after you've read it to make sure that you're seeing an up to date
value.

[Marcel, would it be a good idea to add reading of the panic and
fault codes to the CSR specific section of hcitool revision? Or,
get BlueZ to read the panic code when the inerface is brought up. If
it's non-zero and < 0x0100 write it to an event log and then zero it.]

- Steven
--

**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

**********************************************************************

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Bluez-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bluez-users

2005-03-17 15:43:42

by Han Hoekstra

[permalink] [raw]

Subject: RE: [Bluez-users] Has anyone seen these problems with theCSRBlueCore and BlueZ before ?

Hey Marcel,

We are using a Mitsumi WML-C20, which is a BlueCore 02, I believe it is a
BC02-Flash, but I'm not 100% sure about that. I have checked on the CSR
support web-site that I have the latest firmware and that was the case, so
firmware is up to date.

Greetings,
Han

---
Han Hoekstra
Wireless Value B.V.
Waanderweg 30a
7812 HZ Emmen
Tel: +31-591-633200

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Marcel
Holtmann
Sent: donderdag 17 maart 2005 15:59
To: BlueZ Mailing List
Subject: RE: [Bluez-users] Has anyone seen these problems with
theCSRBlueCore and BlueZ before ?

Hi Han,

> The info I get from hciconfig -a is :
>
> hci0: Type: UART
> BD Address: 00:A0:96:0A:BF:D4 ACL MTU: 192:8 SCO MTU: 64:8
> UP RUNNING PSCAN ISCAN
> RX bytes:303 acl:0 sco:0 events:13 errors:0
> TX bytes:434 acl:0 sco:0 commands:12 errors:0
> Features: 0xff 0xff 0x8f 0x78 0x18 0x18 0x00 0x80
> Packet type: DM1 DM3 DM5 DH1 DH3 DH5 HV1 HV2 HV3
> Link policy: RSWITCH HOLD SNIFF PARK
> Link mode: SLAVE ACCEPT
> Name: 'BlueZ (0)'
> Class: 0x000100
> Service Classes: Unspecified
> Device Class: Computer, Uncategorized
> HCI Ver: 1.2 (0x2) HCI Rev: 0x5df LMP Ver: 1.2 (0x2) LMP Subver:
> 0x5df
> Manufacturer: Cambridge Silicon Radio (10)
>
> This info is what I get just after booting the board.

this is a HCI 18.2 firmware on a BlueCore02 or BlueCore3 chip. What kind of
chip is it? Is it a ROM or External/Flash version?

Regards

Marcel

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide Read honest & candid reviews
on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Bluez-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bluez-users

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Bluez-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bluez-users

2005-03-17 15:35:55

by Erwin Authried

[permalink] [raw]

Subject: Re: [Bluez-users] Has anyone seen these problems with the CSR BlueCore and BlueZ before ?

Am Don, den 17.03.2005 schrieb Han Hoekstra um 14:30:
> Hey All,
>
> I'm developping an application on an ARM-7 based platform which is
> running Linux 2.4.18 with the latest BlueZ version for that platform.
> In this application there is a master device that can connect to up to
> 7 slaves (master and slave devices are all the same, the external
> controller determines which is master).
>
> The application does the following, the external controller on the
> master can send a single command to the Linux application, in the
> command is an address field identifying to which slaves the command
> must be passed on. A slave will pass the command on to its external
> controller, which will respond to the command by sending a reply back
> to the Linux slave, which in turn passes it on to the Linux master and
> is then handed over to the external controller again. As soon as all
> addressed slaves have responded, the sequence is repeated.
>
> Now to what I see happening, when I run this with 5 slaves, I don't
> see any problems for more than an hour, but when I run it with 6 or 7
> slaves, I see that the BlueCore resets itself, with 6 slaves after
> about 40 minutes and with 7 slaves already after 5 to 10 minutes. When
> I eliminate the CRC check I can run the 7 slaves configuration for up
> to about 40 minutes after it crashes.
>
> What I think happens is that the BlueCore receives data to fast over
> the air and can not pass it on to the Linux CPU quick enough, so there
> is an internal overflow in the BlueCore.
> Communication between the BlueCore and the Linux CPU is done via BCSP
> at a speed of 921600 baud.
>
> Has anybody seen behaviour like this before and is there something I
> can do about it ???
>
Hi,
I had problems with uart overruns on an embedded ARM7 system that could
be solved by optimisations in the serial driver and hci_bcsp.c. You
should take a look at /proc/tty/driver to see if you get a lot of
overrun errors. What is your cpu, how many Bogomips does the kernel
show?

Regards,
Erwin

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Bluez-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bluez-users

2005-03-17 14:58:30

by Marcel Holtmann

[permalink] [raw]

Subject: RE: [Bluez-users] Has anyone seen these problems with the CSRBlueCore and BlueZ before ?

Hi Han,

> The info I get from hciconfig -a is :
>
> hci0: Type: UART
> BD Address: 00:A0:96:0A:BF:D4 ACL MTU: 192:8 SCO MTU: 64:8
> UP RUNNING PSCAN ISCAN
> RX bytes:303 acl:0 sco:0 events:13 errors:0
> TX bytes:434 acl:0 sco:0 commands:12 errors:0
> Features: 0xff 0xff 0x8f 0x78 0x18 0x18 0x00 0x80
> Packet type: DM1 DM3 DM5 DH1 DH3 DH5 HV1 HV2 HV3
> Link policy: RSWITCH HOLD SNIFF PARK
> Link mode: SLAVE ACCEPT
> Name: 'BlueZ (0)'
> Class: 0x000100
> Service Classes: Unspecified
> Device Class: Computer, Uncategorized
> HCI Ver: 1.2 (0x2) HCI Rev: 0x5df LMP Ver: 1.2 (0x2) LMP Subver:
> 0x5df
> Manufacturer: Cambridge Silicon Radio (10)
>
> This info is what I get just after booting the board.

this is a HCI 18.2 firmware on a BlueCore02 or BlueCore3 chip. What kind
of chip is it? Is it a ROM or External/Flash version?

Regards

Marcel

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Bluez-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bluez-users

2005-03-17 14:19:44

by Han Hoekstra

[permalink] [raw]

Subject: RE: [Bluez-users] Has anyone seen these problems with the CSRBlueCore and BlueZ before ?

Hey Marcel,

The info I get from hciconfig -a is :

hci0: Type: UART
BD Address: 00:A0:96:0A:BF:D4 ACL MTU: 192:8 SCO MTU: 64:8
UP RUNNING PSCAN ISCAN
RX bytes:303 acl:0 sco:0 events:13 errors:0
TX bytes:434 acl:0 sco:0 commands:12 errors:0
Features: 0xff 0xff 0x8f 0x78 0x18 0x18 0x00 0x80
Packet type: DM1 DM3 DM5 DH1 DH3 DH5 HV1 HV2 HV3
Link policy: RSWITCH HOLD SNIFF PARK
Link mode: SLAVE ACCEPT
Name: 'BlueZ (0)'
Class: 0x000100
Service Classes: Unspecified
Device Class: Computer, Uncategorized
HCI Ver: 1.2 (0x2) HCI Rev: 0x5df LMP Ver: 1.2 (0x2) LMP Subver:
0x5df
Manufacturer: Cambridge Silicon Radio (10)

This info is what I get just after booting the board.

Greetings,
Han

---
Han Hoekstra
Wireless Value B.V.
Waanderweg 30a
7812 HZ Emmen
Tel: +31-591-633200

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Marcel
Holtmann
Sent: donderdag 17 maart 2005 15:02
To: BlueZ Mailing List
Subject: Re: [Bluez-users] Has anyone seen these problems with the
CSRBlueCore and BlueZ before ?

Hi Han,

> I'm developping an application on an ARM-7 based platform which is
> running Linux 2.4.18 with the latest BlueZ version for that platform.
> In this application there is a master device that can connect to up to
> 7 slaves (master and slave devices are all the same, the external
> controller determines which is master).
>
> The application does the following, the external controller on the
> master can send a single command to the Linux application, in the
> command is an address field identifying to which slaves the command
> must be passed on. A slave will pass the command on to its external
> controller, which will respond to the command by sending a reply back
> to the Linux slave, which in turn passes it on to the Linux master and
> is then handed over to the external controller again. As soon as all
> addressed slaves have responded, the sequence is repeated.
>
> Now to what I see happening, when I run this with 5 slaves, I don't
> see any problems for more than an hour, but when I run it with 6 or 7
> slaves, I see that the BlueCore resets itself, with 6 slaves after
> about 40 minutes and with 7 slaves already after 5 to 10 minutes. When
> I eliminate the CRC check I can run the 7 slaves configuration for up
> to about 40 minutes after it crashes.
>
> What I think happens is that the BlueCore receives data to fast over
> the air and can not pass it on to the Linux CPU quick enough, so there
> is an internal overflow in the BlueCore.
> Communication between the BlueCore and the Linux CPU is done via BCSP
> at a speed of 921600 baud.
>
> Has anybody seen behaviour like this before and is there something I
> can do about it ???

what does "hciconfig -a" tells you about the chip?

Regards

Marcel

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide Read honest & candid reviews
on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Bluez-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bluez-users

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Bluez-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bluez-users

2005-03-17 14:02:08

by Marcel Holtmann

[permalink] [raw]

Subject: Re: [Bluez-users] Has anyone seen these problems with the CSR BlueCore and BlueZ before ?

Hi Han,

> I'm developping an application on an ARM-7 based platform which is
> running Linux 2.4.18 with the latest BlueZ version for that platform.
> In this application there is a master device that can connect to up to
> 7 slaves (master and slave devices are all the same, the external
> controller determines which is master).
>
> The application does the following, the external controller on the
> master can send a single command to the Linux application, in the
> command is an address field identifying to which slaves the command
> must be passed on. A slave will pass the command on to its external
> controller, which will respond to the command by sending a reply back
> to the Linux slave, which in turn passes it on to the Linux master and
> is then handed over to the external controller again. As soon as all
> addressed slaves have responded, the sequence is repeated.
>
> Now to what I see happening, when I run this with 5 slaves, I don't
> see any problems for more than an hour, but when I run it with 6 or 7
> slaves, I see that the BlueCore resets itself, with 6 slaves after
> about 40 minutes and with 7 slaves already after 5 to 10 minutes. When
> I eliminate the CRC check I can run the 7 slaves configuration for up
> to about 40 minutes after it crashes.
>
> What I think happens is that the BlueCore receives data to fast over
> the air and can not pass it on to the Linux CPU quick enough, so there
> is an internal overflow in the BlueCore.
> Communication between the BlueCore and the Linux CPU is done via BCSP
> at a speed of 921600 baud.
>
> Has anybody seen behaviour like this before and is there something I
> can do about it ???

what does "hciconfig -a" tells you about the chip?

Regards

Marcel

-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Bluez-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bluez-users