2005-02-26 04:58:46

by James Bruce

[permalink] [raw]
Subject: Potentially dead bttv cards from 2.6.10

Hi I've read elsewhere that the following message:
"tveeprom(bttv internal): Huh, no eeprom present (err=-121)?"
Means that a bttv card is dead. If so, then I've apparently found a way
to kill bttv cards in vanilla 2.6.10. They worked fine a few days ago,
but after running some "cleaned up" userspace caputure code today it led
to a hard lockup. Even after power-cycling the machine the cards no
longer seem to work in that captured video is garbled.

I was testing only one card, but two were installed in the machine and
both appear to no longer work, using either S-video or composite input.
I don't use the TV-tuners but they are present on the cards.

I can gather more information, and I even have a third card still in its
box, but obviously I'm hesitant to test anything immediately. Before
proceeding I'd like to know any general pointers about how to find out
what may be wrong, and if anyone knows any potential causes offhand.

I can supply a test case, but It's a large C++ program and kind of
unweildy. I'd narrow it down to a small test case but I don't have
log(N) cards to fry in the process of searching for the root cause.

I would have expected a dud card, but these cards worked flawlessly just
a few days ago, and both of them died at exactly the same time.

Thanks,
Jim Bruce

#### Capture Card Model ####
V-Stream Xpert TV-PVR 878
On the card, the following chip is visible:
Conexant FUSION 878A
25878-13
E403069.1
0330 TAIWAN

#### Relevant dmesg output ####
Linux video capture interface: v1.00
bttv: driver version 0.9.15 loaded
bttv: using 8 buffers with 2080k (520 pages) each for capture
bttv: Bt8xx card found (0).
PCI: Found IRQ 12 for device 0000:00:0b.0
PCI: Sharing IRQ 12 with 0000:00:0b.1
bttv0: Bt878 (rev 17) at 0000:00:0b.0, irq: 12, latency: 32, mmio:
0xe3001000
bttv0: using: *** UNKNOWN/GENERIC *** [card=0,autodetected]
bttv0: gpio: en=00000000, out=00000000 in=003fffff [init]
tveeprom(bttv internal): Huh, no eeprom present (err=-121)?
bttv0: using tuner=-1
bttv0: i2c: checking for MSP34xx @ 0x80... not found
bttv0: i2c: checking for TDA9875 @ 0xb0... not found
bttv0: i2c: checking for TDA7432 @ 0x8a... not found
bttv0: i2c: checking for TDA9887 @ 0x86... not found
bttv0: registered device video0
bttv0: registered device vbi0
bttv: Bt8xx card found (1).
PCI: Found IRQ 11 for device 0000:00:0c.0
PCI: Sharing IRQ 11 with 0000:00:09.0
PCI: Sharing IRQ 11 with 0000:00:0c.1
bttv1: Bt878 (rev 17) at 0000:00:0c.0, irq: 11, latency: 32, mmio:
0xe3003000
bttv1: using: *** UNKNOWN/GENERIC *** [card=0,autodetected]
bttv1: gpio: en=00000000, out=00000000 in=003fffff [init]
tveeprom(bttv internal): Huh, no eeprom present (err=-121)?
bttv1: using tuner=-1
bttv1: i2c: checking for MSP34xx @ 0x80... not found
bttv1: i2c: checking for TDA9875 @ 0xb0... not found
bttv1: i2c: checking for TDA7432 @ 0x8a... not found
bttv1: i2c: checking for TDA9887 @ 0x86... not found
bttv1: registered device video1
bttv1: registered device vbi1


2005-02-28 13:47:40

by Gerd Knorr

[permalink] [raw]
Subject: Re: Potentially dead bttv cards from 2.6.10

On Fri, Feb 25, 2005 at 11:57:49PM -0500, James Bruce wrote:
> Hi I've read elsewhere that the following message:
> "tveeprom(bttv internal): Huh, no eeprom present (err=-121)?"
> Means that a bttv card is dead.

Or i2c communication to the eeprom failed. There used to be some -mm
kernels with experimental i2c stuff causing this ...

Gerd

--
#define printk(args...) fprintf(stderr, ## args)

2005-02-28 14:45:16

by James Bruce

[permalink] [raw]
Subject: Re: Potentially dead bttv cards from 2.6.10

Well, are there any theories as to why it would work flawlessly, then
after a hard lockup (due to what I think is a buggy V4L2 application),
that the cards no longer work? That was with 2.6.10, but after they
started failing I tried 2.6.11-rc5 and it doesn't work either. By the
way, I sent the wrong output; what I sent was from 2.6.11-rc5. The
2.6.10 output is below, and looks similar except for generating a
different error message.

An example of the kind of output I get from capture is here:
http://sponge.coral.cs.cmu.edu/~jbruce/temp/img0000.jpg
Which has some of the right colors, but all in the wrong places.
Tracking seems to be off because the capture happens at irregular
intervals. The following is the sort of thing a working card would produce:
http://sponge.coral.cs.cmu.edu/~jbruce/temp/overhead-view.jpg
Note that the two images should not be the same however, as one is from
almost a year ago. I didn't save any of the recent working ones
unfortunately. The camera S-video link still looks fine on a monitor,
and testing with a different camera and component video yields the same
sort of scrambled results as the first image above.

The reason I think my problem is possibly important is that I think I
potentially found a way to PERMANENTLY KILL a bttv card FROM USERSPACE
(emphasis added for any bttv users only half reading at this point).

In our case these are cards bought by our lab, and they were only $40
each, so they can be replaced. It'd be nice to protect other users from
this problem however, since they may not be able to replace their cards
as readily. Well also for me, since to get money for new cards I'd have
to make the case that they wouldn't also blow up after a few days of use[1].

Thanks,
Jim Bruce

[1] The cards are actually >1 year old, but they sat in a running Linux
machine without the bttv drivers loaded. They died after 3 days of
working flawlessly in a new machine where they were actually being used.

Gerd Knorr wrote:
> On Fri, Feb 25, 2005 at 11:57:49PM -0500, James Bruce wrote:
>
>>Hi I've read elsewhere that the following message:
>> "tveeprom(bttv internal): Huh, no eeprom present (err=-121)?"
>>Means that a bttv card is dead.
>
>
> Or i2c communication to the eeprom failed. There used to be some -mm
> kernels with experimental i2c stuff causing this ...
>
> Gerd
>

Linux video capture interface: v1.00
bttv: driver version 0.9.15 loaded
bttv: using 8 buffers with 2080k (520 pages) each for capture
bttv: Bt8xx card found (0).
PCI: Found IRQ 12 for device 0000:00:0b.0
PCI: Sharing IRQ 12 with 0000:00:0b.1
bttv0: Bt878 (rev 17) at 0000:00:0b.0, irq: 12, latency: 32, mmio:
0xe3001000
bttv0: using: *** UNKNOWN/GENERIC *** [card=0,autodetected]
bttv0: gpio: en=00000000, out=00000000 in=003fffff [init]
bttv: readee error
bttv0: using tuner=-1
bttv0: i2c: checking for MSP34xx @ 0x80... not found
bttv0: i2c: checking for TDA9875 @ 0xb0... not found
bttv0: i2c: checking for TDA7432 @ 0x8a... not found
bttv0: i2c: checking for TDA9887 @ 0x86... not found
bttv0: registered device video0
bttv0: registered device vbi0
bttv: Bt8xx card found (1).
PCI: Found IRQ 11 for device 0000:00:0c.0
PCI: Sharing IRQ 11 with 0000:00:09.0
PCI: Sharing IRQ 11 with 0000:00:0c.1
bttv1: Bt878 (rev 17) at 0000:00:0c.0, irq: 11, latency: 32, mmio:
0xe3003000
bttv1: using: *** UNKNOWN/GENERIC *** [card=0,autodetected]
bttv1: gpio: en=00000000, out=00000000 in=003fffff [init]
bttv: readee error
bttv1: using tuner=-1
bttv1: i2c: checking for MSP34xx @ 0x80... not found
bttv1: i2c: checking for TDA9875 @ 0xb0... not found
bttv1: i2c: checking for TDA7432 @ 0x8a... not found
bttv1: i2c: checking for TDA9887 @ 0x86... not found
bttv1: registered device video1
bttv1: registered device vbi1

2005-02-28 16:05:13

by Gerd Knorr

[permalink] [raw]
Subject: Re: Potentially dead bttv cards from 2.6.10

James Bruce <[email protected]> writes:

> Well, are there any theories as to why it would work flawlessly, then
> after a hard lockup (due to what I think is a buggy V4L2 application),
> that the cards no longer work?

No idea why the eeprom doesn't respond any more. Maybe it's really
broken. Note that the eeprom is read only at insmod time (and even
that for some cards only), thus there isn't a clear connection between
the crash and the eeprom issue. It could have died earlied unnoticed.

The eeprom holds the PCI Subsystem ID, so without a working eeprom
bttv can't figure automatically what exact card that is (see the
"unknown/default" card name in the log) and maybe thats why does not
work any more for the card in question. Thats should be easily
fixable using the card= insmod option.

Gerd

--
#define printk(args...) fprintf(stderr, ## args)

2005-02-28 16:45:11

by folkert

[permalink] [raw]
Subject: Re: Potentially dead bttv cards from 2.6.10

> > Well, are there any theories as to why it would work flawlessly, then
> > after a hard lockup (due to what I think is a buggy V4L2 application),
> > that the cards no longer work?
> No idea why the eeprom doesn't respond any more. Maybe it's really
> broken. Note that the eeprom is read only at insmod time (and even
> that for some cards only), thus there isn't a clear connection between
> the crash and the eeprom issue. It could have died earlied unnoticed.
> The eeprom holds the PCI Subsystem ID, so without a working eeprom
> bttv can't figure automatically what exact card that is (see the
> "unknown/default" card name in the log) and maybe thats why does not
> work any more for the card in question. Thats should be easily
> fixable using the card= insmod option.

I remember something about that you shouldn't use the teletext-decoder
at the same time as viewing regular tv. That would damage the eeprom.
Maybe it is related?


Folkert van Heusden

Op zoek naar een IT of Finance baan? Mail me voor de mogelijkheden!
+------------------------------------------------------------------+
|UNIX admin? Then give MultiTail (http://vanheusden.com/multitail/)|
|a try, it brings monitoring logfiles to a different level! See |
|http://vanheusden.com/multitail/features.html for a feature list. |
+------------------------------------------= http://www.unixsoftware.nl =-+
Phone: +31-6-41278122, PGP-key: 1F28D8AE
Get your PGP/GPG key signed at http://www.biglumber.com!

2005-02-28 16:56:57

by Gerd Knorr

[permalink] [raw]
Subject: Re: Potentially dead bttv cards from 2.6.10

> I remember something about that you shouldn't use the teletext-decoder
> at the same time as viewing regular tv. That would damage the eeprom.
> Maybe it is related?

No. Thats (a) very old and about two drivers banging on the bt848 card
at the same time, where the second doesn't even exist for 2.4 any more I
think. And (b) damage in that case means "write random bytes to it",
which is a very different issue. You can still read it in that case.

Gerd

--
#define printk(args...) fprintf(stderr, ## args)

2005-02-28 23:09:18

by Bill Davidsen

[permalink] [raw]
Subject: Re: Potentially dead bttv cards from 2.6.10

James Bruce wrote:
> Well, are there any theories as to why it would work flawlessly, then
> after a hard lockup (due to what I think is a buggy V4L2 application),
> that the cards no longer work? That was with 2.6.10, but after they
> started failing I tried 2.6.11-rc5 and it doesn't work either. By the
> way, I sent the wrong output; what I sent was from 2.6.11-rc5. The
> 2.6.10 output is below, and looks similar except for generating a
> different error message.

Is there any chance that the lockup was related to an external event,
like a spike on the line to the video? Or any other outside event? It
seems like a very odd failure mode, but since I'm about to drop in a
bttv card and digitize about a hundred old tapes, I'd like to know.

Did you try the "card=" suggestion?

--
-bill davidsen ([email protected])
"The secret to procrastination is to put things off until the
last possible moment - but no longer" -me

2005-03-01 06:42:28

by James Bruce

[permalink] [raw]
Subject: Re: Potentially dead bttv cards from 2.6.10

Thanks for the hints. Unfortunately the cards in question really are
fairly generic and thus don't appear in the list. I tried the first 75
cards as insmod options (using a script of course), and some of them are
different, but none work properly.

I am lucky in that I still have a spare. If you could suggest a very
well tested kernel for bttv (2.6.9?), I can set up another machine with
that kernel and the remaining card. That should allow me to isolate the
problem better. At the very least I could get the right card= option
to use for the broken pair. Hopefully I will be able to generate a
table entry for this card for bttv-cards.c; I'll look some more at this
tomorrow.

I've heard that there is some way to dump eeproms; Is there a way to
write them also? If I could copy the eeprom from the unused cards to
the (now broken) pair that might fix things.

Thanks,
Jim

Gerd Knorr wrote:
> James Bruce <[email protected]> writes:
>
>>Well, are there any theories as to why it would work flawlessly, then
>>after a hard lockup (due to what I think is a buggy V4L2 application),
>>that the cards no longer work?
>
> No idea why the eeprom doesn't respond any more. Maybe it's really
> broken. Note that the eeprom is read only at insmod time (and even
> that for some cards only), thus there isn't a clear connection between
> the crash and the eeprom issue. It could have died earlied unnoticed.
>
> The eeprom holds the PCI Subsystem ID, so without a working eeprom
> bttv can't figure automatically what exact card that is (see the
> "unknown/default" card name in the log) and maybe thats why does not
> work any more for the card in question. Thats should be easily
> fixable using the card= insmod option.
>
> Gerd

2005-03-01 07:07:14

by James Bruce

[permalink] [raw]
Subject: Re: Potentially dead bttv cards from 2.6.10

I don't think it was a line spike since one of the video cards that went
bad didn't have a video cable attached to it. It could be the computer,
but that one hasn't given us a problem for the almost five years we've
had it. If I did cause it though with my "buggy program of doom", that
shouldn't reflect on using well tested/well behaved V4L2 apps. At most
I might say 2.6.10+bttv might not be a good development platform for new
V4L2 apps. I'll investigate more and hopefully have an better answer soon.

The card= option didn't help in my case since my card is not in the
list; For thess cards we went off the reccomendation of other people
doing machine vision in Linux; Next time I guess we'll go name brand
again...

We have another machine with clones of the "Matrox Meteor I", which has
"Intel SAA7116" chips on it. It doesn't seem to be supported in 2.6
however by any of the various SAA* drivers; In 2.4 there was an
out-of-tree drive here:
http://www.k-team.com/software/linux.htmldriver
If there is a way to get these cards working I could use them to debug
the "program of doom", and thus find the bugs that potentially caused
the original problem with the bttv cards. Right now said program is 22k
lines with 600 lines of V4L interaction, so its hardly an efficient test
case to tell us where to look in the driver. Another option is go buy
Conexant 2388x derivatives for the debugging, but I'm worried they may
be similar enough to bttv chips that the same problem might be triggered.

- Jim Bruce

Bill Davidsen wrote:
> James Bruce wrote:
>
>> Well, are there any theories as to why it would work flawlessly, then
>> after a hard lockup (due to what I think is a buggy V4L2 application),
>> that the cards no longer work? That was with 2.6.10, but after they
>> started failing I tried 2.6.11-rc5 and it doesn't work either. By the
>> way, I sent the wrong output; what I sent was from 2.6.11-rc5. The
>> 2.6.10 output is below, and looks similar except for generating a
>> different error message.
>
> Is there any chance that the lockup was related to an external event,
> like a spike on the line to the video? Or any other outside event? It
> seems like a very odd failure mode, but since I'm about to drop in a
> bttv card and digitize about a hundred old tapes, I'd like to know.
>
> Did you try the "card=" suggestion?

2005-03-01 08:45:14

by Gerd Knorr

[permalink] [raw]
Subject: Re: Potentially dead bttv cards from 2.6.10

James Bruce <[email protected]> writes:

> If you could suggest a very well tested kernel for bttv (2.6.9?),

What do you expect? With just one single report and not remotely
being clear what exactly caused it ...

> I've heard that there is some way to dump eeproms; Is there a way to
> write them also?

Yes, you can. That works only if you can still talk to it though.

> If I could copy the eeprom from the unused cards to the (now broken)
> pair that might fix things.

No. It's not accessable, not just the content scrambled.

Gerd

--
#define printk(args...) fprintf(stderr, ## args)

2005-03-01 13:12:02

by James Bruce

[permalink] [raw]
Subject: Re: Potentially dead bttv cards from 2.6.10

Forgive me for being annoying; I'm trying to be careful because I get
one more failure in a test and then that's it. The manufacturer no
longer lists that model as being produced. Thus if there's a way to
ruin a bttv card through the V4L2 interface I will no longer be of any
assistance in finding it.

Gerd Knorr wrote:
> James Bruce <[email protected]> writes:
>>If you could suggest a very well tested kernel for bttv (2.6.9?),
> What do you expect? With just one single report and not remotely
> being clear what exactly caused it ...

It goes further than that though; I have about 3+ people a month asking
me what camera and capture card to use with my GPL'd machine vision library:
http://www-2.cs.cmu.edu/~jbruce/cmvision/

Right now, I have to tell them "use 2.4 patched with V4L2", which is
neither what they want to hear nor what I want to tell them. CMVision
can use IEEE1394 cameras, but since that has been "working, not working,
then working again" in relatively recent 2.6.x, I can't sanely tell
people to use that yet. Now that the V4L2 API churn rate has gone down,
I'm trying to get it working properly with the 2.6.x V4L2 API (which
AFAICT doesn't match any 2.4.x V4L2 API variant). If it works I could
just tell people "Use 2.6.x with a commonly available bttv card", which
would be great. However, so far I've gotten that to work for five days
before the cards stopped working. I've been tracking the V4L API since
1999, on something like 10 different cards, and this is the most serious
failure I've had. So I'm back to telling CMVision users "Use 2.4 +
patches", which I would really like to *not* have to tell people.

As far as kernels go, there was the guy only one report on lkml by
someone overwritting mozilla by running XawTV in 2.6.10-ac8. The
changes between 2.6.11-rc* and 2.6.10 seem to be minor, but I guess I2C
has changed causing the reported 2.6.11-rc* bttv issues. Going back to
2.6.9 there are a lot more changes in the driver. I'd know more if
linux.bkbits didn't seem to be down at the moment.

Google says:
[linux bttv problem "2.6.7"] -> 1230 hits
[linux bttv problem "2.6.8"] -> 1010 hits
[linux bttv problem "2.6.9"] -> 958 hits
[linux bttv problem "2.6.10"] -> 831 hits

That's certainly promising, so I guess I'll try 2.6.9 and 2.6.10 on
another computer with the remaining card. If you want me to also try
some out-of-tree-latest-version patches, now would be the time to speak
up before I've messed up the third card.

>>I've heard that there is some way to dump eeproms; Is there a way to
>>write them also?
>
> Yes, you can. That works only if you can still talk to it though.

I'll gather all the information I can from the remaining card.

>>If I could copy the eeprom from the unused cards to the (now broken)
>>pair that might fix things.
>
> No. It's not accessable, not just the content scrambled.
>
> Gerd

Ok.

Thanks,
Jim Bruce

2005-03-01 14:14:47

by Paulo Marques

[permalink] [raw]
Subject: Re: Potentially dead bttv cards from 2.6.10

James Bruce wrote:
> [...]
> The card= option didn't help in my case since my card is not in the
> list; For thess cards we went off the reccomendation of other people
> doing machine vision in Linux; Next time I guess we'll go name brand
> again...

I think you should try it anyway, using all the options, because it is
very likely that your card might be compatible with one of the listed
ones. This is specially true if you don't care about the tuner.

Just modprobe the bttv module with card=X option, test it, rmmod it,
modprobe it again with card=X+1, etc., until you find a number that fits.

--
Paulo Marques - http://www.grupopie.com

All that is necessary for the triumph of evil is that good men do nothing.
Edmund Burke (1729 - 1797)

2005-03-01 15:49:41

by James Bruce

[permalink] [raw]
Subject: Re: Potentially dead bttv cards from 2.6.10

Sorry, I wasn't clear in the previous email; I did try the card= option
anyway. I wrote a looping script and tested first 70 card= options, and
none worked properly for streaming capture. Some did show different
behavior though. I might try the remaining 50 later today.

I did notice one strange thing though; the card= option is only applied
to the first bttv card. All remaining cards in the system are still
autodetected (which ends up assuming card=0 in my case). Not sure if
this is the intended behavior or not, since someone really could run two
different bttv cards in the same system.

- Jim Bruce

Paulo Marques wrote:
> James Bruce wrote:
>
>> [...]
>> The card= option didn't help in my case since my card is not in the
>> list; For thess cards we went off the reccomendation of other people
>> doing machine vision in Linux; Next time I guess we'll go name brand
>> again...
>
>
> I think you should try it anyway, using all the options, because it is
> very likely that your card might be compatible with one of the listed
> ones. This is specially true if you don't care about the tuner.
>
> Just modprobe the bttv module with card=X option, test it, rmmod it,
> modprobe it again with card=X+1, etc., until you find a number that fits.

2005-03-01 16:08:47

by Gerd Knorr

[permalink] [raw]
Subject: Re: Potentially dead bttv cards from 2.6.10

> I did notice one strange thing though; the card= option is only applied
> to the first bttv card. All remaining cards in the system are still
> autodetected (which ends up assuming card=0 in my case). Not sure if
> this is the intended behavior or not, since someone really could run two
> different bttv cards in the same system.

Intentional. Try "insmd bttv card=3,4" ...

Gerd

2005-03-01 20:47:27

by Bill Davidsen

[permalink] [raw]
Subject: Re: Potentially dead bttv cards from 2.6.10

On Tue, 1 Mar 2005, James Bruce wrote:

> Sorry, I wasn't clear in the previous email; I did try the card= option
> anyway. I wrote a looping script and tested first 70 card= options, and
> none worked properly for streaming capture. Some did show different
> behavior though. I might try the remaining 50 later today.
>
> I did notice one strange thing though; the card= option is only applied
> to the first bttv card. All remaining cards in the system are still
> autodetected (which ends up assuming card=0 in my case). Not sure if
> this is the intended behavior or not, since someone really could run two
> different bttv cards in the same system.

Just for grins, did you try pulling one of the cards? I have to guess that
having multiple cards is a low occurence configuration, and that you *may*
be following some less traveled path here.

At least now that you know how to set the type for the cards separately
you can test two configurations at a time.

--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

2005-03-04 20:45:36

by James Bruce

[permalink] [raw]
Subject: Re: Potentially dead bttv cards from 2.6.10

As a final update, I added the third card to another machine and that
doesn't work either. So after trying 3 kernels on two machines with
either one or two cards, and trying the ~120 different card options for
bttv to no avail, I'll just guess this card isn't actually supported
right now.

The strange thing is that it ever worked in the first place, and
amazingly that it worked the first time I tried it with no extra effort,
yet never again after a reboot, nor on any other machines.

I'll take this discussion to the video for linux mailing list and try to
find out how to add support for this card. Once it works, I'll see if
my test program can still lock up the machine from userspace; If so
that'll be a separate issue to debug. Thanks for the help.

Jim Bruce