I'm tracking down the problem, it's been off the air for most of the night it
appears. I may upgrade the processor and motherboard. If I do that, it
will be off the air for a bit longer.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
On Sat, Jun 21, 2003 at 06:58:12AM -0700, Larry McVoy wrote:
> I'm tracking down the problem, it's been off the air for most of the night it
> appears. I may upgrade the processor and motherboard. If I do that, it
> will be off the air for a bit longer.
I'm still at the office working on this. Both the main and the backup drives
are not checking cleanly. It's going to be several more hours before it is
back up at this rate.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
On Sat, Jun 21, 2003 at 12:09:44PM -0700, Larry McVoy wrote:
> On Sat, Jun 21, 2003 at 06:58:12AM -0700, Larry McVoy wrote:
> > I'm tracking down the problem, it's been off the air for most of the night it
> > appears. I may upgrade the processor and motherboard. If I do that, it
> > will be off the air for a bit longer.
>
> I'm still at the office working on this. Both the main and the backup drives
> are not checking cleanly. It's going to be several more hours before it is
> back up at this rate.
Still working on it but I'm starting to get somewhere. There is something
very strange or flakey about Samsung 80GB IDE drives. The symptoms are that
on some controllers the drive gets all sorts of errors. Running under
the most recent (on bkbits.net at least) linux-2.5 tree, if I put the drive
on a Serverworks IDE interface (Tyan dual PIII, I think a 2150?) then the
drive looks like it is just trashed, lots of fsck errors. I pulled it and
tried on an ECS el cheopo motherboard and that failed too. I then thought
that maybe the deal was that I had partitioned it under a 3ware and I was
trying to fsck it on a normal IDE (hey, I was grabbing for an answer) so
I stuck a 3ware into the el cheapo box. Still didnt work. OK, I tried
another ECS el cheapo and this one works.
By the way, "didn't work" meant that it would get through the fsck enough
to restart the fsck from the beginning and then somewhere along the way
the fsck would cause the system to reboot. Nice. It took several tries
to figure that out, I eventually resorted to video taping the screen to
find out what happened (it takes an hour to fsck this drive so I'd be
reading mail and looking over my shoulder about every minute trying to
catch it and of course I always missed it) and all I got was stuff that
looked like the kernel was hosed, sendmail started crapping out and I
know it wasn't doing anything. I have the video if someone wants it,
this was Red Hat 8.0 generic, so 2.4.19 I think.
OK, finally clean fsck on the second el cheapo, move it over to the Tyan
and try again. Disk drive go kaboom. I'm starting to get pissed, this
is my bloody Saturday, I promised my kids we'd play together, I'm grumpy,
my wife keeps calling to ask when we are going to the beach, shit this
just sucks, I need an sys admin that I trust to do this stuff. It's
beyond lame that I don't have one. Any good sys admins out there in
the Bay Area? Call me, now is a good time to negotiate a good package.
Deep breath, don't get pissed, that's how you make things worse. OK,
back at it. Pull the drive, plug in a Promise card, stick the drive on
that and pray that it works. Whoops, didn't compile that in, recompile,
reboot, man I hate American Megatrends, it takes forever to do a warm
reboot. Linux BIOS, where are you? OK, it sees it, do the fsck, wait
an hour, whoohoo! We're clean.
Time to think about what to do. I don't trust the Samsung even though it
says it is OK, too many problems. Another deep breath, call the local
suppliers, yeah, they got some 80GB drives, we're at 40GB so that seems
cool, head off to the store to buy some new drives. Shit. The store
lied and they were out of stock. Buy some 120GB? Nah, if we get to
that much data on bkbits.net I want it spread over multiple machines
so I'm not stuck in the machine room for 40 hours the next time this
crap happens. It sucks to be me some days, it really does. Go back,
steal a 80GB Western Digital drive from one of the desktops, stick it in.
By the way, Western Digital, if you ever want an endorsement I'm your man,
every other drive company has screwed me at least once and you never have.
Your drives rock, they behave well under benchmarks and they behave well
in the real world and I have the data to prove it. And, best of all,
your drives fail nicely, blocks start going bad but you can get 99.9%
of the data off, very nice. Good job.
Plug in the WD 80GB, write a script to start cloning the repositories,
that's easy, it's running, and I'm typing in this mail as something to
do while it runs. Hence the verboseness. And in spite of that we are
only up to linux-ajc (who's ajc?). But we're getting there. My guess
is that this is going to run for a few more hours. I've been here since
7am this morning, I'm going out to get plastered and I'll put the rest
of this back together tomorrow.
I wasn't kidding about that sys admin job, I'd love for this to be
someone else's problem. In theory I'm supposed to be a CEO who plays
golf games and cuts multi-million dollar deals for development tools.
I still need to learn how to play golf so I could use some help, right?
The problem is that I want things fixed right so that problems don't come
back and I don't trust lame people to do that so I end up doing a lot of
stuff myself. If you have an ego that won't quit because you could kick
my pathetic sys admin ass all over the place, you're who I want to hire.
Of course you need to be able to take a lot of shit because BK isn't
politically correct in the open source world :-(
I'm outta here to drink some beer, ETA on bkbits being back online is
some time tomorrow. Sorry about the delay. For what it is worth, we are
in the process of setting up an India based development effort which is
going to take this over and make it work better. We really want to be
in a place where when something goes wrong we change some DNS entries
and we're back on line. We're not there yet but we are working on it.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
It should be back up now. Finally. There may be permission problems
that slipped through the cracks. Send mail to [email protected]
if you can't pull or push.
On Sat, Jun 21, 2003 at 05:26:14PM -0700, Larry McVoy wrote:
> On Sat, Jun 21, 2003 at 12:09:44PM -0700, Larry McVoy wrote:
> > On Sat, Jun 21, 2003 at 06:58:12AM -0700, Larry McVoy wrote:
> > > I'm tracking down the problem, it's been off the air for most of the night it
> > > appears. I may upgrade the processor and motherboard. If I do that, it
> > > will be off the air for a bit longer.
> >
> > I'm still at the office working on this. Both the main and the backup drives
> > are not checking cleanly. It's going to be several more hours before it is
> > back up at this rate.
>
> Still working on it but I'm starting to get somewhere. There is something
> very strange or flakey about Samsung 80GB IDE drives. The symptoms are that
> on some controllers the drive gets all sorts of errors. Running under
> the most recent (on bkbits.net at least) linux-2.5 tree, if I put the drive
> on a Serverworks IDE interface (Tyan dual PIII, I think a 2150?) then the
> drive looks like it is just trashed, lots of fsck errors. I pulled it and
> tried on an ECS el cheopo motherboard and that failed too. I then thought
> that maybe the deal was that I had partitioned it under a 3ware and I was
> trying to fsck it on a normal IDE (hey, I was grabbing for an answer) so
> I stuck a 3ware into the el cheapo box. Still didnt work. OK, I tried
> another ECS el cheapo and this one works.
>
> By the way, "didn't work" meant that it would get through the fsck enough
> to restart the fsck from the beginning and then somewhere along the way
> the fsck would cause the system to reboot. Nice. It took several tries
> to figure that out, I eventually resorted to video taping the screen to
> find out what happened (it takes an hour to fsck this drive so I'd be
> reading mail and looking over my shoulder about every minute trying to
> catch it and of course I always missed it) and all I got was stuff that
> looked like the kernel was hosed, sendmail started crapping out and I
> know it wasn't doing anything. I have the video if someone wants it,
> this was Red Hat 8.0 generic, so 2.4.19 I think.
>
> OK, finally clean fsck on the second el cheapo, move it over to the Tyan
> and try again. Disk drive go kaboom. I'm starting to get pissed, this
> is my bloody Saturday, I promised my kids we'd play together, I'm grumpy,
> my wife keeps calling to ask when we are going to the beach, shit this
> just sucks, I need an sys admin that I trust to do this stuff. It's
> beyond lame that I don't have one. Any good sys admins out there in
> the Bay Area? Call me, now is a good time to negotiate a good package.
> Deep breath, don't get pissed, that's how you make things worse. OK,
> back at it. Pull the drive, plug in a Promise card, stick the drive on
> that and pray that it works. Whoops, didn't compile that in, recompile,
> reboot, man I hate American Megatrends, it takes forever to do a warm
> reboot. Linux BIOS, where are you? OK, it sees it, do the fsck, wait
> an hour, whoohoo! We're clean.
>
> Time to think about what to do. I don't trust the Samsung even though it
> says it is OK, too many problems. Another deep breath, call the local
> suppliers, yeah, they got some 80GB drives, we're at 40GB so that seems
> cool, head off to the store to buy some new drives. Shit. The store
> lied and they were out of stock. Buy some 120GB? Nah, if we get to
> that much data on bkbits.net I want it spread over multiple machines
> so I'm not stuck in the machine room for 40 hours the next time this
> crap happens. It sucks to be me some days, it really does. Go back,
> steal a 80GB Western Digital drive from one of the desktops, stick it in.
> By the way, Western Digital, if you ever want an endorsement I'm your man,
> every other drive company has screwed me at least once and you never have.
> Your drives rock, they behave well under benchmarks and they behave well
> in the real world and I have the data to prove it. And, best of all,
> your drives fail nicely, blocks start going bad but you can get 99.9%
> of the data off, very nice. Good job.
>
> Plug in the WD 80GB, write a script to start cloning the repositories,
> that's easy, it's running, and I'm typing in this mail as something to
> do while it runs. Hence the verboseness. And in spite of that we are
> only up to linux-ajc (who's ajc?). But we're getting there. My guess
> is that this is going to run for a few more hours. I've been here since
> 7am this morning, I'm going out to get plastered and I'll put the rest
> of this back together tomorrow.
>
> I wasn't kidding about that sys admin job, I'd love for this to be
> someone else's problem. In theory I'm supposed to be a CEO who plays
> golf games and cuts multi-million dollar deals for development tools.
> I still need to learn how to play golf so I could use some help, right?
> The problem is that I want things fixed right so that problems don't come
> back and I don't trust lame people to do that so I end up doing a lot of
> stuff myself. If you have an ego that won't quit because you could kick
> my pathetic sys admin ass all over the place, you're who I want to hire.
> Of course you need to be able to take a lot of shit because BK isn't
> politically correct in the open source world :-(
>
> I'm outta here to drink some beer, ETA on bkbits being back online is
> some time tomorrow. Sorry about the delay. For what it is worth, we are
> in the process of setting up an India based development effort which is
> going to take this over and make it work better. We really want to be
> in a place where when something goes wrong we change some DNS entries
> and we're back on line. We're not there yet but we are working on it.
> --
> ---
> Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
And the saga continues (I really want to be a real CEO. I've done some
research and I've found that golf isn't the only answer, fly fishing is
cool now too. I already know how to do that, I'm even good at it, so
I should go close some deals :)
Anyway, we put 2.5.70 on bkbits.net which is a Tyan dual PII motherboard
w/ serverworks IDE and we started getting data corruption. So I just
installed 2.4.21 and we'll see if that works better.
Things are going to be a little slow for a while I run through integrity
checks on all the repos, there are 4.5 million files here so it takes a
while (you guys do generate a pile of data, I'll give you that).
More status as I have, bkbits is up now and you should be able to use it.
If you hit problems in specific repos let me know, I already know about
the ppc problems.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
We're up to the projects starting with "p" and through all the ppc trees.
So far everything is checking out clean after a few manual fixups.
We "downgraded" to 2.4.21 from ~2.5.70 because of what we think are file
system or IDE corruption problems. If anyone else is running on a
serverworks IDE chipset (Tyan dual PIII MB) and has hit problems with
2.4.21 I'd be deeply grateful for a heads up.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
100% of all projects check clean. So we're cool on the data. I know we
still have BK/Web problems, hope to get those resolved later today.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
Larry McVoy <[email protected]> writes:
>Anyway, we put 2.5.70 on bkbits.net which is a Tyan dual PII motherboard
>w/ serverworks IDE and we started getting data corruption. So I just
>installed 2.4.21 and we'll see if that works better.
I will never understand while you insist running a _production_ server on
some beta and alpha quality kernels.
Why don't you simply pull a RedHat, SuSE, Mandrake, debian or anything
else release version distribution, put it on your box and let it
provide the service? And if anything breaks you have a vendor to
ask. Gee, look they would make money from supporting open source
software. :-)
Regards
Henning
--
Dipl.-Inf. (Univ.) Henning P. Schmiedehausen INTERMETA GmbH
[email protected] +49 9131 50 654 0 http://www.intermeta.de/
Java, perl, Solaris, Linux, xSP Consulting, Web Services
freelance consultant -- Jakarta Turbine Development -- hero for hire
--- Quote of the week: "It is pointless to tell people anything when
you know that they won't process the message." --- Jonathan Revusky
On Tue, Jun 24, 2003 at 06:33:02PM -0700, Larry McVoy wrote:
> And the saga continues (I really want to be a real CEO. I've done some
> research and I've found that golf isn't the only answer, fly fishing is
> cool now too. I already know how to do that, I'm even good at it, so
> I should go close some deals :)
>
> Anyway, we put 2.5.70 on bkbits.net which is a Tyan dual PII motherboard
> w/ serverworks IDE and we started getting data corruption. So I just
> installed 2.4.21 and we'll see if that works better.
Eek. Serverworks IDE. I don't think they ever got that bit of their
chipset right.
--
Vojtech Pavlik
SuSE Labs, SuSE CR
On Thu, Jun 26, 2003 at 11:17:52PM +0200, Vojtech Pavlik wrote:
> On Tue, Jun 24, 2003 at 06:33:02PM -0700, Larry McVoy wrote:
>
> > And the saga continues (I really want to be a real CEO. I've done some
> > research and I've found that golf isn't the only answer, fly fishing is
> > cool now too. I already know how to do that, I'm even good at it, so
> > I should go close some deals :)
> >
> > Anyway, we put 2.5.70 on bkbits.net which is a Tyan dual PII motherboard
> > w/ serverworks IDE and we started getting data corruption. So I just
> > installed 2.4.21 and we'll see if that works better.
>
> Eek. Serverworks IDE. I don't think they ever got that bit of their
> chipset right.
Hmm. I could shove in a promise card and put at least the repos on that,
would that be better?
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
On Thu, 26 Jun 2003, Vojtech Pavlik wrote:
> On Tue, Jun 24, 2003 at 06:33:02PM -0700, Larry McVoy wrote:
> > Anyway, we put 2.5.70 on bkbits.net which is a Tyan dual PII motherboard
> > w/ serverworks IDE and we started getting data corruption. So I just
> > installed 2.4.21 and we'll see if that works better.
>
> Eek. Serverworks IDE. I don't think they ever got that bit of their
> chipset right.
I've had sane behavior for most of the later 2.4 series...
Linux pith.uoregon.edu 2.4.21-rc1-ac4 #2 SMP Tue May 6 14:27:52 PDT 2003 i686 unknown
2:39pm up 36 days, 1:26, 2 users, load average: 0.02, 0.12, 0.13
[root@pith root]# cat /proc/ide/svwks
ServerWorks OSB4/CSB5/CSB6
ServerWorks OSB4 Chipset (rev 00)
------------------------------- General Status ---------------------------------
--------------- Primary Channel ---------------- Secondary Channel -------------
enabled enabled
--------------- drive0 --------- drive1 -------- drive0 ---------- drive1 ------
DMA enabled: yes no yes yes
UDMA enabled: yes no yes yes
UDMA enabled: 2 0 2 2
DMA enabled: 2 ? 2 2
PIO enabled: 4 0 4 4
PDC20267: not 100% native mode: will probe irqs later
PDC20267: ROM enabled at 0xfea70000
PDC20267: (U)DMA Burst Bit ENABLED Primary PCI Mode Secondary PCI Mode.
ide4: BM-DMA at 0xde80-0xde87, BIOS settings: hdi:DMA, hdj:pio
ide5: BM-DMA at 0xde88-0xde8f, BIOS settings: hdk:pio, hdl:pio
PDC20267: IDE controller at PCI slot 00:02.0
PDC20267: chipset revision 2
PDC20267: not 100% native mode: will probe irqs later
PDC20267: ROM enabled at 0xfea60000
PDC20267: (U)DMA Burst Bit DISABLED Primary PCI Mode Secondary PCI Mode.
PDC20267: FORCING BURST BIT 0x00->0x01 ACTIVE
ide6: BM-DMA at 0xdd80-0xdd87, BIOS settings: hdm:DMA, hdn:pio
ide7: BM-DMA at 0xdd88-0xdd8f, BIOS settings: hdo:DMA, hdp:DMA
SvrWks OSB4: IDE controller at PCI slot 00:0f.1
SvrWks OSB4: chipset revision 0
SvrWks OSB4: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:DMA, hdd:DMA
hda: IBM-DTLA-307015, ATA DISK drive
blk: queue c0361940, I/O limit 4095Mb (mask 0xffffffff)
hdc: IBM-DTLA-307015, ATA DISK drive
hdd: CD-540E, ATAPI CD/DVD-ROM drive
blk: queue c0361dbc, I/O limit 4095Mb (mask 0xffffffff)
hde: IBM-DTLA-307075, ATA DISK drive
blk: queue c0362238, I/O limit 4095Mb (mask 0xffffffff)
hdg: IBM-DTLA-307075, ATA DISK drive
blk: queue c03626b4, I/O limit 4095Mb (mask 0xffffffff)
hdi: IBM-DTLA-307075, ATA DISK drive
blk: queue c0362b30, I/O limit 4095Mb (mask 0xffffffff)
hdm: IBM-DTLA-307015, ATA DISK drive
blk: queue c0363428, I/O limit 4095Mb (mask 0xffffffff)
hdo: IBM-DTLA-307075, ATA DISK drive
blk: queue c03638a4, I/O limit 4095Mb (mask 0xffffffff)
--
--------------------------------------------------------------------------
Joel Jaeggli Academic User Services [email protected]
-- PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E --
In Dr. Johnson's famous dictionary patriotism is defined as the last
resort of the scoundrel. With all due respect to an enlightened but
inferior lexicographer I beg to submit that it is the first.
-- Ambrose Bierce, "The Devil's Dictionary"
At 2:21pm -0700 6/26/03, Larry McVoy wrote:
> > Eek. Serverworks IDE. I don't think they ever got that bit of their
>> chipset right.
>
>Hmm. I could shove in a promise card and put at least the repos on that,
>would that be better?
We routinely disable DMA for ServerWorks+IDE systems. Fortunately our
application doesn't care about disk performance. The symptom is that
a word or so of data gets inserted into a sector (or dropped; I
disremember right now).
A Promise card might be a good idea, though someone else will have to
vouch for it.
--
/Jonathan Lundell.
On Iau, 2003-06-26 at 22:21, Larry McVoy wrote:
> > Eek. Serverworks IDE. I don't think they ever got that bit of their
> > chipset right.
>
> Hmm. I could shove in a promise card and put at least the repos on that,
> would that be better?
Serverworks OSB4 IDE had a few problems that we now deal with.
Serverworks CSB5/CSB6 (Ie anything vaguely current) is great and hasn't
had many problems at all.
There are some small updates from Duncan in the 2.4.21 tree but nothing
"wrong" has been fixed for quite some time.
On Fri, Jun 27, 2003 at 11:53:20AM +0100, Alan Cox wrote:
> On Iau, 2003-06-26 at 22:21, Larry McVoy wrote:
> > > Eek. Serverworks IDE. I don't think they ever got that bit of their
> > > chipset right.
> >
> > Hmm. I could shove in a promise card and put at least the repos on that,
> > would that be better?
>
> Serverworks OSB4 IDE had a few problems that we now deal with.
> Serverworks CSB5/CSB6 (Ie anything vaguely current) is great and hasn't
> had many problems at all.
>
> There are some small updates from Duncan in the 2.4.21 tree but nothing
> "wrong" has been fixed for quite some time.
Is there a PCI EIDE card that you could suggest that would be ultra stable?
Or should I just toss this box and go build up another one?
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
On Fri, 27 Jun 2003, Larry McVoy wrote:
> Is there a PCI EIDE card that you could suggest that would be ultra stable?
> Or should I just toss this box and go build up another one?
What about using one of the 3ware 7000-2/7500-4 cards? You can use it in
software mode if you want.
I believe the 2 port goes for about 147, the 4 port for 260ish
If you have the hardware, the 7500-4 is also 64bit/33mhz capable, along
with their -8 and -12 port models.
Mike
On Gwe, 2003-06-27 at 15:57, Larry McVoy wrote:
> Is there a PCI EIDE card that you could suggest that would be ultra stable?
> Or should I just toss this box and go build up another one?
PCI ones tend to be the most problematic. The on board CSB5/CSB6 should be
very reliable. Failing that you really get to choose between promise, highpoint
and SI (SI/CMD680). The CMD680 driver has had a few problems compared with the
others but docs exist under NDA. I'd use the onboard IDE
I've always had very good luck with 3ware hardware. As I understand it
Serverworks officially says only to use their IDE for CDRom drives &
similar.
Nick
On Fri, 27 Jun 2003, Larry McVoy wrote:
> On Fri, Jun 27, 2003 at 11:53:20AM +0100, Alan Cox wrote:
> > On Iau, 2003-06-26 at 22:21, Larry McVoy wrote:
> > > > Eek. Serverworks IDE. I don't think they ever got that bit of their
> > > > chipset right.
> > >
> > > Hmm. I could shove in a promise card and put at least the repos on that,
> > > would that be better?
> >
> > Serverworks OSB4 IDE had a few problems that we now deal with.
> > Serverworks CSB5/CSB6 (Ie anything vaguely current) is great and hasn't
> > had many problems at all.
> >
> > There are some small updates from Duncan in the 2.4.21 tree but nothing
> > "wrong" has been fixed for quite some time.
>
> Is there a PCI EIDE card that you could suggest that would be ultra stable?
> Or should I just toss this box and go build up another one?
> --
> ---
> Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
On Fri, Jun 27, 2003 at 12:28:50PM -0400, [email protected] wrote:
> I've always had very good luck with 3ware hardware. As I understand it
> Serverworks officially says only to use their IDE for CDRom drives &
> similar.
Aye. From what I read when researching them a few years ago, they have
truly excellent motherboards except for one thing: IDE support. There
they suck the bigone. If you're gonna use a serverworks mb, use SCSI
with it. It's what they were designed for.
--
"How can I not love the Americans? They helped me with a flat tire the
other day," he said.
- http://www.toledoblade.com/apps/pbcs.dll/artikkel?SearchID=73139162287496&Avis=TO&Dato=20030624&Kategori=NEWS28&Lopenr=106240111&Ref=AR
On Gwe, 2003-06-27 at 17:37, CaT wrote:
> On Fri, Jun 27, 2003 at 12:28:50PM -0400, [email protected] wrote:
> > I've always had very good luck with 3ware hardware. As I understand it
> > Serverworks officially says only to use their IDE for CDRom drives &
> > similar.
>
> Aye. From what I read when researching them a few years ago, they have
> truly excellent motherboards except for one thing: IDE support. There
> they suck the bigone. If you're gonna use a serverworks mb, use SCSI
> with it. It's what they were designed for.
CSB5/CSB6 is a decent IDE interface the old OSB4 stuff is not
Hello Mike & Larry , Having just done a search of prices on the
3W 12 port 7500 (7500-12) the lowest price on that card is $515.00
The 8 port was ~ $373.00 . Hth , JimL
On Fri, 27 Jun 2003, Mike Dresser wrote:
> On Fri, 27 Jun 2003, Larry McVoy wrote:
> > Is there a PCI EIDE card that you could suggest that would be ultra stable?
> > Or should I just toss this box and go build up another one?
> What about using one of the 3ware 7000-2/7500-4 cards? You can use it in
> software mode if you want.
> I believe the 2 port goes for about 147, the 4 port for 260ish
> If you have the hardware, the 7500-4 is also 64bit/33mhz capable, along
> with their -8 and -12 port models.
--
+------------------------------------------------------------------+
| James W. Laferriere | System Techniques | Give me VMS |
| Network Engineer | P.O. Box 854 | Give me Linux |
| [email protected] | Coudersport PA 16915 | only on AXP |
+------------------------------------------------------------------+
Well, I almost had everything backed up to the new rackspace server
and we crashed. We're in fsck now. I think the machine room overheated.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
On Fri, Jun 27, 2003 at 03:12:14PM -0700, Larry McVoy wrote:
> Well, I almost had everything backed up to the new rackspace server
> and we crashed. We're in fsck now. I think the machine room overheated.
Oh, yeah, in the meantime for the repos which made it to the backup machine,
you can get them like so:
OLD: bk://project.bkbits.net/repo
NEW: bk://rack2.bitmover.com/$C/project/repo
where
$C first letter of your project name
Not all of them are there, we were part way through the "l"s which is the
biggest directory (you guys need to be more imaginative in your naming).
All the other letters should be there though, so I want to hear about it
if you are a [a-km-z] project and you can't pull your data. Pushes don't
work.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
On Fri, Jun 27, 2003 at 03:12:14PM -0700, Larry McVoy wrote:
> Well, I almost had everything backed up to the new rackspace server
> and we crashed. We're in fsck now. I think the machine room overheated.
We're back. More cooling has been applied.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
Anyone know what this means? This is from the supposedly superduper
rackspace machine which has a Mylex SCSI RAID (see below):
DAC960#0: Physical Device 0:0 Sense Data Received
DAC960#0: Physical Device 0:0 Request Sense: Sense Key = 3, ASC = 11, ASCQ = 00
DAC960#0: Physical Device 0:0 Request Sense: Information = 0380A6CA 00000000
DAC960#0: Physical Device 0:0 Sense Data Received
DAC960#0: Physical Device 0:0 Request Sense: Sense Key = 3, ASC = 11, ASCQ = 00
DAC960#0: Physical Device 0:0 Request Sense: Information = 0380A6CA 00000000
DAC960#0: Physical Device 0:0 Errors: Parity = 0, Soft = 0, Hard = 0, Misc = 0
DAC960#0: Physical Device 0:0 Errors: Timeouts = 0, Retries = 0, Aborts = 0, Predicted = 0
Boot messages:
DAC960: ***** DAC960 RAID Driver Version 2.4.10 of 23 July 2001 *****
DAC960: Copyright 1998-2001 by Leonard N. Zubkoff <[email protected]>
DAC960#0: Configuring Mylex AcceleRAID 170 PCI RAID Controller
DAC960#0: Firmware Version: 7.01-00, Channels: 1, Memory Size: 32MB
DAC960#0: PCI Bus: 0, Device: 10, Function: 0, I/O Address: Unassigned
DAC960#0: PCI Address: 0xF6000000 mapped at 0xF883F000, IRQ Channel: 18
DAC960#0: Controller Queue Depth: 512, Maximum Blocks per Command: 2048
DAC960#0: Driver Queue Depth: 511, Scatter/Gather Limit: 128 of 257 Segments
DAC960#0: Physical Devices:
DAC960#0: 0:0 Vendor: IBM Model: IC35L036UCD210-0 Revision: S5BS
DAC960#0: Wide Synchronous at 160 MB/sec
DAC960#0: Serial Number: KQZ3S401
DAC960#0: Disk Status: Online, 71651328 blocks
DAC960#0: 0:1 Vendor: IBM Model: IC35L036UCD210-0 Revision: S5BS
DAC960#0: Wide Synchronous at 160 MB/sec
DAC960#0: Serial Number: KQZ3E613
DAC960#0: Disk Status: Online, 71651328 blocks
DAC960#0: 0:2 Vendor: IBM Model: IC35L036UCD210-0 Revision: S5BS
DAC960#0: Wide Synchronous at 160 MB/sec
DAC960#0: Serial Number: KQZ39942
DAC960#0: Disk Status: Online, 71651328 blocks
DAC960#0: 0:7 Vendor: MYLEX Model: AcceleRAID 170 Revision: 0701
DAC960#0: Wide Synchronous at 160 MB/sec
DAC960#0: Serial Number:
DAC960#0: Logical Drives:
DAC960#0: /dev/rd/c0d0: RAID-5, Online, 143302656 blocks
DAC960#0: Logical Device Uninitialized, BIOS Geometry: 255/63
DAC960#0: Stripe Size: 64KB, Segment Size: 8KB
DAC960#0: Read Cache Disabled, Write Cache Disabled
Partition check:
rd/c0d0: rd/c0d0p1 rd/c0d0p2 rd/c0d0p3
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
On Fri, Jun 27, 2003 at 04:51:50PM -0700, Larry McVoy wrote:
> Anyone know what this means? This is from the supposedly superduper
> rackspace machine which has a Mylex SCSI RAID (see below):
>
> DAC960#0: Physical Device 0:0 Sense Data Received
> DAC960#0: Physical Device 0:0 Request Sense: Sense Key = 3, ASC = 11, ASCQ = 00
> DAC960#0: Physical Device 0:0 Request Sense: Information = 0380A6CA 00000000
> DAC960#0: Physical Device 0:0 Sense Data Received
> DAC960#0: Physical Device 0:0 Request Sense: Sense Key = 3, ASC = 11, ASCQ = 00
> DAC960#0: Physical Device 0:0 Request Sense: Information = 0380A6CA 00000000
> DAC960#0: Physical Device 0:0 Errors: Parity = 0, Soft = 0, Hard = 0, Misc = 0
> DAC960#0: Physical Device 0:0 Errors: Timeouts = 0, Retries = 0, Aborts = 0, Predicted = 0
Sense key 3 is MEDIUM ERROR. ASC 11 ASCQ 0 is an unrecovered medium error.
-- Patrick Mansfield
On Fri, 27 Jun 2003, Larry McVoy wrote:
> Anyone know what this means? This is from the supposedly superduper
> rackspace machine which has a Mylex SCSI RAID (see below):
>
> DAC960#0: Physical Device 0:0 Sense Data Received
> DAC960#0: Physical Device 0:0 Request Sense: Sense Key = 3, ASC = 11, ASCQ = 00
> DAC960#0: Physical Device 0:0 Request Sense: Information = 0380A6CA 00000000
> DAC960#0: Physical Device 0:0 Sense Data Received
> DAC960#0: Physical Device 0:0 Request Sense: Sense Key = 3, ASC = 11, ASCQ = 00
> DAC960#0: Physical Device 0:0 Request Sense: Information = 0380A6CA 00000000
> DAC960#0: Physical Device 0:0 Errors: Parity = 0, Soft = 0, Hard = 0, Misc = 0
> DAC960#0: Physical Device 0:0 Errors: Timeouts = 0, Retries = 0, Aborts = 0, Predicted = 0
3h - MEDIUM ERROR: indicates that the command terminated with a
non-recovered error caused by a flaw in the medium (the medium depends on
the device type)
Have to lookup the asc and ascq with ibm, as it varies by manufacturer.
0Bh 01h DTLPWRSOMCAE Warning - specified temperature exceeded
My guess is that you're seeing a temperature warning on a drive.
Which makes sense with the overheated server room.
Mike
On Fri, 27 Jun 2003, Mike Dresser wrote:
> > DAC960#0: Physical Device 0:0 Request Sense: Sense Key = 3, ASC = 11, ASCQ = 00
>
> 0Bh 01h DTLPWRSOMCAE Warning - specified temperature exceeded
Hum, why did i convert decimal to hex?
http://www.t10.org/lists/asc-num.htm#ASC_03
device write fault.
Bad sectors on that drive?
Which could still be related to the heat ;)
Mike
Wow. So this is a different server, this is the second backup machine.
So in about a week we've had the primary die, the secondary have a bad
disk, and the second backup have a bad disk.
I don't know if you all realize this but at one point we had corrupted
data in several repositories and the backups were also shot. But because
BK replicates the data (as Peter Chubb says "you are lost in a maze of
BitKeeper repositories, all almost the same") I was able to look through
other replicas until I found the missing chunks and put them back.
Maybe we should take a page from Oracle and start advertising. How's this?
BitKeeper makes your source unbreakable
I'm only half joking. If SVN/CVS/Clearcase/anyone else had both the primary
and the backup fail, you are just screwed, there isn't anything you can do.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
On Fri, 27 Jun 2003, Mike Dresser wrote:
> Hum, why did i convert decimal to hex?
>
> http://www.t10.org/lists/asc-num.htm#ASC_03
>
> device write fault.
>
> Bad sectors on that drive?
>
> Which could still be related to the heat ;)
>
> Mike
Gah. I'm gonna close my mail session now before i screw this up again.
http://www.t10.org/lists/asc-num.htm#ASC_11
At least it's unrecovered read fault, which is somewhat related to write
fault.
:/
Larry McVoy on Fri 27/06 17:16 -0700:
> I don't know if you all realize this but at one point we
> had corrupted data in several repositories and the backups
> were also shot.
ever hear of tapes?
how about SCSI?
On Fri, Jun 27, 2003 at 08:51:40PM -0400, Scott McDermott wrote:
> Larry McVoy on Fri 27/06 17:16 -0700:
> > I don't know if you all realize this but at one point we
> > had corrupted data in several repositories and the backups
> > were also shot.
>
> ever hear of tapes?
bkbits is 45GB of data and growing. Tapes are completely impractical,
that's why we have hot spares.
> how about SCSI?
The raid system that failed is SCSI.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
Hello Larry , Foooeeyyyyy ! It was not the SCSI drives that
failed you from what you had posted . It was the drivers which
were puking (afaict) . I'll admit that tape backup of heavily
changing data is a test in futility . But I'll NOT fail in making
my backups , ever !-) . Twyl , JimL
# dmesg | grep -B2 -A5 -i adaptec
Loading Adaptec I2O RAID: Version 2.4 Build 5
Detecting Adaptec I2O RAID controllers...
Adaptec I2O RAID controller 0 at faa48000 size=100000 irq=26
dpti: If you have a lot of devices this could take a few minutes.
dpti0: Reading the hardware resource table.
TID 008 Vendor: ADAPTEC Device: AIC-7899 Rev: 00000001
TID 525 Vendor: ADAPTEC Device: RAID-5 Rev: 380E
scsi2 : Vendor: Adaptec Model: 2110S FW:380E
Vendor: ADAPTEC Model: RAID-5 Rev: 380E
Type: Direct-Access ANSI SCSI revision: 02
# dmesg | grep sdd
SCSI device sdd: 177827840 512-byte hdwr sectors (91048 MB)
# df /home/archive /home/jiml
Filesystem 1k-blocks Used Available Use% Mounted on
/dev/sdd1 57422288 24301792 30156504 45% /home/archive
/dev/sdd2 28703752 19319748 7902412 71% /home/jiml
--------
43621540
root@filesrv1:~ # tapeinfo -f /dev/sg3
Product Type: Tape Drive
Vendor ID: 'COMPAQ '
Product ID: 'TSL-9000 '
Revision: '2.06'
On Fri, 27 Jun 2003, Larry McVoy wrote:
> On Fri, Jun 27, 2003 at 08:51:40PM -0400, Scott McDermott wrote:
> > Larry McVoy on Fri 27/06 17:16 -0700:
> > > I don't know if you all realize this but at one point we
> > > had corrupted data in several repositories and the backups
> > > were also shot.
> >
> > ever hear of tapes?
>
> bkbits is 45GB of data and growing. Tapes are completely impractical,
> that's why we have hot spares.
>
> > how about SCSI?
>
> The raid system that failed is SCSI.
>
--
+------------------------------------------------------------------+
| James W. Laferriere | System Techniques | Give me VMS |
| Network Engineer | P.O. Box 854 | Give me Linux |
| [email protected] | Coudersport PA 16915 | only on AXP |
+------------------------------------------------------------------+
On Fri, 2003-06-27 at 20:19, Larry McVoy wrote:
> On Fri, Jun 27, 2003 at 08:51:40PM -0400, Scott McDermott wrote:
> > Larry McVoy on Fri 27/06 17:16 -0700:
> > > I don't know if you all realize this but at one point we
> > > had corrupted data in several repositories and the backups
> > > were also shot.
> >
> > ever hear of tapes?
>
> bkbits is 45GB of data and growing. Tapes are completely impractical,
> that's why we have hot spares.
Boy you do need a good admin :) Done correctly, tapes are quite
practical for that amount of data. A LTO or SDLT drive would back the
entire 45GB thing up on a single tape, with room for at least one to two
more full backups. Granted, you're not going to have tape act as your
hot backup, but it is a good third line of defense. Plus data backed up
to tape is immune from human or software error that may otherwise affect
the hard-drive based data.
45GB of code is very compressible and I'm sure good chunks of that don't
change on a weekly basis. I'd imagine you could get a weekly or
bi-weekly full backup to tape in the span of about two hours, and then
do nightly differentials which would probably be only 15 minutes in
length. A filesystem capable of doing snapshots would ensure
consistency of the repositories on tape and would prevent you from
having to shutdown bkbits while backing up.
--Josh
One thing to consider is that people already whine about bkbits performance,
it's a heavily used (in the disk arm sense) machine. It's a lot easier to
just back the data up with BK itself.
That said, I'm shopping for SCSI RAIDs :) No tapes.
On Fri, Jun 27, 2003 at 09:08:06PM -0700, Joshua Penix wrote:
> On Fri, 2003-06-27 at 20:19, Larry McVoy wrote:
> > On Fri, Jun 27, 2003 at 08:51:40PM -0400, Scott McDermott wrote:
> > > Larry McVoy on Fri 27/06 17:16 -0700:
> > > > I don't know if you all realize this but at one point we
> > > > had corrupted data in several repositories and the backups
> > > > were also shot.
> > >
> > > ever hear of tapes?
> >
> > bkbits is 45GB of data and growing. Tapes are completely impractical,
> > that's why we have hot spares.
>
> Boy you do need a good admin :) Done correctly, tapes are quite
> practical for that amount of data. A LTO or SDLT drive would back the
> entire 45GB thing up on a single tape, with room for at least one to two
> more full backups. Granted, you're not going to have tape act as your
> hot backup, but it is a good third line of defense. Plus data backed up
> to tape is immune from human or software error that may otherwise affect
> the hard-drive based data.
>
> 45GB of code is very compressible and I'm sure good chunks of that don't
> change on a weekly basis. I'd imagine you could get a weekly or
> bi-weekly full backup to tape in the span of about two hours, and then
> do nightly differentials which would probably be only 15 minutes in
> length. A filesystem capable of doing snapshots would ensure
> consistency of the repositories on tape and would prevent you from
> having to shutdown bkbits while backing up.
>
> --Josh
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
On Fri, Jun 27, 2003 at 09:08:06PM -0700, Joshua Penix wrote:
> On Fri, 2003-06-27 at 20:19, Larry McVoy wrote:
> > On Fri, Jun 27, 2003 at 08:51:40PM -0400, Scott McDermott wrote:
> > > Larry McVoy on Fri 27/06 17:16 -0700:
> > > > I don't know if you all realize this but at one point we
> > > > had corrupted data in several repositories and the backups
> > > > were also shot.
> > >
> > > ever hear of tapes?
> >
> > bkbits is 45GB of data and growing. Tapes are completely impractical,
> > that's why we have hot spares.
>
> Boy you do need a good admin :) Done correctly, tapes are quite
> practical for that amount of data.
Totally. 45GB of data is nothing. Even a terabyte is easily backed up
with today's [tape] technology. You can start talking about impractical
when you get to petabytes. :-) (ok, dozens of terabytes)
Hot spares perform a completely different function than backups.
/fc
On Fri, Jun 27, 2003 at 10:42:00PM -0700, Frank Cusack wrote:
> On Fri, Jun 27, 2003 at 09:08:06PM -0700, Joshua Penix wrote:
> > On Fri, 2003-06-27 at 20:19, Larry McVoy wrote:
> > > On Fri, Jun 27, 2003 at 08:51:40PM -0400, Scott McDermott wrote:
> > > > Larry McVoy on Fri 27/06 17:16 -0700:
> > > > > I don't know if you all realize this but at one point we
> > > > > had corrupted data in several repositories and the backups
> > > > > were also shot.
> > > >
> > > > ever hear of tapes?
> > >
> > > bkbits is 45GB of data and growing. Tapes are completely impractical,
> > > that's why we have hot spares.
> >
> > Boy you do need a good admin :) Done correctly, tapes are quite
> > practical for that amount of data.
>
> Totally. 45GB of data is nothing. Even a terabyte is easily backed up
> with today's [tape] technology. You can start talking about impractical
> when you get to petabytes. :-) (ok, dozens of terabytes)
Sounds great. Send me a tape drive and some media and I'll be happy to
use it. Let's not forget that this is a service we provide for free that
already has a fixed $1400/month cost not counting human costs. If you
are volunteering to donate the hardware and the media that's great, we
appreciate it.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
LTO drives are probably a lot more expensive then the entire computers he
is useing as a backup (one source listed the individual drives at ~$10K,
I'll bet Larry's backup systems are under $5K probably under 3K)
David Lang
On 27 Jun
2003, Joshua Penix wrote:
> Date: 27 Jun 2003 21:08:06 -0700
> From: Joshua Penix <[email protected]>
> To: Linux Kernel Mailing List <[email protected]>
> Subject: Re: bkbits.net is down
>
> On Fri, 2003-06-27 at 20:19, Larry McVoy wrote:
> > On Fri, Jun 27, 2003 at 08:51:40PM -0400, Scott McDermott wrote:
> > > Larry McVoy on Fri 27/06 17:16 -0700:
> > > > I don't know if you all realize this but at one point we
> > > > had corrupted data in several repositories and the backups
> > > > were also shot.
> > >
> > > ever hear of tapes?
> >
> > bkbits is 45GB of data and growing. Tapes are completely impractical,
> > that's why we have hot spares.
>
> Boy you do need a good admin :) Done correctly, tapes are quite
> practical for that amount of data. A LTO or SDLT drive would back the
> entire 45GB thing up on a single tape, with room for at least one to two
> more full backups. Granted, you're not going to have tape act as your
> hot backup, but it is a good third line of defense. Plus data backed up
> to tape is immune from human or software error that may otherwise affect
> the hard-drive based data.
>
> 45GB of code is very compressible and I'm sure good chunks of that don't
> change on a weekly basis. I'd imagine you could get a weekly or
> bi-weekly full backup to tape in the span of about two hours, and then
> do nightly differentials which would probably be only 15 minutes in
> length. A filesystem capable of doing snapshots would ensure
> consistency of the repositories on tape and would prevent you from
> having to shutdown bkbits while backing up.
>
> --Josh
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
Larry McVoy on Fri 27/06 20:19 -0700:
> > ever hear of tapes?
>
> bkbits is 45GB of data and growing. Tapes are completely
> impractical, that's why we have hot spares.
You've got to be kidding. My AIT2 tapes do 50G each,
uncompressed. Those are years old technology. I have a
Qualstar library only a few thousand dollars that has 20
tape slots.
> > how about SCSI?
>
> The raid system that failed is SCSI.
ok, well I stand corrected here, I thought you were using
IDE.
On Fri, Jun 27, 2003 at 03:15:12PM -0700, Larry McVoy wrote:
> Not all of them are there, we were part way through the "l"s which is the
> biggest directory (you guys need to be more imaginative in your naming).
> All the other letters should be there though, so I want to hear about it
> if you are a [a-km-z] project and you can't pull your data. Pushes don't
> work.
Will pushes ever work? It'd save me quite some work.
--
Vojtech Pavlik
SuSE Labs, SuSE CR
On Sat, Jun 28, 2003 at 10:14:22AM +0200, Vojtech Pavlik wrote:
> On Fri, Jun 27, 2003 at 03:15:12PM -0700, Larry McVoy wrote:
>
> > Not all of them are there, we were part way through the "l"s which is the
> > biggest directory (you guys need to be more imaginative in your naming).
> > All the other letters should be there though, so I want to hear about it
> > if you are a [a-km-z] project and you can't pull your data. Pushes don't
> > work.
>
> Will pushes ever work? It'd save me quite some work.
We've been back live on bkbits.net since some time yesterday. Everythings
back to as normal as it gets. Just pull/push as you always have.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
On Sat, Jun 28, 2003 at 04:08:40AM -0400, Scott McDermott wrote:
> > bkbits is 45GB of data and growing. Tapes are completely
> > impractical, that's why we have hot spares.
>
> You've got to be kidding. My AIT2 tapes do 50G each,
> uncompressed. Those are years old technology. I have a
> Qualstar library only a few thousand dollars that has 20
> tape slots.
I haven't had much luck with tape, I've found them to be fairly unreliable
and slow over the years. I've moved to using disk as backup and it works.
It worked quite nicely in this case, I had a handful of places I needed
random access to in order to fix up the problem. Tape would have sucked.
> > > how about SCSI?
> >
> > The raid system that failed is SCSI.
>
> ok, well I stand corrected here, I thought you were using IDE.
It's a mix: 1 IDE, one 3ware SCSI/IDE, and one real SCSI.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
On Sad, 2003-06-28 at 04:19, Larry McVoy wrote:
> bkbits is 45GB of data and growing. Tapes are completely impractical,
> that's why we have hot spares.
Overhot spares included 8).
Hot spares wont save you always. I've worked at a telco where we lost
all the disks. the hosts and the hot spares to a PSU failure. The
replication has a lot going for it 8)
On Sat, Jun 28, 2003 at 08:14:16PM +0100, Alan Cox wrote:
> On Sad, 2003-06-28 at 04:19, Larry McVoy wrote:
> > bkbits is 45GB of data and growing. Tapes are completely impractical,
> > that's why we have hot spares.
>
> Overhot spares included 8).
>
> Hot spares wont save you always. I've worked at a telco where we lost
> all the disks. the hosts and the hot spares to a PSU failure. The
> replication has a lot going for it 8)
Yup. We're looking at having replicas in at least 2, maybe 3 locations,
one in San Francisco, one in Texas, and one in either North Carolina
or in Oregon. That should cover our butt for a while.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
Tapes are a pain; but at the type of 40GB range is it worth considering
a pile of external USB/Firewire hard drives?
(I've also had some bad luck with IDE RAIDs on Linux; now that the
western digital 200GBs have the new firmware on I thought they might
be stable - until 800GBs of RAID fell on its face last week. We've
now pulled the promise and hpt card out and replaced them with
a 3ware-8 port.)
Dave
---------------- Have a happy GNU millennium! ----------------------
/ Dr. David Alan Gilbert | Running GNU/Linux on Alpha,68K| Happy \
\ gro.gilbert @ treblig.org | MIPS,x86,ARM,SPARC,PPC & HPPA | In Hex /
\ _________________________|_____ http://www.treblig.org |_______/
On Sat, Jun 28, 2003 at 08:38:57PM +0100, Dr. David Alan Gilbert wrote:
> Tapes are a pain; but at the type of 40GB range is it worth considering
> a pile of external USB/Firewire hard drives?
Maybe it's not obvious to the none BK users. BK _replicates_ the database
of revision history.
cd /tmp
bk clone /repos/l/linux/linux-2.5
rm -rf /repos/l/linux/linux-2.5
bk clone /tmp/linux-2.5 /repos/l/linux/linux-2.5
That's a noop. Nothing was lost. And BK is excellent at incremental
updates, far better than anything else in existence.
And BK does in file and cross file integrity checks.
So backing up using BK to another mirror is faster, simpler, and more
reliable.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
On Sat, 28 Jun 2003 07:07:00 PDT, Larry McVoy said:
> I haven't had much luck with tape, I've found them to be fairly unreliable
> and slow over the years. I've moved to using disk as backup and it works.
> It worked quite nicely in this case, I had a handful of places I needed
> random access to in order to fix up the problem. Tape would have sucked.
One thing that tape gives you that most backup-to-disk don't is *OFFSITE*
backup. Unless you're backing up over a fiberchannel or other network to
another machine in a *REMOTE* building, some things can take out the whole
enchilada.
Ask the crew at the Uni of Twente NOC......
(For what it's worth, our offsite vault is about 2 miles from our machine room)
On Sad, 2003-06-28 at 20:38, Dr. David Alan Gilbert wrote:
> Tapes are a pain; but at the type of 40GB range is it worth considering
> a pile of external USB/Firewire hard drives?
I'm testing the USB2 disk idea at the moment. Big problem is performance
- 5Mbytes/second isnt the best backup rate in the world.
On Sat, Jun 28, 2003 at 09:31:30PM +0100, Alan Cox wrote:
> On Sad, 2003-06-28 at 20:38, Dr. David Alan Gilbert wrote:
> > Tapes are a pain; but at the type of 40GB range is it worth considering
> > a pile of external USB/Firewire hard drives?
>
> I'm testing the USB2 disk idea at the moment. Big problem is performance
> - 5Mbytes/second isnt the best backup rate in the world.
Well, still quite faster than a DDS3 anyway, and probably faster than a DDS4...
Cheers,
Willy
On Sat, 28 Jun 2003 12:18:47 PDT, Larry McVoy said:
> Yup. We're looking at having replicas in at least 2, maybe 3 locations,
> one in San Francisco, one in Texas, and one in either North Carolina
> or in Oregon. That should cover our butt for a while.
That's certainly physically diverse enough, although our auditors would *still*
whine because all copies are online, creating a risk of a virus, hack, or
software error nuking all the online copies.
I haven't decided yet if our auditors need tin foil helmets or not. ;)
> On Sad, 2003-06-28 at 20:38, Dr. David Alan Gilbert wrote:
> > Tapes are a pain; but at the type of 40GB range is it worth considering
> > a pile of external USB/Firewire hard drives?
> I'm testing the USB2 disk idea at the moment. Big problem is performance
> - 5Mbytes/second isnt the best backup rate in the world.
If the issue is the time the backup itself takes, 2 hours isn't a big deal,
it'll finish over a long lunch break. If the issue is having to lock out
write access or otherwise stabilize the actual data for the time it takes to
backup, just stage a copy of the backup to a local disk and then backup to
external disk from there.
DS
On Sat, Jun 28, 2003 at 01:55:11PM -0700, David Schwartz wrote:
>
> > On Sad, 2003-06-28 at 20:38, Dr. David Alan Gilbert wrote:
>
> > > Tapes are a pain; but at the type of 40GB range is it worth considering
> > > a pile of external USB/Firewire hard drives?
>
> > I'm testing the USB2 disk idea at the moment. Big problem is performance
> > - 5Mbytes/second isnt the best backup rate in the world.
>
> If the issue is the time the backup itself takes, 2 hours isn't a big deal,
> it'll finish over a long lunch break. If the issue is having to lock out
> write access or otherwise stabilize the actual data for the time it takes to
> backup, just stage a copy of the backup to a local disk and then backup to
> external disk from there.
If we're talking about BK, BK already has repository level locking to get
stable snapshots.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
Alan Cox ha scritto:
> On Sad, 2003-06-28 at 20:38, Dr. David Alan Gilbert wrote:
>
>>Tapes are a pain; but at the type of 40GB range is it worth considering
>>a pile of external USB/Firewire hard drives?
>
>
> I'm testing the USB2 disk idea at the moment. Big problem is performance
> - 5Mbytes/second isnt the best backup rate in the world.
IIRC some weeks ago I've verified 13 Mbytes/second using an el cheapo
external box with both USB2 and IEEE-1394 (tested with the latter), a
Maxtor 6Y120L0 and vanilla Linux 2.4.20
I've not the stuff handy now, but I'm almost sure of this number.
--
Abramo Bagnara mailto:[email protected]
Opera Unica Phone: +39.546.656023
Via Emilia Interna, 140
48014 Castel Bolognese (RA) - Italy
* Alan Cox ([email protected]) wrote:
> On Sad, 2003-06-28 at 20:38, Dr. David Alan Gilbert wrote:
> > Tapes are a pain; but at the type of 40GB range is it worth considering
> > a pile of external USB/Firewire hard drives?
>
> I'm testing the USB2 disk idea at the moment. Big problem is performance
> - 5Mbytes/second isnt the best backup rate in the world.
Hmm - why should it suck so badly? Shouldn't USB 2 (yes I mean the
480Mbps) manage 40MByte/s+ ?
Disc struck me as so nice in the sense that its a file system and you
don't need extra software, and that the recovery time is near instant.
(Incidentally - putting it in a RAID1 with a main disc would seem
an interesting option and just letting RAID take a copy of the file
system)
Dave
---------------- Have a happy GNU millennium! ----------------------
/ Dr. David Alan Gilbert | Running GNU/Linux on Alpha,68K| Happy \
\ gro.gilbert @ treblig.org | MIPS,x86,ARM,SPARC,PPC & HPPA | In Hex /
\ _________________________|_____ http://www.treblig.org |_______/
Am Sonntag, 29. Juni 2003 00:15 schrieb Dr. David Alan Gilbert:
> * Alan Cox ([email protected]) wrote:
> > On Sad, 2003-06-28 at 20:38, Dr. David Alan Gilbert wrote:
> > > Tapes are a pain; but at the type of 40GB range is it worth considering
> > > a pile of external USB/Firewire hard drives?
> >
> > I'm testing the USB2 disk idea at the moment. Big problem is performance
> > - 5Mbytes/second isnt the best backup rate in the world.
>
> Hmm - why should it suck so badly? Shouldn't USB 2 (yes I mean the
> 480Mbps) manage 40MByte/s+ ?
It should. 2.5 is a lot closer to that.
Regards
Oliver
On Sad, 2003-06-28 at 23:15, Dr. David Alan Gilbert wrote:
> Hmm - why should it suck so badly? Shouldn't USB 2 (yes I mean the
> 480Mbps) manage 40MByte/s+ ?
I don't think you get the full 480Mbit/sec on a single device.
5Mbyte/sec is a bit low but that may be some of the remaining work on
the USB EHCI drivers. I've not tried 2.5.x which may be way better here.
> From: Alan Cox
> Date: 2003-06-28 23:13:55
>
> On Sad, 2003-06-28 at 23:15, Dr. David Alan Gilbert wrote:
> > Hmm - why should it suck so badly? Shouldn't USB 2 (yes I mean the
> > 480Mbps) manage 40MByte/s+ ?
Custom devices certainly have done that, with drivers that keep
everything busy. Last fall, one person reported 38+ MB/sec
from a VT8235. The theoretical peak bandwidth for bulk traffic
(what most folk want) is 52 MByte/sec.
A Western Digital drive I tried gave me 27 MByte/sec with USB.
And I hate to say that the FireWire mode didn't work at all,
since I was curious how they'd compare! (2.5.71 or so.)
> I don't think you get the full 480Mbit/sec on a single device.
> 5Mbyte/sec is a bit low
Some combinations of EHCI silicon, USB-to-IDE adapter, and IDE
work better than others ... I once switched a drive from one
EHCI controller to another (same host and OS, didn't reboot),
and went from 5 MB/s to 19 MB/sec. That was on 2.4; with the
2.5 usb-storage, both controllers gave the higher speed.
> but that may be some of the remaining work on
> the USB EHCI drivers. I've not tried 2.5.x which may be way better here.
The key difference in 2.5 is that usb-storage queues requests,
no more slow page-at-a-time I/O. It's the same EHCI driver
underneath, lately -- much improved since last September (or so)
when it first started to generate real user feedback.
- Dave
Hi,
a friend of mine talked to a vendor of an usb 2.0/ide adapter.
They told him, that he won't get more than 12mb/sec (and even some different
adapters we looked at were not faster), and I saw ~8mb/sec copying a large
file to an elderly harddisk. This harddisk does ~10mb/sec when connected to
an ide port. We had no faster harddisk to test, but I won't blame the ehci
driver (which worked fine with my SiS746 board), but the adapter ;o).
Gl?ck Auf
Volker
Am Sam, 2003-06-28 um 22.31 schrieb Alan Cox:
> I'm testing the USB2 disk idea at the moment. Big problem is performance
> - 5Mbytes/second isnt the best backup rate in the world.
Which are 300Mbytes/minute, still faster than many tapes.
I've also made the experience that IEEE1394 (aka Firewire/iLink) is
always faster than USB2.
--
Servus,
Daniel
Hello Daniel ,
On Sun, 29 Jun 2003, Daniel Egger wrote:
> Am Sam, 2003-06-28 um 22.31 schrieb Alan Cox:
> > I'm testing the USB2 disk idea at the moment. Big problem is performance
> > - 5Mbytes/second isnt the best backup rate in the world.
> Which are 300Mbytes/minute, still faster than many tapes.
^^^^^^^^^^^^^^^^
5MB/Sec is faster than MOST tapes drivs ? Or ???
If you are talking older scsi-2 or 1 drives yes .
But on a properly tuned system any of the newer tape drives s/b
able beat that hands down .
> I've also made the experience that IEEE1394 (aka Firewire/iLink) is
> always faster than USB2.
I'd like to see a raising hands that have this functional at
anywhere near line (60% is close enough) rate ?
Tia , JimL
--
+------------------------------------------------------------------+
| James W. Laferriere | System Techniques | Give me VMS |
| Network Engineer | P.O. Box 854 | Give me Linux |
| [email protected] | Coudersport PA 16915 | only on AXP |
+------------------------------------------------------------------+
Am Son, 2003-06-29 um 12.24 schrieb Mr. James W. Laferriere:
> > Which are 300Mbytes/minute, still faster than many tapes.
> ^^^^^^^^^^^^^^^^
> 5MB/Sec is faster than MOST tapes drivs ? Or ???
> If you are talking older scsi-2 or 1 drives yes .
> But on a properly tuned system any of the newer tape drives s/b
> able beat that hands down .
To cite a popular manufacturer directly from the homepage:
"... and a data transfer rate of up to 5 megabytes per second"
Please note the "up to" and that this drive is an affordable latest
generation ADR streamer.
Last time I looked the speed was still specified per minute since they
are (were?) so slow. Also note than an iPod has a data transfer rate of
5MB/s, modern drives (even when crammed into an USB2/Firewire casing)
beat that by a magnitude yet are a whole lot cheaper than a good
streamer.
FWIW: The streamer in my office is in the happy 100MB/m (compressed)
league.
Streamers are only interesting for lots of data, for normal use they're
not only too slow but also too expensive.
> I'd like to see a raising hands that have this functional at
> anywhere near line (60% is close enough) rate ?
Check tomshardware or whatever magazine you prefer to read. Modern cases
(like the ones with the Oxford chipsets) deliver almost the same
performance as a built-in controller.
--
Servus,
Daniel
I don't really care which is faster, so please don't cc me in any replieds to
this thread, I 'm just telling my experience.
I've tried OpenGFS on an external firewire hard drive, and I got 13 MB/s(it
was read, but it shows that the bus can at least handle that much) on a
WD310100 (which is a pretty old 10GB udma33 hard drive). That would probably
be even better with ext2 or some fs other than OpenGFS.
--Brian Jackson
On Sunday 29 June 2003 05:24 am, Mr. James W. Laferriere wrote:
> Hello Daniel ,
>
> On Sun, 29 Jun 2003, Daniel Egger wrote:
> > Am Sam, 2003-06-28 um 22.31 schrieb Alan Cox:
> > > I'm testing the USB2 disk idea at the moment. Big problem is
> > > performance - 5Mbytes/second isnt the best backup rate in the world.
> >
> > Which are 300Mbytes/minute, still faster than many tapes.
>
> ^^^^^^^^^^^^^^^^
> 5MB/Sec is faster than MOST tapes drivs ? Or ???
> If you are talking older scsi-2 or 1 drives yes .
> But on a properly tuned system any of the newer tape drives s/b
> able beat that hands down .
>
> > I've also made the experience that IEEE1394 (aka Firewire/iLink) is
> > always faster than USB2.
>
> I'd like to see a raising hands that have this functional at
> anywhere near line (60% is close enough) rate ?
> Tia , JimL
--
OpenGFS -- http://opengfs.sourceforge.net
Home -- http://www.brianandsara.net
On Fri, 27 Jun 2003, Larry McVoy wrote:
> Maybe we should take a page from Oracle and start advertising. How's this?
>
> BitKeeper makes your source unbreakable
>
> I'm only half joking. If SVN/CVS/Clearcase/anyone else had both the primary
> and the backup fail, you are just screwed, there isn't anything you can do.
With arch you don't have any problems in recovering as well. Probably you
wouldn't need as much hassle, just pick any mirror amb mirror back.
Pau
On Sun, Jun 29, 2003 at 03:14:25PM +0200, Daniel Egger wrote:
>
> > 5MB/Sec is faster than MOST tapes drivs ? Or ???
> > If you are talking older scsi-2 or 1 drives yes .
> > But on a properly tuned system any of the newer tape drives s/b
> > able beat that hands down .
>
> To cite a popular manufacturer directly from the homepage:
> "... and a data transfer rate of up to 5 megabytes per second"
>
> Please note the "up to" and that this drive is an affordable latest
> generation ADR streamer.
>
While a little expensive, LTO drives can do 15MB/sec native. Even if
bkbits does have 2 or 3 backup sites, it is still a good idea to have a tape
backup of the data.
Andy
Does anyone know how to enable caching on a mylex AcceleRAID 170 (aka
DAC960) SCSI controller? We've got the bkbits.net data mirrored at
rackspace.com but the controllers are read and write cache disabled.
I read the driver source and it doesn't offer this ability via the
/proc configuration space which is where I would have expected it.
Is this a bios only thing?
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
This is why all my clients get HP 7504a tape drives in their tape servers.
40/80GB tape that can do disaster recovery is a GOOD thing! :-)
--
/"\ / For information and quotes, email us at
\ / ASCII RIBBON CAMPAIGN / [email protected]
X AGAINST HTML MAIL / http://www.lrsehosting.com/
/ \ AND POSTINGS / [email protected]
-------------------------------------------------------------------------
-----Original Message-----
From: [email protected]
[mailto:[email protected]]On Behalf Of Joshua Penix
Sent: Friday, June 27, 2003 11:08 PM
To: Linux Kernel Mailing List
Subject: Re: bkbits.net is down
On Fri, 2003-06-27 at 20:19, Larry McVoy wrote:
> On Fri, Jun 27, 2003 at 08:51:40PM -0400, Scott McDermott wrote:
> > Larry McVoy on Fri 27/06 17:16 -0700:
> > > I don't know if you all realize this but at one point we
> > > had corrupted data in several repositories and the backups
> > > were also shot.
> >
> > ever hear of tapes?
>
> bkbits is 45GB of data and growing. Tapes are completely impractical,
> that's why we have hot spares.
Boy you do need a good admin :) Done correctly, tapes are quite
practical for that amount of data. A LTO or SDLT drive would back the
entire 45GB thing up on a single tape, with room for at least one to two
more full backups. Granted, you're not going to have tape act as your
hot backup, but it is a good third line of defense. Plus data backed up
to tape is immune from human or software error that may otherwise affect
the hard-drive based data.
45GB of code is very compressible and I'm sure good chunks of that don't
change on a weekly basis. I'd imagine you could get a weekly or
bi-weekly full backup to tape in the span of about two hours, and then
do nightly differentials which would probably be only 15 minutes in
length. A filesystem capable of doing snapshots would ensure
consistency of the repositories on tape and would prevent you from
having to shutdown bkbits while backing up.
--Josh
Hello Larry , The inserted eamil was dropped on the list awhile
back . See after .sig . JimL
On Tue, 1 Jul 2003, Larry McVoy wrote:
> Does anyone know how to enable caching on a mylex AcceleRAID 170 (aka
> DAC960) SCSI controller? We've got the bkbits.net data mirrored at
> rackspace.com but the controllers are read and write cache disabled.
> I read the driver source and it doesn't offer this ability via the
> /proc configuration space which is where I would have expected it.
> Is this a bios only thing?
--
+------------------------------------------------------------------+
| James W. Laferriere | System Techniques | Give me VMS |
| Network Engineer | P.O. Box 854 | Give me Linux |
| [email protected] | Coudersport PA 16915 | only on AXP |
+------------------------------------------------------------------+
On Mon, 5 Feb 2001, Dan Jones wrote:
> Date: Mon, 05 Feb 2001 17:28:05 -0800
> From: Dan Jones <[email protected]>
> To: octave klaba <[email protected]>
> Cc: "[email protected]" <[email protected]>
> Subject: Re: DAC960 & cache
>
> octave klaba wrote:
> >
> > Hello,
> >
> > On Mylex 170 we have, the read/write cache disabled.
> > How can I switch it on ? Since we have it with raid-1
> > and the hd has the cache too, do we lose the data if
> > the server caches ?
> >
> > Thanks
> > Octave
> >
> > DAC960#0: Logical Drives:
> > DAC960#0: /dev/rd/c0d0: RAID-1, Online, 35807232 blocks
> > DAC960#0: Logical Device Initialized, BIOS Geometry:
> > 255/63
> > DAC960#0: Stripe Size: 64KB, Segment Size: 8KB
> > DAC960#0: Read Cache Disabled, Write Cache Disabled
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > the body of a message to [email protected]
>
> The cache policy can be switched via the Mylex firmware. Check out
> Mylex's searchable FAQ. Try searching on enable cache. The
> instructions cover the DACCF firmware. EzAssist will be different:
>
> http://mylex.custhelp.com/cgi-bin/mylex.cfg/php/enduser/home.php
>
> The A170 doesn't have a battery, so remember to allow the cache to
> flush before resetting or power cycling your system:
> --
> Dan Jones, Manager, Storage Products VA Linux Systems
> V:(510)687-6737 F:(510)683-8602 47071 Bayside Parkway
> [email protected] Fremont, CA 94538
On Sun, Jun 29, 2003 at 12:13:55AM +0100, Alan Cox wrote:
> On Sad, 2003-06-28 at 23:15, Dr. David Alan Gilbert wrote:
> > Hmm - why should it suck so badly? Shouldn't USB 2 (yes I mean the
> > 480Mbps) manage 40MByte/s+ ?
>
> I don't think you get the full 480Mbit/sec on a single device.
> 5Mbyte/sec is a bit low but that may be some of the remaining work on
> the USB EHCI drivers. I've not tried 2.5.x which may be way better here.
As a random data point, we're using external USB2 Maxtor IDE
drives (they also have firewire support, but the weird grille on the
back of our server prevented the firewire plug from properly inserting
into the combo USB2/firewire card I bought) for our backups on
2.4.21-rc1-rmap15g, and I was seeing around 10Mbyte/sec when I was
monitoring it the first couple times. It's also been much more
reliable than the tape solutions we've tried, and it's hard to beat
the price.
--
Zed Pobre <[email protected]> a.k.a. Zed Pobre <[email protected]>
PGP key and fingerprint available on finger; encrypted mail welcomed.
I have a 3ware SATA 8500-4 controller with 4x WD 36GB Raptor SATA drives.
Wow. Just "wow". At least a factor of 2 better than what I've seen
before. I think a boatload of that is the 5ms seek time, gotta love
that.
I'm burning them in, if they work then we'll use those for bkbits.net.
If you guys need me to run any tests/kernels against this mix let me
know.
These are nice drives. I got 4 drives and the controller from newegg
for about $950 shipped including tax.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm