[WF-Infra] What next

Jack Cummings jack at needle.mudshark.org
Wed Jun 13 14:22:55 PDT 2001


On Wed, Jun 13, 2001 at 08:00:23PM -0000, anubis  wrote:

> Thanks to everyone who responded so well to the temporary loss of email.  
> It's encouraging to see how quickly you all came up with a workable 
> solution.

It was muchly appreciated. I feel like a bit of a meathead for letting
things fall apart so badly. But like anything, it wasn't just one thing. 
 
> I believe the first step was to get victor to provide a seamless backup mail 
> service in the event of an outage with the primary server.  Is this in 
> place?  If not, what do we need to do to get there?

To provide a seamless backup for the mailing lists may prove to be rather 
difficult, at least with our current infrastructure. 

Providing a backup mailserver, so that mail is not lost, is easy, and
should be done. 

> CVS is another critical service, which currently is running on one machine.  
> Should we set up a secondary cvs server that contains the full cvs tree and 
> history?  I know there are 100 copies of the latest or at least a recent 
> checkout, but that won't be enough to fully restore the service, if I recall 
> previous discussions correctly.

This is another trickey one. The best I think that can be done is a 
snapshot of the cvs root on a regular basis .. rsync would do a good job. 
 
> We already have a couple of irc servers.  Is that now a sufficiently 
> redundant service that we can consider it 'done'?

If we loose irc.us.worldforge.org, we loose services. Which doesn't kill
the usability of irc, but it does kill some of the features. 

We could always use some more irc servers, and some more irc ops. 

I will be brining another irc server online when 'ice' comes up, see below. 
 
> DNS is another service that we need to look at.  In some ways its the most 
> fundmental to infrastructure.  I  have more concerns about malicious efforts 
> with dns than other services (someone could point all our names to a porn 
> site, or oracle, or whatever), but I don't think that that concern absolves 
> us of a need to make sure the system has some flexibility.  Up till now, 
> we've relied on jack to handle all the changes.  We've been lucky that he 
> hasn't been on vacation at an inoportune time, and has always worked quickly 
> to help us resolve whatever troubles we've had.  I feel like we are tempting 
> fate too much to assume that will always be the case, though.  Shit happens, 
> and real life can force plans to change no matter how well intentioned.

Agreed. Here is what is happening with DNS. 

1) I'm changing registrars. The InterNIC sucks. 
2) I'm setting up a host 'ice' to be colocated. It will be
    Primary DNS ( authoratative ) 
    Primary mail/news.
    Primary search engine. 
   The rest of the parts should be here by the end of th week, and it should
   be online by the end of the month. ($PANTHEON willing) 
3) I have another host 'impulse' that will serve as authoratative secondary. 

> So I propose that we come up with some method of distributing dns control.  
> I hope that you will have some ideas on how to do this.  We can't be the 
> first project that has similar needs.

> Here are some of the issues as I see it.  Please add your concerns and 
> suggestions as well:

> 1) We need to have one authoritive source of dns information at all times.
> 2) We need to be able to transfer the authority in a relatively short period 
> of time (24-48 hrs?) for whatever reason.
> 3) We need to insure that authority cannot be hijacked.

Here is a proposal:

1) Authoritiy for worldforge.{org,com,net} should be shared between 
   bryce, jack, and perhaps someone else. 

2) worldforge.* should be split up into seperate zone files. 
  ( toplevel worldforve.org, *irc.worldforge.org.
   zones, *www.worldforge.org, *mail.worldforge.org, &c. )

3) the zone file should be in cvs. With different access controls. 

4) Upon checkin of a zonefile, it should be linted, and then installed. 

This should be a fairly robust system. A certain number of 'trusted' users
should have access to do things manually, of course. 

> I think we might also want to reconsider using granitecanyon for primary dns 
> service.  It has occasionally taken many days for records to be updated, and 
> we can do better.

Indeed, they are going to get the boot when 'ice' comes online. 
 
--Jack 

--
John ( Jack ) Cummings                     http://mudshark.org/jack 
Key fingerprint: 945A 89EB CA2D 6D93 7F15  5C97 1028 194E 5E7A 62DD
Now playing: Tool - sober



More information about the Infra mailing list