Coral is down

Ed Myers edmyers at stanford.edu
Fri Jul 17 16:50:06 PDT 2009


John and Bill,

I tried to contact Pat Burke, but he was also on vacation.  Somehow 
Joe Little found out about the problem and had already began 
restarting the computers when I found him.  I watched Joe cycle 
through the KVM switch to verify each computer was coming up.  The 
one stumbling point was the sequence of the restart.  Once Joe 
indicated everything was on line, I tried to log on to the sunrays, 
remote coral and tested the email system and none of them were 
online.  This when we found the sequence of restarting had a 
problem.  I think Joe lucked out when he put a couple of the systems 
back into restart.  After a couple of system restarts, we were back on line.

I heard about the problem just before noon and we were back just 
before 2pm.  I had the maintenance staff in the fab enabling the 
tools with the black box, so I expect we will have some funny billing 
for today.

The reason we went down was a maintenance event.  The plumbing shop 
was working on the leak in the air conditioning units.  They plugged 
their vacuum cleaner in to a power strip and took out a 
circuit.  This shut down all the systems.

Ed


At 03:46 PM 7/17/2009, Bill Murray wrote:
>Ed,
>
>I just got back from Santa Rosa.  Everything looks good with Coral.
>However, there did appear to be a huge number of sun ray sessions 
>running on flare so I just restarted the sunray server software there.
>
>Bill
>
>John D Shott wrote:
>>Ed:
>>
>>While I've got no ability to kick start anything from here, here is 
>>what I would suggest:
>>
>>1 Go up to our aisle (closest to the air conditioner).
>>
>>2. In about the center of the row, we have one monitor, and a KVM 
>>swtich above it.
>>By pushing on the switch on the front of the black KVM you should 
>>be able to move from machine to machine and see which have a login window.
>>
>>It's a 16 channel KVM and there are 3 things that need to be up.
>>
>>Here's what I'd do ....On the second row from the top, the 
>>rightmost 2 machines are critical ... the third machine from the 
>>left is the one that is both the Coral server and database and also 
>>the file system for the other machines.  If you move the switch to 
>>that machine and shake the mouse button, does it show a login 
>>screen? If so, it should have running database servers and running 
>>Coral servers.
>>
>>Then move the switch so that it is on the third row from the top on 
>>the rightmost machine.  That is SNF.  Shake the mouse, does it have 
>>a login screen?  Even if it does, power down and power back up that 
>>machine.  It is the Dell that is at about waist height in the rack 
>>to the left of the KVM.  I think that the power switch may be on the back.
>>
>>Finally, move back to the second row from the top on the rightmost 
>>machine.  That is "flare" the machine that runs the sunrays.  Does 
>>it have a login window?  Even if it does, we are going to reboot 
>>it.  It is (I think) the topmost of the 3 machines on the row to 
>>the left of the KVM at nearly eye level.  To power cycle this, you 
>>need a pencil or ballpoint pen.  On the left of the machine is a 
>>little white on-off button. Hold it in for about 2 seconds to (I 
>>suspect) power down the machine.  Then wait for 30 seconds.  Then 
>>push it in for a second or so until you hear the high-speed fans come on.
>>
>>Hopefully, this will bring up flare and SNF after the first machine 
>>that you checked is fully operational .... and things should, 
>>hopefully, be good to go.
>>
>>Of course, if Bill is in the area, he'll do a better job of 
>>resolving things than my explanations.
>>
>>I'll try to check back later tonight .... but, by then, I suspect 
>>things will be resolved.
>>
>>Talk to you later,
>>
>>John
>>
>>----- Original Message -----
>>From: "Ed Myers" <edmyers at stanford.edu>
>>To: coral at snf.stanford.edu
>>Sent: Friday, July 17, 2009 12:58:05 PM GMT -06:00 Central America
>>Subject: Coral is down
>>
>>John and Bill,
>>
>>It's noon on Friday and both the sunray's and remote coral are not working.
>>
>>Ed
>>
>>





More information about the coral mailing list