Please read this announcement in full - reading from the bottom up will give you the correct timeline of events - the offer to move any user to the cloud is at the bottom and we can action that next week for anyone who prefers to move to a shared server on our cloud platform
Sunday 12 February 2012 - 9.49pm
The server is on the workbench still in the data centre following the upgrade. The server powered off - they are unsure why this happend and have told me the workbench is not the most stable place. The logs show no reason for the server to have powered off so we assume it was human error. We have scheduled a move back to the rack at 6am UK time tomorrow morning which is a more stable environment.
Sunday 12 February 2012 - 12.32am
The server is back on line and has been for 20 miniutes. There was a networking issue where the new server was idendifying an incorrect network port - eth1 instead of eth0. The issue has now been corrected.
Saturday 11 February 2012 - 11.46pm
The server is not coming back on line. We have excalated this to the data centre techs with highest priority.
Saturday 11 February 2012 - 11.32pm
The server is coming back on line now. It may take a few minutes for all services to start and there will be initial elevated load while this happens.
Saturday 11 February 2012 - 10.49pm
fsck is at 79%. We believe we know the issue. When the new file system was creaated a corrupted journal was created for some reason. This will be corrected through the fsck.
Saturday 11 February 2012 - 10.43pm
The server raid array went read only. It is all brand new hardware (12 hours old) so we are as frustrated by this as you are. The server is being rebooted and an fsck file system check is currently being completed (this is essential in a linux system for such a problem and there is no avoiding it). This should complete within the hour and we will check to see what caused this at that time.
Saturday 11 February 2012 - 9.24am
The data copy to the new server is progressing and should be completed in a few more hours. Thanks for your extended patience.
Thursday 7 February 2012 - 6.42pm
The upgrade will happen on Friday 10th February overnight and this will mean server 12 will be off line for a period of time. We apologise in advance for this outage.
Thursday 7 February 2012 - 6.27pm
The upgrade did not happen as we discovered the old RAID card in the old server 12 is not compatible. We need to replace the old 3ware card with a new Adaptec card. This will then result in new drives but a 7 hour outage while we physically copy the data from the old raid array to the new raid array. We are having internal discussions about this right now but it looks as if we will go ahead and do this overnight as we cannot let this situation continue with server 12 being not stable We fully appreciate this will cause some frustraton but please remember this server has had 18 months of virtually continuous uptime.
Thanks for being so understanding on this so far we really do appreciate it.
Please refresh this page for the latest update on when this will happen.
Thursday 7 February 2012 - 5.10pm
This is to let you know server 12 is being replaced with a brand new server with a new (and better performance in benchmarks) Xeon processor replacing the older Xeon that is currently in use, faster RAM, new motherboard etc - we are keeping the old RAID 1 array to prevent a massive period of downtime and we will monitor the I/O on the array and we can make a decision at a later date if the array needs changed to Raid 5 or Raid 10. There will be approximately 30 minutes of downtime while the server maintenance is done and we are doing this shortly.
Thanks for your extended patience.
Wednesday 6 February 2012 - 5.32pm
This is an update on the ongoing issues with Server 12. The server has been pretty stable and it is off the rack still being monitored. Approximately ten minutes ago the server load spiked again. The technician on duty was able to remove the network cable thus lowering the load low enough to console into the server. He is now checking to see the cause of these random load spikes
We will update this announcement once we have more information.
Tuesday 7 February 2012 - 8.00am
Good Morning
This is a post about Server 12 in our USA partner data centre. Server 12 whilst one of our older shared servers has been rock solid for 18 months. Yesterday the server went unresponsive 5 times and required 5 reboots. At each time this happened the server load went sky high. We are closely monitoring the server to try to determine the cause of this issue. At this stage we are pretty confident it is not hardware related.
Overnight the server was removed from the rack and is now on the technician's workbench in the data centre where staff there are monitoring it closely- this will explain the 5 minutes outage overnight as the server was removed from the rack and onto the workbench. The latest report from the technician on the night shift monitoring the server came in 2 hours ago and is pasted below
=================
Everything looks good on your server except the north bridge seems to be getting very hot, hotter then I can touch. we have the server on our workbench and we left the lid off.
=================
We are continuing to monitor the server. We did find one user who over the past 24 hours has been using more CPU that our Terms of Service permit for their website and this is a relatively new user to this server. Whilst their usage technically would merit a suspension based on our terms of service we want to be sure it is this site causing the issues before suspending what is essentially a new client. If we have to remove this site then we will not hesitate if losing one user will ensure the continued server stability for others.
We are also looking at upgrading this server as well and are discussing possible options. The quickest option would be to place the RAID array of server 12 into a new chassis with better processors and more RAM. Ideally if we upgrade we would prefer to have the benefit of Raid 5 or Raid 10 on this server as presently this is an older server with RAID 1. The implications of a change of RAID would be a number of hours of downtime and we want to avoid that if possible. Simply placing the disks in a new chassis would prevent many hours downtime. We will make a decision on that within the next 24-48 hours.
UK Cloud
We are presently building a new server on our Cloud in the UK and if any user from server 12 would like a free move today to a shared cloud server in the UK just open a support ticket and we can arrange a move at a time to suit you. The cloud has a benefit that if we needed to increase CPU power or RAM quickly we can without any downtime. We do anticipate a few users from server 12 moving to the cloud and we will move one at a time so if you do put in a ticket to have that done please be patient. I will handle these transfers personally today from our Belfast Office.
If you have any concerns please contact me personally. My work email is sk@bwf.co
Regards
Stephen K
BWF Hosting