VibroAxe
Junior Administrator
This is a long post, so I'm going to put the tl;dr; at the top. If you are interested in the latest series of problems behind the scenes then I recommend reading on
So, I'm writing this post after the recent troubles with teamspeak, minecraft and the forums software to try and give people a bit of a view of what goes on behind the scenes as THN to keep the wheels turning (or in this case, fix the wheels and make them start turning again).
Our story begins back in summer last year. In order to save money the THN admins and community made the decision to remove our workhorse server from dedicated hosting in order to save costs and use a virtual server "destiny" instead. At this point it was also realised that Promethius could be hosted by members at home in order to give a better gaming capacity whilst saving on costs. Having the highest upload of THN volunteers I drove off to pick up prom from her datacentre near reading and bring her back to (not so) sunny Stevenage.
Performing open heart surgery on promethius revealed the north bridge fan to have been broken for some time. This not being an immediate issue (everything worked in the datacentre) I made the decision to connect prom to the network and worry about the fan if/when an issue occurred. It took a little while to get prom up and running again but between myself, Haven and Wol we managed to get the minecraft server happily hosted on promethius (A big thanks to luc once again for stopping the gap here)
Some time later the decision was made to move Teamspeak onto promethius. The teamspeak instance on Destiny was continually falling over (in a similar way to it is at the moment) and a move back to promethius seemed logical. In addition to this, in order to optimise minecraft backups we deployed Flashcache onto the server (technical: flashcache creates a ram cache in between the system and your target hard drive). During this transition there were some issues with DNS caching (some of you may remember) with peoples clients still trying to connect to destiny not promethius, due to lack of reproducibility these were tricky to track down and did all eventually resolve after DNS cache's expired.
Several software updates and a fast forward to just before Christmas and promethius started experiencing random crashes under high load. These were completely taking out the server including teamspeak and minecraft. Unable initially to resolve the crashes the initial decision was taken to move teamspeak back to destiny temporarily and then shortly after completely offline the minecraft server as well. It was decided that perhaps the northbridge fan could be at fault and two weeks ago this was obtained and fitted to promethius along with a shiny new SSD drive for certain services.
It's taken a long time to get confidence in promethius back and neither myself nor haven want to redeploy it actively before entirely sure that she is stable again. We have spent the last week running stress tests and have found several issues in the config which have resolved or removed (I'm looking at you flashcache!).
So where do we stand now...
I've been running a minecraft map render stress test on prom for the last 3 days and am pleased to report that after a few false starts prom now seems to be returning to stable. The SSD is functioning appropriately and having run at 200% cpu utilisation since 7am this morning we appear to have solved any memory and overheating problems we had before!
Where do we go from here?
MINECRAFT
There's a few final config options to setup on minecraft before I fully declare the doors open, but I'm not closing the server, use at your own risk, we should be ok but don't come crying if we break anything for the next few days.
TEAM FORTRESS 2
In addition to getting minecraft back up and running wol and I are working on getting THN Purple back up and running. There's been a pretty consistant TF2 contingent to THN in the last few weeks and we figured we should have our own server back to work with, so if you want to play, try searching THN in the server browser!
TEAMSPEAK 3
Yeah, we know, TS3 on destiny is broken...
We are doing something about it though! We know that the current TS3 install has an issue with permissions, so we're starting from scratch. At the same time we don't want to make the same mistakes we made last time so it's not going to be an instant fix. As of today (tonight?) haven and I are cleaning out the old TS3 install on promethius and will be letting thatbloke and luc start configuring the new TS3 permission structure. We've had a few thoughts on this in the past as a community but i'm sure bloke and luc will do a good job. Give em time and let em get on with it. Also I will say now that I wouldn't expect to end up with server admin just because you had it last time. We should end up with a permissions system that gives you what you need to do, without giving you all full control!
vBulletin & Shoutbox
Also, yeah, we know, blank posts
I'm going to say relatively quiet on this one suffice to say that the problems are caused by us upgrading PHP and Apache. vBulletin 3 is quite an old piece of software now and doesn't play ball with the newer versions of our caching and optimisation stuff. We have something in the pipe, it should happen relatively soon, but as I said I'll let someone else announce that unless I hear otherwise!
The main aim of this post is to let you know that we are taking action on the current spate of problems. We are definitely concerned with the current state of THN services and are working behind the scenes as quickly as possible to resolve this. There has been a huge amount of work gone in recently from myself, Haven, Wol and Ronin trying to keep the wheels on and build a replacement cart at the same time. Please bear with us, we all have jobs and personal lives, keep reporting stuff when it's broken and we will get stuff oiled out as soon as we can!
tl;dr; said:stuff moved;
stuff broke;
stuff apparently got fixed;
stuff didnt;
stuff got switched off;
stuff got fixed;
stuff is being stress tested;
stuff will be back online new and shiny asap;
So, I'm writing this post after the recent troubles with teamspeak, minecraft and the forums software to try and give people a bit of a view of what goes on behind the scenes as THN to keep the wheels turning (or in this case, fix the wheels and make them start turning again).
Our story begins back in summer last year. In order to save money the THN admins and community made the decision to remove our workhorse server from dedicated hosting in order to save costs and use a virtual server "destiny" instead. At this point it was also realised that Promethius could be hosted by members at home in order to give a better gaming capacity whilst saving on costs. Having the highest upload of THN volunteers I drove off to pick up prom from her datacentre near reading and bring her back to (not so) sunny Stevenage.
Performing open heart surgery on promethius revealed the north bridge fan to have been broken for some time. This not being an immediate issue (everything worked in the datacentre) I made the decision to connect prom to the network and worry about the fan if/when an issue occurred. It took a little while to get prom up and running again but between myself, Haven and Wol we managed to get the minecraft server happily hosted on promethius (A big thanks to luc once again for stopping the gap here)
Some time later the decision was made to move Teamspeak onto promethius. The teamspeak instance on Destiny was continually falling over (in a similar way to it is at the moment) and a move back to promethius seemed logical. In addition to this, in order to optimise minecraft backups we deployed Flashcache onto the server (technical: flashcache creates a ram cache in between the system and your target hard drive). During this transition there were some issues with DNS caching (some of you may remember) with peoples clients still trying to connect to destiny not promethius, due to lack of reproducibility these were tricky to track down and did all eventually resolve after DNS cache's expired.
Several software updates and a fast forward to just before Christmas and promethius started experiencing random crashes under high load. These were completely taking out the server including teamspeak and minecraft. Unable initially to resolve the crashes the initial decision was taken to move teamspeak back to destiny temporarily and then shortly after completely offline the minecraft server as well. It was decided that perhaps the northbridge fan could be at fault and two weeks ago this was obtained and fitted to promethius along with a shiny new SSD drive for certain services.
It's taken a long time to get confidence in promethius back and neither myself nor haven want to redeploy it actively before entirely sure that she is stable again. We have spent the last week running stress tests and have found several issues in the config which have resolved or removed (I'm looking at you flashcache!).
So where do we stand now...
I've been running a minecraft map render stress test on prom for the last 3 days and am pleased to report that after a few false starts prom now seems to be returning to stable. The SSD is functioning appropriately and having run at 200% cpu utilisation since 7am this morning we appear to have solved any memory and overheating problems we had before!
Where do we go from here?
MINECRAFT
There's a few final config options to setup on minecraft before I fully declare the doors open, but I'm not closing the server, use at your own risk, we should be ok but don't come crying if we break anything for the next few days.
TEAM FORTRESS 2
In addition to getting minecraft back up and running wol and I are working on getting THN Purple back up and running. There's been a pretty consistant TF2 contingent to THN in the last few weeks and we figured we should have our own server back to work with, so if you want to play, try searching THN in the server browser!
TEAMSPEAK 3
Yeah, we know, TS3 on destiny is broken...
We are doing something about it though! We know that the current TS3 install has an issue with permissions, so we're starting from scratch. At the same time we don't want to make the same mistakes we made last time so it's not going to be an instant fix. As of today (tonight?) haven and I are cleaning out the old TS3 install on promethius and will be letting thatbloke and luc start configuring the new TS3 permission structure. We've had a few thoughts on this in the past as a community but i'm sure bloke and luc will do a good job. Give em time and let em get on with it. Also I will say now that I wouldn't expect to end up with server admin just because you had it last time. We should end up with a permissions system that gives you what you need to do, without giving you all full control!
vBulletin & Shoutbox
Also, yeah, we know, blank posts
I'm going to say relatively quiet on this one suffice to say that the problems are caused by us upgrading PHP and Apache. vBulletin 3 is quite an old piece of software now and doesn't play ball with the newer versions of our caching and optimisation stuff. We have something in the pipe, it should happen relatively soon, but as I said I'll let someone else announce that unless I hear otherwise!
The main aim of this post is to let you know that we are taking action on the current spate of problems. We are definitely concerned with the current state of THN services and are working behind the scenes as quickly as possible to resolve this. There has been a huge amount of work gone in recently from myself, Haven, Wol and Ronin trying to keep the wheels on and build a replacement cart at the same time. Please bear with us, we all have jobs and personal lives, keep reporting stuff when it's broken and we will get stuff oiled out as soon as we can!
Yours in gaming
VibroAxe
VibroAxe