Service status updates

Subscribe via RSS | Twitter: @SRCFstatus

See also our Nagios automated test output.

Urgent maintenance midnight tonight

Tonight, soon after midnight, there will be a short disruption to most SRCF services as we will be installing an urgent software patch on our servers.

This will involve a reboot, so please make sure that you save your work and exit any applications you may have running either on the shell service (pip) or on the desktop service (cyclone).

Apologies for the short notice; this patch is important for the continued safety and security of the facility.

0 notes

Server maintenance tonight

Tonight we will be replacing and upgrading hardware in the SRCF virtualised cluster.  This means that, once again, there will be an interruption to the following services:

  • IRC
  • Usenet news
  • Desktop

These services will be shut down for anything from half an hour to multiple hours each whilst their data is copied onto the new platform.

Arrangements have been made to maintain a basic IRC service during this maintenance; those of you wanting to remain on IRC may connect your IRC clients to drought.srcf.net.  Make sure to change them back to irc.srcf.net tomorrow after the outage.

The outage will not affect the IRC server on zeus, but that will be netsplit from drought.  If you don’t know what zeus is, ignore this and just use drought.srcf.net.

0 notes

Server reboots, 23:00 today

As per Andrew’s earlier announcement, we will be rebooting a few servers this evening from 23:00 in order to install security patches.

This will affect:

  • IRC service (flood)
  • Usenet news service (flame)
  • Desktop service (cyclone)

Connections to these services will fail for a few minutes each.  Anything left running on the Desktop service will be terminated.

This maintenance will not affect the shell or web services (pip).

0 notes

Electrical maintenance work completed without downtime

Although our electrical supply was interrupted for a brief period today as expected, we survived on battery power and the maintenance was completed without any downtime for our services.

0 notes

Possible disruption due to electrical work, 2013-02-23

On Saturday the building which houses the SRCF servers will be affected by electrical work.  This work will start at 07:30 and continue throughout the day, and will involve short interruptions to the SRCF’s power supply.

The SRCF has UPS (battery backup) systems to power the servers during short electrical outages—indeed we have an additional UPS arriving today to increase capacity and resilience—however it is still possible that the power disruption(s) will last longer than the batteries.  We may have to shut down some or all servers at short notice.

Apologies for any inconvenience caused.

0 notes

Linux distribution upgrades, 2013-01-10

This Thursday we will be upgrading the Linux distribution on the SRCF servers in order to provide a wide variety of new features and bug fixes.

This means that during the afternoon and evening of 10th January there will be intermittent disruption to some or all of our services, including the websites we host.  We will try to keep this to a minimum, but some brief service outages are inevitable.

Furthermore, if you maintain a website on the SRCF, there is a slight chance that the upgrade will change the behaviour of system components that your site relies upon (for example, MySQL will be upgraded from version 5.1.66 to 5.5.28; PHP from 5.3.2 to 5.3.10; PostgreSQL from 8.4.13 to 9.1; Python 2 from 2.6.5 to 2.7.3; Perl from 5.10.1 to 5.14.2).  You should check on Friday that your website still functions correctly; get in touch if you need any assistance.

Please acccept our apologies for any inconvenience caused and for the short notice; we felt it best to complete this work before the start of term.

If you have any questions please get in touch.

0 notes

Hardware upgrade for pip, 2012-11-10 16:00

We will be installing hardware upgrades (additional memory) in the main SRCF server (pip) tomorrow afternoon, starting at 16:00, in order to improve system responsiveness and performance.

Most SRCF services, including web sites hosted on the SRCF, will be unavailable for up to about an hour (probably much less if all goes well).

Apologies for the inconvenience and the short notice; following a couple of recent spells of heavy load we have decided to perform this upgrade as soon as possible.

0 notes

Unplanned reboot of pip, 2012-08-14 00:19

We had to reboot pip (our main server) just now as it had become unresponsive.  This is usually caused by a user’s runaway process using up all the memory; sysadmins are investigating the exact cause of this incident.

0 notes

Server reboots, 2012-08-11

All SRCF services will be unavailable for a short period during the scheduled maintenance window tonight (2am-3am) so that we can install important system security updates.

Any applications left running on the shell or desktop services will be terminated.  Please save your work and log out before 2am.

0 notes

Reboot, 2012-07-01 04:02

The leap second caused several further problems from 1am, with both the Linux kernel and several users’ processes sat in tight loops hours later unable to cope with a recent minute having contained 61 seconds.  I resolved this with a further emergency reboot.

Fingers crossed that either we as a species learn to test software properly and not make assumptions about the length of a minute, or that leap seconds are abolished before one is next deemed necessary.

0 notes