[Linux-aus] Recent Linux Australia website issues - VPAC Closure

Linux Australia Secretary secretary at linux.org.au
Thu Apr 28 23:13:48 AEST 2016


Dear Linux Australia Colleagues,

In line with our values of openness and transparency, we'd like to 
communicate
recent developments regarding the recent VPAC/V3 Alliance closure [^1]. The
closure has had some impacts to Linux Australia’s infrastructure, and 
we'd like
to describe what they are, what we're doing about them, and what you might
observe.

### Background
In November 2015, Linux Australia was made aware of the announcement of the
closure of the Victorian based eResearch co-operative 'V3 Alliance'. Two of
Linux Australia's servers at the time were hosted by the Victorian 
Partnership
for Advanced Computing (VPAC), a member of the co-operative. Linux Australia
contacted VPAC, who indicated that funding existed for VPAC to continue to
operate through to 30 June 2016.

Following discussions between the Linux Australia Council and the Admin 
team,
the Admin team was authorised to locate new hosting for these servers as a
matter of priority. Discussions were started with a department within a
university located within Victoria to seek hosting space and support as 
a matter
of urgency.

In early December, contacts at VPAC in December advised Linux Australia 
that as
the majority of staff had moved onto other organisations, VPAC would now be
auspiced by the University that had hosted their offices and equipment 
rooms,
effective 1 January 2016.

As a result the pace of negotiations increased, this was however 
subsequently
postponed due to impending Christmas and New Year holiday periods. The Admin
team ensured the servers located at VPAC were clearly marked with 
identifying
information, and backups of the devices continued. New equipment was 
purchased
to facilitate the rapid deployment of new hardware once a hosting agreement
could be negotiated.

In March 2016, the conversations regarding hosting were restarted, and 
planning
began for a staged and managed migration away from the old hardware to 
the new
server hardware when the equipment had a new home.

#### April 19 2016
At 3pm, the Admin team received notifications from the monitoring system 
that
the two servers at VPAC, and the virtual machines that run on them, were no
longer reachable. As the VPAC website was also offline, the Admin team 
advised
Council that it was highly likely the servers had been powered off. Through
conversations with previous staff of VPAC, it was confirmed the site was 
in a
decommissioning process, and the number for the contractor 
decommissioning the
room was obtained. The contractor was notified of the equipment in the 
room, and
negotiations began for its recovery. The contractor worked with the project
manager from the hosting institution, and an agreement was reached to 
permit the
servers to be brought back online for 2 hours the following morning. It was
immediately identified with the loss of these servers that the 
functional email
accounts, such as hostmaster, council and domain admin, were offline until
service could be restored.

The Admin team subsequently undertook the following steps;

- Council was updated on the status of the servers
- Backups of the servers hosted at VPAC were analysed, and a number of files
were found to be missing from backups due to permissions on the host server.
- Several of the files found to be missing were ones required to update 
the root
DNS glue for the linux.org.au and linux.conf.au domains. Replacement 
codes were
requested with the expectation these would be delivered the next morning 
when
the servers came online.
- Plans were made to change the delegation of the two domains, however, 
due to
the nature of change requests on the linux.conf.au domain, the Admin team
reached out to Jo Lim, our contact at AUDA to determine the best alternate
method to change the delegation should it be needed.
- New VMs were prepared on LA servers in Canberra to take over the roles
previously provided by the servers in Melbourne.

#### April 20 2016
At 8.30am, the Admin team were notified the servers were back online, 
and backup
jobs were started to recover the missing files. Once these were 
complete, entire
directory trees on both servers were compressed and copied to a remote 
server.
While these jobs were running, a DNS server was stood up and configured 
with the
zone files from the previous DNS server hosted in Victoria. The codes 
required
for DNS glue changes were also recovered at this time.

The last files were recovered from the servers at 11.05am, and the 
equipment was
powered down by the contractors and removed from the racks. Backups of 
the Linux
Australia wed, DNS and mail systems were prioritised over other sites, which
meant that the LCA2010 and LCA2012 sites were not backed up from their host
servers before power was removed.

Around 2pm, the replacement DNS server was ready, and root glue for the dns
server 'russell.linux.org.au' was changed with the registrar. Once this was
live, the secondary DNS was updated around 3.30pm to pull from the new 
server,
and DNS was returned for the linux.org.au and linux.conf.au domain.

Whilst plans had been well underway to migrate the @linux.org.au email 
to a new
system, the Admin team proposed, and Council wholeheartedly agreed, that the
risk at this point was too high, and the legacy configuration would be 
stood up
on a new VM.

The Admin team focussed their work on this service for the remainder of the
night.

#### April 21 2016
The legacy mail server was bought online at 10.30am, and the majority of 
normal
mail flow returned. The logs were monitored and issues were resolved as they
were identified. Around 6pm the Admin team started to restore the 
community and
LUG websites from backups, however following discussion within the team, the
decision was made to postpone this work to give team members a break and 
prevent
burnout or critical mistakes from occurring, and also to deal with personal
matters that required their attention..

Linux Australia reached out to our friends at AARNet to recover the 
servers from
the Datacentre on our behalf at the same time they recovered their 
equipment.

#### April 22 2016
Work re-commenced on the recovery of websites, continues at this time. 
The Linux
Australia blog aggregation site “planet.linux.org.au” is managed by a 
different
team than the Admin team, and they will work on returning this functionality
over the coming week. AARNet notified the Admin team that the servers 
have been
recovered, and plans are underway to have the equipment shipped to the Admin
team so that a final backup can be taken

### Current situation - April 28 2016
During Wednesday 20th April you may have observed the following symptoms;

- Slow or non-delivery of email to some Linux Australia mailing lists
- Slow or non-delivery of email to Linux Australia email addresses (ie
president at linux.org.au)
- Sporadic outages of Linux Australia web based properties such as
www.linux.org.au, www.linux.conf.au and MemberDB.

At the time of writing, the current status of infrastructure is as follows;

- DNS services have been restored onto new infrastructure.
- DNS secondary services, kindly provided by Andrew Pollock, have now been
updated and are retrieving records correctly
- linux.org.au is resolving and online
- linux.conf.au is resolving and online
- Non-core sites such as some Linux Australia hosted LUG sites, Linux 
Australia
Planet, radio and hosted sites etc are offline as their respective 
backends are
still in the process of being restored.

### How was this allowed to occur?
Whilst Linux Australia had undertaken all possible means to ensure the 
equipment
was identified as belonging to the organisation, ownership was mistakenly
attributed to a Victorian linux users group, who also had equipment 
hosted in
the room. The room was decommissioned at 9 weeks ahead of the last 
shutdown date
given, which meant that migration works had not yet been completed.

### Was any data lost?
It is too early to determine this, however once the servers have been 
received
by the Admin team, a complete analysis of the servers will be undertaken 
and an
update given to Council.
Was any personal data leaked?
Linux Australia believe the chain of ownership (VPAC to HPC Contractor 
to AARNet
to LA) has protected any personal information held on the servers.

### Why does this keep happening to Linux Australia?
Linux Australia has previously relied on the good-will of the community or
commercial organisations to host the server infrastructure. These 
agreements are
often non-binding, and, in some previous situations, have been revoked 
at short
notice following a change in business ownership.  Starting in 2012, hosting
agreements were prepared for new servers that ensures Linux Australia 
receives
plenty of advance notice, and also outlines a clear communication path 
for any
changes to the hosting situation. All new LA servers are clearly marked 
with the
organisational details, as well as at least 2 contact numbers should an 
issue
arise.

### What can I do to help LA?
Linux Australia servers are currently hosted at sites connected to the 
AARNet
network, as these sites are connected onto a high speed white space 
network, and
transfers between these sites are considered to be unmetered. If you 
work at a
higher-education institution that is connected to the AARNet network, 
and are
open to hosting several rack-units of equipment and have IP space available,
please contact the Council.

### Planned future actions
The Admin Team, with Council's approval, is not going to migrate mail to 
the new
system immediately. Instead, a new server environment has been built to 
host the
legacy mail configuration, and the migration planning has been picked up 
from
where it was left.

Linux Australia has long pursued a strategy of controlling our own virtual
machines and the underlying hardware. This approach gives us the ability to
resolve hardware issues much faster than a hosting company. It also provides
greater flexibility for scheduling maintenance. Linux Australia has usually
partnered with Universities for hosting, given that transfers on the AARNet
network are un-metered, reducing the cost of data transfers. Our current 
needs
are for 2RU of rack space, a /28 block of IP space (14 usable addresses) 
without
filtering, and a large amount of data transfer.

The Admin Team is currently liaising with some Universities located in 
Victoria
to identify if it's possible for them to host the Linux Australia server
equipment. Council and Admin Team will review alternative options for 
hosting in
due course to ensure alignment with Linux Australia's ongoing needs in this
space. Any transition will be well planned.

### Feedback
As always, we warmly welcome your comments, queries and feedback.

[^1]:
https://www.google.com/url?q=https://web.archive.org/web/20160307103442/http://
www.vpac.org/node/1318&sa=D&ust=1461163795106000&usg=
AFQjCNF5_J9jLDEGY0pyFjZ4tllU6uweXg

Kind Regards,

Sae Ra

-- 

Sae Ra Germaine
Secretary
Linux Australia

secretary at linux.org.au
http://linux.org.au

Linux Australia Inc
GPO Box 4788
Sydney NSW 2001
Australia

ABN 56 987 117 479



More information about the linux-aus mailing list