From secretary at linux.org.au Thu Apr 28 23:13:48 2016 From: secretary at linux.org.au (Linux Australia Secretary) Date: Thu, 28 Apr 2016 23:13:48 +1000 Subject: [Announce] Recent Linux Australia website issues - VPAC Closure Message-ID: <57220C8C.3000004@linux.org.au> Dear Linux Australia Colleagues, In line with our values of openness and transparency, we'd like to communicate recent developments regarding the recent VPAC/V3 Alliance closure [^1]. The closure has had some impacts to Linux Australia?s infrastructure, and we'd like to describe what they are, what we're doing about them, and what you might observe. ### Background In November 2015, Linux Australia was made aware of the announcement of the closure of the Victorian based eResearch co-operative 'V3 Alliance'. Two of Linux Australia's servers at the time were hosted by the Victorian Partnership for Advanced Computing (VPAC), a member of the co-operative. Linux Australia contacted VPAC, who indicated that funding existed for VPAC to continue to operate through to 30 June 2016. Following discussions between the Linux Australia Council and the Admin team, the Admin team was authorised to locate new hosting for these servers as a matter of priority. Discussions were started with a department within a university located within Victoria to seek hosting space and support as a matter of urgency. In early December, contacts at VPAC in December advised Linux Australia that as the majority of staff had moved onto other organisations, VPAC would now be auspiced by the University that had hosted their offices and equipment rooms, effective 1 January 2016. As a result the pace of negotiations increased, this was however subsequently postponed due to impending Christmas and New Year holiday periods. The Admin team ensured the servers located at VPAC were clearly marked with identifying information, and backups of the devices continued. New equipment was purchased to facilitate the rapid deployment of new hardware once a hosting agreement could be negotiated. In March 2016, the conversations regarding hosting were restarted, and planning began for a staged and managed migration away from the old hardware to the new server hardware when the equipment had a new home. #### April 19 2016 At 3pm, the Admin team received notifications from the monitoring system that the two servers at VPAC, and the virtual machines that run on them, were no longer reachable. As the VPAC website was also offline, the Admin team advised Council that it was highly likely the servers had been powered off. Through conversations with previous staff of VPAC, it was confirmed the site was in a decommissioning process, and the number for the contractor decommissioning the room was obtained. The contractor was notified of the equipment in the room, and negotiations began for its recovery. The contractor worked with the project manager from the hosting institution, and an agreement was reached to permit the servers to be brought back online for 2 hours the following morning. It was immediately identified with the loss of these servers that the functional email accounts, such as hostmaster, council and domain admin, were offline until service could be restored. The Admin team subsequently undertook the following steps; - Council was updated on the status of the servers - Backups of the servers hosted at VPAC were analysed, and a number of files were found to be missing from backups due to permissions on the host server. - Several of the files found to be missing were ones required to update the root DNS glue for the linux.org.au and linux.conf.au domains. Replacement codes were requested with the expectation these would be delivered the next morning when the servers came online. - Plans were made to change the delegation of the two domains, however, due to the nature of change requests on the linux.conf.au domain, the Admin team reached out to Jo Lim, our contact at AUDA to determine the best alternate method to change the delegation should it be needed. - New VMs were prepared on LA servers in Canberra to take over the roles previously provided by the servers in Melbourne. #### April 20 2016 At 8.30am, the Admin team were notified the servers were back online, and backup jobs were started to recover the missing files. Once these were complete, entire directory trees on both servers were compressed and copied to a remote server. While these jobs were running, a DNS server was stood up and configured with the zone files from the previous DNS server hosted in Victoria. The codes required for DNS glue changes were also recovered at this time. The last files were recovered from the servers at 11.05am, and the equipment was powered down by the contractors and removed from the racks. Backups of the Linux Australia wed, DNS and mail systems were prioritised over other sites, which meant that the LCA2010 and LCA2012 sites were not backed up from their host servers before power was removed. Around 2pm, the replacement DNS server was ready, and root glue for the dns server 'russell.linux.org.au' was changed with the registrar. Once this was live, the secondary DNS was updated around 3.30pm to pull from the new server, and DNS was returned for the linux.org.au and linux.conf.au domain. Whilst plans had been well underway to migrate the @linux.org.au email to a new system, the Admin team proposed, and Council wholeheartedly agreed, that the risk at this point was too high, and the legacy configuration would be stood up on a new VM. The Admin team focussed their work on this service for the remainder of the night. #### April 21 2016 The legacy mail server was bought online at 10.30am, and the majority of normal mail flow returned. The logs were monitored and issues were resolved as they were identified. Around 6pm the Admin team started to restore the community and LUG websites from backups, however following discussion within the team, the decision was made to postpone this work to give team members a break and prevent burnout or critical mistakes from occurring, and also to deal with personal matters that required their attention.. Linux Australia reached out to our friends at AARNet to recover the servers from the Datacentre on our behalf at the same time they recovered their equipment. #### April 22 2016 Work re-commenced on the recovery of websites, continues at this time. The Linux Australia blog aggregation site ?planet.linux.org.au? is managed by a different team than the Admin team, and they will work on returning this functionality over the coming week. AARNet notified the Admin team that the servers have been recovered, and plans are underway to have the equipment shipped to the Admin team so that a final backup can be taken ### Current situation - April 28 2016 During Wednesday 20th April you may have observed the following symptoms; - Slow or non-delivery of email to some Linux Australia mailing lists - Slow or non-delivery of email to Linux Australia email addresses (ie president at linux.org.au) - Sporadic outages of Linux Australia web based properties such as www.linux.org.au, www.linux.conf.au and MemberDB. At the time of writing, the current status of infrastructure is as follows; - DNS services have been restored onto new infrastructure. - DNS secondary services, kindly provided by Andrew Pollock, have now been updated and are retrieving records correctly - linux.org.au is resolving and online - linux.conf.au is resolving and online - Non-core sites such as some Linux Australia hosted LUG sites, Linux Australia Planet, radio and hosted sites etc are offline as their respective backends are still in the process of being restored. ### How was this allowed to occur? Whilst Linux Australia had undertaken all possible means to ensure the equipment was identified as belonging to the organisation, ownership was mistakenly attributed to a Victorian linux users group, who also had equipment hosted in the room. The room was decommissioned at 9 weeks ahead of the last shutdown date given, which meant that migration works had not yet been completed. ### Was any data lost? It is too early to determine this, however once the servers have been received by the Admin team, a complete analysis of the servers will be undertaken and an update given to Council. Was any personal data leaked? Linux Australia believe the chain of ownership (VPAC to HPC Contractor to AARNet to LA) has protected any personal information held on the servers. ### Why does this keep happening to Linux Australia? Linux Australia has previously relied on the good-will of the community or commercial organisations to host the server infrastructure. These agreements are often non-binding, and, in some previous situations, have been revoked at short notice following a change in business ownership. Starting in 2012, hosting agreements were prepared for new servers that ensures Linux Australia receives plenty of advance notice, and also outlines a clear communication path for any changes to the hosting situation. All new LA servers are clearly marked with the organisational details, as well as at least 2 contact numbers should an issue arise. ### What can I do to help LA? Linux Australia servers are currently hosted at sites connected to the AARNet network, as these sites are connected onto a high speed white space network, and transfers between these sites are considered to be unmetered. If you work at a higher-education institution that is connected to the AARNet network, and are open to hosting several rack-units of equipment and have IP space available, please contact the Council. ### Planned future actions The Admin Team, with Council's approval, is not going to migrate mail to the new system immediately. Instead, a new server environment has been built to host the legacy mail configuration, and the migration planning has been picked up from where it was left. Linux Australia has long pursued a strategy of controlling our own virtual machines and the underlying hardware. This approach gives us the ability to resolve hardware issues much faster than a hosting company. It also provides greater flexibility for scheduling maintenance. Linux Australia has usually partnered with Universities for hosting, given that transfers on the AARNet network are un-metered, reducing the cost of data transfers. Our current needs are for 2RU of rack space, a /28 block of IP space (14 usable addresses) without filtering, and a large amount of data transfer. The Admin Team is currently liaising with some Universities located in Victoria to identify if it's possible for them to host the Linux Australia server equipment. Council and Admin Team will review alternative options for hosting in due course to ensure alignment with Linux Australia's ongoing needs in this space. Any transition will be well planned. ### Feedback As always, we warmly welcome your comments, queries and feedback. [^1]: https://www.google.com/url?q=https://web.archive.org/web/20160307103442/http:// www.vpac.org/node/1318&sa=D&ust=1461163795106000&usg= AFQjCNF5_J9jLDEGY0pyFjZ4tllU6uweXg Kind Regards, Sae Ra -- Sae Ra Germaine Secretary Linux Australia secretary at linux.org.au http://linux.org.au Linux Australia Inc GPO Box 4788 Sydney NSW 2001 Australia ABN 56 987 117 479