[Linux-aus] Linux Australia mail/list server
Hugh Blemings
hugh at blemings.org
Sun Feb 28 20:50:54 AEDT 2016
Dear Linux-Aus,
I write to brief you on a recent issue with Linux Australia's mail server.
If you are among the majority of folk that haven't experienced any
problems, then you can safely ignore this email. However, in the
interests of transparency and openness, we're providing the following
details as it is an experience that many in our community may find
interesting.
Early on the morning of the 20th Feb, members of the new LA council were
contacted by previous council members to advise that they appeared to
have been re-subscribed to the council mailing list.
In addition to this a number of community members noted the apparent
intermittent absence of the Linux Australia mailing list archives.
While investigating these reports on the 22nd, the Admin team was
contacted by other users of the Linux Australia mail system to advise
that it appeared their mailboxes had been rolled back to a version from
December 2015.
During the investigation, there were periods of the correct mailboxes
and mailman subscriptions being returned, immediately followed by a
return to the old information.
As the last change to the system had been a reboot following a patch for
the recent glibc buffer overflow vulnerability (CVE-2015-7547) which was
applied on the 19th, the initial path of investigation was related to
the possibility of an as yet unknown bug related to virtualisation disk
caching and the recently applied update.
On the 23rd February, whilst gathering hardware information on one of
the virtualistation chassis that Linux Australia uses so as to lodge a
maintenance case, a member of the Admin team discovered that this
system, which had previously been removed from the storage pool and
shutdown, had recently been powered on, and a duplicate copy of the
mailhost virtual machine had been started on boot.
Once this virtual machine was powered down, all users reported that the
correct mailbox contents were now available and mailing lists had the
correct subscribers. The chassis has subsequently been removed from the
datacentre, and will re-installed once the existing maintenance issues
have been resolved.
No mail was lost during this period, as it was delivered to the
now-powered down virtual machine. The Admin team is working to recover
this mail and insert it into the relevant archives of the Linux
Australia and other lists, as well as the user mailboxes.
We expect to have this recovery concluded within the fortnight.
** Was this a malicious act?
No. We believe this was a non-intentional occurrence. The server was
powered down and removed from the cluster due partially to the failure
of one of the system power supplies.
These machines are set to power up on return of power, and we believe
the server received a power "blip" on the functional power supply that
had a sufficient duration to trigger the "power return" state action
within the system controller, resulting in the machine booting into a
running OS.
** How were two instances of the same machine allowed to be running?
As the second server was not part of the cluster, there was no file lock
to inhibit the starting of the virtual machine from local storage, and
no ability for the operational system to know the second system had
started a duplicate machine.
** What has been done to stop this happening again?
The Admin team has changed the procedure for decommissioning a machine
for maintenance to include the request for physical removal of power, as
well as add the machine to a monitoring alert that will alert on power up.
My apologies for any inconvenience caused by the above and thanks to the
Admin team for acting quickly on the reports of errant behaviour.
Kind Regards,
Hugh
--
Hugh Blemings,
President, Linux Australia
More information about the linux-aus
mailing list