There was an upcoming change to a BlueCoat ProxySG over a weekend swing shift ( Saturday 3pm - midnight). All we had to do was upgrade from version 220.127.116.11 to version 18.104.22.168. Currently, we had two BlueCoat devices, a primary and secondary. These were used together to provide high-availability. If the primary proxy where to stop functioning, the secondary would take over. This was to uphold availability for the proxied traffic.
With this particular brand of proxy device, if you make a configuration change on the primary, it won't replicate to the secondary. You'd have to also login to the secondary to make the exact change. This is why it's important to make changes to both devices using a sound and well documented change management process. Instead of making a change on one device, we have to make a change on both devices.
As always, a few days before a major change with a high-profile customer, we make sure that everything is set to go for the engineer on-shift. As the team lead, it was my job to make sure everyone involved in this upgrade knew what they were supposed to do, how to login, had emergency contact numbers in case of a failure, and how to rollback the changes in case something went wrong.
Well, it's a good thing we did our due diligence beforehand because it was found that while we were able to login to the primary BlueCoat proxy with the password...the password for the secondary device was not working.
It was our responsibility to know the passwords to both devices. Nobody else should know the passwords. So now, we were in a position where we might have had to tell the customer we can't go forward with the change because we didn't know the password to one of their devices which we manage. These are devices they pay us a large sum of money to manage.
So what do we do now? Everyone was looking at the team lead (me!) for answers. It is at these times where security professionals are made. I had to draw on all my CISSP studies and experience.
This was the plan in chronological order:
Make sure to copy and paste the password myself just in case someone else was making a mistake. I know it's a simple thing to copy and paste, but weird things can happen with colleagues. It's best to trust, but verify.
Ask if any of the other engineers in the security operations center knew the password (sometimes the simple method works). Or sometimes an engineer has to change the password based on an emergency (not the best change management practice, but sometimes a necessity for troubleshooting).
Check the ticketing system to see if there was a change request in the past by the customer to change the password. This could be due to their internal compliance, ISO, or password retention policy.
Attempt to use an old password for the system just in case.
Contact the vendor and see if there is a way to reset the password without formatting the device or other aggressive procedures.
Finally, after we exhausted all efforts, it was best to just contact the customer and tell the truth. It wouldn't make us look too good, but I wanted to follow the second canon of the ISC2 Code of Ethics, "Act honorably, honestly, justly, responsibly, and legally."
That's the thing about the truth, it will always defend itself. Lying takes too much work and effort, and also unbecoming of a security professional. We're the ones who are supposed to be telling the truth.
And that's what worked in the end. We just told the customer that somehow the password to the secondary proxy device was not working, and that we would need to reset it. The customer didn't complain, they didn't contact senior management for a root cause analysis, and they did not pull their contract with us. They completely understood that things like this happen. We got lucky, as it would not be the same response if it happened a second time.
Senior management proceeded to assure them that when we change the password to something new, we will make sure to have a change request ticket in order to properly document what happened and to also have a place to store the new password (encrypted). The change management process would help us keep track of changes to the device and also for the customer to know we performed the work. This is why having a change management program is so important to an organization. It helps your organization for proper tracking and auditing, and it shows the customer all the changes which happened to their security devices. It is a way to prevent unauthorized changes to a network security device which could potentially bring down the entire network, or remove a critical security function.
We never did find out how the secondary password was changed or why it didn't work. We accepted the risk of this unknown factor and remained vigilant to make sure it never happens again.
While I can't go into how we changed the password, these documents may shed some light into the process:
"How do I change the ProxySG appliance CLI enable password?"
"Reset the console user or enable password in SGOS"
CISSP Take-Away Concepts
Domain 1: Security and Risk Management
Code of Ethics
If you noticed, it was senior management who "proceeded to assure them" of the various ways we would not have this incident happen again. This is an example of senior management being ultimately responsible. Even if I made the mistake, or another engineer, or my supervisor...it is always the ones who provide direction for the company who has to answer to any mistakes or errors. That's why they get paid the big bucks.
Domain 4: Network Security
Domain 7: Security Operations
High-availability upholds...well, availability. The term itself is a high-level concept, and how we deal with it in the "Security Operations" domain is the low-level implementation of the high-level concept.
Honesty is the best policy, especially as security professionals.
Thanks for reading.
Stories of a CISSP: Change Management
Stories of a CISSP: Low Availability
Stories of a CISSP: Symmetric Key Recovery
Stories of a CISSP: TCP Handshake