As I mentioned in At Doorsteps of a New Engagement I have a new vendor to deal with. It is a company that has been working with my team for over two years and thus it’s only new for me. It took about a few days for me to encounter the first set of issues. And that set came from the area so common that it’s worth a post by itself – software environments. Below is an email which I was cc’ed on –
Subject: RE: WCM Publish failed
Please explain to me how the production environment does not match what is in UAT. This is unacceptable and must stop. This is not the first time a production turnover did not match UAT.
We need to review our build, turnover, and documentation procedures. This pattern cannot continue.
Looks familiar? I am sure it is…
If you are in business of delivering software as a service or similar to it the chances you will have the following environments: development, QA, staging, production, disaster recovery. You may also have dedicated environments including Build, UAT, Sand Box(es), Performance Lab, etc. If you work with offshore team the chances are some of those environments are duplicated in the offshore offices.
A number of issues arise as environments proliferate:
- keeping environments in synch;
- individual configuration and corresponding changes to build & deploy;
- external access to environments;
- scheduling access;
- backup, restore and disaster recovery;
- synching up with 3rd parties, especially integration partners;
- patching, updates, etc.
Unless you have a solid plan for dealing with each of these issues and follow the plan you will face a serious environmental crisis which could grow in global meltdown… well, as global as your organization’s footprint.
The good part about these types of environmental issues is that dealing with them is not necessarily a rocket science. Of course if you running projects of Google proportion have thousands of checkins per hour and have a large distributed team dealing with these issues becomes fairly complex, but hopefully in that case you have a Google size budget as well.
While none of the items on the list fall into category of “set and forget” environments tend to be static or slow moving. That allows for planning / design / execution with some sanity even with offshore ingredients. However, never forget the Second Fundamental Law of outsourcing – left to themselves things go from bad to worse. Consistent degradation of quality of services in absence of non-stop energy applied from the on-shore in the light of environmental issues typically translates first into environments going out of synch.
Here are a few fairly obvious tips for preventing environmental crisis:
- Establish SOPs that cover all aspects / issues outlined above. Follow them.
- Zero tolerance policy for breaking SOPs, even randomly enforced, may go a long way towards preventing the issues (serious penalties including termination worked quite well in my experience).
- Automate as many aspects of environment creation / update / patching as you can reasonably afford. The fewer human intervention the better.
- Use frequent (automated) ghosting with well designed (e.g. grandfather-father-son) back up process.
- Use virtualization to broadest reasonable extent.
- Minimize access to production / critical systems, it’s the same as with information access – “kneed to know basis”. The fewer people have access to production the better.
- Use strong change control process with logs and audits for any modification on production.
- Test restores of backups, ghost and virtual images on a regular basis.
- Maintain up-to-date documentation for zero base recovery of all critical / important systems.