Cloud Migration | Every Cloud Has A Silver Lining
Disaster Recovery | June 16
If you think about it from a purely conceptual point of view the cloud can seem like the perfect disaster recovery platform. I have been around so many clients who have complained at length about dedicated disaster recovery location. Though most have moved past a “cold” disaster recovery model (meaning they might just have some hardware their and perhaps their backups already replicated) due to the very long Recovery Time Objectives (RTO) in this model. In other words, how long it would take them to get back to operations if they ever had to declare a disaster. A “warm” model is still pretty common where the idea they have some automation built in and at least their Tier 1-2 applications can be recovered in a 4-12-hour period. Finally, a truly “hot” approach where we switch over to the secondary location in a matter of a few seconds to a few minutes does actually exist. Prior to the advent of resources hosted fully in the cloud (production on prem and DR in a secondary co-lo) this was easily a few 100K a month to support. Our company has helped to deploy solutions like this for a number of clients. The ironic thing is some ended up moving back to a “warm” approach not only because of the cost but because of the complexity of managing these solutions.
So back to thinking about how to take a traditional onsite workload and leverage the cloud for DR. There are “native” options from Microsoft with Azure Site Recovery (ASR) which gives you a virtual machine level replication approach (think of the model DoubleTake perfected years ago). This is nice and can be an option but you have to keep in mind how you are going to make all of the supporting services like routing, DNS and supporting services (did you fail over the database server when the web server failed even though it was fine because performance across the internet VPN was not going to work well)? ASR has the ability to automate a lot of items on the target side, but this does add complexity. Microsoft is at least ahead of the game or in a different game altogether because Amazon does not provide any “native” options thought the Market Place has lots of options.
Note: AWS would tell you your application design is flawed and you should have designed at the application level to architected around the loss of a machine in either location from causing an outage. The reality is some applications just don’t work that way and an application re-write can take a lot of time and dollars.
This is where a 3rd party option can really start to make sense. My company provides one option based on Zerto which looks at the problem from a subnet perspective. In short the optimal unit of failover is the subnet. If your subnet design is configured properly (or corrected as part of onboarding) you are able to fail over the servers as is and have them retain their IP address on the other side. This is a great solution as it keeps our clients from continuing to ask over and over for a IP address which is no longer available.
The other nice thing with this approach is if you have legacy OS’s which are not supported in a provider like Azure (anyone have a few legacy Windows 2003 servers out there?) you can fail these over to our solution as is. The key to a solid disaster recovery plan is to make it as simple as possible. Removing a bunch of reconfiguration events really improves the odds of you being seen as at least doing a “good job” and not a “resume generating event” when you have to exercise your DR strategy.
Cloud Migration Seminar – June 29
ivision cordially invites IT decision makers having overarching challenges with cloud strategy or management approach to attend our seminar on June 29.
IT executives from the Atlanta Braves, LendingPoint, Recall, Manhattan Associates and ivision will share the biggest challenges and best approaches for cloud migration and managing applications on cloud infrastructures, utilizing services to solve modern business problems.
Network Routing Matters | June 10
For most hyper-scalers the default network you deploy when you spin up your first virtual machine could very likely overlap with your internal network making that newly deployed machine very difficult to integrate into your corporate environment when you decide to spin up your site-site-vpn connection.
Even if you have a limited number of locations and a simple network design corporately I advise my clients to really think about their overall networking design before they deploy their first resource. The reason is with the cloud you can quickly start spinning up additional VPC’s/vNET’s for things like disaster recovery, dev/test, and staging. Just like you likely have a server naming strategy you want to have a network naming and policy in place. Most enterprises will end up with multiple cloud providers as well, making this all the more important.
A simple design which has worked well something similar to the following.
|
Corporate |
Production |
Test/Development |
Base/VPC/vNet |
10.1.0.0/16 |
172.16.0.0/16 |
172.26.0.0/16 |
Remote Access |
10.1.10.0/16 |
172.16.10.0/24 |
172.26.10.0/24 |
DMZ |
10.1.15.0/16 |
172.16.15.0/24 |
172.26.15.0/24 |
Web |
10.1.20.0/16 |
172.16.20.0/24 |
172.26.20.0/24 |
Application |
10.1.25.0/16 |
172.16.25.0/24 |
172.26.25.0/24 |
Database |
10.1.30.0/16 |
172.16.30.0/24 |
172.26.30.0/24 |
User/Workstations |
10.1.35.0/16 |
172.16.35.0/24 |
172.26.35.0/24 |
The reality is the corporate network does not usually line up so neatly, and with multiple locations the design is most often more complex. However, when you start moving into the cloud you need to start thinking about how you are going to route all of these networks together, and perhaps even more importantly how you are going to overlay a tight security structure to make sure confusion does not create opportunities for would be hackers or curious employees to gain access to information they should not.
It’s Lonely at the Top | June 08
When it comes to access to your cloud or Active Directory for that matter, it pays to be lonely. I can’t tell you the number of environments I go into to complete a health check that have ½ of the IT department in the Domain Admins, Enterprise or even Schema Admins groups. What a bad idea. “Hi here you go, please destroy my entire company at will (or by mistake).” Generally this is because they didn’t want to take the time to figure out the right set of rights they needed in their role. Or they just needed to get Application X up and running as soon as possible and so the easiest way was to give it full rights to everything. Now everyone is afraid to change anything.
Well the cloud is no different and after the first account is created which generally has rights to everything when the next person needs access a lot of people simply copy and paste to make the new one. This is a really a bad idea. There is a concept of Role Based Access Control (RBAC) which simply means you assign the user/service the rights they need for their specific tasks and nothing more. In AWS you do this in IAM, in Azure this entails creating custom roles and resource groups You really need to work with someone who understands not only your cloud of choice but how to integrate your existing security approach to the cloud (or better yet fix it before your move forward). They say all publicity is good, but I would say these folks have never been responsible for a security breach or mass outage.
Knock Knock, “Who’s there?”…“Its me”…“Prove it” | June 07
When it comes to moving to the cloud many companies take the path of least resistance. This means making access to their environments and services as simple and often as insecure as possible. I understand the need to provide simple and secure access to web applications running in AWS or Azure. We have an App Dev team who focuses on making sure this happens. However, what I strive to help my customers understand is often the single fastest and least secure path to your data is through your front door.
Many clients are choosing to architect their environments in a way that makes access as easy as possible. This means connecting their cloud and production network’s together at the network level. If done properly this is a great thing if not, you should be concerted. Really, look around and forget wireless, do you have a port in your conference room that is connected to your corporate network, and therefore also to your cloud services. You know how often a client asks to even look at my laptop before I jump on their wireless or plug into their network, on average about 1 our of 5 times, and I work with some really big clients.
Today it is really easy to take advantage of two-factor authentication with AWS or Azure. Quite simply when your internal folks log into that key application or even the management console they login with their username and password and are then asked to provide a rotating PIN which is provided on a free authenticator app on their phone. It is simple to use and makes sure when I sit down at Debby’s desk and see a sticky note, watch your sales person try and type their password three times in front of me, or simply have the rights internally (think help desk) to have access to your passwords that I won’t be able to login with just them. I need that phone too (which hopefully has it own PIN on it. If you are note enforcing a corporate policy on this we really should talk like today and fix that J ). We all know that people are more likely to forget just about anything other than their mobile phone. This is really not that hard it just takes someone to help lead the way who has been there and done that before.
Don’t Leave the Lights On | June 06
I come across too many clients who spend 100K+ on a set of highly redundant firewalls and every flavor of IPS system out there to control and manage traffic and security coming into their environment. They generally have a team who is responsible for network security (who may or may not be BFF’s with the rest of IT). Despite this no one would think of logging into the firewall and making a change.
Now comes the web. In about 5-10 minutes I can have a machine up and running. Given another ½ day I can have full routing up and running between production and AWS or Azure with a site-site VPN. Do you see where this is going? The cloud is often designed to remove barriers and make it easy to get things running in a hurry. What this also means is it is also very easy to put up a machine and in about the same 10 minutes provide a big fat “Come get me” to the internet. Because it doesn’t matter if you come in through the big shinny Cisco ASA’s or the new RDP server connection on your $29/month server, you are in just the same. Care really needs to be taken to make sure your security and roles and responsibilities are well planned out to prevent a security breach from shutting down (or worse yet, happening and then operating under the radar). There are ways to leverage security logs with the native tools in AWS and Azure as well as 3rd party tools like Splunk to help you understand what is happening but the first step is to come up with a architecture that prevents it from happening in the first place.
Don’t Act Like You Own It | June 02
Many of my clients are used to working in a model where they own the whole environment. This means they get to tightly control the environment in order to ensure performance. They can reliably predict performance because it can measured it therefore they can be confident it will work the same way it does in testing in the future.
The cloud brings a lot of things, but in general unless you want to invest heavily in dedicated hosts with AWS you will need to be very careful in your configuration. As these servers host multiple clients at one time there is a concept of a “noisy neighbor”. This means another instance running on your server could start sucking up more than it’s fair share of resource. There are ways to architect around this in AWS through instance selection dedicated IOPS, but if you have to have complete confidence in the performance of your virtual machines then it may make sense to go with something like a hybrid model where you can have some of your machines running in a co-location like Equinix with a small hyper-converged infrastructure and then be able to easily to run some of your infrastructure running on your equipment and then take advantage of cloud providers like Azure and AWS for other workloads. The best thing is if you architect is properly they can all act seamlessly together as one environment. There are a couple of key things you need to make sure are in place to fully support this.
Rethink How You Apply Security | June 01
Do you think you are going to use the same security principles and architectures you use in your traditional data center? The principles apply, but the application of those principles are different.
In the traditional world, we look at securing subnets or VLAN’s and protecting traffic at the edge. In the cloud world, the focus of security is often around the Azure virtual machine and AWS instance itself. It is less about NACLs and more about security groups and restricting specific IP/port traffic to a particular machine.
You can deploy traditional security technologies like virtual Cisco ASA’a or 1000v’s but they way they work is different in the cloud. I work with so many clients who don’t fully understand how to integrate their two worlds together and they end up being frustrated or exposed. There are practical ways to achieve security and I would encourage you to work with a provider who understands both worlds.