Features AWS with Ansible Lead image: Lead Image © markovka, 123RF.com

Roll out hybrid clouds with Ansible automation

Mixing Allowed

Designing your own hybrid IT structure as a digital mix of your servers and public or private clouds might be technically elegant and cost effective, but setup is time consuming. Thanks to Ansible, it might take less work than you think. By Konstantin Agouros

According to reliable estimates, half of all corporations will be operating a hybrid cloud architecture by next year. Obviously, the advantages of mixing the historic IT landscape, which physically resides in-house, and public or private clouds from external service providers is something that appeals to admins. Lower running costs are often the deciding factor in favor of hybrid setups. In many cases, however, the combination also might have functional advantages. Companies with hybrid IT structures are said to be more successful than their competitors [1].

The reason the hybrid cloud is not yet widely used is probably because of the great effort needed to set it up. In this article, I seek to demonstrate that every admin can take a considerable step toward achieving a hybrid cloud with little effort, while gaining enough experience to outsource more parts of the local IT structure. With Amazon Web Services (AWS) as an example, I show how you can use two Linux virtual machines (VMs) to build a secure infrastructure that is even capable of multitenancy. The best part is that you do not need to do all this work yourself; instead, Ansible will carry out the setup steps. (For more information on Ansible, refer to earlier articles in this magazine [2]–[4].)

Many users still think that VMs in the cloud are always publicly available. However, administrators who have been involved with virtual network infrastructures, such as AWS or Azure, know that private networks, VPNs, and routers are also common in the cloud. AWS and Azure even offer VPNs to the private corporate network as a (commercial) service (see the "Amazon Network Glossary" box).

Amazon Network Glossary

The AWS Elastic Compute Cloud service allows complex networking of VMs, much like OpenStack or VMware. However, the admin has to follow Amazon's logic. To explain the target architecture used in this article, a short introduction to the concepts is helpful.

The highest hierarchy level is an account, which can belong to a company, for example. Multiple employees then have access to the environment, its VMs, and other resources. The account is comparable to the OpenStack project.

AWS divides the infrastructure offered to the customer into regions, or the data center's geographical location (where the data is located). Each network setup is also relative to a region. If you want to operate your hybrid cloud in Ireland and Germany, you need to set up the solution presented in this article twice.

Within the account are Virtual Private Clouds (VPCs). Five VPCs per region and account are possible. In a VPC, the user can create up to 200 separate subnets. (An AWS document [5] shows the limits.) Between VPCs (also between accounts), you can create peering that specifically routes traffic. Both peering partners need to agree, which is especially important when establishing a connection between two accounts.

VMs in subnets initially receive only one address from the address block assigned to the subnet and cannot be reached from outside. If you use the public IP address option, the VM additionally gets an IP address from the AWS public address blocks. For closed VMs to reach the Internet, a network address translation (NAT) gateway must be assigned to the subnet. When creating a subnet, you can specify that every VM you connect to this network is assigned a public IP automatically.

AWS stores routes in routing tables as expected, and administrators maintain these as separate units, which are then assigned to a VPC.

AWS Security Groups describe a set of IP filters (inbound and outbound) that the customer applies to a VM to control externally which packets it can send or which packets can reach the VM.

AWS offers its own, rather expensive VPN as a Service (VPNaaS). Therefore, the setup in this article uses its own VPN VM, which is cheaper – or to be more precise, might be cheaper, because with a large, expensive Amazon instance, the costs could be higher.

Target Architecture

Figure 1 shows the architecture I set up for the project here. In the local data center on the right, a VPN gateway establishes the connection to the gateway in the AWS universe. This can be implemented with a firewall or through a separate gateway. Ansible also configures a local OpenVPN instance; since I have already built VPN playbooks for Libreswan, Fortinet, and Juniper firewalls, I could easily replace the component.

Figure 1: This hybrid architecture is my goal. On the right, you can see the relevant part of the in-house IT; on the left, you can see the setup to be created in the Amazon cloud.

On the AWS side, the structure is a bit more complicated, because I want to develop a multitenant solution – a VM facing the Internet that forwards to different closed environments. The VPN data in the closed environments will not be known to the front end. However, an incoming control based on IP addresses will still take place there. Therefore, the AWS VM uses socat to forward the incoming packets to the actual VPN service in the closed network so that an encrypted end-to-end connection is still established. If you do not want multitenant capability, you can omit this redirection from the structure.

The AWS subnet has a routing table that ensures the routing between the local networks of your data center and the AWS VM with OpenVPN occurs correctly.

Networks and Peering

As with other virtualization environments, you start by setting up the virtual network infrastructure so that the VMs ultimately end up on the correct networks. For AWS, you first create two VPCs: the inner one for the closed environment, the outer one for the gateway VM (Figure 2). Both have a private IP network.

Figure 2: Two VPCs to be created form the basis on the AWS side.

The inner VPC should be selected so that it does not collide with the network in your data center, because on the local side, the routes into this network area must point to the local VPN gateway. In the external block, it is sufficient to select a small network area, because it only serves as a transition point. AWS now needs subnet objects whose IP networks each have to be a subset of the VPC's total range.

In the next step, you set up the VPC peering connection, which is like adding a virtual cable between your two VPCs (Figure 3). You need to accept this peering in a separate step. So that the VMs can find their way out of the outer VPC later on, you now need to create an Internet gateway.

Figure 3: This dialog lets you set up the peering connection; you must confirm this in a separate step.

VPN Routing

The configurations up to now simulate Layer 2 cabling, which would also be necessary for a physical structure. Now the IP routing follows. In the routing table for the inner VPC, you should define a route to the outer subnet. Because the VPN connections are forwarded at the application level (socat), there is no direct IP connection between the VPN gateway and the Internet. The goal of these routers is VPC peering – comparable to setting an interface route with:

ip route add 1.2.3.0/24 dev eth0

Conversely, the external VPC must know how to find the internal VPC, which requires routes to the internal subnet and also to VPC peering. Additionally, the external VPC's default route must point to the previously created Internet gateway.

The security configuration completes the setup of the virtual network: The VPN gateway in the internal VPC needs the VPN port (for OpenVPN, port 1194/UDP; for IPsec with NAT traversal, ports 500/UDP and 4500/UDP). To allow the admin to access the VM itself, SSH access is also required (i.e., incoming port 22/TCP). The same applies to the external gateway computer. Here, the VPN input can then be limited to the local VPN gateways' static IPs – if available. This does not offer very strong protection (especially for UDP), but it at least limits the Internet background noise a bit.

Two VMs

Now, you need just two VM instances to do the work. The intended software is installed and configured on each: socat as a service on the external VM and OpenVPN as VPN software on the internal VM.

To create a VM, you need an image ID (ami-id in AWS-speak) and an instance size. The small t2.micro instances are perfectly sufficient for the setup here, because the performance is sufficient for the Internet connections of most companies. If you have a 10Gb Internet connection, you should adjust the instance's size accordingly. Then, assign the correct VPCs and subnets, as well as the previously set up security groups, to both instances.

Both VMs are based on an Ubuntu 16.04 server image (Figure 4). socat is assigned to the front VM as a port forwarder; it launches systemd as a daemon. This would be enough for the VPN, but the internal VM has no Internet access, so it needs a proxy to install the packages. Therefore, the front VM is assigned a Squid web proxy, which is restricted so that only internal subnet clients can use it. To match this, you need to extend the security group to include port 3128/TCP from this network.

Figure 4: Both VMs are based on an Ubuntu 16.04 server image provided by Amazon.

OpenVPN is installed on the inner VM in server mode. VMs on the inner network need a route through the OpenVPN gateway to the local data center network.

Ansible

If you want to configure all of this manually, it would require some work in the AWS web GUI. Additionally, the parameters must all match, such as the IP addresses of the networks and the corresponding entries in the routing tables. Later, the addresses are also included in the configuration of the services on the VMs. This leaves the admin with many opportunities for errors, especially since asking "What was the address again?" interrupts the workflow in the web GUI.

Ansible's cloud modules, on the other hand, cover everything necessary for the configuration. When creating one component in the playbook, always store the results, because the IDs of the individual components are required to create the next one. For example, you also need to enter the inner VM IP address assigned by Amazon in the socat configuration on the outer VM.

Thanks to Ansible, the configuration is a holistic process resulting in fewer errors. The whole ensemble has two roles for configuration on the inner and outer VM. Initially, the playbook builds the entire infrastructure.

Some data is parameterized, like the AWS region, the address space for the inner VPC and the subnet, and the subnet on the local LAN so that the VPN configuration can be generated. The playbook reads this data from a YAML file at the outset.

To allow the playbook to access AWS at all, you need either a file in ~.aws/credentials containing the aws_access_key and aws_secret_access_key entries, or you can set the values in the playbook as variables (or preferably via Ansible Vault for security reasons) or store them in environment variables. The cloud module documentation [4] explains the variants.

The only way to access a VM in AWS is with an SSH connection. Unlike other hosting service providers, there is no console. Therefore, the admin SSH key also needs to be stored in AWS. Listing 1 shows the list of tasks that upload the key (only do this if the key is not already in place), create the VPCs with one subnet each, and generate the VPC peering.

Listing 1: Base and Network Tasks

01 - name: Load Data
02   include_vars: "{{ datafile }}"
03   tags: getdata
04
05 - name: SSH Key
06   ec2_key:
07     name: ansible-admin-key
08     key_material: "{{ item }}"
09     state: present
10     region: "{{ region }}"
11   with_file: /home/user/.ssh/id_rsa.pub
12   register: sshkey
13   tags: sshkey
14
15 - name: Create VPC INT
16   ec2_vpc_net:
17     name: "{{ netname }}-int"
18     cidr_block: "{{ cidr_master }}"
19     region: "{{ region }}"
20   tags: create_vpc_int
21   register: myvpcint
22
23 - name: Create Subnet INT
24   ec2_vpc_subnet:
25     cidr: "{{ subnet }}"
26     vpc_id: "{{ myvpcint.vpc.id }}"
27     region: "{{ region }}"
28     state: present
29   tags: create_subnet_int
30   register: mysubnetint
31
32 - name: Create VPC Ext
33   ec2_vpc_net:
34     name: "{{ netname }}-ext"
35     cidr_block: 172.25.0.0/28
36     region: "{{ region }}"
37   tags: create_vpc_ext
38   register: myvpcext
39
40 - name: Create Subnet Ext
41   ec2_vpc_subnet:
42     cidr: 172.25.0.0/28
43     vpc_id: "{{ myvpcext.vpc.id }}"
44     region: "{{ region }}"
45     state: present
46   tags: create_subnet_ext
47   register: mysubnetext
48
49 - name: Create VPC Peering
50   ec2_vpc_peer:
51     region: "{{ region }}"
52     vpc_id: "{{ myvpcint.vpc.id }}"
53     peer_vpc_id: "{{ myvpcext.vpc.id }}"
54     state: present
55   register: myvpcpeering
56   tags: createvpcpeering
57
58 - name: Accept VPC Peering
59   ec2_vpc_peer:
60     region: "{{ region }}"
61     peering_id: "{{ myvpcpeering.peering_id }}"
62     state: accept
63   register: action_peer
64
65 - name: Create Internet Gateway
66   ec2_vpc_igw:
67     vpc_id: "{{ myvpcext.vpc.id }}"
68     region: "{{ region }}"
69     state: present
70   register: igw
71   tags: igw

At the end of each ec2 instruction is a register block that stores the operation's result in a variable. To create a subnet, you need the VPC's ID in which the subnet is to be located. The same thing happens starting on line 49, first to create and then accept VPC peering. The final task creates the Internet gateway.

Creating Routing and VMs

Listing 2 shows the second part of the playbook. The Gather Route tables task searches the routing table in the internal VPC to enter the route on the outer network. Then, the playbook sets the route from the inside out and in the opposite direction. The next two tasks create the security groups for both VMs.

Listing 2: Routes and Filter Rules

01 - name: Gather Route tables
02   ec2_vpc_route_table_facts:
03     region: "{{ region }}"
04     filters:
05       vpc-id: "{{ myvpcint.vpc.id }}"
06   register: inttables
07   tags: gatherroutes
08
09 - name: Set Route out
10   ec2_vpc_route_table:
11     vpc_id: "{{ myvpcint.vpc.id }}"
12     region: "{{ region }}"
13     route_table_id: "{{ inttables.route_tables[0].id }}"
14     tags:
15       Name: "{{ netname }}-int"
16     subnets:
17       - "{{ mysubnetint.subnet.id }}"
18     routes:
19       - dest: 172.25.0.0/28
20         vpc_peering_connection_id: "{{ myvpcpeering.peering_id }}"
21   register: outboundroutetable
22   tags: routeout
23
24 - name: Set Route in
25   ec2_vpc_route_table:
26     vpc_id: "{{ myvpcext.vpc.id }}"
27     region: "{{ region }}"
28     tags:
29   name: "{{ netname }}-ext"
30     subnets:
31       - "{{ mysubnetext.subnet.id }}"
32     routes:
33       - dest: "{{ subnet }}"
34         vpc_peering_connection_id: "{{ myvpcpeering.peering_id }}"
35       - dest: 0.0.0.0/0
36         gateway_id: igw
37   register: outboundroutetable
38   tags: routein
39
40 - name: internal Secgroup
41   ec2_group:
42     name: "{{ netname }}-int-secgroup"
43     vpc_id: "{{ myvpcint.vpc.id }}"
44     region: "{{ region }}"
45     purge_rules: true
46     description: Ansible-Generated internal rule
47     rules:
48       - proto: udp
49         from_port: 12345
50         to_port: 12345
51         cidr_ip: 0.0.0.0/0
52       - proto: tcp
53         from_port: 22
54         to_port: 22
55         cidr_ip: 0.0.0.0/0
56       - proto: tcp
57         from_port: 443
58         to_port: 443
59         cidr_ip: 0.0.0.0/0
60   register: intsecg
61   tags: internalsec
62
63 - name: external Secgroup
64   ec2_group:
65     name: "{{ netname }}-ext-secgroup"
66     vpc_id: "{{ myvpcext.vpc.id }}"
67     region: "{{ region }}"
68     purge_rules: true
69     description: Ansible-Generated internal rule
70     rules:
71       - proto: udp
72         from_port: 12345
73         to_port: 12345
74         cidr_ip: 0.0.0.0/0
75       - proto: tcp
76         from_port: 22
77         to_port: 22
78         cidr_ip: 0.0.0.0/0
79       - proto: tcp
80         from_port: 443
81         to_port: 443
82         cidr_ip: 0.0.0.0/0
83       - proto: tcp
84         from_port: 3128
85         to_port: 3128
86         cidr_ip: "{{ subnet }}"
87   register: extsecg
88   tags: externalsec
89
90 - name: Update Auto
91   ec2_auto_assign_public_ip_for_subnet:
92     subnet: "{{ mysubnetext.subnet.id }}"
93     region: "{{ region }}"
94     state: present

The last task in Listing 2 (line 90) is necessary because of an Ansible peculiarity – according to the AWS API, this task is unnecessary without Ansible. The API can tell a subnet whether hosts on this subnet should always be assigned a public IP automatically and is only possible in Ansible with the added task shown in Listing 2.

Listing 3 creates the two VMs. The tasks create the VMs and add them to a group. These groups will immediately use the following plays to install and configure the software via SSH. The last task of the first play now waits until the external VM is accessible via SSH.

Listing 3: Rolling Out the VMs

01 - name: Deploy Backend
02   ec2:
03     key_name: ansible-user-key
04     instance_type: t2.micro
05     image: ami-d15a75c7
06     region: "{{ region }}"
07     wait: yes
08     id: test-backend
09     assign_public_ip: no
10     vpc_subnet_id: "{{ mysubnetint.subnet.id }}"
11     group_id: "{{ intsecg.group_id }}"
12   register: backendvm
13   tags: createbackend
14
15 - name: add frontend to group
16   add_host:
17     hostname: "{{ item.private_ip }}"
18     groupname: backend
19   with_items: "{{ backendvm.instances }}"
20
21
22 - name: Deploy Frontend
23   ec2:
24     key_name: ansible-user-key
25     instance_type: t2.micro
26     image: ami-d15a75c7
27     region: "{{ region }}"
28     wait: yes
29     id: test-frontend
30     assign_public_ip: yes
31     vpc_subnet_id: "{{ mysubnetext.subnet.id }}"
32     group_id: "{{ extsecg.group_id }}"
33   register: frontendvm
34   tags: createfrontend
35
36 - name: add frontend to group
37   add_host:
38     hostname: "{{ item.public_ip }}"
39     groupname: frontend
40   with_items: "{{ frontendvm.instances }}"
41
42 - name: Wait for ssh of frontend
43   wait_for:
44     host: "{{ item.public_dns_name }}"
45     port: 22
46     state: started
47   with_items: "{{ frontendvm.instances }}"

Role-Play

The playbook's second and third plays use a role to configure the software on the two VMs. The last play in Listing 4 is especially important. Because the inner VM is only accessible from the outer VM, it needs to reroute the SSH command via ansible_ssh_common_args so that the inner VM uses the outer as a proxy. I only packed the configuration into roles here because I wanted the configuration to be reusable on the VMs.

Listing 4: The Last Two Plays

01 - name: Configure Frontend
02   hosts: frontend
03   remote_user: ubuntu
04   gather_facts: no
05   roles:
06     - agouros.aws.frontend
07
08 - name: Configure Backend
09   hosts: backend
10   remote_user: ubuntu
11   gather_facts: no
12   vars:
13     ansible_ssh_common_args: "-o ProxyCommand='ssh -W %h:%p -q ubuntu@{{ hostvars['localhost']['frontendvm']['instances'][0]['public_ip'] }}'"
14   roles:
15     - agouros.aws.backend

Listing 5 shows the front-end play or tasks from the role. The first task installs Python in raw mode, because Python was not installed on the AWS Ubuntu image I used. Then Ansible installs and configures the socat and squid packages. What is striking here is socat, which is not really a service. Only a systemd unit file makes it a service on UDP and 443/TCP. The target address of the inner VM for connecting is taken from the previous playbook run. The play starting in line 43 (Listing 5) activates the services and starts them.

Listing 5: Installing the Front End

01 - name: Install Python
02   raw: test -e /usr/bin/python || (sudo -s apt-get -y install python)
03
04 - name: Install Software
05   become: true
06   become_method: sudo
07   apt:
08     name: "{{ item }}"
09     state: present
10     cache_valid_time: 86400
11   with_items:
12     - socat
13     - squid
14
15 - name: Configure Squid
16   become: true
17   become_method: sudo
18   template:
19     src: templates/squid.conf.j2
20     dest: /etc/squid/squid.conf
21   notify: restart squid
22
23 - name: Configure socat
24   become: true
25   become_method: sudo
26   template:
27     src: templates/socat.service.j2
28     dest: /etc/systemd/system/socat.service
29   notify:
30     - daemon-reload
31     - start socat
32
33 - name: Configure socat443
34   become: true
35   become_method: sudo
36   template:
37     src: templates/socat443.service.j2
38     dest: /etc/systemd/system/socat443.service
39   notify:
40     - daemon-reload
41     - start socat
42
43 - name: Start and enable Services
44   become: true
45   become_method: sudo
46   systemd:
47     enabled: yes
48     state: started
49     daemon-reload: yes
50     name: "{{ item }}"
51   with_items:
52     - squid
53     - socat
54     - socat443

Listing 6 handles the internal VM's configuration. This can't start with the installation of Python, because apt first has to get to know the upstream proxy; this again happens in raw mode here.

Listing 6: Internal VM Configuration

01 - name: Debug1
02   debug: msg="{{ inventory_hostname }}"
03
04 - name: Set Proxy
05   raw: test -e /etc/apt/apt.conf.d/80proxy || (sudo sh -c 'echo "Acquire::http::proxy \"http://{{ hostvars['localhost']['frontendvm']['instances'][0]['private_ip']  }}:3128\";" >  /etc/apt/apt.conf.d/80proxy')
06
07 - name: Install Python
08   raw: test -e /usr/bin/python || (sudo -s apt-get -y install python)
09
10 - name: Install Software
11   become: true
12   become_method: sudo
13   apt:
14     name: "{{ item }}"
15     state: present
16     cache_valid_time: 86400
17   with_items:
18     - openvpn
19
20 - name: Set OpenVPN Name
21   become: true
22   become_method: sudo
23   lineinfile:
24     path: /etc/default/openvpn
25     regexp: "^AUTOSTART"
26     line: "AUTOSTART=\"{{ hostvars['localhost']['netname'] }} {{ hostvars['localhost']['netname'] }}-443\""
27
28 - name: Send SSL Files
29   become: true
30   become_method: sudo
31   copy:
32     src: "{{ item }}"
33     dest: /etc/openvpn
34     owner: root
35     mode: 0600
36   with_fileglob:
37     - "files/*"
38
39 - name: ccd-dir
40   become: true
41   become_method: sudo
42   file:
43     dest: /etc/openvpn/ccd
44     state: directory
45     mode: 0755
46     owner: root
47     group: root
48
49 - name: clientfile
50   become: true
51   become_method: sudo
52   template:
53     src: templates/clientfile.j2
54     dest: /etc/openvpn/ccd/clientfile
55
56 - name: OpenVPN Group
57   become: true
58   become_method: sudo
59   group:
60     name: openvpn
61     state: present
62
63 - name: OpenVPN user
64   become: true
65   become_method: sudo
66   user:
67     name: openvpn
68     state: present
69     groups: openvpn
70     system: yes
71
72 - name: Configure Openvpn
73   become: true
74   become_method: sudo
75   template:
76     src: templates/openvpn.conf.j2
77     dest: "/etc/openvpn/{{ hostvars['localhost']['netname']}}.conf"
78
79 - name: Configure Openvpn
80   become: true
81   become_method: sudo
82   template:
83     src: templates/openvpn.conf-443.j2
84     dest: "/etc/openvpn/{{ hostvars['localhost']['netname']}}-443.conf"
85
86 - name: Enable and start service
87   become: true
88   become_method: sudo
89   systemd:
90     name: openvpn
91     enabled: true
92     state: restarted

In terms of software packages, the role only installs OpenVPN and configures two instances: one on the UDP port and one on TCP/443. These instances set an entry in the /etc/default/openvpn file to AUTOSTART. The last play in the listing enables and starts the OpenVPN service. The SSL keys in this example were already preconfigured. It is up to the admin to generate and upload certificates with Ansible.

Objective Achieved

Extending your data center with resources from the cloud can significantly reduce total costs. You'll mainly see this benefit if you need to purchase very expensive services or the services are only occasionally needed, such as additional instances of your own web store or web server for a foreseeable customer rush.

However, to ensure that the cloud services – unless currently desired – are not freely available on the Internet, the infrastructure allows AWS to be connected almost like your own data center. However, for temporary resources, a manual assembly and disassembly would be labor intensive. Automation with Ansible, as shown in this article, or with something like the HashiCorp Terraform tool, lets the admin quickly deploy and dismantle a data center infrastructure.