Using Ansible to Deploy Neo4j HA Cluster on AWS/Eucalyptus

As a follow-up to my last Neo4j, AWS/Eucalyptus blog, this entry demonstrates another great example of AWS/Eucalyptus fidelity by using Ansible to deploy a Neo4j High Available cluster.

Neo4j - Graph Database
Neo4j – Graph Database
Amazon AWS EC2
Amazon AWS EC2
Eucalyptus Systems Inc.
Eucalyptus Systems Inc.

Pre-requisites

In order to use this Ansible playbook on AWS/Eucalyptus, the following is needed:

Before deploying the cluster, a security group needs to be created that the cluster will use.  The security group must allow the following:

  • port 22 (SSH)
  • all instances part of the security group allowed to community with each other (ports 0 – 65535)

To create the security group and authorize the ports, make sure the user’s access key, secret access key, and EC2 URL are noted, and do the following:

  1. Create the security group

    ec2-create-group --aws-access-key <EC2_ACCESS_KEY> 
    --aws-secret-key <EC2_SECRET_KEY> 
    --url <EC2_URL> -g neo4j-cluster -d "Neo4j HA Cluster"

  2. Authorize port for SSH in neo4j-cluster security group

    ec2-authorize 
    --aws-access-key <EC2_ACCESS_KEY> 
    --aws-secret-key <EC2_SECRET_KEY> 
    --url <EC2_URL> -P tcp -p 22 -s 0.0.0.0/0 neo4j-cluster

  3. Authorize all port communication between cluster members 

    ec2-authorize 
    --aws-access-key <EC2_ACCESS_KEY> --aws-secret-key <EC2_SECRET_KEY> 
    --url <EC2_URL> -P tcp -o neo4j-cluster -p -1 neo4j-cluster

After completing these steps, use

ec2-describe-group

to view the security group:

ec2-describe-group --aws-access-key <EC2_ACCESS_KEY> 
--aws-secret-key <EC2_SECRET_KEY> --url <EC2_URL> neo4j-cluster

GROUP sg-1cbc5777 986451091583 neo4j-cluster Neo4j HA Cluster
PERMISSION 986451091583 neo4j-cluster ALLOWS tcp 0 65535 FROM 
USER 986451091583 NAME neo4j-cluster ID sg-1cbc5777 ingress
PERMISSION 986451091583 neo4j-cluster 
ALLOWS tcp 22 22 FROM CIDR 0.0.0.0/0 ingress

Neo4j HA Cluster Deployment

Once the security group is created with the correct ports authorized, the cluster can be deployed.  To deploy the cluster, do the following:

  1. Obtain Ansible from git and setup the environment by following the instructions mentioned here – http://ansible.cc/docs/gettingstarted.html#getting-ansible
  2. Obtain the Ansible Playbook for Neo4j HA Cluster using git

    git clone https://github.com/hspencer77/ansible-neo4j-cluster.git

  3. Change directory into ansible-neo4j-cluster  

    cd ansible-neo4j-cluster

  4. Set up /etc/ansible/hosts with the following information:
    [local]
    127.0.0.1
  5. Populate vars/ec2-config with either Eucalyptus/AWS information. vars/ec2-config contains the following variables:
    keypair: <EC2/Eucalyptus Keypair>
    ec2_access_key: <EC2_ACCESS_KEY>
    ec2_secret_key: <EC2_SECRET_KEY>
    ec2_url: <EC2_URL>
    instance_type: m1.small
    security_group: <AWS/Eucalyptus Security Group>
    image: <AMI/EMI>
  6. 
    

    Execute the following command:

    ansible-playbook neo4j-cluster.yml \
     --private-key=<AWS/Eucalyptus Private Key file> --extra-vars "node_count=3"
  7. After the playbook finishes, there will be an URL provided to access the cluster – similar to the example below:
    TASK: [Display HAProxy URL] *********************
    changed: [23.22.248.75] => {"changed": true, "cmd": 
    "echo \"HAProxy URL for Neo4j -
     http://ec2-23-22-248-75.compute-1.amazonaws.com/webadmin/#/info/org.neo4j/High%20Availability/\" ",
     "delta": "0:00:00.006835", "end": "2013-03-30 19:54:31.104320", 
    "rc": 0, "start": "2013-03-30 19:54:31.097485", "stderr": "", 
    "stdout": 
    "HAProxy URL for Neo4j - 
    http://ec2-23-22-248-75.compute-1.amazonaws.com/webadmin/#/info/org.neo4j/High%20Availability/"}

    To view the status of cluster in the browser, open up http://ec2-23-22-248-75.compute-1.amazonaws.com/webadmin/#/info/org.neo4j/High%20Availability/.

  8. To get the status of the cluster, use curl:
    curl -H "Content-Type:application/json" -d '["org.neo4j:*"]' 
    http://ec2-23-22-248-75.compute-1.amazonaws.com/db/manage/server/jmx/query

Thats it!  A Neo4j HA cluster with an HA Proxy server serving as an endpoint is available to be used.   If a bigger cluster is desired, just change the

node_count

value.   For additional information regarding this playbook, and how it handles the cluster membership, please refer to the following URL – https://github.com/hspencer77/ansible-neo4j-cluster/blob/master/README.md.

Hope you enjoy!  As always, questions/comments/suggestions are always welcome.

Using Ansible to Deploy Neo4j HA Cluster on AWS/Eucalyptus

Another Great Example of AWS Fidelity – Neo4j, Cloud-Init and Eucalyptus

I recently ran across a blog entry entitled Neo4j 1.9.M01 – Self-managed HA.  I found the concept of graph databases storing data really interesting and reached out to the guys at Neo4j to get some insight on how to deploy their HA solution on Eucalyptus.   Amongst the resources that they provided,  they shared this little gem – how to deploy Neo4j on EC2.  In order to run first, you need to know how to walk – so before going down the path of standing up HA Neo4j, I decided to be influenced by the DIY on EC2 article provided by Neo4j and deploy Neo4j on Eucalyptus  – with a little help from Cloud-Init.  The follow-up blog will show how to use the same setup, and deploy an HA Neo4j environment.

The Setup

Eucalyptus

The Eucalyptus cloud I used is configured using Eucalyptus High-Availability.  Its running on CentOS 6.3, running KVM.  Its also running in Managed networking mode, so that we can take advantage of network isolation of the VMs, and the use of security groups  – interacting very much in the same way as its done in the security groups provided in AWS EC2.

Ubuntu Cloud Image – 12.04 LTS Precise Pangolin

The image that we will use is the Ubuntu 12.04 LTS Cloud image.  The reasons for using this image is as follows:

  • Ubuntu cloud images come pre-packaged with cloud-init, which helps with bootstrapping the instance.
  • I wanted to have the solution work on AWS EC2 and Eucalyptus; since Ubuntu cloud images work on both, its a great choice.

Registering the Ubuntu Cloud Image with Eucalyptus

In order for us to get started, we need to get the Ubuntu Cloud image into Eucalyptus so that we can use it for our instance.  To upload, bundle and register the Ubuntu Cloud image, ramdisk and kernel, do the following:

  1. Download current version of  Ubuntu Precise Server AMD64 from the Ubuntu Cloud Image – Precise page, then unpack (ungzip, unarchive) the tar-gzipped file.  

    $ tar -zxvf precise-server-cloudimg-amd64.tar.gz
    x precise-server-cloudimg-amd64.img
    x precise-server-cloudimg-amd64-vmlinuz-virtual
    x precise-server-cloudimg-amd64-loader
    x precise-server-cloudimg-amd64-floppy
    x README.files

  2. Make sure to download and source your Eucalyptus credentials.
  3. We need to bundle, upload, and register precise-server-cloudimg-amd64-loader (ERI), precise-server-cloudimg-amd64-vmlinuz-virtual (EKI), and precise-server-cloudimg-amd64.img (EMI).  For more information regarding this, please refer to the “Image Overview” section of the Eucalyptus 3.1 User Guide.  

    $ euca-bundle-image -i precise-server-cloudimg-amd64-loader --ramdisk true
    $ euca-upload-bundle -b latest-ubuntu-precise -m /tmp/precise-server-cloudimg-amd64-loader.manifest.xml
    $ euca-register -a x86_64 latest-ubuntu-precise/precise-server-cloudimg-amd64-loader.manifest.xml
    $ euca-bundle-image -i precise-server-cloudimg-amd64-vmlinuz-virtual --kernel true
    $ euca-upload-bundle -b latest-ubuntu-precise -m /tmp/precise-server-cloudimg-amd64-vmlinuz-virtual.manifest.xml
    $ euca-register -a x86_64 latest-ubuntu-precise/precise-server-cloudimg-amd64-vmlinuz-virtual.manifest.xml
    $ euca-bundle-image -i precise-server-cloudimg-amd64.img
    $ euca-upload-bundle -b latest-ubuntu-precise -m /tmp/precise-server-cloudimg-amd64.img.manifest.xml
    $ euca-register -a x86_64 latest-ubuntu-precise/precise-server-cloudimg-amd64.img.manifest.xml

After bundling, uploading and registering the ramdisk, kernel and image, the latest-ubuntu-precise bucket in Walrus should have the following images:

$ euca-describe-images | grep latest-ubuntu-precise
IMAGE eki-0F3937E9 latest-ubuntu-precise/precise-server-cloudimg-amd64-vmlinuz-virtual.manifest.xml 345590850920 available public x86_64 kernel instance-store

IMAGE emi-C1613E67 latest-ubuntu-precise/precise-server-cloudimg-amd64.img.manifest.xml 345590850920 available public x86_64 machine instance-store

IMAGE eri-0BE53BFD latest-ubuntu-precise/precise-server-cloudimg-amd64-loader.manifest.xml 345590850920 available public x86_64 ramdisk instance-store

Cloud-init Config File

Now that we have the image ready to go, we need to create a cloud-init config file to pass in using the –user-data-file option that is part of euca-run-instances.  For more examples of different cloud-init files, please refer to the cloud-init-dev/cloud-init repository on bazaar.launchpad.net.  Below is the cloud-init.config file I created for bootstrapping the instance with an install of Neo4j, using ephemeral disk for the application storage, and installing some other packages (i.e. latest euca2ools, mlocate, less, etc.). The script can be also accessed from github as well – under the eucalptus/recipes repo.

#cloud-config
apt_update: true
apt_upgrade: true
disable_root: true
package_reboot_if_required: true
packages:
 - less
 - bind9utils
 - dnsutils
 - mlocate
cloud_config_modules:
 - ssh
 - [ apt-update-upgrade, always ]
 - updates-check
 - runcmd
runcmd:
 - [ sh, -xc, "if [ -b /dev/sda2 ]; then tune2fs -L ephemeral0 /dev/sda2;elif [ -b /dev/vda2 ]; then tune2fs -L ephemeral0 /dev/vda2;elif [ -b /dev/xvda2 ]; then tune2fs -L ephemeral0 /dev/xvda2;fi" ]
 - [ sh, -xc, "mkdir -p /var/lib/neo4j" ]
 - [ sh, -xc, "mount LABEL=ephemeral0 /var/lib/neo4j" ]
 - [ sh, -xc, "if [ -z `ls /var/lib/neo4j/*` ]; then sed --in-place '$ iMETA_HOSTNAME=`curl -s http://169.254.169.254/latest/meta-data/local-hostname`\\nMETA_IP=`curl -s http://169.254.169.254/latest/meta-data/local-ipv4`\\necho ${META_IP}   ${META_HOSTNAME} >> /etc/hosts; hostname ${META_HOSTNAME}; sysctl -w kernel.hostname=${META_HOSTNAME}\\nif [ -d /var/lib/neo4j/ ]; then mount LABEL=ephemeral0 /var/lib/neo4j; service neo4j-service restart; fi' /etc/rc.local; fi" ] 
 - [ sh, -xc, "META_HOSTNAME=`curl -s http://169.254.169.254/latest/meta-data/local-hostname`; META_IP=`curl -s http://169.254.169.254/latest/meta-data/local-ipv4`; echo ${META_IP}   ${META_HOSTNAME} >> /etc/hosts" ]
 - [ sh, -xc, "META_HOSTNAME=`curl -s http://169.254.169.254/latest/meta-data/local-hostname`; hostname ${META_HOSTNAME}; sysctl -w kernel.hostname=${META_HOSTNAME}" ]
 - [ sh, -xc, "wget -O c1240596-eucalyptus-release-key.pub http://www.eucalyptus.com/sites/all/files/c1240596-eucalyptus-release-key.pub" ]
 - [ apt-key, add, c1240596-eucalyptus-release-key.pub ]
 - [ sh, -xc, "echo 'deb http://downloads.eucalyptus.com/software/euca2ools/2.1/ubuntu precise main' > /etc/apt/sources.list.d/euca2ools.list" ]
 - [ sh, -xc, "echo 'deb http://debian.neo4j.org/repo stable/' > /etc/apt/sources.list.d/neo4j.list" ]
 - [ apt-get, update ]
 - [ apt-get, install, -y, --force-yes, euca2ools ]
 - [ apt-get, install, -y, --force-yes, neo4j ]
 - [ sh, -xc, "sed --in-place 's/#org.neo4j.server.webserver.address=0.0.0.0/org.neo4j.server.webserver.address=0.0.0.0/' /etc/neo4j/neo4j-server.properties" ]
 - [ sh, -xc, "service neo4j-service restart" ]
 - [ sh, -xc, "export LANGUAGE=en_US.UTF-8" ]
 - [ sh, -xc, "export LANG=en_US.UTF-8" ]
 - [ sh, -xc, "export LC_ALL=en_US.UTF-8" ]
 - [ locale-gen, en_US.UTF-8 ]
 - [ dpkg-reconfigure, locales ]
 - [ updatedb ]
mounts:
 - [ ephemeral0, /var/lib/neo4j, auto, "defaults,noexec" ]

Now, we are ready to launch the instance.

Putting It All Together

Before launching the instance, we need to set up our keypair and security group that we will use with the instance.

  1. To create a keypair, run euca-create-keypair.  *NOTE* Make sure you change the permissions of the keypair to 0600 after its been created.

    euca-create-keypair  neo4j-user > neo4j-user.priv; chmod 0600 neo4j-user.priv

  2. Next, we need to create a security group for our instance.  To create a security group, use euca-create-group.  To open any ports you  need for the application, use euca-authorize.  The ports we will open up for the Neo4j application are SSH (22), ICMP, HTTP( 7474), and HTTPS (7473).
    • Create security group:

      # euca-create-group neo4j-test -d "Security for Neo4j Instances"

    • Authorize SSH:

      # euca-authorize -P tcp -p 22 -s 0.0.0.0/0 neo4j-test

    • Authorize HTTP:

      # euca-authorize -P tcp -p 7474 -s 0.0.0.0/0 neo4j-test

    • Authorize HTTPS:

      # euca-authorize -P tcp -p 7473 -s 0.0.0.0/0 neo4j-test

    • Authorize ICMP:

      # euca-authorize -P icmp -t -1:-1 -s 0.0.0.0/0 neo4j-test

  3. Finally, we use euca-run-instances to launch the Ubuntu Precise image, and use cloud-init to install Neo4j:

    # euca-run-instances -k neo4j-user --user-data-file cloud-init-neo4j.config emi-C1613E67 --kernel eki-0F3937E9 --ramdisk eri-0BE53BFD --group neo4j-test

To check the status of the instance, use euca-describe-instances.

# euca-describe-instances i-A9EF448C
RESERVATION r-ED8E4699 345590850920 neo4j-test
INSTANCE i-A9EF448C emi-C1613E67 euca-192-168-55-104.wu-tang.euca-hasp.eucalyptus-systems.com 
euca-10-106-69-154.wu-tang.internal running admin 0 m1.small 2012-12-04T03:13:13.869Z 
enter-the-wu eki-0F3937E9 eri-0BE53BFD monitoring-disable 
euca-192-168-55-104.wu-tang.euca-hasp.eucalyptus-systems.com euca-10-106-69-154.wu-tang.internal instance-store

Because I added in the cloud-init config file to do an “apt-get upgrade”, it takes about 5 to 7 minutes until the instance is fully configured and Neo4j is running.  Once you have it running, go to https://<ip-address of instance>:7473.  It will direct you to the web administration page for monitoring and management of the Neo4j instance.  In this example, the URL will be https://euca-192-168-55-104.wu-tang.euca-hasp.eucalyptus-systems.com:7473

Neo4j Monitoring and Management Tool
Neo4j Monitoring and Management Tool

Thats it!  The cool thing about this too, is that you can find an Ubuntu Precise AMI on AWS EC2, use the same cloud-init script, use euca2ools, and follow these instructions to get the same deployment on AWS EC2.

As mentioned before, the follow-up blog will be how to deploy the HA solution of Neo4j on Eucalyptus. Enjoy!

Another Great Example of AWS Fidelity – Neo4j, Cloud-Init and Eucalyptus