Cloud Image Management on Eucalyptus: Creating a CentOS 6.6 EMI With ZFS Support

ZFS is a filesystem designed by Sun Microsystems that focuses on data integrity.  What makes this such an attractive filesystem to use in the cloud is that a cloud user can easily do the following:

  • set up an LVM + RAID filesystem for storing large amounts of data (e.g. database information)
  • expand the filesystem by adding more storage (i.e. EBS volumes)
  • backup the filesystem without taking the filesystem offline/unmounting
  • restore the filesystem

This blog entry will focus on how a cloud user can create their own Eucalyptus Machine Image (EMI) that has ZFS support.  The CentOS 6.5 EMI on the Eucalyptus Machine Image Catalog will be used as the base image.

Before Starting…

Before following the steps in this blog, make sure the following is in place:

Once these requirements have been met, everything should be ready to go.

Set Up Base Image/Instance

To begin, follow the ‘Quick Start’ instructions mentioned on the Eucalyptus Machine Image Catalog page.  This will install all the images provided by the catalog.  When the process has finished, list the CentOS 6.5 EMI.  For example:

# euca-describe-images emi-bdcec010 
IMAGE emi-bdcec010 centos-6.5-x86_64-20140917/centos.raw.manifest.xml 094999295155 available public x86_64 machine instance-store hvm

Once the CentOS 6.5 EMI has been listed, launch an instance from the EMI.  For example:

# euca-run-instances -k account2-user11 -t m1.medium emi-bdcec010 
RESERVATION r-a22f0201 325271821652 default
INSTANCE i-b9fccf9f emi-bdcec010 pending account2-user11 0 m1.medium 2014-12-03T22:52:41.522Z Honest monitoring-disabled 0.0.0.0 0.0.0.0 instance-store hvm sg-6ef9907f x86_64
# euca-describe-instances i-b9fccf9f
RESERVATION r-a22f0201 325271821652 default
INSTANCE i-b9fccf9f emi-bdcec010 euca-10-104-7-15.future.future.euca-hasp.cs.prc.eucalyptus-systems.com euca-172-17-248-178.future.internal running account2-user11 0 m1.medium 2014-12-03T22:52:41.522Z Honest monitoring-disabled 10.104.7.15 172.17.248.178 instance-store hvm sg-6ef9907f x86_64

Once the instance is running, its ready to be customized.

Adding ZFS Support to the Instance

Now that the instance is running, SSH into the instance so the following ZFS repository can be added:

[root@odc-f-13 ~]# ssh -i account2-user11.priv root@euca-10-104-7-15.future.future.euca-hasp.cs.prc.eucalyptus-systems.com
[root@euca-172-17-248-178 ~]# yum localinstall --nogpgcheck https://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
[root@euca-172-17-248-178 ~]# yum localinstall --nogpgcheck http://archive.zfsonlinux.org/epel/zfs-release.el6.noarch.rpm
[root@euca-172-17-248-178 ~]# yum upgrade -y
[root@euca-172-17-248-178 ~]# yum install kernel-devel zfs -y

After all the packages have been installed, reboot the instance:

[root@euca-172-17-248-178 ~]# reboot

Preparing the Instance For EMI Creation

After rebooting the instance, SSH back into the instance and prepare the instance for EMI creation.  First, load the zfs module:

[root@odc-f-13 ~]# ssh -i account2-user11.priv root@euca-10-104-7-15.future.future.euca-hasp.cs.prc.eucalyptus-systems.com
[root@euca-172-17-248-178 ~]# modprobe zfs
[root@euca-172-17-248-178 ~]# lsmod | grep zfs
zfs 1195522 0
zcommon 46278 1 zfs
znvpair 80974 2 zfs,zcommon
zavl 6925 1 zfs
zunicode 323159 1 zfs
spl 266655 5 zfs,zcommon,znvpair,zavl,zunicode

After confirming that the ZFS module is loaded, clear the network udev rules, and confirm PERSISTENT_DHCLIENT is set to “yes” in the /etc/sysconfig/network-scripts/ifcfg-eth0 file:

[root@euca-172-17-248-178 ~]# echo "" > /etc/udev/rules.d/70-persistent-net.rules
[root@euca-172-17-248-178 ~]# echo "" > /lib/udev/rules.d/75-persistent-net-generator.rules
[root@euca-172-17-248-178 ~]# echo "PERSISTENT_DHCLIENT=yes" >> /etc/sysconfig/network-scripts/ifcfg-eth0

Confirm that the instance has been upgraded to CentOS 6.6, then exit the instance.

[root@euca-172-17-248-178 ~]# cat /etc/redhat-release
CentOS release 6.6 (Final)
[root@euca-172-17-248-178 ~]# exit

Create the CentOS 6.6 EMI with ZFS Support

The instance is now ready to be bundled.  Bundle the instance using the euca-bundle-instance command.  This command is used to bundle Windows instances, however Eucalyptus extended this command to work with Linux instances as well.  Use euca-describe-bundle-tasks to monitor the bundling status:

[root@odc-f-13 ~]# euca-bundle-instance --bucket centos6.6-zfs --prefix centos6.6-zfs i-b9fccf9f
BUNDLE bun-b9fccf9f i-b9fccf9f centos6.6-zfs centos6.6-zfs 2014-12-03T23:54:51.644Z 2014-12-03T23:54:51.644Z pending 0 centos6.6-zfs/centos6.6-zfs.manifest.xml
..
[root@odc-f-13 ~]# euca-describe-bundle-tasks
BUNDLE bun-b9fccf9f i-b9fccf9f centos6.6-zfs centos6.6-zfs 2014-12-03T23:54:51.644Z 2014-12-03T23:57:37.517Z complete 0 centos6.6-zfs/centos6.6-zfs.manifest.xml

Once the bundle task completes, register the instance store-backed HVM image using the euca-register command:

[root@odc-f-13 ~]# euca-register -a x86_64 -n centos6.6-zfs centos6.6-zfs/centos6.6-zfs.manifest.xml --virtualization-type hvm 
IMAGE emi-5e63f02c

The custom image has been registered. Now lets test it out.

ZFS Test

To test the image out, we will do the following:

  • Launch an instance from the new EMI
  • Create 5 volumes and attach them to the instance
  • Create a ZFS storage pool and dataset

To launch the instance, use the euca-run-instances command.  To create the 5 EBS volumes, use euca-create-volume command.  After the volumes are created, use euca-attach-volume to attach the volumes to the instance.  Once the volumes are attached, the output of euca-describe-instances should look similar to the following:

# euca-describe-instances i-0cd3b6b8
RESERVATION r-cf7c5c73 325271821652 default
INSTANCE i-0cd3b6b8 emi-5e63f02c euca-10-104-7-3.future.future.euca-hasp.cs.prc.eucalyptus-systems.com euca-172-17-248-184.future.internal running account2-user11 0 m1.medium 2014-12-04T00:16:52.887Z Honest monitoring-disabled 10.104.7.3 172.17.248.184 instance-store hvm sg-6ef9907f x86_64
BLOCKDEVICE /dev/sdd vol-a23cfb1f 2014-12-04T01:45:59.730Z false
BLOCKDEVICE /dev/sdh vol-a27b75a5 2014-12-04T01:47:31.162Z false
BLOCKDEVICE /dev/sdf vol-2a971204 2014-12-04T01:46:54.575Z false
BLOCKDEVICE /dev/sdg vol-b33e9890 2014-12-04T01:47:13.346Z false
BLOCKDEVICE /dev/sde vol-dcc8b6ac 2014-12-04T01:46:15.011Z false

SSH into the instance and check what block devices are associated with the EBS volumes using the lsblk command:

# ssh -i account2-user11.priv root@euca-10-104-7-3.future.future.euca-hasp.cs.prc.eucalyptus-systems.com
[root@euca-172-17-248-184 ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 252:0 0 4.9G 0 disk
├─vda1 252:1 0 500M 0 part /boot
└─vda2 252:2 0 4.4G 0 part
 ├─VolGroup-lv_root (dm-0) 253:0 0 3.9G 0 lvm /
 └─VolGroup-lv_swap (dm-1) 253:1 0 500M 0 lvm [SWAP]
vdb 252:16 0 5.1G 0 disk
vdc 252:32 0 5G 0 disk
vdd 252:48 0 5G 0 disk
vde 252:64 0 5G 0 disk
vdf 252:80 0 5G 0 disk
vdg 252:96 0 5G 0 disk

The EBS volumes are /dev/vdc, /dev/vdd, /dev/vde, /dev/vdf, and /dev/vdg.  Use these devices to create the ZFS storage pool by using the zpool command:

[root@euca-172-17-248-184 ~]# zpool create -f app-pool vdc vdd vde vdf vdg
[root@euca-172-17-248-184 ~]# zpool status
 pool: app-pool
 state: ONLINE
 scan: none requested
config:
 NAME STATE READ WRITE CKSUM
 app-pool ONLINE 0 0 0
 vdc1 ONLINE 0 0 0
 vdd1 ONLINE 0 0 0
 vde1 ONLINE 0 0 0
 vdf1 ONLINE 0 0 0
 vdg1 ONLINE 0 0 0
errors: No known data errors

Next, we need to create a ZFS dataset.  For this example, this instance will end up being a MySQL server, so we will create a dataset for storing the MySQL data.

[root@euca-172-17-248-184 ~]# zfs create app-pool/mysql
[root@euca-172-17-248-184 ~]# zfs list
NAME USED AVAIL REFER MOUNTPOINT
app-pool 152K 24.5G 30K /app-pool
app-pool/mysql 30K 24.5G 30K /app-pool/mysql

The mount point of the dataset can be adjusted by setting the mountpoint option:

[root@euca-172-17-248-184 ~]# zfs set mountpoint=/opt/mysql app-pool/mysql
[root@euca-172-17-248-184 ~]# zfs list
NAME USED AVAIL REFER MOUNTPOINT
app-pool 162K 24.5G 31K /app-pool
app-pool/mysql 30K 24.5G 30K /opt/mysql

Thats it!  Notice how this only required 2 commands to set up a LVM + RAID filesystem, compared to around 7 commands using mdadm, pvcreate, vgcreate, mkfs, mkdir and mount. The instance is now ready to utilize the ZFS filesystem for the MySQL server.

Online Backup Example to OSG Bucket using s3cmd

As mentioned earlier, a slick feature of using ZFS is being able to perform backups online.  This section will show the following:

  • Setup and configure s3cmd
  • Create a ZFS snapshot, and use ZFS send with s3cmd to place the snapshot on an OSG bucket

To get started, in the instance, install the following packages:

[root@euca-172-17-248-184 ~]# yum install -y git python-dateutil.noarch xz

Next, clone the s3tools/s3cmd repository from Github:

[root@euca-172-17-248-184 ~]# git clone https://github.com/s3tools/s3cmd.git

If the instance was launched with an instance profile that assumes a role with OSG (S3) API access, s3cmd will pick up the temporary credentials and token through the Eucalyptus instance metadata service, as if the instance was launched on AWS EC2.  This wasn’t the case here, so we need to provide the Access Key ID and Secret Key manually:

[root@euca-172-17-248-184 ~]# ./s3cmd/s3cmd --configure

Enter new values or accept defaults in brackets with Enter.
Refer to user manual for detailed description of all options.

Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables.
Access Key: AKIRAGCHAGFE6IIX9BYF
Secret Key: GMdrL97AqcybhfyyxOpNmVUnBtiMenag3ju82L7L

Encryption password is used to protect your files from reading
by unauthorized persons while in transfer to S3
Encryption password:
Path to GPG program [/usr/bin/gpg]:
When using secure HTTPS protocol all communication with Amazon S3
servers is protected from 3rd party eavesdropping. This method is
slower than plain HTTP and can't be used if you're behind a proxy
Use HTTPS protocol [No]:

On some networks all internet access must go through a HTTP proxy.
Try setting it here if you can't connect to S3 directly
HTTP Proxy server name:

New settings:
 Access Key: AKIRAGCHAGFE6IIX9BYF
 Secret Key: GMdrL97AqcybhfyyxOpNmVUnBtiMenag3ju82L7L
 Encryption password:
 Path to GPG program: /usr/bin/gpg
 Use HTTPS protocol: False
 HTTP Proxy server name:
 HTTP Proxy server port: 0

Test access with supplied credentials? [Y/n] n
Save settings? [y/N] y
Configuration saved to '/root/.s3cfg'

Edit the .s3cfg file to make sure to point to the OSG on your Eucalyptus 4.0.2 cloud.  For example, change the following:

host_base = s3.amazonaws.com

to

host_base = objectstorage.future.euca-hasp.cs.prc.eucalyptus-systems.com:8773

and

host_bucket = %(bucket)s.s3.amazonaws.com

to

host_bucket = %(bucket)s.objectstorage.future.euca-hasp.cs.prc.eucalyptus-systems.com:8773

Confirm that s3cmd is configured correctly.  For example:

[root@euca-172-17-248-184 ~]# ./s3cmd/s3cmd ls
2014-11-05 21:45 s3://centos-images
2014-12-03 23:54 s3://centos6.6-zfs
2014-10-08 01:50 s3://instance-profile-testing
2014-12-01 22:27 s3://mongodb-snapshots
2014-10-10 20:01 s3://new-ubuntu-bundled-image
2014-09-17 18:31 s3://s3cmd-testing
2014-09-30 01:58 s3://ubuntu-bundled-vol
2014-10-22 14:47 s3://ubuntu-docker-template
2014-10-08 13:39 s3://ubuntu-images
2014-10-02 01:42 s3://ubuntu-trusty-imported-20141001
2014-10-30 18:25 s3://ubuntu-trusty-imported-20141030
2014-10-29 02:18 s3://ubuntu-trusty-server-10282014
2014-10-01 00:28 s3://wrong-s3-url-test

To perform a ZFS snapshot of the app-pool/mysql dataset, do the following:

[root@euca-172-17-248-184 ~]# zfs snapshot app-pool/mysql@wednesday
[root@euca-172-17-248-184 ~]# zfs list -t snapshot
NAME USED AVAIL REFER MOUNTPOINT
app-pool/mysql@wednesday 0 - 30K -

After creating a bucket for the backup, send the ZFS snapshot to the bucket:

[root@euca-172-17-248-184 ~]# ./s3cmd/s3cmd mb s3://mysql-backups
[root@euca-172-17-248-184 ~]# zfs send app-pool/mysql@wednesday | xz | ./s3cmd/s3cmd put - s3://mysql-backups/mysql-backup-wednesday.img.xz
<stdin> -> s3://mysql-backups/mysql-backup-wednesday.img.xz [part 1, 1440B]
 1440 of 1440 100% in 2s 561.67 B/s done

To confirm if the snapshot is located in the bucket, use s3cmd:

[root@euca-172-17-248-184 ~]# ./s3cmd/s3cmd ls s3://mysql-backups
2014-12-04 02:22 1440 s3://mysql-backups/mysql-backup-wednesday.img.xz

Thats all folks.  We have successfully created a CentOS 6.6 EMI with ZFS support.  For more information regarding ZFS (and inspirations for this blog), check out the following resources:

Cloud Image Management on Eucalyptus: Creating a CentOS 6.6 EMI With ZFS Support

12 Steps To EBS-Backed EMI Bliss on Eucalyptus

In previous posts, I shared how to use Ubuntu Cloud Images and eustore with Eucalyptus and AWS.  This blog entry will focus on how to use these assets to create EBS-backed EMIs in 12 steps.   These steps can be used on AWS as well, but instead of creating an instance store-backed AMI first, Ubuntu has already provided AMIs that can be used as the building block instance on AWS.  Let’s get started.

Prerequisites

On Eucalyptus and AWS, it is required the user has the appropriate IAM policy in order to perform these steps.  The policy should contain the following EC2 Actions at a minimum:

  • RunInstances
  • AttachVolume
  • AuthorizeSecurityGroupEgress
  • AuthorizeSecurityGroupIngress
  • CreateKeyPair
  • CreateSnapshot
  • CreateVolume
  • DescribeImages
  • DescribeInstances
  • DescribeInstanceStatus
  • DescribeSnapshots
  • DetachVolume
  • RegisterImage

In addition, the user needs an access key ID and secret key.  For more information, check out the following resources:

This entry also assumes Eucalyptus euca2ools are installed on the client machine.

The 12 Steps

Although the Ubuntu Cloud Image used in this entry is Ubuntu Precise (12.04) LTS, any of of the maintained Ubuntu Cloud images can be used.

  1. Use wget to download tar-gzipped precise-server-cloudimg:
    $ wget http://cloud-images.ubuntu.com/precise/current/precise-server-cloudimg-amd64.tar.gz
  2. After setting the EC2_ACCESS_KEY, EC2_SECRET_KEY, and EC2_URL, use eustore-install-image to an instance stored-backed EMI:
    $ eustore-install-image -t precise-server-cloudimg-amd64.tar.gz \
    -b ubuntu-latest-precise-x86_64 --hypervisor universal \
    -s "Ubuntu Cloud Image - Precise Pangolin - 12.04 LTS"
  3. Create a keypair using euca-create-keypair, then use euca-run-instances to launch an instance from the EMI returned from eustore-install-image. For example:
    $ euca-run-instances -t m1.medium \
    -k account1-user01 emi-5C8C3909
  4. Use euca-create-volume to create a volume based upon the size of how big you want the root filesystem to be.  The availability zone (-z option) will be based on if you are using Eucalyptus or AWS:
    $ euca-create-volume -s 6 \
    -z LayinDaSmackDown
  5. Using euca-attach-volume, attach the resulting volume to the running instance. For example:
    $ euca-attach-volume -d /dev/vdd \
    -i i-839E3FB0 vol-B5863B3B
  6. Use euca-authorize to open SSH access to the instance, SSH into the instance, then use wget to download the Ubuntu Precise Cloud Image (qcow2 format):
    $ ssh -i account1-user01.priv ubuntu@euca-10-104-7-10.eucalyptus.euca-hasp.eucalyptus-systems.com
    # sudo -s
    # wget http://cloud-images.ubuntu.com/precise/current/precise-server-cloudimg-amd64-disk1.img
  7. Install qemu-utils:
    # apt-get install -y qemu-utils
  8. Use qemu-img to convert image from qcow2 to raw:
    # qemu-img convert \
    -O raw precise-server-cloudimg-amd64-disk1.img precise-server-cloudimg-amd64-disk1-raw.img
  9. dd raw image to block device where volume is attached (use dmesg to figure that out easily):
    # dmesg | tail
    [ 7026.943212] virtio-pci 0000:00:05.0: using default PCI settings
    [ 7026.943249] pci 0000:00:07.0: no hotplug settings from platform
    [ 7026.943251] pci 0000:00:07.0: using default PCI settings
    [ 7026.945964] virtio-pci 0000:00:07.0: enabling device (0000 -> 0003)
    [ 7026.955143] virtio-pci 0000:00:07.0: PCI INT A -> Link[LNKC] -> GSI 10 (level, high) -> IRQ 10
    [ 7026.955180] virtio-pci 0000:00:07.0: setting latency timer to 64
    [ 7026.955429] virtio-pci 0000:00:07.0: irq 45 for MSI/MSI-X
    [ 7026.955456] virtio-pci 0000:00:07.0: irq 46 for MSI/MSI-X
    [ 7026.986990] vdb: unknown partition table
    [10447.093426] virtio-pci 0000:00:07.0: PCI INT A disabled
    # dd if=/mnt/precise-server-cloudimg-amd64-disk1-raw.img of=/dev/vdb bs=1M
  10. Log out the instance, and use euca-detach-volume to detach the volume:
    $ euca-detach-volume vol-B5863B3B
  11. Use euca-create-snapshot to create a snapshot of the volume:
    $ euca-create-snapshot vol-B5863B3B
  12. Use euca-register to register the resulting snapshot to create the EBS-backed EMI:
    $ euca-register --name ebs-precise-x86_64-sda \
    --snapshot snap-EFDB40A1 --root-device-name /dev/sda

Thats it!  You have successfully created an EBS-backed EMI/AMI.  As mentioned earlier, these steps can be used on AWS just as well (just skip steps 1 & 2, and use one of the Ubuntu Cloud Images in the AWS region of your choice).  Enjoy!

12 Steps To EBS-Backed EMI Bliss on Eucalyptus

Bind DNS + OpenLDAP MDB == Dynamic Domain and Fully Delegated Sub-Domain Configuration of DNS

This blog post was driven by the need to make it easier to test Eucalyptus DNS in a lab environment.   The goal was to have a scriptable way to add/delete fully delegated sub-domains without having to reload/restart DNS when Eucalyptus clouds were being deployed/destroyed.   This was tested on an CentOS 6 instance running in a Eucalyptus 3.3 HA Cloud.

Prerequisites

This entry will not cover setting up Eucalyptus HA, creating a Eucalyptus user, using eustore to register the image, and/or opening up ports in security groups.  Its assumed the reader understands these concepts.  The focus will be configuring and deploying Bind9 and OpenLDAP.

In addition to using a CentOS 6.4 image, the following is needed:

  • ports open for DNS (tcp and udp 53)
  • port open for OpenLDAP (tcp 389)
  • port open for SSH (tcp 22)

Now that the prereqs have been covered, lets jump into setting up the environment.

Base Software Installation

There are a series of packages needed to install OpenLDAP (since we are building from source), and Bind9.  Once the instance is launched and running, SSH into the instance, and run the following commands:

# sudo yum -y upgrade
# sudo yum install -y git cyrus-sasl gcc glibc-devel libtool-ltdl \
db4-devel openssl-devel unixODBC-devel libtool-ltdl-devel libtool \
cyrus-sasl-devel cyrus-sasl-gssapi cyrus-sasl-lib cyrus-sasl-md5 \
make bind-dyndb-ldap

All the packages except for bind-dyndb-ldap are needed to build OpenLDAP from source.  The reason we are building OpenLDAP from source is to take advantage of their powerful backend – MDB.  Check out my previous blogs on this topic from this listing.

The key package for Bind9 DNS to communicate to OpenLDAP as a backend is bind-dyndb-ldap.  This plug-in is used by the FreeIPA Identity/Policy Management application to help leverage 389 Directory Servers (which is based off OpenLDAP) for storing domain name information.

OpenLDAP Installation and Configuration

Since all the base packages are installed, we can now grab and install the latest source version of OpenLDAP.  While still being logged into the instance, run the following command:

# git clone git://git.openldap.org/openldap.git ~/openldap

Since we are working with an instance based on the CentOS 6 image on eustore, we will use the ephemeral store (which is mounted under /media/ephemeral0) for the location of our OpenLDAP installation.  Create a directory for installing OpenLDAP on the ephemeral store by running the command below:

# mkdir /media/ephemeral0/openldap

Next, configure OpenLDAP:

# cd ~/openldap
# ./configure --prefix=/media/ephemeral0/openldap --enable-debug=yes \
--enable-syslog --enable-dynamic --enable-slapd --enable-dynacl \
--enable-spasswd --enable-modules --enable-rlookups --enable-mdb \
--enable-monitor --enable-overlays --with-cyrus-sasl --with-threads \
--with-tls=openssl CC="gcc" LDFLAGS="-L/usr/lib64/sasl2" CPPFLAGS="-I/usr/include/sasl"

After thats completed successfully, compile and install OpenLDAP:

# make depend
# make
# sudo make install

After the installation is complete, create the openldap user that will be responsible for running OpenLDAP:

# sudo useradd -m -U -c "OpenLDAP User" -s /bin/bash openldap
# sudo passwd -l openldap

Since the bind-dyndb-ldap package was installed earlier, copy the schema to where OpenLDAP stores its schemas, so that it can be added to the OpenLDAP configuration:

# sudo cp /usr/share/doc/bind-dyndb-ldap-2.3/schema \
/media/ephemeral0/openldap/etc/openldap/schema/bind-dyndb-ldap.schema

Next, create the LDAP password for the cn=admin,cn=config user.  This user is responsible for managing the configuration of the OpenLDAP server using OLC:

# /media/ephemeral0/openldap/sbin/slappasswd -h {SSHA}

Modify the slapd.conf file located under /media/ephemeral0/openldap/etc/openldap/, to set up the base configuration structure (cn=config) for OpenLDAP.  When completed, it should look like the following:

#######################################################################
# Config database definitions
#######################################################################
pidfile /media/ephemeral0/openldap/var/run/slapd.pid
argsfile /media/ephemeral0/openldap/var/run/slapd.args
database config
rootdn cn=admin,cn=config
rootpw {SSHA}xxxxxxxxxxxxxxxxxxxxxxxx - (password created for cn=admin,cn=config user)
# Schemas, in order
include /media/ephemeral0/openldap/etc/openldap/schema/core.schema
include /media/ephemeral0/openldap/etc/openldap/schema/cosine.schema
include /media/ephemeral0/openldap/etc/openldap/schema/inetorgperson.schema
include /media/ephemeral0/openldap/etc/openldap/schema/collective.schema
include /media/ephemeral0/openldap/etc/openldap/schema/corba.schema
include /media/ephemeral0/openldap/etc/openldap/schema/duaconf.schema
include /media/ephemeral0/openldap/etc/openldap/schema/dyngroup.schema
include /media/ephemeral0/openldap/etc/openldap/schema/misc.schema
include /media/ephemeral0/openldap/etc/openldap/schema/nis.schema
include /media/ephemeral0/openldap/etc/openldap/schema/openldap.schema
include /media/ephemeral0/openldap/etc/openldap/schema/ppolicy.schema
include /media/ephemeral0/openldap/etc/openldap/schema/bind-dyndb-ldap.schema

Create the slapd.d directory under /media/ephemeral0/openldap/etc/openldap/.  This will contain all the directory information:

# sudo chown -R openldap:openldap /media/ephemeral0/openldap/* 
# su - openldap -c "mkdir /media/ephemeral0/openldap/etc/openldap/slapd.d"

Populate the slapd.d directory with the base configuration by running the following command:

su - openldap -c "/media/ephemeral0/openldap/sbin/slaptest \
-f /media/ephemeral0/openldap/etc/openldap/slapd.conf \
-F /media/ephemeral0/openldap/etc/openldap/slapd.d"
config file testing succeeded

For this example, we will be setting up the directory to use dc=eucalyptus,dc=com as the LDAP base.  Create the cn=Directory Manager,dc=eucalyptus,dc=com LDAP password:

# /media/ephemeral0/openldap/sbin/slappasswd -h {SSHA}

Create an LDIF that will define the configuration of the DB associated with the information regarding the DNS entries.  For this example, the LDIF will be called directory-layout.ldif.  It should look like the following:

#######################################################################
# MDB database definitions
#######################################################################
#
dn: olcDatabase=mdb,cn=config
changetype: add
objectClass: olcDatabaseConfig
objectClass: olcMdbConfig
olcDatabase: mdb
olcSuffix: dc=eucalyptus,dc=com
olcRootDN: cn=Directory Manager,dc=eucalyptus,dc=com
olcRootPW: {SSHA}xxxx - (password of cn=Directory Manager,dc=eucalyptus,dc=com user)
olcDbDirectory: /media/ephemeral0/openldap/var/openldap-data/dns
olcDbIndex: objectClass eq
olcAccess: to attrs=userPassword by dn="cn=Directory Manager,dc=eucalyptus,dc=com"
 write by anonymous auth by self write by * none
olcAccess: to attrs=shadowLastChange by self write by * read
olcAccess: to dn.base="" by * read
olcAccess: to * by dn="cn=Directory Manager,dc=eucalyptus,dc=com" write by * read
olcDbMaxReaders: 0
olcDbMode: 0600
olcDbSearchStack: 16
olcDbMaxSize: 4294967296
olcAddContentAcl: FALSE
olcLastMod: TRUE
olcMaxDerefDepth: 15
olcReadOnly: FALSE
olcSyncUseSubentry: FALSE
olcMonitoring: TRUE
olcDbNoSync: FALSE
olcDbEnvFlags: writemap
olcDbEnvFlags: nometasync

Make sure and create the directory where the DB information will be stored:

# su - openldap -c "mkdir /media/ephemeral0/openldap/var/openldap-data/dns"

Start up the OpenLDAP directory:

# sudo /media/ephemeral0/openldap/libexec/slapd -h "ldap:/// ldapi:///" \
-u openldap -g openldap

After OpenLDAP has been started successfully,  upload the directory-layout.ldif as the cn=admin,cn=config user:

# /media/ephemeral0/openldap/bin/ldapadd -D cn=admin,cn=config -W \
-f directory-layout.ldif
Enter LDAP Password:
adding new entry "olcDatabase=mdb,cn=config"

To allow search access to the directory, create an LDIF called frontend.ldif, that contains the following:

dn: olcDatabase={-1}frontend,cn=config
changetype: modify
replace: olcAccess
olcAccess: to dn.base="" by * read
olcAccess: to dn.base="cn=Subschema" by * read
olcAccess: to * by self write by users read by anonymous auth

Upload the LDIF using the ldapmodify command:

# /media/ephemeral0/openldap/bin/ldapmodify -D cn=admin,cn=config -W -f frontend.ldif
Enter LDAP Password:
modifying entry "olcDatabase={-1}frontend,cn=config"

To check the results of these changes, use ldapsearch:

# /media/ephemeral0/openldap/bin/ldapsearch -D cn=admin,cn=config -W -b cn=config
Enter LDAP Password:
(ldapsearch results...)

After confirming that the base configurations have been stored, create an LDIF called dns-domain.ldif that lays out the directory structure for the database.  As seen previously in the directory-layout.ldif, the base is dc=eucalyptus,dc=com. The dns-domain.ldif file should look like the following:

dn: dc=eucalyptus,dc=com
objectClass: top
objectClass: dcObject
objectclass: organization
o: Eucalyptus Systems Inc - QA DNS Domain
dc: eucalyptus
description: Test LDAP+DNS Setup
 
dn: ou=dns,dc=eucalyptus,dc=com
objectClass: organizationalUnit
ou: dns

After creating the dns-domain.ldif file, upload the file using the cn=Directory Manager,dc=eucalyptus,dc=com user:

# /media/ephemeral0/openldap/bin/ldapadd -H ldap://localhost \
-D "cn=Directory Manager,dc=eucalyptus,dc=com" -W -f dns-domain.ldif
Enter LDAP Password:
adding new entry "dc=eucalyptus,dc=com"
adding new entry "ou=dns,dc=eucalyptus,dc=com"

To allow the cn=Directory Manager,dc=eucalyptus,dc=com user to see what updates are being done to the Directory, enable the Access Log overlay.  To enable this option, create an LDIF file called access-log.ldif.  The contents should look like the following:

dn: olcDatabase={2}mdb,cn=config
objectClass: olcDatabaseConfig
objectClass: olcMdbConfig
olcDatabase: {2}mdb
olcDbDirectory: /media/ephemeral0/openldap/var/openldap-data/access
olcSuffix: cn=log
olcDbIndex: reqStart eq
olcDbMaxSize: 1073741824
olcDbMode: 0600
olcAccess: {1}to * by dn="cn=Directory Manager,dc=eucalyptus,dc=com" read

dn: olcOverlay={1}accesslog,olcDatabase={3}mdb,cn=config
objectClass: olcOverlayConfig
objectClass: olcAccessLogConfig
olcOverlay: {1}accesslog
olcAccessLogDB: cn=log
olcAccessLogOps: all
olcAccessLogPurge: 7+00:00 1+00:00
olcAccessLogSuccess: TRUE
olcAccessLogOld: (objectclass=idnsRecord)

After creating the access-log.ldif, create the directory for storing the access database:

# su - openldap -c "mkdir /media/ephemeral0/openldap/var/openldap-data/access"

Upload the LDIF using the cn=admin,cn=config user:

# /media/ephemeral0/openldap/bin/ldapadd -D cn=admin,cn=config -W -f access-log.ldif

Now that OpenLDAP is ready to go, let’s work on configuring Bind9 DNS.

Bind9 DNS Configuration

Since bind-dyndb-ldap has a dependency package of bind,  named has already been installed on the instance.  The only thing left to do is edit /etc/named.conf so that we are able to use the dynamic ldap backend module.  Edit /etc/named.conf so that it looks like the following:

options {
 listen-on port 53 { <private IP address of the instance>; };
 listen-on-v6 port 53 { ::1; };
 directory "/var/named";
 dump-file "/var/named/data/cache_dump.db";
 statistics-file "/var/named/data/named_stats.txt";
 memstatistics-file "/var/named/data/named_mem_stats.txt";
 recursion yes;

 dnssec-enable yes;
 dnssec-validation yes;
 dnssec-lookaside auto;

/* Path to ISC DLV key */
 bindkeys-file "/etc/named.iscdlv.key";
managed-keys-directory "/var/named/dynamic";
 allow-recursion { any; };
};

dynamic-db "qa_dns_test" {
 library "ldap.so";
 arg "uri ldap://localhost";
 arg "base ou=dns,dc=eucalyptus, dc=com";
 arg "auth_method none";
 arg "cache_ttl 10";
 arg "zone_refresh 1";
 arg "dyn_update yes";
};

logging {
 channel default_debug {
 file "data/named.run";
 severity debug;
 print-time yes;
 };
};

zone "." IN {
 type hint;
 file "named.ca";
};

include "/etc/named.rfc1912.zones";
include "/etc/named.root.key";

As seen above, the dynamic-db section is the configuration for connecting to the OpenLDAP server.  For more advanced configurations, please reference the README in the bind-dyndb-ldap repository on git.fedorahosted.org.

Now we are ready to start named (the bind DNS server).  Before starting the server, make sure and create the rndc key, then start named:

# rndc-confgen -a -r /dev/urandom
# service named start

You have successfully created an Bind9 DNS + OpenLDAP deployment.  Let’s run a quick test.

Test the Deployment

To test the deployment, I created an LDIF called test-cloud.ldif.  The configuration sets up a domain called eucalyptus-systems.com.  It also creates a sub-domain that will be forwarding requests for euca-hasp.eucalyptus-systems.com to the CLCs of the Eucalyptus HA deployment that has been set up.  The contents of the file are as follows:

dn: idnsName=eucalyptus-systems.com,ou=dns,dc=eucalyptus,dc=com
objectClass: top
objectClass: idnsZone
objectClass: idnsRecord
idnsName: eucalyptus-systems.com
idnsUpdatePolicy: grant EUCALYPTUS-SYSTEMS.COM krb5-self * A;
idnsZoneActive: TRUE
idnsSOAmName: server.eucalyptus-systems.com
idnsSOArName: root.server.eucalyptus-systems.com
idnsAllowQuery: any;
idnsAllowDynUpdate: TRUE
idnsSOAserial: 1
idnsSOArefresh: 10800
idnsSOAretry: 900
idnsSOAexpire: 604800
idnsSOAminimum: 86400
NSRecord: ns
ARecord: 192.168.55.103 - (the public IP of the instance)

dn: idnsName=ns,idnsName=eucalyptus-systems.com,ou=dns,dc=eucalyptus,dc=com
objectClass: idnsRecord
objectClass: top
idnsName: ns
aRecord: 192.168.55.103

dn: idnsName=server,idnsName=eucalyptus-systems.com,ou=dns,dc=eucalyptus,dc=com
objectClass: idnsRecord
objectClass: top
idnsName: server
CNAMERecord: eucalyptus-systems.com.

dn: idnsname=39.168.192.in-addr.arpa.,ou=dns,dc=eucalyptus,dc=com
objectClass: idnszone
objectClass: idnsrecord
objectClass: top
idnsName: 39.168.192.in-addr.arpa.
idnsSOAmName: server.eucalyptus-systems.com
idnsSOArName: root.server.eucalyptus-systems.com
idnsSOAserial: 1350039556
idnsSOArefresh: 10800
idnsSOAretry: 900
idnsSOAexpire: 604800
idnsSOAminimum: 86400
idnsZoneActive: TRUE
idnsAllowDynUpdate: TRUE
idnsAllowQuery: any;
idnsAllowTransfer: none;
idnsUpdatePolicy: grant EUCALYPTUS-SYSTEMS.COM krb5-subdomain 39.168.192.in-addr.arpa. PTR;
nSRecord: server.eucalyptus-systems.com.

dn: idnsName=_ldap._tcp,idnsName=eucalyptus-systems.com,ou=dns,dc=eucalyptus,dc=com
objectClass: idnsRecord
objectClass: top
idnsName: _ldap._tcp
SRVRecord: 0 100 389 server

dn: idnsName=_ntp._udp,idnsName=eucalyptus-systems.com,ou=dns,dc=eucalyptus,dc=com
objectClass: idnsRecord
objectClass: top
idnsName: _ntp._udp
SRVRecord: 0 100 123 server

# The DNS entries for the CLCs of the cloud - viking-01 and viking-02

dn: idnsName=viking-02,idnsName=eucalyptus-systems.com,ou=dns,dc=eucalyptus,dc=com
objectClass: idnsRecord
objectClass: top
idnsName: viking-02
aRecord: 192.168.39.102

dn: idnsname=102,idnsname=39.168.192.in-addr.arpa.,ou=dns,dc=eucalyptus,dc=com
objectClass: idnsrecord
objectClass: top
idnsName: 102
pTRRecord: viking-02.eucalyptus-systems.com.

dn: idnsName=viking-01,idnsName=eucalyptus-systems.com,ou=dns,dc=eucalyptus,dc=com
objectClass: idnsRecord
objectClass: top
idnsName: viking-01
aRecord: 192.168.39.101

dn: idnsname=101,idnsname=39.168.192.in-addr.arpa.,ou=dns,dc=eucalyptus,dc=com
objectClass: idnsrecord
objectClass: top
idnsName: 101
pTRRecord: viking-01.eucalyptus-systems.com.

# The delegated zone - euca-hasp.eucalyptus-systems.com

dn: idnsName=euca-hasp,idnsName=eucalyptus-systems.com,ou=dns,dc=eucalyptus,dc=com
objectClass: top
objectClass: idnsRecord
objectClass: idnsZone
idnsForwardPolicy: first
idnsAllowDynUpdate: FALSE
idnsZoneActive: TRUE
idnsAllowQuery: any;
idnsForwarders: 192.168.39.101
idnsForwarders: 192.168.39.102
idnsName: euca-hasp
idnsSOAmName: server.eucalyptus-systems.com
idnsSOArName: root.server.eucalyptus-systems.com
idnsSOAretry: 15
idnsSOAserial: 1
idnsSOArefresh: 80
idnsSOAexpire: 120
idnsSOAminimum: 30
nSRecord: viking-01.eucalyptus-systems.com
nSRecord: viking-02.eucalyptus-systems.com

Since this was intended for a lab environment where sub-domains (and possibly domains) would be added/deleted on a regular basis, the SOA records were not set to the RFC 1912 standards defined for production DNS use.

After creating this LDIF,  it was uploaded to the LDAP server as the cn=Directory Manager,dc=eucalyptus,dc=com user:

# /media/ephemeral0/openldap/bin/ldapadd -H ldap://localhost \
-D "cn=Directory Manager,dc=eucalyptus,dc=com" -W -f test-cloud.ldif
Enter LDAP Password:
adding new entry "idnsName=eucalyptus-systems.com,ou=dns,dc=eucalyptus,dc=com"
adding new entry "idnsName=ns,idnsName=eucalyptus-systems.com,ou=dns,dc=eucalyptus,dc=com"
adding new entry "idnsName=server,idnsName=eucalyptus-systems.com,ou=dns,dc=eucalyptus,dc=com"
adding new entry "idnsname=39.168.192.in-addr.arpa.,ou=dns,dc=eucalyptus,dc=com"
adding new entry "idnsName=_ldap._tcp,idnsName=eucalyptus-systems.com,ou=dns,dc=eucalyptus,dc=com"
adding new entry "idnsName=_ntp._udp,idnsName=eucalyptus-systems.com,ou=dns,dc=eucalyptus,dc=com"
adding new entry "idnsName=viking-02,idnsName=eucalyptus-systems.com,ou=dns,dc=eucalyptus,dc=com"
adding new entry "idnsname=102,idnsname=39.168.192.in-addr.arpa.,ou=dns,dc=eucalyptus,dc=com"
adding new entry "idnsName=viking-01,idnsName=eucalyptus-systems.com,ou=dns,dc=eucalyptus,dc=com"
adding new entry "idnsname=101,idnsname=39.168.192.in-addr.arpa.,ou=dns,dc=eucalyptus,dc=com"
adding new entry "idnsName=euca-hasp,idnsName=eucalyptus-systems.com,ou=dns,dc=eucalyptus,dc=com"

To test out the setup, tests were ran against the public IP address of the instance to resolve for the various configurations:

# nslookup viking-01.eucalyptus-systems.com 192.168.55.103
Server: 192.168.55.103
Address: 192.168.55.103#53

Name: viking-01.eucalyptus-systems.com
Address: 192.168.39.101

# nslookup 192.168.39.101 192.168.55.103
Server: 192.168.55.103
Address: 192.168.55.103#53

101.39.168.192.in-addr.arpa name = viking-01.eucalyptus-systems.com.

# nslookup eucalyptus.euca-hasp.eucalyptus-systems.com 192.168.55.103
Server: 192.168.55.103
Address: 192.168.55.103#53

Non-authoritative answer:
Name: eucalyptus.euca-hasp.eucalyptus-systems.com
Address: 192.168.39.102

# nslookup walrus.euca-hasp.eucalyptus-systems.com 192.168.55.103
Server: 192.168.55.103
Address: 192.168.55.103#53

Non-authoritative answer:
Name: walrus.euca-hasp.eucalyptus-systems.com
Address: 192.168.39.101

As see above, not only did the resolution come back correct for the machines under the eucalyptus-systems.com domain, but it also forwarded the requests for the hosts under euca-hasp.eucalyptus-systems.com correctly, and returned the correct response.

To delete the set up, an LDIF called delete-test-cloud.ldif from the test-cloud.ldif as follows:

# tac test-cloud.ldif | grep dn: > delete-test-cloud.ldif

Open up the delete-test-cloud.ldif and add the following lines between each dn:

changetype: delete
(empty line)

Now, use ldapmodify as the cn=Directory Manager,dc=eucalyptus,dc=com user to delete the entries:

# /media/ephemeral0/openldap/bin/ldapmodify -H ldap://localhost -D "cn=Directory Manager,dc=eucalyptus,dc=com" -W -f delete-test-cloud.ldif
Enter LDAP Password:
deleting entry "idnsName=euca-hasp,idnsName=eucalyptus-systems.com,ou=dns,dc=eucalyptus,dc=com"
deleting entry "idnsname=101,idnsname=39.168.192.in-addr.arpa.,ou=dns,dc=eucalyptus,dc=com"
deleting entry "idnsName=viking-01,idnsName=eucalyptus-systems.com,ou=dns,dc=eucalyptus,dc=com"
deleting entry "idnsname=102,idnsname=39.168.192.in-addr.arpa.,ou=dns,dc=eucalyptus,dc=com"
deleting entry "idnsName=viking-02,idnsName=eucalyptus-systems.com,ou=dns,dc=eucalyptus,dc=com"
deleting entry "idnsName=_ntp._udp,idnsName=eucalyptus-systems.com,ou=dns,dc=eucalyptus,dc=com"
deleting entry "idnsName=_ldap._tcp,idnsName=eucalyptus-systems.com,ou=dns,dc=eucalyptus,dc=com"
deleting entry "idnsname=39.168.192.in-addr.arpa.,ou=dns,dc=eucalyptus,dc=com"
deleting entry "idnsName=server,idnsName=eucalyptus-systems.com,ou=dns,dc=eucalyptus,dc=com"
deleting entry "idnsName=ns,idnsName=eucalyptus-systems.com,ou=dns,dc=eucalyptus,dc=com"
deleting entry "idnsName=eucalyptus-systems.com,ou=dns,dc=eucalyptus,dc=com"

To confirm, do a lookup against one of the entries to see if it still exists:

# nslookup viking-01.eucalyptus-systems.com 192.168.55.103
Server: 192.168.55.103
Address: 192.168.55.103#53

** server can't find viking-01.eucalyptus-systems.com: NXDOMAIN

There you have it!  A successful Bind9 DNS + OpenLDAP deployment is ready to be used.

Enjoy!  And as always, questions/suggestions/comments are always welcome.

Bind DNS + OpenLDAP MDB == Dynamic Domain and Fully Delegated Sub-Domain Configuration of DNS

Big Data on the Cloud using Ansible, RHadoop, AppScale, and AWS/Eucalyptus

Background

Big Data has been a hot topic over the last few years.  Big Data on public clouds, such as AWS’s Elastic MapReduce, has been gaining even more popularity as cloud computing becomes more of an industry standard.

R  is an open source project for statistical computing and graphics.  It has been growing in popularity for doing  linear and nonlinear modeling, classical statistical tests, time-series analysis and others, at various Universities and companies.

R Project
R Project

RHadoop was developed by Revolution Analytics to interface with Hadoop.  Revolution Analytics builds analytic software solutions using R.

Revolution Analytics
Revolution Analytics

AppScale is an open source PaaS that implements the Google AppEngine API on IaaS environments.  One of the Google AppEngine APIs that is implemented is AppEngine MapReduce.   The back-end support for this API that AppScale using Cloudera’s Distribution for Apache Hadoop.

AppScale Inc.
AppScale Inc.

Ansible is an open source orchestration software that utilizes SSH for handling configuration management for physical/virtual machines, and machines running in the cloud.

Ansible Works
Ansible Works

Amazon Web Services is a public IaaS that provides infrastructure and application services in the cloud.  Eucalyptus is an open source software solution that provides the AWS APIs for EC2, S3, and IAM for on-premise cloud environments.

Amazon AWS EC2
Amazon AWS EC2
Eucalyptus Systems Inc.
Eucalyptus Systems Inc.

This blog entry will cover how to deploy AppScale (either on AWS or Eucalyptus), then use Ansible to configure each AppScale node with R, and the RHadoop packages in order allow programs written in R to utilize MapReduce in the cloud.

Pre-requisites

To get started, the following is needed on a desktop/laptop computer:

*NOTE:  These variables are used by AppScale Tools version 1.6.9.  Check the AWS and Eucalyptus documentation regarding obtaining user credentials. 

Deployment

AppScale

After installing AppScale Tools and Ansible, the AppScale cluster needs to be deployed.  After defining the AWS/Eucalyptus variables,  initialize the creation of the AppScale cluster configuration file – AppScalefile.

$ ./appscale-tools/bin/appscale init cloud

Edit the AppScalefile, providing information for the keypair, security group, and AppScale AMI/EMI.  The keypair and security group do not need to be pre-created. AppScale will handle this.  The AppScale AMI on AWS (us-east-1) is ami-4e472227.  The Eucalyptus EMI will be unique based upon the Eucalyptus cloud that is being used.  In this example, the AWS AppScale AMI will be used, and the AppScale cluster size will be 3 nodes.  Here is the example AppScalefile:

---
group : 'appscale-rmr'
infrastructure : 'ec2'
instance_type : 'm1.large'
keyname : 'appscale-rmr'
machine : 'ami-4e472227'
max : 3
min : 3
table : 'hypertable'

After editing the AppScalefile, start up the AppScale cluster by running the following command:

$ ./appscale-tools/bin/appscale up

Once the cluster finishes setting up, the status of the cluster can be seen by running the command below:

$ ./appscale-tools/bin/appscale status

R, RHadoop Installation Using Ansible

Now that the cluster is up and running, grab the Ansible playbook for installing R, and RHadoop rmr2 and rhdfs packages onto the AppScale nodes.  The playbook can be downloaded from github using git:

$ git clone https://github.com/hspencer77/ansible-r-appscale-playbook.git

After downloading the playbook, the ansible-r-appscale-playbook/production file needs to be populated with the information of the AppScale cluster.  Grab the cluster  node information by running the following command:

$ ./appscale-tools/bin/appscale status | grep amazon | grep Status | awk '{print $5}' | cut -d ":" -f 1
ec2-50-17-96-162.compute-1.amazonaws.com
ec2-50-19-45-193.compute-1.amazonaws.com
ec2-67-202-23-157.compute-1.amazonaws.com

Add those DNS entries to the ansible-r-appscale-playbook/production file.  After editing, the file will look like the following:

[appscale-nodes]
ec2-50-17-96-162.compute-1.amazonaws.com
ec2-50-19-45-193.compute-1.amazonaws.com
ec2-67-202-23-157.compute-1.amazonaws.com

Now the playbook can be executed.  The playbook requires the SSH private key to the nodes.  This key will be located under the ~/.appscale folder.  In this example, the key file is named appscale-rmr.key.  To execute the playbook, run the following command:

$ ansible-playbook -i r-appscale-deployment/production 
--private-key=~/.appscale/appscale-rmr.key -v r-appscale-deployment/site.yml

Testing Out The Deployment – Wordcount.R

Once the playbook has finished running, the AppScale cluster is now ready to be used.  To test out the setup, SSH into the head node of the AppScale cluster.  To find out the head node of the cluster, execute the following command:

$ ./appscale-tools/bin/appscale status

After discovering the head node, SSH into the head node using the private key located in the ~/.appscale directory:

$ ssh -i ~/.appscale/appscale-rmr.key root@ec2-50-17-96-162.compute-1.amazonaws.com

To test out the R setup on all the nodes, grab the wordcount.R program:

root@appscale-image0:~# tar zxf rmr2_2.0.2.tar.gz rmr2/tests/wordcount.R

In the wordcount.R file, the following lines are present

rmr2:::hdfs.put("/etc/passwd", "/tmp/wordcount-test")
out.hadoop = from.dfs(wordcount("/tmp/wordcount-test", pattern = " +"))

When the wordcount.R program is executed, it will grab the /etc/password file from the head node, copy it to the hdfs filesystem, then run wordcount on /etc/password to look for the pattern ” +”.   NOTE: wordcount.R can be edited to use any file and pattern desired.

Run wordcount.R:

root@appscale-image0:~# R

R version 2.15.3 (2013-03-01) -- "Security Blanket"
Copyright (C) 2013 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

[Previously saved workspace restored]

> source('rmr2/tests/wordcount.R')
Loading required package: Rcpp
Loading required package: RJSONIO
Loading required package: digest
Loading required package: functional
Loading required package: stringr
Loading required package: plyr
13/04/05 02:33:41 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
13/04/05 02:33:43 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
packageJobJar: [/tmp/RtmprcYtsu/rmr-local-env19811a7afd54, /tmp/RtmprcYtsu/rmr-global-env1981646cf288, /tmp/RtmprcYtsu/rmr-streaming-map198150b6ff60, /tmp/RtmprcYtsu/rmr-streaming-reduce198177b3496f, /tmp/RtmprcYtsu/rmr-streaming-combine19813f7ea210, /var/appscale/hadoop/hadoop-unjar5632722635192578728/] [] /tmp/streamjob8198423737782283790.jar tmpDir=null
13/04/05 02:33:44 WARN snappy.LoadSnappy: Snappy native library is available
13/04/05 02:33:44 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/04/05 02:33:44 INFO snappy.LoadSnappy: Snappy native library loaded
13/04/05 02:33:44 INFO mapred.FileInputFormat: Total input paths to process : 1
13/04/05 02:33:44 INFO streaming.StreamJob: getLocalDirs(): [/var/appscale/hadoop/mapred/local]
13/04/05 02:33:44 INFO streaming.StreamJob: Running job: job_201304042111_0015
13/04/05 02:33:44 INFO streaming.StreamJob: To kill this job, run:
13/04/05 02:33:44 INFO streaming.StreamJob: /root/appscale/AppDB/hadoop-0.20.2-cdh3u3/bin/hadoop job  -Dmapred.job.tracker=10.77.33.247:9001 -kill job_201304042111_0015
13/04/05 02:33:44 INFO streaming.StreamJob: Tracking URL: http://appscale-image0:50030/jobdetails.jsp?jobid=job_201304042111_0015
13/04/05 02:33:45 INFO streaming.StreamJob:  map 0%  reduce 0%
13/04/05 02:33:51 INFO streaming.StreamJob:  map 50%  reduce 0%
13/04/05 02:33:52 INFO streaming.StreamJob:  map 100%  reduce 0%
13/04/05 02:33:59 INFO streaming.StreamJob:  map 100%  reduce 33%
13/04/05 02:34:02 INFO streaming.StreamJob:  map 100%  reduce 100%
13/04/05 02:34:04 INFO streaming.StreamJob: Job complete: job_201304042111_0015
13/04/05 02:34:04 INFO streaming.StreamJob: Output: /tmp/RtmprcYtsu/file1981524ee1a3
13/04/05 02:34:05 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
13/04/05 02:34:07 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
13/04/05 02:34:08 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
13/04/05 02:34:10 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
Deleted hdfs://10.77.33.247:9000/tmp/wordcount-test
>quit("yes")

Thats it!  The AppScale cluster is ready for additional R programs that utilize MapReduce.   Enjoy the world of Big Data on public/private IaaS.

Big Data on the Cloud using Ansible, RHadoop, AppScale, and AWS/Eucalyptus

Using Ansible to Deploy Neo4j HA Cluster on AWS/Eucalyptus

As a follow-up to my last Neo4j, AWS/Eucalyptus blog, this entry demonstrates another great example of AWS/Eucalyptus fidelity by using Ansible to deploy a Neo4j High Available cluster.

Neo4j - Graph Database
Neo4j – Graph Database
Amazon AWS EC2
Amazon AWS EC2
Eucalyptus Systems Inc.
Eucalyptus Systems Inc.

Pre-requisites

In order to use this Ansible playbook on AWS/Eucalyptus, the following is needed:

Before deploying the cluster, a security group needs to be created that the cluster will use.  The security group must allow the following:

  • port 22 (SSH)
  • all instances part of the security group allowed to community with each other (ports 0 – 65535)

To create the security group and authorize the ports, make sure the user’s access key, secret access key, and EC2 URL are noted, and do the following:

  1. Create the security group

    ec2-create-group --aws-access-key <EC2_ACCESS_KEY> 
    --aws-secret-key <EC2_SECRET_KEY> 
    --url <EC2_URL> -g neo4j-cluster -d "Neo4j HA Cluster"

  2. Authorize port for SSH in neo4j-cluster security group

    ec2-authorize 
    --aws-access-key <EC2_ACCESS_KEY> 
    --aws-secret-key <EC2_SECRET_KEY> 
    --url <EC2_URL> -P tcp -p 22 -s 0.0.0.0/0 neo4j-cluster

  3. Authorize all port communication between cluster members 

    ec2-authorize 
    --aws-access-key <EC2_ACCESS_KEY> --aws-secret-key <EC2_SECRET_KEY> 
    --url <EC2_URL> -P tcp -o neo4j-cluster -p -1 neo4j-cluster

After completing these steps, use

ec2-describe-group

to view the security group:

ec2-describe-group --aws-access-key <EC2_ACCESS_KEY> 
--aws-secret-key <EC2_SECRET_KEY> --url <EC2_URL> neo4j-cluster

GROUP sg-1cbc5777 986451091583 neo4j-cluster Neo4j HA Cluster
PERMISSION 986451091583 neo4j-cluster ALLOWS tcp 0 65535 FROM 
USER 986451091583 NAME neo4j-cluster ID sg-1cbc5777 ingress
PERMISSION 986451091583 neo4j-cluster 
ALLOWS tcp 22 22 FROM CIDR 0.0.0.0/0 ingress

Neo4j HA Cluster Deployment

Once the security group is created with the correct ports authorized, the cluster can be deployed.  To deploy the cluster, do the following:

  1. Obtain Ansible from git and setup the environment by following the instructions mentioned here – http://ansible.cc/docs/gettingstarted.html#getting-ansible
  2. Obtain the Ansible Playbook for Neo4j HA Cluster using git

    git clone https://github.com/hspencer77/ansible-neo4j-cluster.git

  3. Change directory into ansible-neo4j-cluster  

    cd ansible-neo4j-cluster

  4. Set up /etc/ansible/hosts with the following information:
    [local]
    127.0.0.1
  5. Populate vars/ec2-config with either Eucalyptus/AWS information. vars/ec2-config contains the following variables:
    keypair: <EC2/Eucalyptus Keypair>
    ec2_access_key: <EC2_ACCESS_KEY>
    ec2_secret_key: <EC2_SECRET_KEY>
    ec2_url: <EC2_URL>
    instance_type: m1.small
    security_group: <AWS/Eucalyptus Security Group>
    image: <AMI/EMI>
  6. 
    

    Execute the following command:

    ansible-playbook neo4j-cluster.yml \
     --private-key=<AWS/Eucalyptus Private Key file> --extra-vars "node_count=3"
  7. After the playbook finishes, there will be an URL provided to access the cluster – similar to the example below:
    TASK: [Display HAProxy URL] *********************
    changed: [23.22.248.75] => {"changed": true, "cmd": 
    "echo \"HAProxy URL for Neo4j -
     http://ec2-23-22-248-75.compute-1.amazonaws.com/webadmin/#/info/org.neo4j/High%20Availability/\" ",
     "delta": "0:00:00.006835", "end": "2013-03-30 19:54:31.104320", 
    "rc": 0, "start": "2013-03-30 19:54:31.097485", "stderr": "", 
    "stdout": 
    "HAProxy URL for Neo4j - 
    http://ec2-23-22-248-75.compute-1.amazonaws.com/webadmin/#/info/org.neo4j/High%20Availability/"}

    To view the status of cluster in the browser, open up http://ec2-23-22-248-75.compute-1.amazonaws.com/webadmin/#/info/org.neo4j/High%20Availability/.

  8. To get the status of the cluster, use curl:
    curl -H "Content-Type:application/json" -d '["org.neo4j:*"]' 
    http://ec2-23-22-248-75.compute-1.amazonaws.com/db/manage/server/jmx/query

Thats it!  A Neo4j HA cluster with an HA Proxy server serving as an endpoint is available to be used.   If a bigger cluster is desired, just change the

node_count

value.   For additional information regarding this playbook, and how it handles the cluster membership, please refer to the following URL – https://github.com/hspencer77/ansible-neo4j-cluster/blob/master/README.md.

Hope you enjoy!  As always, questions/comments/suggestions are always welcome.

Using Ansible to Deploy Neo4j HA Cluster on AWS/Eucalyptus

Test Drive: Drupal Deployment on Eucalyptus using Stackato, Amazon Route 53 and the Eucalyptus Community Cloud

Recently, I did a blog discussing how to deploy a Jenkins server using Stackato, running on Eucalyptus.  At the end of that blog, I mentioned how the Eucalyptus Community Cloud (ECC) could be used for testing out the Stackato Microcloud image on Eucalyptus.   The previous blog – I felt – was more for DevOps administrators who had access to their own on-premise Eucalyptus clouds.  The inspiration of this blog comes from the blog on ActiveBlog entitled “Deploy & Scale Drupal on Any Cloud with Stackato” to show love to Web Developers, and show the power of Amazon’s Route 53.

Test Drive Pre-Reqs

The prerequisites for this blog are the same that are mentioned in my previous blog regarding using Stackato on Eucalyptus (for the Eucalyptus pre-reqs, make sure the ECC is being used).  In addition to the prerequisites mentioned above, the following is needed:

After the prerequisites have been met, its time to setup the Drupal environment.

Test Drive Engage!

Since the ECC is being used, there is no need to worry about bundling, uploading and registering the Stackato image.  The Stackato image used for this blog is as follows:

IMAGE emi-859B3D5C stackato_v2.6.6/stackato-cloudinit.manifest.xml
150820662310 available public x86_64 machine eki-6FBE3D2D eri-67463B77 instance-store

Next, lets make sure the user has an elastic IP that will be used in AWS Route 53, and a security group to allow proper network traffic to the instance.  Do the following:

  1. Make sure the user credentials are sourced correctly, and euca2ools is installed correctly.
  2. Grab an elastic IP using euca-allocate-address (in this example 173.205.188.105 was allocated):

    # euca-allocate-address
    ADDRESS 173.205.188.105

  3. If the user already doesn’t have a keypair, create a keypair for the user by using euca-create-keypair, and make sure the permission of the file is 0600:  

    # euca-create-keypair hspencer-stackato > hspencer-stackato.priv 
    # chmod 0600 hspencer-stackato.priv

  4. Create a security group for the instance to use:

    # euca-create-group stackato-test -d "Test Security Group for Stackato PaaS"
    GROUP stackato-test Test Security Group for Stackato PaaS

  5. Authorize ping, ssh, http, and https ports:

    # euca-authorize -P icmp -t -1:-1 -s 0.0.0.0/0 stackato-test
    GROUP stackato-test
    PERMISSION stackato-test ALLOWS icmp -1 -1 FROM CIDR 0.0.0.0/0
    
    # euca-authorize -P tcp -p 22 -s 0.0.0.0/0 stackato-test
    GROUP stackato-test
    PERMISSION stackato-test ALLOWS tcp 22 22 FROM CIDR 0.0.0.0/0
    
    # euca-authorize -P tcp -p 80 -s 0.0.0.0/0 stackato-test
    GROUP stackato-test
    PERMISSION stackato-test ALLOWS tcp 80 80 FROM CIDR 0.0.0.0/0
    
    # euca-authorize -P tcp -p 443 -s 0.0.0.0/0 stackato-test
    GROUP stackato-test
    PERMISSION stackato-test ALLOWS tcp 443 443 FROM CIDR 0.0.0.0/0

  6. Now, launch the instance, specifying the keypair name to use, and a VM type.  On the ECC, only m1.xlarge and c1.xlarge meet the requirements of launching the Stackato image:

    # euca-run-instances -k hspencer-stackato -t c1.xlarge emi-859B3D5C -g stackato-test
    RESERVATION r-66EE4030 628376682871 stackato-test
    INSTANCE i-E85843C4 emi-859B3D5C euca-0-0-0-0.eucalyptus.ecc.eucalyptus.com euca-0-0-0-0.eucalyptus.internal
     pending hspencer-stackato 0 c1.xlarge 2013-02-24T19:40:35.516Z partner01 eki-6FBE3D2D
     eri-67463B77 monitoring-disabled 0.0.0.0 0.0.0.0 instance-store

  7. Once the instance gets to a running state, associate the elastic IP that the user owns to the instance:

    # euca-describe-instances
    RESERVATION r-66EE4030 628376682871 stackato-test
    INSTANCE i-E85843C4 emi-859B3D5C euca-173-205-188-106.eucalyptus.ecc.eucalyptus.com
     euca-10-9-190-24.eucalyptus.internal running hspencer-stackato 0 c1.xlarge 
    2013-02-24T19:40:35.516Z partner01 eki-6FBE3D2D eri-67463B77 monitoring-disabled
     173.205.188.10 10.9.190.24 instance-store
    
    # euca-associate-address -i i-E85843C4 173.205.188.105
    ADDRESS 173.205.188.105 i-E85843C4
    
    # euca-describe-instances
    RESERVATION r-66EE4030 628376682871 stackato-test
    INSTANCE i-E85843C4 emi-859B3D5C euca-173-205-188-105.eucalyptus.ecc.eucalyptus.com
     euca-10-9-190-24.eucalyptus.internal running hspencer-stackato 0 c1.xlarge 2013-02-24T19:40:35.516Z
     partner01 eki-6FBE3D2D eri-67463B77 monitoring-disabled 173.205.188.10 10.9.190.24 instance-store

  8. Log into the AWS management console,  select Route 53, and setup the A and CNAME records in your domain as mentioned here under the Stackato Documentation regarding detailed DNS configuration.  In this example, the DNS name associated with the elastic IP 173.205.188.105 is stackato-dev.mindspew-age.com.
  9. Next ssh into the instance, and proceed to follow the steps for setting up the Stackato instance that is mentioned in my previous blog under the section Configuration of the Stackato Instance.  Make sure the DNS name setup in AWS Route 53 is used with “kato rename public-DNS-name” and “kato setup core api.public-DNS-name” configuration steps.
  10. After the instance is configured, just open up the browser and go to the DNS name set up for the Stackato instance in AWS Route 53, as mentioned in the Stackato Documentation regarding configuration via the Management Console.
  11. Once logged into the Stackato Management Console, select “App Store” in the lefthand menu and select “Drupal” to install

    App Store - Drupal Application
    App Store – Drupal Application

     

  12. After Drupal has installed, start the application.  Once it has started successfully, select the URL that shows up in the right-hand menu box.  The Drupal log-in page will appear in your browser

    Drupal Landing Page
    Drupal Landing Page

Thats it!  Now Drupal is ready for any web developer to test out on the ECC.  If there is any questions/comments/suggestions, please feel free to leave comments.  Enjoy!

Test Drive: Drupal Deployment on Eucalyptus using Stackato, Amazon Route 53 and the Eucalyptus Community Cloud

Jenkins, Stackato, Cloud-Init and Eucalyptus == Potent Combination for an On-Premise Continuous Integration Environment

The Ingredients

Jenkins

An extendable open source continuous integration server.

Stackato

The Enterprise Private PaaS that makes it easy to deploy, manage, and monitor applications on any cloud.

Cloud-init

The Ubuntu package that handles early initialization of a cloud instance. It is installed in the Ubuntu Cloud Images and also in the official Ubuntu images available on EC2.

Eucalyptus

Allows you to build production-ready, AWS-compatible private and hybrid clouds by leveraging your existing virtualized infrastructure to create on-demand cloud resource pools.

What happens when you combine all three of these tools?  A potent combination for continuous integration on a easy-to-configure PaaS and an on-premise, AWS-compatible IaaS.  With this combination, developers can take advantage of easy configuration that Stackato brings to the table, running on top of Eucalyptus – bringing an AWS-like cloud environment into your datacenter.

This blog entry will discuss the steps that I took to get Jenkins installed on a Stackato instance store-backed instance running on Eucalyptus.  But before I get started, I would like to thank the folks from ActiveState for all their guidance.  Their support staff is really top notch, and very helpful.  Check them out in #stackato on freenode.net.  They can also be checked out on Twitter at @ActiveState. Now on to the dirty work…..

The Recipe for Success

The Stackato Microcloud Image and Cloud-Init

To begin, the following is needed:

After downloading the Stackato VM for KVM and unzipping the file, we will need to pull out the root file system, the kernel and ramdisk.  These will be uploaded, bundled and registered as the EMI, EKI, and ERI.  To extract the root filesystem, do the following:

  1. Use parted to locate the root filesystem as follows:    
    # parted stackato-img-kvm-v2.6.6.img
    GNU Parted 2.1
    Using /root/images/Stackato-VM/stackato-img-kvm-v2.6.6.img
    Welcome to GNU Parted! Type 'help' to view a list of commands.
    (parted) U
    Unit? [compact]? b
    (parted) p
    Model: (file)
    Disk /root/images/Stackato-VM/stackato-img-kvm-v2.6.6.img: 10737418240B
    Sector size (logical/physical): 512B/512B
    Partition Table: msdos
    
    Number Start End Size Type File system Flags
    1 1048576B 200278015B 199229440B primary ext3 boot
    3 200278016B 1237319679B 1037041664B primary linux-swap(v1)
    2 1237319680B 10736369663B 9499049984B primary ext4
    
    (parted) quit
  2. In this example, the root filesystem is partition 2.  The value for “Start” and “Size” will need to be used.  Next, run dd to extract the root filesystem:
    dd if=stackato-img-kvm-v2.6.6.img of=stackato-rootfs.img
     bs=1 skip=1237319680 count=9499049984
    
  3. Once it has completed, mount  stackato-rootfs.img to the loopback device:
    mount -o loop stackato-rootfs.img /mnt/
  4. Copy out initrd.img-3.2.0-27-virtual  and vmlinuz-3.2.0-27-virtual from /mnt/boot.
  5. In /mnt/etc/fstab, replace the UUID entry with LABEL.  The LABEL will look simliar to the following:  
    LABEL=cloudimg-rootfs    /               ext4   defaults     1       1
  6. Chroot to /mnt – there may be a need to do a mount -o bind for sys, dev, and proc.
  7. Run “dpkg-reconfigure cloud-init”, and make sure that the EC2 Data Source is selected.
  8. Unmount stackato-rootfs.img (if sys, dev, and proc were mounted, unmount them before unmounting stackato-rootfs.img).  After it has been unmounted, run tune2fs to change the label of the image:
    tune2fs -L cloudimg-rootfs stackato-rootfs.img

After following these steps, the following should be available:

  • initrd.img-3.2.0-27-virtual – to be bundled, uploaded and registered as the ERI
  • vmlinuz-3.2.0-27-virtual – to be bundled, uploaded and registered as the EKI
  • stackato-rootfs.img – to be bundled, uploaded and registered as the EMI

Go through the steps of bundling, uploading and registering the ERI, EKI, and EMI.  For additional information, please refer to the Add an Image section of the Eucalyptus 3.2 User Guide.

Launching the Stackato Image

Now its time to launch the Stackato image on Eucalyptus.  Since cloud-init has the enabled EC2 data source now, when the image is launched, the instance will grab ssh keys, and mount the ephemeral storage.  Also, additional configuration can be passed using the user-data file option.  More information regarding this can be found on Stackato’s documentation in reference to using cloud-init.  Key thing to remember here is that the minimum RAM requirement for the Stackato image is 2 gigs.  Make sure the VM type used for launching the Stackato image has at least 2 gigs of RAM or more.  In this example, the image ID is emi-DAB23A8A.  The ramdisk and kernel are registered as eri-9B453C09 and eki-ADF640B0.   The VM type c1.xlarge is used, which has 4 CPU,  4096 MB of RAM,  and    50 Gigs of disk space.

euca-run-instances -k admin emi-DAB23A8A -t c1.xlarge 
--kernel eki-ADF640B0 --ramdisk eri-9B453C09

Use euca-describe-instances to check to see when the instance reaches a running state:

euca-describe-instances i-100444EF
RESERVATION r-CC69438B 345590850920 default
INSTANCE i-100444EF emi-DAB23A8A euca-192-168-55-104.wu-tang.euca-hasp.eucalyptus-systems.com 
euca-10-106-101-17.wu-tang.internal running admin 0 c1.xlarge 2013-02-23T00:34:07.436Z enter-the-wu eki-ADF640B0 
eri-9B453C09 monitoring-disabled 192.168.55.104 10.106.101.17 instance-store

The key thing for running a Stackato instance is setting up the correct DNS entries.  For more information regarding setting up DNS with regards to a Stackato instance, please read the Detail Configuration section on DNS in the Stackato online documentation.  For this example, instead of using an external DNS service using a tool like nsupdate, to configure the A record and CNAME records, we will use xip.io. xip.io is a magic domain name that provides wildcard DNS for any IP address.  Next, its time to configure the Stackato instance.

Configuration of the Stackato Instance

To configure the Stackato instance, do the following:

  1. SSH into the instance.  

    ssh -i creds/admin.priv 
    stackato@euca-192-168-55-104.wu-tang.euca-hasp.eucalyptus-systems.com
  2. Make note of the ip address associated with eth0 and the netmask using ifconfig.  Also note the gateway IP by using the route command.
  3. Run “kato op static_ip” to configure the static IP address for eth0.  Make sure and add 127.0.0.1 as the first entry as part of the nameservers, and add “containers.” as the first entry under the search domains.
  4. Run “kato rename public DNS name “, where public DNS name includes the public IP of the instance, using xip.io (e.g. 192.168.55.104.xip.io)
  5. Run “kato disable mdns”, then run “sudo reboot” to reboot the instance.
  6. Once the instance has come back up, ssh into the instance, and run the following command  “kato setup core api.public DNS name” where public DNS name is the same value used for the “kato rename” step (e.g.  192.168.55.104.xip.io).
  7. Next, edit /etc/resolv.conf and make sure that the value for the search option is “containers.”, and the first entry for the nameservers is 127.0.0.1.
  8. Finally, run “kato enable –all-but mdns”

Thats it!  Now go to the public DNS name that was used in your favorite browser.  For this example, 192.168.55.104.xip.io was used.  The following landing page should look similar to what you see here in the Stackato documentation regarding accessing the instance through the management console.

Setting Up Jenkins

After setting up the admin account, navigate to the “App Store” on the lefthand menu.  Once selected, navigate to find the Jenkins application:

 

Jenkins Application
Jenkins Application

 

After selecting to install Jenkins, select “Yes” to install.  After the installation takes place,  select “Applications” in the left hand menu.  From there, select the Jenkins application, and select “Start” (its the green arrow to the right of the application window).   Once it has started, you will see the following:

Jenkins Running in Stackato
Jenkins Running in Stackato

 

Now Jenkins is ready to be used.

If anyone wants to test this out on Eucalyptus but doesn’t have access to their own Eucalyptus cloud, fear not, the Eucalyptus Community Cloud has the Stackato image available.  After applying to get access to the Community Cloud, follow the steps above.  The image for Stackato is as follows:

IMAGE emi-859B3D5C stackato_v2.6.6/stackato-cloudinit.manifest.xml 150820662310 available public x86_64 machine eki-6FBE3D2D eri-67463B77 instance-store

And as always, this image and steps can be used on AWS EC2 as well. 🙂

Let me know if there are any questions.  Feedback is always welcome.  Enjoy!

Jenkins, Stackato, Cloud-Init and Eucalyptus == Potent Combination for an On-Premise Continuous Integration Environment

AWS EBS-backed AMI to Eucalyptus Walrus-backed EMI

Preface

A few weeks back, I was doing some testing with the guys from AppScale to get an Eucalyptus Machine Image (EMI) to run on Eucalyptus.  The image that was provided to me was an EBS-backed Amazon Machine Image (AMI), using a published EC2 Lucid Ubuntu Cloud image.  This blog entry describes the procedure to convert an EBS-backed AMI to an Walrus-backed EMI.  The goal here is to demonstrate how easy it is to use Ubuntu Cloud images to set up AppScale on both AWS and Eucalyptus as a hybrid cloud use case.  There are many other hybrid cloud use cases that can be done with this setup, but this blog entry will focus on the migration of AMI images to EMI images.

*NOTE* This entry assumes that a user is experienced with both Amazon Web Services and Eucalyptus.  For additional information, please refer to the following resources:

Prerequisites

Before getting started, the following is needed:

*NOTE* Make sure there is an understanding of the IAM policies on AWS and Eucalyptus.  These are key in making sure that the user on both AWS and Eucalyptus can perform all the steps covered in this topic.

Work in AWS…

After setting up the command-line tools for AWS EC2,  and adding in the necessary EC2 and S3 IAM policies, everything is in place to get started with working with the AWS instances and images. *NOTE* To get help with setting up the IAM policies, check out the AWS Policy Generator.    To make sure things look good, I tested out my EC2 access by running ec2-describe-availability-zones:

$ ec2-describe-availability-zones 
AVAILABILITYZONE us-east-1a available us-east-1 
AVAILABILITYZONE us-east-1b available us-east-1 
AVAILABILITYZONE us-east-1c available us-east-1 
AVAILABILITYZONE us-east-1d available us-east-1

After that, I set up a keypair and SSH access for any instance that is launched within the default security group:

$ ec2-create-keypair hspencer-appscale –region ap-northeast-1 > hspencer-appscale.pem

$ ec2-authorize -P tcp -p 22 -s 0.0.0.0/0 default –region ap-northeast-1

With everything looking good, I went ahead and checked out the AMI that I was asked to test.  Below is the AMI that was given to me:

$ ec2-describe-images ami-2e4bf22f --region ap-northeast-1
IMAGE ami-2e4bf22f 839953741869/appscale-lite-1.6.3-testing 839953741869 available public x86_64 machine aki-d409a2d5 ebs paravirtual xen
BLOCKDEVICEMAPPING EBS /dev/sda1 snap-7953a059 8 true standard
BLOCKDEVICEMAPPING EPHEMERAL /dev/sdb ephemeral0

As you can see, the AMI given to me is an EBS-backed image, and it is in a different region (ap-northeast-1).  I could have done all my work in the ap-northeast-1 region, but I wanted to test out region-to-region migration of images on AWS S3 using ec2-migrate-manifest.  In order to access the EBS-backed instance that is launched, I set up a keypair and SSH access for any instance that is launched within the default security group:

$ ec2-create-keypair hspencer-appscale --region ap-northeast-1 > hspencer-appscale.pem
$ ec2-authorize -P tcp -p 22 -s 0.0.0.0/0 default --region ap-northeast-1

Now that I have my image, keypair and security group access,  I am ready to launch an instance, so I can use the ec2-bundle-vol command to create an image of the instance.  To launch the instance, I ran the following:

$ ec2-run-instances -k hspencer-appscale ami-2e4bf22f –region ap-northeast-1

After the instance is up and running, I scp’d my EC2_PRIVATE_KEY and EC2_CERT to the instance using the keypair created (hspencer-appscale.pem).  The instance already had the latest  version of ec2-api-tools and ec2-ami-tools as part of the installation of AppScale.  Similar to the instructions provided by AWS for creating an instance-store backed AMI from an existing AMI, I  used ec2-bundle-vol to bundle a new image and used /mnt/ (which is ephemeral storage) to store the manifest information.

root@ip-10-156-123-126:~# ec2-bundle-vol -u 9xxxxxxx3 -k pk-XXXXXXXXXXXXXXXX.pem -c cert-XXXXXXXXXXXXXXXXX.pem -d /mnt/ -e /mnt/
Please specify a value for arch [x86_64]: x86_64
Copying / into the image file /mnt/image...
Excluding: 
/dev
/sys
/sys/kernel/security
/sys/kernel/debug
/proc
/dev/pts
/dev
/media
/mnt
/proc
/sys
/mnt/
/mnt/image
/mnt/img-mnt
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.00990555 s, 106 MB/s
mke2fs 1.41.11 (14-Mar-2010)
Bundling image file...
Splitting /mnt/image.tar.gz.enc...
Created image.part.000
……………..

Next, I need to inform the manifest to use us-west-1 as the region to store the image, and not ap-northeast-1.  To do this, I used ec2-migrate-manifest.  *NOTE* This tool can only be used in the following regions: EU,US,us-gov-west-1,us-west-1,us-west-2,ap-southeast-1,ap-southeast-2,ap-northeast-1,sa-east-1.

root@ip-10-156-123-126:~# ec2-migrate-manifest -m /mnt/image.manifest.xml -c cert-XXXXXXXXX.pem -k pk-XXXXXXXXXXXX.pem -a XXXXXXXXXX -s XXXXXXXXX --region us-west-1
Backing up manifest...
warning: peer certificate won't be verified in this SSL session
warning: peer certificate won't be verified in this SSL session
warning: peer certificate won't be verified in this SSL session
warning: peer certificate won't be verified in this SSL session
Successfully migrated /mnt/image.manifest.xml
It is now suitable for use in us-west-1.

Time to upload the bundle to S3 using ec2-upload-bundle:

root@ip-10-156-123-126:~# ec2-upload-bundle -b appscale-lite-1.6.3-testing -m /mnt/image.manifest.xml -a XXXXXXXXXX -s XXXXXXXXXX --location us-west-1
You are bundling in one region, but uploading to another. If the kernel
or ramdisk associated with this AMI are not in the target region, AMI
registration will fail.
You can use the ec2-migrate-manifest tool to update your manifest file
with a kernel and ramdisk that exist in the target region.
Are you sure you want to continue? [y/N]y
Creating bucket...
Uploading bundled image parts to the S3 bucket appscale-lite-1.6.3-testing ...
Uploaded image.part.000
Uploaded image.part.001
Uploaded image.part.002
Uploaded image.part.003
………….

After the image has been uploaded successfully, all that is left to do is register the image.

root@ip-10-156-123-126:~# export JAVA_HOME=/usr
root@ip-10-156-123-126:~# ec2-register -K pk-XXXXXXXXXXXX.pem -C cert-XXXXXXXXXX.pem --region us-west-1 appscale-lite-1.6.3-testing/image.manifest.xml --name appscale1.6.3-testing
IMAGE ami-705d7c35
$ ec2-describe-images ami-705d7c35 --region us-west-1
IMAGE ami-705d7c35 986451091583/appscale1.6.3-testing 986451091583 available private x86_64 machine aki-9ba0f1de instance-store paravirtual xen
BLOCKDEVICEMAPPING EPHEMERAL /dev/sdb ephemeral0

Work in Eucalyptus…

Now that we have the image registered, we can use ec2-download-bundle and ec2-unbundle to get the machine image to an instance running on Eucalyptus, so that we can bundle, upload and register the image to Eucalyptus.

To start off, I followed the instructions for setting up my command-line environment, and Eucalyptus IAM policies on Eucalyptus – similar to what was done for AWS.

Next, I downloaded the lucid-server-cloudimg-amd64.tar.gz file from the Ubuntu Cloud Images (Lucid) site.  After that, I bundled, uploaded and registered the following images:

  • lucid-server-cloudimg-amd64-loader (ramdisk)
  • lucid-server-cloudimg-amd64-vmlinuz-virtual (kernel)
  • lucid-server-cloudimg-amd64.img (root image)

After bundling, uploading and registering those images, I created a keypair, and SSH access for the instance that is launched within the default security group:

euca-add-keypair hspencer-euca > hspencer-euca.pem
euca-authorize -P tcp -p 22 -s 0.0.0.0/0 default

Now, I run the EMI for the Lucid image that was registered:

euca-run-instance -k hspencer-euca --user-data-file cloud-init.config -t m1.large emi-29433329

I used vm.type m1.large so that I can use the space on ephemeral to store the image that I will pull from AWS.

Once the instance is running, I scp’d my EC2_PRIVATE_KEY and EC2_CERT to the instance using the keypair created (hspencer-euca.pem).  After installing the ec2-ami-tools on the instance, I used ec2-download-bundle to download the bundle to /media/ephemeral0, and ec2-unbundle the image:

# ec2-download-bundle -b appscale-lite-1.6.3-testing -d /media/ephemeral0/ -a XXXXXXXXXXX -s XXXXXXXXXXXX -k pk-XXXXXXXXX.pem --url http://s3-us-west-1.amazonaws.com
# ec2-unbundle -m /media/ephemeral0/image.manifest.xml -s /media/ephemeral0/ -d /media/ephemeral0/ -k pk-XXXXXXXXXX.pem

Now that I have the root image from AWS, I just need to bundle, upload and register the root image to Eucalyptus.  To do so, I scp’d my Eucalyptus user credentials to the instance.  After copying the Eucalyptus credentials to the instance, I ssh’ed into the instance and source the Eucalyptus credentials.

Since I have already bundled the kernel and ramdisk for the Ubuntu Cloud Lucid image before, I just need to upload, bundle and register the image I unbundled from AWS.  To do so, I did the following:

euca-bundle-image -i image  
euca-upload-bundle -b appscale-1.6.3-x86_64 -m /tmp/image.manifest.xml
euca-register -a x86_64 appscale-1.6.3-x86_64/image.manifest.xml

Now the image is ready to be launched on Eucalyptus.

Conclusion

As demonstrated above,  because of the AWS fidelity that Eucalyptus provides, it enables setting up hybrid cloud environments with Eucalyptus and AWS that can be leveraged by applications, like AppScale.

Other examples of AMI to EMI conversions can be found here:

https://github.com/eucalyptus/ami2emi

Enjoy!

 

 

AWS EBS-backed AMI to Eucalyptus Walrus-backed EMI

OpenLDAP Sandbox in the Clouds

Background

I really enjoy OpenLDAP.  I think folks really don’t understand the power of OpenLDAP, concerning its robustness (i.e. use multiple back-ends), speed and efficiency.

I think its important to have sandboxes to test various technologies.  The “cloud” is the best place for this.  To test out the latest builds provided by OpenLDAP (via git), I created a cloud-init script that allows me to configure, build, and install an OpenLDAP sandbox environment in the cloud (on-premise and/or public).  This script has been tested on AWS and Eucalyptus using Ubuntu Precise 12.04 LTS.   This blog entry is a compliment to my past blog regarding overlays, MDB and OpenLDAP.

Lean Requirements – Script, Image, and Cloud

When thinking about this setup, there were three goals in mind:

  1. Ease of configuration – this is why cloud-init was used.  Its very powerful in regards to bootstrapping instances as they boot up.  You can use Puppet, Chef or others (e.g. Salt Stack, Juju, etc.), but I decided to go with cloud-init.  The script does the following:
    • Downloads all the prerequisites for building OpenLDAP from source, including euca2ools.
    • Downloads OpenLDAP using Git
    • Set up ephemeral storage to be the installation point for OpenLDAP (e.g. configuration, storage, etc.)
    • Adds information into /etc/rc.local to make sure ephemeral gets re-mounted on reboots of the instance, and hostname is set.
    • Configures, builds and installs OpenLDAP.
  2. Cloud image that is ready to go – Ubuntu has done a wonderful job with their cloud images.  They have made it really easy to access them on AWS. These images can be used on Eucalyptus as well.
  3. Public and Private Cloud Deployment – Since Eucalyptus follows the AWS EC2 API very closely, it makes it really easy to test on both AWS and Eucalyptus.

Now that the background has been covered a bit, the next section will cover deploying the sandbox on AWS and/or Eucalyptus.

Deploy the Sandbox

To set the sandbox setup, use the following steps:

  1. Make sure and have an account on AWS and/or Eucalyptus (and the correct AWS/Eucalyptus IAM policies are in place so that you can bundle, upload and register images to AWS S3 and Eucalyptus Walrus).
  2. Make sure you have access to a registered AMI/EMI that runs Ubuntu Precise 12.04 LTS.  *NOTE* If you are using AWS, you can just go to the Ubuntu Precise Cloud Image download page, and select the AMI in the region that you have access to.
  3. Download the openldap cloud-init recipe from Eucalyptus/recipes repository.
  4. Download and install the latest Euca2ools (I used  the command-line tool euca-run-instances to run these instances).
  5. After you have downloaded your credentials from AWS/Eucalyptus, define your global environments by either following the documentation for AWS EC2 or the documentation for Eucalyptus.
  6. Use euca-run-instances with the –user-data-file option to launch the instance:  

    euca-run-instances -k hspencer.pem ....
     --user-data-file cloud-init-openldap.config [AMI | EMI]

After the instance is launched, ssh into the instance, and you will see something similar to the following:

ubuntu@euca-10-106-69-149:~$ df -ah
Filesystem Size Used Avail Use% Mounted on
/dev/vda1 1.4G 1.2G 188M 86% /
proc 0 0 0 - /proc
sysfs 0 0 0 - /sys
none 0 0 0 - /sys/fs/fuse/connections
none 0 0 0 - /sys/kernel/debug
none 0 0 0 - /sys/kernel/security
udev 494M 12K 494M 1% /dev
devpts 0 0 0 - /dev/pts
tmpfs 200M 232K 199M 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 498M 0 498M 0% /run/shm
/dev/vda2 8.0G 159M 7.5G 3% /opt/openldap

Your sandbox environment is now set up.  From here, just following the instructions in the OpenLDAP Administrator’s Guide on configuring your openldap server, or continue from the “Setup – OLC and MDB” section located in my previous blog.  *NOTE* As you configure your openldap server, make sure and use euca-authorize to control access to your instance.

Enjoy!

OpenLDAP Sandbox in the Clouds