Archive
Big Data on the Cloud using Ansible, RHadoop, AppScale, and AWS/Eucalyptus
Background
Big Data has been a hot topic over the last few years. Big Data on public clouds, such as AWS’s Elastic MapReduce, has been gaining even more popularity as cloud computing becomes more of an industry standard.
R is an open source project for statistical computing and graphics. It has been growing in popularity for doing linear and nonlinear modeling, classical statistical tests, time-series analysis and others, at various Universities and companies.
RHadoop was developed by Revolution Analytics to interface with Hadoop. Revolution Analytics builds analytic software solutions using R.
AppScale is an open source PaaS that implements the Google AppEngine API on IaaS environments. One of the Google AppEngine APIs that is implemented is AppEngine MapReduce. The back-end support for this API that AppScale using Cloudera’s Distribution for Apache Hadoop.
Ansible is an open source orchestration software that utilizes SSH for handling configuration management for physical/virtual machines, and machines running in the cloud.
Amazon Web Services is a public IaaS that provides infrastructure and application services in the cloud. Eucalyptus is an open source software solution that provides the AWS APIs for EC2, S3, and IAM for on-premise cloud environments.
This blog entry will cover how to deploy AppScale (either on AWS or Eucalyptus), then use Ansible to configure each AppScale node with R, and the RHadoop packages in order allow programs written in R to utilize MapReduce in the cloud.
Pre-requisites
To get started, the following is needed on a desktop/laptop computer:
- AppScale Tools installed.
- Ansible installed.
- The following AWS/Eucalyptus variables exported as global variables in your shell:
- EC2_ACCESS_KEY
- EC2_SECRET_KEY
- EC2_URL
*NOTE: These variables are used by AppScale Tools version 1.6.9. Check the AWS and Eucalyptus documentation regarding obtaining user credentials.
Deployment
AppScale
After installing AppScale Tools and Ansible, the AppScale cluster needs to be deployed. After defining the AWS/Eucalyptus variables, initialize the creation of the AppScale cluster configuration file – AppScalefile.
$ ./appscale-tools/bin/appscale init cloud
Edit the AppScalefile, providing information for the keypair, security group, and AppScale AMI/EMI. The keypair and security group do not need to be pre-created. AppScale will handle this. The AppScale AMI on AWS (us-east-1) is ami-4e472227. The Eucalyptus EMI will be unique based upon the Eucalyptus cloud that is being used. In this example, the AWS AppScale AMI will be used, and the AppScale cluster size will be 3 nodes. Here is the example AppScalefile:
--- group : 'appscale-rmr' infrastructure : 'ec2' instance_type : 'm1.large' keyname : 'appscale-rmr' machine : 'ami-4e472227' max : 3 min : 3 table : 'hypertable'
After editing the AppScalefile, start up the AppScale cluster by running the following command:
$ ./appscale-tools/bin/appscale up
Once the cluster finishes setting up, the status of the cluster can be seen by running the command below:
$ ./appscale-tools/bin/appscale status
R, RHadoop Installation Using Ansible
Now that the cluster is up and running, grab the Ansible playbook for installing R, and RHadoop rmr2 and rhdfs packages onto the AppScale nodes. The playbook can be downloaded from github using git:
$ git clone https://github.com/hspencer77/ansible-r-appscale-playbook.git
After downloading the playbook, the ansible-r-appscale-playbook/production file needs to be populated with the information of the AppScale cluster. Grab the cluster node information by running the following command:
$ ./appscale-tools/bin/appscale status | grep amazon | grep Status | awk '{print $5}' | cut -d ":" -f 1
ec2-50-17-96-162.compute-1.amazonaws.com
ec2-50-19-45-193.compute-1.amazonaws.com
ec2-67-202-23-157.compute-1.amazonaws.com
Add those DNS entries to the ansible-r-appscale-playbook/production file. After editing, the file will look like the following:
[appscale-nodes]
ec2-50-17-96-162.compute-1.amazonaws.com
ec2-50-19-45-193.compute-1.amazonaws.com
ec2-67-202-23-157.compute-1.amazonaws.com
Now the playbook can be executed. The playbook requires the SSH private key to the nodes. This key will be located under the ~/.appscale folder. In this example, the key file is named appscale-rmr.key. To execute the playbook, run the following command:
$ ansible-playbook -i r-appscale-deployment/production
--private-key=~/.appscale/appscale-rmr.key -v r-appscale-deployment/site.yml
Testing Out The Deployment – Wordcount.R
Once the playbook has finished running, the AppScale cluster is now ready to be used. To test out the setup, SSH into the head node of the AppScale cluster. To find out the head node of the cluster, execute the following command:
$ ./appscale-tools/bin/appscale status
After discovering the head node, SSH into the head node using the private key located in the ~/.appscale directory:
$ ssh -i ~/.appscale/appscale-rmr.key root@ec2-50-17-96-162.compute-1.amazonaws.com
To test out the R setup on all the nodes, grab the wordcount.R program:
root@appscale-image0:~# tar zxf rmr2_2.0.2.tar.gz rmr2/tests/wordcount.R
In the wordcount.R file, the following lines are present
rmr2:::hdfs.put("/etc/passwd", "/tmp/wordcount-test")
out.hadoop = from.dfs(wordcount("/tmp/wordcount-test", pattern = " +"))
When the wordcount.R program is executed, it will grab the /etc/password file from the head node, copy it to the hdfs filesystem, then run wordcount on /etc/password to look for the pattern ” +”. NOTE: wordcount.R can be edited to use any file and pattern desired.
Run wordcount.R:
root@appscale-image0:~# R
R version 2.15.3 (2013-03-01) -- "Security Blanket"
Copyright (C) 2013 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
[Previously saved workspace restored]
> source('rmr2/tests/wordcount.R')
Loading required package: Rcpp
Loading required package: RJSONIO
Loading required package: digest
Loading required package: functional
Loading required package: stringr
Loading required package: plyr
13/04/05 02:33:41 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
13/04/05 02:33:43 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
packageJobJar: [/tmp/RtmprcYtsu/rmr-local-env19811a7afd54, /tmp/RtmprcYtsu/rmr-global-env1981646cf288, /tmp/RtmprcYtsu/rmr-streaming-map198150b6ff60, /tmp/RtmprcYtsu/rmr-streaming-reduce198177b3496f, /tmp/RtmprcYtsu/rmr-streaming-combine19813f7ea210, /var/appscale/hadoop/hadoop-unjar5632722635192578728/] [] /tmp/streamjob8198423737782283790.jar tmpDir=null
13/04/05 02:33:44 WARN snappy.LoadSnappy: Snappy native library is available
13/04/05 02:33:44 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/04/05 02:33:44 INFO snappy.LoadSnappy: Snappy native library loaded
13/04/05 02:33:44 INFO mapred.FileInputFormat: Total input paths to process : 1
13/04/05 02:33:44 INFO streaming.StreamJob: getLocalDirs(): [/var/appscale/hadoop/mapred/local]
13/04/05 02:33:44 INFO streaming.StreamJob: Running job: job_201304042111_0015
13/04/05 02:33:44 INFO streaming.StreamJob: To kill this job, run:
13/04/05 02:33:44 INFO streaming.StreamJob: /root/appscale/AppDB/hadoop-0.20.2-cdh3u3/bin/hadoop job -Dmapred.job.tracker=10.77.33.247:9001 -kill job_201304042111_0015
13/04/05 02:33:44 INFO streaming.StreamJob: Tracking URL: http://appscale-image0:50030/jobdetails.jsp?jobid=job_201304042111_0015
13/04/05 02:33:45 INFO streaming.StreamJob: map 0% reduce 0%
13/04/05 02:33:51 INFO streaming.StreamJob: map 50% reduce 0%
13/04/05 02:33:52 INFO streaming.StreamJob: map 100% reduce 0%
13/04/05 02:33:59 INFO streaming.StreamJob: map 100% reduce 33%
13/04/05 02:34:02 INFO streaming.StreamJob: map 100% reduce 100%
13/04/05 02:34:04 INFO streaming.StreamJob: Job complete: job_201304042111_0015
13/04/05 02:34:04 INFO streaming.StreamJob: Output: /tmp/RtmprcYtsu/file1981524ee1a3
13/04/05 02:34:05 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
13/04/05 02:34:07 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
13/04/05 02:34:08 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
13/04/05 02:34:10 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
Deleted hdfs://10.77.33.247:9000/tmp/wordcount-test
>quit("yes")
Thats it! The AppScale cluster is ready for additional R programs that utilize MapReduce. Enjoy the world of Big Data on public/private IaaS.
What's new in Ansible 1.1 for AWS and Eucalyptus users?
Reblogged from Take that to the bank and cash it!:
I thought the Ansible 1.0 development cycle was busy but 1.1 is crammed full of orchestration goodness. On Tuesday, 1.1 was released and you can read more about it here: http://blog.ansibleworks.com/2013/04/02/ansible-1-1-released/
For those working on AWS and Eucalyptus, 1.1 brings some nice module improvements as well as a new cloudformation and s3 module. It's great to see the AWS-related modules becoming so popular so quickly.
Using Ansible to Deploy Neo4j HA Cluster on AWS/Eucalyptus
As a follow-up to my last Neo4j, AWS/Eucalyptus blog, this entry demonstrates another great example of AWS/Eucalyptus fidelity by using Ansible to deploy a Neo4j High Available cluster.
Pre-requisites
In order to use this Ansible playbook on AWS/Eucalyptus, the following is needed:
- An AWS or Eucalyptus account, with a user’s access key and secret access key.
- EC2 IAM Policy to allow launching of instances, and authorize ports in security group
- Ubuntu Cloud Image (Precise 12.04)
- EC2 API Client Tools
- git repository tools
Before deploying the cluster, a security group needs to be created that the cluster will use. The security group must allow the following:
- port 22 (SSH)
- all instances part of the security group allowed to community with each other (ports 0 - 65535)
To create the security group and authorize the ports, make sure the user’s access key, secret access key, and EC2 URL are noted, and do the following:
- Create the security group
ec2-create-group --aws-access-key <EC2_ACCESS_KEY> --aws-secret-key <EC2_SECRET_KEY> --url <EC2_URL> -g neo4j-cluster -d "Neo4j HA Cluster"
- Authorize port for SSH in neo4j-cluster security group
ec2-authorize --aws-access-key <EC2_ACCESS_KEY> --aws-secret-key <EC2_SECRET_KEY> --url <EC2_URL> -P tcp -p 22 -s 0.0.0.0/0 neo4j-cluster
- Authorize all port communication between cluster members
ec2-authorize --aws-access-key <EC2_ACCESS_KEY> --aws-secret-key <EC2_SECRET_KEY> --url <EC2_URL> -P tcp -o neo4j-cluster -p -1 neo4j-cluster
After completing these steps, use
ec2-describe-group
to view the security group:
ec2-describe-group --aws-access-key <EC2_ACCESS_KEY> --aws-secret-key <EC2_SECRET_KEY> --url <EC2_URL> neo4j-cluster GROUP sg-1cbc5777 986451091583 neo4j-cluster Neo4j HA Cluster PERMISSION 986451091583 neo4j-cluster ALLOWS tcp 0 65535 FROM USER 986451091583 NAME neo4j-cluster ID sg-1cbc5777 ingress PERMISSION 986451091583 neo4j-cluster ALLOWS tcp 22 22 FROM CIDR 0.0.0.0/0 ingress
Neo4j HA Cluster Deployment
Once the security group is created with the correct ports authorized, the cluster can be deployed. To deploy the cluster, do the following:
- Obtain Ansible from git and setup the environment by following the instructions mentioned here - http://ansible.cc/docs/gettingstarted.html#getting-ansible
- Obtain the Ansible Playbook for Neo4j HA Cluster using git
git clone https://github.com/hspencer77/ansible-neo4j-cluster.git
- Change directory into ansible-neo4j-cluster
cd ansible-neo4j-cluster
- Set up /etc/ansible/hosts with the following information:
[local] 127.0.0.1 - Populate vars/ec2-config with either Eucalyptus/AWS information. vars/ec2-config contains the following variables:
keypair: <EC2/Eucalyptus Keypair> ec2_access_key: <EC2_ACCESS_KEY> ec2_secret_key: <EC2_SECRET_KEY> ec2_url: <EC2_URL> instance_type: m1.small security_group: <AWS/Eucalyptus Security Group> image: <AMI/EMI> -
Execute the following command:
ansible-playbook neo4j-cluster.yml \ --private-key=<AWS/Eucalyptus Private Key file> --extra-vars "node_count=3" - After the playbook finishes, there will be an URL provided to access the cluster – similar to the example below:
TASK: [Display HAProxy URL] ********************* changed: [23.22.248.75] => {"changed": true, "cmd": "echo \"HAProxy URL for Neo4j - http://ec2-23-22-248-75.compute-1.amazonaws.com/webadmin/#/info/org.neo4j/High%20Availability/\" ", "delta": "0:00:00.006835", "end": "2013-03-30 19:54:31.104320", "rc": 0, "start": "2013-03-30 19:54:31.097485", "stderr": "", "stdout": "HAProxy URL for Neo4j - http://ec2-23-22-248-75.compute-1.amazonaws.com/webadmin/#/info/org.neo4j/High%20Availability/"}To view the status of cluster in the browser, open up http://ec2-23-22-248-75.compute-1.amazonaws.com/webadmin/#/info/org.neo4j/High%20Availability/.
- To get the status of the cluster, use curl:
curl -H "Content-Type:application/json" -d '["org.neo4j:*"]' http://ec2-23-22-248-75.compute-1.amazonaws.com/db/manage/server/jmx/query
Thats it! A Neo4j HA cluster with an HA Proxy server serving as an endpoint is available to be used. If a bigger cluster is desired, just change the
node_count
value. For additional information regarding this playbook, and how it handles the cluster membership, please refer to the following URL - https://github.com/hspencer77/ansible-neo4j-cluster/blob/master/README.md.
Hope you enjoy! As always, questions/comments/suggestions are always welcome.
Run Appscale on Eucalyptus
Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over a network (typically the Internet). - Wikipedia
According to Wikipedia currently there are few popular service models exist.
1. Infrastructure as a service (IaaS)
2. Platform as a service (PaaS)
3. Software as a service (SaaS)
So, I have an Eucalyptus cloud, which is great, serves as AWS-like IaaS platform.
Test Drive: Drupal Deployment on Eucalyptus using Stackato, Amazon Route 53 and the Eucalyptus Community Cloud
Recently, I did a blog discussing how to deploy a Jenkins server using Stackato, running on Eucalyptus. At the end of that blog, I mentioned how the Eucalyptus Community Cloud (ECC) could be used for testing out the Stackato Microcloud image on Eucalyptus. The previous blog – I felt – was more for DevOps administrators who had access to their own on-premise Eucalyptus clouds. The inspiration of this blog comes from the blog on ActiveBlog entitled “Deploy & Scale Drupal on Any Cloud with Stackato” to show love to Web Developers, and show the power of Amazon’s Route 53.
Test Drive Pre-Reqs
The prerequisites for this blog are the same that are mentioned in my previous blog regarding using Stackato on Eucalyptus (for the Eucalyptus pre-reqs, make sure the ECC is being used). In addition to the prerequisites mentioned above, the following is needed:
After the prerequisites have been met, its time to setup the Drupal environment.
Test Drive Engage!
Since the ECC is being used, there is no need to worry about bundling, uploading and registering the Stackato image. The Stackato image used for this blog is as follows:
IMAGE emi-859B3D5C stackato_v2.6.6/stackato-cloudinit.manifest.xml
150820662310 available public x86_64 machine eki-6FBE3D2D eri-67463B77 instance-store
Next, lets make sure the user has an elastic IP that will be used in AWS Route 53, and a security group to allow proper network traffic to the instance. Do the following:
- Make sure the user credentials are sourced correctly, and euca2ools is installed correctly.
- Grab an elastic IP using euca-allocate-address (in this example 173.205.188.105 was allocated):
# euca-allocate-address ADDRESS 173.205.188.105
- If the user already doesn’t have a keypair, create a keypair for the user by using euca-create-keypair, and make sure the permission of the file is 0600:
# euca-create-keypair hspencer-stackato > hspencer-stackato.priv # chmod 0600 hspencer-stackato.priv
- Create a security group for the instance to use:
# euca-create-group stackato-test -d "Test Security Group for Stackato PaaS" GROUP stackato-test Test Security Group for Stackato PaaS
- Authorize ping, ssh, http, and https ports:
# euca-authorize -P icmp -t -1:-1 -s 0.0.0.0/0 stackato-test GROUP stackato-test PERMISSION stackato-test ALLOWS icmp -1 -1 FROM CIDR 0.0.0.0/0 # euca-authorize -P tcp -p 22 -s 0.0.0.0/0 stackato-test GROUP stackato-test PERMISSION stackato-test ALLOWS tcp 22 22 FROM CIDR 0.0.0.0/0 # euca-authorize -P tcp -p 80 -s 0.0.0.0/0 stackato-test GROUP stackato-test PERMISSION stackato-test ALLOWS tcp 80 80 FROM CIDR 0.0.0.0/0 # euca-authorize -P tcp -p 443 -s 0.0.0.0/0 stackato-test GROUP stackato-test PERMISSION stackato-test ALLOWS tcp 443 443 FROM CIDR 0.0.0.0/0
- Now, launch the instance, specifying the keypair name to use, and a VM type. On the ECC, only m1.xlarge and c1.xlarge meet the requirements of launching the Stackato image:
# euca-run-instances -k hspencer-stackato -t c1.xlarge emi-859B3D5C -g stackato-test RESERVATION r-66EE4030 628376682871 stackato-test INSTANCE i-E85843C4 emi-859B3D5C euca-0-0-0-0.eucalyptus.ecc.eucalyptus.com euca-0-0-0-0.eucalyptus.internal pending hspencer-stackato 0 c1.xlarge 2013-02-24T19:40:35.516Z partner01 eki-6FBE3D2D eri-67463B77 monitoring-disabled 0.0.0.0 0.0.0.0 instance-store
- Once the instance gets to a running state, associate the elastic IP that the user owns to the instance:
# euca-describe-instances RESERVATION r-66EE4030 628376682871 stackato-test INSTANCE i-E85843C4 emi-859B3D5C euca-173-205-188-106.eucalyptus.ecc.eucalyptus.com euca-10-9-190-24.eucalyptus.internal running hspencer-stackato 0 c1.xlarge 2013-02-24T19:40:35.516Z partner01 eki-6FBE3D2D eri-67463B77 monitoring-disabled 173.205.188.10 10.9.190.24 instance-store # euca-associate-address -i i-E85843C4 173.205.188.105 ADDRESS 173.205.188.105 i-E85843C4 # euca-describe-instances RESERVATION r-66EE4030 628376682871 stackato-test INSTANCE i-E85843C4 emi-859B3D5C euca-173-205-188-105.eucalyptus.ecc.eucalyptus.com euca-10-9-190-24.eucalyptus.internal running hspencer-stackato 0 c1.xlarge 2013-02-24T19:40:35.516Z partner01 eki-6FBE3D2D eri-67463B77 monitoring-disabled 173.205.188.10 10.9.190.24 instance-store
- Log into the AWS management console, select Route 53, and setup the A and CNAME records in your domain as mentioned here under the Stackato Documentation regarding detailed DNS configuration. In this example, the DNS name associated with the elastic IP 173.205.188.105 is stackato-dev.mindspew-age.com.
- Next ssh into the instance, and proceed to follow the steps for setting up the Stackato instance that is mentioned in my previous blog under the section Configuration of the Stackato Instance. Make sure the DNS name setup in AWS Route 53 is used with “kato rename public-DNS-name” and “kato setup core api.public-DNS-name” configuration steps.
- After the instance is configured, just open up the browser and go to the DNS name set up for the Stackato instance in AWS Route 53, as mentioned in the Stackato Documentation regarding configuration via the Management Console.
- Once logged into the Stackato Management Console, select “App Store” in the lefthand menu and select “Drupal” to install
- After Drupal has installed, start the application. Once it has started successfully, select the URL that shows up in the right-hand menu box. The Drupal log-in page will appear in your browser
Thats it! Now Drupal is ready for any web developer to test out on the ECC. If there is any questions/comments/suggestions, please feel free to leave comments. Enjoy!
Jenkins, Stackato, Cloud-Init and Eucalyptus == Potent Combination for an On-Premise Continuous Integration Environment
The Ingredients
Jenkins
An extendable open source continuous integration server.
Stackato
The Enterprise Private PaaS that makes it easy to deploy, manage, and monitor applications on any cloud.
Cloud-init
The Ubuntu package that handles early initialization of a cloud instance. It is installed in the Ubuntu Cloud Images and also in the official Ubuntu images available on EC2.
Eucalyptus
Allows you to build production-ready, AWS-compatible private and hybrid clouds by leveraging your existing virtualized infrastructure to create on-demand cloud resource pools.
What happens when you combine all three of these tools? A potent combination for continuous integration on a easy-to-configure PaaS and an on-premise, AWS-compatible IaaS. With this combination, developers can take advantage of easy configuration that Stackato brings to the table, running on top of Eucalyptus – bringing an AWS-like cloud environment into your datacenter.
This blog entry will discuss the steps that I took to get Jenkins installed on a Stackato instance store-backed instance running on Eucalyptus. But before I get started, I would like to thank the folks from ActiveState for all their guidance. Their support staff is really top notch, and very helpful. Check them out in #stackato on freenode.net. They can also be checked out on Twitter at @ActiveState. Now on to the dirty work…..
The Recipe for Success
The Stackato Microcloud Image and Cloud-Init
To begin, the following is needed:
- A running Eucalyptus cloud
- User credentials and proper Eucalyptus IAM policies to allow uploading of images and launching instances. (If there is more information needed here, please check out the Managing Access section in Eucalyptus 3.2 Administrator’s Guide.)
- the Stackato Mircocloud VM for KVM
- A Linux desktop with euca2ools installed.
After downloading the Stackato VM for KVM and unzipping the file, we will need to pull out the root file system, the kernel and ramdisk. These will be uploaded, bundled and registered as the EMI, EKI, and ERI. To extract the root filesystem, do the following:
- Use parted to locate the root filesystem as follows:
# parted stackato-img-kvm-v2.6.6.img GNU Parted 2.1 Using /root/images/Stackato-VM/stackato-img-kvm-v2.6.6.img Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) U Unit? [compact]? b (parted) p Model: (file) Disk /root/images/Stackato-VM/stackato-img-kvm-v2.6.6.img: 10737418240B Sector size (logical/physical): 512B/512B Partition Table: msdos Number Start End Size Type File system Flags 1 1048576B 200278015B 199229440B primary ext3 boot 3 200278016B 1237319679B 1037041664B primary linux-swap(v1) 2 1237319680B 10736369663B 9499049984B primary ext4 (parted) quit
- In this example, the root filesystem is partition 2. The value for “Start” and “Size” will need to be used. Next, run dd to extract the root filesystem:
dd if=stackato-img-kvm-v2.6.6.img of=stackato-rootfs.img bs=1 skip=1237319680 count=9499049984
- Once it has completed, mount stackato-rootfs.img to the loopback device:
mount -o loop stackato-rootfs.img /mnt/
- Copy out initrd.img-3.2.0-27-virtual and vmlinuz-3.2.0-27-virtual from /mnt/boot.
- In /mnt/etc/fstab, replace the UUID entry with LABEL. The LABEL will look simliar to the following:
LABEL=cloudimg-rootfs / ext4 defaults 1 1
- Chroot to /mnt – there may be a need to do a mount -o bind for sys, dev, and proc.
- Run “dpkg-reconfigure cloud-init”, and make sure that the EC2 Data Source is selected.
- Unmount stackato-rootfs.img (if sys, dev, and proc were mounted, unmount them before unmounting stackato-rootfs.img). After it has been unmounted, run tune2fs to change the label of the image:
tune2fs -L cloudimg-rootfs stackato-rootfs.img
After following these steps, the following should be available:
- initrd.img-3.2.0-27-virtual – to be bundled, uploaded and registered as the ERI
- vmlinuz-3.2.0-27-virtual – to be bundled, uploaded and registered as the EKI
- stackato-rootfs.img – to be bundled, uploaded and registered as the EMI
Go through the steps of bundling, uploading and registering the ERI, EKI, and EMI. For additional information, please refer to the Add an Image section of the Eucalyptus 3.2 User Guide.
Launching the Stackato Image
Now its time to launch the Stackato image on Eucalyptus. Since cloud-init has the enabled EC2 data source now, when the image is launched, the instance will grab ssh keys, and mount the ephemeral storage. Also, additional configuration can be passed using the user-data file option. More information regarding this can be found on Stackato’s documentation in reference to using cloud-init. Key thing to remember here is that the minimum RAM requirement for the Stackato image is 2 gigs. Make sure the VM type used for launching the Stackato image has at least 2 gigs of RAM or more. In this example, the image ID is emi-DAB23A8A. The ramdisk and kernel are registered as eri-9B453C09 and eki-ADF640B0. The VM type c1.xlarge is used, which has 4 CPU, 4096 MB of RAM, and 50 Gigs of disk space.
euca-run-instances -k admin emi-DAB23A8A -t c1.xlarge --kernel eki-ADF640B0 --ramdisk eri-9B453C09
Use euca-describe-instances to check to see when the instance reaches a running state:
euca-describe-instances i-100444EF RESERVATION r-CC69438B 345590850920 default INSTANCE i-100444EF emi-DAB23A8A euca-192-168-55-104.wu-tang.euca-hasp.eucalyptus-systems.com euca-10-106-101-17.wu-tang.internal running admin 0 c1.xlarge 2013-02-23T00:34:07.436Z enter-the-wu eki-ADF640B0 eri-9B453C09 monitoring-disabled 192.168.55.104 10.106.101.17 instance-store
The key thing for running a Stackato instance is setting up the correct DNS entries. For more information regarding setting up DNS with regards to a Stackato instance, please read the Detail Configuration section on DNS in the Stackato online documentation. For this example, instead of using an external DNS service using a tool like nsupdate, to configure the A record and CNAME records, we will use xip.io. xip.io is a magic domain name that provides wildcard DNS for any IP address. Next, its time to configure the Stackato instance.
Configuration of the Stackato Instance
To configure the Stackato instance, do the following:
- SSH into the instance.
ssh -i creds/admin.priv stackato@euca-192-168-55-104.wu-tang.euca-hasp.eucalyptus-systems.com
- Make note of the ip address associated with eth0 and the netmask using ifconfig. Also note the gateway IP by using the route command.
- Run “kato op static_ip” to configure the static IP address for eth0. Make sure and add 127.0.0.1 as the first entry as part of the nameservers, and add “containers.” as the first entry under the search domains.
- Run “kato rename public DNS name “, where public DNS name includes the public IP of the instance, using xip.io (e.g. 192.168.55.104.xip.io)
- Run “kato disable mdns”, then run “sudo reboot” to reboot the instance.
- Once the instance has come back up, ssh into the instance, and run the following command ”kato setup core api.public DNS name” where public DNS name is the same value used for the “kato rename” step (e.g. 192.168.55.104.xip.io).
- Next, edit /etc/resolv.conf and make sure that the value for the search option is “containers.”, and the first entry for the nameservers is 127.0.0.1.
- Finally, run “kato enable –all-but mdns”
Thats it! Now go to the public DNS name that was used in your favorite browser. For this example, 192.168.55.104.xip.io was used. The following landing page should look similar to what you see here in the Stackato documentation regarding accessing the instance through the management console.
Setting Up Jenkins
After setting up the admin account, navigate to the “App Store” on the lefthand menu. Once selected, navigate to find the Jenkins application:
After selecting to install Jenkins, select “Yes” to install. After the installation takes place, select “Applications” in the left hand menu. From there, select the Jenkins application, and select “Start” (its the green arrow to the right of the application window). Once it has started, you will see the following:
Now Jenkins is ready to be used.
If anyone wants to test this out on Eucalyptus but doesn’t have access to their own Eucalyptus cloud, fear not, the Eucalyptus Community Cloud has the Stackato image available. After applying to get access to the Community Cloud, follow the steps above. The image for Stackato is as follows:
IMAGE emi-859B3D5C stackato_v2.6.6/stackato-cloudinit.manifest.xml 150820662310 available public x86_64 machine eki-6FBE3D2D eri-67463B77 instance-store
And as always, this image and steps can be used on AWS EC2 as well.
Let me know if there are any questions. Feedback is always welcome. Enjoy!
DIY Debian Packages for Eucalyptus 3.2
Just read this latest blog from Brian Thomason, Engineer at Eucalyptus System, Inc. He leads us to the promise land on how to create your own debian packages for Eucalyptus 3.2.
Who knows, maybe he will follow up with a blog discussing how to use Walrus buckets to serve up the APT repository – similar to what can done with Amazon S3. Make sure and visit Brian’s blog entry. Any feedback will be greatly appreciated! Keep up the good work Brian!
Using the Eucalyptus User Console with AWS
Reblogged from Coders Like Us:
At the end of last year, we (Eucalyptus) released version 3.2 which included our user console. This feature finally allowed regular users to login to a web UI to manage their resources. Because this was our first release, we had a lot of catching up to do. I would say that is still the case, but the point here is that we were able to test all of our features against Eucalyptus.
2012 in Review
Happy New Year! I want to thank everyone who promoted, followed, commented and inspired my blog entries in 2012. I look forward to providing more material in 2013.
The WordPress.com stats helper monkeys prepared a 2012 annual report for this blog.
Here’s an excerpt:
600 people reached the top of Mt. Everest in 2012. This blog got about 9,300 views in 2012. If every person who reached the top of Mt. Everest viewed this blog, it would have taken 16 years to get that many views.
AWS EBS-backed AMI to Eucalyptus Walrus-backed EMI
Preface
A few weeks back, I was doing some testing with the guys from AppScale to get an Eucalyptus Machine Image (EMI) to run on Eucalyptus. The image that was provided to me was an EBS-backed Amazon Machine Image (AMI), using a published EC2 Lucid Ubuntu Cloud image. This blog entry describes the procedure to convert an EBS-backed AMI to an Walrus-backed EMI. The goal here is to demonstrate how easy it is to use Ubuntu Cloud images to set up AppScale on both AWS and Eucalyptus as a hybrid cloud use case. There are many other hybrid cloud use cases that can be done with this setup, but this blog entry will focus on the migration of AMI images to EMI images.
*NOTE* This entry assumes that a user is experienced with both Amazon Web Services and Eucalyptus. For additional information, please refer to the following resources:
- Amazon Elastic Compute Cloud User Guide
- AWS Identity and Access Management – Using IAM Guide
- Amazon Elastic Compute Cloud CLI Guide
- Eucalyptus 3.2 Administrator’s Guide
- Eucalyptus 3.2 User Guide
Prerequisites
Before getting started, the following is needed:
- For Amazon Web Services
- For Eucalyptus
*NOTE* Make sure there is an understanding of the IAM policies on AWS and Eucalyptus. These are key in making sure that the user on both AWS and Eucalyptus can perform all the steps covered in this topic.
Work in AWS…
After setting up the command-line tools for AWS EC2, and adding in the necessary EC2 and S3 IAM policies, everything is in place to get started with working with the AWS instances and images. *NOTE* To get help with setting up the IAM policies, check out the AWS Policy Generator. To make sure things look good, I tested out my EC2 access by running ec2-describe-availability-zones:
$ ec2-describe-availability-zones AVAILABILITYZONE us-east-1a available us-east-1 AVAILABILITYZONE us-east-1b available us-east-1 AVAILABILITYZONE us-east-1c available us-east-1 AVAILABILITYZONE us-east-1d available us-east-1
After that, I set up a keypair and SSH access for any instance that is launched within the default security group:
$ ec2-create-keypair hspencer-appscale –region ap-northeast-1 > hspencer-appscale.pem
$ ec2-authorize -P tcp -p 22 -s 0.0.0.0/0 default –region ap-northeast-1
With everything looking good, I went ahead and checked out the AMI that I was asked to test. Below is the AMI that was given to me:
$ ec2-describe-images ami-2e4bf22f --region ap-northeast-1 IMAGE ami-2e4bf22f 839953741869/appscale-lite-1.6.3-testing 839953741869 available public x86_64 machine aki-d409a2d5 ebs paravirtual xen BLOCKDEVICEMAPPING EBS /dev/sda1 snap-7953a059 8 true standard BLOCKDEVICEMAPPING EPHEMERAL /dev/sdb ephemeral0
As you can see, the AMI given to me is an EBS-backed image, and it is in a different region (ap-northeast-1). I could have done all my work in the ap-northeast-1 region, but I wanted to test out region-to-region migration of images on AWS S3 using ec2-migrate-manifest. In order to access the EBS-backed instance that is launched, I set up a keypair and SSH access for any instance that is launched within the default security group:
$ ec2-create-keypair hspencer-appscale --region ap-northeast-1 > hspencer-appscale.pem $ ec2-authorize -P tcp -p 22 -s 0.0.0.0/0 default --region ap-northeast-1
Now that I have my image, keypair and security group access, I am ready to launch an instance, so I can use the ec2-bundle-vol command to create an image of the instance. To launch the instance, I ran the following:
$ ec2-run-instances -k hspencer-appscale ami-2e4bf22f –region ap-northeast-1
After the instance is up and running, I scp’d my EC2_PRIVATE_KEY and EC2_CERT to the instance using the keypair created (hspencer-appscale.pem). The instance already had the latest version of ec2-api-tools and ec2-ami-tools as part of the installation of AppScale. Similar to the instructions provided by AWS for creating an instance-store backed AMI from an existing AMI, I used ec2-bundle-vol to bundle a new image and used /mnt/ (which is ephemeral storage) to store the manifest information.
root@ip-10-156-123-126:~# ec2-bundle-vol -u 9xxxxxxx3 -k pk-XXXXXXXXXXXXXXXX.pem -c cert-XXXXXXXXXXXXXXXXX.pem -d /mnt/ -e /mnt/ Please specify a value for arch [x86_64]: x86_64 Copying / into the image file /mnt/image... Excluding: /dev /sys /sys/kernel/security /sys/kernel/debug /proc /dev/pts /dev /media /mnt /proc /sys /mnt/ /mnt/image /mnt/img-mnt 1+0 records in 1+0 records out 1048576 bytes (1.0 MB) copied, 0.00990555 s, 106 MB/s mke2fs 1.41.11 (14-Mar-2010) Bundling image file... Splitting /mnt/image.tar.gz.enc... Created image.part.000 ……………..
Next, I need to inform the manifest to use us-west-1 as the region to store the image, and not ap-northeast-1. To do this, I used ec2-migrate-manifest. *NOTE* This tool can only be used in the following regions: EU,US,us-gov-west-1,us-west-1,us-west-2,ap-southeast-1,ap-southeast-2,ap-northeast-1,sa-east-1.
root@ip-10-156-123-126:~# ec2-migrate-manifest -m /mnt/image.manifest.xml -c cert-XXXXXXXXX.pem -k pk-XXXXXXXXXXXX.pem -a XXXXXXXXXX -s XXXXXXXXX --region us-west-1 Backing up manifest... warning: peer certificate won't be verified in this SSL session warning: peer certificate won't be verified in this SSL session warning: peer certificate won't be verified in this SSL session warning: peer certificate won't be verified in this SSL session Successfully migrated /mnt/image.manifest.xml It is now suitable for use in us-west-1.
Time to upload the bundle to S3 using ec2-upload-bundle:
root@ip-10-156-123-126:~# ec2-upload-bundle -b appscale-lite-1.6.3-testing -m /mnt/image.manifest.xml -a XXXXXXXXXX -s XXXXXXXXXX --location us-west-1 You are bundling in one region, but uploading to another. If the kernel or ramdisk associated with this AMI are not in the target region, AMI registration will fail. You can use the ec2-migrate-manifest tool to update your manifest file with a kernel and ramdisk that exist in the target region. Are you sure you want to continue? [y/N]y Creating bucket... Uploading bundled image parts to the S3 bucket appscale-lite-1.6.3-testing ... Uploaded image.part.000 Uploaded image.part.001 Uploaded image.part.002 Uploaded image.part.003 ………….
After the image has been uploaded successfully, all that is left to do is register the image.
root@ip-10-156-123-126:~# export JAVA_HOME=/usr root@ip-10-156-123-126:~# ec2-register -K pk-XXXXXXXXXXXX.pem -C cert-XXXXXXXXXX.pem --region us-west-1 appscale-lite-1.6.3-testing/image.manifest.xml --name appscale1.6.3-testing IMAGE ami-705d7c35 $ ec2-describe-images ami-705d7c35 --region us-west-1 IMAGE ami-705d7c35 986451091583/appscale1.6.3-testing 986451091583 available private x86_64 machine aki-9ba0f1de instance-store paravirtual xen BLOCKDEVICEMAPPING EPHEMERAL /dev/sdb ephemeral0
Work in Eucalyptus…
Now that we have the image registered, we can use ec2-download-bundle and ec2-unbundle to get the machine image to an instance running on Eucalyptus, so that we can bundle, upload and register the image to Eucalyptus.
To start off, I followed the instructions for setting up my command-line environment, and Eucalyptus IAM policies on Eucalyptus – similar to what was done for AWS.
Next, I downloaded the lucid-server-cloudimg-amd64.tar.gz file from the Ubuntu Cloud Images (Lucid) site. After that, I bundled, uploaded and registered the following images:
- lucid-server-cloudimg-amd64-loader (ramdisk)
- lucid-server-cloudimg-amd64-vmlinuz-virtual (kernel)
- lucid-server-cloudimg-amd64.img (root image)
After bundling, uploading and registering those images, I created a keypair, and SSH access for the instance that is launched within the default security group:
euca-add-keypair hspencer-euca > hspencer-euca.pem euca-authorize -P tcp -p 22 -s 0.0.0.0/0 default
Now, I run the EMI for the Lucid image that was registered:
euca-run-instance -k hspencer-euca --user-data-file cloud-init.config -t m1.large emi-29433329
I used vm.type m1.large so that I can use the space on ephemeral to store the image that I will pull from AWS.
Once the instance is running, I scp’d my EC2_PRIVATE_KEY and EC2_CERT to the instance using the keypair created (hspencer-euca.pem). After installing the ec2-ami-tools on the instance, I used ec2-download-bundle to download the bundle to /media/ephemeral0, and ec2-unbundle the image:
# ec2-download-bundle -b appscale-lite-1.6.3-testing -d /media/ephemeral0/ -a XXXXXXXXXXX -s XXXXXXXXXXXX -k pk-XXXXXXXXX.pem --url http://s3-us-west-1.amazonaws.com # ec2-unbundle -m /media/ephemeral0/image.manifest.xml -s /media/ephemeral0/ -d /media/ephemeral0/ -k pk-XXXXXXXXXX.pem
Now that I have the root image from AWS, I just need to bundle, upload and register the root image to Eucalyptus. To do so, I scp’d my Eucalyptus user credentials to the instance. After copying the Eucalyptus credentials to the instance, I ssh’ed into the instance and source the Eucalyptus credentials.
Since I have already bundled the kernel and ramdisk for the Ubuntu Cloud Lucid image before, I just need to upload, bundle and register the image I unbundled from AWS. To do so, I did the following:
euca-bundle-image -i image euca-upload-bundle -b appscale-1.6.3-x86_64 -m /tmp/image.manifest.xml euca-register -a x86_64 appscale-1.6.3-x86_64/image.manifest.xml
Now the image is ready to be launched on Eucalyptus.
Conclusion
As demonstrated above, because of the AWS fidelity that Eucalyptus provides, it enables setting up hybrid cloud environments with Eucalyptus and AWS that can be leveraged by applications, like AppScale.
Other examples of AMI to EMI conversions can be found here:
https://github.com/eucalyptus/ami2emi
Enjoy!























