Archive
Introducing Micro QA
Reblogged from Testing Clouds at 128bpm:
I have devoted my last 2 years to testing Eucalyptus. In that period the QA team and I have gone through many iterations of tools to find those that make us most efficient. It has become a never ending and enjoyable quest.
We have evolved our testing processes through the following stages:
- Using command line tools exclusively
- Writing scripts that call command line tools and parsing their output…
Demo of Eucalyptus hotness: 3.3 milestone 6
Reblogged from Greg DeKoenigsberg Speaks:
Our demo day for milestone 6 was yesterday, and it was choice. We're at feature completeness at this point, and we're now on final approach for release sometime Soon-ish, as soon as we shake out all the code nasties. We've got some good stuff to show off on Vimeo. The basic transcript:
- 0:00 Eric Choi, Product Mktg Manager, with agenda/housekeeping.
Big Data on the Cloud using Ansible, RHadoop, AppScale, and AWS/Eucalyptus
Background
Big Data has been a hot topic over the last few years. Big Data on public clouds, such as AWS’s Elastic MapReduce, has been gaining even more popularity as cloud computing becomes more of an industry standard.
R is an open source project for statistical computing and graphics. It has been growing in popularity for doing linear and nonlinear modeling, classical statistical tests, time-series analysis and others, at various Universities and companies.
RHadoop was developed by Revolution Analytics to interface with Hadoop. Revolution Analytics builds analytic software solutions using R.
AppScale is an open source PaaS that implements the Google AppEngine API on IaaS environments. One of the Google AppEngine APIs that is implemented is AppEngine MapReduce. The back-end support for this API that AppScale using Cloudera’s Distribution for Apache Hadoop.
Ansible is an open source orchestration software that utilizes SSH for handling configuration management for physical/virtual machines, and machines running in the cloud.
Amazon Web Services is a public IaaS that provides infrastructure and application services in the cloud. Eucalyptus is an open source software solution that provides the AWS APIs for EC2, S3, and IAM for on-premise cloud environments.
This blog entry will cover how to deploy AppScale (either on AWS or Eucalyptus), then use Ansible to configure each AppScale node with R, and the RHadoop packages in order allow programs written in R to utilize MapReduce in the cloud.
Pre-requisites
To get started, the following is needed on a desktop/laptop computer:
- AppScale Tools installed.
- Ansible installed.
- The following AWS/Eucalyptus variables exported as global variables in your shell:
- EC2_ACCESS_KEY
- EC2_SECRET_KEY
- EC2_URL
*NOTE: These variables are used by AppScale Tools version 1.6.9. Check the AWS and Eucalyptus documentation regarding obtaining user credentials.
Deployment
AppScale
After installing AppScale Tools and Ansible, the AppScale cluster needs to be deployed. After defining the AWS/Eucalyptus variables, initialize the creation of the AppScale cluster configuration file – AppScalefile.
$ ./appscale-tools/bin/appscale init cloud
Edit the AppScalefile, providing information for the keypair, security group, and AppScale AMI/EMI. The keypair and security group do not need to be pre-created. AppScale will handle this. The AppScale AMI on AWS (us-east-1) is ami-4e472227. The Eucalyptus EMI will be unique based upon the Eucalyptus cloud that is being used. In this example, the AWS AppScale AMI will be used, and the AppScale cluster size will be 3 nodes. Here is the example AppScalefile:
--- group : 'appscale-rmr' infrastructure : 'ec2' instance_type : 'm1.large' keyname : 'appscale-rmr' machine : 'ami-4e472227' max : 3 min : 3 table : 'hypertable'
After editing the AppScalefile, start up the AppScale cluster by running the following command:
$ ./appscale-tools/bin/appscale up
Once the cluster finishes setting up, the status of the cluster can be seen by running the command below:
$ ./appscale-tools/bin/appscale status
R, RHadoop Installation Using Ansible
Now that the cluster is up and running, grab the Ansible playbook for installing R, and RHadoop rmr2 and rhdfs packages onto the AppScale nodes. The playbook can be downloaded from github using git:
$ git clone https://github.com/hspencer77/ansible-r-appscale-playbook.git
After downloading the playbook, the ansible-r-appscale-playbook/production file needs to be populated with the information of the AppScale cluster. Grab the cluster node information by running the following command:
$ ./appscale-tools/bin/appscale status | grep amazon | grep Status | awk '{print $5}' | cut -d ":" -f 1
ec2-50-17-96-162.compute-1.amazonaws.com
ec2-50-19-45-193.compute-1.amazonaws.com
ec2-67-202-23-157.compute-1.amazonaws.com
Add those DNS entries to the ansible-r-appscale-playbook/production file. After editing, the file will look like the following:
[appscale-nodes]
ec2-50-17-96-162.compute-1.amazonaws.com
ec2-50-19-45-193.compute-1.amazonaws.com
ec2-67-202-23-157.compute-1.amazonaws.com
Now the playbook can be executed. The playbook requires the SSH private key to the nodes. This key will be located under the ~/.appscale folder. In this example, the key file is named appscale-rmr.key. To execute the playbook, run the following command:
$ ansible-playbook -i r-appscale-deployment/production
--private-key=~/.appscale/appscale-rmr.key -v r-appscale-deployment/site.yml
Testing Out The Deployment – Wordcount.R
Once the playbook has finished running, the AppScale cluster is now ready to be used. To test out the setup, SSH into the head node of the AppScale cluster. To find out the head node of the cluster, execute the following command:
$ ./appscale-tools/bin/appscale status
After discovering the head node, SSH into the head node using the private key located in the ~/.appscale directory:
$ ssh -i ~/.appscale/appscale-rmr.key root@ec2-50-17-96-162.compute-1.amazonaws.com
To test out the R setup on all the nodes, grab the wordcount.R program:
root@appscale-image0:~# tar zxf rmr2_2.0.2.tar.gz rmr2/tests/wordcount.R
In the wordcount.R file, the following lines are present
rmr2:::hdfs.put("/etc/passwd", "/tmp/wordcount-test")
out.hadoop = from.dfs(wordcount("/tmp/wordcount-test", pattern = " +"))
When the wordcount.R program is executed, it will grab the /etc/password file from the head node, copy it to the hdfs filesystem, then run wordcount on /etc/password to look for the pattern ” +”. NOTE: wordcount.R can be edited to use any file and pattern desired.
Run wordcount.R:
root@appscale-image0:~# R
R version 2.15.3 (2013-03-01) -- "Security Blanket"
Copyright (C) 2013 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
[Previously saved workspace restored]
> source('rmr2/tests/wordcount.R')
Loading required package: Rcpp
Loading required package: RJSONIO
Loading required package: digest
Loading required package: functional
Loading required package: stringr
Loading required package: plyr
13/04/05 02:33:41 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
13/04/05 02:33:43 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
packageJobJar: [/tmp/RtmprcYtsu/rmr-local-env19811a7afd54, /tmp/RtmprcYtsu/rmr-global-env1981646cf288, /tmp/RtmprcYtsu/rmr-streaming-map198150b6ff60, /tmp/RtmprcYtsu/rmr-streaming-reduce198177b3496f, /tmp/RtmprcYtsu/rmr-streaming-combine19813f7ea210, /var/appscale/hadoop/hadoop-unjar5632722635192578728/] [] /tmp/streamjob8198423737782283790.jar tmpDir=null
13/04/05 02:33:44 WARN snappy.LoadSnappy: Snappy native library is available
13/04/05 02:33:44 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/04/05 02:33:44 INFO snappy.LoadSnappy: Snappy native library loaded
13/04/05 02:33:44 INFO mapred.FileInputFormat: Total input paths to process : 1
13/04/05 02:33:44 INFO streaming.StreamJob: getLocalDirs(): [/var/appscale/hadoop/mapred/local]
13/04/05 02:33:44 INFO streaming.StreamJob: Running job: job_201304042111_0015
13/04/05 02:33:44 INFO streaming.StreamJob: To kill this job, run:
13/04/05 02:33:44 INFO streaming.StreamJob: /root/appscale/AppDB/hadoop-0.20.2-cdh3u3/bin/hadoop job -Dmapred.job.tracker=10.77.33.247:9001 -kill job_201304042111_0015
13/04/05 02:33:44 INFO streaming.StreamJob: Tracking URL: http://appscale-image0:50030/jobdetails.jsp?jobid=job_201304042111_0015
13/04/05 02:33:45 INFO streaming.StreamJob: map 0% reduce 0%
13/04/05 02:33:51 INFO streaming.StreamJob: map 50% reduce 0%
13/04/05 02:33:52 INFO streaming.StreamJob: map 100% reduce 0%
13/04/05 02:33:59 INFO streaming.StreamJob: map 100% reduce 33%
13/04/05 02:34:02 INFO streaming.StreamJob: map 100% reduce 100%
13/04/05 02:34:04 INFO streaming.StreamJob: Job complete: job_201304042111_0015
13/04/05 02:34:04 INFO streaming.StreamJob: Output: /tmp/RtmprcYtsu/file1981524ee1a3
13/04/05 02:34:05 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
13/04/05 02:34:07 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
13/04/05 02:34:08 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
13/04/05 02:34:10 INFO security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
Deleted hdfs://10.77.33.247:9000/tmp/wordcount-test
>quit("yes")
Thats it! The AppScale cluster is ready for additional R programs that utilize MapReduce. Enjoy the world of Big Data on public/private IaaS.
What's new in Ansible 1.1 for AWS and Eucalyptus users?
Reblogged from Take that to the bank and cash it!:
I thought the Ansible 1.0 development cycle was busy but 1.1 is crammed full of orchestration goodness. On Tuesday, 1.1 was released and you can read more about it here: http://blog.ansibleworks.com/2013/04/02/ansible-1-1-released/
For those working on AWS and Eucalyptus, 1.1 brings some nice module improvements as well as a new cloudformation and s3 module. It's great to see the AWS-related modules becoming so popular so quickly.
Using Ansible to Deploy Neo4j HA Cluster on AWS/Eucalyptus
As a follow-up to my last Neo4j, AWS/Eucalyptus blog, this entry demonstrates another great example of AWS/Eucalyptus fidelity by using Ansible to deploy a Neo4j High Available cluster.
Pre-requisites
In order to use this Ansible playbook on AWS/Eucalyptus, the following is needed:
- An AWS or Eucalyptus account, with a user’s access key and secret access key.
- EC2 IAM Policy to allow launching of instances, and authorize ports in security group
- Ubuntu Cloud Image (Precise 12.04)
- EC2 API Client Tools
- git repository tools
Before deploying the cluster, a security group needs to be created that the cluster will use. The security group must allow the following:
- port 22 (SSH)
- all instances part of the security group allowed to community with each other (ports 0 - 65535)
To create the security group and authorize the ports, make sure the user’s access key, secret access key, and EC2 URL are noted, and do the following:
- Create the security group
ec2-create-group --aws-access-key <EC2_ACCESS_KEY> --aws-secret-key <EC2_SECRET_KEY> --url <EC2_URL> -g neo4j-cluster -d "Neo4j HA Cluster"
- Authorize port for SSH in neo4j-cluster security group
ec2-authorize --aws-access-key <EC2_ACCESS_KEY> --aws-secret-key <EC2_SECRET_KEY> --url <EC2_URL> -P tcp -p 22 -s 0.0.0.0/0 neo4j-cluster
- Authorize all port communication between cluster members
ec2-authorize --aws-access-key <EC2_ACCESS_KEY> --aws-secret-key <EC2_SECRET_KEY> --url <EC2_URL> -P tcp -o neo4j-cluster -p -1 neo4j-cluster
After completing these steps, use
ec2-describe-group
to view the security group:
ec2-describe-group --aws-access-key <EC2_ACCESS_KEY> --aws-secret-key <EC2_SECRET_KEY> --url <EC2_URL> neo4j-cluster GROUP sg-1cbc5777 986451091583 neo4j-cluster Neo4j HA Cluster PERMISSION 986451091583 neo4j-cluster ALLOWS tcp 0 65535 FROM USER 986451091583 NAME neo4j-cluster ID sg-1cbc5777 ingress PERMISSION 986451091583 neo4j-cluster ALLOWS tcp 22 22 FROM CIDR 0.0.0.0/0 ingress
Neo4j HA Cluster Deployment
Once the security group is created with the correct ports authorized, the cluster can be deployed. To deploy the cluster, do the following:
- Obtain Ansible from git and setup the environment by following the instructions mentioned here - http://ansible.cc/docs/gettingstarted.html#getting-ansible
- Obtain the Ansible Playbook for Neo4j HA Cluster using git
git clone https://github.com/hspencer77/ansible-neo4j-cluster.git
- Change directory into ansible-neo4j-cluster
cd ansible-neo4j-cluster
- Set up /etc/ansible/hosts with the following information:
[local] 127.0.0.1 - Populate vars/ec2-config with either Eucalyptus/AWS information. vars/ec2-config contains the following variables:
keypair: <EC2/Eucalyptus Keypair> ec2_access_key: <EC2_ACCESS_KEY> ec2_secret_key: <EC2_SECRET_KEY> ec2_url: <EC2_URL> instance_type: m1.small security_group: <AWS/Eucalyptus Security Group> image: <AMI/EMI> -
Execute the following command:
ansible-playbook neo4j-cluster.yml \ --private-key=<AWS/Eucalyptus Private Key file> --extra-vars "node_count=3" - After the playbook finishes, there will be an URL provided to access the cluster – similar to the example below:
TASK: [Display HAProxy URL] ********************* changed: [23.22.248.75] => {"changed": true, "cmd": "echo \"HAProxy URL for Neo4j - http://ec2-23-22-248-75.compute-1.amazonaws.com/webadmin/#/info/org.neo4j/High%20Availability/\" ", "delta": "0:00:00.006835", "end": "2013-03-30 19:54:31.104320", "rc": 0, "start": "2013-03-30 19:54:31.097485", "stderr": "", "stdout": "HAProxy URL for Neo4j - http://ec2-23-22-248-75.compute-1.amazonaws.com/webadmin/#/info/org.neo4j/High%20Availability/"}To view the status of cluster in the browser, open up http://ec2-23-22-248-75.compute-1.amazonaws.com/webadmin/#/info/org.neo4j/High%20Availability/.
- To get the status of the cluster, use curl:
curl -H "Content-Type:application/json" -d '["org.neo4j:*"]' http://ec2-23-22-248-75.compute-1.amazonaws.com/db/manage/server/jmx/query
Thats it! A Neo4j HA cluster with an HA Proxy server serving as an endpoint is available to be used. If a bigger cluster is desired, just change the
node_count
value. For additional information regarding this playbook, and how it handles the cluster membership, please refer to the following URL - https://github.com/hspencer77/ansible-neo4j-cluster/blob/master/README.md.
Hope you enjoy! As always, questions/comments/suggestions are always welcome.
Exclusive: Startup AnsibleWorks pitches open-source IT configuration, deployment tool
A couple of former Red Hat (s rhat) veterans think there's an easier way to configure, deploy and manage IT across an organization and founded AnsibleWorks to attack that problem.
Systems administrators and developers want one tool for deployment, configuration and management -- they don't want to deal with agents and add-ons, said Said Siouani, CEO of Santa Barbara, Calif.-based AnsibleWorks.
Deploying Eucalyptus via Ansible playbook(s)
Reblogged from Take that to the bank and cash it!:
The first cut of the Ansible deployment playbook for deploying Eucalyptus private clouds is ready. I've merged the first "release" into the master branch here: https://github.com/lwade/eucalyptus-playbook. Feedback and contributions are very welcome, please file issues against the project.
This playbook allows a user to deploy a single front-end cloud (i.e. all component on a single system) and as many NC's as they want.
Run Appscale on Eucalyptus
Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over a network (typically the Internet). - Wikipedia
According to Wikipedia currently there are few popular service models exist.
1. Infrastructure as a service (IaaS)
2. Platform as a service (PaaS)
3. Software as a service (SaaS)
So, I have an Eucalyptus cloud, which is great, serves as AWS-like IaaS platform.
Using Scalr for Automation of your Eucalyptus Cloud
Reblogged from Testing Clouds at 128bpm:
Introduction
I have been using Eucalyptus heavily (as a quality engineer it is my day to day) for the past 1.5 years. I know the ins and the outs of system and am constantly tracking new features and bug fixes that arrive. With this knowledge it makes me a prime candidate to find out how other pieces of the cloud story can integrate with Eucalyptus.
Test Drive: Drupal Deployment on Eucalyptus using Stackato, Amazon Route 53 and the Eucalyptus Community Cloud
Recently, I did a blog discussing how to deploy a Jenkins server using Stackato, running on Eucalyptus. At the end of that blog, I mentioned how the Eucalyptus Community Cloud (ECC) could be used for testing out the Stackato Microcloud image on Eucalyptus. The previous blog – I felt – was more for DevOps administrators who had access to their own on-premise Eucalyptus clouds. The inspiration of this blog comes from the blog on ActiveBlog entitled “Deploy & Scale Drupal on Any Cloud with Stackato” to show love to Web Developers, and show the power of Amazon’s Route 53.
Test Drive Pre-Reqs
The prerequisites for this blog are the same that are mentioned in my previous blog regarding using Stackato on Eucalyptus (for the Eucalyptus pre-reqs, make sure the ECC is being used). In addition to the prerequisites mentioned above, the following is needed:
After the prerequisites have been met, its time to setup the Drupal environment.
Test Drive Engage!
Since the ECC is being used, there is no need to worry about bundling, uploading and registering the Stackato image. The Stackato image used for this blog is as follows:
IMAGE emi-859B3D5C stackato_v2.6.6/stackato-cloudinit.manifest.xml
150820662310 available public x86_64 machine eki-6FBE3D2D eri-67463B77 instance-store
Next, lets make sure the user has an elastic IP that will be used in AWS Route 53, and a security group to allow proper network traffic to the instance. Do the following:
- Make sure the user credentials are sourced correctly, and euca2ools is installed correctly.
- Grab an elastic IP using euca-allocate-address (in this example 173.205.188.105 was allocated):
# euca-allocate-address ADDRESS 173.205.188.105
- If the user already doesn’t have a keypair, create a keypair for the user by using euca-create-keypair, and make sure the permission of the file is 0600:
# euca-create-keypair hspencer-stackato > hspencer-stackato.priv # chmod 0600 hspencer-stackato.priv
- Create a security group for the instance to use:
# euca-create-group stackato-test -d "Test Security Group for Stackato PaaS" GROUP stackato-test Test Security Group for Stackato PaaS
- Authorize ping, ssh, http, and https ports:
# euca-authorize -P icmp -t -1:-1 -s 0.0.0.0/0 stackato-test GROUP stackato-test PERMISSION stackato-test ALLOWS icmp -1 -1 FROM CIDR 0.0.0.0/0 # euca-authorize -P tcp -p 22 -s 0.0.0.0/0 stackato-test GROUP stackato-test PERMISSION stackato-test ALLOWS tcp 22 22 FROM CIDR 0.0.0.0/0 # euca-authorize -P tcp -p 80 -s 0.0.0.0/0 stackato-test GROUP stackato-test PERMISSION stackato-test ALLOWS tcp 80 80 FROM CIDR 0.0.0.0/0 # euca-authorize -P tcp -p 443 -s 0.0.0.0/0 stackato-test GROUP stackato-test PERMISSION stackato-test ALLOWS tcp 443 443 FROM CIDR 0.0.0.0/0
- Now, launch the instance, specifying the keypair name to use, and a VM type. On the ECC, only m1.xlarge and c1.xlarge meet the requirements of launching the Stackato image:
# euca-run-instances -k hspencer-stackato -t c1.xlarge emi-859B3D5C -g stackato-test RESERVATION r-66EE4030 628376682871 stackato-test INSTANCE i-E85843C4 emi-859B3D5C euca-0-0-0-0.eucalyptus.ecc.eucalyptus.com euca-0-0-0-0.eucalyptus.internal pending hspencer-stackato 0 c1.xlarge 2013-02-24T19:40:35.516Z partner01 eki-6FBE3D2D eri-67463B77 monitoring-disabled 0.0.0.0 0.0.0.0 instance-store
- Once the instance gets to a running state, associate the elastic IP that the user owns to the instance:
# euca-describe-instances RESERVATION r-66EE4030 628376682871 stackato-test INSTANCE i-E85843C4 emi-859B3D5C euca-173-205-188-106.eucalyptus.ecc.eucalyptus.com euca-10-9-190-24.eucalyptus.internal running hspencer-stackato 0 c1.xlarge 2013-02-24T19:40:35.516Z partner01 eki-6FBE3D2D eri-67463B77 monitoring-disabled 173.205.188.10 10.9.190.24 instance-store # euca-associate-address -i i-E85843C4 173.205.188.105 ADDRESS 173.205.188.105 i-E85843C4 # euca-describe-instances RESERVATION r-66EE4030 628376682871 stackato-test INSTANCE i-E85843C4 emi-859B3D5C euca-173-205-188-105.eucalyptus.ecc.eucalyptus.com euca-10-9-190-24.eucalyptus.internal running hspencer-stackato 0 c1.xlarge 2013-02-24T19:40:35.516Z partner01 eki-6FBE3D2D eri-67463B77 monitoring-disabled 173.205.188.10 10.9.190.24 instance-store
- Log into the AWS management console, select Route 53, and setup the A and CNAME records in your domain as mentioned here under the Stackato Documentation regarding detailed DNS configuration. In this example, the DNS name associated with the elastic IP 173.205.188.105 is stackato-dev.mindspew-age.com.
- Next ssh into the instance, and proceed to follow the steps for setting up the Stackato instance that is mentioned in my previous blog under the section Configuration of the Stackato Instance. Make sure the DNS name setup in AWS Route 53 is used with “kato rename public-DNS-name” and “kato setup core api.public-DNS-name” configuration steps.
- After the instance is configured, just open up the browser and go to the DNS name set up for the Stackato instance in AWS Route 53, as mentioned in the Stackato Documentation regarding configuration via the Management Console.
- Once logged into the Stackato Management Console, select “App Store” in the lefthand menu and select “Drupal” to install
- After Drupal has installed, start the application. Once it has started successfully, select the URL that shows up in the right-hand menu box. The Drupal log-in page will appear in your browser
Thats it! Now Drupal is ready for any web developer to test out on the ECC. If there is any questions/comments/suggestions, please feel free to leave comments. Enjoy!

























