Adding Eucalyptus Load Balancer Access Logging for Eucalyptus Cloud Users

Preface

Eucalyptus continues to strive as the best on-premise AWS-compatible Infrastructure as a Service (IaaS).  One of the great things about Eucalyptus being an open source platform, is that if there is an AWS feature that any cloud administrator/developer wants to add, they have the ability to do it.  This blog entry will cover how to enable cloud users to have access to the Eucalyptus Load Balancer access logs – similar to how this is accomplished with Amazon Web Services Elastic Load Balancer service.

Before we dive in, I would like to give special thanks to the following members of the Eucalyptus Engineering Team.  Without their hard work, this blog would not be possible:

Special thanks to these individuals for their continued contributions to the Eucalyptus software.

Overview

Currently, when a cloud user launches a Eucalyptus Load Balancer, they will see something similar to the following:

# eulb-create-lb hasp-euca-lb --listener "lb-port=80, protocol=http, instance-port=8888, instance-protocol=http" --availability-zone Honest
# eulb-describe-lbs
LOAD_BALANCER hasp-euca-elb hasp-euca-elb-325271821652.eulb.future.euca-hasp.cs.prc.eucalyptus-systems.com 2014-12-11T23:34:35.397Z

Notice the DNS name of the load balancer.  It has the following format:

{load balancer name}-{Account ID}.{Load Balancer DNS Subdomain}.{Eucalyptus Cloud DNS Domain}

The “{load balancer name}-{Account ID}” string is the important information in this value.

From the cloud administrator’s perspective, the load balancer is an AutoScaling group.  More information can be found in the following resources:

If the cloud administrator describes the instances running under the ‘eucalyptus‘ account and the load balancer above is running, the following would be displayed:

# euca-describe-instances 
RESERVATION r-278c161e 094999295155 euca-internal-325271821652-hasp-euca-elb
INSTANCE i-135b4b0a emi-7a4367b8 euca-10-104-7-21.future.future.euca-hasp.cs.prc.eucalyptus-systems.com euca-172-17-156-121.future.internal running euca-elb 0 c1.medium 2014-12-11T23:34:44.428Z Honest monitoring-enabled 10.104.7.21 172.17.156.121 instance-store hvm c4946e25-64ed-4453-808c-9ff2ab831b47_Honest_1 sg-da911c98 arn:aws:iam::094999295155:instance-profile/internal/loadbalancer/loadbalancer-vm-325271821652-hasp-euca-elb x86_64
TAG instance i-135b4b0a Name loadbalancer-resources
TAG instance i-135b4b0a aws:autoscaling:groupName asg-euca-internal-elb-325271821652-hasp-euca-elb
TAG instance i-135b4b0a euca:node 10.104.1.218

Notice the ‘RESERVATION’ line that contains the security group that the instance is using.  If the ‘euca-internal-‘ prefix is removed, the security group has the following format:

{Account ID}-{load balancer Name}

This information matches the Load Balancer launched by the cloud user and will be the base for the solution.

Building the Foundation

In order to get started, the solution needs to be applied from the Cloud Administrator (i.e. admin user in the ‘eucalyptus’ account) perspective.  This solution can not be applied by any other type of cloud user.  In addition to cloud administrator user requirement, the following is needed:

Once these requirements are met, the environment is ready to go.

Create ELB Access Log User

A user (e.g. ‘elb-osg-logger’) needs to be created under the ‘eucalyptus’ account which will be used with the custom python script to store the load balancer access logs to the OSG bucket.  To create the user, after sourcing the cloud administrator credentials, use euare-usercreate:

# euare-usercreate -u elb-osg-logger -k 
AKILXXXXXXXXXXXXXX
PS6nXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Store these credentials in a safe place.  Next, customizing the load balancer instance.

Customize the Load Balancer

To begin, a Eucalyptus Load Balancer needs to be launched in order to modify it.  The goal here is to build an image from this instance using euca-bundle-instance.  We will start with the load balancer mentioned earlier:

# euca-describe-instances 
RESERVATION r-5e1d4d17 094999295155 euca-internal-325271821652-hasp-euca-lb
INSTANCE i-315dd646 emi-7a4367b8 euca-10-104-7-9.future.future.euca-hasp.cs.prc.eucalyptus-systems.com euca-172-17-177-235.future.internal running euca-elb 0 c1.medium 2014-12-11T04:23:04.441Z Honest monitoring-enabled 10.104.7.9 172.17.177.235 instance-store hvm b134a0bc-cfc4-4c6e-84ba-4fd1df160407_Honest_1 sg-b6cc605e arn:aws:iam::094999295155:instance-profile/internal/loadbalancer/loadbalancer-vm-325271821652-hasp-euca-lb x86_64
TAG instance i-315dd646 Name loadbalancer-resources
TAG instance i-315dd646 aws:autoscaling:groupName asg-euca-internal-elb-325271821652-hasp-euca-lb
TAG instance i-315dd646 euca:node 10.104.1.218

To access the load balancer, authorize SSH to the instance:

# euca-authorize -P tcp -p 22 euca-internal-325271821652-hasp-euca-elb

Next, SSH into the ELB instance:

# ssh -i euca-elb.priv root@euca-10-104-7-9.future.future.euca-hasp.cs.prc.eucalyptus-systems.com

Once inside the instance, install the EPEL package repository:

# yum localinstall --nogpgcheck https://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm -y

After the package has been installed, use yum to install the python-pip package:

# yum install python-pip -y

Next, use pip to upgrade and install the ‘boto‘ and ‘argparse‘ modules:

# pip install --upgrade boto argparse

Now, its time to add the custom python script.

Add Access Logs Script

The Access Log Script performs the following actions:

  • Creates a bucket with READ bucket ACL for Account  ID which launches the Eucalyptus Load Balancer
    • bucket created with the following format – s3://access_logs-{LB name}_{public-IPV4 of LB}_{LB instance numeric ID}
  • Places a copy of “/var/log/load-balancer-access.log.1” in the bucket with READ object ACL for Account ID which owns the Eucalyptus Load Balancer
    • the file in the bucket will have the following naming format – elb-access-{timestamp DDMMYY-HourMinSec}.log
  • Bonus – since Eucalyptus 4.0.0, OSG has supported object lifecycle management.  If lifecycle value is passed and its greater than 0, the object lifecycle is applied to all objects in the bucket.

To add the script to the instance, use curl:

# curl http://euca-elb-access-log-blog.s3.amazonaws.com/access-log-transfer-s3.py -o access-log-transfer-s3.py

Once the script has been downloaded, edit the script and add the ‘elb-osg-logger’ user credentials, the S3_URL and EC2_URL to the script in the following locations:

 EC2Connection.DefaultRegionEndpoint = '<EC2_URL - Eucalyptus Cloud Compute API DNS Name>'
 ec2conn = EC2Connection(aws_access_key_id="<elb-osg-logger user Access Key ID>",
 aws_secret_access_key="<elb-osg-logger user Secret Access Key>",
 is_secure=False, port="8773")
 s3 = S3Connection(aws_access_key_id="<elb-osg-logger user Access Key ID>",
 aws_secret_access_key="<elb-osg-logger user Secret Access Key>",
 host="<S3_URL - Eucalyptus Cloud OSG API DNS Name>",
 is_secure=False, port=8773, calling_format=OrdinaryCallingFormat())

Set the script to be executable using chmod:

# chmod a+x /root/access-log-transfer-s3.py

Now its time to configure HAProxy to log information.

Enable HAProxy Logging

The Eucalyptus Load Balancer uses haproxy to perform load balancing.  To enable logging, the following files need to be edited:

  • /etc/load-balancer-servo/haproxy_template.conf 
    • under the ‘global’ section add – log 127.0.0.1 local3 info
    • under the ‘default’ section add – log global
  • /usr/lib/python2.6/site-packages/servo/haproxy/haproxy_conf.py
    • change the following section:
 if protocol == 'http' or protocol == 'https':
 self.__content_map[section_name].append('log-format httplog\ %f\ %b\ %s\ %ST\ %ts\ %Tq\ %Tw\ %Tc\ %Tr\ %Tt') 
 elif protocol == 'tcp' or protocol == 'ssl':
 self.__content_map[section_name].append('log-format tcplog\ %f\ %b\ %s\ %ts\ %Tw\ %Tc\ %Tt')

to

if protocol == 'http' or protocol == 'https':
 self.__content_map[section_name].append('log-format httplog\ %f\ %b\ %s\ %ST\ %ts\ %Tq\ %Tw\ %Tc\ %Tr\ %Tt\ %{+Q}r\ %ci:%cp\ %fi:%fp\ %si:%sp\ req_size=%U\ resp_size=%B')
 elif protocol == 'tcp' or protocol == 'ssl':
 self.__content_map[section_name].append('log-format tcplog\ %f\ %b\ %s\ %ts\ %Tw\ %Tc\ %Tt\ %{+Q}r\ %ci:%cp\ %fi:%fp\ %si:%sp\ req_size=%U\ resp_size=%B')

For more information about the log-format in HAProxy, reference the HAProxy documentation on log format. The information that can be logged is highly customizable.  Reference the AWS ELB documentation regarding Access Log Entries to get a better sense of the logging experience on AWS.

Logging for HAProxy is complete.  Next, rsyslog and logrotate need to be configured.

Log Management

Storing the HAProxy logs, and rotating them is very important to this solution.  The script takes the rotated log, and stores it in the OSG bucket for the access logs.  The purpose of this is to make sure the file is not being written to when its being sent to the OSG bucket.  To start out, download the load-balancer.conf file to use with logrotate using curl:

# curl http://euca-elb-access-log-blog.s3.amazonaws.com/load-balancer.conf -o load-balancer.conf

This is the logrotate configuration file that the cronjob script will call to rotate the log file, then execute the access-log-transfer-s3.py script with a 1 day object lifecycle. To change the lifecycle, just change the value of the –lifecycle option in the load-balancer.conf file.

Next, update rsyslog to make sure the latest is running on the instance:

# yum upgrade rsyslog -y

After this has completed, add the following to the /etc/rsyslog.d/load-balancer.conf file:

local3.*       /var/log/load-balancer-access.log

Follow this step up by uncommenting and adding the following lines in /etc/rsyslog.conf:

$ModLoad imudp
$UDPServerAddress 127.0.0.1
$UDPServerRun 514

To wrap up, we need to add a script that will be kicked off by the cronjob.

Cronjob Script

To kick off the log rotation, add the ‘elb-logrotate‘ script to the instance using curl:

# curl http://euca-elb-access-log-blog.s3.amazonaws.com/elb-logrotate -o elb-logrotate

Using ‘crontab -e’, set up a cron for 5 minutes (or however often the access log information would like to be uploaded to the bucket):

*/5 * * * * /root/elb-logrotate

Clean Up

After completing all the customizations, the instance needs to be prepared for bundling.  Run the following commands to prepare the instance:

# echo "" > /etc/udev/rules.d/70-persistent-net.rules
# echo "" > /lib/udev/rules.d/75-persistent-net-generator.rules

If PERSISTENT_DHCLIENT is not in the  /etc/sysconfig/network-scripts/ifcfg-eth0 file, then add it:

# grep PERSISTENT_DHCLIENT /etc/sysconfig/network-scripts/ifcfg-eth0
# echo "PERSISTENT_DHCLIENT=yes" >> /etc/sysconfig/network-scripts/ifcfg-eth0

Now we can exit out the instance.

Creating the New Eucalyptus Load Balancer EMI

After finishing with the instance customizations, the instance is ready to be bundled and registered.  First, use euca-bundle-instance to bundle and upload the instance.  Use euca-describe-bundle-tasks to check on the status of the bundling operation.  Once the bundling operation has been completed, use euca-register to register the new ELB EMI:

# euca-bundle-instance -b load-balancer-access-logs -p eucalyptus-load-balancer-image-access-log i-315dd646
BUNDLE bun-315dd646 i-315dd646 load-balancer-access-logs eucalyptus-load-balancer-image-access-log 2014-12-11T04:07:59.835Z 2014-12-11T04:07:59.835Z pending 0 load-balancer-access-logs/eucalyptus-load-balancer-image-access-log.manifest.xml
....
# euca-describe-bundle-tasks
BUNDLE bun-315dd646 i-315dd646 load-balancer-access-logs eucalyptus-load-balancer-image-access-log 2014-12-11T04:07:59.835Z 2014-12-11T04:09:57.671Z complete 0 load-balancer-access-logs/eucalyptus-load-balancer-image-access-log.manifest.xml
# euca-register -a x86_64 -n load-balancer-access-logs load-balancer-access-logs/eucalyptus-load-balancer-image-access-log.manifest.xml --virtualization-type hvm
IMAGE emi-7a4367b8

Now that the new Eucalyptus Load Balancer EMI is register, update the cloud property ‘loadbalancing.loadbalancer_emi‘ to display the new ELB EMI:

# euca-modify-property -p loadbalancing.loadbalancer_emi=emi-7a4367b8
PROPERTY loadbalancing.loadbalancer_emi emi-7a4367b8 was emi-cf4fb988

Now, lets test out the changes.

Testing Out the ELB with Access Logging

To test it out, you can use either the Cloud Administrator, or a user from a ‘non-eucalyptus’ account.  In the example below, a user from a ‘non-eucalyptus’ account was used.  If a ‘non-eucalyptus’ account user is used, make sure the user has the appropriate IAM access policies for EC2 (Compute), S3 (OSG), and ELB (Eucalyptus Load Balancer).

First, create the Eucalyptus Load Balancer:

# eulb-create-lb hasp-euca-lb --listener "lb-port=80, protocol=http, instance-port=80, instance-protocol=http" --availability-zone Honest --region account2-user11@

Next, launch an instance that has a web service running on port 80.  In this example, I used a cloud-init configuration file to install nginx on an Ubuntu 14.04 (Trusty Tahr) Cloud Image:

# euca-run-instances -k account2-user11 -t m1.medium emi-59a742d0 --user-data-file nginx-cloudinit.config --region account2-user11@
....
# euca-describe-instances --region account2-user11@
RESERVATION r-5c16c716 325271821652 default
INSTANCE i-45c1ebd1 emi-59a742d0 euca-10-104-7-29.future.future.euca-hasp.cs.prc.eucalyptus-systems.com euca-172-17-248-189.future.internal running account2-user11 0 m1.medium 2014-12-05T21:53:51.197Z Honest monitoring-disabled 10.104.7.29 172.17.248.189 instance-store hvm sg-6ef9907f x86_64

Register the instance with the ELB:

# eulb-register-instances-with-lb --instances i-45c1ebd1 hasp-euca-lb --region account2-user11@
INSTANCE i-45c1ebd1

Generate some traffic to the ELB using curl or some other tool to populate the HAProxy log file.  Based upon how often the cronjob was set to execute, use s3cmd to see the bucket created in the ‘eucalyptus’ account (i.e. Cloud Administrator) for the access logs.  For information regarding s3cmd configuration files, refer to my previous blog:

# ./s3cmd/s3cmd --config=.s3cfg-cloud-admin ls
2014-09-18 02:59 s3://51c700-download-manifests
2014-12-11 04:31 s3://access_logs-hasp-euca-lb_10.104.7.9_315dd646
2014-09-18 02:43 s3://centos-6.5-x86_64-20140917
2014-09-18 02:46 s3://centos-7-x86_64-20140917
2014-11-05 22:05 s3://centos6.4-kernel
2014-11-05 21:54 s3://centos6.4-ramdisk
2014-11-05 22:08 s3://centos6.4-test
2014-09-18 02:52 s3://debian-7-x86_64-20140917
# ./s3cmd/s3cmd --config=.s3cfg-cloud-admin ls s3://access_logs-hasp-euca-lb_10.104.7.9_315dd646
2014-12-11 04:31 817 s3://access_logs-hasp-euca-lb_10.104.7.9_315dd646/elb-access-11122014-043122.log
2014-12-11 05:13 78764 s3://access_logs-hasp-euca-lb_10.104.7.9_315dd646/elb-access-11122014-051353.log
2014-12-11 05:20 58202 s3://access_logs-hasp-euca-lb_10.104.7.9_315dd646/elb-access-11122014-052002.log

Once that has been confirmed, create another s3cmd configuration file for the ‘non-eucalyptus’ user, and confirm the user can list the contents of the bucket:

# ./s3cmd/s3cmd --config=.s3cfg-acct2-user11 ls s3://access_logs-hasp-euca-lb_10.104.7.9_315dd646
2014-12-11 04:31 817 s3://access_logs-hasp-euca-lb_10.104.7.9_315dd646/elb-access-11122014-043122.log
2014-12-11 05:13 78764 s3://access_logs-hasp-euca-lb_10.104.7.9_315dd646/elb-access-11122014-051353.log
2014-12-11 05:20 58202 s3://access_logs-hasp-euca-lb_10.104.7.9_315dd646/elb-access-11122014-052002.log

After that has been confirmed, download one of the log files and confirm the contents:

# ./s3cmd/s3cmd --config=.s3cfg-acct2-user11 get s3://access_logs-hasp-euca-lb_10.104.7.9_315dd646/elb-access-11122014-051353.log .
s3://access_logs-hasp-euca-lb_10.104.7.9_315dd646/elb-access-11122014-051353.log -> ./elb-access-11122014-051353.log [1 of 1]
 78764 of 78764 100% in 0s 238.84 kB/s done
 
# cat elb-access-11122014-051353.log
Dec 11 04:32:11 localhost haproxy[1070]: httplog http-80 backend-http-80 http-80 200 -- 0 0 0 1 1 "HEAD / HTTP/1.1" 10.5.1.70:49960 172.17.177.235:80 10.104.7.29:8888 req_size=142 resp_size=241
Dec 11 05:05:35 localhost haproxy[1070]: httplog http-80 backend-http-80 http-80 200 -- 0 0 0 1 1 "HEAD / HTTP/1.1" 10.5.1.70:50216 172.17.177.235:80 10.104.7.29:8888 req_size=142 resp_size=241
Dec 11 05:05:36 localhost haproxy[1070]: httplog http-80 backend-http-80 http-80 200 -- 4 0 1 1 6 "HEAD / HTTP/1.1" 10.5.1.70:50217 172.17.177.235:80 10.104.7.29:8888 req_size=142 resp_size=241
Dec 11 05:05:38 localhost haproxy[1070]: httplog http-80 backend-http-80 http-80 200 -- 5 0 0 1 6 "HEAD / HTTP/1.1" 10.5.1.70:50218 172.17.177.235:80 10.104.7.29:8888 req_size=142 resp_size=241
Dec 11 05:05:39 localhost haproxy[1070]: httplog http-80 backend-http-80 http-80 200 -- 0 0 0 1 1 "HEAD / HTTP/1.1" 10.5.1.70:50219 172.17.177.235:80 10.104.7.29:8888 req_size=142 resp_size=241
Dec 11 05:05:40 localhost haproxy[1070]: httplog http-80 backend-http-80 http-80 200 -- 0 0 0 1 1 "HEAD / HTTP/1.1" 10.5.1.70:50220 172.17.177.235:80 10.104.7.29:8888 req_size=142 resp_size=241
Dec 11 05:05:41 localhost haproxy[1070]: httplog http-80 backend-http-80 http-80 200 -- 0 0 0 1 1 "HEAD / HTTP/1.1" 10.5.1.70:50221 172.17.177.235:80 10.104.7.29:8888 req_size=142 resp_size=241
Dec 11 05:05:42 localhost haproxy[1070]: httplog http-80 backend-http-80 http-80 200 -- 4 0 1 1 6 "HEAD / HTTP/1.1" 10.5.1.70:50222 172.17.177.235:80 10.104.7.29:8888 req_size=142 resp_size=241
.....

How is this ‘non-eucalyptus’ user able to see and download the contents of this bucket?  This is because of the script that creates the access log bucket, and uploads the logs to the bucket.  By grabbing the account ID from the instance metadata ‘security group’ category, the script adds bucket and object READ ACLs for the account ID.  The only issue here is that the cloud administrator will still need to communicate the bucket that the cloud user can access for the logs.  With the extra bonus of using the object lifecycle, the cloud administrator doesn’t have to worry about managing the buckets.  The objects will remove themselves after the define period of time.

Conclusion

Even though the solution isn’t exactly like AWS ELB Access Logs feature, it does provide a solution that is very similar to it.  The only thing missing is the service API interaction to enable/disable the access logging feature, set the interval and define the bucket that will be used.  Hopefully, this will be a feature we will see in the not too distant feature.  Thanks for hanging in there with me.  I hope you enjoy!  Feedback is always welcome.

Cheers!

Adding Eucalyptus Load Balancer Access Logging for Eucalyptus Cloud Users

Cloud Image Management on Eucalyptus: Creating a CentOS 6.6 EMI With ZFS Support

ZFS is a filesystem designed by Sun Microsystems that focuses on data integrity.  What makes this such an attractive filesystem to use in the cloud is that a cloud user can easily do the following:

  • set up an LVM + RAID filesystem for storing large amounts of data (e.g. database information)
  • expand the filesystem by adding more storage (i.e. EBS volumes)
  • backup the filesystem without taking the filesystem offline/unmounting
  • restore the filesystem

This blog entry will focus on how a cloud user can create their own Eucalyptus Machine Image (EMI) that has ZFS support.  The CentOS 6.5 EMI on the Eucalyptus Machine Image Catalog will be used as the base image.

Before Starting…

Before following the steps in this blog, make sure the following is in place:

Once these requirements have been met, everything should be ready to go.

Set Up Base Image/Instance

To begin, follow the ‘Quick Start’ instructions mentioned on the Eucalyptus Machine Image Catalog page.  This will install all the images provided by the catalog.  When the process has finished, list the CentOS 6.5 EMI.  For example:

# euca-describe-images emi-bdcec010 
IMAGE emi-bdcec010 centos-6.5-x86_64-20140917/centos.raw.manifest.xml 094999295155 available public x86_64 machine instance-store hvm

Once the CentOS 6.5 EMI has been listed, launch an instance from the EMI.  For example:

# euca-run-instances -k account2-user11 -t m1.medium emi-bdcec010 
RESERVATION r-a22f0201 325271821652 default
INSTANCE i-b9fccf9f emi-bdcec010 pending account2-user11 0 m1.medium 2014-12-03T22:52:41.522Z Honest monitoring-disabled 0.0.0.0 0.0.0.0 instance-store hvm sg-6ef9907f x86_64
# euca-describe-instances i-b9fccf9f
RESERVATION r-a22f0201 325271821652 default
INSTANCE i-b9fccf9f emi-bdcec010 euca-10-104-7-15.future.future.euca-hasp.cs.prc.eucalyptus-systems.com euca-172-17-248-178.future.internal running account2-user11 0 m1.medium 2014-12-03T22:52:41.522Z Honest monitoring-disabled 10.104.7.15 172.17.248.178 instance-store hvm sg-6ef9907f x86_64

Once the instance is running, its ready to be customized.

Adding ZFS Support to the Instance

Now that the instance is running, SSH into the instance so the following ZFS repository can be added:

[root@odc-f-13 ~]# ssh -i account2-user11.priv root@euca-10-104-7-15.future.future.euca-hasp.cs.prc.eucalyptus-systems.com
[root@euca-172-17-248-178 ~]# yum localinstall --nogpgcheck https://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
[root@euca-172-17-248-178 ~]# yum localinstall --nogpgcheck http://archive.zfsonlinux.org/epel/zfs-release.el6.noarch.rpm
[root@euca-172-17-248-178 ~]# yum upgrade -y
[root@euca-172-17-248-178 ~]# yum install kernel-devel zfs -y

After all the packages have been installed, reboot the instance:

[root@euca-172-17-248-178 ~]# reboot

Preparing the Instance For EMI Creation

After rebooting the instance, SSH back into the instance and prepare the instance for EMI creation.  First, load the zfs module:

[root@odc-f-13 ~]# ssh -i account2-user11.priv root@euca-10-104-7-15.future.future.euca-hasp.cs.prc.eucalyptus-systems.com
[root@euca-172-17-248-178 ~]# modprobe zfs
[root@euca-172-17-248-178 ~]# lsmod | grep zfs
zfs 1195522 0
zcommon 46278 1 zfs
znvpair 80974 2 zfs,zcommon
zavl 6925 1 zfs
zunicode 323159 1 zfs
spl 266655 5 zfs,zcommon,znvpair,zavl,zunicode

After confirming that the ZFS module is loaded, clear the network udev rules, and confirm PERSISTENT_DHCLIENT is set to “yes” in the /etc/sysconfig/network-scripts/ifcfg-eth0 file:

[root@euca-172-17-248-178 ~]# echo "" > /etc/udev/rules.d/70-persistent-net.rules
[root@euca-172-17-248-178 ~]# echo "" > /lib/udev/rules.d/75-persistent-net-generator.rules
[root@euca-172-17-248-178 ~]# echo "PERSISTENT_DHCLIENT=yes" >> /etc/sysconfig/network-scripts/ifcfg-eth0

Confirm that the instance has been upgraded to CentOS 6.6, then exit the instance.

[root@euca-172-17-248-178 ~]# cat /etc/redhat-release
CentOS release 6.6 (Final)
[root@euca-172-17-248-178 ~]# exit

Create the CentOS 6.6 EMI with ZFS Support

The instance is now ready to be bundled.  Bundle the instance using the euca-bundle-instance command.  This command is used to bundle Windows instances, however Eucalyptus extended this command to work with Linux instances as well.  Use euca-describe-bundle-tasks to monitor the bundling status:

[root@odc-f-13 ~]# euca-bundle-instance --bucket centos6.6-zfs --prefix centos6.6-zfs i-b9fccf9f
BUNDLE bun-b9fccf9f i-b9fccf9f centos6.6-zfs centos6.6-zfs 2014-12-03T23:54:51.644Z 2014-12-03T23:54:51.644Z pending 0 centos6.6-zfs/centos6.6-zfs.manifest.xml
..
[root@odc-f-13 ~]# euca-describe-bundle-tasks
BUNDLE bun-b9fccf9f i-b9fccf9f centos6.6-zfs centos6.6-zfs 2014-12-03T23:54:51.644Z 2014-12-03T23:57:37.517Z complete 0 centos6.6-zfs/centos6.6-zfs.manifest.xml

Once the bundle task completes, register the instance store-backed HVM image using the euca-register command:

[root@odc-f-13 ~]# euca-register -a x86_64 -n centos6.6-zfs centos6.6-zfs/centos6.6-zfs.manifest.xml --virtualization-type hvm 
IMAGE emi-5e63f02c

The custom image has been registered. Now lets test it out.

ZFS Test

To test the image out, we will do the following:

  • Launch an instance from the new EMI
  • Create 5 volumes and attach them to the instance
  • Create a ZFS storage pool and dataset

To launch the instance, use the euca-run-instances command.  To create the 5 EBS volumes, use euca-create-volume command.  After the volumes are created, use euca-attach-volume to attach the volumes to the instance.  Once the volumes are attached, the output of euca-describe-instances should look similar to the following:

# euca-describe-instances i-0cd3b6b8
RESERVATION r-cf7c5c73 325271821652 default
INSTANCE i-0cd3b6b8 emi-5e63f02c euca-10-104-7-3.future.future.euca-hasp.cs.prc.eucalyptus-systems.com euca-172-17-248-184.future.internal running account2-user11 0 m1.medium 2014-12-04T00:16:52.887Z Honest monitoring-disabled 10.104.7.3 172.17.248.184 instance-store hvm sg-6ef9907f x86_64
BLOCKDEVICE /dev/sdd vol-a23cfb1f 2014-12-04T01:45:59.730Z false
BLOCKDEVICE /dev/sdh vol-a27b75a5 2014-12-04T01:47:31.162Z false
BLOCKDEVICE /dev/sdf vol-2a971204 2014-12-04T01:46:54.575Z false
BLOCKDEVICE /dev/sdg vol-b33e9890 2014-12-04T01:47:13.346Z false
BLOCKDEVICE /dev/sde vol-dcc8b6ac 2014-12-04T01:46:15.011Z false

SSH into the instance and check what block devices are associated with the EBS volumes using the lsblk command:

# ssh -i account2-user11.priv root@euca-10-104-7-3.future.future.euca-hasp.cs.prc.eucalyptus-systems.com
[root@euca-172-17-248-184 ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 252:0 0 4.9G 0 disk
├─vda1 252:1 0 500M 0 part /boot
└─vda2 252:2 0 4.4G 0 part
 ├─VolGroup-lv_root (dm-0) 253:0 0 3.9G 0 lvm /
 └─VolGroup-lv_swap (dm-1) 253:1 0 500M 0 lvm [SWAP]
vdb 252:16 0 5.1G 0 disk
vdc 252:32 0 5G 0 disk
vdd 252:48 0 5G 0 disk
vde 252:64 0 5G 0 disk
vdf 252:80 0 5G 0 disk
vdg 252:96 0 5G 0 disk

The EBS volumes are /dev/vdc, /dev/vdd, /dev/vde, /dev/vdf, and /dev/vdg.  Use these devices to create the ZFS storage pool by using the zpool command:

[root@euca-172-17-248-184 ~]# zpool create -f app-pool vdc vdd vde vdf vdg
[root@euca-172-17-248-184 ~]# zpool status
 pool: app-pool
 state: ONLINE
 scan: none requested
config:
 NAME STATE READ WRITE CKSUM
 app-pool ONLINE 0 0 0
 vdc1 ONLINE 0 0 0
 vdd1 ONLINE 0 0 0
 vde1 ONLINE 0 0 0
 vdf1 ONLINE 0 0 0
 vdg1 ONLINE 0 0 0
errors: No known data errors

Next, we need to create a ZFS dataset.  For this example, this instance will end up being a MySQL server, so we will create a dataset for storing the MySQL data.

[root@euca-172-17-248-184 ~]# zfs create app-pool/mysql
[root@euca-172-17-248-184 ~]# zfs list
NAME USED AVAIL REFER MOUNTPOINT
app-pool 152K 24.5G 30K /app-pool
app-pool/mysql 30K 24.5G 30K /app-pool/mysql

The mount point of the dataset can be adjusted by setting the mountpoint option:

[root@euca-172-17-248-184 ~]# zfs set mountpoint=/opt/mysql app-pool/mysql
[root@euca-172-17-248-184 ~]# zfs list
NAME USED AVAIL REFER MOUNTPOINT
app-pool 162K 24.5G 31K /app-pool
app-pool/mysql 30K 24.5G 30K /opt/mysql

Thats it!  Notice how this only required 2 commands to set up a LVM + RAID filesystem, compared to around 7 commands using mdadm, pvcreate, vgcreate, mkfs, mkdir and mount. The instance is now ready to utilize the ZFS filesystem for the MySQL server.

Online Backup Example to OSG Bucket using s3cmd

As mentioned earlier, a slick feature of using ZFS is being able to perform backups online.  This section will show the following:

  • Setup and configure s3cmd
  • Create a ZFS snapshot, and use ZFS send with s3cmd to place the snapshot on an OSG bucket

To get started, in the instance, install the following packages:

[root@euca-172-17-248-184 ~]# yum install -y git python-dateutil.noarch xz

Next, clone the s3tools/s3cmd repository from Github:

[root@euca-172-17-248-184 ~]# git clone https://github.com/s3tools/s3cmd.git

If the instance was launched with an instance profile that assumes a role with OSG (S3) API access, s3cmd will pick up the temporary credentials and token through the Eucalyptus instance metadata service, as if the instance was launched on AWS EC2.  This wasn’t the case here, so we need to provide the Access Key ID and Secret Key manually:

[root@euca-172-17-248-184 ~]# ./s3cmd/s3cmd --configure

Enter new values or accept defaults in brackets with Enter.
Refer to user manual for detailed description of all options.

Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables.
Access Key: AKIRAGCHAGFE6IIX9BYF
Secret Key: GMdrL97AqcybhfyyxOpNmVUnBtiMenag3ju82L7L

Encryption password is used to protect your files from reading
by unauthorized persons while in transfer to S3
Encryption password:
Path to GPG program [/usr/bin/gpg]:
When using secure HTTPS protocol all communication with Amazon S3
servers is protected from 3rd party eavesdropping. This method is
slower than plain HTTP and can't be used if you're behind a proxy
Use HTTPS protocol [No]:

On some networks all internet access must go through a HTTP proxy.
Try setting it here if you can't connect to S3 directly
HTTP Proxy server name:

New settings:
 Access Key: AKIRAGCHAGFE6IIX9BYF
 Secret Key: GMdrL97AqcybhfyyxOpNmVUnBtiMenag3ju82L7L
 Encryption password:
 Path to GPG program: /usr/bin/gpg
 Use HTTPS protocol: False
 HTTP Proxy server name:
 HTTP Proxy server port: 0

Test access with supplied credentials? [Y/n] n
Save settings? [y/N] y
Configuration saved to '/root/.s3cfg'

Edit the .s3cfg file to make sure to point to the OSG on your Eucalyptus 4.0.2 cloud.  For example, change the following:

host_base = s3.amazonaws.com

to

host_base = objectstorage.future.euca-hasp.cs.prc.eucalyptus-systems.com:8773

and

host_bucket = %(bucket)s.s3.amazonaws.com

to

host_bucket = %(bucket)s.objectstorage.future.euca-hasp.cs.prc.eucalyptus-systems.com:8773

Confirm that s3cmd is configured correctly.  For example:

[root@euca-172-17-248-184 ~]# ./s3cmd/s3cmd ls
2014-11-05 21:45 s3://centos-images
2014-12-03 23:54 s3://centos6.6-zfs
2014-10-08 01:50 s3://instance-profile-testing
2014-12-01 22:27 s3://mongodb-snapshots
2014-10-10 20:01 s3://new-ubuntu-bundled-image
2014-09-17 18:31 s3://s3cmd-testing
2014-09-30 01:58 s3://ubuntu-bundled-vol
2014-10-22 14:47 s3://ubuntu-docker-template
2014-10-08 13:39 s3://ubuntu-images
2014-10-02 01:42 s3://ubuntu-trusty-imported-20141001
2014-10-30 18:25 s3://ubuntu-trusty-imported-20141030
2014-10-29 02:18 s3://ubuntu-trusty-server-10282014
2014-10-01 00:28 s3://wrong-s3-url-test

To perform a ZFS snapshot of the app-pool/mysql dataset, do the following:

[root@euca-172-17-248-184 ~]# zfs snapshot app-pool/mysql@wednesday
[root@euca-172-17-248-184 ~]# zfs list -t snapshot
NAME USED AVAIL REFER MOUNTPOINT
app-pool/mysql@wednesday 0 - 30K -

After creating a bucket for the backup, send the ZFS snapshot to the bucket:

[root@euca-172-17-248-184 ~]# ./s3cmd/s3cmd mb s3://mysql-backups
[root@euca-172-17-248-184 ~]# zfs send app-pool/mysql@wednesday | xz | ./s3cmd/s3cmd put - s3://mysql-backups/mysql-backup-wednesday.img.xz
<stdin> -> s3://mysql-backups/mysql-backup-wednesday.img.xz [part 1, 1440B]
 1440 of 1440 100% in 2s 561.67 B/s done

To confirm if the snapshot is located in the bucket, use s3cmd:

[root@euca-172-17-248-184 ~]# ./s3cmd/s3cmd ls s3://mysql-backups
2014-12-04 02:22 1440 s3://mysql-backups/mysql-backup-wednesday.img.xz

Thats all folks.  We have successfully created a CentOS 6.6 EMI with ZFS support.  For more information regarding ZFS (and inspirations for this blog), check out the following resources:

Cloud Image Management on Eucalyptus: Creating a CentOS 6.6 EMI With ZFS Support