Sunday, December 16, 2012

Introducing cfndsl

Last posting I ranted a little about what I like and don't like about [AWS CloudFormation](http://aws.amazon.com/cloudformation/). This time, I am going to do something about it.

AWS CloudFormation Templates

If you are using AWS for anything substantial and you are not using CloudFormation, you should think about it. It gives you a place to launch a whole bunch of AWS resources in a well defined and repeatable fashion. In my mind, there is really only one drawback: the template language is awful.

What do I mean?

Here is a template I have been playing with this afternoon:

{
   "AWSTemplateFormatVersion" : "2010-09-09"
   "Parameters" : {
      "BucketName" : {
         "Type" : "String",
         "Default" : "MyBucket",
         "Description" : "Name of the bucket to grant read access to."
      },
      "Folder" : {
         "Type" : "String",
         "Default" : "myFolder",
         "MinLength" : 2,
         "Description" : "Name of a folder in the bucket to grant read access to."
      }
   },
   "Resources" : {
      "ReadBucketIProfile" : {
         "Type" : "AWS::IAM::InstanceProfile",
         "Properties" : {
            "Roles" : [{"Ref" : "ReadBucketRole"}],
            "Path" : "/"
         }
      },
      "ReadBucketRole" : {
         "Type" : "AWS::IAM::Role",
         "Properties" : {
            "AssumeRolePolicyDocument" : {
               "Statement" : [
                  {
                     "Effect" : "Allow",
                     "Action" : ["sts:AssumeRole"                     ],
                     "Principal" : { "Service" : ["ec2.amazonaws.com"]
                     }
                  }
               ]
            },
            "Policies" : [
               {
                  "PolicyDocument" : {
                     "Statement" : [
                        {
                           "Effect" : "Allow",
                           "Resource" : {
                              "Fn::Join" : [
                                 "",
                                 [
                                    "arn:aws:s3:::",
                                    {
                                       "Ref" : "BucketName"
                                    },
                                    "/",
                                    {
                                       "Ref" : "Folder"
                                    },
                                    "/*"
                                 ]
                              ]
                           },
                           "Action" : [
                              "s3:GetObject",
                              "s3:GetObjectVersion"
                           ]
                        }
                     ]
                  },
                  "PolicyName" : "readBucket"
               }
            ],
            "Path" : "/"
         }
      }
   }
}

As you can see, the language is ghastly. This particular template sets up an IAM Role that allows the machines that it is associated with have read access to the files stored in a particular folder in a particular bucket in S3. It has two input parameters, and it creates two resources. The input parameters allow you to tell it what bucket and folder you want the created role to access. The resources are an IAM Role and and instance profile. The role contains some policy definitions, allowing access to an S3 bucket, while the profile object gathers roles together to associate with EC2 instances. This probably is not a template that is going to be of much use to you, or anyone else, but when you run it, it does create resources in AWS (I think that they are all free!)

Anyway, here goes launching a stack with command line tools:

 
chris@frankentstein:~$ cfn-create-stack TestStack --template-file rr.template -p "BucketName=deathbydatadata;Folder=cfn" -n arn:aws:sns:us-east-1:99999999999:deathbydatadata
arn:aws:cloudformation:us-east-1:999999999999:stack/TestStack/e9dae100-47d8-11e2-b6f5-5081c366858d

One thing I have learned about setting up CFN stacks is that it is really useful to set up an SNS topic for it to talk to using the "-n arn:aws:sns:..." notation, especially when you are writing a new stack. CloudFormation will send very detailed notifications about what is going on, and sometimes this is about the only way to diagnose what went wrong when a stack folds on creation because of some failure to create a resource. You can also of course watch the progression of stack creation through the events tab on the cloudformation web panel, or you can periodically ask for updates from the api:

chris@frankentstein:~$ cfn-describe-stack-events -s TestStack
STACK_EVENT  TestStack  TestStack           AWS::CloudFormation::Stack  2012-12-16T23:35:47Z  CREATE_COMPLETE     
STACK_EVENT  TestStack  ReadBucketIProfile  AWS::IAM::InstanceProfile   2012-12-16T23:35:46Z  CREATE_COMPLETE     
STACK_EVENT  TestStack  ReadBucketIProfile  AWS::IAM::InstanceProfile   2012-12-16T23:33:26Z  CREATE_IN_PROGRESS  
STACK_EVENT  TestStack  ReadBucketRole      AWS::IAM::Role              2012-12-16T23:33:26Z  CREATE_COMPLETE     
STACK_EVENT  TestStack  ReadBucketRole      AWS::IAM::Role              2012-12-16T23:33:13Z  CREATE_IN_PROGRESS  
STACK_EVENT  TestStack  TestStack           AWS::CloudFormation::Stack  2012-12-16T23:32:53Z  CREATE_IN_PROGRESS  User Initiated

Yay, it looks as if the stack worked: (The CREATE_COMPLETE event for the AWS::CloudFormation::Stack at the top of the list is a good indicator of this). One of the nice things about parameterizing stacks is that you can often update the paramaters without destroying everything (although, sometimes updates can be pretty ugly in terms of rebuilding resources - read the docs about your resources carefully before you do it to a production-level system...) Here is an update, chagning the Folder parameter to "dsl" insdtead of "cfn":

chris@frankentstein:~$ cfn-update-stack TestStack --template-file rr.template -p "BucketName=deathbydatadata;Folder=dsl"
arn:aws:cloudformation:us-east-1:999999999999:stack/TestStack/e9dae100-47d8-11e2-b6f5-5081c366858d
chris@frankentstein:~$ cfn-describe-stack-events -s TestStack
STACK_EVENT  TestStack  TestStack           AWS::CloudFormation::Stack  2012-12-16T23:56:36Z  UPDATE_COMPLETE                      
STACK_EVENT  TestStack  TestStack           AWS::CloudFormation::Stack  2012-12-16T23:56:32Z  UPDATE_COMPLETE_CLEANUP_IN_PROGRESS  
STACK_EVENT  TestStack  ReadBucketRole      AWS::IAM::Role              2012-12-16T23:56:28Z  UPDATE_COMPLETE                      
STACK_EVENT  TestStack  ReadBucketRole      AWS::IAM::Role              2012-12-16T23:56:11Z  UPDATE_IN_PROGRESS                   
STACK_EVENT  TestStack  TestStack           AWS::CloudFormation::Stack  2012-12-16T23:55:58Z  UPDATE_IN_PROGRESS                   User Initiated

Note that in this case, all we needed to touch was the Role object, and CloudFormation was able to update it in place. Nifty. If I has machines associated with this role, they would now be able to read the "dsl" folder, but not the "cfn" folder of the deathbydatadata bucket. Outstanding!


A Snippet of CfnDsl

Now, go back to the template a minute. The list of parameters is not too bad, but the resources part is ugly. Even though there are only two resources present, I get lost looking at the resource definitions. The real culprit here is the "readBucket" policy object, and it is especially bad because we are building up an arn string out of its compnents. The template language has a tool that is sufficient for this kind of work in the form of the built in function Fn::Join. It works a lot like you would expect a join command to work if you have used JavaScript, Perl or Ruby - it builds a string by concatenating together an array of strings, interspersed by a separator string. Here it is in detail:

"Fn::Join" : [ "",
                 [ "arn:aws:s3:::",
                   { "Ref" : "BucketName" },
                   "/",
                   { "Ref" : "Folder" },
                   "/*"
                 ]
               ]
}

You know, the "Ref"s dont help a whole lot either when you are trying to read this thing. So, what does this same thing look like in cfndsl?

FnFormat("arn:aws:s3:::%0/%1/*", Ref("BucketName"), Ref("Folder") )

Ref("BucketName") is ultimately going to turn into a "Ref" style JSON object. What's up with the FnFormat? It will ultimately resolve to a string with instances of %0 replaced with the value of the first parameter after the format string, %1 replaced by the second after the format string, etc. AWS doesn't have one of those! Of course, it doesnt need it as you can do the same thing with Fn::Join. If you use FnFormat, the ruby DSL will take care of figuring out how to write it into Fn::Join notation. So far, FnFormat is the only extra function that I have written for the DSL. The AWS builtin functions are all available by their Amazon names, with the "::" removed. FnJoin(...) produces {"Fn::Join":,,,}.


A whole Template in CfnDsl

Ok, so now that you have had a taste of it, here is the whole read bucket template written in cfndsl:

CloudFormation {
  AWSTemplateFormatVersion "2010-09-09"
  Parameter("BucketName") {
    Type :String
    Default "MyBucket"
    Description "Name of the bucket to grant read access to."
  }

  Parameter("Folder") {
    Type :String
    Default "myFolder"
    MinLength 2
    Description "Name of a folder in the bucket to grant read access to."
  }
  
  Resource("ReadBucketRole") {
    Type "AWS::IAM::Role"
    Property( "AssumeRolePolicyDocument", {
                "Statement" => [ {
                                   "Effect" => "Allow",
                                   "Principal"=> 
                                   {
                                     "Service" => [ "ec2.amazonaws.com" ]
                                   },
                                   "Action" => [ "sts:AssumeRole" ]
                                 } ]
              })
    Property("Path", "/")
    Property("Policies", 
             [ 
              { "PolicyName"=> "readBucket",
                "PolicyDocument"=> 
                {
                  "Statement" => 
                  [ 
                   {
                     "Effect" => "Allow",
                     "Action" => ["s3:GetObject","s3:GetObjectVersion"],
                     "Resource" => FnFormat("arn:aws:s3:::%0/%1/*", 
                                            Ref("BucketName"),
                                            Ref("Folder") )
                   }
                  ]
                }
              }
             ]
             )
  }
  
  Resource( "ReadBucketIProfile") {
    Type "AWS::IAM::InstanceProfile"
    Property( "Path", "/")
    Property( "Roles", [ Ref("ReadBucketRole") ] )
  }
}

Easier to read? I think so, but it is probably a matter of opinion. I like that the resources are declared individually, rather than as a long list. Resource properties are usually pretty simple, so I declare them here by just giving the value as a second parameter to the Property constructor keyword - you could actually use the block form just as easily. 

I have defined special objects for handling most of the top level stuff in a template - the template itself, Parameters, Resources, Mappings, Outputs, Metadata, and Resource Properties. I also have in place a means for dealing with function calls, discussed previously. There is of course a whole lot more to  a template than these things, as many of the resources have complicated and dedicated data types used to specify their inner workings. While I eventually plan to capture some more of these into the same object notation, it is not always convenient to do so.  When a particular type structure has not been explicitly implemented in the dsl, template authors can always fall back on creating ruby hashes and arrays that parallel the JSON notation for the structure that they are creating, and the dsl will handle it appropriately.  


Running CfnDsl

How do your turn this ruby thing into something that AWS understands? Ah - simplicity itself (assuming that you have ruby 1.9). First, you need to get yourself set up with the cfndsl ruby gem

chris@frankentstein:~$ sudo gem install cfndsl
Fetching: cfndsl-0.0.4.gem (100%)
Successfully installed cfndsl-0.0.4
1 gem installed
Installing ri documentation for cfndsl-0.0.4...
Installing RDoc documentation for cfndsl-0.0.4...

Then you just run cfndsl on the ruby

chris@frankentstein:~$ cfndsl rr.rb 
{"AWSTemplateFormatVersion":"2010-09-09","Parameters":{"BucketName":{"Type":"String","Default":"MyBucket","Description":"Name of the bucket to grant read access to."},"Folder":{"Type":"String","Default":"myFolder","Description":"Name of a folder in the bucket to grant read access to.","MinLength":2}},"Resources":{"ReadBucketRole":{"Type":"AWS::IAM::Role","Properties":{"AssumeRolePolicyDocument":{"Statement":[{"Effect":"Allow","Principal":{"Service":["ec2.amazonaws.com"]},"Action":["sts:AssumeRole"]}]},"Path":"/","Policies":[{"PolicyName":"readBucket","PolicyDocument":{"Statement":[{"Effect":"Allow","Action":["s3:GetObject","s3:GetObjectVersion"],"Resource":{"Fn::Join":["",["arn:aws:s3:::",{"Ref":"BucketName"},"/",{"Ref":"Folder"},"/*"]]}}]}}]}},"ReadBucketIProfile":{"Type":"AWS::IAM::InstanceProfile","Properties":{"Path":"/","Roles":[{"Ref":"ReadBucketRole"}]}}}}

There it is, ready to build resources with! How do you build a stack with it? Well, so far I have just been redirecting the output of cfndsl into a text file and then running "cfn-create-stack" referencing the result. There may be better ways to hook these tools together.

It could be that these few improvements are enough to justify having a ruby dsl behind what is effectively a JSON dsl.  As I said, before I believe that the dsl representation of my template is a little nicer. However, ruby allows some other things that we have not explored yet. Not the least of these is comments - sometimes a small (or a large) amount of comments in a piece of code keeps it maintainable. But of course, ruby lets you do much more, but I will save it for next time.

1 comment: