tags:

In my previous blog Building Java application over AWS Cloudformation I gave an overview of how I build my cloud formation for my application.

As mentioned in the article a lot of the logic was coded with the userdata of cloud formation.

What I found was that Amazon will start the machine, but the machine does not alway fully boot. The user script would sometimes get stuck, and amazon would think that all is ok.

So what is the solution.

Create file at end of script

For local debugging, I added at the end of the user script to create a file with all installation parameters. This way I have an easy way to check if the script finished, and I can easily see what type of installation was run.

For example, I create a file by the name of: allok.txt, that has the following information:

TIME: 198

LIQUIBASE_VERSION: 3.5.3

JAVA_VERSION: 8u111

KAFKA_VERSION: 2.11-0.10.1.0

KAFKAMANAGERVERSION: 1.3.2.1

TRIFECTA_VERSION: 0.21.3

INSTANCE_ID: i-zzyyyxxx

APP_VERSION: 1.0.1-2107


elasticip: a.b.c.d

privateip: 172.31.34.4


Fail CloudFormation if not fully booted

While this was nice it does not help the fact that if the server did not properly boot then my formation is not correctly running.

There is a feature in CloudFormation that is Creation Policy. What this policy does, is allow you to define that a signal from the machine must be sent, and only after received will the formation considered to be finished. You can set a timeout so that once this timeout has passed and you have not received the signal, the formation will be canceled.

For example on the AutoScalingGroup add the following section:

"CreationPolicy": {
  "ResourceSignal": {
     "Count": "1",
     "Timeout": "PT10M"
 }
}

This lets the auto scaling group know, that it must get one single notification within 10 minutes. The count should be configured according to the minimum amount of nodes that need to be running.

Signaling after boot

To signal after boot you need to add in the LaunchConfiguration section after the userscript to call the signal.

For example:

BatchLaunchConfiguration" : {
  "Type" : "AWS::AutoScaling::LaunchConfiguration",
  "Properties" : {
    "UserData" : {
     "Fn::Base64" : {
        "Fn::Join" : ["\n", [
           "#!/bin/bash -x",
           "exec >& /var/log/cloud-output.log",
           { "Fn::Join": [ "\n", [
              "curl -s --retry 30",
              { "Ref": "UserDataS3URL" },
    ...
              "bash"
]]},
           { "Fn::Join": [ "\n", [
              "/opt/aws/bin/cfn-signal -e $? ",
              "  --stack "**, { "Ref": "AWS::StackName" },
              "  --resource BatchAutoScalingGroup " ,
              "  --region ", { "Ref" : "AWS::Region" }
           ]]}
        ]]}}
  }
}

This will run the cnf-singal command to let the cloudformation know that you have finished.

To check that the command is running correctly you can check on the machine the following log file: /var/log/cloud-output.log