On-Call: Using Lambda, SNS and CloudWatch

Fire Alarm

I HATE on-call. That’s why when I decided to start doing my own side-hustle, I wanted to make sure there were no 4 AM wake-ups unless necessary. I have designed a very simple and cheap process for monitoring websites that uses AWS services. The basic process we use is as follows:

  • Check the domains we have in our Route 53
  • Go through each domain and check the comment for our primary A Record
  • Make a web request to each URL.
    • If the site is up (HTTP 200), we log “1” to a metric alarm for that site.
    • If the site is down (any other HTTP code), we log “0” to a metric alarm for that site.

All of the metric alarms are subscribed to an SNS Topic that sends text messages if the site is down. We also receive an alert if the Lambda function isn’t working (i.e. no data in the alarms).

Getting Up and Running

I’ve created a simple solution to enable this monitoring for each Hosted Zone. We go into Route 53 and add the following comments to each of our Hosted Zones:

e.g. on;www;24x7
  • ON_OR_OFF: lowercase “on” or “off”. If set to “on”, the script will monitor your website.
  • SUBDOMAIN: the subdomain to monitor. E.g. if my subdomain is www and my Hosted Zone is hammo.io, then it will check the URL https://www.hammo.io/.
  • AVAILABILITY_HOURS: Not yet implemented.

Installing System

git clone https://github.com/HammoTime/aws-www-monitoring.git
cd aws-www-monitoring/
terraform init
terraform apply

That is it, you’re all done! You will receive an initial ALARM text but within five minutes you should receive the OK text (unless of course your site is down).

On-call message received from solution

Use Cases

In most circumstances for simple websites, I would use Route 53 Health Checks. However, I designed this solution as a simple project that would enable someone to fork and get started on some complex monitoring. There are some websites that need to perform specific checks on APIs and this Lambda-based solution provides the framework for that to occur.

If you have any questions or problems, leave me a comment here. Thanks for reading, and see you for the next article which will be the first in our WordPress series.

Leave a Reply

Your email address will not be published. Required fields are marked *