AWS CloudFormation Resource Specification Auditor

How wanting to generate code from CloudFormation specifications led to auditing AWS documentation with Python and GitHub Actions

Derek Ardolf

7 minute read

NOTE: This post expands on some content previously mentioned in Scraping Docs to Generate PowerShell Help in VaporShell, but doesn’t need to be read to understand what’s happening here.

“This looks cool.”

When I first had seen the AWS CloudFormation Resource Specification files in either 2016 or 2017, representing CFN supported services per region, I imagined they would be of great use to InfraCoders. I love making and improving automation tools. How else could one possibly keep up with the amount of new services supported by AWS CloudFormation?

I was curious about how code was generated for development kits that assisted in creating CFN templates like SparkleFormation, Troposphere, probably the AWS Cloud Development Kit, and VaporShell.

I revisited the idea again in a recent post, and it led to a list of gaps that I wanted to create tooling around.

Some Thoughts on AWS CloudFormation Documentation

AWS documentation seems to be out-of-sync across a variety of places when it comes to:

  • What AWS services are supported in each region
  • What AWS services are supported by AWS CloudFormation in each region

Instead, it exists across a spectrum of sources that can contradict each other at times:

Solutions to this have been left up to people to create their own source of truth, which falls out-of-date quickly without automation involved. I may make additional tools in the future that scrape additional pages, and try to generate a single-consensus landing pages that includes where pages contradict.

Another thing: AWS documentation source for the AWS CloudFormation User Guide, on GitHub, isn’t as much the source as it is a duplication of the live CFN User Guide and the source looks to be manually updated inconsistently. This results in documentation source not reflecting the documentation that is live, and it also looks to leave confusion around how to approach PRs to the duplicated source documentation:

  • There are currently >90 PRs
  • Merged changes make the “source” on GitHub further out-of-sync from live docs, since it is already not reflecting the live/public documentation it is the “source” for. Managing this manually must be a headache
  • Sometimes, there is no documentation existing in the GitHub source repository for a resource or property type. This can be the case even if the documentation exists on the live website.
  • PRs are routinely closed, without merging, after an assignee mentions that a ticket was made internally to make the correction to the resource specification files. This is done due to chunks of the user guide documentation being auto-generated from the spec files.

These spec files in question are not tracked on GitHub by AWS, yet they are JSON and are used to auto-generate large portions of the AWS CFN User Guide (and by third-party tools made to assist in CFN template creation/linting/etc.). What to do?

Resource Specification Auditor

I have made my own repository for tracking and auditing these spec files in the meantime: AWS CloudFormation Resource Specification Auditor.

This led to automating it all with GitHub Actions. I signed up for the beta, and no use GitHub Actions with Python scripts to:

  • Look for new cfn spec updates, and download/push to the repo if discovered
  • Catalog what regions are supported for each service in CFN
  • Audit documentation links and AWS CFN User Guide source repo for consistency
    • What links are broken links?
    • What documentation exists on the live documentation website, but not the GitHub source for CFN documentation?

Implementation

GitHub Actions are in beta, and I’ve only run into one bug where a job looks stuck in a workflow execution loop without console output or way to close it out. I opened a support ticket with GitHub, and they got back to me about it being a known bug where the workflows appear to be hung, but nothing is actually happening (?). As of today, I still have previously run workflows that show they haven’t stopped executing.

WorkflowBroken

To work around this, I had to rename and push a new workflow YAML file. This essentially creates a new GitHub Actions workflow under the new name, allowing for new runs to execute without depending on the completion of the hung workflow.

WorkflowBroken

To get a nice workflow output like this, what does the workflow YAML look like?

For GitHub Actions, your workflows need to exist in the following location: .github/workflows/myworkflow.yml

I used the default for a daily-scheduled Python Application workflow, and modified to work with Pipenv and my Python auditing scripts:

name: CFN Specification and User Guide Audit

on:
  schedule:
  - cron: 0 2 * * 1-5

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v1
    - name: Set up Python 3.7
      uses: actions/setup-python@v1
      with:
        python-version: 3.7
    - name: Install dependencies
      run: |
        python -m pip install --upgrade pip setuptools
        git submodule update --init --recursive --remote
        pip install pipenv
        pipenv install --dev
    - name: Lint with flake8
      run: |
        # stop the build if there are Python syntax errors or undefined names
        pipenv run flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
        # exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
        pipenv run flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
    - name: Look for and update new CFN specs if found
      env:
        GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
        AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
      run: |
        pipenv run python tools/cfn-resource-list.py
    - name: Audit documentation links and cfn user guide
      run: |
        pipenv run python tools/cfn-supported-region-generator.py  
    - name: Create pull request if any files were updated
      env:
        GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
      run: |
        pipenv run python tools/create-pull-request.py

As seen above, GitHub Actions allows access to secrets that can be stored in the repository settings on GitHub:

I have seen many examples and tutorials online that choose to use the AmazonS3ReadOnlyAccess policy when it comes to giving access to read public S3 buckets, which looks like this:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:Get*",
                "s3:List*"
            ],
            "Resource": "*"
        }
    ]
}

Don’t do this.

This allows anyone using the policy to list/download ANYTHING stored in S3 on your own account too, unless bucket-specific policies prevent it, in addition to being able to access public S3 buckets owned by other AWS accounts.

It is best practice to go with the most explicit, restricted route. Since I know what public buckets I want to access, the policy looks like this:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:Get*",
                "s3:List*"
            ],
            "Resource": [
                "arn:aws:s3:::cfn-resource-specifications-us-east-1-prod",
                "arn:aws:s3:::cfn-resource-specifications-us-east-1-prod/*",
                "arn:aws:s3:::cfn-resource-specifications-eu-north-1-prod",
                "arn:aws:s3:::cfn-resource-specifications-eu-north-1-prod/*",
                "arn:aws:s3:::cfn-resource-specifications-ap-south-1-prod",
                "arn:aws:s3:::cfn-resource-specifications-ap-south-1-prod/*",
                "arn:aws:s3:::cfn-resource-specifications-eu-west-3-prod",
                "arn:aws:s3:::cfn-resource-specifications-eu-west-3-prod/*",
                "arn:aws:s3:::cfn-resource-specifications-eu-west-2-prod",
                "arn:aws:s3:::cfn-resource-specifications-eu-west-2-prod/*",
                "arn:aws:s3:::cfn-resource-specifications-eu-west-1-prod",
                "arn:aws:s3:::cfn-resource-specifications-eu-west-1-prod/*",
                "arn:aws:s3:::cfn-resource-specifications-ap-northeast-3-prod",
                "arn:aws:s3:::cfn-resource-specifications-ap-northeast-3-prod/*",
                "arn:aws:s3:::cfn-resource-specifications-ap-northeast-2-prod",
                "arn:aws:s3:::cfn-resource-specifications-ap-northeast-2-prod/*",
                "arn:aws:s3:::cfn-resource-specifications-ap-northeast-1-prod",
                "arn:aws:s3:::cfn-resource-specifications-ap-northeast-1-prod/*",
                "arn:aws:s3:::cfn-resource-specifications-sa-east-1-prod",
                "arn:aws:s3:::cfn-resource-specifications-sa-east-1-prod/*",
                "arn:aws:s3:::cfn-resource-specifications-ca-central-1-prod",
                "arn:aws:s3:::cfn-resource-specifications-ca-central-1-prod/*",
                "arn:aws:s3:::cfn-resource-specifications-ap-southeast-2-prod",
                "arn:aws:s3:::cfn-resource-specifications-ap-southeast-2-prod/*",
                "arn:aws:s3:::cfn-resource-specifications-ap-southeast-1-prod",
                "arn:aws:s3:::cfn-resource-specifications-ap-southeast-1-prod/*",
                "arn:aws:s3:::cfn-resource-specifications-eu-central-1-prod",
                "arn:aws:s3:::cfn-resource-specifications-eu-central-1-prod/*",
                "arn:aws:s3:::cfn-resource-specifications-us-east-2-prod",
                "arn:aws:s3:::cfn-resource-specifications-us-east-2-prod/*",
                "arn:aws:s3:::cfn-resource-specifications-us-west-1-prod",
                "arn:aws:s3:::cfn-resource-specifications-us-west-1-prod/*",
                "arn:aws:s3:::cfn-resource-specifications-us-west-2-prod",
                "arn:aws:s3:::cfn-resource-specifications-us-west-2-prod/*"
            ]
        }
    ]
}

The Future

The AWS CloudFormation Resource Specification Auditor should continue to update itself overtime, and the resulting JSON files may be helpful to people wanting to easily see:

  • What resources are supported in what regions in CloudFormation?
  • What documentation links are broken within the CFN Resource Spec files provided by AWS?
  • What resources, if any, are mentioned in other specification files outside of us-east-1. Certain tools have been dependent on the us-east-1 spec file as the master source file, though my tooling has discovered errors where certain supported resources were not included (when they were supported)

I’ll continue to expand it when I run into more issues.

Bonus Thoughts

I thought a good feature for the AWS CFN User Guide documentation would be to list all supported regions in the documentation for each resource and property type, so I have created the following as a PR for Hacktoberfest:

Join the conversations on them if they seem worthwhile or could use some feedback!

Final Notes

Want help improving your CloudFormation template workflow? Take a look at:

  • cfn-python-lint (lint your CFN templates; the tool includes region-specific checks!)
  • AWS CDK: Use Amazon official tooling to generate CFN templates via your favorite coding language instead of raw JSON/YAML
  • The AWS CDK and cfn-python-lint project both have base directories that have been used for on-the-fly patches against the base CFN spec files, and are good places to keep an eye on if creating cfn-spec-dependent workflows (or if just running into issues):

This article can also be viewed on dev.to

comments powered by Disqus