Validating CfCT Configurations Before They Break Your Pipeline…or Your Spirit

A couple weeks ago, I attended a Revolutionary War battle re-enactment at Guilford Courthouse National Military Park and my oldest, being a history lover, really embraced the atmosphere and began talking as one who lived in 1780s America. He’s pretty good at it…I am not. I was going to open this post with a simple I got tired of waiting and though that statement is true, it doesn’t carry enough weight or really capture the pain I experienced recently while working with CfCT and CodePipeline.

If you’ve ever worked with AWS Customizations for Control Tower (CfCT), or the AWS Landing Zone Accelerator (LZA), or perhaps anything using a multi-stage CodePipeline workflow, you know exactly what I mean.

You make a change to your manifest or tweak a parameter file, push it, and then sit there watching CodePipeline slowly work through its stages, hoping and praying with all your might that it completes, only for it to fail ten minutes later because of a typo or a malformed JSON file. After banging your head upon your desk, you fix the problem, push the changes to your repo, rerun the pipeline, and wait again.

It can be a painful loop, and after going through it one too many times, I could take it no longer. In my best 1780s English, “My patience, long tried, was at last exhausted“.

In an attempt to maintain my sanity, I created a GitHub Actions workflow that validates the contents of the CfCT repository before execution of the pipeline. When CodePipeline kicks off, I feel more confident that the CfCT configuration is already known-good. The slow feedback loop doesn’t disappear and the pipeline isn’t any faster, but you should spend less time waiting on failures by performing a validation check when creating a pull request.

The Problem with CfCT and Slow Feedback Loops

You can read more about it here, but in a nutshell, CfCT lets you manage Control Tower customizations — SCPs, CloudFormation StackSets, account-level resources — through a Git-based workflow backed by a CodePipeline. The pipeline itself is the validation mechanism, and I’ll say it’s slow. A failed run means waiting for the pipeline to start, execute, and fail before you find out something was wrong.

The general categories of errors that I have encountered while using CfCT are shown below:

  • manifest.yaml has invalid structure — missing required fields, wrong indentation, unsupported keys
  • JSON parameter files are malformed — a trailing comma, a missing bracket, something that trips up the parser
  • Parameter files reference keys that don’t exist in the CloudFormation template — or skip required parameters that have no default
  • CloudFormation templates have security issues — things cfn_nag would catch, like overly permissive IAM policies or missing encryption settings

The CfCT validation workflow aims to address these issues.

Step 1: Validate manifest.yaml Structure with pykwalify

The first thing the workflow does on every push or pull request is validate the structure of manifest.yaml against a local schema file using pykwalify.

yaml

- name: Validate manifest.yaml
  run: |
    pip install pykwalify pyyaml
    pykwalify -d manifest.yaml -s schema/cfct-schema.yaml

The schema (cfct-schema.yaml) defines what a valid manifest looks like – required keys, allowed values, data types, and so on. If manifest.yaml doesn’t conform, the workflow fails immediately with a clear error message pointing to exactly what’s wrong.

One practical note: the CfCT schema is technically available online, but the URL to access it has not worked reliably never in practice, so the schema lives directly in the repo. Some might say a local schema is a good thing as it means validation works consistently without depending on an external resource being available. Others may say, “blahhhh”…why isn’t this referencing an online schema. Personally, I’m a mix of both. I would prefer to reference the necessary online schema but in the short-term, I wanted the workflow to work so I copied locally to the repo.

Step 2: Discover and Validate JSON Parameter Files

CfCT resources can reference external JSON parameter files via parameter_file entries in the manifest. The workflow extracts these dynamically using a small Python script rather than hardcoding paths.

- name: Discover parameter files from manifest.yaml
  id: params
  run: |
    python3 << 'EOF'
    import yaml, json, sys, os
    manifest = yaml.safe_load(open("manifest.yaml"))

    param_files = []
    for resource in manifest.get("resources", []):
        pf = resource.get("parameter_file")
        if pf:
            param_files.append(pf)

    github_output = os.environ.get("GITHUB_OUTPUT")
    if github_output:
       with open(github_output, "a") as f:
           f.write(f"param_files={','.join(param_files)}\n")
    EOF

This outputs the list of parameter files as a step output, which the next step picks up and loops over to check each one for valid JSON syntax.

- name: Validate JSON Parameter Files
  if: steps.params.outputs.param_files != ''
  run: |
    IFS=',' read -ra FILES <<< "${{ steps.params.outputs.param_files }}"
    for file in "${FILES[@]}"; do
      if ! python3 -m json.tool < "$file" > /dev/null 2>&1; then
        echo "❌ Invalid JSON in: $file"
        exit 1
      fi
      echo "✅ Valid JSON: $file"
    done

This catches syntax errors — malformed JSON that would silently cause a pipeline failure — before anything else runs.

Step 3: Validate Parameters Against CloudFormation Templates

Valid JSON isn’t the same as correct JSON. A parameter file can be perfectly well-formed but still reference a key that doesn’t exist in the CloudFormation template, or omit a required parameter that has no default value. Either will cause a stack deployment to fail.

The workflow handles this with a custom Python script (validate_cf_parameters.py) that compares the parameter file against the actual template. The script does two things:

  1. Checks for extra parameters — keys in the JSON file that aren’t declared as Parameters in the template
  2. Checks for missing required parameters — parameters declared in the template that have no Default value and aren’t present in the JSON file

One thing worth calling out: CloudFormation templates use intrinsic functions like !Ref, !Sub, and !GetAtt in YAML. The script registers custom constructors for all of these so they don’t trip up the YAML parser.

Step 4: Run cfn_nag on Changed Templates

The last step runs cfn_nag, a static analysis tool that scans CloudFormation templates for security and compliance issues. Rather than scanning everything on every run, the workflow detects which templates have been changed and only scans those.

cfn_nag catches things like IAM policies with wildcards, missing S3 bucket encryption, and security groups open to 0.0.0.0/0. Getting this feedback on a PR rather than discovering it post-deployment is a meaningful improvement.

Putting It Together

The sequence is intentional, each step building on the last: is the manifest valid? Are the parameter files valid JSON? Do the parameters match what the templates expect? Do the templates pass security checks?

If any step fails, you get a clear error immediately. No waiting on CodePipeline. No cryptic pipeline logs. Decreased aggravation. A win for everyone? We’ll see.

If you’d like to try out the CfCT validation workflow, you can find it here. If you are able to get it to work with a publicly available schema file, let me know. 🙂

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top