AWX in Practice

In my previous post, I got AWX running on k3s with a custom Execution Environment for Cisco collections. Time to actually use it for something practical: a self-service VLAN provisioning job that lets anyone on the team provision a VLAN on specific switches by filling in a form — no CLI, no SSH, no risk of typos in config mode.

This post covers building it end to end, including the surprising number of gotchas I hit with surveys, host targeting, and variable scoping.

The Goal

A one-click (well, one-form) workflow where the operator:

Opens AWX
Fills in three fields: VLAN ID, VLAN Name, which switches
Hits Launch

AWX does the rest — validates, provisions, verifies — and logs who did what and when.

The Inventory

My master inventory in AWX has two groups:

switches
  ├── ios-switch-1
  └── ios-switch-2

routers
  ├── ios-router-1
  └── ios-router-2

The Playbook

# vlan-provision.yaml
---
- name: Provision VLAN on Cisco switches
  hosts: "{{ limit }}"
  gather_facts: false

  vars:
    vlan_id: "{{ survey_vlan_id | default(0) | int }}"
    vlan_name: "{{ survey_vlan_name | default('') }}"

  tasks:
    - name: Validate VLAN ID range
      ansible.builtin.assert:
        that:
          - vlan_id >= 2
          - vlan_id <= 4094
          - vlan_id not in [1002, 1003, 1004, 1005]
        fail_msg: "VLAN ID is invalid or reserved (must be 2-4094, excluding 1002-1005)"

    - name: Create VLAN
      cisco.ios.ios_vlans:
        config:
          - vlan_id: "{{ vlan_id }}"
            name: "{{ vlan_name }}"
            state: active
        state: merged

    - name: Verify VLAN was created
      cisco.ios.ios_vlans:
        state: gathered
      register: vlan_verify

    - name: Assert VLAN exists
      ansible.builtin.assert:
        that: >
          vlan_verify.gathered
          | selectattr('vlan_id', 'equalto', vlan_id)
          | list | length == 1
        fail_msg: "VLAN was not found after provisioning"

    - name: Show result
      ansible.builtin.debug:
        msg: "VLAN {{ vlan_id }} ({{ vlan_name }}) successfully provisioned on {{ inventory_hostname }}"

Two things worth noting here before we get to the gotchas:

hosts: "{{ limit }}" — the host targeting comes from the survey, not hardcoded. More on this below.
| default(0) and | default('') on the vars — required to prevent Jinja2 from crashing during lazy evaluation. More on this too.

AWX Setup

Job Template

Templates → Add → Job Template

Field	Value
Name	`vlan-provisioning`
Inventory	`master`
Project	your project
Playbook	`vlan-provision.yaml`
Execution Environment	`awx-ee-cisco`
Credentials	your switch Machine credential
Limit	(empty, prompt on launch: unchecked)

Survey

Templates → vlan-provisioning → Survey → Add

Question 1 — VLAN ID

Field	Value
Question	VLAN ID
Answer Variable Name	`survey_vlan_id`
Answer Type	Integer
Minimum	2
Maximum	4094
Required	✅

Question 2 — VLAN Name

Field	Value
Question	VLAN Name
Answer Variable Name	`survey_vlan_name`
Answer Type	Text
Required	✅

Question 3 — Which switches?

Field	Value
Question	Which switches?
Answer Variable Name	`limit`
Answer Type	Multiple Choice (single select)
Choices	`switches`, `ios-switch-1`, `ios-switch-2`
Required	✅

Enable the survey toggle → Save.

The Gotchas (there were many)

Gotcha 1 — `hosts:` can’t use arbitrary survey variables

My first instinct was:

hosts: "{{ target_switches }}"

With a survey variable target_switches. This fails immediately:

[ERROR]: Error processing keyword 'hosts': 'target_switches' is undefined

The hosts: field is evaluated during inventory parsing, before extra vars (including survey vars) are loaded. You can’t use arbitrary variable names here.

The fix: use the reserved name limit as your survey variable name. AWX passes survey variables as --extra-vars to ansible-playbook, and limit happens to be available early enough in evaluation to work in the hosts: field.

hosts: "{{ limit }}"

Gotcha 2 — Multi-select survey passes a list, not a string

I set up Question 3 as Multiple Choice (multiple select). The job ran but targeted all switches regardless of what was selected. The variables passed to the job showed why:

{
  "limit": ["ios-switch-1"]
}

A list ["ios-switch-1"] instead of a string "ios-switch-1". Ansible’s hosts: field expects a string — it silently ignored the list and fell through to the full group.

The fix: change Answer Type to Multiple Choice (single select). If you need to target multiple specific switches, add combined options to the choices:

switches
ios-switch-1
ios-switch-2
ios-switch-1,ios-switch-2

A comma-separated string is a valid Ansible host pattern.

Gotcha 3 — Hardcoded Limit in Job Template overrides the survey

After switching to single select, the job still hit all switches. The culprit: the Job Template had switches hardcoded in the Limit field. A hardcoded template Limit takes precedence over any survey variable — even one named limit.

The fix: clear the Limit field in the Job Template completely, and make sure Prompt on launch is unchecked. The survey drives everything.

Gotcha 4 — `fail_msg` with variable interpolation crashes on undefined vars

My original assert:

fail_msg: "VLAN ID {{ vlan_id }} is invalid or reserved"

This caused:

Error while resolving value for 'fail_msg': 'survey_vlan_id' is undefined

Even though vlan_id is defined in vars: as "{{ survey_vlan_id | int }}", Jinja2’s lazy evaluation means it tries to resolve survey_vlan_id at assert evaluation time — and if the survey var hasn’t propagated yet, it crashes.

Two fixes together:

First, add | default() to the var definitions so they never resolve to undefined:

vars:
  vlan_id: "{{ survey_vlan_id | default(0) | int }}"
  vlan_name: "{{ survey_vlan_name | default('') }}"

Second, remove the variable interpolation from fail_msg to break the evaluation chain:

fail_msg: "VLAN ID is invalid or reserved (must be 2-4094, excluding 1002-1005)"

The | default(0) also gives you a clean assertion failure (0 >= 2 evaluates to false) rather than an undefined variable crash — so the error message is actually useful.

Gotcha 5 — Job failing silently with `rc=None`

Early on, jobs were completing in ~4 seconds with no output at all. The awx-task logs showed:

job 28 (failed) encountered an error (rc=None)

rc=None means Ansible never ran — the Execution Environment container failed before the playbook started. The cause was an architecture mismatch: the EE image was built on Apple Silicon (linux/arm64) but the k3s nodes are amd64. The container silently failed to start.

The fix: always build EE images with --platform linux/amd64 on Apple Silicon:

docker buildx build \
  --platform linux/amd64 \
  -f context/Containerfile \
  -t forgejo.uclab.dev/affragak/awx-ee-cisco:latest \
  --push \
  context/

The Final Working Flow

After all of the above, the launch sequence is a clean single survey with no extra prompts:

┌─────────────────────────────────┐
│  Launch: vlan-provisioning      │
├─────────────────────────────────┤
│  VLAN ID?        [ 105        ] │
│  VLAN Name?      [ sales      ] │
│  Which switches? [ ios-switch-1 ]│
└─────────────────────────────────┘

And the output:

TASK [Validate VLAN ID range] ✓
TASK [Create VLAN] ✓
TASK [Verify VLAN was created] ✓
TASK [Assert VLAN exists] ✓
TASK [Show result] ✓
  "msg": "VLAN 105 (sales) successfully provisioned on ios-switch-1"

Gotcha Summary

#	Problem	Fix
1	Arbitrary variable in `hosts:` is undefined	Use `limit` as the survey variable name
2	Multi-select passes a list, not a string	Use single select; add combined choices for multi-target
3	Hardcoded Job Template Limit overrides survey	Clear the Limit field in the template
4	`fail_msg` with vars crashes on undefined	Add `\| default()` to vars; remove vars from fail_msg
5	`rc=None`, job fails silently with no output	Build EE with `--platform linux/amd64` on Apple Silicon

What’s Next

With the pattern working, the same survey-driven approach applies to:

VLAN deletion — same structure, state: absent
Interface assignment — assign a port to a VLAN on a specific switch
Switch config backup — scheduled nightly, committed to Forgejo
Compliance audit — gather running config, diff against Git baseline

The foundation is solid. Each new playbook is just another Job Template pointing at a different file in the same repo.

my DevOps Odyssey

“Σα βγεις στον πηγαιμό για την Ιθάκη, να εύχεσαι να ‘ναι μακρύς ο δρόμος, γεμάτος περιπέτειες, γεμάτος γνώσεις.” - Kavafis’ Ithaka.

AWX in Practice: Self-Service VLAN Provisioning with Surveys

6 min read · · views

2026-03-26

Series:lab

Categories:network-automation

Tags:#ansible, #awx, #network-automation, #lab

AWX in Practice: