In Octopus Server 2025.1.5751
, a bug caused the deployment of Kubernetes config maps containing multi-line variables, when created through the Configure and apply Kubernetes resources step (built-in Kubernetes step for deploying containers), to fail. Config maps created using the dedicated Kubernetes config map step, as well as those generated with Raw YAML or Helm steps, were unaffected. This issue impacted Cloud customers, who experienced deployment failures that had previously been successful.
The bug was a regression caused by a change supporting manifest reporting for Kubernetes deployment steps, part of an upcoming feature. This change mistakenly caused line breaks in multi-line Octopus variable values to not be properly escaped when substituted into the config map's key-value pairs. The problem became apparent when customers had PEM certificates or JSON blobs that needed to be inserted into the config map. These were replaced verbatim in Calamari, leading to YAML formatting issues due to unescaped line breaks.
The Configure and apply Kubernetes resources step deploys a combination of Kubernetes Deployment, Service, and Ingress resources. It also allows the optional configuration and deployment of an associated Kubernetes ConfigMap and Secret for reference by the Deployment.
To support Rolling Update and Blue/Green deployment strategies, ConfigMap and Secret resources must have unique names for each Deployment version. These resources are assigned computed names, which, by default, combine the resource name with the Octopus deployment ID, and are determined only at deployment time.
(All dates and times below are shown in UTC)
Began receiving customer reports of an increase in failing Kubernetes deployments. These failures have been observed across various projects, with similar errors related to parsing config maps. Our support team worked with our customers to troubleshoot the reasons for the failures.
Our support team escalated the issue to our engineering teams.
Our internal incident response process was initiated.
Our engineers logged on and begin to identify the cause of the incident.
The fix for the bug is merged, and our Status Page is updated to Identified.
Our Status Page is updated to Monitoring as we begin the process to expedite the release 2025.1.7128
of the fix to our affected Cloud customers.
Status Page updated to Resolved.
Before the change to support manifest reporting, the Kubernetes container deployment step created associated Kubernetes config maps (and secrets) using the kubectl create
command with the --from-files
flag, where each config map key-value pair was sent to Calamari as an individual file. This process was updated to use the more standard kubectl apply -f
method, where Octopus now sends a single YAML manifest to Calamari representing the config map. The YAML is generated from a config map resource that we build as an in-memory C# object.
The bug was introduced when the argument for the config map object used raw, unevaluated Octopus variable values. The issue wasn’t identified during testing because the deployment step involves two stages of variable substitution: the first on Octopus Server, and the second inside Calamari during deployment. The two substitution passes are necessary to support the use of computed names, ensuring that each deployment version has its own unique resources.
The change didn't account for multi-line strings as potential variables, causing newline characters to not be properly escaped before serialization. This issue occurred because encoding needs to happen on Octopus Server before the object is serialized into YAML. The second substitution in Calamari is direct on the YAML file. The bug was a regression, and the fix involved evaluating the values before serialization to ensure newline characters were handled correctly.
At Octopus, we take deployment reliability very seriously. After this incident, we conducted a thorough review to identify areas where we can improve our processes, in light of the lessons learned.
We’ve identified a complex and unconventional area of the code—specifically script-based Kubernetes deployments—that requires further attention. Given the distinctive challenges these deployments present, we are committed to enhancing this area with additional tests to ensure better reliability.