Attack Paths Into VMs in the Cloud

By

Category: Cloud

Tags: , , , , , , ,

A pictorial representation of attackers exploiting virtual machines in the cloud. A server atop a cloud, surrounded by technical tools.

This post is also available in: 日本語 (Japanese)

Executive Summary

This post reviews strategies for identifying and mitigating potential attack vectors against virtual machine (VM) services in the cloud. Organizations can use this information to understand the potential risks associated with their VM services and strengthen their defense mechanisms. This research focuses on VM services offered by three major cloud service providers (CSPs): Amazon Web Services (AWS), Azure and Google Cloud Platform (GCP).

VMs are among the most frequently deployed resources in every cloud environment. Their widespread use also makes them a prime target for attackers. Our research shows that 11% of cloud hosts exposed to the internet contain vulnerabilities rated Critical or High severity.

A compromised VM can provide attackers with access to not only the data within the VM instance but also the permissions assigned to it. As compute workloads like VMs are generally ephemeral and immutable, the risk posed by a compromised identity is arguably greater than that of compromised data within a VM.

It is important to note that all the attack paths discussed in this post are intended features with legitimate use cases, such as streamlining the configuration, updating, and monitoring of VMs across hybrid or multi-cloud environments, rather than vulnerabilities. However, if security best practices are not followed, accounts are not protected, and careful attention isn't given to the design of your architecture, malicious users could misuse these services or features. The responsibility of protecting and mitigating these attack paths falls on the cloud users and administrators.

Palo Alto Networks customers are better protected from the threats discussed above through the following products:

  • Prisma Cloud customers are better protected by the attack path policies continuously monitoring and alerting on potential attack paths.
  • Cortex XDR detects and blocks exploits and evasive cloud-based attacks.
  • Cortex Xpanse can detect shadow IT running in public cloud providers and help bring these resources under management.
  • The Unit 42 Incident Response team can also be engaged to help with a compromise or to provide a proactive assessment to lower your risk.
Related Unit 42 Topics Cloud

Table of Contents

Summary of the VM Attack Paths
Understanding VMs in the Cloud
Vulnerability Exploitation
Startup Script Manipulation
AWS: Modify Startup Scripts in User Data
Azure: Modify Startup Scripts in Custom Data
GCP: Modify Startup Scripts in Metadata
SSH Key Push
AWS: Use EC2 Instance Connect to Push SSH Keys
Azure: Use VMAccess Extension to Push SSH Keys
GCP: Update Metadata to Push SSH Keys
GCP: Use OSLogin to Push SSH Keys
Direct Code Execution
AWS: Use SSM Run Command to Execute Code
Azure: Use Virtual Machine Run Command to Execute Code
Azure: Use a Custom Script Extension to Run Scripts
GCP: Use VM Manager to Execute Code
Run Pre-Patch or Post-Patch Scripts
Run Scripts in OS Policies
SSH Over Middleware
AWS: Use SSM Session Manager to Log into a VM
Serial Console Access
AWS: Login via Serial Ports
Azure: Login via Serial Ports
GCP: Login via Serial Ports
Conclusion
Palo Alto Networks Protection and Mitigation
Additional Resources

Summary of the VM Attack Paths

We explore the conditions and permissions for each attack path into a running VM instance to assist organizations in fine-tuning their detection and mitigation mechanisms. Table 1 provides an overview of all the attack paths we discuss.

AWS Azure GCP
Vulnerability Exploitation Feasible: Yes

Complexity: Depends

Feasible: Yes

Complexity: Depends

Feasible: Yes

Complexity: Depends

Startup Script Manipulation Feasible: Yes

Feature: EC2 User Data

Complexity: Low

Feasible: only VM Scale Sets

Feature: VM custom data

Complexity: Low

Feasible: Yes

Feature: Metadata Startup Scripts

Complexity: Low

SSH Key Push Feasible: Yes

Feature: EC2 Instance Connect

Complexity: Low

Feasible: Yes

Feature: VMAccess extension

Complexity: Medium

Feasible: Yes

Feature: Metadata, OSLogin

Complexity: Low

Direct Code Execution Feasible: Yes

Feature: SSM Run Command

Complexity: Medium

Feasible: Yes

Feature: Run Command, Custom Script Extension

Complexity: Low

Feasible: Yes

Feature: VM Manager

Complexity: Medium

SSH Over Middleware Feasible: Yes

Feature: SSM Session Manager

Complexity: Low

Feasible: No Feasible: No
Serial Console Access Feasible: Yes

Feature: EC2 Serial Console

Complexity: High

Feasible: Yes

Feature: Azure Serial Console

Complexity: High

Feasible: Yes

Feature: Metadata/Serial Console

Complexity: Low

Table 1. Summary of VM attack paths.

Understanding VMs in the Cloud

VMs are among the oldest and most widely used infrastructure-as-a-service (IaaS) offerings across all cloud service providers. They offer a swift and straightforward method to “lift and shift” on-premises applications to the cloud, maintaining the same user experience at the operating system level and above. Modern VM services support a broad spectrum of operating systems, from Linux to Windows to macOS, enabling virtually any application to be deployed in the cloud.

While VMs might not be the most novel cloud technology today, they continue to host many vital cloud workloads. If a VM is compromised, attackers can not only exfiltrate sensitive data and hijack computational resources but also gain access to all the cloud permissions granted to the VM.

As the tactics, techniques and procedures (TTPs) employed by attackers in the cloud largely depend on the permissions they have managed to obtain, one common method of gaining more permissions is to compromise a compute resource, such as a VM, and hijack its workload identity. As a result, each VM instance can potentially serve as a stepping stone towards an attacker's goal, making it crucial to meticulously manage the VM's attack surface.

We define a VM attack path as a series of steps and conditions that could potentially allow an attacker to log in or execute commands in a VM instance. We assume that attackers possess basic information about the targeted VM, such as its unique identifier (UID), IP address, virtual private cloud (VPC) and region.

This information, which is not typically considered confidential, can be sourced from logs, code, or low-privileged read permissions. However, attackers do not possess the login credentials for VMs. The majority of the attack paths discussed in this post rely on control plane application programming interfaces (APIs) to gain access to a VM.

Subsequent sections will each cover a specific technique and explore the attack paths related to that technique in each CSP. We will outline the preconditions for each attack path, noting that while these conditions are necessary, they might not be sufficient. For instance, to exploit these attack paths, we assume the attackers have obtained the required permissions through means such as credential leaks or phishing in order to exploit a specific attack path.

We will focus on the most relevant permissions or configurations that result in these attack paths. Although most of the techniques described are not specific to any particular VM operating system, for simplicity, the references and examples provided will primarily be based on Linux systems.

Vulnerability Exploitation

Our research reveals that 11% of the cloud hosts exposed to the internet contain Critical or High severity vulnerabilities. Exploiting these vulnerabilities is one of the most common ways attackers use to gain initial access to cloud environments.

Given that modern applications are bundled with hundreds of dependent packages, the emergence of new vulnerabilities is accelerating faster than ever. Regardless of the instance type and cloud type, if attackers can identify a remotely exploitable vulnerability exposed by a VM, they could potentially compromise and take control of it.

Conditions:

  • The target VM has a vulnerability exposed to the network that can be exploited remotely.
  • The vulnerability allows remote code execution, file access or file overwriting.

Mitigations:

Startup Script Manipulation

A startup script is a file that executes tasks during the initialization process of a VM instance. These scripts are typically used to set up the environment, download dependencies, initialize services and fetch updates. If attackers gain permissions to alter a VM's startup script, they could exploit this feature to inject malicious code into the VMs.

AWS: Modify Startup Scripts in User Data

When launching a new EC2 instance, users can optionally pass parameters or scripts in user data. Any scripts in user data are run when the instance is launched. By default, the scripts are only executed during the first boot of the instance. However, it is possible to configure the cloud-init directives to force scripts to execute at every restart.

Conditions:

  • The Amazon Machine Images (AMI) used for creating EC2 VM must support the user data and cloud-init functionality.
  • The principals have the following permissions to alter a VM’s user data and restart the VM:
    • ec2:StopInstances
    • ec2:ModifyInstanceAttribute
    • ec2:StartInstances

Mitigations:

  • Restrict and monitor the use of the ec2:ModifyInstanceAttribute permission.

Azure: Modify Startup Scripts in Custom Data

The startup scripts are stored and passed to an Azure VM via its custom data. For a single VM, its custom data can only be set once at boot time and can’t be updated subsequently. However, custom data of a VM Scale Set, a group of VMs, can be updated.

Newly initiated VMs will receive the updated custom data. Existing VMs, on the other hand, need to be reimaged to receive the new custom data.

Conditions:

  • The principals have the following permissions to update the state of a VM scale set and reimage a VM.
    • Microsoft.Compute/virtualMachineScaleSets/write

Mitigations:

  • Restrict and monitor the use of the Microsoft.Compute/virtualMachineScaleSets/write permission.

GCP: Modify Startup Scripts in Metadata

Compute Engine’s metadata service offers a mechanism for storing and retrieving metadata in the form of key-value pairs, including startup/shutdown scripts, SSH keys and numerous feature flags. Metadata can be set at instance-level for each individual VM or project-wide level for all VMs within the project. Each VM is then configured according to its respective metadata.

The startup-script metadata key contains the commands that run when a VM instance boots.

Conditions:

  • The guest agent is installed and activated.
  • The principals have the following permissions to update a VM’s metadata:
  • The principals have the following permissions to restart or reboot a VM:
    • compute.instances.stop
    • compute.instances.start
    • compute.instances.reset

Mitigations:

  • Restrict and monitor the use of the compute.instances.setMetadata and compute.projects.setCommonInstanceMetadata permissions.
  • It is recommended to store startup scripts in cloud storage rather than metadata directly and using the startup-script-url metadata key to point to it. This better secures potentially sensitive information in the startup script through change control and additional access controls as well as allows for a script greater than 256 KB in size.

SSH Key Push

Given that each organization typically hosts various applications on hundreds (if not thousands) of VM instances in their cloud environments, managing the SSH keys for all these VMs can be a daunting task. To help streamline the process of credential management and access control, most CSPs offer features that allow for the easy insertion of SSH public keys into running VMs.

This process usually involves an agent running within a VM, fetching a public key from a cloud API endpoint, modifying the SSH daemon (sshd) configuration file and overwriting the authorized_keys file on the VM. If attackers gain permissions to push SSH keys, they could exploit this feature to gain unauthorized access to VMs.

AWS: Use EC2 Instance Connect to Push SSH Keys

EC2 Instance Connect provides a simple and secure method to manage SSH access to Linux VMs using identity and access management (IAM). When a user needs to SSH into a VM, Instance Connect pushes a temporary public key to the VM, allowing the user to authenticate with the SSH daemon.

Conditions:

  • The EC2 Instance Connect agent is installed and activated. The VM itself doesn’t require any permissions.
  • The principals have the following permission to push SSH keys:
    • ec2-instance-connect:SendSSHPublicKey

Mitigations:

  • Restrict and monitor the use of the ec2-instance-connect:SendSSHPublicKey permission.
  • Uninstall the EC2 Instance Connect if the feature is not needed.

Azure: Use VMAccess Extension to Push SSH Keys

VM Extensions are small applications that facilitate post-deployment configuration and automation on VM instances. These extensions offer functions such as system configuration, system monitoring and system backup.

The VMAccess extension allows the management of administrative users on Linux VMs for tasks like setting a user’s password, pushing an SSH public key or creating a new sudo user. The az vm user command relies on the VMAccess extension to manage user accounts in a VM.

Conditions:

  • The Azure VM agent is installed and activated.
  • The principals have the following permission to install an extension and update user accounts:
    • Microsoft.Compute/virtualMachines/extensions/write
    • Microsoft.Compute/virtualMachines/write

Mitigations:

  • Restrict and monitor the use of the Microsoft.Compute/virtualMachines/extensions/write and Microsoft.Compute/virtualMachines/write permissions.
  • Restrict the type of extension that can be installed on VMs.
  • Remove the VMAccess extension if it is not needed.

GCP: Update Metadata to Push SSH Keys

Compute Engine’s metadata service offers a mechanism for storing and retrieving metadata in the form of key-value pairs. By updating the SSH keys metadata key, one can add SSH public keys to a VM instance.

Conditions:

  • The guest agent is installed and activated.
  • The principals have the following permission to update a VM’s metadata:

Mitigations:

  • Restrict and monitor the use of the compute.instances.setMetadata and compute.projects.setCommonInstanceMetadata permissions.
  • Block metadata-based SSH Keys at the project level.

GCP: Use OSLogin to Push SSH Keys

OSLogin automatically manages SSH keys in metadata and user accounts in VM instances using Google Cloud Identity (IAM) policies. This is the recommended way to manage SSH keys in VMs.

OSLogin can be enabled by updating the enable-oslogin metadata key in the metadata service. It is important to note that metadata-based SSH keys and OSLogin are two mutually exclusive features that can’t both be enabled.

Conditions:

  • OSLogin agent is activated.
  • The principals have the following permissions to update a VM’s metadata:
  • The principals are associated with the compute.osLogin role to connect to VMs using OSLogin.
  • The principals, if outside of the organization, have the following permission
    • compute.oslogin.updateExternalUser

Mitigations:

  • Restrict and monitor the use of the compute.instances.osLogin and compute.oslogin.updateExternalUser permissions.
  • Enforce OS Login with 2FA at the project level.
  • Enforce physical security keys for operating system (OS) Login at the project level.

Direct Code Execution

To streamline the management and configurations of a fleet of VM instances, most CSPs offer features that allow the execution of commands or scripts across a set of VMs. This eliminates the need for VMs to have exposed management ports, bastion hosts or even an active sshd running, increasing their security and cost-effectiveness.

These features usually rely on agents running in the VMs that fetch and execute commands from the cloud API endpoints. If attackers gain the necessary permissions to perform these actions, they could exploit these features to execute malicious code within the VMs.

AWS: Use SSM Run Command to Execute Code

The SSM Run Command allows users to execute commands on nodes where the System Manager is installed. The feature offers an easy way for performing one-time configurations or status checks across nodes in single-cloud, multi-cloud or hybrid cloud environments.

Conditions:

  • The SSM agent is installed and activated.
  • The VM has the permissions specified in the AmazonSSMManagedInstanceCore policy.
  • The principals have the following permission
    • ssm:SendCommand

Mitigations:

  • Restrict and monitor the use of the ssm:SendCommand permission.
  • Restrict the SSM documents that Run Command can execute.
  • Revoke SSM permissions from VMs that are not managed by SSM.
  • Deactivate the Default Host Management Configuration if it is not needed. This feature allows AWS System Manager to manage all the qualified EC2 instances.

Azure: Use Virtual Machine Run Command to Execute Code

The Run Command feature in Azure uses the VM agent within a VM to execute scripts. It can be used for application management, system diagnostics or troubleshooting when RDP or SSH service are unavailable.

Conditions:

  • The Azure VM agent is installed and activated.
  • The principals have the following permission to perform the Run Command
    • Microsoft.Compute/virtualMachines/runCommands/write

Mitigations:

  • Restrict and monitor the use of the Microsoft.Compute/virtualMachines/runCommands/write permission.

Azure: Use a Custom Script Extension to Run Scripts

VM Extensions are small applications that can perform post-deployment configuration and automation on VM instances. The custom script extension allows for the downloading and execution of scripts within VMs.

Conditions:

  • The Azure VM agent is installed and activated.
  • The principals have the following permission to install an extension and run the custom script extension
    • Microsoft.Compute/virtualMachines/extensions/write
    • Microsoft.Compute/virtualMachines/write

Mitigations:

  • Restrict and monitor the use of the Microsoft.Compute/virtualMachines/extensions/write and Microsoft.Compute/virtualMachines/write permissions.
  • Restrict the type of extension that can be installed.

GCP: Use VM Manager to Execute Code

VM Manager is a suite of tools that can help manage a group of VMs. It is primarily used for applying patches, collecting OS information and installing or removing software packages.

Run Pre-Patch or Post-Patch Scripts

The Patch feature can apply OS patches across a set of VM instances using OS package managers like the Advanced Packaging Tool (APT) and Yellowdog Updater, Modified (YUM). During the creation of a patch job, optional pre-patch or post-patch scripts can be executed to either prepare for or test the patch.

Run Scripts in OS Policies

The OS Policy feature allows users to maintain a consistent configuration in OSes across multiple VMs. Each policy file contains the declarative configuration for resources such as packages, repositories or files. One way to configure resources in an OS is executing scripts.

Conditions:

  • The guest agent is installed and activated.
  • OS Config is enabled in the metadata.
  • The OS Config agent is installed and activated.
  • The VM must have an attached service account, although the service account doesn’t need any permission.
  • The principals need the following permissions to run a patch job:
    • osconfig.patchJobs.exec
    • osconfig.patchJobs.get
    • osconfig.patchJobs.list
  • The principals need the following permissions to manage OS policy assignments:
    • osconfig.osPolicyAssignments.update
    • osconfig.osPolicyAssignments.get
    • osconfig.osPolicyAssignments.list

Mitigations:

  • Restrict and monitor the use of the osconfig.patchJobs.exec and osconfig.osPolicyAssignments.update permissions.
  • Disable the Patch and OS policies feature by setting the osconfig-disabled-features metadata key at the project level.
  • Disable OS Config in the metadata at the project level if it is not needed.
  • Uninstall the OS Config agent if it is not needed.

SSH Over Middleware

AWS: Use SSM Session Manager to Log into a VM

AWS SSM Session Manager provides a secure and auditable way to log into nodes using IAM. Nodes with Session Manager don’t need to have open inbound ports and users logging into the nodes don’t need to manage the private keys.

Session manager can also be configured on nodes in multi-cloud or hybrid cloud environments. If attackers gain permissions to perform the session manager’s actions, they could potentially abuse the feature to log into VMs with the session manager running.

Conditions:

  • The SSM agent is installed and activated.
  • The VM’s instance profile has the following permissions:
    • ssmmessages:CreateControlChannel
    • ssmmessages:CreateDataChannel
    • ssmmessages:OpenControlChannel
    • ssmmessages:OpenDataChannel
    • ssm:UpdateInstanceInformation
  • The principals have the following permissions to connect to the VM:
    • ssm:StartSession
    • ssm:ResumeSession
    • ssm:TerminateSession

Mitigations:

  • Restrict and monitor the use of the ssm:StartSession and ssm:ResumeSession permissions.
  • Revoke ssmmessages permissions from VMs that are not managed by the session manager.
  • Deactivate Default Host Management Configuration if it is not needed.
  • Uninstall the SSM agent if it is not needed.

Serial Console Access

Most cloud service providers offer serial console access as a feature to troubleshoot boot and network configuration issues in VMs. This feature provides text-based console access to VMs, independent of the network and operating system state.

Because network-based access control does not apply to serial console access, attackers with serial console permissions could potentially abuse this feature to bypass network-based firewall restrictions and gain unauthorized access to VMs. It is important to note that the serial console access does not bypass the user authentication. Valid passwords or private keys are still needed to log into a VM.

AWS: Login via Serial Ports

Amazon EC2 Serial Console provides access to the serial port of EC2 instances.

Conditions:

  • Serial console access must be enabled at the account level.
  • Principals with the ec2:EnableSerialConsoleAccess permission can enable access.
  • The VM must be one of the supported instance types. Most instances built on the Nitro System are supported.
    • Principals with the ec2:ModifyInstanceAttribute permission can change a VM’s instance type.
  • The principals must have a valid password or permission to push the SSH public key to the instance.
    • The principals with the ec2-instance-connect:SendSerialConsoleSSHPublicKey permission can push the SSH key into a VM via the serial console.

Mitigations:

  • Disable serial console access at the account level.
  • Restrict and monitor the use of ec2:EnableSerialConsoleAccess and ec2-instance-connect:SendSerialConsoleSSHPublicKey permissions.

Azure: Login via Serial Ports

Azure Serial Console provides text-based console access to VM instances.

Conditions:

  • Serial Console is enabled at the subscription level. (It is enabled by default.)
  • The VM’s boot diagnostic is enabled.
  • The VM Guest OS has the terminal management service enabled (e.g., getty in Linux and SAC in Windows). (They are enabled by default.)
  • The VM Guest OS has text-based user authentication configured for local logins, (e.g., valid user/password).
  • The principals have the following permissions to connect to a VM via serial port:
    • Microsoft.Compute/virtualMachines/start/action
    • Microsoft.Compute/virtualMachines/read
    • Microsoft.Compute/virtualMachines/write
    • Microsoft.Resources/subscriptions/resourceGroups/read
    • Microsoft.Storage/storageAccounts/listKeys/action
    • Microsoft.Storage/storageAccounts/read
    • Microsoft.SerialConsole/serialPorts/connect/action

Mitigations:

  • Restrict and monitor the use of the Microsoft.SerialConsole/serialPorts/connect/action permission.
  • Disable serial console access at the subscription level.
  • Disable boot diagnostic for individual VMs.
  • Disable terminal management service for individual VMs.
  • Disable text-based authentication for local logins, such as user/password for individual VMs.

GCP: Login via Serial Ports

GCP provides an alternative way to connect to a VM over a serial port. Compute Engine serial port access can be enabled in the metadata service by updating the serial-port-enable metadata key.

Conditions:

  • The guest agent is installed and activated on the VM.
  • The principals have the following permissions to update a VM’s metadata:
    • compute.instances.setMetadata (via VM’s instance metadata)
    • compute.projects.setCommonInstanceMetadata (via project-wide metadata)
    • iam.serviceAccountUser role on the Instance’s service account

Mitigations:

  • Restrict and monitor the use of the compute.instances.setMetadata and compute.projects.setCommonInstanceMetadata permissions.
  • Restrict and monitor the use of the iam.serviceAccountUser role
  • Disable serial port access through organization policy.
  • Disable the OS Config agent on the VM.

Conclusion

This post provides an overview of potential attack paths into VMs and outlines the mitigation strategies that organizations can implement to enhance their cloud security. Maintaining the security posture of VMs in cloud environments is crucial.

Due to their widespread use and inherent permissions from workload identities, VMs are attractive targets for attackers. All the attack paths discussed throughout this post are based on the intended features of legitimate use cases. However, if these features are not properly secured, adversaries can abuse them with malicious intent. The responsibility of safeguarding these attack paths and mitigating potential risks lies with the cloud users.

IAM configuration plays a pivotal role in both enabling these attack paths and mitigating their associated risks. To ensure robust cloud security, it is vital to continuously identify these attack paths and monitor the use of risky permissions.

As cloud environments continue to evolve, so too will the TTPs employed by cyberattackers. Organizations must remain vigilant and proactive in their cloud security efforts, adapting their strategies to counter evolving threats.

Palo Alto Networks Protection and Mitigation

Palo Alto Networks customers are better protected from the threats discussed above through the following products:

  • Prisma Cloud customers are better protected by the attack path policies that continuously monitor and alert potential attack paths.
  • Cortex XDR detects and blocks exploits and evasive cloud-based attacks.
  • Cortex Xpanse can detect shadow IT running in public cloud providers and help bring these resources under management.

If you think you may have been compromised or have an urgent matter, get in touch with the Unit 42 Incident Response team or call:

  • North America Toll-Free: 866.486.4842 (866.4.UNIT42)
  • EMEA: +31.20.299.3130
  • APAC: +65.6983.8730
  • Japan: +81.50.1790.0200

Palo Alto Networks has shared these findings with our fellow Cyber Threat Alliance (CTA) members. CTA members use this intelligence to rapidly deploy protections to their customers and to systematically disrupt malicious cyber actors. Learn more about the Cyber Threat Alliance.

Additional Resources

Palo Alto Networks

AWS

Azure

GCP

Others