svg

Fixing Aws Backup Failure Due To Uefi Data Size Exceeding 64kb On Ec2 Instances

aws ec2 uefi

Recently, we encountered an issue where AWS Backup failed to back up one of our EC2 instances. The backup job failed with the following error message:

Message: An AWS Backup job failed. 
Resource ARN: arn:aws:ec2:ap-southeast-2:1234567890:instance/i-0a1a2b3c4d5f5f 
BackupJob ID: 293A311A-787H-027E-H7F6-E878768GHG 
Status Message: uefiData in the instance exceeds maximum size of 64k bytes.

After investigation, we discovered that this issue occurred after the instance experienced a kernel panic. Further checks showed that the UEFI data size for the instance was approximately 80 KB, exceeding the 64 KB limit imposed by AWS.

Root Cause: Linux “pstore” Mechanism Filling the UEFI Variable Store

Linux includes a mechanism called pstore (persistent storage) that can store crash dumps in the UEFI variable store. This allows the system to retain crash logs across reboots for debugging purposes.

In AWS, the space available for UEFI variables is limited. When AWS Backup or create-image tries to copy UEFI data from the instance to an AMI, it fails if the data exceeds 64 KB.

Step-by-Step Workaround

To resolve this issue, we need to clear the pstore and optionally disable it to prevent recurrence.

  1. Temporarily Clear the “pstore”

Connect to the affected instance and run:

# rm -rf /sys/fs/pstore/*

This removes all crash dump files from the pstore and frees up UEFI space.

  1. Permanently Disable pstore to Prevent Future Issues

To ensure this doesn’t happen again, disable pstore by adding the kernel parameter efi_pstore.pstore_disable=1.

Edit /etc/default/grub:

# vi /etc/default/grub

Then make sure the configuration includes efi_pstore.pstore_disable=1:

GRUB_CMDLINE_LINUX_DEFAULT="console=tty0 console=ttyS0,115200n8 nvme_core.io_timeout=4294967295 rd.emergency=poweroff rd.shell=0 selinux=1 security=selinux quiet efi_pstore.pstore_disable=1"
GRUB_TIMEOUT=0
GRUB_DISABLE_RECOVERY="true"
GRUB_TERMINAL="ec2-console"
GRUB_ENABLE_BLSCFG="true"
GRUB_X86_USE_32BIT="true"
GRUB_DEFAULT=saved
GRUB_UPDATE_DEFAULT_KERNEL=true

After editing, regenerate the GRUB configuration:

# grub2-mkconfig -o /boot/grub2/grub.cfg

If pstore Data Still Exceeds 64KB

If simply clearing /sys/fs/pstore doesn’t reduce UEFI data enough, perform the following deeper cleanup:

  1. Verify Current UEFI Data Size

Run these commands to measure the size before making changes:

# eval echo -n $(aws ec2 get-instance-uefi-data --instance-id i-0a1a2b3c4d5f5f --query UefiData --region ap-southeast-2) | base64 -d | wc -c
60650

# eval echo -n $(aws ec2 get-instance-uefi-data --instance-id i-0a1a2b3c4d5f5f --query UefiData --region ap-southeast-2) | wc -c
80868
  1. Clear Dump Variables from EFI Vars
# rm -f /sys/firmware/efi/efivars/dump-type*
  1. Unmount and Remount the pstore
# umount /sys/fs/pstore
# mount -v -t pstore pstore /sys/fs/pstore
  1. Add and Remove a Large Zeroed Variable

This step helps reclaim leftover fragmented space in the EFI variable store.

# (echo -ne '\x7'; dd if=/dev/zero bs=55k count=1 status=none) | dd bs=1M iflag=fullblock of=/sys/firmware/efi/efivars/dump-type0-1-1-1704266701-D-cfc8fc79-be2e-4ddc-97f0-9f98bfe298a0 status=none
# rm -f /sys/firmware/efi/efivars/dump-type0-1-1-1704266701-D-cfc8fc79-be2e-4ddc-97f0-9f98bfe298a0

Those pair of commands looks intimidating, but it’s actually a clever workaround to reclaim fragmented or stale space inside the UEFI variable store. Let’s unpack that:

echo -ne '\x7' -> Writes a single byte (0x07) — this is required because the first byte of every EFI variable file is an attribute bitmask that defines variable permissions (like runtime access, boot service access, etc.).

; dd if=/dev/zero bs=55k count=1 status=none -> Generates 55 KB of zero bytes. This is just filler data, enough to make a large EFI variable and occupy significant space.

( ... ) | dd ... of=/sys/firmware/efi/efivars/... -> Pipes the generated data into another dd command that writes it directly as a new EFI variable file under /sys/firmware/efi/efivars/.

The filename dump-type0-1-1-1704266701-D-cfc8fc79-be2e-4ddc-97f0-9f98bfe298a0 acts as the EFI variable name and GUID, just like all other variables in that directory.

So, after this command runs, the system has a large fake EFI variable stored in firmware.

Deleting that file removes the variable from the UEFI store, freeing the previously occupied space.

This effectively causes the firmware to reclaim and consolidate that flash memory region, reducing fragmentation and restoring usable variable space.

  1. Verify UEFI Data Size After Cleanup
# eval echo -n $(aws ec2 get-instance-uefi-data --instance-id i-0a1a2b3c4d5f5f --query UefiData --region ap-southeast-2) | base64 -d | wc -c
8082

# eval echo -n $(aws ec2 get-instance-uefi-data --instance-id i-0a1a2b3c4d5f5f --query UefiData --region ap-southeast-2) | wc -c
10776

If the result is less than 64,000 bytes, the create-image or AWS Backup job should now succeed.

Final Thoughts

This issue highlights a lesser-known side effect of the Linux pstore feature in cloud environments. While pstore is useful for post-mortem debugging, on EC2 instances with UEFI firmware it can unintentionally interfere with image creation and backup operations.

References: