For AI agents: Documentation index at /llms.txt

Skip to content

Secure Upgrades

Canister upgrades are one of the highest-risk operations in production. A bad upgrade can corrupt state, make the canister permanently non-upgradeable, or break clients. This guide covers the patterns and checks you need to upgrade safely.

Use this before every production upgrade:

  • Take a snapshot immediately before upgrading
  • Run the upgrade locally first with icp deploy
  • Verify data survives: write → upgrade → read
  • Check Candid interface compatibility. No removed methods, no breaking type changes
  • Avoid pre_upgrade hooks that serialize large state (use stable structures instead)
  • In Motoko, use persistent actor (which eliminates the need for pre_upgrade hooks): avoid manual pre_upgrade/post_upgrade
  • Confirm you have a backup controller (cannot recover from a trapped post_upgrade without one)
  • Add a rollback plan: snapshot ID recorded, restore procedure tested

When you run icp deploy on an existing canister, the IC executes the following sequence:

  1. Stop the canister (waits for in-flight messages to complete)
  2. Run pre_upgrade on the old code (if defined)
  3. Replace the Wasm module with the new code
  4. Run post_upgrade on the new code (if defined)
  5. Restart the canister

Stable memory is preserved through steps 2–4. Heap memory is cleared when the new Wasm loads. If pre_upgrade or post_upgrade traps, the upgrade fails with different consequences:

HookTrap result
pre_upgradeUpgrade cancelled. Old code still running. State intact but may need attention.
post_upgradeNew Wasm installed but initialization failed. Canister may be in an inconsistent state.

Both scenarios leave the canister in a difficult state. Prevention is far better than recovery.

The persistent actor declaration automatically stores all let and var fields in stable memory. No serialization, no upgrade hooks, no instruction-limit traps.

persistent actor Counter {
var count : Nat = 0;
public func increment() : async Nat {
count += 1;
count;
};
public query func get() : async Nat { count };
// transient: resets to [] on each upgrade: correct for caches, transient logs, and reset-on-upgrade counters
transient var recentCallers : [Principal] = [];
};

Key rules:

  • All let/var fields persist automatically. No stable keyword needed
  • transient var for caches or counters that should reset on upgrade
  • Do not write manual pre_upgrade/post_upgrade hooks. The runtime handles everything
  • If a persistent field’s type changes incompatibly, the upgrade traps. See Schema evolution.

In Rust, use ic-stable-structures to store data directly in stable memory. Data lives there from the start. No serialization step on upgrade.

use ic_stable_structures::{
memory_manager::{MemoryId, MemoryManager, VirtualMemory},
DefaultMemoryImpl, StableBTreeMap, StableCell,
};
use std::cell::RefCell;
type Memory = VirtualMemory<DefaultMemoryImpl>;
// Each structure must have its own unique MemoryId: never reuse IDs
const USERS_MEM_ID: MemoryId = MemoryId::new(0);
const COUNTER_MEM_ID: MemoryId = MemoryId::new(1);
thread_local! {
static MEMORY_MANAGER: RefCell<MemoryManager<DefaultMemoryImpl>> =
RefCell::new(MemoryManager::init(DefaultMemoryImpl::default()));
static USERS: RefCell<StableBTreeMap<u64, Vec<u8>, Memory>> =
RefCell::new(StableBTreeMap::init(
MEMORY_MANAGER.with(|m| m.borrow().get(USERS_MEM_ID))
));
static COUNTER: RefCell<StableCell<u64, Memory>> =
RefCell::new(StableCell::init(
MEMORY_MANAGER.with(|m| m.borrow().get(COUNTER_MEM_ID)),
0u64,
).expect("Failed to init counter"));
}
#[ic_cdk::post_upgrade]
fn post_upgrade() {
// Stable structures auto-restore: no deserialization needed.
// Re-initialize timers or transient state here if required.
}

Warning: Each MemoryId must map to exactly one data structure for the lifetime of the canister. Reusing a MemoryId for a different structure after an upgrade corrupts both. Keep a written record of your MemoryId allocations and never reorder them.

The serialization-based upgrade pattern is common in older Rust code but is fundamentally fragile:

// DO NOT DO THIS in production
#[ic_cdk::pre_upgrade]
fn pre_upgrade() {
// If STATE is large, this hits the instruction limit and traps.
// A trapped pre_upgrade prevents the upgrade: canister stays on old code.
ic_cdk::storage::stable_save((STATE.with(|s| s.borrow().clone()),)).unwrap();
}

When pre_upgrade traps due to instruction exhaustion, the canister cannot be upgraded. The skip_pre_upgrade flag (an emergency escape hatch via the management canister’s install_code API (see Management canister reference) bypasses the hook) but anything the hook would have saved is lost. Use stable structures so the upgrade path cannot brick itself under load.

The IC checks your new Wasm module’s Candid interface against the old one before completing the upgrade. If the new interface is not backward-compatible, the upgrade is rejected.

Safe changes:

ChangeWhy it is safe
Add a new methodExisting clients don’t call it
Add optional parameters to an existing methodOld clients send no value; IC substitutes null
Remove trailing parameters from an existing methodOld clients send extra values; IC ignores them
Return additional values from a methodOld clients ignore extra return values
Change a parameter type to a supertypeOld values remain valid inputs
Change a return type to a subtypeNew values remain valid for old clients

Breaking changes (upgrade rejected or clients break):

ChangeWhy it breaks
Remove a methodClients calling it get errors
Add a required (non-optional) parameterOld clients don’t send it
Change a parameter type to an incompatible typeOld clients send invalid values

Example: safe evolution:

// Before
service counter : {
add : (nat) -> ();
get : () -> (int) query;
}
// After: safe: optional param added, new return value, new method
service counter : {
add : (nat, label : opt text) -> (new_val : nat);
get : () -> (nat, last_change : nat) query;
reset : () -> ();
}

icp-cli checks Candid compatibility during deploy and prompts for confirmation if it detects a potentially breaking change. Use --yes in automated pipelines after manually verifying compatibility:

Terminal window
icp deploy my-canister -e ic --yes

Always take a snapshot immediately before a risky upgrade. If the upgrade causes unexpected behavior, you can restore the previous state within minutes.

Terminal window
# 1. Stop the canister and create a snapshot
icp canister stop my-canister -e ic
icp canister snapshot create my-canister -e ic
# Note the snapshot ID printed in the output
icp canister start my-canister -e ic
# 2. Deploy the upgrade
icp deploy my-canister -e ic
# 3. Verify correctness
icp canister call my-canister health_check -e ic
# 4a. If everything works, clean up when no longer needed
icp canister snapshot delete my-canister <snapshot-id> -e ic
# 4b. If something is wrong, stop and restore
icp canister stop my-canister -e ic
icp canister snapshot restore my-canister <snapshot-id> -e ic
icp canister start my-canister -e ic

Snapshots capture the full canister state: Wasm module, Wasm heap memory, stable memory, and chunk store. Restoring from a snapshot brings back all of this state atomically.

See Canister snapshots for listing, downloading, and the state transfer workflow.

Upgrading canister code sometimes requires changing the shape of stored data. The rules differ by language.

When upgrading a persistent actor, the runtime checks that every persistent field’s current type is compatible with the value stored in stable memory. Incompatible changes cause the upgrade to trap.

Safe changes:

  • Add new let or var fields with initial values. The runtime initializes them on upgrade
  • Add optional record fields (e.g., change { name : Text } to { name : Text; email : ?Text })
  • Widen a field’s type (e.g., NatInt)

Unsafe changes (upgrade traps):

  • Remove or rename a persistent field
  • Narrow a field’s type (e.g., IntNat)
  • Change a non-optional field to an incompatible type

If you need to make an unsafe change, migrate the data in two upgrades: add the new field alongside the old one, upgrade once (both fields present), then upgrade again to remove the old field. Test this two-step process locally before deploying to mainnet.

Rust stable structures use serialized bytes on disk. Schema evolution safety depends on the serialization format and versioning strategy.

Adding fields safely with Candid encoding:

use candid::{CandidType, Decode, Deserialize, Encode};
use ic_stable_structures::storable::{Bound, Storable};
use std::borrow::Cow;
#[derive(CandidType, Deserialize, Clone)]
struct UserV2 {
id: u64,
name: String,
created: u64,
// New optional field: safe to add: old records deserialize with None
email: Option<String>,
}
impl Storable for UserV2 {
// Unbounded avoids write failures when struct grows.
// Bounded requires a fixed max_size; if encoded size exceeds it after
// adding fields, writes trap.
const BOUND: Bound = Bound::Unbounded;
fn to_bytes(&self) -> Cow<'_, [u8]> {
Cow::Owned(Encode!(self).expect("failed to encode UserV2"))
}
fn from_bytes(bytes: Cow<'_, [u8]>) -> Self {
Decode!(&bytes, Self).expect("failed to decode UserV2")
}
}

Rules:

  • Use Option<T> for new fields: Candid deserializes absent fields as None, so old records remain readable after the upgrade
  • Use Bound::Unbounded unless you have a strict size requirement
  • Never reorder MemoryId allocations across upgrades: same effect as changing a field type
  • For breaking schema changes, use a versioned enum and migrate records lazily on read

Never upgrade on mainnet without first verifying locally that data written before the upgrade is still readable after.

Motoko:

Terminal window
# Start local network
icp network start -d
# Deploy initial version
icp deploy backend
# Write data
icp canister call backend increment '()'
icp canister call backend increment '()'
icp canister call backend get '()'
# Returns: (2 : nat)
# Modify source code, then redeploy
icp deploy backend
# Verify data survived
icp canister call backend get '()'
# Must still return: (2 : nat)

Rust:

Terminal window
# Start local network
icp network start -d
# Deploy initial version
icp deploy backend
# Write data
icp canister call backend add_user '("Alice")'
icp canister call backend get_user_count '()'
# Returns: (1 : nat64)
# Modify source code, then upgrade
icp deploy backend
# Verify data survived
icp canister call backend get_user_count '()'
# Must still return: (1 : nat64)

If the count drops to zero after upgrade, your data is not in stable memory: review your storage declarations before touching mainnet.

For advanced scenarios (upgrade rollbacks, schema migrations, concurrent call safety), use PocketIC to script multi-step upgrade scenarios in a controlled environment.

You cannot upgrade a canister without a valid controller. Losing all controller keys leaves the canister permanently frozen at its current code: there is no recovery path on the IC.

Terminal window
# Check current controllers
icp canister settings show my-canister -e ic
# Add a backup controller before any risky upgrade
icp canister settings update my-canister --add-controller <backup-principal> -e ic

For production canisters:

  • Maintain at least two controllers (primary identity + hardware wallet or multisig)
  • For fully onchain governance, add an SNS or DAO canister as controller and remove personal principals

See Access management for detailed controller management patterns.