After years of working as a Database Architect and handling escalations around MySQL environments on AWS, I noticed one topic kept coming back: Data Dictionary inconsistencies. Teams would panic, escalate, and often the fix was something they could have handled themselves — if they understood what was actually happening under the hood.
This post is my attempt to save you that escalation. Let’s break it down.
Table of Contents
Open Table of Contents
What Is the Data Dictionary?
Every relational database engine maintains a Data Dictionary — a collection of metadata that tracks objects like tables, indexes, columns, and constraints. Think of it as the database’s internal map of itself.
Here’s where it gets interesting: the implementation changed significantly between MySQL versions.
MySQL 5.7 (and Aurora MySQL v2, RDS MySQL 5.7)
MySQL 5.7 maintains two separate dictionaries that must stay in sync:
┌─────────────────────────────────────────────────┐
│ MySQL Server Layer │
│ │
│ Filesystem-based Dictionary │
│ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ │
│ │.FRM │ │.TRN │ │.TRG │ │.OPT │ │
│ └─────┘ └─────┘ └─────┘ └─────┘ │
│ │
├─────────────────────────────────────────────────┤
│ InnoDB Engine Layer │
│ │
│ InnoDB Internal Dictionary │
│ (inside system tablespace) │
│ │
└─────────────────────────────────────────────────┘
Both must agree for a table to be accessible.
MySQL 8.0 (and Aurora MySQL v3, RDS MySQL 8.0)
MySQL 8.0 introduced a single, unified, transactional data dictionary stored entirely in InnoDB tables. This was a major architectural improvement:
┌─────────────────────────────────────────────────┐
│ MySQL 8.0 Unified Dictionary │
│ │
│ ┌─────────────────────────────────────────┐ │
│ │ Single InnoDB-based Data Dictionary │ │
│ │ • Atomic DDL support │ │
│ │ • Crash-safe schema changes │ │
│ │ • No more .FRM files │ │
│ └─────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────┘
DDL operations are now atomic — they either fully
succeed or fully roll back.
Why this matters: On MySQL 8.0 / Aurora v3, data dictionary inconsistencies are significantly less common because DDL operations are atomic. They still can happen, but the window is much narrower.
Why Does the Data Dictionary Become Inconsistent?
Let’s be precise with language here. We say “inconsistency” rather than “corruption.” True data corruption is rare and involves damaged data pages. Dictionary inconsistency means the metadata layers disagree about what exists — and that’s a design consequence, not a storage failure.
The Root Cause (MySQL 5.7 / Aurora v2)
Here’s the fundamental problem:
- DML operations (INSERT, UPDATE, DELETE) on InnoDB tables are fully transactional — protected by ACID guarantees (when
sync_binlogandinnodb_flush_log_at_trx_commitare properly configured). - DDL and DCL operations (CREATE, ALTER, DROP, GRANT) are not transactional in MySQL 5.7.
MySQL 5.7 does not guarantee ACID properties for schema changes. If something goes wrong mid-DDL, you can end up with one dictionary knowing about a table and the other not.
Note on Aurora MySQL v2: While Aurora’s distributed storage layer provides superior crash safety for data, the MySQL-layer dictionary operations still follow the same non-transactional DDL behavior as community MySQL 5.7.
The Root Cause (MySQL 8.0 / Aurora v3)
MySQL 8.0 introduced Atomic DDL, which means DDL statements are now crash-safe — they either complete fully or roll back. This dramatically reduces inconsistency risk, but doesn’t eliminate it entirely (edge cases around foreign_key_checks still apply).
Common Symptoms
When your data dictionary is inconsistent, you’ll typically see:
User-reported symptoms:
- “I can’t access a table”
- “A table disappeared”
- “I can’t drop this table”
Error log messages:
ERROR 1146 (42S02): Table 'mydb.orders' doesn't exist
ERROR 1050 (42S01): Table 'mydb.orders' already exists
ERROR 1051 (42S02): Unknown table 'mydb.orders'
[Warning] InnoDB: Cannot open table mydb/orders from the internal
data dictionary of InnoDB though the .frm file for the table exists.
[ERROR] InnoDB: Trying to load index GEN_CLUST_INDEX for table
mydb.orders, but the index tree has been freed!
The paradox of a table both “existing” and “not existing” is the hallmark of a dictionary inconsistency — one layer sees it, the other doesn’t.
What Triggers These Inconsistencies?
1. Foreign Key Checks Disabled During DDL
This is the most common cause I’ve seen in production.
-- Developer does this to speed up migrations:
SET foreign_key_checks = 0;
-- Then runs DDL that changes table structure:
ALTER TABLE orders DROP INDEX idx_customer_id;
-- This index was required by a foreign key!
SET foreign_key_checks = 1;
-- Now BOTH tables (parent and child) "disappear"
The MySQL documentation explicitly warns about this:
“With foreign_key_checks=0, dropping an index required by a foreign key constraint places the table in an inconsistent state and causes the foreign key check that occurs at table load to fail.”
When MySQL detects the FK relationship is broken, it refuses to load both tables involved in the relationship.
This applies to both RDS MySQL and Aurora MySQL, all versions.
2. Crash During DDL Operations
If MySQL crashes (OOM, storage full, hardware failure, or even a MySQL bug) while a DDL is in progress:
| Version | Risk Level | Why |
|---|---|---|
| MySQL 5.7 / Aurora v2 | High | DDL is non-transactional. Crash = partial state. |
| MySQL 8.0 / Aurora v3 | Low | Atomic DDL rolls back incomplete operations. |
On MySQL 5.7, a crash mid-DDL can leave behind:
- Orphaned tablespace files (
.ibd) without matching.frmfiles .frmfiles pointing to non-existent tablespaces- InnoDB dictionary entries with no filesystem counterpart
Aurora-specific note: Aurora’s storage layer is inherently crash-safe for data writes, which means storage-level issues causing mid-DDL crashes are rarer than on RDS MySQL or self-hosted. However, MySQL-layer crashes (OOM, bugs) can still trigger the same dictionary inconsistencies.
Diagnosis and Recovery Flowchart
┌─────────────────────────────────────┐
│ Table inaccessible / missing? │
└──────────────┬──────────────────────┘
│
▼
┌─────────────────────────────────────┐
│ Does the table use Foreign Keys? │
└──────┬───────────────────────┬──────┘
│ YES │ NO
▼ ▼
┌──────────────────┐ ┌──────────────────────────┐
│ SET SESSION │ │ Was there a recent crash │
│ foreign_key_ │ │ or DDL operation? │
│ checks = 0; │ └────────┬─────────────────┘
│ │ │
│ Can you access │ ▼
│ the table now? │ ┌──────────────────────────┐
└───┬──────────┬───┘ │ Consider PITR to before │
│YES │NO │ the DDL/crash event │
▼ ▼ └──────────────────────────┘
┌────────┐ ┌──────────────────────────┐
│ Fix FK │ │ Check error logs for │
│ defs & │ │ orphaned tablespace or │
│indexes │ │ .frm mismatch messages │
└────────┘ └──────────────────────────┘
Step-by-Step Recovery
Step 1: Check Foreign Keys First
This resolves the majority of cases I’ve seen:
-- In a session, disable FK checks
SET SESSION foreign_key_checks = 0;
-- Try to access the "missing" table
SELECT COUNT(*) FROM mydb.orders;
If the table comes back: The issue is a broken FK relationship. Fix it by:
- Reviewing the MySQL Foreign Key documentation — specifically the “Conditions and Restrictions” section
- Ensuring both parent and child tables have the required indexes
- Ensuring column definitions match exactly between FK columns
Prevention: Never run DDL statements that modify indexes or columns involved in foreign key relationships while foreign_key_checks = 0.
Step 2: Assess the Situation
Before attempting any recovery, answer these questions:
| Question | Why It Matters |
|---|---|
| Is this a production database? | Determines urgency and risk tolerance |
| Do you have recent backups/snapshots? | Determines recovery options |
| When did the issue start? | Helps identify the triggering event |
| What DDL ran before the issue? | Points to root cause |
Step 3: Protect Your Recovery Options
Do this immediately, regardless of which recovery path you choose:
# Extend backup retention to maximum (35 days)
# This preserves your PITR window
aws rds modify-db-instance \
--db-instance-identifier your-instance \
--backup-retention-period 35 \
--apply-immediately
# For Aurora clusters:
aws rds modify-db-cluster \
--db-cluster-identifier your-cluster \
--backup-retention-period 35 \
--apply-immediately
Step 4: Choose Your Recovery Path
Option A — Point-in-Time Recovery (Best option when available)
Restore to a point just before the problematic DDL:
# RDS MySQL
aws rds restore-db-instance-to-point-in-time \
--source-db-instance-identifier your-instance \
--target-db-instance-identifier your-instance-recovered \
--restore-time "2026-05-18T10:00:00Z"
# Aurora MySQL
aws rds restore-db-cluster-to-point-in-time \
--source-db-cluster-identifier your-cluster \
--db-cluster-identifier your-cluster-recovered \
--restore-to-time "2026-05-18T10:00:00Z"
Option B — Restore from Snapshot
If you have a snapshot from before the issue:
# Create a new instance from the good snapshot
aws rds restore-db-instance-from-db-snapshot \
--db-instance-identifier your-instance-recovered \
--db-snapshot-identifier your-good-snapshot
Option C — Manual Recovery (Advanced, MySQL 5.7 only)
For orphaned tablespace issues on MySQL 5.7, you may be able to:
- Identify the orphaned
.ibdfile - Use
ALTER TABLE ... DISCARD TABLESPACEandIMPORT TABLESPACEto reattach
This is risky and should only be attempted on non-production instances or copies.
Prevention Checklist
| Practice | Applies To |
|---|---|
Never run structural DDL with foreign_key_checks = 0 | All versions |
| Monitor disk space — avoid storage-full during DDL | RDS MySQL especially |
Set appropriate innodb_flush_log_at_trx_commit = 1 | RDS MySQL 5.7 |
Set sync_binlog = 1 if using binlog | RDS MySQL 5.7 |
| Upgrade to MySQL 8.0 / Aurora v3 for Atomic DDL | All |
| Test DDL operations in non-production first | All |
| Maintain adequate backup retention | All |
| Monitor error logs for early warning signs | All |
RDS MySQL vs Aurora MySQL: Key Differences
| Aspect | RDS MySQL | Aurora MySQL |
|---|---|---|
| Storage crash safety | EBS-based, relies on MySQL crash recovery | Distributed storage with built-in crash safety |
| DDL crash risk | Higher — storage-level issues more likely | Lower — storage layer is more resilient |
sync_binlog importance | Critical for durability | Less critical — Aurora handles durability at storage layer |
| Dictionary inconsistency risk | Standard MySQL risk profile | Slightly lower due to storage architecture |
| Recovery options | PITR, snapshots, manual tablespace recovery | PITR, snapshots, backtrack (Aurora-specific) |
| Atomic DDL (8.0) | Supported | Supported (Aurora v3) |
Key Takeaways
-
Data dictionary inconsistency ≠ data corruption. Your data is likely fine — the metadata just disagrees with itself.
-
Foreign keys are the #1 culprit. Always check
foreign_key_checks = 0as your first diagnostic step. -
MySQL 8.0 / Aurora v3 dramatically reduces risk through Atomic DDL. If you’re still on 5.7, this is one more reason to upgrade.
-
Protect your PITR window immediately when you discover an issue. Extend backup retention before doing anything else.
-
These issues stem from MySQL’s architecture, not from the managed platform. Understanding the “why” helps you prevent recurrence.
Special thanks to Valter for contributing ideas and review to this article.