Appendix C: AAAK Dialect Complete Reference
This appendix consolidates the
AAAK_SPECconstant frommcp_server.pyand the complete encoding tables fromdialect.py, providing a searchable reference for the AAAK dialect. Source baseline: the current MemPalace source snapshot discussed in this book.
Overview
AAAK is a compressed shorthand format designed for AI agents. It is not meant for human reading -- it is meant for LLMs. Any model that can read English (Claude, GPT, Gemini, Llama, Mistral) can understand AAAK directly, without a decoder or fine-tuning.
Format Structure
Line Types
| Prefix | Meaning | Format |
|---|---|---|
0: | Header line | FILE_NUM|PRIMARY_ENTITY|DATE|TITLE |
Z + number | Zettel entry | ZID:ENTITIES|topic_keywords|"key_quote"|WEIGHT|EMOTIONS|FLAGS |
T: | Tunnel (cross-entry link) | T:ZID<->ZID|label |
ARC: | Emotion arc | ARC:emotion->emotion->emotion |
Field Separators
- Pipe
|separates different fields within the same line - Arrow
→denotes causal or transformational relationships - Stars
★to★★★★★indicate importance (1--5 scale)
Entity Encoding
Entity names are encoded as the first three letters in uppercase:
| Name | Code | Rule |
|---|---|---|
| Alice | ALC | name[:3].upper() |
| Jordan | JOR | |
| Riley | RIL | |
| Max | MAX | |
| Ben | BEN | |
| Priya | PRI | |
| Kai | KAI | |
| Soren | SOR |
Source location: dialect.py:367-379 (encode_entity method)
Emotion Encoding Table
AAAK uses standardized short codes to represent emotional states.
Core Emotion Codes
| English | Code | Meaning |
|---|---|---|
| vulnerability | vul | Vulnerability |
| joy | joy | Joy |
| fear | fear | Fear |
| trust | trust | Trust |
| grief | grief | Grief |
| wonder | wonder | Wonder |
| rage | rage | Rage |
| love | love | Love |
| hope | hope | Hope |
| despair | despair | Despair |
| peace | peace | Peace |
| humor | humor | Humor |
| tenderness | tender | Tenderness |
| raw_honesty | raw | Raw honesty |
| self_doubt | doubt | Self-doubt |
| relief | relief | Relief |
| anxiety | anx | Anxiety |
| exhaustion | exhaust | Exhaustion |
| conviction | convict | Conviction |
| quiet_passion | passion | Quiet passion |
| warmth | warmth | Warmth |
| curiosity | curious | Curiosity |
| gratitude | grat | Gratitude |
| frustration | frust | Frustration |
| confusion | confuse | Confusion |
| satisfaction | satis | Satisfaction |
| excitement | excite | Excitement |
| determination | determ | Determination |
| surprise | surprise | Surprise |
Source location: dialect.py:47-88 (EMOTION_CODES dictionary)
Shorthand Markers in the MCP Server
The AAAK_SPEC in mcp_server.py uses *marker* format to annotate emotional context:
| Marker | Meaning |
|---|---|
*warm* | Warmth / Joy |
*fierce* | Determination / Resolve |
*raw* | Vulnerability / Raw honesty |
*bloom* | Tenderness / Blossoming |
Emotion Signal Detection
dialect.py automatically detects emotions in text via keyword matching:
| Keyword | Mapped Code |
|---|---|
| decided | determ |
| prefer | convict |
| worried | anx |
| excited | excite |
| frustrated | frust |
| confused | confuse |
| love | love |
| hate | rage |
| hope | hope |
| fear | fear |
| happy | joy |
| sad | grief |
| surprised | surprise |
| grateful | grat |
| curious | curious |
| anxious | anx |
| relieved | relief |
| concern | anx |
Source location: dialect.py:91-114 (_EMOTION_SIGNALS dictionary)
Semantic Flags
Flags mark the type of factual assertion, aiding retrieval and classification.
| Flag | Meaning | Trigger Keywords |
|---|---|---|
DECISION | Explicit decision or choice | decided, chose, switched, migrated, replaced, instead of, because |
ORIGIN | Origin moment | founded, created, started, born, launched, first time |
CORE | Core belief or identity pillar | core, fundamental, essential, principle, belief, always, never forget |
PIVOT | Emotional turning point | turning point, changed everything, realized, breakthrough, epiphany |
TECHNICAL | Technical architecture or implementation detail | api, database, architecture, deploy, infrastructure, algorithm, framework, server, config |
SENSITIVE | Content requiring careful handling | (manually annotated) |
GENESIS | Directly led to the creation of something that still exists | (inferred from context) |
Source location: dialect.py:117-152 (_FLAG_SIGNALS dictionary)
Palace Structure Identifiers
| Element | Format | Example |
|---|---|---|
| Wing | wing_ + name | wing_user, wing_code, wing_myproject |
| Hall | hall_ + type | hall_facts, hall_events, hall_discoveries, hall_preferences, hall_advice |
| Room | Hyphenated slug | chromadb-setup, gpu-pricing, auth-migration |
Full Example
Original English (~70 tokens)
Priya manages the Driftwood team: Kai (backend, 3 years), Soren (frontend),
Maya (infrastructure), and Leo (junior, started last month). They're building
a SaaS analytics platform. Current sprint: auth migration to Clerk.
Kai recommended Clerk over Auth0 based on pricing and DX.
AAAK Encoding (~35 tokens)
TEAM: PRI(lead) | KAI(backend,3yr) SOR(frontend) MAY(infra) LEO(junior,new)
PROJ: DRIFTWOOD(saas.analytics) | SPRINT: auth.migration→clerk
DECISION: KAI.rec:clerk>auth0(pricing+dx) | ★★★★
Factual Assertion Verification
| # | Assertion | AAAK Counterpart | Preserved |
|---|---|---|---|
| 1 | Priya is the team lead | PRI(lead) | Yes |
| 2 | Kai does backend | KAI(backend,3yr) | Yes |
| 3 | Kai has 3 years of experience | KAI(backend,3yr) | Yes |
| 4 | Soren does frontend | SOR(frontend) | Yes |
| 5 | Maya does infrastructure | MAY(infra) | Yes |
| 6 | Leo is a junior engineer | LEO(junior,new) | Yes |
| 7 | Leo started last month | LEO(junior,new) | Yes |
| 8 | The project is called Driftwood | DRIFTWOOD | Yes |
| 9 | It is a SaaS analytics platform | saas.analytics | Yes |
| 10 | Current sprint is auth migration | SPRINT: auth.migration→clerk | Yes |
| 11 | Migration target is Clerk | →clerk | Yes |
| 12 | Kai recommended Clerk | KAI.rec:clerk | Yes |
| 13 | Reasons are pricing and developer experience | pricing+dx | Yes |
For this short structured example, all 13/13 factual assertions are preserved. Compression ratio ~2x (this example is short and information-dense).
AAAK_SPEC in the MCP Server
The following is the complete specification passed to the AI via the mempalace_status tool, found at mcp_server.py:102-119:
AAAK is a compressed memory dialect that MemPalace uses for efficient storage.
It is designed to be readable by both humans and LLMs without decoding.
FORMAT:
ENTITIES: 3-letter uppercase codes. ALC=Alice, JOR=Jordan, RIL=Riley, MAX=Max, BEN=Ben.
EMOTIONS: *action markers* before/during text. *warm*=joy, *fierce*=determined,
*raw*=vulnerable, *bloom*=tenderness.
STRUCTURE: Pipe-separated fields. FAM: family | PROJ: projects | ⚠: warnings/reminders.
DATES: ISO format (2026-03-31). COUNTS: Nx = N mentions (e.g., 570x).
IMPORTANCE: ★ to ★★★★★ (1-5 scale).
HALLS: hall_facts, hall_events, hall_discoveries, hall_preferences, hall_advice.
WINGS: wing_user, wing_agent, wing_team, wing_code, wing_myproject,
wing_hardware, wing_ue5, wing_ai_research.
ROOMS: Hyphenated slugs representing named ideas (e.g., chromadb-setup, gpu-pricing).
EXAMPLE:
FAM: ALC→♡JOR | 2D(kids): RIL(18,sports) MAX(11,chess+swimming) | BEN(contributor)
Read AAAK naturally — expand codes mentally, treat *markers* as emotional context.
When WRITING AAAK: use entity codes, mark emotions, keep structure tight.
Under the current protocol, the AI receives this specification when it explicitly calls mempalace_status and a palace already exists. This is not an out-of-band automatic injection path.
Compression Pipeline
The compress() method in dialect.py performs five-stage processing:
graph TD
A[Raw Text] --> B["1. Entity Detection<br/>name[:3].upper()"]
B --> C["2. Topic Extraction<br/>Remove stopwords + frequency sort"]
C --> D["3. Key Sentence Selection<br/>_extract_key_sentence()"]
D --> E["4. Emotion/Flag Detection<br/>Keyword → code mapping"]
E --> F["5. AAAK Assembly<br/>Pipe-separated + header line"]
For the current dialect.compress() plain-text path, a more accurate description is that all five stages contain heuristic selection, not only Stage 3. Entities, topics, emotions, and flags are all detected and truncated; key_sentence is simply the most obvious selection step. The current pipeline is closer to a high-compression index generator than to a strictly lossless encoder.
The "lossless AAAK" discussed elsewhere in the README and book is best understood as a design goal: truly preserving fact-by-fact structure would require a stronger alignment between the compressor and the original text than the current heuristic plain-text pipeline provides.
Source location: dialect.py:539-602 (compress method)
AAAK Dialect Completeness Assessment
Implemented Capabilities
| Capability | Source Location | Completeness |
|---|---|---|
| Entity encoding | encode_entity() :367-379 | Complete — name[:3].upper(), supports pre-defined mappings and auto-coding |
| Emotion encoding | EMOTION_CODES :47-88 | Complete — 28 emotions → short code mapping |
| Emotion detection | _EMOTION_SIGNALS :91-114 | Basic — 24 keyword triggers, simple in matching, no context |
| Flag detection | _FLAG_SIGNALS :117-152 | Basic — 7 flag types, 36 keywords, simple matching |
| Topic extraction | _extract_topics() :430-455 | Basic — word frequency + capitalization/camelCase weighting, top-3 |
| Key sentence | _extract_key_sentence() :457-508 | Basic — 18 decision words scored, short sentences weighted, truncated to 55 chars |
| Entity detection | _detect_entities_in_text() :510-537 | Basic — known entity matching + capitalized word fallback, top-3 |
| Compression assembly | compress() :539-602 | Complete — pipe-separated output format |
| Stop words | _STOP_WORDS :155-289 | Complete — ~135 English stop words |
| Config persistence | from_config() / save_config() | Complete |
| Zettel format | encode_zettel() / compress_file() | Complete — backward-compatible with legacy format |
| Layer1 generation | generate_layer1() | Complete — batch compression + aggregation |
| Compression stats | compression_stats() | Complete — original/compressed token counting |
Missing Critical Capabilities
As a "language," AAAK lacks key linguistic infrastructure:
| Missing | Impact | Severity |
|---|---|---|
| No formal grammar definition | No BNF/EBNF/PEG specification; "grammar" exists only in compress() code logic | High |
| No decoder/decompressor | Only encoding direction; no decompress() method to verify reversibility | High |
| No roundtrip tests | No assert decompress(compress(text)) ≈ text verification | High |
| No token-level precision | count_tokens() uses len(text)//3 estimation, not a real tokenizer | Medium |
| No multilingual support | Stop words, signal words, entity detection all hardcoded for English | Medium |
| No versioning | Encoding format has no version marker; cannot distinguish between different AAAK output versions | Medium |
| Truncation is irrecoverable | key_sentence truncated to 55 chars (:506-507), topics capped at top-3, emotions capped at top-3 — anything beyond is discarded | High |
Core Qualitative Judgment
AAAK is not a language — it is a compression function.
A true language requires three elements:
- Syntax — what constitutes a valid AAAK string. AAAK partially has this (pipe separation, header line format), but without formal definition.
- Semantics — the meaning definition of each symbol. AAAK has this (emotion code table has clear semantics).
- Roundtrip capability — information is preserved after encode→decode. AAAK completely lacks this.
compress() is a one-way function — it compresses text into AAAK format, but there is no corresponding decompress() to verify whether information was actually preserved. The README's "lossless" claim relies on "LLMs can read AAAK" — this delegates verification responsibility to the model's reasoning capability rather than the format's own reversibility guarantee.
In Fairness
- The design intuition is correct — "extremely abbreviated English, let the LLM be the decoder" genuinely works, because LLM language understanding can fill in omitted information.
- Engineering-sufficient — as a Closet-layer index (not the sole storage), AAAK does not need strict losslessness — Drawers preserve the originals.
- Cross-model readability is real — any English-capable model can indeed understand
KAI(backend,3yr); this property does not depend on AAAK's formal completeness. - 950 lines of code achieved usability — for a v3.0.0 project, this implementation sufficiently supports the benchmark results.
Overall Ratings
| Dimension | Score | Notes |
|---|---|---|
| Design concept | 8/10 | "LLM as decoder" is an original and effective insight |
| Implementation completeness | 5/10 | Encoder is complete, but lacks decoder and roundtrip verification |
| Formal language completeness | 3/10 | No BNF, no versioning, no formal semantics |
| Engineering utility | 7/10 | Sufficient as an index layer with Drawer as safety net |
| "30x lossless" claim | 3/10 | Over-promises — actually lossy index generation |
The most honest positioning: AAAK is an AI-oriented shorthand index format that enables any LLM to quickly understand context summaries through extremely abbreviated English, while relying on the Drawer layer to preserve complete original text as a safety net. Its core value lies not in "lossless compression" but in "cross-model-readable efficient indexing."