Optimizing Source for LLMs: NERD's Token-Efficient, LLVM-Native Language

NERD is a terse, English-word-based language that cuts token use by 50-70% and compiles natively to LLVM. It treats humans as reviewers while optimizing source for LLM tokenizers and AI-driven development.

Optimizing Source for LLMs: NERD's Token-Efficient, LLVM-Native Language

TL;DR

  • About 40% of code now written by LLMs, with that share increasing.
  • Token efficiency: claims of 50–70% fewer tokens vs comparable TypeScript; example reports ~67% reduction.
  • Native compilation: compiles to LLVM with a C bootstrap compiler and no runtime dependencies.
  • Syntax uses short English words instead of punctuation, making model-tokenizers more compact and code human-observable as a read-only artifact.
  • Workflow reorients humans as requesters/auditors: natural-language prompts → LLM emits/modifies NERD → compile to native → humans validate; iterative refinement via NL.
  • Debugging and compliance reframed as NL-driven diagnostics and generated audit views; project presented as an experiment testing whether LLM-optimized languages scale.
    Explore: https://github.com/Nerd-Lang/nerd-lang-core — Original: https://www.nerd-lang.org/about?

When code is written by models, source looks different

An argument emerging from the NERD project starts with a striking premise: about 40% of code is now written by LLMs, and that share is increasing. If models become the primary authors, the constraints that shaped programming languages — readability for human authors, symbol-heavy terseness, historical idioms — deserve re-evaluation. NERD (No Effort Required, Done) proposes a language and workflow designed around LLM tokenization and native compilation rather than humans typing symbols.

Design goals and core ideas

NERD is intentionally dense and machine-optimized, yet human-observable for audit and review. Its design emphasizes:

  • Token efficiency: claims of 50–70% fewer tokens versus comparable TypeScript implementations by replacing punctuation and operators with English words where LLM tokenizers are more compact.
  • Native compilation: compiles to LLVM with a bootstrap compiler written in C and no runtime dependencies.
  • Auditability over authorship: code is intended to be read-only for humans, serving as a verifiable artifact rather than the primary editing surface.

The wider rationale leans on how LLM tokenizers treat English words and symbols: many symbols fragment into multiple tokens, whereas short English words often map to single tokens. The result is a terser representation for models that still communicates intent clearly.

Syntax snapshot

An example of the NERD style:

fn add a b
ret a plus b

fn calc a b op
if op eq zero ret ok a plus b
if op eq one ret ok a minus b
ret err "unknown"

This snippet illustrates the shift from punctuation-heavy code to a compact, English-word-driven form. The project reports roughly 67% fewer tokens for the example compared with a TypeScript equivalent.

Workflow rethought

The proposed workflow places humans as stakeholders rather than line-by-line authors:

  • A human requests a change in natural language.
  • An LLM emits or modifies NERD source.
  • NERD compiles to native code.
  • Humans observe and validate the compiled behavior (read-only).
  • Further natural-language prompts refine behavior.

This aligns development with LLM strengths: generation, rapid iteration, and natural-language constraints, while preserving human oversight.

Addressing common objections

  • Debugging: The argument reframes debugging as operating at the chosen abstraction (as with JVM bytecode or engine internals). If the abstraction is natural language, diagnostics and fixes become natural-language prompts to the model rather than manual edits to low-level code.
  • Compliance and auditability: Readability for auditors does not require human authorship. Translated views of data flow, constraints, and security properties can be generated from NERD artifacts, potentially improving traceability compared with tangled hand-written code.

The longer bet

NERD stakes a bet: within a few years, most production code will be AI-authored, and conventional languages developed for human typing will feel like vestiges of a different era. The project exists as an experiment — a C bootstrap compiler targeting LLVM with no dependencies — and invites examination of whether optimizing for LLM tokenization produces practical advantages.

Explore the project on GitHub: https://github.com/Nerd-Lang/nerd-lang-core

Original source: https://www.nerd-lang.org/about?

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community