Meow Programming Language Compiler Internals
This document describes the internal architecture of the Meow Programming Language compiler for contributors who want to understand or modify the compilation pipeline.
Pipeline Overview
flowchart TD
src[".nyan source"]
lexer["Lexer<br/>pkg/lexer"]
parser["Parser<br/>pkg/parser"]
checker["Checker<br/>pkg/checker"]
codegen["Codegen<br/>pkg/codegen"]
gobuild["go build"]
bin(["binary"])
src --> lexer
lexer -- "iter.Seq[Token]" --> parser
parser -- "AST" --> checker
checker -- "TypeInfo" --> codegen
codegen -- "Go source" --> gobuild
gobuild --> bin
The pipeline is orchestrated by compiler/compiler.go:
- Lexer tokenizes
.nyansource into a stream of tokens - Parser builds an AST from the token stream
- Checker performs type checking and collects type information
- Codegen transforms the AST into Go source code
- go build compiles the Go source to a native binary
Lexer (pkg/lexer/)
Design
The lexer produces an iter.Seq[Token] — a Go 1.26 push-based iterator. This means the lexer doesn’t allocate a slice of all tokens upfront; instead, it yields tokens lazily as they’re consumed.
Token Emission
func Lex(source, filename string) iter.Seq[token.Token] {
return func(yield func(token.Token) bool) {
// scan characters, yield tokens
}
}
Scanning
The lexer operates character-by-character:
- Skips whitespace (spaces, tabs, carriage returns)
- Recognizes single/multi-character operators (
==,!=,|=|,~>,..,=>) - Scans identifiers and looks them up in the keyword table (
token.LookupIdent) - Scans numeric literals (integers and floats)
- Scans string literals (double-quoted, with escape sequences)
- Handles line comments (
#) and block comments (-~ ... ~-) - Emits
NEWLINEtokens as statement separators
Position Tracking
Every token carries a Position with file name, 1-based line number, and column number.
Parser (pkg/parser/)
Design
The parser uses Pratt parsing (top-down operator precedence) for expressions, with recursive descent for statements. The token stream arrives as iter.Seq[Token], which is converted to a pull-based iterator via iter.Pull:
func New(tokens iter.Seq[token.Token]) *Parser {
next, stop := iter.Pull(tokens)
p := &Parser{next: next, stop: stop}
p.advance()
p.advance()
return p
}
The parser maintains two tokens: cur (current) and peek (lookahead).
Precedence Levels
const (
precNone = iota
precCatch // ~>
precOr // ||
precAnd // &&
precEq // == !=
precCmp // < > <= >=
precPipe // |=|
precAdd // + -
precMul // * / %
precUnary // ! -
precCall // () [] .
)
Expression Parsing
The core of Pratt parsing:
func (p *Parser) parseExpr(minPrec int) ast.Expr {
left := p.parsePrefix() // Parse prefix (literal, ident, unary, etc.)
for {
prec := p.infixPrec(p.cur.Type)
if prec <= minPrec {
break
}
left = p.parseInfix(left, prec) // Parse infix (binary, pipe, catch)
}
return left
}
Prefix parsers handle: literals, identifiers, unary operators, lambdas, lists, maps, match expressions, and grouped expressions (...).
Infix parsers handle: binary operators, pipe |=|, and catch ~>.
Statement Parsing
parseStmt() dispatches on the current token type:
| Token | Parser |
|---|---|
NYAN | parseVarStmt |
MEOW | parseFuncStmt |
BRING | parseReturnStmt |
SNIFF | parseIfStmt |
PURR | parsePurrStmt |
NAB | parseFetchStmt |
KITTY | parseKittyStmt |
| other | parseExprStmtOrAssign |
Newline Handling
Newlines are significant as statement terminators. The parser skips consecutive newlines and comments between statements via skipNewlines(). Within an expression, newlines within brackets [...], braces {...}, and parentheses (...) are ignored.
AST (pkg/ast/)
Node Hierarchy
classDiagram
class Node {
<<interface>>
}
class Expr {
<<interface>>
produces a value
}
class Stmt {
<<interface>>
performs an action
}
class Pattern {
<<interface>>
for pattern matching
}
class TypeExpr {
<<interface>>
type annotations
}
Node <|-- Expr
Node <|-- Stmt
Node <|-- Pattern
Node <|-- TypeExpr
Expr <|-- IntLit
Expr <|-- FloatLit
Expr <|-- StringLit
Expr <|-- BoolLit
Expr <|-- NilLit
Expr <|-- Ident
Expr <|-- UnaryExpr
Expr <|-- BinaryExpr
Expr <|-- CallExpr
Expr <|-- LambdaExpr
Expr <|-- ListLit
Expr <|-- MapLit
Expr <|-- IndexExpr
Expr <|-- PipeExpr
Expr <|-- CatchExpr
Expr <|-- MatchExpr
Expr <|-- MemberExpr
Stmt <|-- VarStmt
Stmt <|-- FuncStmt
Stmt <|-- ReturnStmt
Stmt <|-- IfStmt
Stmt <|-- RangeStmt
Stmt <|-- FetchStmt
Stmt <|-- KittyStmt
Stmt <|-- ExprStmt
Pattern <|-- LiteralPattern
Pattern <|-- RangePattern
Pattern <|-- WildcardPattern
TypeExpr <|-- BasicType
Key Nodes
- PipeExpr:
Left |=| Right— desugared to a function call in codegen - CatchExpr:
Left ~> Right— desugared toGagOrin codegen - RangeStmt: Supports both count form (
Start=nil) and range form (Start!=nil, Inclusive=true) - KittyStmt: Defines struct types; collected before code generation so constructors can be generated
Type Checker (pkg/checker/)
Two-Pass Design
The checker performs two passes over the AST:
- Declaration registration: Scans all function declarations and
kittydefinitions, recording their type signatures inFuncTypes - Type checking: Walks the AST, verifying type annotations, checking function calls, and recording expression types in
ExprTypes
TypeInfo
The checker produces a TypeInfo struct passed to codegen:
type TypeInfo struct {
FuncTypes map[string]types.FuncType // function name → type signature
ExprTypes map[ast.Expr]types.Type // expression → inferred type
VarTypes map[string]types.Type // variable name → declared type
}
Gradual Typing
The type system is gradual — untyped code coexists with typed code. The AnyType represents dynamically-typed values. Functions are considered “fully typed” only when all parameters and the return type have concrete types.
Scope Stack
Variables are tracked in a scope stack. Function bodies push a new scope containing the parameters. The checker resolves variable references by walking up the scope chain.
Codegen (pkg/codegen/)
Value Boxing
In untyped mode, all values are boxed as meow.Value:
| Meow | Generated Go |
|---|---|
42 | meow.NewInt(42) |
3.14 | meow.NewFloat(3.14) |
"hello" | meow.NewString("hello") |
yarn | meow.NewBool(true) |
catnap | meow.NewNil() |
[1, 2] | meow.NewList(meow.NewInt(1), meow.NewInt(2)) |
Typed Code Generation
When a function is “fully typed” (all params and return have concrete types), codegen generates native Go types:
| Meow Type | Go Type |
|---|---|
int | int64 |
float | float64 |
string | string |
bool | bool |
The typed path avoids boxing overhead:
meow add(a int, b int) int { bring a + b }
Generates:
func add(a int64, b int64) int64 {
return (a + b)
}
When typed functions are called from untyped contexts, values are unboxed at call sites and re-boxed for the return value.
Stdlib Import Resolution
The stdPackages map defines available packages:
var stdPackages = map[string]string{
"file": "github.com/135yshr/meow/runtime/file",
"http": "github.com/135yshr/meow/runtime/http",
"testing": "github.com/135yshr/meow/runtime/testing",
}
nab "file" registers the import, and member calls like file.snoop(x) are generated as meow_file.Snoop(x) — the function name is capitalized by capitalizeFirst.
Pipe Desugaring
The pipe |=| is desugared to a function call:
x |=| f(y) → f(x, y)
x |=| f → f(x)
Catch Desugaring
The catch ~> is desugared to GagOr:
expr ~> fallback
Becomes:
meow.GagOr(meow.NewFunc("~>", func(args ...meow.Value) meow.Value {
return <expr>
}), <fallback>)
Kitty (Struct) Handling
Kitty definitions are collected in a pre-pass (collectKittyDefs). They don’t generate Go struct types — instead, they use the runtime Kitty value with dynamic field lookup:
Cat("Nyantyu", 3)
Generates:
meow.NewKitty("Cat", []string{"name", "age"}, meow.NewString("Nyantyu"), meow.NewInt(3))
Field access cat.name generates cat.(*meow.Kitty).GetField("name").
Test Mode
In test mode (GenerateTest), the codegen:
- Auto-imports the testing package
- Collects
test_prefixed functions and wraps them withmeow_testing.Run() - Collects
catwalk_prefixed functions and wraps them withmeow_testing.Catwalk() - Appends
meow_testing.Report()at the end ofmain()
Compiler Orchestration (compiler/)
The Compiler struct ties the pipeline together:
func (c *Compiler) CompileToGo(source, filename string) string {
tokens := lexer.Lex(source, filename)
parser := parser.New(tokens)
prog, errs := parser.Parse()
// ... error handling ...
typeInfo := checker.Check(prog)
gen := codegen.New()
gen.SetTypeInfo(typeInfo)
goCode, err := gen.Generate(prog)
// ...
return goCode
}
For Build and Run, the compiler:
- Creates a temporary directory
- Writes a
go.modandmain.gowith the generated code - Runs
go buildin the temp directory - Copies or executes the resulting binary
Runtime (runtime/meowrt/)
Value Interface
All Meow values implement:
type Value interface {
Type() string // "Int", "Float", "String", "Bool", etc.
String() string // String representation
IsTruthy() bool // Truthiness for conditions
}
Concrete Types
Int— wrapsint64Float— wrapsfloat64String— wrapsstringBool— wrapsboolNilValue— singleton nilFunc— wraps a Go functionfunc(args ...Value) ValueFurball— error value withMessage stringList— wraps[]Valuewith helper methodsMap— wrapsmap[string]ValueKitty— dynamic struct withTypeName,FieldNames,Fields map[string]Value
Operator Dispatch
Operators in operators.go use type switches to dispatch on operand types. All arithmetic requires same-type operands. Type mismatches panic with "Hiss! ...".
Error Convention
Runtime errors panic with strings matching "Hiss! <message>, nya~". Test assertion failures use a distinct testFailure panic type (not prefixed with “Hiss!”) so the test runner can distinguish assertion failures from runtime errors.