Iroh: A Unified Programming Language for the Entire Stack
Why Iroh?
Building production apps today tends to involve multiple languages, e.g. Go for the backend, SQL for the database, TypeScript for the frontend, CSS for styling, etc.
Worse, despite there being hundreds of languages, if you wanted high performance with strong memory safety and minimal runtime overhead, your options are limited, e.g. to Rust and Ada.
But what if we could have it all? A single, minimal language that:
-
Was comparable to C, C++, Rust, and Zig in terms of performance.
-
Provided the memory safety of Rust but without having to constantly work around the constraints of the borrow checker.
-
Was designed for fast compilations and developer productivity like Pascal, Go, and Zig.
-
Was as easy to learn as Python and TypeScript.
-
Could be used across the entire stack: infrastructure, apps, data engineering, styling, and even smart contracts.
This is what we’re creating with Iroh. We will be using it to build both the Espra browser and Hyperchain. Developers will be able to use it to write both onchain scripts and user-facing apps on Espra.
Our only non-goal is running directly on hardware without an OS, e.g. on embedded systems or when writing kernels or drivers. We’ll leave that to Ada, C, C++, Forth, Rust, and Zig.
Execution Modes
Iroh is intended to support 6 primary execution modes:
-
standard-
The default mechanism that produces an executable binary for an OS and target architecture, e.g.
macos-arm64. -
This is pretty similar to what you would expect if you were to write code in languages like Go or Rust.
-
This mode has full access to make whatever calls the underlying OS allows, e.g. calling into C functions, making system calls, etc.
-
The
mainfunction within the root package serves as the entrypoint for the program being run.
-
-
onchain-script-
A sandboxed execution mechanism with automatic metering of every operation.
-
This mode will have restricted data types, e.g. floating-point types won’t be available as they will necessitate complex circuits when producing zero-knowledge proofs.
-
This mode will also set certain defaults, e.g. treat all decimal-point literals as
fixed128values, check arithmetic operations for wrapping, default to an arena allocator, etc.
-
-
onchain-ui-app-
A sandboxed execution mechanism that can only call into the underlying system based on explicit functions that have been passed in by the “host”, e.g. to load and save data.
-
This is somewhat similar to how WASM modules are executed by web browsers.
-
The
Appdefined within the root package serves as the entrypoint for the onchain app being run.
-
-
declarative-
A constrained mode that only allows the declarative aspects of the language.
-
This allows for Iroh to be used in settings like config files, whilst still retaining the richness of expressivity, type validation, IDE support, etc.
-
-
gpu-compute-
A constrained mode for executing code on a GPU or AI chip.
-
This mode will not allow for certain operations, e.g. dynamic allocation of memory, and will have restricted data types to match what the underlying chip supports.
-
-
script-
An interpreted mode where Iroh behaves like a dynamic “scripting” language.
-
This will be available when the
irohexecutable is run with no subcommands. This provides the user with a REPL combined with an Iroh-based shell interface. -
The fast feedback mechanism of the REPL will allow for high developer productivity, rapid prototyping, as well as make it easier for people to learn and discover features.
-
Scripts can be run with no compilation by just passing the script name, e.g.
iroh <script-name>.irohOr if the script files start with a shebang, i.e.
#!/usr/bin/env irohThey can be made executable with
chmod +xand run directly, e.g../script-name.iroh -
This mode will also be useful in embedded contexts, e.g. scripting for games, notebooks, etc.
-
While the execution mechanism and available functionality will be slightly different in each mode, they will all support the same core language. Making life easy for developers everywhere.
Structured Code Editing
Iroh code is edited through a custom editor that gives the impression of editing text. For example, here’s what “hello world” looks like:
func main() {
print("Hello world")
}
But despite looking like text, behind the scenes, Iroh’s editor updates various data structures about the code as it is being edited. In fact this is where Iroh gets its name from:
- Iroh stands for “intermediate representation of hypergraphs”.
Hypergraphs are just graphs where edges can connect any number of nodes, i.e. are not limited to just connecting two nodes. This gives us a powerful base data structure that can capture:
-
Data flow dependencies.
-
Control flow relationships.
-
Type relationships.
-
Package dependencies.
-
Multi-dimensional relationships that typical structures like trees and normal graphs can’t express.
Perhaps most importantly, where the typical compiler loses semantic information at each step:
Source → Parse → AST → Semantic Analysis → IR → Optimize → Machine Code
Iroh maintains full semantic information throughout. This allows us to improve the developer experience in ways that are not possible with other languages:
-
Instant semantic feedback as you type — going well beyond what IDEs can do today.
-
Refactoring that understands intent, not just syntax.
-
Intelligent error messages and debugging based on the full context.
It also makes Iroh itself a lot simpler to develop:
-
Outside of single-line expressions, we don’t need to worry about any ambiguous or conflicting grammars in the language. We can use simpler syntax without worrying about parsing.
-
There’s no need to build bespoke tooling like LSPs, code formatters, etc. Everything from how the code is presented and edited is just a transformation of the hypergraph.
This gives Iroh a massive competitive advantage:
-
Compiling becomes superfast as most of the work that a typical compiler would do is already done at “edit time”. In essence, compilation is just:
Hypergraph → Optimize → Machine Code -
All tools work from the same source of truth — providing consistency across the board.
-
As new tools are just new hypergraph transformations, the system can be easily extended with richer and richer functionality.
-
There’s a lot less maintenance work as it’s all effectively just one system instead of lots of separate tools.
The fact that a developer can only be editing one bit of code at any point in time, allows us to:
-
Do deep analysis on just the specific area that is being edited. Since we’re not starting from scratch like a typical compiler, we have the full context to help guide the developer.
-
Provide real-time inference, e.g. create sets of all the different errors that might be generated within a function.
-
Automatically refactor, e.g. detect that a deeply nested component needs an extra field to be passed to it, and make sure that all of its parent callers pass it through.
-
Provide custom editing views for different contexts, e.g. a live simulation interface for UI statecharts, a graph editor for dataflow systems, a picker for changing colour values, etc.
-
Not have issues like Swift, where it errors with “expression too complex to be solved in reasonable time” as type inference from scratch took way too long.
In particular, the combination of textual code with domain-specific views can create really intuitive experiences, e.g.
-
Decision tables result in less buggy code than nested boolean logic/conditions like:
if !user.is_active and (!user.has_paid or !user.email_verified) { // Block access } else if (!user.phone_verified and user.has_paid) or (!user.is_active and !user.email_verified) { // Ask to verify phone } else if user.is_active and user.has_paid and user.email_verified and user.phone_verified { // Grant access } else { // Manual review required } -
Many map/dictionary values can be cleanly expressed as tables.
-
Complex array/matrix operations can be represented as mathematical equations providing greater clarity on what’s happening.
We also happen to sidestep the issue of agreeing on coding styles within teams. Each developer can configure their own “theme” that defines exactly how code should be rendered.
In addition, the iroh command-line tool provides Iroh-aware awk, cat,
diff, grep, head, less, sed, tail, and wc, so that developers can
use them on Iroh code bases.
These tools work by:
-
Finding all package files that need to be processed by the tool call that is being made.
-
Rendering them from their hypergraph representation to Iroh Standard Syntax.
-
Writing the rendered output to a temporary/cache directory.
-
Calling the underlying tool on those files.
-
Rewriting the output to reference the original files, e.g. so that the
grepoutput points to the actual source file and “line”.
All in all, Iroh can provide a fundamentally better development experience, whilst still keeping it accessible and familiar with its default text-like representation.
Edit Calculus
Iroh’s editor builds on the fantastic work that Jonathan Edwards has been doing from Subtext onwards. At the heart of our editor, we have an edit calculus that:
-
Codifies a broad set of operations on the underlying hypergraph that preserves the intent of changes, whilst providing efficient in-memory and on-disk representations of the data.
-
Allows for “time-travelling” backwards and forwards across changes, whilst maintaining consistency with any changes made concurrently by others.
-
Unlike CRDTs, natively ensures semantic validity and coherence when concurrent changes are merged.
-
Unlike git’s text-based diffs, provides semantic diffs that preserve the meaning of changes made by developers.
This allows for the kind of collaboration on code, intelligent merging, enriched code reviews, and advanced debugging that’s not available in mainstream languages.
Numeric Data Types
Iroh implements the full range of numeric data types that one would expect:
- Integer types
- Floating-point types
- Complex number types
- Decimal types
- Uncertainty types
- Fraction types
All numeric values support the typical arithmetic operators:
- Addition:
x + y - Subtraction:
x - y - Multiplication:
x * y - Division:
x / y - Negation:
-x - Modulus/Remainder:
x % y - Exponentiation:
x ** y - Percentage:
x%
Signed integer types are available in the usual bit-widths:
| Type | Min Value | Max Value |
|---|---|---|
int8 |
-128 | 127 |
int16 |
-32768 | 32767 |
int32 |
-2147483648 | 2147483647 |
int64 |
-9223372036854775808 | 9223372036854775807 |
int128 |
-170141183460469231731687303715884105728 | 170141183460469231731687303715884105727 |
Likewise for the unsigned integer types:
| Type | Min Value | Max Value |
|---|---|---|
uint8 |
0 | 255 |
uint16 |
0 | 65535 |
uint32 |
0 | 4294967295 |
uint64 |
0 | 18446744073709551615 |
uint128 |
0 | 340282366920938463463374607431768211455 |
Some integer types have aliases like in Go:
-
byteis an alias foruint8to indicate that the data being processed represents bytes, e.g. in a byte slice:[]byte. -
intis aliased to the signed integer type corresponding to the underlying architecture’s bit-width, e.g.int64on 64-bit platforms,int32on 32-bit platforms, etc. -
uintis aliased to the unsigned integer type corresponding to the underlying architecture’s bit-width, e.g.uint64on 64-bit platforms,uint32on 32-bit platforms, etc.
Like Zig, Iroh supports arbitrary-width integers for bit-widths between 1 and
65535 when the type name int or uint is followed by a bit-width, e.g.
// 7-bit signed integer
a = int7(20)
// 4096-bit unsigned integer
b = uint4096(10715086071862641821530)
If an arbitrary-precision integer is desired, then that is available via the
built-in bigint type that automatically expands to fit the necessary
precision, e.g.
a = bigint(10715086071862641821530) ** 5000 // would need 365,911 bits
Integer literals can be represented in various formats:
// Decimal
24699848519483
// With underscores for legibility
24_699_848_519_483
// Hex
0x1676e1b26f3b
0X1676E1B26F3B
// Octal
0o547334154467473
// Binary
0b101100111011011100001101100100110111100111011
Integer types support the typical bit operators:
- Bitwise AND:
a & b - Bitwise OR:
a | b - Bitwise XOR:
a ^ b - Bitwise NOT:
^a - Bitwise AND NOT:
a &^ b - Left shift:
a << b - Right shift:
a >> b
Various methods exist on integer types to support bit manipulation, e.g.
-
x.count_one_bits— count the number of bits with the value1. -
x.leading_zero_bits— count the number of leading zeros. -
x.bit_length— the minimum number of bits needed to represent the value. -
x.reverse_bits— reverses the bits. -
x.rotate_bits_left— rotate the value left bynbits. -
x.to_big_endian— converts the value from little endian to big endian. -
x.to_little_endian— converts the value from big endian to little endian. -
x.trailing_zero_bits— count the number of trailing zeros.
Iroh provides the typical floating-point types:
| Type | Implementation |
|---|---|
float16 |
IEEE-754-2008 binary16 |
float32 |
IEEE-754-2008 binary32 |
float64 |
IEEE-754-2008 binary64 |
float80 |
IEEE-754-2008 80-bit extended precision |
float128 |
IEEE-754-2008 binary128 |
As well as some additional floating-point types for use cases like machine learning:
| Type | Implementation |
|---|---|
float8_e4 |
E4M3 8-bit storage-only format |
float8_e5 |
E5M2 8-bit storage-only format |
bfloat16 |
Brain floating-point format |
Non-finite floating-point values can be constructed using methods on the computational floating-point types, e.g.
float64.nan() // NaN
float64.inf() // Positive infinity
float64.neg_inf() // Negative infinity
Additional methods on the floating-point values support further manipulation:
-
x.copy_sign— copies the sign of a given value. -
x.is_nan— checks if a floating-point value is a NaN. -
x.is_inf— checks if a floating-point value is infinity. -
x.is_finite— checks if a floating-point value is not a NaN or infinity. -
x.nan_payload— returns the NaN payload data for debugging.
The with construct can be used to control how subnormals are treated in a
scope, e.g.
with {
.float_subnormals = .flush_to_zero
} {
// Overrides the default gradual underflow and flushes
// subnormals to zero within any performance-critical
// code.
}
The following float-specific control options are supported:
-
exceptions—.ignore,.trap -
inf_handling—.preserve,.saturate_to_max,.saturate_to_zero,.trap -
nan_handling—.quiet,.canonical,.preserve_payload,.to_max,.to_min,.to_zero,.trap -
optimize_for—.throughput,.latency,.memory,.power -
overflow—.to_infinity,.saturate_to_max,.saturate_to_zero,.saturate_to_signed_zero,.to_nan,.trap,.wrap_to_opposite_inf -
rounding—.half_even,.half_up,.half_away,.half_down,.truncate,.ceiling,.floor,.processor_default,.stochastic -
quality—.strict,.approximate,.fast_math -
signed_zero—.strict,.ignore -
subnormals—.preserve,.flush_to_min_normal,.flush_to_zero,.trap -
track— one or more of.divide_by_zero,.inexact,.invalid,.overflow,.underflow
While casting to larger bit-widths is generally safe, e.g.
x = float32(3.14)
y = float64(x)
Downcasting to smaller bit-widths takes optional parameters, e.g.
x = float64(3.14)
y = float32(x
exceptions: .ignore,
optimize_for: .throughput,
overflow: .saturate_to_max,
quality: .approximate,
rounding: .stochastic,
subnormals: .flush_to_zero,
)
These parameters also work with array conversions. When available, hardware accelerations like SIMD instructions are automatically used for bulk conversions, e.g.
x = [2]float64{3.14, 1.23}
y = [2]float32(x
exceptions: .trap,
optimize_for: .latency,
overflow: .to_infinity,
quality: .fast_math,
rounding: .half_even,
subnormals: .preserve,
)
Relatedly, certain operations on float arrays will automatically take advantage of SIMD instructions, e.g.
a = [4]float32{1.0, 2.0, 3.0, 4.0}
b = [4]float32{10.0, 20.0, 30.0, 40.0}
c = a .+ b // Element-wise addition is auto-vectorized
c == {11.0, 22.0, 33.0, 44.0}
Iroh supports posit types for simpler handling of special cases compared to
IEEE floats and better precision for values near 1.0 at comparable bit-widths.
Posit types are of the form posit<bit-width>_<exponent-size> and currently
support a maximum bit-width of 128, e.g.
-
posit8_0— an 8-bit posit with no explicit exponent. -
posit16_1— a 16-bit posit with 1-bit exponent. -
posit32_2— a 32-bit posit with 2-bit exponent.
Posits work just like normal numbers:
x = posit32_2(1.5)
y = posit32_2(2.25)
z = x + y
z == 3.75 // true
To ensure numeric stability during calculations, wide accumulator registers called quires can be used, e.g.
acc = quire32_2()
// The literals below are interpreted as the matching posit
// type, i.e. posit32_2
acc += 12.3 * 0.4
acc -= 6.3 * 8.4
// The .posit() method can be called on a quire to get the
// resulting value:
result = acc.posit()
result == -48.0 // true
Complex numbers can be represented using Iroh’s complex number types:
| Type | Description |
|---|---|
complex64 |
Real and imaginary parts are float32 |
complex128 |
Real and imaginary parts are float64 |
Complex number values can be constructed using the complex constructor or
literals, i.e.
// Using the complex constructor
x = complex(1.2, 3.4)
// Using literals
x = 1.2 + 3.4i
Similarly, quaternion types are provided for use in domains like computer graphics and physics:
| Type | Description |
|---|---|
quaternion128 |
Real and imaginary parts are float32 |
quaternion256 |
Real and imaginary parts are float64 |
Quaternions number values can be constructed using the quaternion constructor
or literals, i.e.
// Using the quaternion constructor
x = quaternion(1.2, 3.4, 5.6, 7.8)
// Using literals
x = 1.2 + 3.4i + 5.6j + 7.8k
Iroh provides 2 fixed-point signed data types for dealing with things like monetary values:
| Type | Max Value |
|---|---|
fixed128 |
170141183460469231731.687303715884105727 |
fixed256 |
57896044618658097711785492504343953926634992332820282019728.792003956564819967 |
Unlike floating-point values, these fixed-point data types can represent decimal
values exactly, and support up to 18 decimal places of precision.
The smallest number that can be represented is:
0.000000000000000001
Our fixed-point types do not have any notion of NaNs or infinities. All invalid operations will automatically generate an error, e.g. overflow, underflow, etc.
The fixed-point types are augmented with a third decimal type:
| Type | Implementation |
|---|---|
bigdecimal |
Supports arbitrary-precision decimal calculations |
These are a lot slower than fixed-point types as they are allocated on the heap. But, it allows for unbounded scale, i.e. the number of digits after the decimal point, and unlimited range, e.g.
// Scale sets the number of digits after the decimal point:
x = bigdecimal(3.2, scale: 30)
y = x ** 100
y == 18092513943330655534932966.407607485602073435104006338131 // true
When even that’s not enough, and you need exact calculations, there’s a
fraction type for representing a rational number, i.e. a quotient a/b of
arbitrary-precision integers, e.g.
x = fraction(1, 3)
y = fraction(2, 5)
z = x + y // results in 11/15 exactly
The numerator and denominator of the fraction can be directly accessed:
x = fraction(1, 3)
x.num // 1
x.den // 3
The fraction value can be converted to a decimal value, e.g. using
bigdecimal:
x = fraction(1, 3)
bigdecimal(x) // 0.333333333333333333
The exact scale can also be specified, e.g.
x = fraction(1, 3)
bigdecimal(x, scale: 5) // 0.33333
Iroh also supports uncertainty values that are useful for things like simulations, financial risk analysis, scientific calculations, engineering tolerance analysis, etc. These types:
-
Include a component that represents the level of uncertainty of its value.
-
Propagate the uncertainty level through calculations.
Each of the floating-point and decimal values have an uncertainty variant where
the name of the variant is the underlying type’s name prefixed with u, e.g.
ufloat64, ufixed128, etc.
Uncertainty values can be instantiated using the type constructor or using the
± literal, e.g.
x = ufloat64(1.2, 0.2)
y = 1.2 ± 0.2
The uncertainty bounds are propagated across calculations, e.g.
x = 1.0 ± 0.1
y = 2.0 ± 0.05
z = x + y // 3.0 ± ~0.11
The value and uncertainty of the resulting value can be accessed directly, e.g.
x = 1.2 ± 0.2
x.value // 1.2
x.uncertainty // 0.2
To perform interval arithmetic with guaranteed bounds, interval types can be
used to represent values as ranges and automatically propagate uncertainty
through calculations, e.g.
x = interval(1.0, 1.1)
y = interval(2.0, 2.2)
z = x + y
z == (3.0, 3.3)
Modular arithmetic is supported by modular types that are parameterized by
their modulus, automatically handling wraparound for all operations, e.g.
ring = modular[7]
x = ring(5)
y = ring(4)
z = x + y
z == 2 // 5 + 4 = 9, which wraps to 2 in mod 7
Similarly, Galois field arithmetic is supported by finite_field types, e.g.
prime fields work just like in modular arithmetic:
gf7 = finite_field[7]
x = gf7(5)
y = gf7(4)
z = x + y
z == 2 // 5 + 4 = 9, which wraps to 2 in mod 7
Finite field values support additional methods that are useful such as calculating multiplicative inverses, e.g.
gf7 = finite_field[7]
x = gf7(3)
y = x.multiplicative_inverse()
y == 5 // 3 * 5 = 15, which wraps to 1 in mod 7
Extension fields support calculations using polynomials represented as integers, e.g.
gf256 = finite_field[256]
x = gf256(0b11000101) // x⁷ + x⁶ + x² + 1
y = gf256(0b01010110) // x⁶ + x⁴ + x² + x
z = x + y
z == 0b10010011 // x⁷ + x⁴ + x + 1
By natively supporting finite fields, Iroh makes it easy for libraries to be built to support things like AES, elliptic curves, Shamir’s secret sharing, error correcting codes, zero-knowledge proofs, etc.
For further security, the with construct can be used to enable constant time
operations, e.g.
with {
.constant_time = true
} {
// All calculations within here are done in constant time.
}
Numeric literals like 42, 3.14, and 1e6 are initially untyped and of
arbitrary precision in Iroh. They remain untyped when assigned to const
values, and are only typed when assigned to variables.
When numeric literals are not typed, by default:
-
Values without decimal places, i.e.
integer_literals, are inferred asintvalues. -
Values with decimal places, i.e.
decimal_point_literals, are inferred asfixed128values -
Values with
eorEexponents, i.e.exponential_literals, are inferred asfloat64values
For example:
// Untyped decimal-point value
const Pi = 3.14159265358979323846264338327950288419716939937510582097494459
// When explicit types aren't specified:
radius = 5.0 // inferred as a fixed128
area = Pi * radius * radius // Pi is treated as a fixed128
// When explicit types are specified:
radius = float32(5.0)
area = Pi * radius * radius // Pi is treated as a float32
The with statement can be used to control how numeric literals are inferred in
specific lexical scopes, e.g.
with {
.decimal_point_literals = .float32
.integer_literals = .int128
} {
x = 3.14 // x is a float32
y = 1234 // y is an int128
}
Numeric types are automatically upcasted if it can be done safely, e.g.
x = int32(20)
y = int64(10)
z = x + y // x is automatically upcasted to an int64
Otherwise, variables will have to be cast explicitly, e.g.
x = int32(20)
y = int64(10)
z = x + int32(y) // y is explicitly downcasted to an int32
When integers are cast to a floating-point type, a type conversion is performed that tries to fit the closest representation of the integer within the float.
Conversely, when a floating-point value is cast to an integer type, its fractional part is discarded, and the result is the integer part of the floating-point value.
Except for the cases when a type won’t fit, e.g. when adding an int256 and a
fixed128 value, integers are considered safe for upcasting to fixed-point
types.
Similarly, both integers and fixed-point values are both considered safe for
upcasting into bigdecimal values, e.g.
x = bigdecimal(1.2)
y = fixed128(2.5)
z = 3 * (x / y) // results in a bigdecimal value of 1.44
By default, the compiler will error if literals are assigned to types outside of
the range for an integer, fixed-point, or bigdecimal type, e.g.
x = int8(1024) // ERROR!
Likewise, an error is generated at runtime if a value is downcast into a type that doesn’t fit, e.g.
x = int64(1024)
y = int8(x) // ERROR!
Integer types can take an optional policy parameter to instead either truncate
the value to fit the target type’s bit width, recast the bits, or clamp it to
the type’s min or max value, e.g.
x = int64(4431)
y = int8(x, policy: .truncate) // results in 79
z = int8(x, policy: .clamp) // results in 127
// Recast a uint8 as an int8
x = uint8(200)
y = int8(x, policy: .recast) // results in -56
Arithmetic operations on integer types are automatically checked, i.e. will raise an error on either overflow or underflow, e.g.
x = uint8(160)
y = uint8(160)
z = x + y // ERROR!
This behaviour can be changed by using the with statement to set the
.integer_arithmetic_policy to either .wrapping, .saturating, or
.checked, e.g.
with {
.integer_arithmetic_policy = .wrapping
} {
x = uint8(160)
y = uint8(160)
z = x + y // results in 64
}
The integer types also provides methods corresponding to operations using each of the policy variants for when you don’t want to change the setting for the whole scope, e.g.
x = uint8(160)
y = uint8(160)
z = x.wrapping_add(y) // results in 64
The errors mentioned above, along with the error caused by dividing by zero, can
be caught using the try keyword, e.g.
x = 1024
y = try int8(x)
Percentage values can be constructed using the % suffix, e.g.
total = 146.00
vat_rate = 20%
vat = total * vat_rate
vat == 29.20 // true
Numeric types can also be constructed from string values, e.g.
x = int64("910365")
The string value can be of any format that’s valid for a literal of that type, e.g.
x = int32("1234")
y = fixed128("28.50")
z = int8("0xff") // Automatic base inferred from the 0x prefix
An optional base parameter can be specified when parsing strings to integer
types, e.g.
x = int64("deadbeef", base: .hex)
y = int8("10101100", base: .binary)
Numbers can be rounded using the built-in round function. By default it will
round to the closest integer value using .half_even rounding, e.g.
x = 13.5
round(x) // 14
An alternative rounding mode can be specified if desired, e.g.
x = 13.5
round(x, .down) // 13
The following rounding modes are natively supported:
enum {
half_even, // Round to nearest, .5 goes towards nearest even integer
half_up, // Round to nearest, .5 goes up towards positive infinity
half_away, // Round to nearest, .5 goes away from zero
half_down, // Round to nearest, .5 goes down towards zero
up, // Always round away from zero
down, // Always round towards zero
ceiling, // Always round towards positive infinity
floor, // Always round towards negative infinity
truncate, // Remove fractional part
}
The number of decimal places can be controlled by an optional scale parameter, e.g.
pi = 3.1415926
round(pi, scale: 4) // 3.1416
A negative scale makes the rounding occur to the left of the decimal point. This is useful for rounding to the nearest ten, hundred, thousand, etc.
x = 12345.67
round(x, scale: -2) // 12300
The optional significant_figures parameter can round to a specific number of
significant figures, e.g.
x = 123.456
round(x, significant_figures: 3) // 123
y = 0.001234
round(y, significant_figures: 2) // 0.0012
The round function works on all numeric types. Note that the rounding of
floating-point values might yield surprising results as most decimal fractions
can’t be represented exactly in floats.
Certain arithmetic operations on the decimal types are automatically rounded. The compiler will avoid re-ordering these so that outputs are deterministic.
By default, rounding for both fixed-point types and bigdecimal will be to 18
decimal places and using the .half_even rounding mode. This can be controlled
using the with statement, e.g.
with {
.decimal_point_literals = .fixed128
.integer_literals = .fixed128
.decimal_round = .floor
.decimal_scale = 4
} {
x = 1/3 // 0.3333
}
Converting numeric values into strings can be done by just casting them into a
string type, e.g.
x = 1234
string(x) // "1234"
This can take an optional format parameter to control how the number is formatted, e.g.
x = 1234
string(x, format: .decimal) // "1234", the default
string(x, format: .hex) // "4d2"
string(x, format: .hex_upper) // "4D2"
string(x, format: .hex_prefixed) // "0x4d2"
string(x, format: .hex_upper) // "4D2"
string(x, format: .hex_upper_prefixed) // "0X4D2"
string(x, format: .octal) // "2322"
string(x, format: .octal_prefixed) // "0o2322"
string(x, format: .binary) // "10011010010"
string(x, format: .binary_prefixed) // "0b10011010010"
The optional scale parameter will pad the output with trailing zeros to match
the desired number of decimal places, e.g.
x = 12.3
string(x, scale: 2) // "12.30"
If something other than the default .half_even rounding, i.e. bankers
rounding, is desired, then the optional round parameter can be used:
x = 12.395
string(x, round: .floor, scale: 2) // "12.39"
The optional thousands parameter can be set to segment the number into
multiples of thousands, e.g.
x = 1234567
string(x, thousands: true) // "1,234,567"
For formatting numbers as they’re expected in different locales, the locale
parameter can be set, e.g.
x = 1234567.89
string(x, locale: .de) // "1.234.567,89"
A with statement can be used to apply the locale to a lexical scope, e.g.
with {
.locale = .de
} {
// all string formatting in this scope will use the specified locale
}
The specific separators for decimal and thousands can also be controlled
explicitly. The setting of thousands_separator implicitly sets thousands to
true.
x = 1234567.89
string(x, decimal_separator: ",", thousands_separator: ".") // "1.234.567,89"
For a more complete numeric support, the standard library also provides
additional packages, e.g. the math package defines constants like Pi,
implements trig functions, etc.
Unit Values
Units can be defined using the built-in unit type and live within a custom
<unit> namespace, e.g.
<s> = unit(name: "second", plural: "seconds")
<km> = unit(name: "kilometre", plural: "kilometres")
Numeric values can be instantiated with a specific <unit>, e.g.
distance = 20<km>
timeout = 30<s>
Unit definitions are evaluated at compile-time, and can be related to each
other via the @relate function, e.g.
<s> = unit(name: "second", plural: "seconds")
<min> = unit(name: "minute", plural: "minutes")
<hour> = unit(name: "hour", plural: "hours")
@relate(<min>, 60<s>)
@relate(<hour>, 60<min>)
If the optional si_unit is set to true during unit definition, then variants
using SI prefixes will be automatically created, e.g.
<s> = unit(name: "second", plural: "seconds", si_unit: true)
// Units like ns, us, ms, ks, Ms, etc. are automatically created, e.g.
1<s> == 1000<ms> // true
Non-linear unit relationships can be defined too, e.g.
<C> = unit(name: "°C")
<F> = unit(name: "°F")
@relate(<C>, ((<F> - 32) * 5) / 9)
// The opposite is automatically inferred, i.e.
<F> == ((<C> * 9) / 5) + 32
Cyclical units with a wrap_at value automatically wrap-around on calculations,
e.g.
<degrees> = unit(name: "°", wrap_at: 360)
difference = 20<degrees> - 350<degrees>
difference == 30<degrees> // true
Logarithmic units can also be defined, e.g.
<dB> = unit(name: "decibel", plural: "decibels", logarithmic: true, base: 10)
20<dB> + 30<dB> == 30.4<dB> // true
Values with units are of the type quantity and can also be programmatically
defined. When a function expects a parameter of type unit, the <> can be
elided, e.g.
distance = quantity(20, km)
Computations with quantities propagate their units, e.g.
speed = 20<km> / 40<min>
speed == 0.5<km/min> // true
Quantities can be normalized to convertible units, e.g.
speed = 0.5<km/min>
speed.to(km/hour) == 30<km/hour>
The type system automatically prevents illegal calculations, e.g.
10<USD> + 20<min> // ERROR!
While units are automatically calculated on multiplication and division, e.g.
force = mass * acceleration // kg⋅m/s² (newtons)
energy = force * distance // kg⋅m²/s² (joules)
power = energy / time // kg⋅m²/s³ (watts)
Quantities default to using fixed128 values, but this can be customized by
using a different type during the value construction, e.g.
speed = bigdecimal(55.312)<km/s>
measurement = (1.5 ± 0.1)<m>
Quantities can be parsed from string values where a numeric value is suffixed with the unit, e.g.
timeout = quantity("30s")
timeout == 30<s> // true
By default all units that are available in the scope are supported. This can be
constrained by specifying the optional limit_to parameter, e.g.
block_size_limit = quantity("100MB", limit_to: [MB, GB])
block_size_limit == 0.1<GB> // true
Quantities can also be cast to strings, e.g.
timeout = 30<s>
string(timeout) == "30s"
Localized long form names for units can be used by setting long_form to
true. These default to the names given during unit definition, e.g.
string(1<s>, long_form: true) == "1 second" // true
string(30<s>, long_form: true) == "30 seconds" // true
When cast to a string, the most appropriate unit from a list can be
automatically selected by specifying closest_fit. This will find the largest
unit that gives a positive integer quantity, e.g.
time_taken = time.since(start) // 251<s>
string(time_taken, closest_fit: [s, min, hour]) == "2 minutes" // true
The humanize parameter automatically selects up to two of the largest units
that result in whole numbers, e.g.
string(
time.since(post.updated_time),
humanize: true,
limit_units_to: [s, min, hour, day, month, year]
)
// Outputs look something like:
// "2 minutes"
// "3 months and 10 days"
Types of a specific quantity can be referred to explicitly as quantity[unit],
e.g.
distance = 10<km>
type(distance) == quantity[km] // true
Some quantity types are aliased for convenience, e.g.
duration = quantity[s]
When parsing from a string to a quantity of a specific unit, it will be normalized from relatable units, e.g.
timeout = 30<s>
timeout = quantity("2min")
timeout == 120<s> // true
Quantity types using a specific numeric type and unit can be specified as
type<unit>, e.g.
func set_timeout(duration int<s>) {
...
}
set_timeout(10<km>) // ERROR! Invalid unit!
Numeric values must be explicitly multiplied with a variable so that it can be clearly distinguished from types with a unit, e.g.
duration = int<s> // type
type(duration) == type // true
value = 20
// To create a quantity from `value`, it must pass the
// variable and unit to the quantity constructor, e.g.
length = quantity(value, <km>)
// Or use the explicit type as a constructor:
length = int<km>(value)
// Or multiply the variable with the unit:
length = value * <km>
// But it would be an edit-time error to have the unit
// immediately after the variable name as it can be
// confused with a type definition:
length = value<km> // ERROR!
Relatedly, when a variable is known to be unit, then the surrounding angle
brackets can be elided, e.g.
func convert(length int<km>, to unit) {
...
}
convert(10<m>, ft)
Custom asset and currency units can be defined at runtime with a symbol
and name, e.g.
USDC = currency(symbol: "USDC", name: "USD Coin")
EURC = currency(symbol: "EURC", name: "EUR Coin")
Computations with currency units can be converted using live exchange rates at runtime, e.g.
fx_rate = 1.16<USDC/EURC>
total = 1000<EURC>
total.to(USDC, at: fx_rate) == 1160<USDC> // true
This allows type safety to be maintained, e.g. a GBP value can’t be
accidentally added to a USD value, while supporting explicit conversion, e.g.
gbp_subtotal = 200<GBP>
usd_subtotal = 100<USD>
total = gbp_subtotal + usd_subtotal // ERROR! Can't mix currencies
// This would work:
total = gbp_subtotal + usd_subtotal.to(GBP, at: fx_rate)
Range Values
Iroh provides a native range type for constructing a sequence of values
between a given start and end integer values, e.g.
x = range(start: 5, end: 9)
len(x) == 5 // true
// Prints: 5, 6, 7, 8, 9
for i in x {
print(i)
}
A next value can also be provided to deduce the “steps” to take between the
start and end, e.g.
x = range(start: 1, next: 3, end: 10)
// Prints: 1, 3, 5, 7, 9
for i in x {
print(i)
}
If next is not specified, it defaults to start + 1 if the start value is
less than or equal to end, otherwise it defaults to start - 1, e.g.
x = range(start: 5, end: 1)
// Prints: 5, 4, 3, 2, 1
for i in x {
print(i)
}
Likewise, if start is not specified, it defaults to 0, e.g.
x = range(end: 5)
// Prints: 0, 1, 2, 3, 4, 5
for i in x {
print(i)
}
Note that, unlike range in Python, our ranges are inclusive of the end
value. We believe this is much more intuitive — especially when it comes to
using ranges to slice values.
Since ranges are frequently used, the shorthand start..end syntax from Perl is
available, e.g.
x = 5..10
x == range(start: 5, end: 10) // true
This is particularly useful in for loops, e.g.
// Prints: 5, 6, 7, 8, 9, 10
for i in 5..10 {
print(i)
}
Similar to Haskell, the next value can also be specified in the shorthand as
start,next..end, e.g.
// Prints: 5, 7, 9
for i in 5,7..10 {
print(i)
}
While range expressions do not support full sub-expressions, variable identifiers can be used in place of integer literals, e.g.
pos = 0
next = 2
finish = 6
// Prints: 0, 2, 4, 6
for i in pos,next..finish {
print(i)
}
When the start value is elided in range expressions, it defaults to 0, e.g.
finish = 6
// Prints: 0, 1, 2, 3, 4, 5, 6
for i in ..finish {
print(i)
}
Iroh does not provide syntax for ranges that are exclusive of its end value, as
having two separate syntaxes tends to confuse developers, e.g. .. and ... in
Ruby, .. and ..= in Rust, etc.
Array Data Types
Iroh supports arrays, i.e. ordered collections of elements of the same type with a length that is determined at compile-time.
// A 5-item array of ints:
x = [5]int{1, 2, 3, 4, 5}
// A 3-item arrays of strings:
y = [3]string{"Tav", "Alice", "Zeno"}
Array types are of the form [N]Type when N specifies the length and Type
specifies the type of each element. All unspecified array elements are
zero-initialized, e.g.
x = [3]int{}
x[0] == 0 and x[1] == 0 and x[2] == 0 // true
Elements at specific indices can also be explicitly specified for sparse initialization, e.g.
x = [3]int{2: 42}
x == {0, 0, 42} // true
The ; delimiter can be used within the length specifier to initialize with a
custom value, e.g.
x = [3; 20]int{2: 42}
x == {20, 20, 42} // true
For more complex defaults, a fill function can also be specified. This function is passed the index positions for all dimensions that can be optionally used, e.g.
x = [3]int{} << { $0 * 2 }
x == {0, 2, 4} // true
Using a custom fill function together with either a default value or with arrays that have elements initialized at specific indices will result in an error at edit time.
The length of an array can either be an integer literal or an expression that compile-time evaluates to an integer, e.g.
const n = 5
x = [n + 1]int{}
len(x) == 6 // true
The ... syntax can be used to let the compiler automatically evaluate the
length of the array from the number of elements, e.g.
x = [...]int{1, 2, 3}
len(x) == 3 // true
The element type can also be elided when it can be automatically inferred, e.g.
x = [3]{"Zeno", "Reia", "Zaia"} // type is [3]string
y = [...]{"Zeno", "Reia", "Zaia"} // type is [3]string
Array elements are accessed via [index], which is 0-indexed, e.g.
x = [3]int{1, 2, 3}
x[1] == 2 // true
Indexed access is always bounds checked for safety. Out of bounds access will generate a compile-time error for index values that are known at compile-time, and a runtime error otherwise.
Elements of an array can be iterated using for loops, e.g.
x = [3]int{1, 2, 3}
for elem in x {
print(elem)
}
If you want the index as well, destructure the iterator into 2 values, i.e.
x = [3]int{1, 2, 3}
for idx, elem in x {
print("Element at index ${idx} is: ${elem}")
}
Arrays can be compared for equality by using the == operator, which compares
each element of the array for equality, e.g.
x = [3]int{1, 2, 3}
if x == {1, 2, 3} {
// do something
}
Note that when the compiler can infer the type of something, e.g. on one side of a comparison where the type of the other side is already known, the type specification can be elided as above.
Array instances also have a bunch of utility methods, e.g.
-
x.contains— checks if the array has a specific element. -
x.index_of— returns the first matching index of a specific element if it exists. -
x.reversed— returns a reversed copy of the array. -
x.sort— sorts the array as long as the elements are sortable.
Depending on the size of the array and how it’s used, the compiler will automatically allocate the array on either the stack, heap, or even the registers.
By default, arrays are passed by value. If a function needs to mutate the array so that the changes are visible to the caller, it can specify the parameter type to be a pointer to an array, e.g.
func change_first_value(x *[3]int) {
x[0] = 21 // indexed access is automatically dereferenced
}
x = [3]int{1, 2, 3}
change_first_value(&x)
print(x) // [3]int{21, 2, 3}
If an array is defined as a const, it can’t be mutated, e.g.
const x = [3]int{1, 2, 3}
x[0] = 5 // ERROR!
Arrays with elements of the same type can be cast between different sizes, e.g.
x = [3]int{1, 2, 3}
y = [5]int(x) // when upsizing, missing values are zero-initialized
z = [2]int(x) // when downsizing, excess values are truncated
Multi-dimensional arrays can also be specified by stacking the array types, e.g.
x = [2][3]int{
{1, 2, 3},
{4, 5, 6}
}
This can also use the slightly clearer MxN syntax, e.g.
x = [2x3]int{
{1, 2, 3},
{4, 5, 6}
}
This can be extended to whatever depth is needed, e.g.
x = [2x3x4]int{}
y = [480x640x3]byte{}
Multi-dimensional arrays default to row-major layouts so as to be CPU cache-friendly. For use cases where a column-major layout is needed, this can be explicitly specified, e.g.
x = [2x3@col]int{}
This can also be set for an entire lexical scope using the with statement, e.g.
with {
.array_layout = .column_major
} {
x = [2x3]int{}
}
Such array layouts can be useful in domains like linear algebra calculations. The layout can also be transposed dynamically to a new array via casting, e.g.
x = [2x3]int{}
y = @col(x)
z = @row(y) // same layout as x
Or by using @transpose when you just want the opposite layout, e.g.
x = [2x3]int{}
y = @transpose(x)
z = @transpose(x) // same layout as x
Slice Data Types
Slices, like arrays, are used for handling ordered elements. Unlike arrays, which have a size that is fixed at compile-time, slices can be dynamically resized during runtime.
Slice types are of the form []Type where Type specifies the type of each
element. For example, a slice of ints can be initialized with:
x = []int{1, 2, 3}
Since slices are used often, a shorthand is available when the element type can be inferred, e.g.
x = [1, 2, 3]
Internally, slices point to a resizable array, keep track of their current length, i.e. the current number of elements, and specified capacity, i.e. allocated space:
slice = struct {
_data [*]array // pointer to an array
_length int
_capacity int
}
The built-in len and cap functions provide the current length and capacity
of a slice, e.g.
x = [1, 2, 3]
len(x) == 3 // true
cap(x) == 3 // true
Slices are dynamically resized as needed, e.g. if you append to a slice that is already at capacity:
x = [1, 2, 3]
x.append(4)
len(x) == 4 // true
cap(x) == 4 // true
When a slice’s capacity needs to grow, it is increased to the closest power of 2 that will fit the additional elements, e.g.
x = [1, 2, 3, 4]
x.append(5)
len(x) == 5 // true
cap(x) == 8 // true
As slices are just views into arrays, they can be formed by “slicing” arrays with a range, e.g.
x = [5]int{1, 2, 3, 4, 5}
y = x[0..2]
y == [1, 2, 3] // true
Slices can also be sliced to form new slices, e.g.
x = [1, 2, 3, 4, 5]
y = x[0..2]
y == [1, 2, 3] // true
When a slice needs to grow beyond the underlying capacity, a new array is allocated and existing values are copied over. To keep things simple, Iroh uses copy-on-write on slice views.
For example, consider this Go code:
base := []int{1, 2, 3}
a := base[:2] // [1, 2]
b := base[1:] // [2, 3]
// If we now append to a, the underlying capacity is enough,
// so the base array is reused:
a = append(a, 100)
// At this point, a is [1, 2, 100]
// And b also sees the change, i.e. b is [2, 100]
// But if we append to a again, the underlying capacity is
// exceeded, so a new array allocated:
a = append(a, 200)
// At this point, a is [1, 2, 100, 200]
// But b no longer sees the change, i.e. b is still [2, 100]
Iroh’s use of copy-on-write avoids this confusing behaviour, e.g.
base = []int{1, 2, 3}
a = base[..1] // [1, 2]
b = base[1..] // [2, 3]
// If we now append to a, it uses an independent array from
// base:
a.append(100)
// At this point, a is [1, 2, 100]
// And b is unaffected, i.e. b is [2, 3]
// If we append to a again, we get similar behaviour:
a.append(200)
// At this point, a is [1, 2, 100, 200]
// But b is still unaffected, i.e. b is [2, 3]
Iroh tracks variable usage to minimize unnecessary copies. Slice operations can
be annotated as a @view to prevent copy-on-write and get Go-like behaviour,
e.g.
arr = [4]int{1, 2, 3, 4}
x = arr[2..3 @view] // [3, 4]
y = arr[.. @view] // [1, 2, 3, 4]
// Changes to the underlying array are reflected in other
// views, e.g.
x[0] = 10
x == [10, 4] // true
y == [1, 2, 10, 4] // true
// But if a slice exceeds the underlying capacity, it no
// longer points to the same array, e.g.
x.append(5)
// So changes to x are no longer reflected in other views,
// e.g.
x[0] = 11
x == [11, 4, 5] // true
y == [1, 2, 10, 4] // true
Individual elements of a slice can be accessed by indexing, e.g.
x = [1, 2, 3]
x[0] == 1 and x[1] == 2 and x[2] == 3 // true
Similar to Python, negative indices work as offsets from the end, e.g.
x = [1, 2, 3]
x[-1] == 3 // true
x[-2] == 2 // true
Slices can be iterated over by for loops, e.g.
x = [1, 2, 3]
// Prints: 1, 2, 3
for elem in x {
print(elem)
}
If you want the index as well, destructure the iterator into 2 values, just like arrays, e.g.
x = [1, 2, 3]
for idx, elem in x {
print("Element at index ${idx} is: ${elem}")
}
Using square brackets around a range expression acts as a shorthand for creating a slice from that range, e.g.
x = [1..5]
len(x) == 5 // true
x == [1, 2, 3, 4, 5] // true
If an actual slice consisting of range values is desired, then parantheses
need to be used, e.g.
x = [(1..5)]
len(x) == 1 // true
x[0] == range(start: 1, end: 5) // true
If the start value is left off by the range when slicing, it defaults to 0, e.g.
x = [1, 2, 3, 4, 5]
y = x[..2]
y == [1, 2, 3] // true
If the end value is left off, it defaults to the index of the last element, i.e.
len(slice) - 1, or if the start value is positive, then it defaults to the
negative index of the first value, e.g.
x = [1, 2, 3, 4, 5]
y = x[2..]
z = x[-3..]
y == [3, 4, 5] // true
z == [3, 2, 1] // true
If both start and end are left off, it slices the whole range, e.g.
x = [1, 2, 3, 4, 5]
y = x[..]
y == [1, 2, 3, 4, 5] // true
Slicing with stepped ranges let’s you get interleaved elements, e.g.
x = [10..20]
y = x[0,2..] // gets every other element
y == [10, 12, 14, 16, 18, 20] // true
When initializing a slice, if you already know the minimum capacity that’s needed, it can be specified ahead of time so as to avoid reallocations, e.g.
x = []int(cap: 100)
len(x) == 0 // true
cap(x) == 100 // true
The length can also be specified, e.g.
x = []int(cap: 100, len: 5)
len(x) == 5 // true
cap(x) == 100 // true
If length is specified, but not capacity, then capacity defaults to the same value as length, e.g.
x = []int(len: 5)
len(x) == 5 // true
cap(x) == 5 // true
By default, elements are initialized to their zero value. An alternative default can be specified if desired, e.g.
x = []int(len: 5, default: 20)
len(x) == 5 // true
cap(x) == 5 // true
x[0] == 20 // true
x[4] == 20 // true
Slices have a broad range of utility methods, e.g.
-
x.append— adds one or more elements to the end. -
x.all— returns true if all elements match the given predicate function. -
x.any— returns true if any element matches the given predicate function. -
x.choose— selectnnumber of elements from the slice. -
x.chunk— split the slice into sub-slices of the specified length. -
x.clear— removes all elements. -
x.combinations— generate all possible combinations of the given length from the slice elements. -
x.contains— checks if the slice has a specific element. -
x.count— returns the number of elements that match the given predicate function. -
x.drop— remove the firstnelements. -
x.drop_while— keep removing elements until the given predicate function fails. -
x.enumerated— returns a new slice where each element gives both the original element index and value. -
x.extend— add all elements from the given slice to the end. -
x.filter— return a new slice for elements matching the given predicate function. -
x.filter_map— applies a function to each element and keeps only the non-nilresults as a new slice. -
x.first— return the first element, if any; also takes an optional predicate function, in which case, the first element, if any, that matches that predicate will be returned. -
x.flatten— flatten nested slices. -
x.flat_map— transform each element with the given function and then flatten nested slices. -
x.for_each— run a given function for each element in the slice. -
x.group_by— group elements by the given key function. -
x.index_of— returns the first matching index of a specific element if it exists. -
x.insert_at— inserts one or more elements at a given index. -
x.intersperse— insert a separator between all elements. -
x.join— join all elements into a string with the given separator. -
x.last— return the last element, if any. -
x.last_index_of— returns the last matching index of a specific element if it exists. -
x.map— return a new slice with each element transformed by a function. -
x.max— returns the maximum element; a custom comparison function can be given when the elements are non-comparable. -
x.min— returns the minimum element; a custom comparison function can be given when the elements are non-comparable. -
x.partition— split into two slices based on a predicate function. -
x.permutations— generate all possible permutations of the given length from the slice elements. -
x.pop— removes the last element from the slice and returns it. -
x.prepend— adds one or more elements to the start. -
x.reduce— reduce the elements into one value using a given accumulator function and initial value. -
x.remove_at— remove one or more elements at a given index. -
x.remove_if— remove elements matching the given predicate function. -
x.reverse— reverses the slice in-place. -
x.reversed— returns a reversed copy of the slice. -
x.scan— starting with an initial value and an accumulator function, keep applying to each element, and return a slice of all intermediary results. -
x.shift— removes the first element from the slice and returns it. -
x.shuffle— shuffles the slice in-place. -
x.shuffled— returns a shuffled copy of the slice. -
x.sort— sorts the slice in-place. -
x.sorted— returns a sorted copy of the slice. -
x.split_at— split into two separate slices at the given index. -
x.sum— for slices of elements that support the+operator, this returns the sum of all elements. -
x.take— return the firstnelements. -
x.take_while— return all elements up to the point that the given predicate function fails. -
x.transpose— swap rows/columns for a slice of slices. -
x.unique— remove any duplicate elements. -
x.unzip— convert a slice of paired elements into two separate slices. -
x.window— create sub-slices that form a sliding window of the given length. -
x.zip— pair each element with elements from the given slice. -
x.zip_with— pair each element with elements from the given slice, and apply the given transformation function to each pair.
Methods like filter and map can use a Swift-like closure syntax for defining
their function parameters, e.g.
adult_names = people
.filter { $0.age > 18 }
.map { $0.name.upper() }
Likewise for sort, e.g.
people.sort { $0.age < $1.age }
// Or, if you want a stable sort:
people.sort(stable: true) { $0.age < $1.age }
Iroh fully inlines these functions, so calls like people.filter { $0.age > 18}
performs exactly like a hand-written for loop, while being much more
readable.
When combined with range initialization, the whole sequence effectively becomes lazy, minimizing unnecessary allocation, e.g.
result = [1..1000]
.filter { $0 % 2 == 0 }
.map { $0 * 2 }
.take(10)
And since reuse analysis is done as part of memory management, Iroh can detect when it’ll be safe to reuse an existing allocation, e.g.
result = [1..1000]
.map { $0 * $0 } // Updates elements in place instead of creating a new slice.
Methods like append and insert_at support both inserting one element at a
time, e.g.
x = [1..3]
x.append(10)
x == [1, 2, 3, 10] // true
As well as appending multiple elements as once, e.g.
x = [1..3]
x.append(4, 5, 6)
x == [1, 2, 3, 4, 5, 6] // true
While all elements from another slice can be appended by using the ... splat
operator, e.g.
x = [1..3]
y = [5..7]
x.append(4, ...y)
x == [1, 2, 3, 4, 5, 6, 7] // true
The extend method should be used instead for most use cases, e.g.
x = [1..3]
y = [4..7]
x.extend(y)
x == [1, 2, 3, 4, 5, 6, 7] // true
When a slice is assigned to a new variable or passed in as a parameter, it is to the same reference, e.g.
x = [1..5]
y = x
y[0] = 20
x == [20, 2, 3, 4, 5] // true
y == [20, 2, 3, 4, 5] // true
The copy method needs to be used if it should be to an independent slice, e.g.
x = [1..5]
y = x.copy()
y[0] = 20
x == [1, 2, 3, 4, 5] // true
y == [20, 2, 3, 4, 5] // true
When copying slices where the elements are of a composite type, using the
deep_copy method will recursively call copy on all sub-elements that are
deep-copyable, e.g.
x = [
{"name": "Tav", "surname": "Siva"},
{"name": "Alice", "surname": "Fung"}
]
// Deep copies are independent of each other.
y = x.deep_copy()
y[1]["surname"] = "Siva"
x[1]["surname"] == "Fung" // true
y[1]["surname"] == "Siva" // true
// However, standard copying is shallow.
z = x.copy()
z[1]["surname"] = "Fung"
x[1]["surname"] == "Fung" // true
z[1]["surname"] == "Fung" // true
Slices can be compared for equality by using the == operator, which compares
each element of the slice for equality, e.g.
x = [1, 2, 3]
if x == [1, 2, 3] {
// do something
}
Element-wise comparisons use dotted operators like MATLAB and Julia. When
comparison operators like .== and .> are used, they produce boolean slices,
e.g.
x = [1..5]
x .== 2 // [false, true, false, false, false]
x .> 3 // [false, false, false, true, true]
These can then be used to do boolean/logical indexing, i.e. returning the slice
values matching the true values, as first introduced by MATLAB and made
popular by NumPy, e.g.
x = [1..10]
y = x[x .> 5]
y == [6, 7, 8, 9, 10] // true
Full broadcasting is supported, where operations are applied element-wise to arrays, slices, and collections of different shapes and sizes, e.g.
x = [0..5]
y = [10..15]
z = x .+ y
z == [10, 12, 14, 16, 18, 20] // true
These operations will be automatically vectorized on architectures that support vectorization, e.g. SIMD on CPUs, GPUs, etc. This will happen transparently without needing custom hints.
Iroh supports Julia’s dotted function mechanism for broadcasting the function, i.e. applying it element-wise for each item in the input, e.g.
add = (n) => n * 2
x = [0..5]
y = add.(x)
y == [0, 2, 4, 6, 8, 10] // true
This also works with Iroh’s standard library, which provides functions like
sin and exp, linear algebra functions like inv and solve, stats
functions like mean and std, axis-aware reductions, etc.
Multi-dimensional slices can be created by nesting slice types, e.g.
x = [][]int{
{1, 2, 3},
{4, 5, 6}
}
Where the element type can be inferred, the shorthand syntax can be used, e.g.
x = [[1, 2, 3], [4, 5, 6]]
Or use ; to make it even shorter, e.g.
x = [1, 2, 3; 4, 5, 6]
x[0] == [1, 2, 3] // true
x[1] == [4, 5, 6] // true
Multiple ; delimiters can be used to make more nested structures, e.g.
x = [1, 2; 3, 4;; 5, 6; 7, 8]
len(x) == 2 // true
len(x[0]) == 2 // true
x[0][0] == 1 // true
Custom dimensions can also be specified to the constructor, e.g.
x = [2, 3]int{}
x == [0, 0, 0; 0, 0, 0] // true
By default, such multi-dimensional slices are zero-initialized. A different
default can be specified using ;, e.g.
x = [2, 3; 7]int{}
x == [7, 7, 7; 7, 7, 7] // true
Custom fill function can also be specified. This function is passed the index positions for all dimensions. This can be used, if needed, to derive the fill value, e.g.
x = [2, 3]int{} << { $0 + $1 }
x == [0, 1, 2; 1, 2, 3] // true
The default row-major ordering can be overriden with @col, e.g.
x = [2, 3@col]int{}
Or as explicit parameters to make, e.g.
x = []int(shape: (2, 3), layout: .column_major, default: 7)
Existing arrays/slices can be reshaped easily, e.g.
x = [0..5]
// Cast it to a different shape to reshape it:
y = [2, 3]int(x)
// Or use the fluent syntax within method chains:
z = x.reshape(2, 3)
y == [0, 1, 2; 3, 4, 5] // true
y == z // true
Casting to a new slice will also do type conversion on the elements if the two slices have different element types. Any optional parameters will be passed along for the type conversion, e.g.
x = []int32{1, 2, 3}
// No parameters needed as upcasting is safe:
y = []int64(x)
// Specify type conversion parameter for downcasting, e.g.
z = []int8(x, policy: truncate)
Standard arithmetic operations on slices tends to do matrix operations, e.g.
x = [1, 2; 3, 4]
y = [5, 6; 7, 8]
// Matrix multiplication:
z = x * y
z == [19, 22; 43, 50] // true
Slices can be indexed using other slices to get the elements at the given indices, e.g.
x = [1, 2, 3, 4, 5]
y = x[[0, 2]]
y == [1, 3] // true
Multi-dimensional slices can also be sliced using ;, e.g.
x = [1, 2, 3; 4, 5, 6; 7, 8, 9]
x[..; 1] == [2, 5, 8] // Column/Dimension slice.
x[..1; 1..2] == [2, 3; 5, 6] // Range slice.
x[0,2..; -1..] == [3, 2, 1; 9, 8, 7] // Step slice.
x[[0, 2]; [1, 0]] == [2, 7] // Index slice.
Slice ranges can also be assigned to, e.g.
x = [1..5]
x[..2] = [7, 8, 9]
x == [7, 8, 9, 4, 5] // true
Likewise for multi-dimensional slices, e.g.
x = [1, 2, 3; 4, 5, 6; 7, 8, 9]
x[..; 1] = [23, 24, 25]
x == [1, 23, 3; 4, 24, 6; 7, 25, 9] // true
The .= broadcasting assignment operator can be used to assign a value to each
element in a range, e.g.
x = [1..5]
x[..2] .= 9
x == [9, 9, 9, 4, 5] // true
Slices support a broad range of operations that make it easy in domains like linear algebra, machine learning, physics, etc.
c = a * b // Matrix/vector multiplication.
c = a · b // Inner/dot product.
c = a ⊗ b // Tensor product (Outer product for vectors; Kronecker product for matrices).
c = a × b // Cross product.
The operations work as one would expect when they are applied to different types, e.g. multiplying by a scalar value will automatically broadcast, use matrix-vector multiplication when needed, etc.
Slices also support matrix norms and determinants with similar syntax to what’s used in mathematics, e.g.
|a| // Determinant for square matrices
||a|| // Vector: 2-norm, Matrix: Frobenius norm
||a||_0 // Vector/Matrix: Count non-zeros
||a||_1 // Vector: 1-norm, Matrix: 1-norm
||a||_2 // Vector: 2-norm, Matrix: spectral norm
||a||_inf // Vector: max norm, Matrix: infinity norm
||a||_max // Vector: max norm, Matrix: max absolute element
||a||_nuc // Matrix: nuclear norm
||a||_p // Vector: p-norm, Matrix: Schatten p-norm
||a||_quad // Vector: quadratic norm
||a||_w // Vector: weighted norm, Matrix: weighted Frobenius
Subscripts with {i,j,k,l} syntax can be used on slices for operations in
Einstein notation, e.g.
hidden{i,k} = input{i,j} * weights1{j,k} + bias1
output{i,l} = hidden{i,k} * weights2{k,l} + bias2
This is more readable than how frameworks like Numpy handle it, i.e.
hidden = np.einsum("ij,jk->ik", input, weights1) + bias1
output = np.einsum("ij,jk->ik", hidden, weights2) + bias2
This also simplifies the need for a number of functions, e.g. instead of having
a trace function, diagonal elements can be easily summed with:
x = [1, 2, 3; 4, 5, 6; 7, 8, 9]
trace = x{i,i}
trace == 15 // true
Iroh’s approach also results in higher quality code, e.g. dimension mismatches can be caught at compile-time, the compiler can automatically optimize for the best operation order, etc.
String Data Types
Iroh defaults to Unigraph text as it fixes many of the shortcomings of Unicode. Two specific Unigraph types are provided:
-
string— the default type that’s suitable for most use cases. -
composable_string— a special type that includes the separately namespaced Unigraph Compose Element IDs that’s useful for building text editors.
Both data types are encoded in UEF (Unigraph Encoding Format). But, where
string only consists of Unigraph IDs, composable_string can include both
Unigraph IDs and Compose Element IDs.
A number of string types are also provided for dealing with Unicode text:
-
utf8_string— UTF-8 encoded text. -
wtf8_string— WTF-8 for roundtripping between UTF-8 and broken UTF-16, e.g. on Windows where you can get unpaired surrogates in filenames.
In addition, the encoded_string type takes specific encodings, e.g.
x = encoded_string("Hello", encoding: .iso_8859_1)
The supported encodings include:
enum {
ascii, // ASCII
big5, // Big 5
cesu_8, // CESU-8
cp037, // IBM Code Page 37
cp437, // IBM Code Page 437
cp500, // IBM Code Page 500
cp850, // IBM Code Page 850
cp866, // IBM Code Page 866
cp1026, // IBM Code Page 1026
cp1047, // IBM Code Page 1047
cp1361, // IBM Code Page 1361
custom, // User-defined
euc_jp, // EUC-JP
euc_kr, // EUC-KR
gb2312, // GB2312
gb18030, // GB18030
gbk, // GBK
hz_gb2312, // HZ-GB2312
iscii, // ISCII
iso_2022_jp, // ISO-2022-JP
iso_2022_kr, // ISO-2022-KR
iso_8859_1, // ISO-8859-1
iso_8859_2, // ISO-8859-2
iso_8859_3, // ISO-8859-3
iso_8859_4, // ISO-8859-4
iso_8859_5, // ISO-8859-5
iso_8859_6, // ISO-8859-6
iso_8859_7, // ISO-8859-7
iso_8859_8, // ISO-8859-8
iso_8859_8_i, // ISO-8859-8-I
iso_8859_9, // ISO-8859-9
iso_8859_10, // ISO-8859-10
iso_8859_11, // ISO-8859-11
iso_8859_13, // ISO-8859-13
iso_8859_14, // ISO-8859-14
iso_8859_15, // ISO-8859-15
iso_8859_16, // ISO-8859-16
johab, // Johab (KS C 5601-1992)
koi8_r, // KOI8-R
koi8_u, // KOI8-U
mac_cyrillic, // Mac OS Cyrillic
mac_greek, // Mac OS Greek
mac_hebrew, // Mac OS Hebrew
mac_roman, // Mac OS Roman
mac_thai, // Mac OS Thai
mac_turkish, // Mac OS Turkish
shift_jis, // Shift_JIS
tis_620, // TIS-620
tscii, // TSCII
ucs_2, // UCS-2
ucs_4, // UCS-4
uef, // UEF
utf_7, // UTF-7
utf_8, // UTF-8
utf_16, // UTF-16
utf_16be, // UTF-16BE
utf_16le, // UTF-16LE
utf_32, // UTF-32
utf_32be, // UTF-32BE
utf_32le, // UTF-32LE
viscii, // VISCII
windows_874, // Windows-874
windows_1250, // Windows-1250
windows_1251, // Windows-1251
windows_1252, // Windows-1252
windows_1253, // Windows-1253
windows_1254, // Windows-1254
windows_1255, // Windows-1255
windows_1256, // Windows-1256
windows_1257, // Windows-1257
windows_1258, // Windows-1258
wtf_8, // WTF-8
}
Any other encoding can be used by specifying a .custom encoding and providing
an encoder that implements the built-in string_encoder interface, e.g.
x = encoded_string(text, encoding: .custom, encoder: scsu)
String literals default to the string type and are UEF-encoded. They are
constructed using double quoted text, e.g.
x = "Hello world!"
For convenience, pressing backslash, i.e. \, in the Iroh editor will provide a
mechanism for entering characters that may otherwise be difficult to type, e.g.
| Sequence | Definition |
|---|---|
\a |
ASCII Alert/Bell |
\b |
ASCII Backspace |
\f |
ASCII Form Feed |
\n |
ASCII Newline |
\r |
ASCII Carriage Return |
\t |
ASCII Horizontal Tab |
\v |
ASCII Vertical Tab |
\\ |
ASCII Backslash |
\x + NN |
Hex Bytes |
\u + NNNN |
Unicode Codepoint |
\u{ + N... + } |
Unigraph ID |
\U + NNNNNNNN |
Unicode Codepoint |
As the Iroh editor automatically converts text into bytes as they are typed, this is purely a convenience for typing and rendering the strings. No escaping is needed internally.
String values can be cast between each type easily, e.g.
x = utf8_string("Hello world!")
y = encoded_string(x, encoding: .windows_1252)
z = string(y)
x == z // true
For safety, Unicode text is parsed in .strict mode by default, i.e. invalid
characters generate an error. But this can be changed if needed, e.g.
// Skip characters that can't be encoded/decoded:
x = utf8_string(value, errors: .ignore)
// Replace invalid characters with the U+FFFD replacement character:
y = utf8_string(value, errors: .replace)
Iroh also provides a null-terminated c_string that is compatible with C. This
can be safely converted between the other types, e.g.
// This c_path value is automatically null-terminated when
// passed to C functions. The memory management for the
// value is also automatically handled by Iroh.
c_path = c_string("/tmp/file.txt")
// When converted back to any of the other string types,
// the null termination is automatically removed.
path = string(c_path)
path == "/tmp/file.txt" // true
The c_string type will generate errors when strings that are being converted
have embedded nulls in them. These need to be manually handled with try, e.g.
c_path = try c_string("/tmp/f\x00ile.txt")
The length of all string types can be found by calling len on it, e.g.
x = "Hello world"
len(x) == 11 // true
Calling len returns what most people would expect, e.g.
x = utf8_string("🤦🏼♂️")
len(x) == 1 // true
Except for C strings, where it returns the number of bytes, Iroh always defines the length of a string as the number of visually distinct graphemes:
-
For Unicode strings, this is the number of extended grapheme clusters. Note that Unicode’s definition of grapheme clusters can change between different versions of the standard.
-
For Unigraph strings, this is the number of Unigraph IDs which are already grapheme based. Unlike with Unicode strings, this number will never change with new versions of the standard.
Strings can be concatenated by using the + operator, e.g.
x = "Hello"
y = " world"
z = x + y
z == "Hello world" // true
Multiplying strings with an integer using the * operator will duplicate the
string that number of times, e.g.
x = "a"
y = a * 5
y == "aaaaa" // true
Individual graphemes can be accessed from a string using the [] index
operator, e.g.
x = "Hello world"
y = x[0]
y == 'H' // true
Unigraph graphemes are represented by the grapheme type, Unicode ones by the
unicode_grapheme type, and all other encodings by the encoded_grapheme type
with a specific string_encoding.
Grapheme literals are quoted within single quotes, e.g.
char = '🤦🏼♂️'
When graphemes are added or multiplied, they produce strings, e.g.
x = 'G' + 'B'
x == "GB" // true
Strings can also be sliced similar to slice types, e.g.
x = "Hello world"
y = x[..4]
y == "Hello" // true
All string types support the typical set of methods, e.g.
-
x.ends_with— checks if a string ends with a suffix. -
x.index_of— finds the position of a substring. -
x.join— join a slice of strings with the value ofx. -
x.lower— lower case the string. -
x.pad_end— pad the end of a string -
x.pad_start— pad the start of a string -
x.replace— replace all occurences of a substring. -
x.split— split the string on the given substring or grapheme. -
x.starts_with— checks if a string starts with a prefix. -
x.strip— remove whitespace from both ends of a string. -
x.upper— upper case the string.
For safety, all string values are immutable. They can be cast to a []byte or
[]grapheme to explicitly mutate individual bytes or graphemes, e.g.
x = "Hello world!"
y = []grapheme(x)
y[-1] = '.'
string(y) == "Hello world." // true
As a convenience, Iroh will automatically cast between strings and grapheme slices when values are passed in as parameters. The compiler will also try to minimize unnecessary allocations.
To control how values are formatted, we use explicit parameters, e.g.
x = 255
y = string(x, format: .hex_prefixed)
y == "0xff" // true
Unlike the cryptic format specifiers used in languages like Python, e.g.
amount = 9876.5432
y = f"{amount:>10,.2f}"
y == " 9,876.54" # true
We believe this is a lot clearer, i.e.
amount = 9876.5432
y = string(amount, scale: 2, thousands_separator: ",", width: 10, align: .right)
y == " 9,876.54" // true
A range of format specifiers are available, e.g. quoted strings can be generated
by specifying the optional quote parameter:
x = string("Tav", quote: true)
y = string(42, quote: true)
z = string("Hello \"World\"", quote: true)
print(x) // Outputs: "Tav"
print(y) // Outputs: "42"
print(z) // Outputs: "Hello \"World\""
When long composite values like slices, arrays, maps, and strings are being stringified, they can be truncated using:
-
max_len— specify the maximum number of elements before truncation is applied. -
truncate— specify whether to apply the...truncation at the end or the middle.
The optional to parameter can be used to emit information about the value,
e.g.
string(x, to: .address) // Pointer address, e.g. 0x7ffde4a3b210
string(x, to: .repr) // Syntax for construction, e.g. int32(456)
// When debugging complex/long values:
string(x, to: .repr_short)
// Is the same as:
string(x, to: .repr, max_len: 50, truncate: .middle)
Other format parameters include:
-
accounting— to use accounting format for numeric values, i.e. enclose negative values in parantheses. -
currency,currency_position, andcurrency_space— for formatting monetary values. -
fill— custom padding character. -
indent— to control the amount and characters used for indentation. -
pad_with— pad the output with a given grapheme up to a givenwidth. -
sign— to control the display of positive/negative signs. -
truncate_at— control how a value gets truncated, e.g. at word boundaries, with a trailing ellipsis, etc.
They can also be provided as parameters on the format method on string values,
e.g.
x = "Tav"
y = x.format(quote: true)
print(y) // Outputs: "Tav"
Sub-expressions can be interpolated into strings using 3 different forms:
-
${expr}— evaluates the expression and converts the result into a string. -
={expr}— same as above, but also prefixes the result with the original expression followed by the=sign. -
#{expr}— validates the expression, but instead of evaluating it, passes through the “parsed” syntax.
Interpolation using ${expr} works as one would expect, e.g.
name = "Tav"
print("Hello ${name}!") // Outputs: Hello Tav!
It supports any Iroh expression, e.g.
name = "Tav"
print("Hello ${name.filter { $0 != "T" }}!") // Outputs: Hello av!
Interpolated expressions are wrapped within string( ... ) calls, so any
formatting parameters passed to string can be specified after a comma, e.g.
amount = 9876.5432
print("${amount, scale: 2, thousands_separator: ","}") // Outputs: 9,876.54
Interpolation using ={expr} is useful for debugging as it also prints the
expression being evaluated, e.g.
x = 100
print("={x + x}") // Outputs: x + x = 200
Interpolation using #{expr} is only valid within template strings, where it
allows for domain-specific evaluation of Iroh expressions, e.g.
time.now().format("#{weekday_short}, #{day} #{month_full} #{year4}")
// Outputs: Tue, 5 August 2025
Any function that takes a template value can be used to construct tagged
template literals. These functions evaluate literals at compile-time for optimal
performance, e.g.
content = html`<!doctype html>
<body>
<div>Hello!</div>
</body>`
The casing of identifiers can be changed with ident_casing, e.g.
x = "user_id"
x.ident_casing(from: .snake_case, to: .camel_case) // userID
x.ident_casing(from: .snake_case, to: .kebab_case) // user-id
x.ident_casing(from: .snake_case, to: .pascal_case) // UserID
There’s a default set of initialisms like API, ID, and HTTP, which are
always capitalized in camel and pascal cases. These can be overridden, e.g.
x = "user_id"
y = "json_response"
x.ident_casing(
from: .snake_case,
to: .camel_case,
override_initialisms: []
) // userId
y.ident_casing(
from: .snake_case,
to: .pascal_case,
extend_initialisms: ["JSON"]
) // JSONResponse
Finally, the []byte and string types support from and to methods to
convert between different binary-to-text formats, e.g.
x = "Hello world"
y = x.to(.hex)
z = y.from(.hex)
y == "48656c6c6f20776f726c64" // true
z == "Hello world" // true
The supported formats include:
enum {
ascii85, // Ascii85 (Adobe variant)
ascii85_btoa, // Ascii85 (btoa variant)
base2, // Base2 (binary string as text)
base16, // Base16 (alias of hex)
base32, // Base32 (RFC 4648, padded)
base32_unpadded, // Base32 (RFC 4648, unpadded)
base32_crockford, // Base32 (Crockford, padded)
base32_crockford_unpadded, // Base32 (Crockford, unpadded)
base32_hex, // Base32 (RFC 4648, extended hex alphabet, padded)
base32_hex_unpadded, // Base32 (RFC 4648, extended hex alphabet, unpadded)
base36, // Base36 (0-9, A-Z)
base45, // Base45 (RFC 9285, QR contexts)
base58_bitcoin, // Base58 (Bitcoin alphabet)
base58_check, // Base58 with 4-byte double-SHA256 checksum (no version)
base58_flickr, // Base58 (Flickr alphabet)
base58_ripple, // Base58 (Ripple alphabet)
base62, // Base62 (URL shorteners, compact IDs)
base64, // Base64 (RFC 4648, padded)
base64_unpadded, // Base64 (RFC 4648, unpadded)
base64_urlsafe, // Base64 (RFC 4648, URL-safe, padded)
base64_urlsafe_unpadded, // Base64 (RFC 4648, URL-safe, unpadded)
base64_wrapped_64, // Base64 (64-char wrapping, no headers)
base64_wrapped_76, // Base64 (76-char wrapping, no headers)
base85_rfc1924, // Base85 (RFC 1924 for IPv6 addresses)
base91, // Base91 (efficient binary-to-text)
base122, // Base122 (efficient binary-to-utf8-codepoints)
binary, // Binary string (alias of base2)
binary_prefixed, // Binary string (0b prefix)
hex, // Hexadecimal (lowercase)
hex_prefixed, // Hexadecimal (lowercase, 0x prefix)
hex_upper, // Hexadecimal (uppercase)
hex_upper_prefixed // Hexadecimal (uppercase, 0x prefix)
hex_eip55, // Hexadecimal (EIP-55 mixed-case checksum)
html_attr_escape, // Escape HTML attribute values
html_escape, // Escape HTML text
json_escape, // JSON string escaping
modhex, // ModHex (YubiKey alphabet)
percent, // Percent/URL encode
percent_component, // Percent/URL encode (RFC 3986, component-safe set)
punycode, // Punycode (internationalized domains)
quoted_printable, // Quoted-printable (email)
rot13, // ROT13 letter substitution
rot47, // ROT47 ASCII substitution
uuencode, // UUEncoding (data body only, fixed line length)
xml_attr_escape, // Escape XML attribute values
xml_escape, // Escape XML character data
xxencode, // XXEncoding (data body only, fixed line length)
yenc, // yEnc (Usenet binary)
z85, // Z85 (ZeroMQ variant)
z_base32, // Base32 (Zooko, z-base-32 human-friendly variant)
}
Optionals
Iroh supports optional types which can either be nil or a value of a
specific type, e.g.
x = optional[string](nil)
x == nil // true
x = "Hello"
x == "Hello" // true
A shorthand ?type syntax is available too and defaults to nil if no value is
provided, e.g.
x = ?string()
x == nil // true
x = "Hello"
x == "Hello" // true
For the following examples, let’s assume an optional name field within a
Person struct, e.g.
Person = struct {
name ?string
}
Optionals default to nil, e.g.
person = Person{}
person.name == nil // true
Values can be easily assigned, e.g.
person = Person{}
person.name = "Alice"
When assigning nil to nested optional types, it either needs to be explicitly
disambiguated, or the nil value applies to the top-most level, e.g.
x = ??string("Test")
x = ?string(nil) // Inner optional
x = nil // Outer optional
Optional values can be explicitly unwrapped by using the unwrap method, e.g.
name = person.name.unwrap()
type(name) == string // true
Unwrapping a nil value will generate an error. To safely unwrap, conditionals
can be used, e.g.
if person.name {
print(person.name.upper()) // person.name has been unwrapped into a string here
} else {
// Handle the case where person.name is nil
}
As Iroh automatically narrows types based on conditionals, and explicit unwraps
with the ! operator, it makes the usage of optionals both simple and safe.
Variables and fields which were initially typed as an optional retain this option even if they were type narrowed via an unwrap, e.g.
if person.name {
type(person.name) == string // true
person.name = nil // Valid assignment.
}
Multiple optionals can be unwrapped in the same conditional, e.g.
if person.name and vip_list {
// Both person.name and vip_list are unwrapped here.
}
Optional lookups can be chained with a ? where the value at the end of a chain
is always an optional, e.g.
display_name = person.name?.upper()
type(display_name) == ?string // true
The or operator can be used to provide a default if the left side is nil,
e.g.
display_name = person.name or "ANONYMOUS"
type(display_name) == string // true
Assignments can use ||= to only assign to a variable or field if it’s not
nil, e.g.
person.name ||= "Alice"
Optionals work wherever an expression returns a value, e.g. after map lookups:
users[id]?.update_last_seen()
And can be used wherever types are accepted, e.g. in function parameters:
func greet(name ?string) {
print("Hello ${name or "there"}!")
}
When comparisons are made to a typed value, optionals are automatically unwrapped as necessary, e.g.
// The person.name value is implicitly unwrapped safely
// before the comparison:
if person.name == "Alice" {
// person.name has been unwrapped to a string with
// value "Alice"
} else {
// person.name can be either nil or a string value
// other than "Alice"
}
The various facets of our optionals system eliminates the issues caused by nil pointers, while maintaining a clean, readable syntax with predictable behaviour.
Map & Set Data Types
Iroh’s map data types provide support for managing key/value data within hash
tables. Map types are generic over their key and value type, e.g.
profit = map[string]fixed128{
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
Where the types can be inferred, map values can be constructed using a {}
literal syntax, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
Duplicate keys in map literals will cause an edit-time error, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
"Consumer": 2239.50, // ERROR!
}
As types cannot be inferred from an empty {} map literal, empty maps can only
be initialized with their explicit type, e.g.
profit = {} // ERROR!
profit = map[int]fixed128{} // Valid.
Maps are transparently resized as keys are added to them. To minimize the amount of resizing, maps can be initialized with an initial capacity, e.g.
// This map will only get resized once there are more
// than 5,000 keys in it:
profit = map[int]fixed128(cap: 5000)
Keys can be set to a value by assigning using the [] index operator, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
profit["Developer"] = 9368.30
len(profit) == 3 // true
Values assigned to a key can also be retrieved using the [] operator, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
consumer_profit = profit["Consumer"]
consumer_profit == 1739.50 // true
Assigning a value to a key will automatically overwrite any previous value
associated with that key. To only assign a value if the key doesn’t exist, the
set_if_missing method can be used, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
// No effect as key exists.
profit.set_if_missing("Consumer", 0.0)
profit["Consumer"] == 1739.50
// Value set for key as it doesn't already exist.
profit.set_if_missing("Developer", 9368.30)
profit["Developer"] == 9368.30
When values are retrieved from a map using the [] operator, the zero value is
returned if the key has not been set, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
dev_profit = profit["Developer"]
dev_profit == 0.0 // true
If an alternative default value is desired, this can be specified using the
default parameter when calling make, e.g.
seen = map[string]bool(default: true)
tav_seen = seen["Tav"]
tav_seen == true // true
A custom initializer function can also be specified that derives a default value from the key, e.g.
// Users are fetched from the DB on first lookup.
user_cache = map[int]User(default: { db.get_user(id: $0) })
To check whether a key exists in a map, the in operator can be used, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
consumer_profit_exists = "Consumer" in profit
consumer_profit_exists == true // true
if "Developer" in x {
// As "Developer" has not been set in the map, this code
// block will not execute.
}
To safely access a key’s value, the get method can be used. This will return
nil for non-existent keys, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
consumer_profit = profit.get("Consumer")
consumer_profit == 1739.50 // true
dev_profit = profit.get("Developer")
dev_profit == nil // true
To minimize any confusions caused by nested optionals being returned by the
get method, maps do not support using optional types as keys, e.g.
lookup = map[?string]int{} // ERROR!
The get method can be passed an optional default parameter that will be
returned if the given key is not found, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
dev_profit = profit.get("Developer", default: 0.0)
dev_profit == 0.0 // true
Keys can be removed using the delete method, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
profit.delete("Consumer")
len(profit) == 1 // true
Maps support equality checks, e.g.
x = {1: true}
y = {1: true}
x == y // true
Maps can be iterated using for loops, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
for key in profit {
print(key) // Outputs: Consumer, Enterprise
}
For safety, iteration order is non-deterministic in all execution modes except
onchain-script, where it is deterministic and based on the transaction hash.
By default, iteration will return the map keys. To retrieve both the keys and values, the 2-variable iteration form can be used, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
for key, value in profit {
print("${key} = ${value}") // Outputs: Consumer = 1739.50, Enterprise = 4012.80
}
If just the values are desired, then the values method can be called, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
for value in profit.values() {
print(value) // Outputs: 1739.50, 4012.80
}
Maps provide a number of other utility methods, e.g.
-
clear— removes all entries from the map. -
compute— update the specified key by applying the given function on the current value. -
copy— create a copy of the map. -
deep_copy— create a deep copy of the map. -
entries— returns a slice of key/value pairs. -
filter— create a new map with key/value pairs matching a given predicate function. -
filter_map— apply a function to each key/value pair and keep only the non-nil results as a new map. -
find— find a key/value pair matching the given predicate function. -
get_or_insert— get value for the given key, or insert the given default value if the key has not been set. -
group_by— group entries into nested maps. -
invert— create a new map with the keys and values swapped. -
keys— returns a slice of just the keys. -
map_values— transform all values according to the given transform function. -
merge— merge another map into this one. -
pop— delete the given key and return its value if it exists.
Any hashable type can be used as the key type within a map. Most built-ins like strings and numbers are hashable. User-defined types are automatically hashable if all of their fields are hashable.
If a user-defined type is unhashable, it can implement a __hash__ method to
make it hashable. This can also be useful to hash it on a unique value, e.g. a
primary key.
These methods can make use of the built-in hash function that is used to hash
built-in types, e.g.
Person = struct {
id int64
first_name string
last_name string
}
func (p Person) __hash__() int64 {
return hash(p.id)
}
// Person can now be used as a key within map types, e.g.
balances = map[Person]fixed128{}
Certain built-in types like slices and sets are not hashable as they are
mutable. To use them as keys, then they will first need to be converted into
immutable const values, e.g.
accessed = map[![]string]int{} // The ! in a type spec states that it is const
file_md = const ["path", "to", "file.md"]
accessed[file_md] = 1
The orderedmap data type behaves exactly like a map except that it keeps
track of insertion order and has deterministic iteration order, e.g.
fruits = orderedmap{
"apple": 10,
}
fruits["orange"] = 5
fruits["banana"] = 20
for fruit in fruits {
print(fruit) // Outputs: apple, orange, banana
}
The typemap data type allows for types to be mapped to values of that type,
e.g.
annotations = typemap{
string: "Hello",
int32: 456,
bool: false
}
sval = annotations[string]
bval = annotations[bool]
uval = annotations[uint64] // Returns default value for type
sval == "Hello" // true
bval == false // true
uval == 0 // true
Iroh also supports set data types for unordered collections without any
repeated elements, e.g.
x = set[int]{1, 2, 3}
Where the type can be inferred, sets can be constructed using a {} literal
syntax, e.g.
x = {1, 2, 3}
Duplicates in set literals will cause an edit-time error, e.g.
x = {1, 2, 3, 2} // ERROR!
Like with maps, empty sets can only be initialized with their explicit type, e.g.
x = {} // ERROR!
x = set[int]{} // Valid.
Elements can be added to a set using the add method, e.g.
x = {1, 2, 3}
x.add(4)
len(x) == 4 // true
Elements can be removed using the remove method, e.g.
x = {1, 2, 3}
x.remove(3)
len(x) == 2 // true
Sets can be iterated using for loops, e.g.
x = {1, 2, 3}
for elem in x {
print(x) // Outputs: 1, 2, 3 in some order
}
Like with maps, the iteration order of sets is non-deterministic in all modes
except onchain-script where it will be deterministic and based on the
transaction state.
Checking if an element exists in a set can be done using the in keyword, e.g.
x = {1, 2, 3}
y = 3 in x
y == true // true
if 4 in x {
// As 4 is not in the set, this code block will not execute.
}
Sets can be combined using the typical operations, e.g.
x = {1, 2, 3, 4}
y = {3, 4, 5, 6}
x.union(y) // {1, 2, 3, 4, 5, 6}
x.intersection(y) // {3, 4}
x.difference(y) // {1, 2}
x.symmetric_difference(y) // {1, 2, 5, 6}
For brevity, the equivalent operators can also be used, e.g.
x = {1, 2, 3, 4}
y = {3, 4, 5, 6}
x | y // x.union(y)
x & y // x.intersection(y)
x - y // x.difference(y)
x ^ y // x.symmetric_difference(y)
These methods all return a new set and leave the original untouched. When
efficient memory usage is needed, in-place variants of the methods are also
available, prefixed with in_place_, e.g.
x = {1, 2, 3, 4}
y = {3, 4, 5, 6}
x.in_place_union(y)
x == {1, 2, 3, 4, 5, 6} // true
Sets of the same type are also comparable to determine if a set is a subset or superset of another, e.g.
x = {1, 2, 3}
y = {1, 2, 3, 4, 5, 6}
if x <= y {
// x is a subset of y
}
if x < y {
// x is a proper subset of y
}
if y > x {
// y is a superset of x
}
Boolean & Logic Types
Iroh supports basic boolean logic via the bool type which has the usual values:
-
true -
false
The bool values can be negated using the ! operator, e.g.
x = true
y = !x
y == false // true
Boolean logic is applied for the and and or operators, and act in a
short-circuiting manner, e.g.
debug = false
// When debug is false, the fetch_instance_id call is never made.
if debug and fetch_instance_id() {
...
}
Iroh also supports a number of other logic types. Three-valued logic is
supported by the bool3 type which has the following values:
-
true -
false -
unknown
The unknown value expands the usual logic operations with the following
combinations:
| Expression | Result |
|---|---|
true and unknown |
unknown |
false and unknown |
false |
true or unknown |
true |
false or unknown |
unknown |
!unknown |
unknown |
While three-valued logic can be emulated via an optional bool, i.e. ?bool, the
explicit use of unknown instead of nil makes intent clearer in domains like
knowledge representation.
Four-valued logic is supported by the bool4 type with the values:
-
true -
false -
unknown -
conflicting
The conflicting value expands the usual logic operations with the following
combinations:
| Expression | Result |
|---|---|
true and conflicting |
conflicting |
false and conflicting |
false |
unknown and conflicting |
conflicting |
true or conflicting |
true |
false or conflicting |
conflicting |
unknown or conflicting |
conflicting |
!conflicting |
conflicting |
The use of bool4 can make logic much easier to follow, especially in domains
like distributed database conflicts, consensus algorithms, handling forks in
blockchains, etc.
Boolean values can be safely upcast to higher dimensions, e.g.
txn_executed = true
initial_status = bool4(txn_executed)
initial_status == true // true
The fuzzy type supports fuzzy logic with fixed128 values between 0.0
(false) and 1.0 (true):
apply_brakes = fuzzy(1.0)
apply_brakes == true // true
Fuzzy logic is applied when evaluating fuzzy values, e.g.
| Expression | Mechanism | Result |
|---|---|---|
fuzzy(0.7) and fuzzy(0.4) |
min(a, b) |
fuzzy(0.4) |
fuzzy(0.7) or fuzzy(0.4) |
max(a, b) |
fuzzy(0.7) |
!fuzzy(0.7) |
1 - value |
fuzzy(0.3) |
Fuzzy membership is supported by various membership functions on slice data types that assign each element of the slice a value that represents its degree of membership, e.g.
-
x.gaussian_membership -
x.s_shaped_membership -
x.sigmoidal_membership -
x.trapezoidal_membership -
x.triangular_membership -
x.z_shaped_membership
The in operator can then be used to return the fuzzy degree of membership of
a value, e.g.
// Temperature and humidity ranges.
temperature = [0..100]
humidity = [0..100]
// Temperature and humidity membership functions.
temp_low = temperature.trapezoidal_membership(0, 0, 20, 30)
temp_mid = temperature.trapezoidal_membership(25, 40, 60, 75)
temp_high = temperature.trapezoidal_membership(70, 80, 100, 100)
humid_low = humidity.triangular_membership(0, 0, 50)
humid_high = humidity.triangular_membership(50, 100, 100)
// Current conditions.
current_temp = 25
current_humidity = 60
// Get membership degrees.
temp_low_deg = current_temp in temp_low
temp_mid_deg = current_temp in temp_mid
temp_high_deg = current_temp in temp_high
humid_low_deg = current_humidity in humid_low
humid_high_deg = current_humidity in humid_high
// The membership degree values are fuzzy values:
type(temp_low_deg) == fuzzy // true
type(humid_low_deg) == fuzzy // true
// They can now be used to apply rules under fuzzy logic, e.g.
if temp_high_deg and humid_high_deg {
set_fan_speed(.high)
} else if temp_mid_deg and humid_low_deg {
set_fan_speed(.medium)
}
This is extremely useful in various domains like AI decision making, IoT, control systems, risk assessment, recommendation engines, autonomous vehicles, etc.
Probabilistic logic is supported by the probability data type, e.g.
rain = probability(0.3, .bernoulli)
A range of different distribution types are supported:
enum {
bernoulli,
beta,
beta_binomial,
binomial,
categorical,
cauchy,
chi_squared,
dirichlet,
discrete_uniform,
exponential,
f_distribution,
gamma,
geometric,
gumbel,
hypergeometric,
laplace,
logistic,
lognormal,
multinomial,
multivariate_normal,
negative_binomial,
normal,
pareto,
poisson,
power_law,
rayleigh,
student_t,
triangular,
uniform,
weibull,
zipf,
}
Each distribution type requires different parameters depending on its mathematical definition, e.g.
weather = probability([
("sunny", 0.6),
("rainy", 0.3),
("cloudy", 0.1)
], .categorical)
height = probability(175.0<cm>, .normal, std: 7.0<cm>)
customers_per_hour = probability(5.0, .poisson)
income = probability(50000.0, .lognormal, sigma: 0.5)
successful_trials = probability(0.3, .binomial, trials: 20)
The likelihood of a given value can be queried by using the p method. This
returns the concrete probability or, for continuous distributions, the
probability density, e.g.
dice = probability(1, .uniform, max: 6)
// The chance of a 3 being rolled:
chance = dice.p(3)
chance == 0.166666666666666667 // true
The probability that a value falls within a specific range can be calculated
using the p_between method, e.g.
dice = probability(1, .uniform, max: 6)
// The chance of a 4-6 being rolled:
chance = dice.p_between(4, 6)
chance == 0.5 // true
The sample method draws a random value from the distribution and sample_n
returns a slice of multiple samples of the specified length, e.g.
dice = probability(1, .uniform, max: 6)
roll = dice.sample()
rolls = dice.sample_n(10)
len(rolls) == 10 // true
Some distributions are naturally boolean, e.g.
coin = probability(0.5, .bernoulli)
success = probability(0.3, .binomial, trials: 1)
Others can be converted to a boolean distribution with a predicate, or with the
in_range and in_set membership tests, e.g.
temperature = probability(22.0, .normal, std: 3.0)
weather = probability([
("sunny", 0.6),
("rainy", 0.3),
("cloudy", 0.1)
], .categorical)
// Convert with predicates, e.g.
is_hot = temperature > 25.0
is_sunny = weather == "sunny"
// Membership tests, e.g.
likely_fever = temperature.in_range(38.0, 42.0)
bad_weather = weather.in_set({"rainy", "cloudy"})
Boolean distributions can then be used within conditionals either using an explicit threshold, e.g.
if collision_risk.p(true) > 0.1 {
apply_emergency_brakes()
}
Or implicitly where a probability > 0.5 indicates true, e.g.
if likely_fever {
prescribe_medication()
}
Boolean distributions can be combined with normal bool values, e.g.
rain = probability(0.7, .bernoulli)
weekend = true
stay_inside = rain and weekend
go_out = !rain or weekend
maybe_picnic = !rain and weekend
if maybe_picnic {
print("Let's have a picnic!")
}
Assuming distributions are statistically independent, probabilistic logic operations can be applied to them, e.g.
// Garden watering logic.
rain = probability(0.7, .bernoulli)
sprinkler = probability(0.3, .bernoulli)
wet_from_rain_or_sprinkler = rain or sprinkler
// Autonomous driving.
safe_to_proceed = (visibility > 100.0) and weather.in_set({"sunny", "cloudy"})
Boolean distributions can also be compared against each other, e.g.
market_crash = probability(0.15, .bernoulli)
recession = probability(0.25, .bernoulli)
if recession > market_crash {
adjust_investment_strategy()
}
Constraints for Bayesian inference can be defined using the observe method on
probability types to create probability_observation values, e.g.
rain = probability(0.3, .bernoulli)
sprinkler = probability(0.1, .bernoulli)
wet_grass = rain or sprinkler
evidence_wet = wet_grass.observe(true)
These can then be used with the | and & operators to evaluate conditional
probabilities, e.g.
// Posterior distribution for rain given evidence of observed wet grass.
rain_given_wet = rain | wet_grass.observe(true)
// Posterior distribution for disease given evidence of fever and cough.
disease_given_symptoms = disease | fever.observe(true) & cough.observe(true)
For correlated variables, the joint_distribution type can be used, e.g.
financial_risks = joint_distribution([
("market_crash", probability(0.15, .bernoulli)),
("recession", probability(0.25, .bernoulli))
], correlation: 0.8)
Multiple variables can define correlations following the standard upper triangle order of a correlation matrix, e.g.
financial_risks = joint_distribution([
("market_crash", probability(0.15, .bernoulli)),
("recession", probability(0.25, .bernoulli)),
("inflation_spike", probability(0.20, .bernoulli))
], correlations: [0.8, 0.3, 0.6])
These support the standard operations while maintaining the correlations within the joint distribution, e.g.
crash, recession, inflation = financial_risks.sample()
The correlated probability values can be accessed via the dists method to do
various calculations, e.g.
market_crash, recession, inflation_spike = financial_risks.dists()
// Joint probabilities, e.g.
disaster = market_crash and recession
// Conditional probabilities, e.g.
recession_given_crash = recession | market_crash.observe(true)
Our support for rich probabilistic logic makes it useful in various domains like financial modelling, risk modelling, statistical modelling, Bayesian inference, medical diagnosis, robotics, etc.
Struct Data Types
Iroh provides struct types to group together related fields under a single
data type, e.g.
Person = struct {
name string
age int
height fixed128<cm>
}
Struct values can be initialized with braces and individual fields can be
accessed using . notation, e.g.
zeno = Person{
name: "Zeno",
age: 11,
height: 136<cm>
}
print(zeno.name) // Outputs: Zeno
Unspecified values are zero-initialized, e.g.
-
Numeric types default to
0. -
Boolean types default to
false. -
String types default to
"", the empty string. -
Slice types default to
[], the empty slice. -
Optional types default to
nil. -
Struct types have all their values zero initialized.
For example:
alice = Person{name: "Alice"}
alice.age == 0 // true
alice.height == 0<cm> // true
Structs can mark individual fields as @required. Such fields must be specified
during initialization or an edit-time error will be generated, e.g.
Person = struct {
name string
age int @required
height fixed128<cm>
}
// Now, name and height will be zero-initialized,
// but age must always be set, e.g.
reia = Person{name: "Reia", age: 9}
// Omitting the required field will generate an error:
zaia = Person{name: "Zaia"} // ERROR! Missing required age field on Person!
Structs can specify the optional require_all parameter to mark all of its
fields as @required, e.g.
Person = struct(require_all: true) {
name string
age int
height fixed128<cm>
}
zeno = Person{name: "Zeno"} // ERROR! Missing required fields age and height on Person!
Struct definitions can specify non-zero default values, e.g.
Config = struct {
retries: 5
timeout: 60<s>
}
cfg = Config{retries: 3}
cfg.retries == 3 // true
cfg.timeout == 60<s> // true
Struct fields can be assigned to directly, e.g.
cfg = Config{}
cfg.retries = 7
cfg.retries == 7 // true
If a field being initialized matches an existing variable name, the {ident: ident}
initialization can be simplified to just {ident}, e.g.
retries = 3
cfg = Config{retries, timeout: 10<s>}
cfg.retries == 3 // true
Likewise, the ... splat operator can be used to assign all fields of an
existing struct to another, with later assignments taking precedence over
former, e.g.
ori = Config{retries: 3, timeout: 10<s>}
cfg1 = Config{...ori}
cfg1 == {retries: 3, timeout: 10<s>} // true
cfg2 = Config{...ori, retries: 5}
cfg2 == {retries: 5, timeout: 10<s>} // true
cfg3 = Config{retries: 5, ...ori}
cfg3 == {retries: 3, timeout: 10<s>} // true
cfg4 = Config{retries: 5, ...ori, timeout: 20<s>}
cfg4 == {retries: 3, timeout: 20<s>} // true
Structs can be marked as deprecated, e.g.
Config = struct(deprecated: "Use ConfigV2 instead") {
...
}
// Using the struct will then generate a compile-time
// warning, e.g.
cfg = Config{} // WARNING!
When non-zero default values are specified, the definition is executed per struct so as to avoid accidentally creating globally shared values, e.g.
Person = struct {
mother: ?Person
father: ?Person
children: []Person(cap: 3)
}
tav = Person{}
alice = Person{}
cap(tav.children) == 3 // true
@pointer(tav.children) != @pointer(alice.children) // true
Struct definitions can avoid having to re-specify the same fields and methods
everywhere by copying an existing struct. For example, assuming an existing
Vehicle type:
Vehicle = struct {
make string
model string
fuel_level int
}
(v Vehicle) gear_change() {
...
}
(v Vehicle) refuel() {
...
}
One could create a new Car type that copies over its fields and methods:
Car = struct(Vehicle) {
doors int
seats int
}
(c Car) gear_change() {
// Car-specific gear-change logic. Unlike in languages
// with inheritance, there's no access to Vehicle's
// gear_change method via `super` or `base` here.
}
Where the struct(Vehicle) syntax acts as a shortcut for copying over all the
fields and methods from Vehicle, i.e.
Car = struct {
make string
model string
fuel_level int
doors int
seats int
}
(c Car) gear_change() {
// Car-specific gear-change logic.
}
(c Car) refuel() {
...
}
The new Car type can then be used, e.g.
car = Car{make: "Ford", model: "Sierra", doors: 4, seats: 5}
car.gear_change()
This copying feature exists purely for developer convenience. There is no inheritance or prototype chaining of any kind. The two types are distinct and unrelated, e.g.
func print_model_name(v Vehicle) {
print("${v.make} ${v.model}")
}
v = Vehicle{make: "Ford", model: "Sierra"}
print_model_name(v) // Outputs: Ford Sierra
car = Car{make: "Ford", model: "Sierra"}
print_model_name(car) // ERROR: Car is not assignable to Vehicle
// To use a Car value, explicit conversion is needed, e.g.
v_from_car = Vehicle{make: car.make, model: car.model}
print_model_name(v_from_car)
When fields are copied, fields with the same name override the definition of the field with the same name in the base type, and similarly methods override any definitions on the base type.
When a function expects a concrete struct type, the type name can be elided in the parameter when the function is called, e.g.
func checkin(p Person) {
...
}
checkin({name: "Zeno", age: 11}) // Person is automatically inferred.
And if the parameter is the last parameter, even the braces can be elided, e.g.
checkin(name: "Zeno", age: 11)
Iroh automatically detects when a struct should be passed as a pointer or copied based on some heuristics, so there is no need to specify whether it should be a pointer or not, e.g.
-
If a function/method needs to mutate a struct value, it is passed as a pointer.
-
Where there is no mutation, and the struct value is small enough, it is just copied.
For example:
// Person is automatically passed as a pointer as it is mutated:
func update_age(person Person, new_age int) {
person.age = new_age
}
// Config value is copied as it is not mutated and small enough:
func get_retries(cfg Config) int {
return cfg.retries
}
Occasionally, for performance reasons, it may be necessary to annotate the exact form that’s passed in. But this should be rarely used, e.g.
// Person is passed in as a *pointer even though it's not mutated:
func process_application(person *Person, info *Submission) {
...
}
// The time value is passed in by !value and copied:
func get_local_time(t !UnixTime) {
...
}
All pointers, outside of the raw [*] pointers we have for allocators and
interoperating with C, are always non-null and automatically dereferenced on
lookups and assignments.
A copy method is auto-generated on struct types that allow for easy copying of
struct values, e.g.
tav = Person{name: "Tav"}
dupe = tav.copy()
dupe.name = "Tavino"
tav.name == "Tav" // true
dupe.name == "Tavino" // true
Specific fields can be overridden as parameters to the copy call, e.g.
tav = Person{name: "Tav", location: "London"}
dupe = tav.copy(location: "Singapore")
tav.location == "London" // true
dupe.name == "Tav" // true
dupe.location == "Singapore" // true
The no_copy parameter can be set on struct types to prevent them from being
copied. This also prevents the auto-generation of the copy method, e.g.
Mutex = struct(no_copy: true) {
...
}
mutex = Mutex{}
dupe = mutex.copy() // Edit-time ERROR! Unknown method!
Types with no_copy are always moved, not copied, on assignment, so the
original variable will no longer be accessible, e.g.
a = Mutex{}
b = a
a.lock() // Edit-time ERROR! Value has been moved to b!
This makes no_copy types perfect for data types that should never be copied,
e.g. mutexes, file handles, etc.
Tuple Data Types
Iroh supports tuples for grouping a fixed number of values together. Tuple
values are enclosed within () parentheses, and indexed with [] like slices,
e.g.
london = (51.5074, -0.1278)
lat = london[0]
lng = london[1]
lat == 51.5074 // true
lng == -0.1278 // true
Tuple values can be destructured easily, e.g.
lat, lng = (51.5074, -0.1278)
lat == 51.5074 // true
lng == -0.1278 // true
Like structs, tuples can contain values of different types, e.g.
tav = ("Tav", 186<cm>)
However, unlike structs, tuples are immutable, i.e. their elements cannot be re-assigned:
london = (51.5074, -0.1278)
london[0] = 42.12 // ERROR!
Individual fields of a tuple can also be optionally named and accessed via their name, e.g.
london = (lat: 51.5074, lng: -0.1278)
london.lat == 51.5074 // true
london.lng == -0.1278 // true
// Indexed access still works
london[0] == london.lat // true
Both named and unnamed fields can be mixed within the same tuple, e.g.
tav = ("Tav", height: 186<cm>)
tav[0] == "Tav" // true
tav.height == 186<cm> // true
But to avoid accidental bugs, once a field has been named either in the type definition or value construction, all subsequent fields must be named, e.g.
func get_coords() (lat float64, lng float64) {
return (lng: -0.1278, 51.5074) // ERROR!
}
All trailing elements of a tuple can be elided if the expected types are all optionals, in which case they will be filled with nils, e.g.
func get_coords() (lat float64, lng float64, altitude ?float64<km>, at ?time) {
return (-0.1278, 51.5074)
}
coords = get_coords()
coords.altitude == nil // true
As tuples are iterable, len returns their size, and they can be iterated with
for loops, e.g.
london = (51.5074, -0.1278)
for point in london {
print(point) // Outputs: 51.5074 and then -0.1278
}
len(london) == 2 // true
The tuple type can also be passed an iterable to construct a tuple of all
elements of the iterable, e.g.
t = tuple("Tav")
t == ('T', 'a', 'v') // true
len(t) == 3 // true
t[0] == 'T' // true
Occasionally, single element tuples are useful, e.g.
-
Within APIs that are expecting tuple values.
-
To maintain consistency within data structures.
As single element tuples, e.g. (x), are indistinguishable from an expression
enclosed in parentheses, languages like Python let them be constructed with a
trailing comma, e.g.
# Constructing a single-element tuple in languages like
# Python and Rust:
t = (42,)
t[0] == 42 # True
This tends to cause confusion as trailing commas can be overlooked by even experienced developers. As such, Iroh disallows trailing commas inside parentheses.
Instead, the tuple{} constructor syntax needs to be used, e.g.
t = tuple{42}
len(t) == 1 // true
t[0] == 42 // true
If the tuple{} syntax is used for multiple elements, the Iroh editor will
automatically replace its rendering with the standard form, e.g.
// This:
t = tuple{42, 99}
// Gets automatically converted to:
t = (42, 99)
Enum Data Types
Iroh supports enum data types, e.g.
Colour = enum {
red
green
blue
}
Variants can be addressed with a leading . and matched with the match
keyword, e.g.
match colour {
.red: set_red_bg()
.green: set_green_bg()
.blue: set_blue_bg()
}
The use of . prefix for enum variants makes it less visually noisy than in
other languages, e.g.
// Compare this Rust:
set_bg(Colour::Red)
// To this Iroh:
set_bg(.red)
So the only time when an enum variant will need to be fully qualified is when it’s used to declare a new variable, e.g.
bg = Colour.red
// The variable can then be assigned another variant without
// needing to use the fully qualified form, e.g.
bg = .green
Enum variants without any data do not get any implicit values, e.g.
Colour = enum {red, green, blue}
Colour.red == Colour.green // false
Colour.red == 0 // ERROR! Not comparable
Variants can be given explicit values as long as they are all of the same type. This makes the variants comparable to values of those types, e.g.
Colour = enum {
red = 1
green = 2
blue = 3
}
HTTPMethod = enum {
get = "GET"
head = "HEAD"
post = "POST"
put = "PUT"
}
Colour.red == 1 // true
HTTPMethod.get == "GET" // true
Mapped enums such as these automatically support being assigned values of the variant type, e.g.
http_method = HTTPMethod.get
http_method = .post // Assigned an enum variant
http_method = "PUT" // Assigned a string value mapping to an enum variant
When variables of these types are assigned literal values, they are validated at edit-time. Otherwise, they are validated at runtime and may generate errors, e.g.
http_method = HTTPMethod.post
http_method = "JIBBER" // ERROR!
As runtime values will be automatically converted during comparison, it may be useful to safely type cast them explicitly first, e.g.
http_method = try HTTPMethod(user_input)
if http_method {
...
}
When numeric values are being assigned to variants, they can use iota to
define a pattern to use:
-
The value of
iotastarts at0and auto-increments by one for each subsequent variant that doesn’t define its own value. -
Each new use of
iotaresets its value back to0.
For example:
ErrorCode = enum {
// General errors (start at 0)
ok = iota // 0
unknown_method // 1
invalid_argument // 2
// Authentication errors (start at 100)
auth_failed = iota + 100 // 100
token_expired // 101
permission_denied // 102
// Network errors (start at 200)
timeout = iota + 200 // 200
connection_lost // 201
protocol_mismatch // 202
}
By default, an enum’s tag will be the smallest unsigned integer that can represent all the enum’s possible values, e.g.
// Iroh will use a uint2 for this enum's tag as the 2 bits
// will be able to represent all 4 possible values:
Status = enum {a, b, c, d}
When the in-memory backing storage needs to be controlled, a custom size can
be specified instead, e.g.
// A 20-bit tag will be used here, even though only 4 variants
// have been defined so far:
Status = enum(backing: uint20) {a, b, c, d}
When the size exceeds 8 bits, a qualifier can be added to the custom size to control its endianness, e.g.
Status = enum(backing: uint20be) {a, b, c, d}
The following qualifiers are supported:
-
be— big-endian. -
le— little-endian (the default). -
ne— native-endian matching whatever the current CPU/OS uses.
Like struct types, enums can also copy over variants and methods from an
existing enum, with new variants added to the end, e.g.
ServicesV1 = enum {
create_customer
get_customer
create_order
get_order
}
ServicesV2 = enum(ServicesV1) {
list_customers
remove_order
}
Enums can be marked as deprecated, e.g.
ServicesV1 = enum(deprecated: "Use ServicesV2 instead") {
...
}
Matches on variants must be exhaustive, i.e. they must match all possible variants, e.g.
Status = enum {a, b, c, d}
// This match will error at edit-time as the .d variant
// hasn't been handled:
match status {
.a:
.b:
.c:
}
This helps prevent any bugs caused by accidentally forgetting to handle certain
cases. The default keyword can be used to act as a catch-all when some cases
don’t need to be explicitly handled, e.g.
match http_method {
.get: handle_get()
.post: handle_post()
default: handle_everything_else()
}
The exhaustive nature of enum matches can make it difficult for package authors to evolve their APIs, e.g.
// For an enum like this defined by a package author:
HTTPMethod = enum {
get = "GET"
head = "HEAD"
post = "POST"
put = "PUT"
}
// The following will compile initially as the match
// handles all 4 cases:
match http_method {
.get: handle_get()
.head: handle_head()
.post: handle_post()
.put: handle_put()
}
// But, if the package author adds a new enum .trace variant,
// like below, the `match` will fail as it's not exhaustive
// any more:
HTTPMethod = enum {
get = "GET"
head = "HEAD"
post = "POST"
put = "PUT"
trace = "TRACE"
}
To ensure that user code continues to work if a package author decides to add
new variants, enums can be marked as extensible, e.g.
HTTPMethod = enum(extensible: true) {
get = "GET"
head = "HEAD"
post = "POST"
put = "PUT"
}
This will force all matches on those enums to always have a default case, so
that they will continue to be exhaustive even if package authors add new
variants, e.g.
match http_method {
.get: handle_get()
.head: handle_head()
.post: handle_post()
.put: handle_put()
default: handle_everything_else()
}
// When the enum gets extended with a new .trace variant, this
// match will continue to work thanks to the `default` case.
Multiple variants can be grouped with ; in a single case. Within such
branches, values are automatically type narrowed to only those variants, e.g.
match http_method {
.get:
handle_get()
.post; .put:
// Only .post and .put are valid variants inside here.
handle_common_code_for_post_and_put()
match http_method {
.post:
handle_post_only_stuff()
.put:
handle_put_only_stuff()
// No need to handle the other cases here.
}
.head:
handle_head()
default:
handle_anything_else()
}
Like in Rust, enum variants can also have associated structured data of different kinds, e.g.
Message = enum {
quit // No data
move{ // Struct data
x int32
y int32
}
write(text string) // Single-element tuple
set_colour(r uint8, g uint8, b uint8) // Multi-element tuple
}
These can be destructured within the match cases, e.g.
func process(msg Message) {
match msg {
.quit: // Handle quit
.move{x, y}: // Handle move
.write(text): // Handle write
.set_colour(r, g, b): // Handle set_colour
}
}
If the elements of a destructured variant aren’t needed, they can be elided by just specifying the name of each variant, e.g.
func process(msg Message) {
match msg {
.quit: // Handle quit
.move: // Handle move
.write: // Handle write
.set_colour: // Handle set_colour
}
}
As Iroh automatically type narrows within match arms and conditionals, the existing variable can be used with the narrowed type, e.g.
func process(msg Message) {
match msg {
.write:
print("Received message: ${msg.text}")
.default: // Ignore everything else
}
}
// Or just:
if msg is .write {
print("Received message: ${msg.text}")
}
When exhaustiveness is not a concern, variants can be directly extracted via assignment, e.g.
func print_coords(msg Message) {
// A runtime error will be generated below if msg is
// not .move:
coords = msg.move
print("Got move instruction to ${coords.x}, ${coords.y}")
}
Direct extraction can be safely done using try, e.g.
coords = try msg.move or (x: 0, y: 0)
Similarly, a tag attribute can be specified on an enum to define an
auto-generated attribute name that identifies the active variant, e.g.
Shape = enum(tag: kind) { // The tag name cannot overlap with the name of any variant
circle(radius: int)
square(side: int)
}
shape = Shape.circle(10)
shape.kind == .circle // true
When the tag is used within conditional clauses, it automatically type narrows the original value, e.g.
if shape.kind == .circle {
print("Got a circle with radius ${shape.radius}")
}
This provides an easy way to have discriminated unions when you don’t need to
exhaustively handle all variants. But, for added safety, the tag can be used in
a match construct, e.g.
match shape.kind {
.circle:
print("Got a circle with radius ${shape.radius}")
.square:
print("Got a square with side ${shape.side}")
}
The match enforces exhaustiveness, while the automatic type narrowing makes it
easy to access variant fields without needing them to be destructured.
If an enum is made up of variants of the same type, i.e. all structs or all tuples, and all variants have a field with the same name, type, and offset, they can be directly accessed, e.g.
APIResponse = enum {
list_users{error string, status int, response []User}
get_user{error string, status int, response User}
delete_user{error string, status int}
}
func handle_response(resp APIResponse) {
// Accessing .status is safe as it's the same name, type
// and offset in all variants of APIResponse:
if resp.status == 0 {
...
}
}
This can make it both more performant and more ergonomic to handle various types of data, e.g. API responses, database results, event payloads, protocol messages, etc.
For the zero value of an enum, Iroh defaults to using the first variant. This
can be overridden by marking a specific variant as the @default_variant, e.g.
TextColor = enum {
blue
green
red
white @default_variant
}
Style = struct {
text_color TextColor
}
s = Style{}
s.text_color == .white // true
Like structs, enums support both type and instance methods. These can be defined on the enum as a whole, or on specific variants, e.g.
Shape = enum {
circle(radius float64)
rectangle(width float64, height float64)
triangle(base float64, height float64)
}
(s Shape) func area() {
return match s {
.circle: math.pi * s.radius * s.radius
.rectangle: s.width * s.height
.triangle: 0.5 * s.base * s.height
}
}
(r Shape.rectangle) func is_square() {
return r.width == r.height
}
s = Shape.rectangle(10, 20)
// This Shape value has access to the method
// common to all variants:
s.area() == 200 // true
// As well as the method that's unique to just
// this variant:
s.is_square() == false // true
Anonymous Sum Types
While enums already provide sum types, Iroh also supports anonymous sum types of the form:
TypeAorB = A | B
These can be used to easily create and use sum types without much ceremony thanks to Iroh’s automatic type narrowing based on flow-sensitive analysis, e.g.
func encode_value(encoder JSONEncoder | YAMLEncoder, value) []byte {
if encoder is JSONEncoder {
// encoder is type narrowed to JSONEncoder here:
return encoder.encode_json(value)
}
// encoder is YAMLEncoder here:
return encoder.write(value)
}
enc = JSONEncoder{}
encode_value(enc, value)
The is operator acts as a shortcut to calling type on a value and comparing
it to a type, e.g.
enc = JSONEncoder{}
type(enc) == JSONEncoder // true
enc is JSONEncoder // true
The match construct can also be used with anonymous sum types, e.g.
match enc {
JSONEncoder:
// enc is type-narrowed to a JSONEncoder here:
return enc.encode_json(value)
YAMLEncoder:
// enc is type-narrowed to a YAMLEncoder here:
return enc.write(value)
}
Anonymous sum types are only valid if:
-
All variant types are either
structorenumtypes, or -
All variant types are constants of the same string or numeric type.
This allows for easy use of sum types in lots of real-world use cases without creating any ambiguity, e.g.
ConfigFormat = "json" | "yaml" | "toml"
HTTPMethod = "GET" | "POST" | "PUT" | "DELETE"
DBConnection = PostgresConn | MySQLConn | SQLiteConn
APIResponse = SuccessResponse | ErrorResponse | TimeoutResponse
// The `const` keyword marks this as a type definition
// rather than a bitwise-OR of the values:
Port = const 80 | 443 | 8080
Destructuring
Iroh supports destructuring for many of its data types, e.g.
// Tuples
(a, b, c) = (1, 2, 3)
// Arrays
[a, b, c] = [3]int{1, 2, 3}
// Slices
[a, b, c] = [1, 2, 3]
// Structs
{x, y} = Point{x: 10, y: 20}
// Strings
<<"Hello ", name>> = "Hello Tav"
For elements of iterable values, specific elements can be skipped with a _,
e.g.
[a, _, b] = [1, 2, 3]
The ... splat operator can be used to match any “remaining” elements of an
iterable, e.g.
[head, second, _, tail...] = [1, 2, 3, 4, 5]
head == 1 // true
second == 2 // true
tail == [4, 5] // true
The ... splat operator can be in any position as long as it’s only used once,
e.g.
[head, middle..., last] = [1, 2, 3, 4, 5]
head == 1 // true
middle == [2, 3, 4] // true
last == 5 // true
Or even use it to ignore a group of intermediate elements, e.g.
[head, ..., last] = [1, 2, 3, 4, 5]
head == 1 // true
last == 5 // true
As struct fields destructure by their field names, partial destructuring only needs to specify the desired field names, e.g.
{name} = Person{name: "Tav", location: {lat: 51.5074, lng: -0.1278}}
name == "Tav" // true
Fields can be destructured to a different name with a :, e.g.
{name: user} = Person{name: "Tav", location: {lat: 51.5074, lng: -0.1278}}
user == "Tav" // true
Nested elements can be destructured as needed, e.g.
{location: {lat, lng}, name} = Person{name: "Tav", location: {lat: 51.5074, lng: -0.1278}}
name == "Tav" // true
lat == 51.5074 // true
lng == -0.1278 // true
Destructured elements can set a default value with a : if the value is a
zero-value, e.g.
{name: "Anonymous"} = Person{}
name == "Anonymous" // true
This can often be clearer than manually assigning to a local variable and checking its value before setting a default, e.g.
// Compare:
name = person.name
if !name {
name = "Anonymous"
}
// Versus:
{name: "Anonymous"} = person
To assign an existing variable as a default value, the variable name needs to be
suffixed with a ' to distinguish it from renaming a field during
restructuring, e.g.
anon = "John Doe"
{name: anon'} = Person{}
name == "John Doe" // true
The (), [], and {} around destructured patterns can be elided when
multiple elements are being destructured without any nesting or field renaming,
e.g.
a, _, b = [1, 2, 3]
x, y = Point{x: 10, y: 20}
a == 1 // true
b == 3 // true
x == 10 // true
y == 20 // true
The Erlang-inspired <<x, y>> binary destructuring works well with both strings
and byte slices, e.g.
// Decode binary data into specific data types, e.g.
<<version::uint8, length::uint32, checksum::int16>> = data
// Multi-byte integer types can specify an alternative to the
// default little-endian decoding, e.g.
<<version::uint8, length::uint32be, checksum::int24>> = data
// The splat operator can be used as usual, e.g.
<<version::uint8, header...>> = data
// The number of bytes to destructure can be specified with
// expressions or integer literals, e.g.
<<header::56, payload...>> = data
// These can even refer to previously destructured data, e.g.
<<version::uint8, length::uint32, header::length, payload...>> = data
Pattern Matching
The match keyword supports a broad range of pattern matching constructs. It
can be used to match against literal values, e.g.
match name {
"Alice": print("Hello Alice!")
"Tav": print("Hi, Tav!")
"Zeno": print("Greetings, Zeno!")
}
match colour {
.red: print("The colour is red")
.green: print("The colour is green")
.blue: print("The colour is blue")
}
Integer values can be matched against ranges, e.g.
match n {
1..5: print("Between 1 and 5")
6..10: print("Between 6 and 10")
default: print("Something else")
}
Multiple cases can be matched together using ;, e.g.
match char {
'a'; 'e'; 'i'; 'o'; 'u': print("Got a vowel!")
default: print("Got a consonant!")
}
When match expressions are used in assignments:
-
Single-line arms: the expression is automatically returned.
-
Multi-line arms: need to use
=>to specify the return value.
For example:
category = match char {
// Match arm and case are on the same line, so .vowel
// is automatically returned:
'a'; 'e'; 'i'; 'o'; 'u': letter.vowel
// Match arm and case are on multiple lines, so the =>
// specifies the return value:
default:
print("Got a consonant!")
=> letter.consonant
}
Match expressions must have consistent return types, i.e.
-
Same type: All arms return identical types.
-
Optional type: All arms return the same type or
nil. -
Sum type: Target variable is a sum type, arms return any variant.
When matching is combined with destructuring, sub-elements of a value can be bound to variables, e.g.
Message = enum {
quit
move{x int, y int}
}
msg = Message.move{3, 4}
match msg {
.quit:
print("Quit!")
.move{x, y}:
// Variables `x` and `y` are accessible here:
print("Move to ${x}, ${y}")
}
Full destructured patterns can be matched against, e.g.
Point = struct {x int, y int}
p = Point{0, 7}
match point {
{x: 0, y}: print("On the x-axis at ${y}")
{x, y: 0}: print("On the y-axis at ${x}")
{x, y}: print("At ${x}, ${y}")
}
Guards can be written by following the pattern with a simple if expression,
e.g.
pair = (2, -2)
match pair {
(x, y) if x == y: print("Elements are both equal")
(x, y) if (x + y) == 0: print("Elements sum to zero")
default: print("Other")
}
Note that when guard clauses are used, it makes it possible for values to match against multiple arms, i.e. order matters. In these cases, Iroh can’t apply certain optimizations such as using perfect hashing.
Variables can be matched against their values directly, e.g.
origin = (0, 0)
point = (2, -2)
match point {
origin:
print("The point is at the origin!")
default:
print("The point is not at the origin")
}
To use existing variables in destructuring patterns, suffix the variable name
with ' to indicate that it should be matched against its value rather than
create a new binding, e.g.
x = 0
point = (2, -2)
match point {
// Match against the existing value of x:
(x', _):
print("The point is on the x-axis!")
// Bind matching values to x and y:
(x, y):
print("The point is at ${x}, ${y}")
}
Functions & Methods
The func keyword is used to define both functions and methods. Functions can
have both parameters and a return value, e.g.
// This takes 2 parameters and returns an int value:
func add(a int, b int) int {
return a + b
}
// This takes no parameters and returns a 2-tuple value:
func get_info() (name: string, height: int<cm>) {
return ("Tav", 186<cm>)
}
// This takes 2 parameters and returns nothing:
func set_info(id int, info (name: string, height: int<cm>)) {
cache.update(id, info)
}
Closures, i.e. anonymous functions with associated data, can be defined within
function and method bodies using the => syntax, e.g.
add = (a int, b int) => {
return a + b
}
If the anonymous function body is on the same line as the =>, then the
return is implicit, i.e.
// Return is implicit whether there are braces, e.g.
add = (a int, b int) => { a + b }
// Or not:
add = (a int, b int) => a + b
Anonymous functions cannot define their return type, and thus the type of the return value must be inferrable. Functions which don’t take any parameters can omit them, e.g.
newline = () => print("")
// Print 3 newlines:
newline()
newline()
newline()
If anonymous functions don’t specify any parameters, but do receive parameters,
then those parameters need to be inferrable, and can be referred to by their
position, e.g. $0, $1, etc.
people = ["Reia", "Zaia"]
// This explicit function passed to the `map` method:
people.map((person string) => {
return person.upper()
})
// Can be simplified by omitting the parameter:
people.map(() => {
return $0.upper()
})
// Can be further simplified by putting it all on one line:
people.map(() => $0.upper())
// And even more clearly to just:
people.map { $0.upper() }
Functions, methods, and closures can all be passed as parameters in function calls, and even saved as values, e.g.
func calc(a int, b int, op (int, int) => int) {
return op(a, b)
}
// Pass the add function as a parameter:
calc(1, 2, add)
Instruction = struct {
op (int, int) => int
a int
b int
}
// Store the add function as a struct field:
next = Instruction{
op: add,
a: 1,
b: 2
}
Functions can be variadic if a parameter name is prefixed with ..., e.g.
func print_names(...names string) {
for name in names {
print(name)
}
}
The variadic parameter is automatically a slice of the given type, and can be called with zero or more values, e.g.
print_names() // Outputs: nothing
print_names("Zeno", "Reia", "Zaia") // Outputs: all 3 names
Slices can use the ... splat operator to expand their elements when calling
functions with variadic parameters, e.g.
names = ["Alice", "Tav"]
print_names(...names)
Parameters can be given default values by following the parameter name with a
: and the default value, e.g.
func greet(name string, greeting: "Hello") {
return "${greeting} ${name}"
}
greet("Alice") // Outputs: Hello Alice
greet("Tav", "Hi") // Outputs: Hi Tav
Iroh will generate edit-time warnings for certain function definitions, e.g.
-
Functions that have more than 6 named parameters as these tends to result in unwieldy APIs.
-
Functions that use
boolparameters instead of clearerenumones, e.g.bar(123, true)vs.bar(123, .update).
Functions which take a function parameter at the last position can use block syntax for that parameter, e.g.
func calc(a int, b int, op (int, int) => int) {
return op(a, b)
}
// We call `calc` using block syntax for the `op` parameter:
calc(1, 2) { $0 + $1 }
When a parameter is a struct type, it can be inlined by eliding the struct
keyword, e.g.
func run_server(cfg {host string, port int}) {
...
}
Functions which take a struct parameter at the last position can
accept the struct fields as “named” arguments in the function call, e.g.
Config = struct {
log_level enum{info, debug, fatal}
port int
}
func run_server(cfg Config) {
...
}
// The Config fields can be passed in as "named" arguments, e.g.
run_server(log_level: .info, port: 8080)
// As Config fields will be default-initialized, only any
// necessary fields need to be specified, e.g.
run_server(port: 8080)
Function parameters can combine default values, variadic parameters, struct parameters, and trailing functions as long as they follow this order:
-
Positional parameters with types.
-
Positional parameters with default values, i.e. optional parameters, or a variadic parameter.
-
Trailing
structparameter (with optional default field values). -
Trailing function or block builder parameter.
Parameters can also use destructuring syntax, e.g.
Config = struct {
log_level enum{ info, error, fatal }
port int
}
func run_server({port} = Config) {
// Only the `port` value is available in this scope.
}
func run_server({port: 8080} = Config) {
// The `port` value defaults to 8080 if it's not been specified.
}
For certain types of destructuring, types can be elided, e.g. when destructuring binary data:
func parse_packet(<<header::56, payload...>>) {
...
}
User-defined types “inherit” all of the methods of their underlying type and
have certain types autogenerated for them, e.g. struct types have the
following methods created for them:
-
copy,deepcopy— for copying the value. -
__hash__— for deriving the hash value. -
__str__— for the default string representation of the value.
Custom methods can be defined on a type by prefixing the type name before the
func definition, e.g.
Config = struct {
host string
port int
}
// Define the `validate` method:
(c Config) func validate() bool {
if !c.host {
return false
}
if 1024 <= c.port <= 65535 {
return true
}
return false
}
// Use the `validate` method:
c = Config{host: "", port: 8080}
if c.validate() {
...
}
The receiver component of the method definition, i.e. (c Config), can use any
variable name to refer to the value of the specified type. There’s no implicit
this or self.
Static methods on the type can be defined by specifying type as the receiver
name, e.g.
(type Config) func web_server() Config {
return Config{
host: "localhost",
port: 8080
}
}
// The static method can now be called on the type:
cfg = Config.web_server()
Groups of methods can be implemented within receiver blocks, e.g.
(c Config) {
validate() bool {
...
}
merge(other Config) Config {
...
}
}
(type Config) {
from_file(path string) Config {
...
}
}
Instance methods and type methods cannot use the same name. This makes things clearer for users and allows for static and instance methods to be aliased, e.g.
// Copy a static method as a function:
from_file = Config.from_file
// Copy an instance method as a function that takes an
// instance value as the first parameter:
validate = Config.validate
cfg = from_file("/home/tav/iroh.cfg")
if validate(cfg) {
...
}
Magic Methods
Iroh supports __magic__ methods similar to those in Python. These allow types
to implement or customize their behaviour for operators, built-in functions, and
certain language constructs.
Initialization and deinitialization of values are defined by:
| Method | Description |
|---|---|
__init__ |
Customize initialization with optional parameters |
__drop__ |
Customize cleanup on deinitialization |
Handle initialization from literals:
| Method | Example |
|---|---|
__seq_literal__ |
MyList{1, 2, 3} |
__map_literal__ |
MyMap{"Tav": .parent, "Alice": .parent} |
Define a type as an iterable:
| Method | Description |
|---|---|
__iter__ |
Support for a in x iteration |
__iter2__ |
Support for a, b in x iteration |
Implement the iterator interface:
| Method | Description |
|---|---|
__next__ |
Iterate over the value |
Support copying:
| Method | Description |
|---|---|
__copy__ |
Implement copying of values |
__deepcopy__ |
Implement deep-copying of values |
Provide responses for various built-in functions:
| Method | Description |
|---|---|
__cap__ |
The output for cap(x) |
__hash__ |
The output for hash(x) |
__len__ |
The output for len(x) |
Make the value callable like a function:
| Method | Description |
|---|---|
__call__ |
The function to call on x() |
Implement mem.allocator:
| Method | Description |
|---|---|
__create__ |
Allocate a value |
__destroy__ |
Deallocate a value |
Convert the value into a string, e.g. for use with print:
| Method | Description |
|---|---|
__str__ |
Coerce the value into a string value |
Convert the value into other types, e.g.
| Method | Description |
|---|---|
__bool__ |
Coerce the value into a bool value |
__int__ |
Coerce the value into an integer value |
__decimal__ |
Coerce the value into a decimal value |
__float__ |
Coerce the value into a floating-point value |
__index__ |
Coerce the value for indexing/slicing |
Convert from other types when being assigned, e.g.
| Method | Description |
|---|---|
__from__ |
Coerce from a generic type |
Implement comparisons:
| Method | Usage |
|---|---|
x.__eq__(y) |
x == y |
x.__neq__(y) |
x != y |
x.__lt__(y) |
x < y |
x.__le__(y) |
x <= y |
x.__gt__(y) |
x > y |
x.__ge__(y) |
x >= y |
Overload mathematical operators:
| Method | Usage |
|---|---|
x.__add__(y) |
x + y |
x.__radd__(y) |
y + x |
x.__sub__(y) |
x - y |
x.__rsub__(y) |
y - x |
x.__mul__(y) |
x * y |
x.__rmul__(y) |
y * x |
x.__div__(y) |
x / y |
x.__rdiv__(y) |
y / x |
x.__mod__(y) |
x % y |
x.__rmod__(y) |
y % x |
x.__pow__(y) |
x ** y |
x.__rpow__(y) |
y ** x |
x.__inner_product__(y) |
x · y |
x.__rinner_product__(y) |
y · x |
x.__tensor_product__(y) |
x ⊗ y |
x.__rtensor_product__(y) |
y ⊗ x |
x.__cross_product__(y) |
x × y |
x.__rcross_product__(y) |
y × x |
x.__neg__ |
-x |
x.__pos__ |
+x |
x.__percent__ |
x% |
x.__abs__ |
|x| |
x.__norm__ |
||x|| |
Overload derefencing to use another value instead:
| Method | Usage |
|---|---|
x.__deref__ |
x |
Overload attribute access:
| Method | Usage |
|---|---|
x.__getattr__(y) |
x.y |
x.__setattr__(y, value) |
x.y = value |
Overload indexing operators:
| Method | Usage |
|---|---|
x.__getitem__(y) |
x[y] |
x.__setitem__(y, value) |
x[y] = value |
Overload bitwise operators:
| Method | Usage |
|---|---|
x.__and__(y) |
x & y |
x.__rand__(y) |
y & x |
x.__or__(y) |
x | y |
x.__ror__(y) |
y | x |
x.__xor__(y) |
x ^ y |
x.__rxor__(y) |
y ^ x |
x.__not__ |
^x |
x.__and_not__(y) |
x &^ y |
x.__rand_not__(y) |
y &^ x |
x.__lshift__(y) |
x << y |
x.__rlshift__(y) |
y << x |
x.__rshift__(y) |
x >> y |
x.__rrshift__(y) |
y >> x |
Overload other operators:
| Method | Usage |
|---|---|
x.__contains__(y) |
y in x |
Iroh will generate edit-time errors if magic methods like __eq__ or __hash__
are not pure. This is done to avoid the java.net.URL mistake, where URL
comparisons trigger DNS lookups, e.g.
import java.net.URL;
URL a = new URL("http://example.com");
URL b = new URL("http://example.com");
// Comparing URL values will do a DNS lookup in Java!
System.out.println(a.equals(b));
Derived Types
Types can be derived from existing types using the @derive function, e.g.
MyString = @derive(string)
The memory layouts of derived types match those of the underlying types, and all existing methods of the underlying type are copied over to the new type so that it behaves similarly, e.g.
MyInt = @derive(int64)
a = MyInt(1)
b = MyInt(2)
c = a + b
c == 3 // true
Each derived type is distinct, e.g.
CustomerID = @derive(int64)
ProductID = @derive(int64)
func get_customer(id CustomerID) Customer {
...
}
func get_product(id ProductID) Product {
...
}
id = CustomerID(123)
product = get_product(id) // ERROR! Expected ProductID, got CustomerID
This allows for type safety to be enforced while still preserving the functionality of the underlying type and without any additional complexity or runtime overhead.
Derived types need to be explicitly converted, whether between different types or between derived types and their base types, e.g.
DatabaseID = @derive(int64)
ProductID = @derive(int64)
id = DatabaseID(123)
product_id = ProductID(id)
int_id = int64(product_id)
The only implicit conversion is when the root base type is cast to a derived type parameter, e.g.
DatabaseID = @derive(int64)
ProductID = @derive(int64)
func get_product(id ProductID) {
...
}
int_id = int64(123)
db_id = DatabaseID(123)
product_id = ProductID(123)
get_product(1234) // Works!
get_product(int_id) // Works!
get_product(product_id) // Works!
get_product(db_id) // Error! Needs to be cast to ProductID.
get_product(ProductID(db_id)) // Works!
product_id = 456 // Works too!
Custom methods can be added to derived types, e.g.
ProductID = @derive(int64)
(p ProductID) get_from_db(db DB) Product {
return db.get("/products/${p}", Product)
}
id = ProductID(123)
product = id.get_from_db(db)
Custom methods can even override methods that have been copied over from the base type, e.g.
ProductID = @derive(int64)
(p ProductID) __add__(other ProductID) ProductID {
return 0
}
a = ProductID(123)
b = ProductID(456)
c = a + b
c == 0 // true
Methods from the base type are copied, not inherited. As such, there’s no
super or Parent::member access, e.g.
ProductID = @derive(int64)
(p ProductID) __add__(other ProductID) ProductID {
// Since you cannot access int64.__add__ here, you need
// to convert the types explicitly and do things using
// publicly accessible operators, e.g.
result = int64(p) + int64(other)
return ProductID(2 * result)
}
a = ProductID(123)
b = ProductID(456)
c = a + b
c == 1158 // true
We believe this makes the code more explicit about what’s happening without the magic of inheritance. Derived types can control what methods get copied by specifying either:
-
with— explicitly specify which methods get copied. -
without— explicitly specify which methods don’t get copied.
Both options take a slice of string method names, e.g.
ProductID = @derive(int64, without: ["__add__", "__radd__"])
a = ProductID(123)
b = ProductID(456)
// The following results in an edit-time error as the type
// does not support addition:
c = a + b
Derived types can be used as the base for new derived types, e.g.
// Only allow == and != comparisons:
ComparableID = @derive(int64, with: ["__eq__", "__ne__"])
// Derive ID types from ComparableID:
CustomerID = @derive(ComparableID)
ProductID = @derive(ComparableID)
// Instantiate some values:
cust1 = CustomerID(123)
cust2 = CustomerID(456)
// These operations work as intended:
if cust1 == cust1 {
...
}
customers = [cust1, cust2]
// But other operations will raise an edit-time error:
invalid_sum = cust1 + cust2 // ERROR!
Types that should not longer be used can be marked as deprecated, e.g.
CustomerID = @derive(uint64, deprecated: "Use CustomerUUID instead")
But perhaps the most interesting aspect of derived types is that they can “lock in” specific configurations for the underlying type, e.g.
UTF16 = @derive(encoded_string, params: {
encoding: {
overrideable: false,
value: .utf_16
}
})
ReadOnlyFile = @derive(MyFile, params: {
mode: {
overrideable: false,
value: .read_only
}
})
This strengthens type safety across various domains. As parameter locking happens entirely at compile time, it leads us to an important aspect of derived types: their performance characteristics.
With the exception of any poorly implemented overridden methods, derived types will have identical performance to their base types, i.e. the same size in memory, same operations, etc.
The only downside is an increase in binary size due to the copying of methods. However, since we only include code that actually gets used, we believe this is a reasonable trade-off.
Conditionals
Iroh uses if, if else, and else blocks like most languages, e.g.
if i%3 == 0 and i%5 == 0 {
print("FizzBuzz")
} else if i%3 == 0 {
print("Fizz")
} else if i%5 == 0 {
print("Buzz")
} else {
print(i)
}
The condition for if and if else need to evaluate to a bool value. If the
value can’t be converted, it will generate an error, e.g.
Person = struct {
first_name string
last_name string
}
user = Person{}
// The following will generate a compile-time error as user
// cannot be converted to a bool.
if user {
...
}
For convenience, most built-in types support conversion, e.g.
name = "Tav"
// Instead of this explicit conditional check:
if len(name) > 0 {
...
}
// The string can be used directly as string values only evaluate
// to true when they have a positive length.
if name {
...
}
The falsiness of values of built-in types is given by:
| Type | When False |
|---|---|
| Booleans | false |
| Numbers | 0 |
| Arrays | len(x) == 0 |
| Slices | len(x) == 0 |
| Strings | len(x) == 0 |
| Maps | len(x) == 0 |
Conditions can be negated with an !, e.g.
if !name {
...
}
One can check if a value is within a range using chained comparisons, e.g.
// Check if (age >= 21) and (age <= 35):
if 21 <= age <= 35 {
...
}
Assignments and destructuring can also be done within an if conditional as
long as a ; separated conditional is also checked, e.g.
if resp = api.get_user(id); resp.success {
// The resp variable does not pollute the outer scope.
}
User-defined types can define a __bool__ method if they want to opt into
automatic type coercion into bools, e.g.
State = struct {
command string
is_running bool
}
func (s State) __bool__() bool {
return s.is_running
}
state = State{}
if state { // Automatically checks state.is_running
...
}
Loops
Most languages tend to provide multiple constructs for looping, e.g. for,
while, do, foreach, repeat, etc. This can be slightly confusing for
those new to programming.
So Iroh instead follows Go’s approach and only uses one keyword, for, for all
looping. Using for by itself results in an infinite loop, e.g.
for {
// Will keep running code in this block indefinitely.
}
Loops can be broken with the break keyword, e.g.
for {
now = time.now()
// This loop will stop as soon as the year ticks over.
if now.year > 2025 {
break
}
print(now)
time.sleep(1<s>)
}
Loops can be C-like, i.e.
for initialization; condition; increment {
...
}
For example:
for i = 0; i < 5; i++ {
print(i) // Prints 0, 1, 2, 3, 4
}
Loops can use the continue keyword to skip to the next iteration, e.g.
for i = 0; i < 5; i++ {
if i == 1 {
continue
}
print(i) // Prints 0, 2, 3, 4
}
To avoid common bugs, e.g. when loop variables are captured by closures, loop variables have per-iteration scope instead of per-loop scope, e.g.
for i = 0; i < 5; i++ {
func print_value() {
print(i)
}
print_value() // Prints 0, 1, 2, 3, 4
}
Loops can be nested, and labels can be used to exit specific loops, e.g.
outer: // a label for the outer loop, can be any identifier
for i = 0; i < 3; i++ {
for j = 0; j < 3; j++ {
print("={i}, ={j}")
if i*j == 4 {
print("Breaking out of both loops")
break outer
}
}
}
To execute code only when a loop has not been interrupted with a break, the
for loop can be followed by a fully branch, e.g.
for i = 0; i < 3; i++ {
if i == 2 {
break
}
print(i)
} fully {
print("All values got printed!")
}
Loops can also be conditional, i.e. will keep looping while the condition is
true, e.g.
for len(x) > 0 {
print(x.pop())
}
To loop over ranges, the for keyword can be combined with the in keyword,
e.g.
for i in 0..5 {
print(i) // Prints 0, 1, 2, 3, 4, 5
}
This also works with collections, e.g. iterating over a slice:
users = [{"name": "Tav"}]
for user in users {
print(user["name"])
}
Most collections also support iterating using in with 2 variables, e.g. in
slices this will return each element’s index as well as the element itself:
users = [{"name": "Tav"}]
for idx, user in users {
print("${idx}: ${user["name"]}")
}
Similarly, iterating using just 1 variable over a map value gives just the
keys, e.g.
user = {"name": "Tav", "location": "London"}
for key in user {
print(key) // Prints name, location
}
While iterating using 2 variables gives both the key and the value, e.g.
user = {"name": "Tav", "location": "London"}
for key, value in user {
print("${key} = ${value}")
}
When the loop variables are not needed, they can be elided, e.g.
for 0..5 {
...
}
Unlike boolean conditions which are re-evaluated on every loop, iterable expressions are only evaluated once. This is particularly useful when the iterables yield lazily, e.g.
for rate_limiter.available_slots() {
handle_next_request()
}
User-defined types can add support for iteration by defining the __iter__
method which needs to return a type implementing the built-in iterator
interface.
Types implementing iterator need to have a __next__ method which returns the
next value in the sequence, or nil when the iteration is complete, e.g.
Counter = struct {
current int
max int
}
func (c Counter) __iter__() iterator {
return c
}
func (c Counter) __next__() ?int {
if c.current < c.max {
current = c.current
c.current++
return current
}
return nil
}
counter = Counter{current: 2, max: 5}
for i in counter {
print(i) // Prints 2, 3, 4
}
Structured Flow Error Handling
As developers, we spend a large part of our time handling error conditions. Despite this, most languages do a terrible job of making this easy and robust:
-
C++ provides multiple error handling mechanisms: exceptions, error codes,
std::optional,std::variant,std::expected, and even legacy C-style approaches.This creates inconsistencies within projects, even within the C++ standard library, and interoperating between these paradigms adds a lot of unnecessary complexity.
-
Java got some things right with checked exceptions, but the implementation suffers from poor tooling and newer language features like generics weren’t designed with these in mind.
-
Go is famous for its verbose and repetitive
if err != nilerror handling that makes code hard to read and understand. Ironically, it’s still easy to accidentally ignore errors. -
Rust gets a lot right, e.g. the built-in
Result<T, E>type, rich error types with pattern matching, forcing all errors to be handled, the?operator for propagating errors concisely, etc.But, despite this, it’s still painful. Since libraries create their own errors for robustness, this leads to complex error hierarchies and extensive error conversion code.
You need to use libraries like
thiserrorandanyhowjust to manage the boilerplate, leading to problems like increased compile times and loss of type information. -
Swift has some fantastic ergonomics around error handling, but errors often miss contextual information, are tricky to handle across async boundaries, and use multiple styles like in C++.
-
Zig provides fast and predictable error handling, but it’s not expressive enough for many real world applications where you need to pass along contextual information to users.
The reality is errors can happen anywhere, e.g.
// Even creating a simple string can error due to running
// out of memory, e.g.
x = "Hello World!"
// Even basic addition can error due to overflow, e.g.
c = a + b
Given this, we believe that it is unreasonable to expect developers to:
-
Handle every error as it happens. This will result in extremely verbose code as most of it would be spent handling errors.
-
Manually document all the errors that a function might generate, as this will create a lot of noise in type signatures.
This is one of the reasons that the most used languages like JavaScript, Python, Java, and C# all use exceptions. They make life easy for developers.
But despite the boost in developer productivity, exceptions-based error handling is terrible:
-
It’s often unclear what exceptions may be thrown where, or how a program will behave in failure cases.
-
They can add significant overhead to runtime performance in comparison to just checking return values.
-
They can hide programming errors and make debugging difficult.
So Iroh takes a different approach that we call SFE (Structured Flow Errors). These are value-based errors similar to Rust, but where errors are automatically unwrapped and propagated like exceptions.
| Iroh SFE | Return Values | Exceptions | |
|---|---|---|---|
| Structured | ✅ | ✅ | ❌ |
| Flow | ✅ | ❌ | ✅ |
| Auto-Stacked | ✅ | ❌ | ❌ |
All functions that could error, implicitly return an outcome type as their
result. This is generic over a SuccessType and an ErrorSet enum of the
specific errors a function might return.
For example, for this function:
func calc(a int, b int) int {
return a / b
}
The return type is implicitly inferred to be outcome[int]CalcErrorSet where
the error set is this auto-generated enum:
CalcErrorSet = enum {
std.DivisionByZero
}
As we do full-program analysis and have our own custom editor, we can:
-
Automatically annotate each function with all of the errors that it might return without needing developers to keep it updated.
-
Make it super easy for developers to see all the lines where errors might be generated and show which ones are handled or not handled.
-
Automatically generate an anonymous
enumof all the potential errors. -
Automatically wrap both the success and error returns as an
outcomevalue. -
Automatically check, unwrap, and propagate error values as return values.
This allows developers to focus on the happy path, e.g.
func process_order(user_id UserID, product_id ProductID) {
user = get_user(user_id)
product = get_product(product_id)
product.check_inventory()
user.create_order(product)
user.charge_payment(product.price)
user.send_confirmation_email()
}
The auto-imported std package defines a number of built-in errors, e.g.
| Error | Description |
|---|---|
BrokenPipe(pipe) |
Broken pipe or stream |
ConnectionReset(conn) |
Connection was reset by peer |
DivisionByZero |
Division by zero |
IOError |
Category for I/O errors. |
InvalidCharacter(src) |
Invalid character encountered |
EOF |
End of file or stream reached |
FileNotFound(path) |
File could not be found |
RetriableError |
Category for errors that could be potentially retried |
OutOfMemory |
Allocation failed due to insufficient memory |
Overflow |
Calculation resulted in a value overflowing |
Packages can define their own errors at the package top-level, e.g.
errorset {
ConnectionTimedOut(duration int<s>)
InvalidURL(url string)
HTTPStatus(code int, message string)
}
When an errorset is defined, the compiler automatically:
-
Generates new types for each of the variants. So, if the above definition was in the
networkpackage, you could access them asnetwork.ConnectionTimedOut,network.InvalidURL, etc. -
Generates methods for error handling, e.g. string formatting, etc.
Each error can also specify an optional string formatting, e.g.
errorset {
ConnectionTimedOut(duration int<s>) = "Connection timed out after ${duration}"
}
So that when the errors are printed they emit something meaningful instead of cryptic error names, e.g.
print(ConnectionTimedOut(30<s>))
// Outputs:
// Connection timed out after 30 seconds
When a string format isn’t specified, Iroh automatically generates one based on the following:
-
The
labelof a variant, e.g.InvalidURL, is automatically converted from PascalCase to a sequence of space-separated words, i.e.Invalid URL. -
For variants with no fields, the format is just
${label}. -
For variants with just 1 field,
${label}: ${field1_value}. -
For variants with 2+ fields, each field’s name and value are included, separated by commas.
For example, given this errorset:
errorset {
HostUnreachable
InvalidURL(url string)
HTTPStatus(code int, message string)
}
The auto-generated string formats would look like:
// Host Unreachable
HostUnreachable()
// Invalid URL: httpx://www.google.com
InvalidURL("httpx://www.google.com")
// HTTP Status: code: 404, message: not found
HTTPStatus(404, "not found")
Since Iroh recognizes error types automatically, they can be returned without needing to be wrapped in any way, e.g.
func connect(host string, timeout int<s>) socket {
...
return ConnectionTimedOut(timeout)
}
The only exception is when a function actually returns an error value on
successful execution. In these cases, return statements need to be explicitly
annotated with @success or @failure, e.g.
func convert_timeout(src error) error {
match src {
TimeoutError(duration):
return @success ConnectionTimedOut(duration)
default:
return @failure InvalidError(src)
}
}
An errorgroup can be used to define new error groupings. Error group names
must all end in Error, e.g.
errorgroup {
NetworkError
ValidationError
}
You can specify parent error groups for an errorgroup to create hierarchical
relationships, eg.
errorgroup(std.IOError) {
NetworkError
}
Any number of these groups can be specified for an errorset to apply the
groupings for the errors being defined, e.g.
errorset(std.RetriableError, NetworkError) {
ConnectionTimedOut(duration int<s>)
}
errorset(std.RetriableError, postgres.Error) {
QueryTimedOut(duration int<s>)
}
Error groups act as tags, allowing developers to create domain-specific error taxonomies where errors can participate in multiple error handling strategies easily.
When the possible errors from a function are displayed within the Iroh editor, they are automatically structured into groups. This takes into account any overlaps and hierarchy between error types.
Errors can be matched against specific error types or error groups at the same time, with the matching taking place sequentially, e.g.
match err {
sql.QueryFailed:
// err is type-narrowed to sql.QueryFailed here:
log.error("Query failed ${err.sql_query}")
return
std.IOError:
// err could be any error type tagged with IOError,
// here, e.g. FileNotFound, NetworkError, etc.
//
// You can handle all I/O errors generically, or
// type-narrow it further:
if err is ConnectionTimedOut:
// err is type-narrowed to ConnectionTimedOut here:
log.warn("Connection timed out after ${err.duration}")
} else {
log.warn("I/O operation failed: ${err}")
}
}
The is operator matches against both the specific error type and all error
groups that it has been tagged with, e.g.
err = ConnectionTimedOut(10<s>)
err is ConnectionTimedOut // true
err is std.Retriable // true
err is std.Retriable and err is NetworkError // true
The default behaviour of automatically unwrapping and propagating errors can be
prevented by using the try keyword, e.g.
sock = try connect(host)
This will return the implicit outcome value from the connect call. The
unwrap method can then be used to explicitly unwrap the outcome value, e.g.
sock = try connect(host)
s = sock.unwrap() // Propagates the error if the outcome contains an error.
type(sock) == outcome[socket] // true
type(s) == socket // true
As this is the default behaviour, the above is just equivalent to:
s = connect(host)
As outcome values are falsey when they have errors, it is more ergonomic to
handle them conditionally, e.g.
sock = try connect(host)
if !sock {
// The sock value is an error here, which is accessible
// via the .error attribute:
print("Failed to connect to ${host}: ${sock.error}")
return
}
sock.write("Hello!")
As Iroh automatically unwraps and narrows the type based on the conditional,
there’s no potential to accidentally access the success or .error value like
with say std::expected in C++.
Also, while our outcome data type is heavily inspired by Haskell, we don’t
bake in chaining like the Either monad, despite its benefits, so as to limit
the number of ways that one can handle errors in Iroh.
As outcome values with errors are falsey and the try binds expressions
tighter than an or, the try <expr> or <fallback> pattern can be used to set
fallback values, e.g.
profile_image = try user.get_image() or default_profile_image
This is more ergonomic than:
profile_image = try user.get_image()
if !profile_image {
profile_image = default_profile_image
}
The or can also take an optional block of statements after it. When used with
try, a special $error value can be accessed from within this block.
The $error value represents the unwrapped error value from the preceding try
expression and can be used to handle the error, e.g.
sock = try connect(host) or {
print("Failed to connect to ${host}: ${$error}")
return
}
For explicit error handling, one can match against the unwrapped error with the particular set of errors or error groups, e.g.
req = try http.get(url)
if !req {
match req.error {
// Match an explicit error
http.InvalidURL:
send_response(status_code: http.BadRequest, contents: "Invalid URL")
return
// Match an error group
std.RetriableError:
retry_with_back_off()
}
}
Iroh provides a catch keyword to make this even simpler where the catch
automatically unwraps the error and doubles up as a match on it, e.g.
req = catch http.get(url) {
http.InvalidURL:
send_response(status_code: http.BadRequest, contents: "Invalid URL")
return
std.RetriableError:
retry_with_back_off()
}
Note that when matching against errors from methods on interface values, the set of errors includes all potential implementations of that interface at that call site.
A @`context message` can be added anywhere at any point
where errors are automatically propagated. This will automatically populate the
.stack property on error values, e.g.
func run_app() {
app = init_app @`Failed to initialize the app`
app.serve()
}
func init_app() App {
cfg = init_cfg() @`Failed to initialize the config`
db = init_db(cfg.db_host) @`Failed to initialize the database`
return App{db: db}
}
func init_db(host string) DB {
sock = net.connect(host) @`Failed to connect to the database at ${host}`
return open_database(sock)
}
func main() {
try run_app() or {
print_err("ERROR:\n${$error.stack}")
}
}
For example, in the previous example, if connect() returns a
ConnectionTimedOut(30<s>) error, then the printing of $error.stack might
output something like:
// ERROR:
// 1. Failed to initialize app · myapp:2
// 2. Failed to initialize the database · myapp:8
// 3. Failed to connect to the database at db.espra.com · myapp:12
// 4. Connection timed out after 30 seconds · net:630
Each element of the stack contains a:
-
message— either the stringified form of an error value or a@`context message`. -
location— depending on the program size, either auint32oruint64representing the exact package, version, and line of code that resulted in the new stack element.
We believe this is much more ergonomic than having to wrap errors everywhere.
The .stack property essentially creates a narrative of what went wrong.
The base of the stack always contains the string representation of the original
error. Additionally, when match or catch return new errors, the original
error stack is automatically propagated.
This can be a massive help during debugging, as it preserves the contextual information from the original error stack instead of it being accidentally swallowed.
The @with_error_stack special parameter can be used when constructing an error
to specify a custom error stack, e.g.
func init_db(primary_host string, secondary_host string) DB {
sock = catch net.connect(primary_host) {
net.ConnectionTimedOut:
primary_err = $error
=> catch net.connect(secondary_host) {
net.ConnectionTimedOut:
stack = $error.stack.merge(primary_err.stack, "Failed to establish connection to ${primary_host}")
return SecondaryDatabaseOffline(@with_error_stack: stack)
}
}
return open_database(sock)
}
Likewise, @reset_error_stack can be specified to tell the compiler not to
propagate the original error stack, e.g.
func login(username string, passphrase string) User {
user = catch auth_user(username, passphrase) {
default:
return AuthenticationFailed(@reset_error_stack: true)
}
return user
}
In short, we believe that SFE captures the ergonomics of exceptions with the safety and structure of explicit error values. It also adds automatic context stacking and flexible error hierarchies.
Const Values
Iroh supports const values of different kinds. If the const is on the left
hand side, then the value is compile-time evaluated, e.g.
const x = factorial(5)
x == 120 // true
All dependencies of such evaluations need to be compile-time evaluatable and cannot depend on runtime input, e.g.
const private_key_modulus = read_modulus() // ERROR!
func read_modulus() uint4096 {
return uint4096(io.read_all(stdin))
}
The @read_file and @read_url compile-time functions allow for reading
various resources at compile-time, e.g.
const private_key = @read_file("~/.ssh/id_ed25519")
const logo_png = @read_url("https://assets.espra.com/logo.png")
Compile-time reads are cached on first read, and need to be explicitly uncached
by running a clean build or by marking the resource with watch, e.g.
const config_data = @read_file("config.json", watch: true)
This mechanism is what powers our build system, e.g.
import "build"
func build(b build.Config) {
glfw = b.add_static_library("glfw", sources: glob("glfw/src/*.c"))
.add_include_path("glfw/include")
b.add_executable(root: "src/main.iroh")
.link_library(glfw)
.install()
}
Instead of needing a separate build config and language, like CMake, Autotools, or Gradle, we have the full power of Iroh available in our compile-time build system.
Compile-time const values can be defined within the top-level scope of a package or within function bodies, e.g.
func serialize_value() {
const debug = false
if debug {
...
}
}
Optimizations are then done based on these compile-time values, e.g. in the
above example, the entire if debug block will be fully optimized away.
If const is on the right side of an assignment, i.e. at the head of an
expression, then it is not compile-time evaluated, and instead marks the result
of the expression as immutable, e.g.
path = "/home/tav"
split = const path.split("/")
split[1] = "alice" // ERROR! Cannot mutate an immutable value
When a value is marked as immutable, it can no longer be mutated. To support this, the compiler will try to re-use existing allocations wherever possible, and only make copies when necessary.
Interfaces
Interfaces in Iroh specify the exact methods that a type needs to implement to
satisfy a particular interface, e.g.
reader = interface {
// Read returns the number of bytes that have been read
// into the given buffer. If no further data is available
// to be read, then it must return 0.
read(buf []byte) int
}
Any type that implements the specified methods implicitly satisfies the
interface and can be tested using the is operator, e.g.
File = struct{}
(f File) read(buf []byte) int {
return 0
}
File is reader // true
Anything that needs an interface value can then be passed such values without much ceremony, e.g.
func process(r reader) {
...
}
process(File{})
By default, interface values are stored as “fat pointers”, i.e. a pointer to the type descriptor for the concrete type and a pointer to the actual value (or a copy, depending on size/semantics).
This is then used to dynamically dispatch method calls on the interface value, e.g.
// Mix different readers in the same collection:
streams = []reader{file, socket}
// Process them uniformly:
for stream in streams {
process(stream)
}
But, if the concrete type is known at the call site, static dispatch is used instead. For example, given a function like:
func read_all(r reader) []byte {
b = []byte(cap: 512)
for {
n = r.read(b[len(b):cap(b)])
if n == 0 {
return b
}
b = b[:len(b)+n]
b.extend_if_at_capacity()
}
}
If it gets used with values of concrete types, e.g.
file_contents = read_all(file)
socket_data = read_all(socket)
Then the function will be automatically specialized at each call site for the respective types, e.g.
// Iroh automatically generates specialized versions:
func read_all_file(r File) []byte
func read_all_socket(r Socket) []byte
// Which then get used at the call sites:
file_contents = read_all_file(file)
socket_data = read_all_socket(socket)
Since Iroh performs whole-program analysis, it knows the exact set of concrete types that can be applicable at any given point. This lets us convert virtual calls to direct ones with very little overhead.
For example, as Iroh knows that stream can only be File or Socket in the
hot loop below:
buf = []byte(len: 1024)
streams = []io.reader{file, socket}
for stream in streams {
for 0..100 {
try stream.read(buf)
}
}
It can automatically generate an enum for the possible variants and directly
call the methods on the underlying concrete type, e.g.
StreamValue = enum {
file(File)
socket(Socket)
}
buf = []byte(len: 1024)
streams = []StreamValue{StreamValue.file(file), StreamValue.socket(socket)}
for stream in streams {
match stream {
.file(f):
for 0..100 {
try f.read(buf)
}
.socket(s):
for 0..100 {
try s.read(buf)
}
}
}
This allows us to have the simplicity of interfaces with minimal overhead for performance sensitive code. Developers don’t have to choose!
Interfaces can refer to __magic__ methods even though they’re not directly
accessible, e.g.
iterable = interface {
__iter__() iterator
}
comparable = interface {
__eq__(other T)
}
callable = interface {
__call__(...) T
}
indexable = interface {
__getitem__(index T)
__setitem__(index T, value U)
}
Interfaces can also define utility methods, e.g.
reader = interface {
read(buf []byte) int
}
(r reader) read_all() []byte {
// The implementation of read_all can call any methods
// defined on the interface, i.e. r.read(), or any other
// utility method that's been defined.
}
These methods are then accessible on the interface values, e.g.
func process(r reader) {
data = r.read_all()
...
}
Interfaces can mark methods with @allowed_errors to limit the type of errors
they can return, as well as @no_error if those methods should never return an
error, e.g.
allocator = interface {
__create__(T type) @allowed_errors(std.OutOfMemory)
__destroy__(value T) @no_error
}
Like struct and enum types, interfaces can be marked as deprecated to
generate compile-time warnings, e.g.
LogMessage = interface(deprecated: "Use LogEntry instead") {
...
}
Interfaces can also be composed together easily, e.g.
read_write_seeker = interface {
reader
writer
seek(pos int)
}
Thus allowing developers to easily create complex interfaces from simple building blocks.
Refined Types
Types can be constrained to specific sub-ranges using @limit which limits a
type with a constraint. Constraint expressions can refer to an instantiated
value of the type as this, e.g.
Codepoint = @limit(uint, this <= 0x10ffff)
Port = @limit(uint16, this >= 1024 and this <= 65535)
Format = @limit(string, this in ["json", "toml", "yaml"])
Any compile-time evaluatable expression can be used as the constraint. When values of constrained types are initialized, literals are validated at compile-time, otherwise at runtime, e.g.
port1 = Port(8080) // Validated at compile-time (literal)
port2 = Port(user_input) // Validated at runtime (dynamic value)
Constraints are validated whenever there are any changes that could invalidate it, e.g.
StringList = @limit([]string, len(this) > 0)
x = StringList{"Tav"}
x.pop() // ERROR!
Constraining a type to a set of specific values can be written by prefixing them
with a const and using | to separate the options, e.g.
Format = const "json" | "toml" | "yaml"
Priority = const 1 | 2 | 3
When constraining non-numeric types like strings to a set of values, the const
prefix can be elided as the | bitwise OR operator doesn’t apply to them, e.g.
Format = "json" | "toml" | "yaml"
But the const prefix will still be needed if only one value is possible, e.g.
Format = const "json"
As string values are already immutable, this creates a value which doubles as both a type value with a single string value as well as an immutable string value.
Consumable Types
Struct types can be annotated as being consumable in order to treat them as
linear types, e.g.
Transaction = struct(consumable: true) {
...
}
func (t Transaction) set_key(key string, value V) {
...
}
@consume
func (t Transaction) commit() {
...
}
@consume
func (t Transaction) rollback() {
...
}
All values of a consumable type must be discarded by calling a method that has
been marked with the @consume decorator, e.g.
txn = db.new_txn()
txn.set_key("admin", "Tav")
txn.commit()
Failure to do so will result in an edit-time error, e.g.
txn = db.new_txn()
txn.set_key("admin", "Alice")
// ERROR! Neither txn.commit() nor txn.rollback() were called!
Once a value has been consumed, it can no longer be used, e.g.
txn = db.new_txn()
txn.set_key("admin", "Alice")
txn.commit()
txn.set_key("admin", "Zeno") // ERROR! txn value already consumed!
In order to simplify semantics in the rest of the language, all @consume
methods must explicitly handle all errors that might be generated during the
execution of the method.
Up to one @consume method can be marked as default, e.g.
@consume(default: true)
func (f fs_file) close() {
...
}
This will automatically consume the value with this method if it hasn’t been explicitly consumed by the time it goes out of scope, e.g.
if update_contents {
f = fs_path("/home/tav/${filename}.md").create()
f.write(contents)
// The file is automatically consumed by f.close() here.
}
Values are consumed in LIFO order (last-in, first-out) and takes place even if a function generates an error, e.g.
func write_file(filename string) {
f = fs_path("/home/tav/${filename}.md").create()
v = 1 / 0
// The file is automatically consumed by f.close() here
// before the function returns division by zero error.
f.write(string(v))
}
This enables package authors to provide APIs that are ergonomic and safe, without needing any manual cleanup.
Compile-Time Evaluation
Iroh compile-time evaluates the following constructs:
-
All assignments to
constvariables. -
All type,
func, and<unit>definitions. -
All usage of compile-time control flow, e.g.
$if,$for,$match, etc. -
Any variables or expressions that directly depend on the output of the
@type_infofunction. -
Any variables or expressions that directly depend on compile-time function parameters like
T,U, etc. -
Any calls to any of the built-in compile-time
@functions. -
All top-level expressions in a package.
-
Any variables and expressions that the above depend on.
During compile-time, code can use all of the standard Iroh constructs and runtime functions except for any I/O related functionality, e.g. reading from stdin, changing the scheduler, etc.
The following built-in compile-time functions provide an escape hatch for certain I/O needs:
-
@list_directory -
@read_file -
@read_url
As compile-time evaluations are triggered by edit-time changes, any generated
errors will be shown at the evaluation entrypoints. These can also be manually
triggered using @compile_error, e.g.
MyInt = make_custom_type(int64)
if len(@type_info(MyInt).fields) == 0 {
@compile_error("MyInt must have at least 1 field: found 0")
}
Compile-time control flow works the same as normal control flow except that it
forces evaluation during compile-time. It is triggered by $-prefixed keywords,
e.g.
$if format == .json {
resp = json.encode(v)
}
The body of compile-time blocks are only evaluated if any other compile-time value depends on it, otherwise it would simply emit the code for execution during runtime.
So, in our previous example, if the format was .json and no other
compile-time value depended on resp, it would have been as if the author of
the code had just written:
resp = json.encode(v)
Similarly, functions can mark parameters as compile-time by using single capital letters as parameter names. Expression referencing such parameters are evaluated at compile-time, e.g.
func unroll(N int) {
for 1..N {
// Code repeated N times at compile time.
}
}
This can be used to define type parameters to create generic datastructures,
e.g.
func Pair(T type, U type) type {
Pair = struct {
first T
second U
}
(p Pair) {
func __len__() int {
return 2
}
func __iter__() iterator {
return iter((p.first, p.second))
}
func __seq_literal__(first T, second U) {
p.first = first
p.second = second
}
}
return Pair
}
Generic types like this can then use a [] shortcut syntax to initialize a type
with specific parameters, e.g.
p = Pair[string]int{"Zeno", 12}
p.first == "Zeno" // true
p.second == 12 // true
The [] shortcut syntax takes the following forms:
// Foo[T] for 1 type parameter, e.g.
x = set[string]{}
// Foo[T]U for 2 type parameters, e.g.
y = map[string]int{}
// Foo[T, U, V] for 3 or more type parameters, e.g.
z = BTree[string, int, 12]{}
Only functions which return a type can use this syntax, other functions with compile-time type parameters would have to use the standard function calling syntax, e.g.
func print_typename(T type) {
if T == int {
print("Got an int value")
}
if T == string {
print("Got a string value")
}
}
// A call to print_typename with a compile-time parameter, e.g.
print_typename(type(23))
// Would evaluate the compile-time parameter and create a
// specialized function that gets called at runtime and which
// effectively looks like:
func print_typename_1() {
print("Got an int value")
}
To have a type be inferred based on how it’s used, just the return type parameter can be specified, e.g.
func rand() T {
...
}
x = float32(rand())
type(x) == float32 // true
In some cases, it may not be possible to infer the return type. In this case its
value will be unknown_type, which can be used to construct a value of a
concrete type to avoid an edit-time error, e.g.
func rand() T {
if T == unknown_type {
return fixed128(...)
}
}
x = rand()
type(x) == fixed128 // true
To make a function specialized to different types of a value, parameters can use single-letter variable names for their types, e.g.
(db DB) func get(key string, value T) T {
if T == int {
// handle int value
}
if T == string {
// handle string value
}
}
// The call to db.get below is to a version that is specialized
// for when T is Person:
profile = db.get("/profile/tav", Profile{})
In the above case, while the type is generic and needs to be known at compile-time, the value is passed in at runtime — enabling patterns like re-using pre-allocated values.
The @type_info function can be used for compile-time reflection and returns a
value that’s only accessible at compile-time, e.g.
Person = struct {
name string
age int
location (float64, float64)
}
info = @type_info(Person)
info.kind == .struct // true
sinfo = info.struct
len(sinfo.fields) == 3 // true
sinfo.fields["name"].kind == .stringy // true
sinfo.fields["age"].kind == .numeric // true
sinfo.fields["location"].kind == .tuple // true
For types that are being defined in the local scope, typeinfo values can be mutated to:
-
Add fields.
-
Add instance/static methods.
This can then be converted back to a type using the .value() method, e.g.
func CustomUser(A string, T type) type {
info = @type_info(struct{})
info.fields["name"] = {
...@type_info(string)
}
info.fields["location"] = {
...@type_info((float64, float64))
}
info.fields[A] = {
...@type_info(T)
}
return info.value()
}
Subscriber = CustomUser("subscription", Subscription)
user = Subscriber{
name: "Tav",
location: (51.5074, -0.1278),
subscription: .premium
}
Once can use this to easily create various re-usable constructs, e.g. to automatically convert from an array of structs to a struct of arrays, one could:
// Create a generic converter:
func SoA(src []T) U {
aos = @type_info(T)
if !aos.struct {
@compile_error("${T} is not a struct type")
}
field_names = aos.fields.keys()
// Construct the struct of arrays type dynamically:
soa = @type_info(struct{})
for name, field in aos.fields {
soa.fields[name] = {
...@type_info([]field.type)
}
}
// Define the return type dynamically:
U = soa.value()
// All lines up to here are compile-time evaluated.
dst = U{}
for elem in src {
for name in field_names {
dst_field = @get_field(dst, name)
dst_field.append(@get_field(elem, name))
}
}
// The above lines essentially compile down to the runtime:
//
// for elem in src {
// dst.x.append(elem.x)
// dst.y.append(elem.y)
// }
return dst
}
Point = struct {
x int
y int
}
// Create an array of structs:
points = []Point{
{x: 1, y: 2},
{x: 3, y: 4},
{x: 5, y: 6}
}
// Use the SoA function to convert it to a struct of arrays:
points_soa = SoA(points)
points_soa.x == [1, 3, 5] // true
points_soa.y == [2, 4, 6] // true
For convenience, the is operator also matches on the kind of a type, i.e.
numeric, stringy, struct, enum, slice, array, etc.
(m MyInt) func __from__(v T) {
if !(T is type.numeric) {
@compile_error("Only numeric types can be cast to MyInt")
}
...
}
All this allows for functions and types to be specialized and optimized at compile-time, enabling generic programming, better performance, and early error detection.
The Iroh editor highlights compile-time evaluated lines differently to runtime ones, so it makes it easy for developers to quickly understand how the code will be evaluated.
Date & Time Types
The built-in time value can be used to access the current time, i.e.
time.now()
By default, all time is in TAI, which is UTC but without the complexity of
leap seconds. A specific timezone can be specified to time.now, e.g.
// UTC:
time.now("UTC")
// IANA timezones:
time.now("Asia/Seoul")
time.now("Europe/London")
// Abbreviations for popular timezones:
time.now("EST")
// Same as the default:
time.now("TAI")
// Inferred from system settings:
time.now("local")
Timezones are represented by the built-in timezone enum, so any typos in
timezone literals will be caught at edit-time, e.g.
time.now("Europe/Londn") // Edit-time ERROR! Invalid timezone!
The value returned by time.now is a datetime value. These can be constructed
manually, e.g.
// Full date with nanosecond and timezone:
datetime(2025, 8, 1, 13, 30, 45, 0, tz: "Europe/London")
// Date without the timezone, which defaults to TAI:
datetime(2025, 8, 1, 13, 30, 45)
// Date without the time, with the time components defaulting to 0:
datetime(2025, 8, 1)
While they are immutable, the individual fields of datetime values can be directly accessed, e.g.
date = datetime(2025, 8, 1, 13, 30, 45, 0, tz: "Europe/London")
date.year == 2025 // true
date.month == 8 // true
date.day == 1 // true
date.hour == 13 // true
date.minute == 30 // true
date.second == 45 // true
date.nanosecond == 0 // true
date.tz == "Europe/London" // true
Various methods exist for computing information about the date, e.g.
date = datetime(2025, 8, 1, 13, 30, 45)
date.is_leap_year() == false // true
date.is_past() == true // true
date.is_future() == false // true
date.is_today() == false // true
date.weekday() == .friday // true
date.today() == datetime(2025, 8, 1) // true
date.yesterday() == datetime(2025, 7, 31) // true
date.tomorrow() == datetime(2025, 8, 2) // true
The is_anniversary method checks if a datetime value is the anniversary of a
date, e.g.
bday = datetime(1982, 3, 18)
// By default, the current time is used:
bday.is_anniversary()
// Or an explicit datetime value can be given:
bday.is_anniversary(datetime(2026, 3, 18)) // true
// When the date is February 29th, the anniversary
// date can be substituted during non-leap years, e.g.
wedding = datetime(2004, 2, 29)
wedding.is_anniversary(leap_substitute: .mar01) // Default: .feb28
Datetime values can be compared against each other, e.g.
start = datetime(2025, 8, 1)
end = datetime(2025, 8, 20)
end == start // false
end > start // true
While datetime values can’t be added, they can be subtracted from each other to give a time-based unit value, e.g.
start = datetime(2025, 8, 1)
end = datetime(2025, 8, 20)
diff = end - start
diff == 20<days> // true
Time-based unit values can be used to do datetime calculations, e.g.
start = datetime(2025, 8, 1)
end = start + 20<days> + 10<hours> + 30<mins> + 1<s>
end == datetime(2025, 8, 21, 10, 30, 1)
The <month>, <year>, <decade>, and <century> units are treated
especially, in that:
-
They can only be added or subtracted by themselves, i.e. they don’t mix with any other time-based units.
-
Calculations involving any of these special units need to be done in a separate expression to calculations with any other time unit.
-
Adding
1<month>increments thedatetimevalue by 1 month, but if this creates an invalid date, e.g. 31st of February, it will clamp the date to the end of that month. -
Adding
<month>units keeps track of the original date, so that if a date got clamped, it won’t keep doing so, e.g.subscription_start = datetime(2025, 1, 31) first_month = subscription_start + 1<month> second_month = first_month + 1<month> first_month == datetime(2025, 2, 28) // Clamped to Feb 28th second_month == datetime(2025, 3, 31) // Goes back to Mar 31st
Elapsed time can be calculated using time.since, e.g.
start = time.now()
elapsed = time.since(start)
if elapsed > 30<s> {
...
}
The duration until some point can be calculated using time.until, e.g.
deadline = datetime(2026, 3, 18)
remaining = time.until(deadline)
if remaining < 1<day> {
...
}
Datetime values in one timezone can be converted into another, e.g.
meeting = datetime(2025, 9, 2, 18, tz: "Europe/London")
meeting_est = meeting.in("EST")
meeting_est == datetime(2025, 9, 2, 13, tz: "EST") // true
Likewise, the remaining aspects of a datetime can be preserved whilst switching
just the date components or the time components using the on or at methods,
e.g.
meeting = datetime(2025, 9, 2, 18)
meeting_day = meeting.at(0, 0, 0) // Reset the time components
meeting_tmrw = meeting.on(meeting.tomorrow()) // Change the date components
meeting_day == meeting.today() // true
meeting_tmrw == datetime(2025, 9, 3, 18) // true
The start_of, end_of, previous, and next methods can be used to
calculate relative datetime values at various boundaries, e.g.
date = datetime(2025, 9, 10, 13, 30)
date.start_of(.month) == datetime(2025, 9, 1) // true
date.start_of(.week).weekday() == .monday // true
date.end_of(.month) == datetime(2025, 9, 30, 23, 59, 59) // true
date.previous(.friday) == datetime(2025, 9, 5) // true
date.next(.monday) == datetime(2025, 9, 15) // true
A Unix timestamp can be gotten using the unix, unix_milli, unix_micro, and
unix_nano methods, e.g.
meeting = datetime(2025, 9, 2, 18)
unix = meeting.unix()
unix_nano = meeting.unix_nano()
unix == 1756832400 // true
unix_nano == 1756832400000000000 // true
Some popular time formats are supported, e.g.
meeting = datetime(2025, 9, 2, 18)
meeting.rfc3339() == "2025-09-02T18:00:00Z" // true
meeting.iso8601() == "2025-09-02T18:00:00Z" // true
// Note that even though the timezone is technically
// meant to be represented as "[TAI]", we use "Z"
// as it's more in line with what most systems expect.
Custom datetime formats can also be specified, e.g.
date = datetime(2025, 9, 2)
s = date.format("#{weekday_short}, #{day} #{month_full} #{year4}")
s == "Tue, 2 September 2025" // true
The following format specifiers are supported:
| Specifier | Example |
|---|---|
ad |
BC |
am_lower |
pm |
am_upper |
PM |
ce |
BCE |
day |
2 |
day_padded |
02 |
day_space_padded |
2 |
day_ordinal |
2nd |
hour |
15 |
hour_padded |
15 |
hour12 |
3 |
hour12_padded |
03 |
micro |
009640 |
milli |
072 |
month |
9 |
month_padded |
09 |
month_full |
September |
month_short |
Sep |
month_ordinal |
9th |
nano |
002912716 |
offset2 |
-07 |
offset2z |
-07 or Z |
offset4 |
-0700 |
offset4z |
-0700 or Z |
offset4_colon |
-07:00 |
offset4z_colon |
-07:00 or Z |
offset6 |
-070000 |
offset6z |
-070000 or Z |
offset6_colon |
-07:00:00 |
offset6z_colon |
-07:00:00 or Z |
quarter |
2 |
quarter_ordinal |
2nd |
second |
1 |
second_padded |
01 |
tz_name |
Europe/London |
tz3 |
MST (limited set) |
weekday_full |
Tuesday |
weekday_short |
Tue |
weekday_2_letter |
Tu |
weekday_0_index |
1 |
weekday_1_index |
2 |
year2 |
25 |
year4 |
2025 |
The same formatting mechanism can also be used to parse datetime values from
strings using time.parse, e.g.
date_str = "Tue, 2 September 2025"
date = time.parse(date_str, "#{weekday_short}, #{day} #{month_full} #{year4}")
date == datetime(2025, 9, 2) // true
Common formats are pre-defined on the time_format enum, e.g.
.c // Mon Jan 2 15:04:05 2006
.date_only // 2006-01-02
.date_time // 2006-01-02 15:04:05 (the default)
.iso8601 // 2006-01-02T15:04:05Z
.kitchen // 3:04PM
.rfc822 // 02 Jan 06 15:04 MST
.rfc822_numeric_offset // 02 Jan 06 15:04 -0700
.rfc850 // Monday, 02-Jan-06 15:04:05 MST
.rfc1123 // Mon, 02 Jan 2006 15:04:05 MST
.rfc1123_numeric_offset // Mon, 02 Jan 2006 15:04:05 -0700
.rfc3339 // 2006-01-02T15:04:05Z
.rfc3339_nano // 2006-01-02T15:04:05.999999999Z
.ruby // Mon Jan 02 15:04:05 -0700 2006
.time_only // 15:04:05
.timestamp // Jan 2 15:04:05
.timestamp_milli // Jan 2 15:04:05.000
.timestamp_micro // Jan 2 15:04:05.000000
.timestamp_nano // Jan 2 15:04:05.000000000
.unix // Mon Jan 2 15:04:05 MST 2006
As timezone abbreviations in date strings can be ambiguous or incorrect, the
timezone can be overridden by specifying the optional tz parameter, e.g.
date_str = "02 Jan 06 15:04 GST"
date = time.parse(
date_str,
"#{day_padded} #{month_short} #{year2} #{hour_padded}:#{minute_padded} #{tz}",
tz: "UTC"
)
date == datetime(2006, 1, 2, 15, 4, 0) // true
The individual date_component and time_component values can be accessed via
the date and time attributes, e.g.
dt = datetime(2025, 8, 1, 13, 30, 45, 0, tz: "Europe/London")
dt.date == date_component(2025, 8, 1)
dt.time == time_component(13, 30, 45, 0)
// These can be operated on individually, e.g.
dt.date.tomorrow() == date_component(2025, 8, 2)
dt.time + 12<hours> == time_component(1, 30, 45, 0)
// Or used to reconstruct datetime values, e.g.
dt == datetime.from(dt.date, dt.time, tz: dt.tz) // true
The date_range type can be used to define a range between two datetime values,
e.g.
start = datetime(2025, 6, 1, 12, 0, 0)
end = datetime(2025, 8, 31, 18, 0, 0)
summer = date_range(start, end)
The in operator can be used to check if a datetime value is within a
date_range, e.g.
bday = datetime(2025, 6, 4)
if bday in summer {
...
}
Date ranges can be iterated over by specifying an interval to the every
method, e.g.
for day in summer.every(2<days>) {
...
}
The every method knows to intelligently handle <month> units, e.g.
for quarter in year.every(3<months>) {
...
}
subscription_start = time.now()
subscription_end = subscription_start + 1<year>
billing_period = date_range(subscription_start, subscription_end)
for month in billing_period.every(1<month>) {
// Automatically clamps to the end of the month if
// the calculation would produce an invalid date
// like the 30th of February.
...
}
Similarly, the time_span can be used to specify a range that’s not specific to
any calendar date, e.g.
// 3-6PM
time_span((15, 0), (18, 0))
// 9:30:20PM-4AM
time_span((21, 30, 20), (4, 0))
The in operator can be used to check if a datetime value is within a
time_span, e.g.
family_time = time_span((18, 0), (21, 0))
meeting = datetime(2025, 9, 2, 19, 30, 0)
if meeting in family_time {
...
}
Like date ranges, time spans can also be iterated over, e.g.
workday = time_span((9, 0), (17, 0))
today = time.now().today()
for moment in workday.every(1<hour>, from: time.now()) {
dt = today.at(moment)
if !cal.have_meetings() {
cal.mark_as_busy(start: dt, duration: 1<hour>)
}
}
The time value also provides various utility methods for interacting with the
system, e.g.
start = time.now()
time.sleep(20<s>) // Sleep for 20 seconds.
round(time.since(start) to: <s>) == 20<s> // true
Values constructed using time.now contain both system time as well as
monotonic time values:
-
System time can change, or even go backwards.
-
Monotonic time only ever goes forward.
When time calculations are done using functions like time.since:
-
If the datetime value was constructed using
time.now, then the monotonic time value is used. -
Otherwise, the system time value is used.
This allows programs to be robust against any issues caused by adjustments to the system clock, e.g. due to daylight savings, NTP, manual changes, etc.
Filesystem Access
Iroh abstracts away all filesystem access via the filesystem interface:
filesystem = interface {
chmod(path string, perm fs_perm)
chown(path string, uid int, gid int)
chtimes(path string, atime time, mtime time)
create(path string) fs_file
mkdir(path string, perm fs_perm)
mkdir_all(path string, perm fs_perm)
name() string
open(path string, opts {flags fs_flags, perm fs_perm, rw fs_rw_mode}) fs_file
path_handler() fs_path_handler // For platform-specific path manipulation.
readlink(path string) string
remove(path string)
remove_all(path string)
rename(old_path string, new_path string)
symlink(target string, link string)
stat(path string) fs_info
watch(path string) fs_watcher
}
The io.filesystem value defines which filesystem is used by builtins
like fs_path methods, e.g.
f = fs_path("/home/tav/hello.txt").create()
f.write("Hello World!")
f.close()
By default, the platform-specific os_filesystem is used, but this can be
overridden using the with statement, e.g.
with {
io.filesystem = s3_filesystem(bucket: "my-bucket")
} {
f = fs_path("image/logo.png").create()
f.write(data)
f.close()
}
When the os_filesystem is used, fs_path values manipulate paths according to
the local system rules, e.g.
// On Windows:
path = fs_path("C:\Users\tav\document.pdf")
path.parent() == "C:\Users\tav" // true
// On Linux/macOS:
path = fs_path("/home/tav/document.pdf")
path.parent() == "/home/tav" // true
path.file_extension() == "pdf" // true
path.exists() == true // true
git_cfg = path.parent().join(".gitconfig")
git_cfg == "/home/tav/.gitconfig" // true
The slash_path type can be used to handle paths that use / as a separator,
e.g.
api_base = slash_path("/api/v1")
user_url = api_base.join("users", "tav")
settings_key = slash_path("/app/settings")
db_settings_key = settings_key.join("database")
user_url == "/api/v1/users/tav" // true
db_settings_key == "/app/settings/database" // true
Regular Expressions
While Iroh makes it easy to write custom parsers, it also provides a regex
type for compiling quick and dirty regular expressions, e.g.
email_re = regex("^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$")
We support the typical syntax for regular expressions, e.g.
| Syntax | Meaning |
|---|---|
. |
Any character except newlines |
^ |
Start of a string |
$ |
End of a string |
* |
0 or more |
+ |
1 or more |
? |
0 or 1 |
[...] |
Character set |
[^...] |
Negated character set |
| |
Alternation, e.g. cat|dog |
(...) |
Capturing group |
(?:...) |
Non-capturing group |
(?=...) |
Positive lookahead |
(?!...) |
Negative lookahead |
(?<=...) |
Positive lookbehind |
(?<!...) |
Negative lookbehind |
(?P<name>...) |
Named capturing group |
(?(name)yes|no) |
Conditional match — if the group name matches, use yes, else no |
{n} |
Exactly n repetitions |
{n,} |
n or more repetitions |
{n,m} |
Between n and m repetitions |
\b |
Word boundary |
\B |
Non-word boundary |
\d |
Digit character [0-9] |
\D |
Non-digit character |
\s |
Whitespace character |
\S |
Non-whitespace character |
\w |
Word character [A-Za-z0-9_] |
\W |
Non-word character |
\<N> |
Back-references to a captured group, e.g. \1 |
\... |
Escaped characters like \( or special characters like \n, \t, etc. |
Compiled regular expressions support a range of methods, e.g.
-
x.find— return the text of the leftmost match in a string. -
x.find_all— return a slice of all matches in a string. -
x.match— returntrueif the string contains a match,falseotherwise. -
x.replace— return a string with all matches substituted with the replacement text. -
x.split— return a slice of the string split into substrings separated by matches of the regular expression.
The #`pattern` literal syntax can be used to compile regexes too,
e.g.
uk_postcode = #`[A-Z]{1,2}[0-9][0-9A-Z]? ?[0-9][A-Z]{2}`
Regular expression patterns can span multiple lines and include // comments,
e.g.
uk_number = #`
^ // Start of the string
(\+44\s?7\d{3} // Country code +44, optional space, '7' and 3 digits
|\(?07\d{3}\)?) // OR: (optional '(') '07' and 3 digits (optional ')')
\s? // Optional space
\d{3} // 3 digits
\s? // Optional space
\d{3}$ // 3 digits, end of string
`
if uk_number.match("+44 7123 456 789") {
...
}
There are certain features of regular expressions that can result in unbounded execution time, e.g.
-
Back-references, e.g.
(\w+)\s+\1 -
Lookbehinds, e.g.
(?<=foo)bar -
Arbitrary recursion, e.g.
((a|b)+)\1
When regular expressions with such patterns are compiled, the nfa field of the
compiled regex is true, e.g.
pat_a = #`(\w+)\s+\1`
pat_b = #`^ERROR:.*$`
pat_a.nfa == true // true
pat_b.nfa == false // true
For patterns that use NFA-level features, all method calls need to specify a
timeout, e.g.
pattern = #`(\w+)\s+\1`
text = "this is is a test"
pattern.find(text) // ERROR! No timeout specified for an NFA pattern!
pattern.find(text, timeout: 1<s>) // Outputs: "is is"
Thus we ensure that all DFA regular expressions evaluate in linear time, and that NFA regular expressions are bounded so that they’re not vulnerable to ReDoS attacks.
Sensitive Data Types
In order to protect security-critical data like private keys, passwords, and
tokens, Iroh provides a built-in sensitive data type, e.g.
password = sensitive[string]("hunter2")
print(password) // Outputs: <redacted>
Standard library functions like print, log, and string formatting
automatically replace the value with <redacted> so as to avoid accidental
leakage of sensitive information.
To get the underlying value, it needs to be explicitly accessed via the leak
method, e.g.
password = sensitive[string]("hunter2")
print(password.leak()) // Outputs: hunter2
The show_last optional parameter can be used to show the last N characters,
e.g.
cc_number = sensitive[string]("371449635398431", show_last: 4)
print(cc_number) // Outputs: 8431
The compiler adds various additional protections to help protect the data, e.g.
-
The memory used by sensitive values are always zeroed when the values are deallocated or evicted from a register.
-
The memory is allocated using secure methods on supported platforms, e.g. using guard pages, cryptographic canaries, isolated heaps, prevention of swapping to disk, etc.
-
On platforms with secure enclaves, they are used to automatically encrypt and decrypt the memory used by sensitive values.
-
To prevent timing-based side-channel attacks, all comparisons and copying operations automatically execute in constant time.
-
Compiler optimizations are restricted to ensure that protections like zeroing memory are never removed or leak timing info, e.g. through dead code elimination, code folding, etc.
-
Sensitive values are excluded from debug information and core dumps.
-
Sensitive values need to be explicitly converted to other types.
Additionally, standard library functions automatically handle sensitive values
securely, preventing accidental exposure through printing, JSON serialization,
etc.
The related no_print data type can be used for data that isn’t particularly
sensitive, but just shouldn’t be printed as it would clog up things like logs.
This is particularly useful in errors, where it could be helpful to have access to the raw data that generated an error, but without printing that data, e.g.
errorset {
InvalidToken(src no_print[string], token string, line int, col int)
}
By default, no_print values stringify to <unprintable>, e.g.
err = InvalidToken(src: src, token: "func", line: 42, col: 15)
print(err)
// Outputs:
// Invalid Token: src: <unprintable>, token: func, line: 42, col: 15
The .unwrap method can be used to explicitly access the underlying value, e.g.
if err is InvalidToken {
print("Source: ${err.src.unwrap()}")
}
Atom Values
Like Lisp, Iroh supports unique and immutable #atom values that can be used to
represent identifiers, states, and named values, e.g.
status = #ok
match status {
#ok:
// Handle okay status ...
#err:
// Handle error status ...
}
Atoms support a broader range of identifiers than variable names:
-
Must start and end with ASCII letters, numbers, or
$. -
Can contain
_,.,-,/in the middle. -
Pattern:
#[A-Za-z0-9\$](?:[A-Za-z0-9\$_.-/]*[A-Za-z0-9\$])?
This makes atoms useful in broad contexts, e.g.
config = #database.url
namespace = #com.espra.scaffold
color = #c6e2ff
hashtag = #self-care
action = #video/pause
While usually atoms will be used as constants, they can be converted into
string values if needed, e.g.
color = #c6e2ff
string(color) == "c6e2ff" // true
Valid identifiers can be converted back to atom values, e.g.
color = "c6e2ff"
atom(color) == #c6e2ff // true
The __from__ magic method can be used to convert an atom to a custom datatype
when assigned. For example, the built-in color data type uses this to support
hex colours:
// Add magic method to the color data type:
(c color) __from__(value T) color {
if T == atom {
...
}
}
style.color = #ff0000 // Triggers __from_atom__
As atoms are immutable, they make excellent identifiers for maps, sets, and other data structures that require stable keys, e.g.
transitions = {
(#idle, #start): #running,
(#running, #pause): #paused,
(#paused, #resume): #running
}
When you use a new atom literal in your code, the Iroh editor will prompt you to explicitly declare it for the package. This prevents typos and provides better type safety than string literals.
Other Data types
Iroh provides a broader range of built-in data types than most languages. We believe this will make life easier as developers won’t need to reach for custom packages all the time.
Regional data:
-
country— Country identification and metadata, e.g.c = country("US") c == .us // true c.name() == "United States" // true c.official_name() == "The United States of America" // true c.currency_code() == "USD" // true c.flag_emoji() == "🇺🇸" // true c.neighbors() == [.ca, .mx] // true -
language— Language identification and metadata, e.g.lang = language("Tamil") lang == .ta // true lang.native_name() == "தமிழ்" // true -
locale— Cultural formatting rules for things like numbers, dates, and currencies, e.g.loc = locale("en-US") loc == .en_us // true loc.country() == .us // true loc.language() == .en // true loc.format_number(1234.56) == "1,234.56" // true loc.format_currency(1234.56) == "$1,234.56" // true -
phone_number— International phone number validation and formatting with region support, e.g.number = phone_number("07123 456 789", country: .uk) number.canonical() == "+447123456789" // true
Network protocols:
-
ip_address— Validated IPv4 and IPv6 addresses, e.g.ipv4 = ip_address("192.168.1.1") ipv6 = ip_address("2001:db8::1") ipv6_with_zone = ip_address("fe80::2%eth0") -
cidr— CIDR network ranges with subnet operations, e.g.network = cidr("192.168.1.0/24") ip_address("192.168.1.50") in network // true -
uri— Resource identifier supporting any scheme and internationalized resources, i.e. IRIs, e.g.isbn = uri("urn:isbn:9780593358825") admin_link = uri("http://人民网.中国/admin") path = uri("file:///home/tav/document.pdf") mail_link = uri("mailto:tav@example.com") custom = uri("spotify:track:0VjIjW4GlUZAMYd2vXMi3b") -
web_url- HTTP/HTTPS web addresses with methods for query parameters, paths, and web-specific operations, e.g.link1 = web_url("https://google.com") link2 = web_url("http://人民网.中国/") -
domain_name- Domain name validation with internationalized domain support, e.g.host1 = domain_name("espra.com") host2 = domain_name("新华网.cn") -
email_address- Email address validation supporting both ASCII and internationalized addresses, e.g.addr1 = email_address("tav@example.com") addr2 = email_address("tav@人民网.中国") -
tilde_path— Espra tilde path validation and parsing, e.g.link = tilde_path("~1.tav/journal/introducing-iroh") link.tilde == 1 // true link.parent() == "~1.tav/journal" // true
Collections:
-
deque— A double-ended queue supporting efficient insertion and removal from both ends, e.g.x = deque[int]{1, 2, 3} x.append(4) x.append_left(0) x == [0, 1, 2, 3, 4] // true x.pop() == 4 // true x.pop_left() == 0 // true -
sorted_list— A collection that automatically maintains elements in sorted order and allows duplicate elements, e.g.x = sorted_list[int]{3, 1, 4, 1, 5} x.append(2) x == [1, 1, 2, 3, 4, 5] // true -
ring_buffer— A fixed-size circular queue that overwrites old elements when full, e.g.x = ring_buffer[int](capacity: 3) x.append(1) x.append(2) x.append(3) x.append(4) // overwrites the 1 x.pop() == 2 // removes oldest x.pop() == 3 // true -
bitset— A fixed-size collection of bits with efficient ways to manipulate them, e.g.x = bitset(8) x.set(3) x.set(7) x.clear(3) x.flip(3) x.get(3) == true // true x.count() == 2 // true
Process signals:
-
signal— Handle incoming operating system signals, e.g.c = chan[signal]{} signal.notify(c, .sigint, .sigterm) sig = <-c match sig { .sigint: print("User interrupted with Ctrl+C") .sigterm: print("Got termination request") }
Color data types accessible via color:
| Type | Description | Bit Width |
|---|---|---|
| Basic RGB | ||
color.hex |
Hexadecimal notation (#RRGGBB, #RRGGBBAA, #RGB, #RGBA) | 24/32-bit |
color.rgb |
RGB (Red, Green, Blue) in sRGB color space | Variable |
color.rgba |
RGB with Alpha transparency in sRGB color space | Variable |
| Color Spaces | ||
color.hsi |
Hue, Saturation, Intensity | Variable |
color.hsl |
Hue, Saturation, Lightness | Variable |
color.hsla |
Hue, Saturation, Lightness, Alpha | Variable |
color.hsv |
Hue, Saturation, Value | Variable |
color.hwb |
Hue, Whiteness, Blackness | Variable |
color.ictcp |
ITU-R BT.2100 ICtCp color space for HDR broadcast | Variable |
color.lab |
CIELAB perceptually uniform color space | Variable |
color.lch |
Lightness, Chroma, Hue cylindrical | Variable |
color.luv |
CIELUV alternative uniform color space | Variable |
color.oklab |
Modern perceptual color space | Variable |
color.oklch |
Cylindrical version of Oklab | Variable |
color.xyz |
CIE XYZ tristimulus values with customizable white point | Variable |
| HDR Formats | ||
color.dolby_vision |
Dolby Vision HDR format | Variable |
color.hdr10 |
High Dynamic Range for modern displays | Variable |
color.scrgb |
Extended sRGB with values beyond 0-1 for HDR | Variable |
| Legacy & Low Bit Depth | ||
color.grayscale |
Grayscale/monochrome (single luminance channel) | Variable |
color.grayscale16 |
16-bit grayscale | 16-bit |
color.grayscale8 |
8-bit grayscale | 8-bit |
color.indexed |
8-bit palette/lookup table colors | 8-bit |
color.monochrome |
1-bit monochrome (black and white only) | 1-bit |
color.rgb_332 |
3-3-2 RGB (early PC graphics) | 8-bit |
color.rgb_444 |
4-4-4 RGB (12-bit color) | 12-bit |
color.rgb_555 |
5-5-5 RGB (15-bit high color) | 15-bit |
color.rgb_565 |
5-6-5 RGB (16-bit high color, common in embedded) | 16-bit |
| Platform Byte Orders | ||
color.abgr |
Alpha-Blue-Green-Red byte order | Variable |
color.argb |
Alpha-Red-Green-Blue byte order | Variable |
color.bgra |
Blue-Green-Red-Alpha byte order | Variable |
| Premultiplied Alpha | ||
color.premul_argb |
Premultiplied alpha in ARGB byte order for graphics APIs | Variable |
color.premul_rgba |
RGBA with premultiplied alpha for compositing | Variable |
color.premul_rgba8 |
8-bit RGBA with premultiplied alpha | 32-bit |
color.cmyk |
Cyan, Magenta, Yellow, Key (black) | Variable |
| Professional & Film | ||
color.aces |
Academy Color Encoding System (film/VFX) | Variable |
color.dci_p3 |
DCI-P3 digital cinema standard | Variable |
color.prophoto_rgb |
ProPhoto RGB wide gamut for professional photography | Variable |
| RGB Color Spaces | ||
color.adobe_rgb |
Adobe RGB wide gamut for professional photography | Variable |
color.bt601 |
ITU-R BT.601 standard definition TV | Variable |
color.display_p3 |
Apple’s wide gamut for modern displays | Variable |
color.eci_rgb |
European Color Initiative RGB v2 for print | Variable |
| Specialized | ||
color.linear_rgb |
RGB without gamma correction for 3D rendering | Variable |
color.named |
Named colors (CSS, X11, etc.) | Variable |
| Specific Bit Depths | ||
color.rgb8 |
8-bit RGB (24-bit total, 8-bit per channel) in sRGB | 24-bit |
color.rgb10 |
10-bit RGB (deep color) | 30-bit |
color.rgb12 |
12-bit RGB (cinema/professional) | 36-bit |
color.rgb16 |
16-bit RGB for high precision | 48-bit |
color.rgb_float32 |
RGB with 32-bit float values per channel | 96-bit |
color.rgba8 |
8-bit RGBA in sRGB | 32-bit |
color.rgba16 |
16-bit RGBA in sRGB | 64-bit |
color.rgba_float16 |
Half-precision float RGBA for modern GPU rendering | 64-bit |
| Video & Broadcast | ||
color.bt709 |
ITU-R BT.709 HDTV standard (sRGB for video) | Variable |
color.rec_2020 |
Ultra HD TV standard wide gamut | Variable |
color.rec_2100 |
ITU-R BT.2100 HDR standard | Variable |
color.yiq |
NTSC color encoding standard | Variable |
color.yuv |
Luma + chroma for video encoding | Variable |
color.yuv420 |
YUV 4:2:0 chroma subsampling | Variable |
color.yuv422 |
YUV 4:2:2 chroma subsampling | Variable |
color.yuv444 |
YUV 4:4:4 no chroma subsampling | Variable |
Spatial co-ordinates:
-
geopt— Represents a point on the surface or near surface of any celestial body, e.g.london = geopt(51.5074, -0.1278) paris = geopt(48.8566, 2.3522) // By default, geopt values default to Earth sea-level: london.altitude == 0<km> // true london.body == earth // true // Calculations can then be done on geopt values, e.g. london.distance_to(paris) london.compass_bearing_to(paris) // The geopt value takes several optional parameters: // altitude, reference frame, coordinate epoch, and // any corrections that have been applied, e.g. gale_crater = geopt( -5.4, 137.8, altitude: -4.4<km>, body: mars, at: datetime(2025, 3, 18, 9, 12, 30), corrections: .relativity.atmospheric_delay.tidal_effects ) -
celestial_body— Defines the physical, geometric, and geodetic reference properties of a celestial body, e.g.earth = celestial_body{ name: "Earth", altitude_reference: .ellipsoid, dynamic_frame: false, equatorial_radius: 6378.137<km>, flattening: 0.0033528106647474805, kind: .planet, gravity: 9.80665<m/s²>, motion_model: "Earth/WGS84/PlateMotion", polar_radius: 6356.752<km>, prime_meridian: { name: "Royal Observatory, Greenwich", angle: 0<deg>, coords: { latitude: 51.477928, longitude: 0.0 }, epoch: datetime(2000, 1, 1, 12, 0, 0), reference_frame: "WGS84" }, rotation_axis: { declination: 90<deg>, right_ascension: 0<deg> }, rotation_period: 23.9344696<h> }Iroh comes with various built-in definitions:
earth,moon,mars,venus,mercury,jupiter,europa,ceres,bennu, etc.
Random data:
-
rand— Generate random data, e.g.// Generate a random fixed128 value: x = rand() // Generate a non-cryptographically secure random fixed128 value: y = rand(weak: true) // Generate a random int64 value between 0 and 1,000: z = int64(rand(from: 0, to: 1000)) // Generate an array of 10 random elements: w = [10]int64{} << rand()
Version data:
-
version— Handle software version numbers, e.g.version = enum { alpha_numeric(value string) calver(year int, month int, day ?int, suffix ?string) four_part(major int, minor int, patch ?int, revision ?int) major_minor(major int, minor int) old_school(major int, minor int, patch ?int, revision ?int, suffix ?string) semver(major int, minor int, patch int, pre_release ?string, build ?string) serial(number int, prefix ?string, suffix ?string) timecode(moment datetime) yyyymmdd(date int) } tzdb_ver = version.alpha_numeric("2025g") // 2025g ubuntu_ver = version.calver(2025, 4) // 2025.04 macos_ver = version.four_part(0, 15, 7, 1) // 0.15.7.1 http_ver = version.major_minor(1, 1) // 1.1 gtk_ver = version.old_school(4, 1, 0) // 4.1.0 pkg_ver = version.semver(1, 2, 3, "beta") // 1.2.3-beta build_ver = version.serial(1234, "v") // v1234 espra_ver = version.timecode(time.now()) // 2025-GBCY openssl_ver = version.yyyymmdd(20250321) // 20250321 // Versions can be parsed from strings, e.g. new_ver = version.parse("4.5.6-alpha", .semver) // Versions can be incremented, e.g. pkg_v2 = pkg_ver.increment_major() // 2.0.0 pkg_v2 = pkg_v2.increment_minor() // 2.1.0 // Versions can be compared, e.g. if pkg_v2 > pkg_ver { ... }
Unique IDs:
-
uuid— Generates unique IDs in various formats, e.g.id = uuid(.v7) id == "061cb26a-54b8-7a52-8000-2124e7041024" // trueThe supported formats are:
enum { // Standard UUID Versions: .v1 // Time-based .v2 // DCE Security .v3 // Name-based (MD5) .v4 // Random .v5 // Name-based (SHA-1) .v6 // Time-ordered v1 .v7 // Random with UnixTime-ordered .v8 // Custom // Alternative Unique ID Formats: .cuid // Collision-resistant Unique ID .flake // Flake ID .ksuid // K-Sortable Unique ID .nanoid // Nano ID .objectid // ObjectID .pushid // PushID .snowflake // Snowflake ID .tsid // TSID .ulid // Universally Unique Lexicographically Sortable Identifier .xid // XID }Existing values can be validated and normalized using
uuid.parse, e.g.id = uuid.parse("017F21CF-D130-7CC3-98C4-DC0C0C07398F", .v7) id == "017f21cf-d130-7cc3-98c4-dc0c0c07398f" // true
Implicit Contexts
Programs often need access to various contextual information, e.g. the specific logger to use, the current session id, config settings, etc. Typically, this would be done using:
-
Global values, which are problematic because they can’t be easily overridden for specific contexts.
-
Thread-local storage values, which are isolated per thread, but can’t be easily constrained to specific segments of code.
-
Dependency injection frameworks, which add a lot of unnecessary complexity and make it hard to trace dependencies and their effects.
Even Go, which has one of the better approaches, requires a context.Context
value to be manually threaded through all functions in a call stack just so that
it’s available somewhere deep in the stack.
To simplify all this, Iroh takes an approach similar to Jai and Odin, with
implicit context values. Packages can define top-level variables as a
context_tag, e.g.
// Inside app/config.iroh:
// Ensure that the context value is a UserID:
root_user = context_tag(UserID)
These can then be defined within with constructs and accessed anywhere within
the call stack, e.g.
// Inside the app code:
import "app/config"
func is_root_user(user UserID) bool {
return user == config.root_user
}
func handle_user_action(user UserID, action Action) {
match action {
.delete_project:
if !is_root_user(user) {
return OperationNotPermitted("Only root users can delete projects")
}
...
}
}
func main() {
with {
config.root_user = config.UserID(123)
} {
for {
user = get_next_user()
action = get_next_action()
handle_user_action(user, action)
}
}
}
This eliminates the need for:
-
Passing context parameters through every function.
-
Global state management.
-
Complex dependency injection setups.
Interfaces can also be used as a context tag as long as they don’t have any type methods, e.g.
// Within the `llm` package:
// The Provider interface is defined:
Provider = interface {
query(msg string) string
tools() []Tool
}
func query_with_tools(msg string) string {
tools = Provider.tools()
// Prepare message for things like tool
// calling, etc.
...
// Call the contextually specified provider
// by calling a method on the interface as if
// it were a value:
resp = Provider.query(msg)
// Do other processing.
...
return resp
}
// Within application code:
import "llm"
func handle_input(msg string) {
msg = msg.strip()
if not msg {
return
}
resp = llm.query_with_tools(msg)
print(resp)
}
func run_cli_app() {
for {
// Get user input:
input = ...
handle_input(input)
}
}
with {
llm.Provider = openai.API()
} {
run_cli_app()
}
The with construct allows for an implementation of the interface to be
assigned to the interface. At any point within the call stack, method calls on
the interface are done on the defined value.
Package authors can also mark any struct or enum type as being a
context_tag, e.g.
// Inside say an OpenTelemetry `trace` package:
Provider = struct(context_tag: true) {
...
}
These can then be set using <pkg>.<identifier> within with constructs, e.g.
import "trace"
with {
trace.Provider = ...
}
And accessed the same way too, e.g.
span = trace.new_span()
...
trace.Provider.export_span(span)
Compared to solutions in other languages, our implicit context values:
-
Make it easy to override contextual values.
-
Clearly mark the scopes in which context values are being overridden.
-
Don’t pollute function signatures with context parameters.
The context values used by functions are tracked and propagated up call stacks. These are made visible in the Iroh editor and enforced to ensure that necessary context values are set.
Various context tags are baked into the language, e.g.
-
mem.allocator— specifies the memory allocator to use when allocating values. -
io.done— indicates if the current context has been cancelled. -
io.logger— specifies the logger for emitting logs. -
io.scheduler— specifies the scheduler to use when spawning tasks. -
sys.locale— specifies the locale to use for things like string formatting.
For added convenience, any context tags within the mem, io, and sys
built-in packages can have the package name elided within with constructs,
i.e.
// Full form:
with {
sys.locale = .en
} {
...
}
// With package name elided:
with {
.locale = .en
} {
...
}
Memory Management
Iroh tries to provide the ergonomics of a managed memory language like Go or Python, while enabling fine-grained control like in C, along with safety like in Rust.
Thanks to the fact that Iroh does whole-program analysis, it can:
-
Do fairly comprehensive escape analysis and allocate a lot more on the stack without having to resort to the heap all the time.
-
Automatically switch to an arena allocator for certain patterns, e.g. where many small objects are allocated with the same short lifetime, for temporary data structures in a scope, etc.
-
Automatically detect when values are no longer needed, e.g. when they go out of scope, and try to batch deallocate these.
-
Automatically wrap values with reference counting, i.e. the equivalent of
Rc<RefCell<T>>in Rust, when values need to be shared. -
Optimize away unnecessary reference counting at compile-time, thus minimizing their runtime cost and batching operations wherever possible.
-
Automatically wrap values with thread-safe reference counting and locks, i.e. the equivalent of
Arc<Mutex<T>>in Rust, when values are shared across threads. -
Automatically switch to lock-free concurrent data structures in certain contexts, e.g. for maps that are concurrently read by multiple tasks, etc.
-
Transparently switch to using indices instead of pointers for cyclical data structures so that they can be created without any ceremony.
Thus Iroh is able to provide the safety of Rust without the borrow checker complexity. We also avoid the latency overhead of garbage collection, allowing us to approach C-like performance.
The Iroh editor makes all of the compiler decisions visible so that developers can understand the performance implications of their design without being constrained by the compiler.
Depending on the execution mode, each Iroh program has a root allocator, e.g.
- The
standardmode uses thepage_allocatorthat uses direct memory mapping using OS-specific calls likemmapandVirtualAlloc.
The with construct can then be used to override the allocator in specific
contexts, e.g.
func handle_http_request(req Request) {
with {
.allocator = mem.arena_allocator()
} {
...
}
}
The auto-imported mem package provides various implementations that satisfy
the built-in mem.allocator interface, e.g.
-
general_allocator— The recommended allocator for typical use cases. It takes inspiration from the likes of jemalloc, mimalloc, rpmalloc, snmalloc, tcmalloc, etc. -
arena_allocator— Fast allocator with bulk deallocation. -
c_allocator— Wrapper around the C allocator for interop with C code. -
failing_allocator— Wraps another allocator and can be configured to fail under certain conditions. This is useful for injecting faults, testing, etc. -
fixed_buffer_allocator— Allocates from a pre-allocated buffer for bounded memory usage. -
monotonic_allocator— An allocator that never frees for processes that don’t care to release any memory. -
pool_allocator— Free list allocator for values that are frequently created and destroyed. -
slab_allocator— A size-classed segregated allocator for fast allocation with reduced fragmentation. -
stack_allocator— Fast LIFO allocator that follows strict deallocation order. -
tracking_allocator— Wraps another allocator to track usage, e.g. for profiling.
Allocators can be stored and re-used by values, e.g.
MyMap = struct {
_allocator allocator
...
}
(m MyMap) {
__init__() {
// Save the allocator that was used to create
// this instance.
m._allocator = mem.allocator
}
__setitem__() {
with {
mem.allocator = m._allocator
} {
// Use the same allocator if we need to get
// resized, etc.
}
}
}
When a struct value has an allocator field, the Iroh editor will explicitly
mark lines within methods that don’t use the stored allocator, or any allocators
derived from it, for allocations.
As a shortcut for specifying the mem.allocator, it’s possible to use the
special @allocator function parameter instead, e.g.
// Full-form approach:
with {
mem.allocator = mem.pool_allocator(User, 100)
} {
user = User{}
}
// Condensed form:
allocator = mem.pool_allocator(User, 100)
user = User(@allocator: allocator){}
Developers can define their own custom allocators as long as they implement the
mem.allocator interface:
allocator = interface {
__create__(T type) @allowed_errors(std.OutOfMemory)
__destroy__(value T) @no_error
...
}
Implementations can make use of compile-time functions like @size_of to
determine the necessary amount of memory. And since __destroy__ must use
@pointer, it will be marked as unsafe.
Custom allocators can be used just like built-in ones, e.g.
allocator = MyAllocator{}
user = User(@allocator: allocator)
...
drop(user) // Automatically deallocates using the same allocator.
The compiler, along with the automatic runtime checks for refcounted values, ensures that values aren’t used improperly, e.g. no double free-ing, use after free, etc.
Dropping Values
Values are deinitialized once they’re no longer needed:
-
For values which are not shared, this happens when the scope that references them ends.
-
For shared values, this happens when their reference count gets to zero.
Before the memory used by a value gets deallocated, the value is first “dropped”
like in Rust. This will call the magic __drop__ method on the value if it’s
been specified for that type.
Dropping takes place in the following order:
-
First
__drop__is called on the value being dropped. -
Then, for composite types, each of the elements of the value are processed in sequence or field order:
-
Non-shared values will be dropped.
-
Shared values will have their reference count decremented.
-
-
Finally, the memory used by the value is deallocated with a call to the corresponding allocator’s
__destroy__method.
The __drop__ method can be used to cleanup resources, e.g.
PooledDBConnection = struct {
conn ?RawDBConnection
pool DBConnectionPool
}
(c PooledDBConnection) func __drop__() {
if (!c.conn) {
return
}
// Explicitly move the raw connection from being owned,
// leaving behind `nil` as the `c.conn` value.
conn = <- c.conn
if conn.is_active() {
// Return the connection to the pool.
c.pool.return_connection(conn)
}
}
Iroh enforces two guarantees to make dropping robust:
-
Any values that need to be dropped are always dropped before a function returns. This happens even if that function generates an error. Thus ensuring proper cleanup of resources.
-
Drop methods must be infallible, i.e. they cannot generate any errors. This avoids the issue in other languages where finalizers can error and leave systems in an unknown state.
As Iroh needs access to the specific allocator that was used to create a value
in order to destroy it, it keeps track of this information:
-
If the specific allocator goes out of scope, then Iroh will automatically wrap the value into a tuple containing both itself and the allocator, so that the allocator is always available.
-
In contexts where a value will definitely not get dropped, Iroh will elide any wrapping.
-
For allocators that don’t do any work in their
__destroy__methods, e.g. like the built-inarena_allocator, Iroh will elide any wrapping.
Developers can force a value to be dropped at a specific point by calling the
built-in drop function, e.g.
func save_user(user User) {
user.save()
drop(user)
}
As the value will no longer be accessible once it has been dropped, Iroh will generate edit-time errors for any subsequent uses of the value, e.g.
save_user(user)
print("Saved User: ${user.id}") // ERROR! The `user` value was dropped in `save_user`!
Values that Iroh shouldn’t automatically drop can be marked with @forget. This
is useful in contexts like calling C functions, e.g.
func(extern: true) EVP_PKEY_free(key EVP_PKEY)
EVP_PKEY_free(key) // Pass the key to C to free.
@forget(key) // Tell Iroh to forget about it.
Field Annotations
Composite types, i.e. struct and enum types, can be annotated with
structured data. The annotations can be on individual fields or the type as a
whole, e.g.
Person = struct {
dob date json.Field{date_format: .rfc3339}
name string sql.Field{name: "user_name"}
updated_at ?datetime json.Field{omit_empty: true}
} sql.Table{name: "people"}
Unlike in Go, where annotations have to be shoved into string values, e.g.
type Person struct {
Name string `json:"name" db:"user_name"`
}
Iroh annotations can be any compile-time evaluatable value. This can be used to easily add things like custom encodings, serialization, validation, etc.
BlogPost = struct {
contents encoded_string string_encoding.iso_8859_1
}
Application code can introspect the specific annotations at compile-time to drive behaviour, e.g.
annotation = @type_info(Person).struct.fields["updated_at"].annotation[json.Field]
if annotation {
if annotation.omit_empty {
// skip empty value ...
}
}
Iroh provides several built-in @-annotations. For example, a struct field can
be marked as @required:
Person = struct {
name string @required
location string
}
// Person values must now provide a `name` value during
// initialization, e.g.
tav = Person{name: "Tav"}
// Not doing so will result in an edit-time error, e.g.
zaia = Person{location: "London"} // ERROR!
Similarly, fields can be marked as @deprecated, e.g.
Config = struct {
host string
port uint16
verify_ssl bool @deprecated
}
// Usage of deprecated fields will result in a compile-time
// warning, e.g.
if cfg.verify_ssl { // WARNING!
...
}
The @deprecated annotation can also specify a custom message to explain the
deprecation, e.g.
Config = struct {
...
verify_ssl bool @deprecated("Unnecessary as we always verify for better security")
}
Field numbers can be assigned to struct fields for use in serialization formats like Protobuf, e.g.
Request = struct {
method string @1
args []string @2
}
The compiler automatically ensures that the field numbers are unique, e.g.
Request = struct {
method string @1
args []string @1 // ERROR! Field number already used for `method`!
}
Layout Control
Iroh provides manual control over the memory layout of struct and enum
types. The backing parameter specifies the underlying storage type for:
-
Enum tags - controls the integer type used to store enum variants, e.g.
Color = enum(backing: uint20be) { red blue green } -
Boolean fields in all-
boolstructs, enabling compact bitset storage when combined withpacked, e.g.Flags = struct(backing: uint32, packed: true) { read: false, write: false, appendable: false, }
The mem parameter allows for the in-memory representation of the fields to be
specified, e.g.
// Use the Arrow columnar storage format:
Events = struct(mem: .arrow) {
timestamps []int64
events []string
}
This allows for data to be represented with zero-copy interoperability with various formats:
-
.arrow— Apache Arrow columnar format. -
.ebb— Espra Binary Buffer memory layout. -
.c— matches the C ABI for interoperating with C libraries (default forexterntypes). -
.capnproto— Cap’n’Proto memory layout. -
.flat_buffers— FlatBuffers memory layout. -
.native— Architecture-specific native layout (the default).
Specifying .custom allows for custom memory layouts through types that
implement the mem_layout interface, e.g.
// Use a custom Simple Binary Encoding implementation:
Order = struct(mem: .custom, mem_layout: SBE) {
order_id uint64
price float64
amount uint32
side byte
symbol [8]byte
}
Native support for different memory layouts makes high-performance serialization formats like EBB and FlatBuffers much more ergonomic to use, e.g.
// Automatically stored as a FlatBuffer in memory:
order = mypkg.Order{
order_id: 123,
price: 1.2345,
amount: 100,
side: 'B',
symbol: [8]byte("GBPUSD ")
}
// Get an immutable view of the in-memory representation:
buf = mem_view(order)
Compare this to the equivalent Go code using traditional FlatBuffers:
import (
"github.com/google/flatbuffers/go"
"mypkg/order"
)
builder := flatbuffers.NewBuilder(0)
symbol := builder.CreateByteVector([]byte("GBPUSD "))
order.OrderStart(builder)
order.OrderAddOrderId(builder, 123)
order.OrderAddPrice(builder, 1.2345)
order.OrderAddAmount(builder, 100)
order.OrderAddSide(builder, 'B')
order.OrderAddSymbol(builder, symbol)
orderOffset := order.OrderEnd(builder)
builder.Finish(orderOffset)
buf := builder.FinishedBytes()
The mem_view function can also access individual field representations, e.g.
buf = mem_view(order.price)
The compiler automatically applies custom memory layouts to arrays and slices, ensuring consistent formatting throughout, e.g.
orders = []Order{order}
// Get the entire array/slice in the custom format:
buf = mem_view(orders)
// Or even individual elements:
elem_buf = mem_view(orders[0])
The alignment of types can be controlled through the align parameter, e.g.
// Align struct to a 64-byte boundary:
Vector3 = struct(align: 64) {
x float64
y float64
z float64
}
Likewise, the @align annotation can be used to control alignment on individual
fields, e.g.
PacketHeader = struct {
version uint8
flags uint8
length uint32
checksum uint32
timestamp uint64 @align(16)
}
Function Decorators
Similar to decorators in Python, Iroh allows decorators to extend or modify the behaviour of functions and methods in a clean, reusable, and expressive way, e.g.
app = http.Router{}
@app.get("/items/#{item_id}", {response: .json})
func get_item(item_id int) {
// fetch item from the database
return {"item_id": item_id, "item": item}
}
Decorators are evaluated at compile-time, enabling extensibility without any
runtime overhead. The built-in @decorator specifies if a function or method
can be used as a decorator.
The first parameter to a decorator is always the function that is being decorated. The decorator can wrap the function or replace it entirely, e.g.
@decorator
func (r Router) get(handler function, path template, config Config) function {
// register path with the router and handle parameter conversion
return (handler.parameters) => {
response = handler(handler.parameters)
match config.response {
.json:
r.encode_json(response)
default:
// handle everything else
}
}
}
Inlining
The inline parameter tells Iroh to try and inline a function or loop at the
call site for better performance, e.g.
func(inline: true) add(a int, b int) int {
return a + b
}
As a result of the inline hint, Iroh will directly insert the code of add
wherever it’s called instead of doing a regular function call, e.g.
// This code:
c = add(a, b)
// Gets transformed into:
c = a + b
This can be beneficial in performance critical code as the overhead of the
function call is eliminated. Similarly, func(inline: false) can be used to
prevent a function from being inlined, e.g.
func(inline: false) something_complex() {
...
}
This can be useful in a number of cases, e.g. better debugging thanks to improved stack traces, minimizing instruction cache misses caused by ineffective inlining, etc.
The for(inline) hint can be used to inline loops, e.g.
elems = [1, 2, 3]
for(inline) elem in elems {
process(elem)
}
If the length of the iterable is known at compile time, this will act as a hint to unroll the loop, i.e.
// This code:
for(inline) elem in elems {
process(elem)
}
// Gets transformed into:
process(elem[0])
process(elem[1])
process(elem[2])
// Or if the element values are known, perhaps even:
process(1)
process(2)
process(3)
If the length of the iterable isn’t known at compile time, then the
for(inline) will act as a hint to the compiler to more aggressively optimize
the loop, e.g.
-
Inline any small function calls within the loop body.
-
Move any loop-invariant code outside of the loop.
-
If possible, convert the loop to use SIMD instructions.
Function Annotations
Functions can be annotated within func() to control their behavior. For
example, to specify a calling convention:
// Use the System V AMD64 ABI:
func(.x64_sysv) add(a int, b int) int {}
// Use the ARM64 SVE calling convention for vector parameters:
func(.arm64_sve) simd_multiply(a *[8]float32, b *[8]float32) *[8]float32
Various architecture and OS-specific calling conventions are supported, e.g.
-
.c -
.arm64 -
.arm64_darwin -
.arm64_windows -
.arm64_sve -
.x64_sysv -
.x64_vectorcall -
.x64_windows
The .naked calling convention can be specified when the function needs to have
no prologue or epilogue, e.g. when integrating with external assembly code.
Recursive functions can avoid stack overflow by using the .tail calling
convention, which converts tail calls into efficient jumps rather than new stack
frames, e.g.
func(.tail) fib(n uint, a uint, b uint) uint {
return match n {
0: a
1: b
default: fib(n - 1, b, a + b)
}
}
The specific stack size needed for a function can be specified with
stack_size. e.g.
func(stack_size: 16<KB>) process() {}
If the compiler determines that the size will be insufficient, this will raise
an edit-time error. So it’s mostly useful on externally defined functions, i.e.
extern functions.
Within function bodies, the @stack_size compile-time function can be used to
determine the stack size needed by a function, e.g.
func worker1() {}
func worker2() {}
func run_workers() {
// Get the stack size needed by worker1 and everything
// it calls:
worker1_size = @stack_size(worker1)
// Get the stack size needed by just worker2:
worker2_size = @stack_size(worker2, local_only: true)
// Get the stack size needed by us:
our_size = @stack_size(run_workers, local_only: true)
}
The @incr_func_stack_size and @decr_func_stack_size compile-time functions
can be used to tell Iroh to increase or decrease the inferred stack size, e.g.
due to assembly calls:
func hash_input(data []byte) {
@incr_func_stack_size(4<KB>)
genasm {
// Inline assembly
}
if @arch == .x64 {
@decr_func_stack_size(2<KB>)
}
}
As Iroh does whole-program analysis, it is able to identify the exact entrypoint calls that will create a cycle, e.g.
A → B → C → D → A
Iroh will not allow these calls unless the entrypoint call is annotated so that stack overflows cannot be possible, e.g.
// Assume a function that could recurse and potentially
// cause a stack overflow:
func parse(data []byte) {
...
}
func main() {
// The entrypoint call to parse will either need to
// specify a `max_depth` that will insert a runtime
// check to ensure it doesn't recurse more than the
// specified number of times, e.g.
parse(data, @max_depth: 1000)
// Or a `max_stack` limit that will insert a runtime
// check to ensure that it doesn't use up more than
// the specified amount of stack space, e.g.
parse(data, @max_stack: 10<MB>)
// Or even both, e.g.
parse(data, @max_depth: 1000, @max_stack: 10<MB>)
}
For safety, Iroh uses conservative stack analysis, assuming worst-case scenarios
such as the deepest possible recursive calls through interface dispatch.
Confidential computing on TEEs (Trusted Execution Environments) can be configured through function annotations, e.g.
// SGX enclave entry with custom stack and security policy:
func(tee: .sgx(allow_untrusted: true), stack_size: 8<KB>) process_secret(data []byte) []byte {
...
}
// SEV/SNP, mark as requiring attestation, VM-level isolation:
func(tee: .sev_snp(require_attestation: true)) handle_vm_request(input []byte) []byte {
...
}
// TrustZone, Secure World:
func(tee: .trustzone(world: .secure)) tz_handle_command(cmd uint32) uint32 {
...
}
Functions can also be marked as deprecated, e.g.
func escape(deprecated: "Use escape_uri_component instead. This will be removed in v2026-E") {
...
}
Compile-Time Mutability
Iroh does not allow global mutable state as it makes programs unpredictable, difficult to maintain, and error prone. All global state at runtime needs to be immutable.
However, we allow for mutability during compile time. This is made possible by
marking functions as being compile-time only by using the @compile_time
decorator.
For example, if an image package wanted to define an internal registry of
codecs so that imported packages could register new image formats, it could
define something like:
registry = map[string]Codec{}
@compile_time
func register(format_name string, codec Codec) {
registry[format_name] = codec
}
func get_codec(format_name string) {
return registry[format_name]
}
Other packages could then use image.register within their top-level package
scope at compile time, e.g.
// Within the png package:
import "image"
Codec = struct {
...
}
image.register("PNG", Codec) // Called at compile time.
Any function that’s marked as @compile_time will not be callable during
runtime, and any attempt to mutate a global variable at runtime will result in
an edit-time error.
Function Purity & Parallelism
Iroh automatically keeps track of whether functions are:
-
Deterministic, i.e. given the same inputs they will always return the same outputs.
-
Side-effect free, i.e. they don’t modify external state or interact with the outside world.
Built-in functions like rand and the os implementation of time.now are
automatically marked as non-deterministic. Developers can also mark their
functions as non-deterministic, e.g.
NetworkEntropy = struct {
...
}
(n NetworkEntropy) func(non_deterministic: true) rand_bytes() []byte {
...
}
This information, along with the tracking done by the compiler, is used to determine whether a function is pure, e.g.
a = (x int, y int) => {
return x + y
}
b = (x int, y int) => {
print("Got input: ${x}, ${y}")
return x + y
}
@type_info(a).func.is_pure == true // true
@type_info(b).func.is_pure == false // true
This can be used by developers to enable various automatic optimizations, e.g. as pure functions have no side effects, they can be executed in parallel:
func my_mapper(data []T, f (elem T) U) []U {
info = @type_info(f)
if info.func.is_pure {
with {
.parallel = true
.parallel_threads = 4
} {
return data.map(f)
}
}
return data.map(f)
}
// This gets run in parallel:
output = my_mapper(data) {
$0 * 2
}
// This doesn't:
output = my_mapper(data) {
print("Got input: ${$0}")
$0 * 2
}
Unlike other languages which force developers to choose between eager and lazy evaluation, Iroh allows for selective laziness, i.e. lazy for pure functions and eager for impure ones.
Built-in Functions
Besides built-in types like time, Iroh provides various built-in functions to
make certain common operations easier, e.g.
-
cd(path)-
Change the working directory to the given path.
cd("/home/tav")
-
-
cap(slice)-
Return the capacity of the given slice.
x = []int(len: 100) cap(x) == 100 // true
-
-
drop(value)- Force a value to be dropped immediately.
-
exit(code: 0)- Exit the process with the given status code.
-
fprint(writer, ...args, end: "")-
Write the given arguments to the writer using the same formatting as the
printfunction.fprint(my_file, "Hello world!")
-
-
glob(pattern)-
Return files and directories matching the given pattern within the current working directory.
for path in glob("*.ts") { // Do something with each .ts file }
-
-
hash(value)- Return the hash of the given value.
-
iter(iterable)- Call
__iter__on an iterable and return the corresponding iterator.
- Call
-
len(iterable)-
Return the length of the given iterable.
len(["a", "b", "c"]) == 3 // true
-
-
max(...args)-
Return the maximum of the given values.
max(1, 12, 8) == 12 // true
-
-
mem_view(value)- Return an immutable view of the in-memory representation of a value.
-
min(...args)-
Return the minimum of the given values.
min(1, 12, 8) == 1 // true
-
-
print(...args, end: "\n")-
Print the given arguments to the standard output.
// Output with a newline at the end: print("Hello world!") // Output without a newline at the end: print("Hello world!", end: "")
-
-
print_err(...args, end: "\n")-
Print the given arguments to the standard error.
// Output with a newline at the end: print_err("ERROR: Failed to read file: /home/tav/source.txt") // Output without a newline at the end: print_err("ERROR: ", end: "")
-
-
read_input(prompt: "", raw: false)-
Read input from the standard input.
name = read_input("Enter your name: ")
-
-
read_passphrase(prompt: "", mask: "*")-
Read masked passphrase from the standard input.
passphrase = read_passphrase("Passphrase: ")
-
-
type(value)-
Return the type of the given value.
type("Hello") == string // true match type("Hello") { string: print("Got a string value!") int: print("Got an int value!") default: print_err("Got an unknown value!") }
-
Along with various compile-time functions, e.g.
-
@align_of(type)- Return the alignment requirements of a type.
-
@col- Convert an array/slice from row-major ordering to column major.
-
@compile_error(msg)- Emit a compile-time error message.
-
@compile_warn(msg)- Emit a compile-time warning.
-
@enum_from_variant_names(enum_type)-
Generate a new
enumtype from just the variant names of the given type, i.e. leaving out any associated data from each variant.API = enum { get_order(id uint64) get_user(user string) } Methods = @enum_from_variant_names(API) // This would generate: Methods = enum { get_order get_user }
-
-
@get_field(src, field_name)- Get the value of the given field from
src.
- Get the value of the given field from
-
@list_directory(path, watch: false)- Return the contents of the given directory path.
-
@pointer(val)-
Return the raw
[*]pointer for the given value. -
The call site that calls
@pointerand all lines that use the returned pointer value will be marked as unsafe.
-
-
@print(...args, end="\n")- Print the given arguments at compile-time.
-
@read_file(filepath, watch: false)- Return the byte slice contents of reading the given file.
-
@read_url(url, watch: false)- Return the byte slice contents of reading the given URL.
-
@row- Convert an array/slice from column-major ordering to row major.
-
@set_field(src, field_name, value)- Set the given field on
srcto the given value.
- Set the given field on
-
@size_of(type)- Return the byte size of a type.
-
@stack_size(function, local_only: false)- Return the inferred stack size used by the given function. When
local_onlyistrue, return only the function’s own stack usage, and not the stack usage from any called functions.
- Return the inferred stack size used by the given function. When
-
@transpose- Transpose the layout of an array/slice, e.g. row major to column major, and vice-versa.
-
@type_info(type)- Return the compile-time metadata about a type.
Compile-time information:
-
@arch- The CPU architecture of the build target.
-
@os- The OS of the build target.
-
@target- Full details about the build target, e.g. CPU architecture, OS, GPU, debug build, optimization levels, etc.
Compiler instructions:
-
@decr_func_stack_size(amount)- Decrease the inferred stack size of the enclosing function by the given amount.
-
@forget(variable_identifiers...)- Instruct the compiler to forget about releasing the memory for the given variables, e.g. when they’ve been handed over to C code.
-
@incr_func_stack_size(amount)- Increase the inferred stack size of the enclosing function by the given amount.
-
@preserve(variable_identifiers...)- Instruct the compiler to preserve the given variables ensuring that they’re not removed or optimized away.
Core type constructors:
-
@derive(type, params: {}, with: [], without: [], deprecated: "")- Create a derived type.
-
@limit(type, constraints...)- Create a refined type which is validated against the specified constraints.
-
@relate(unit, relationships...)- Relate units to each other through equality definitions.
Field annotations:
-
@align(size)- Set the alignment of a struct field to the specified byte boundary.
-
@allowed_errors(errors...)- Specify the set of errors that can be returned from an interface method.
-
@deprecated(msg: "")- Mark a field as deprecated.
-
@no_error- Specify that an interface method must not return an error.
-
@required- Mark a field as needing to be initialized with an explicit value.
-
@1,@2, etc.- Annotate a field with a field number for use in serialization, etc.
Function/method decorators:
-
@compile_time- Mark a function/method as only being callable at compile time.
-
@consume(default: false)- Mark methods that “consume” the value.
-
@decorator- Mark a function/method as a compile-time decorator.
-
@undo- Mark a specialized method as the variant to use when undoing a method call
within an
atomicblock.
- Mark a specialized method as the variant to use when undoing a method call
within an
Other annotations:
-
@failure- Annotate a
returnas returning an error value on the failure path for anoutcome.
- Annotate a
-
@success- Annotate a
returnas returning an error value on the success path for anoutcome.
- Annotate a
-
@view- Annotate a slice operation so that the produced slice is considered a “view” and shares the underlying array without copy-on-write protection.
Special variables:
-
$elems- Access the aggregated elements within a block builder’s
appendandfinalizemethods.
- Access the aggregated elements within a block builder’s
-
$error- Access the current error within the
orblock of atry, or within a match arm of acatch.
- Access the current error within the
-
$resume- Access the resume function within an effect handler.
Comments
The Iroh editor supports 3 types of comments:
-
Typing
//will start a line comment. -
Typing
///will start a documentation comment. -
Selecting a block of code and pressing a keyboard shortcut, e.g.
⌘+/on macOS, will comment out the entire block.
Comments preserve whitespace and can span multiple lines. As the Iroh editor is
aware when users are typing within comments, there’s no need to keep typing //
or /// at the start of every line.
Comments also support styling:
-
Headings (levels 1-3).
-
Bold/Italic.
-
Lists — both ordered and unordered.
-
Links — Web, Espra, as well as to Iroh type definitions.
-
Emojis and inline images.
-
Code blocks, including Iroh code for examples.
-
Basic tables.
Documentation comments can be standalone or attached to a particular code
definition, e.g. a func, struct, enum, const, @derive, struct fields,
enum variants, etc.
These are used to auto-generate documentation for packages. They can also be
accessed via the metadata returned via @type_info calls, e.g.
/// DatabaseConfig holds configuration settings for the
/// application database.
DatabaseConfig = struct {
/// db_host specifies the database server hostname or IP address.
db_host string
/// db_port specifies the port number used to connect to the database.
db_port int
/// timeout defines the timeout duration, in seconds.
timeout int<s>
}
info = @type_info(DatabaseConfig)
// Get the documentation for the type:
info.doc.as_text()
// Get the documentation for each of the fields:
for field in info.struct.fields {
field.doc.as_text()
}
This can be used to auto-generate UIs, create API documentation tools, add runtime validation with contextual error messages, etc.
Reserved Keywords
Most languages tend to have a number of keywords that are reserved and can’t be used as identifiers:
| Language | Number of Keywords |
|---|---|
| Go | 25 |
| Python | 35 |
| Ruby | 41 |
| JavaScript | 47 |
| Rust | 52 |
| C | 54 |
| Java | 54 |
| Swift | 72 |
| C# | 77 |
| C++ | 98 |
Iroh is towards the smaller end at just 36 reserved keywords:
actor else import spec
and enum in struct
as errorgroup interface sym
atomic errorset is try
break for match when
catch from onchange with
const func or
continue genasm return
default handle select
do if spawn
Trying to use them as variable names will result in an edit-time error, e.g.
if = 5 // ERROR! Cannot use keyword `if` as a variable name!
However, they can be used as field names, e.g.
Conditional = struct {
if Expression
else Expression
}
cond = Conditional{}
// It's perfectly legal to access these fields with
// dot notation, e.g.
if_cond = cond.if
Besides the core reserved keywords, Iroh also has a few contextual keywords, e.g.
-
fully— only reserved infor { ... } fully { ... }. -
retry— only reserved inatomic(retry). -
this— only reserved in@limitdefinitions.
In scopes where both:
-
A variable has been named with such a contextual keyword, e.g.
retry = false -
And the keyword is also used contextually, e.g.
atomic(retry) { ... }
A compile-time warning will be generated, as the keyword usage might be confused with the variable, e.g.
// Perhaps this was the intended meaning:
atomic(false) {
...
}
Similarly, Iroh emits a warning whenever built-in types and functions like
print, int, and true are redefined by user code, as that could potentially
create confusing situations.
Expression-Based Assignment
Languages with a strong emphasis on expressions can easily lead to code with poor readability and cognitive load, e.g. consider this Rust code:
let final_price = {
let base_price = if item.category == "premium" { if user.is_vip { item.price * 0.8 } else { item.price * 0.9 } } else { item.price };
let shipping_cost = if base_price > 50.0 { if user.location == "remote" { 15.0 } else { 0.0 } } else { 8.0 };
let tax_amount = if user.state == "CA" { if base_price > 100.0 { base_price * 0.08 } else { base_price * 0.06 } } else { base_price * 0.05 };
let total = base_price + shipping_cost + tax_amount;
if total > 200.0 {
if user.membership.is_some() {
total - user.membership.unwrap().discount_amount
} else {
total * if user.first_time_buyer { 0.95 } else { 1.0 }
}
} else {
if user.has_coupon {
if coupon.min_purchase <= total { total - coupon.amount } else { total }
} else {
if user.loyalty_points > 1000 { total - 10.0 } else { total }
}
}
};
An accidental semicolon somewhere can easily change the meaning of the entire calculation. To minimize such issues, Iroh takes a more pragmatic approach to expression-based assignments.
Expressions beginning with certain keywords like if and else can implicitly
assign their block value to a variable as long as there are no further nested
constructs, e.g.
base_price = if user.is_vip { price * 0.8 } else { price }
If multi-line computations are needed, or if nested constructs need to be used,
the block is auto-indented and the value being assigned needs to be explicitly
prefixed with a =>, e.g.
base_price =
if item.category == "premium" {
if user.is_vip {
=> item.price * 0.8
} else {
=> item.price * 0.9
}
} else {
=> item.price
}
The do keyword can be used to evaluate multi-line expression blocks for
assignment, e.g.
value = do {
temp = expensive_calculation()
=> temp * 2 + 1
}
Similar to how return works within function bodies, => ends computation
within a block, i.e.
value = do {
temp = expensive_calculation()
=> temp * 2 + 1
// The following code will be unreachable, just like after a return.
print("This won't execute")
}
Variables declared within do blocks do not pollute the outer scope.
Expression-based assignment can also be nested, e.g.
base_price = do {
discount =
if item.category == "premium" {
if user.is_vip {
=> 0.2
} else {
=> 0.1
}
} else {
=> 0.0
}
=> item.price * (1 - discount)
}
For nested expressions where assignment is to an outer block, labels can be used
and assignments can use the form =>label value, e.g.
base_price = outer: do {
discount =
if item.category == "premium" {
if item.flash_sale {
=>outer item.flash_sale_price
} else if user.is_vip {
=> 0.2
} else {
=> 0.1
}
} else {
=> 0.0
}
=> item.price * (1 - discount)
}
Within for expressions, >> can be used to append elements to the result
slice, e.g.
high_value_sale_items = for item in items {
if item.flash_sale && item.flash_sale_price > 500<USD> {
>> item
}
}
Transmuting Values
The as keyword allows for a value to be reinterpreted as another type, e.g.
x, y = int128(1234) as [2]int64
This allows zero-copy reinterpretation of a value’s bits, e.g.
req = buf as APIRequest
Transmutations are checked for safety at edit-time. When edit-time verification isn’t possible (e.g., with dynamic length slices), runtime checks ensure safe conversion.
Symbolic Programming
Iroh has first-class support for symbolic programming. New symbols are declared
using the sym keyword, e.g.
sym x, y, z
Symbols and the expressions based on them are all of the type sym_expr, e.g.
sym x
y = x⁶ + 4x² + 1
type(y) == sym_expr // true
Expressions can be evaluated using the eval method, e.g.
sym x, w
y = x⁶ + 4x² + 1
z = 7w + 5x + 3y³
z.eval(x: 3) == 7w + 1348365303 // true
The Iroh editor provides keyboard shortcuts for increasing and decreasing the script level to make it easy to write complex mathematical expressions.
These can then be used with functions from the sym package in the standard
library to do things like symbolic differentiation, e.g.
import * from "sym"
sym x
expr = x³ + sin(x) + exp(x)
y = diff(expr, x)
y == 3x² + cos(x) + exp(x) // true
Symbolic integration, e.g.
sym x
integrate(x² * sin(x), x) // -x²*cos(x) + 2x*sin(x) + 2*cos(x)
Solve equations algebraically, e.g.
sym x
solve(x² + 5x + 6 == 0, x) // [-3, -2]
Do algebraic simplification, e.g.
sym x
simplify((x² - 1)/(x - 1)) // x + 1
Over time, more and more functions will be added to the standard library, so that Iroh is competitive with existing systems like Mathematica and Sympy.
Software Transactional Memory
Iroh’s atomic blocks provide an easy way to take advantage of software
transactional memory, e.g.
atomic {
if sender.balance >= amount {
sender.balance -= amount
receiver.balance += amount
}
risky_operation(sender)
}
In the example above, if risky_operation should generate an error, then the
whole atomic block will be rolled back, i.e. it’ll be as if the entire block
of code had never been run.
Atomic blocks naturally compose, e.g.
func transfer(sender Account, receiver Account, amount currency) {
atomic {
if sender.balance >= amount {
sender.balance -= amount
receiver.balance += amount
}
}
}
atomic {
transfer(alice, tav, 100)
transfer(tav, zaia, 150)
}
Behind the scenes, the use of atomic essentially operates on “shadow”
variables. The “real” variables are only overwritten if the entire block
succeeds.
To make this efficient for certain data structures, e.g. large maps, the compiler will automatically switch them to use immutable variants that support efficient copies and rollbacks.
For even greater control, types can define @undo variants of methods that will
be called in reverse order to roll back aborted changes, e.g.
func (d Document) insert_text(text string, pos int) {
...
}
@undo
func (d Document) insert_text(text string, pos int) {
// This variant automatically gets called to rollback changes.
}
Atomic blocks provide optimistic concurrency without needing any explicit locks. The outermost atomic block essentially forms a transaction, and all changes that it makes are tracked automatically.
If transactions conflict by accessing the same data concurrently, Iroh will roll back all conflicting ones except the first transaction that successfully completes.
When transactions are rolled back, they can be configured to automatically retry, e.g.
atomic(retry) {
counter += 1
}
By default, retries will go on indefinitely. These can be constrained if needed, e.g.
// Retry up to 10 times:
atomic(retry, max_retries: 10) {
counter += 1
}
// Retry for up to 5 seconds:
atomic(retry, timeout: 5<s>) {
counter += 1
}
An else branch is executed in case of such failures, e.g.
atomic(retry, timeout: 5<s>) {
counter += 1
} else {
print("Failed to increment the counter!")
}
Instead of retrying immediately after conflicts, you can use atomic.wait() to
wait until the values accessed by the transaction change, e.g.
atomic(retry) {
if queue.is_empty() {
atomic.wait()
}
job = queue.pop()
...
}
While transactions can roll back internal state changes, it’s generally impossible to roll back external effects such as writing output, e.g.
atomic {
if sender.balance < amount {
print("ERROR: Insufficient balance!")
}
}
In these cases, a good pattern is to only create side effects once an atomic
block has finished running, e.g.
result = atomic {
if sender.balance >= amount {
sender.balance -= amount
receiver.balance += amount
=> .successful_transfer
}
=> .insufficient_funds
}
if result == .insufficient_funds {
print("ERROR: Insufficient balance!")
}
Similarly, while atomic blocks support asynchronous I/O calls, idempotency
keys should be used when calling external services, e.g.
idempotency_key = ...
atomic {
ok = <- call_transfer_api({amount, idempotency_key})
if ok {
mirrored.balance -= amount
}
}
Reactive Programming
Iroh supports reactivity natively. Any variables defined using :=
automatically updates whenever any variables that it depends on is updated, e.g.
a = 10
b = 20
c := a + b
b = 30
c == 40 // true
When a computed value can no longer be updated, i.e. when all the variables that it depends on are no longer updatable, it is no longer tracked, e.g.
func add(a int, b int) int {
c := a + b
return c // c is now untracked, as a and b can no longer be updated
}
To automatically perform side-effects whenever values are updated, the built-in
onchange function can be given a handler function to run, e.g
a = 10
b = 20
c := a + b
onchange {
print("={c})")
}
b = 30 // Outputs: c = 40
b = 30 // No output as no change to affect computed value
The onchange function returns a callback that can be used to explicitly remove
the registered handler, e.g.
a = 10
b = 20
c := a + b
remove_printer = onchange {
print("c = ${c})")
}
b = 40 // Outputs: c = 50
remove_printer()
b = 30 // No output as handler has been removed
As onchange automatically tracks all the computed values that a handler
depends on, the handlers are automatically cleaned up whenever the values they
depend on are dropped.
This primitive is what powers many mechanisms in Iroh, e.g. our component-based UIs use this mechanism to automatically re-render the UI whenever dependent state values change.
When computed variables are defined, their definition is evaluated lazily, allowing for circular definitions, e.g.
temp_celsius := 0
temp_celsius := (temp_fahrenheit - 32) * 5/9
temp_fahrenheit := 32
temp_fahrenheit := (temp_celsius * 9/5) + 32
In such cases, the first := defines the initial value, and the second
definition defines how the value should be computed. This is cleaner than in
other languages and frameworks, e.g. Vue:
import { ref, computed } from 'vue'
const celsius = ref(0)
const fahrenheit = ref(32)
const celsiusComputed = computed({
get: () => (fahrenheit.value - 32) * 5/9,
set: (value) => fahrenheit.value = (value * 9/5) + 32
})
const fahrenheitComputed = computed({
get: () => (celsius.value * 9/5) + 32,
set: (value) => celsius.value = (value - 32) * 5/9
})
As the compiler knows which methods mutate underlying data structures,
reactivity works automatically for complex data structures like map types,
e.g.
users = [{name: "Tav"}, {name: "Alice"}]
user_count := len(users)
users.append({name: "Zeno"})
user_count == 3 // true
As the compiler tracks dependencies and data flow, it’s able to optimally update values without much overhead, and in pretty much the same way that a developer would manually do so.
For certain operations, e.g. filtering a slice, the compiler is able to avoid unnecessary operations, e.g. by only filtering new elements and adding them to a computed value, e.g.
users = [{name: "Tav", admin: true}, {name: "Alice", admin: false}]
admins := users.filter { $0.admin == true }
// As users is mutated, admins would need to be recomputed.
// However, the filter is only run on the newly appended item
// to the slice.
users.append({name: "Zeno", admin: false})
This mechanism is applied even to nested collections, e.g.
users = [...]
number_of_active_admins := users
.filter { $0.admin }
.filter { len($0.recent_messages) > 0 }
Fields on struct types can also be reactive, e.g.
Person = struct {
first_name string
last_name string
full_name := "${first_name} ${last_name}"
}
alice = Person{
first_name: "Alice"
last_name: "Fung"
}
alice.full_name == "Alice Fung" // true
Whenever a reactive computation can generate errors, the errors must be explicitly handled so that computed variables don’t need to propagate errors, e.g.
channel = "#espra"
msgs := try fetch_recent_messages(channel) or []
starred := msgs.filter { $0.starred }
The compiler will automatically batch updates based on dataflow analysis. This analysis is aware of I/O boundaries, so can atomically update changes even when dealing with async calls.
However, for the cases, where explicit control is needed, an atomic block can
be used, e.g.
channel = "#espra"
atomic {
msgs := <- try fetch_recent_messages(channel) or []
starred := msgs.filter { $0.starred }
pinned := <- try fetch_pinned_messages(channel) or []
first_pinned := pinned.get(0)
}
Block Builders
In some domains, the typical imperative style can get quite verbose and repetitive, e.g.
func create_profile(name string, tags []string) dom.Element {
profile = dom.create_element("div", class: "profile")
name_elem = dom.create_element("div", class: "profile__name")
name_elem.text_content = name
tags_elem = dom.create_element("ul", class: "profile__tags")
for tag in tags {
li = dom.create_element("li", class: "profile__tag")
li.text_content = tag
tags_elem.append_child(li)
}
profile.append_child(name_elem)
profile.append_child(tags_elem)
return profile
}
create_profile("Max Kampik", ["#crypto", "#art"])
Even abstracting out some functionality like:
func div(attrs {class string, children []dom.Element}) dom.Element {
elem = dom.create_element("div")
if attrs.class {
elem.class_name = attrs.class
}
for child in children {
elem.append_child(child)
}
return elem
}
Can still be awkward to use, e.g.
func create_profile(name string, tags []string) dom.Element {
return (
div(class: "profile", children: [
div(class: "profile__name", children: text(name)),
ul(class: "profile__tags", children: tags.map {
return li(class: "profile__tag", text($0))
})
])
)
}
What if it could be cleaner? What if it could look like:
func create_profile(name string, tags []string) dom.Element {
return div.profile {
div.profile__name { name }
ul.profile__tags {
for tag in tags {
li.profile__tag { tag }
}
}
}
}
While in the context of HTML, something like JSX might be more familiar, e.g.
function createProfile(name string, tags string[]) JSX.Element {
return (
<div className="profile">
<div className="profile__name">{name}</div>
<ul className="profile__tags">
{tags.map(tag =>
<li className="profile__tag">{tag}</li>
)}
</ul>
</div>
)
}
Iroh provides a generic mechanism called block builders that can be used in any context, not just to create HTML-esque UI, e.g.
// Define a vector path, e.g.
Path {
Move to: (50, 50)
Line to: (100, 100)
Line to: (0, 100)
Close
}
// Define a workflow, e.g.
Workflow(City) {
Fetch "https://example.com/cities.csv"
Filter { $0.region == "UK" }
Transform { $0.name.upper() }
SaveResults to: "city_names.json"
}
This is made possible thanks to compile-time defined block_builder types, e.g.
Strings = block_builder(elem_type: string)
// Strings can now be used to construct a slice
// of strings within blocks, e.g.
fruits = Strings {
"Apple"
"Banana"
"Mango"
}
fruits == ["Apple", "Banana", "Mango"] // true
// Standard constructs like `if` and `for` can be
// used within these blocks, e.g.
hosts = Strings {
"node-1.espra.com"
if is_prod {
"node-5.espra.com"
"node-7.espra.com"
}
for node_id in default_hosts {
"node-${node_id}.espra.com"
}
}
Block builders work by executing each line within their block:
-
If the line assigns to a variable or is a construct like
if,match, orfor, execution continues as usual. -
Otherwise, the result of the line’s expression is appended to the block builder by calling its
appendmethod. -
Once all lines have been executed, the
finalizemethod on the block builder is called.
By default, append only adds elements of the block_builder type, and
finalize returns the accumulated slice without any changes, i.e.
// Defining:
Strings = block_builder(elem_type: string)
// Is equivalent to:
Strings = block_builder(elem_type: string) {
append(elem string) {
// $elems is a special variable that contains
// all the accumulated values in the block
// builder's slice.
$elems.append(elem)
}
finalize() {
return $elems
}
}
These can be overridden to customize the behaviour, e.g.
div = block_builder(elem_type: dom.Element) {
append(elem T) {
if T is dom.Element {
$elems.append(elem)
return
}
if T is string {
$elems.append(dom.text_node(elem))
return
}
if T.kind is .slice {
if T.elem is dom.Element {
for sub_elem in elems {
$elems.append(sub_elem)
}
return
}
}
@compile_error("Only dom.Element and string values are currently supported")
}
finalize() {
div = dom.create_element("div")
div.children = $elems
return div
}
}
This can then be used to construct elements using a mix of other elements, strings, and slices, e.g.
content = div {
"Hello "
strong { "world!" }
div {
span {
"Tav was here!"
}
}
}
As all calls to append and finalize will typically get inlined, and the type
checks will all get compile-timed away, the resulting code should be as fast as
manual construction.
Finally, the default view:
-
Elides braces for the first
block_builderor control flow expression on a line and shows its contents as indented blocks. -
Elides the outer parentheses for the first function/method call on a line.
This is intended to reduce the visual noise when viewing deeply hierarchical structures such as UI views. As this is just a presentation layer config, this can be turned off if desired.
For clarity, when a developer starts editing a line, any elided parentheses and braces in the enclosing and child block are automatically rendered. That is:
// What's actually there (shown when editing):
div.profile {
div.profile__name { name }
ul.profile__tags {
for tag in tags {
li.profile__tag { tag }
}
}
}
// What you see by default, i.e. without any of
// the noise of the trailing braces:
div.profile
div.profile__name { name }
ul.profile__tags
for tag in tags
li.profile__tag { tag }
Block builders enable declarative DSLs with minimal fuss. Sort of like REBOL’s dialects but with type safety, and like Swift’s result builders but much simpler.
Package Imports
Iroh supports referencing code from other packages using the import keyword.
Like Go and JavaScript, we use static strings for referencing the package paths
which can be:
-
Tilde links to refer to packages on Espra:
import "~1.tav/some/package" -
URL paths to refer to packages on some git repo that’s accessible over HTTPS:
import "github.com/tav/package" -
Relative paths starting with either
./or../refer to local packages:import "./some/package" -
Absolute paths to refer to packages within the standard library:
import "os/exec"
Imported packages are automatically aliased to a package identifier. This
defaults to the package_name as specified by the package’s metadata, which in
turn defaults to the package’s file name.
Package names must satisfy the following regular expression:
^[a-z]([a-z0-9_]*[a-z0-9])?$
To support predictable naming, the package_name must match one of the last two
slash-separated elements of the import path, e.g.
- If the package is imported as
"~1.tav/json/v2", then thepackage_namemust be eitherjsonorv2.
An automatic conversion from hyphens in an import path to underscores in the
package_name is also supported for those wanting to use hyphens for aesthetic
purposes, e.g.
import "github.com/tav/chart-widget" // imported as chart_widget
Likewise, any dots in the package_name are converted to underscores, e.g.
import "~1.tav/package/csv.lib" // imported as csv_lib
In the case of conflicts, or even just for personal preference, imported
packages can be bound to a different package identifier by using the as
keyword, e.g.
import "~1.tav/json/v2" as fastjson
For importing non-code resources like images, custom importers can be specified
via the using keyword on import statements, e.g.
import "github.com/micrypt/image"
import "~1.tav/app/hero.jpg" using image.optimize
Custom importers need to implement the built-in importer interface and are
evaluated at compile-time. They are passed the resource data and metadata from
the Espra tilde link.
As with all package dependencies, custom imports can also be updated by refreshing the package manifest, e.g. to fetch and use a newer version of a resource.
To use the same custom importer for multiple packages, the with statement can
be used, e.g.
import "github.com/micrypt/bundle"
with {
.importer = bundle.assets
} {
import "~1.tav/logo/espra.png"
import "~1.tav/logo/ethereum.png"
import "~1.tav/logo/bitcoin.png"
}
Code within imported packages are referenced using dot syntax, e.g.
fastjson.encode([1, 2, 3])
Explicitly referencing the packages at call sites generally makes code easier to understand, rather than importing references from packages, e.g.
-
We believe that it’s easier to know what’s going on in:
import ( "github.com/tav/encode/json" "github.com/tav/encode/yaml" ) json.encode([1, 2, 3]) yaml.encode([1, 2, 3])Than in something like:
use serde_json::to_string; use serde_yaml::to_string as to_string_yaml; to_string([1, 2, 3]) to_string_yaml([1, 2, 3])
However, there are certain use cases where constantly referencing the package name will add unnecessary noise, e.g. when referencing JSX-esque components, unit definitions, enums, etc.
For these cases, a special * import form can be used, e.g.
import * from "github.com/tav/calendar"
By default, this will import all exported references from the package that starts with an upper case, any units, as well as import the package itself. This makes it cleaner to use some packages, e.g.
import * from "github.com/tav/calendar"
func App(initial_date string) {
selected_date = calendar.parse_rfc3339(initial_date)
return (
<CalendarWidget selected_date>
<Heading>"Event Start"</Heading>
</CalendarWidget>
)
}
Package authors can define a top-level __all__ to control which symbols are
available when other packages import with *, e.g.
__all__ = [CalendarWidget, parse_rfc3339]
In script mode, it’ll be possible to import all exported identifiers into the
current package using the special ** import syntax, e.g.
import ** from "github.com/tav/math"
This feature can be enabled as a config option in script mode. For example,
the Iroh REPL will have this always on, as it can get tedious to keep typing the
package name repeatedly inside the REPL, e.g.
// The standard approach involves a lot more typing in the REPL:
y = math.sqrt(math.sin(x) + math.cos(x))
// The ** imports make it shorter:
y = sqrt(sin(x) + cos(x))
If imported identifiers conflict in ** imports, identifiers from later imports
will override those with the same name from previous ones.
Package Management
Visibility Control
Within packages, everything defined within the package’s global scope, i.e. variables, functions, types, and even the fields within the types, are fully visible to the rest of the code in the package.
However, outside of the package, i.e. in code that imports a package, visibility is constrained:
-
For a type value to be accessible, its identifier must start with a Latin capital letter, i.e.
AtoZ. -
For a non-type value to be accessible, i.e. a const value, function, a field within a public type, etc., its identifier must not start with an
_.
For example, if a model package defined:
supported_countries = ["China", "UK", "USA"]
_offensive_names = []
location = struct {
lat int
lng int
}
Person = struct {
name string
country string
_date_of_birth time.Date
}
func (p Person) is_over_21() bool {
...
}
func (p Person) _has_offensive_name() bool {
...
}
func _calc_age(p Person) int {
...
}
Then, in a package that imported it:
// Accessible
model.Person
model.supported_countries
person.name
person.country
person.is_over_21()
// Inaccessible
model.location
model._offensive_names
model._calc_age
person._date_of_birth
person._has_offensive_name()
Iroh’s approach reflects established norms within the programming community of
prefixing private fields with underscores, and avoids the need for public and
private visibility modifiers.
Environment Variables
Iroh provides built-in functions like $env.get and $env.lookup to get
the value of environment variable values, e.g.
log_level = $env.get("LOG_LEVEL")
Similarly, $env.set can be used to set these values, e.g. when you need to
spawn external commands with that env value:
log_level = $env.set("LOG_LEVEL", "warn")
All environment variables values can be iterated over using $env.all, e.g.
for env_name, env_value in $env.all() {
// Do something with each environment variable.
}
As this can get noisy in scripts, Iroh also provides syntactic sugar for
variables that start with $ and are followed by a sequence of upper case Latin
letters, numbers, or underscores, e.g.
log_level = $LOG_LEVEL
Besides being a shortcut for getting environ values, they can also be used to update values easily, e.g.
$LOG_LEVEL = "warn"
Well-established environment variables which need to treated as lists, such as
$PATH and $CFLAGS are transparently converted into string slices, e.g.
$PATH == [
"/usr/local/bin",
"/usr/bin",
"/bin",
"/usr/sbin",
"/sbin"
]
When this value gets manipulated, the underlying environment variable gets updated, e.g.
$PATH.prepend("/home/tav/bin")
These can be cast to a string to get the encoded form, e.g.
path = string($PATH)
path == "/home/tav/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin" // true
Other well-established environment variables are also appropriately typed, e.g.
$VERIFY_SSL and $NO_COLOR are treated as booleans, $HTTPS_PORT and
$TIMEOUT are treated as integers, etc.
This typed nature allows for default values to be set easily, e.g.
timeout = $TIMEOUT or 60
Environment variable values are of the environ data type. Custom registrations
can be defined at compile-time, e.g.
environ.register("ESPRA_DEBUG", type: bool)
environ.register("KICKASS_IMPORT_PATH", type: []string, delimiter: ";")
Any $ references to those environment variable names will then be treated as
expected, e.g.
$KICKASS_IMPORT_PATH.prepend("/home/tav/kickass")
string($KICKASS_IMPORT_PATH) == "/home/tav/kickass;/usr/local/kickass" // true
An environment variable can also be marked as sensitive, i.e. it should not be
logged or implicitly inherited by sub-processes, e.g.
environ.register("DB_PASSWORD", type: string, sensitive: true)
print($DB_PASSWORD) // Outputs: <redacted>
By default, changes to environment variables are applied globally and inherited
by all sub-processes. To limit a value to a specific lexical scope, the with
statement can be used, e.g.
with {
$TIMEOUT = 120
} {
// Execute external commands.
}
System & Process Info
Common system and process-related info can also be found in some $-prefixed
variables:
-
$arch- The CPU architecture, e.g.
arm64,x64, etc.
- The CPU architecture, e.g.
-
$args-
List of command-line arguments without the binary and script names.
-
In
standardmode on “legacy” platforms, i.e. all current operating systems, this returns a list ofwtf8_stringvalues instead ofstringvalues.
-
-
$argv-
List of command-line arguments including the binary and script names.
-
In
standardmode on “legacy” platforms, i.e. all current operating systems, this returns a list ofwtf8_stringvalues instead ofstringvalues.
-
-
$available_memory- Currently available memory in bytes.
-
$boot_time- Timestamp of when the system was last booted.
-
$cloud_info- Access cloud metadata (on supported platforms).
-
$container_info- Access container metadata and runtime info (on supported platforms).
-
$cpu_count- Number of available CPU cores/threads.
-
$cpu_info- Details on the system CPUs.
-
$cwd- The current working directory.
-
$disk_info- Details on the system disks.
-
$effective_gid- The effective group ID (on supported platforms).
-
$effective_uid- The effective user ID (on supported platforms).
-
$env- Access environment variables.
-
$exit_code- Exit code of the last executed command/process.
-
$groups- List of group IDs that the user belongs to (on supported platforms).
-
$gpu_info- Details on the system GPUs.
-
$interactive- Boolean indicating whether the process is running within an interactive session.
-
$iroh_mode- The current Iroh execution mode.
-
$iroh_version- Version of the Iroh runtime/compiler.
-
$locale- The current locale setting.
-
$home- The current user’s home directory path.
-
$hostname- The hostname of the system.
-
$machine_id- Persistent identifier for the machine.
-
$max_memory- Memory limit for the process.
-
$max_open_files- File descriptor limit for the process.
-
$max_processes- Processes limit for the process.
-
$mem_info- Details on the system memory.
-
$network_info- Details on the system network interfaces.
-
$os- The current operating system, e.g.
linux,macos,windows, etc.
- The current operating system, e.g.
-
$page_size- The memory page size of the underlying system.
-
$parent_pid- The parent process ID.
-
$pid- The process ID of the current process.
-
$process_limits- Details on any limits that apply to the current process.
-
$process_start_time- Timestamp of when the current process started.
-
$real_gid- The real group ID of the current process (on supported platforms).
-
$real_uid- The real user ID of the current process (on supported platforms).
-
$saved_gid- The saved group ID for privilege restoration (on supported platforms).
-
$saved_uid- The saved user ID for privilege restoration (on supported platforms).
-
$session_id- The process session ID.
-
$stderr_tty- Checks whether standard error is attached to a TTY.
-
$stdin_tty- Checks whether standard input is attached to a TTY.
-
$stdout_tty- Checks whether standard output is attached to a TTY.
-
$term_colors- Number of colours supported by the terminal.
-
$term_height- The current terminal height.
-
$term_info- Details about the terminal capabilities and type.
-
$term_width- The current terminal width.
-
$temp_dir- The default root directory for temporary files.
-
$timezone- The system timezone.
-
$total_memory- Total memory in bytes.
-
$user- The current user’s username.
-
$user_cache_dir- The default root directory for user-specific cache data.
-
$user_config_dir- The default root directory for user-specific config data.
-
$virtualization_info- Details about the virtualization environment (on supported platforms).
Shell Interaction
Iroh provides programmatic access to running external commands via the built-in
process data type. It also provides various syntactic sugar to make this
easier.
An process value, representing an external command to run can be constructed
by prefixing a slice of strings with $, e.g.
cmd = $["git", "checkout", commit]
type(cmd) == process // true
The returned value can be configured as needed, e.g. to set custom environment variables, use a custom reader for the command’s stdin, and a custom writer as its stderr:
cmd = $["ffmpeg", "-i", "-", "-c:v", "libx264", "-f", "mp4", "-"]
cmd.env = {
"AV_LOG_LEVEL": "verbose"
}
cmd.stdin = mkv_file
cmd.stderr = err_buf
To inherit the current environment variable values, $env.with can be used,
e.g.
cmd.env = $env.with {
"AV_LOG_LEVEL": "verbose"
}
Methods on process values allow for fine-grained control over command
execution, e.g.
-
cmd.output— start the command, wait for it to finish, and return the contents of its standard output. -
cmd.run— start the command and wait for it to finish. -
cmd.start— start the command without waiting for it to finish.
A started command, i.e. a running_process, has additional methods, e.g.
-
cmd.kill— send the terminate signal to the process. -
cmd.is_running— check if the process is still alive. -
cmd.memory_usage— details about the memory used by the process. -
cmd.pid— the process identifier. -
cmd.signal— send a signal to the process. -
cmd.wait— wait for the started command to finish.
Iroh provides shell-like syntax for running commands, piping output, etc. These can be executed by quoting the command within backticks, e.g.
commit = `git rev-parse HEAD`
This starts the given command, waits for it to finish, and if successful, i.e.
gets a 0 exit code from the sub-process, returns the standard output after
trimming.
Like most shells, whitespace is treated as a separator between arguments, and need to normally be escaped, e.g.
output = `/home/tav/test\ code/chaos-test --threads 4`
The \ escape of the whitespace in the command above makes it equivalent to:
cmd = $["/home/tav/test code/chaos-test", "--threads", "4"]
output = cmd.output()
The usual ' and " quote marks can be used to avoid the need for escaped
whitespace, e.g.
output = `chaos-test --select "byzantine node"`
String interpolation can be used within backtick commands, e.g.
output = `git checkout ${commit}`
Values being interpolated are escaped automatically. If it’s a slice of strings, then it is treated as multiple space separated arguments. Otherwise, as a string value.
This helps to prevent a range of security vulnerabilities, e.g.
dangerous_input = "'; rm -rf /; echo '"
output = `echo ${dangerous_input}` // Safely escaped.
While still allowing for multiple arguments to be passed in safely, e.g.
files = ["file 1.md", "file 2.md"]
output = `cat ${files}` // Becomes: cat "file 1.md" "file 2.md"
Commands can be pipelined, e.g.
output = `cat file.txt | grep "Error"`
Outputs can also be redirected, e.g. to write to a file:
`cat file1.txt file2.txt > new.txt`
Or to append to a file:
`cat file1.txt file2.txt >> new.txt`
By default, only the standard output is redirected. As most shells, like Bash, use complex syntax for controlling what gets redirected or piped, e.g.
ls /nonexistent 2>&1 1>/dev/null | grep "No such"
Iroh uses more explicit @keywords in front of the | pipe, or > and
>>
redirect operators, e.g.
`ls /nonexistent @stderr | grep "No Such"`
This takes the following values for piping or redirecting:
-
@all— all streams, including standard output and error. -
@both— both the standard output and error. -
@stderr— just the standard error. -
@stdout— just the standard output (default behaviour). -
@stream:N— a specific file descriptor, e.g.@stream:3.
Iroh doesn’t support input redirection or heredocs within backtick commands as we believe that linear pipelines are easier to understand, i.e. when they go left to right.
Instead, when data needs to be piped in, the output of cat can be used, or any
suitably typed value, i.e. a string, []byte, or io.Reader, can be piped
into a backtick command using |, e.g.
mp4_file = mkv_file | `ffmpeg -i - -c:v libx264 -f mp4 -`
This acts as syntactic sugar for:
cmd = $["ffmpeg", "-i", "-", "-c:v", "libx264", "-f", "mp4", "-"]
cmd.stdin = mkv_file
mp4_file = cmd.output()
Likewise, output can be redirected to an io.Writer, e.g.
`git rev-parse HEAD` > file("commit.txt")
// Or even appended:
`git rev-parse HEAD` >> file("commits.txt")
Conditional execution within backtick commands can be controlled using and and
or, e.g.
`command1 and command2` // Only run command2 if command1 succeeds.
`command1 or command2` // Only run command2 if command1 fails.
Iroh supports automatic globbing when wildcard patterns are specified, e.g.
output = `cat *.log`
The following syntax is supported for globbing:
-
*— matches any string, including empty. -
?— matches any single character. -
[abc]— matches any one character in the set. -
[a-z]— matches any one character in the range. -
{foo,bar}— matches alternates, i.e. eitherfooorbar. -
**— matches directories recursively.
For example:
| Pattern | Example Matches |
|---|---|
*.md |
iroh.md, _doc.md |
file?.md |
file1.md, fileA.md |
[abc]*.py |
a.py, car.py |
[a-z]*.py |
a.py, car.py, test.py |
{foo,bar}.sh |
foo.sh, bar.sh |
**/*.ts |
lib/main.ts, tests/fmt_test.ts |
When the ' single quote is used within backtick commands, globbing is not
applied, e.g.
output = `cat log | grep '*error'`
In the interests of safety and predictability, globbing is also not applied to any interpolated values, e.g.
filename = `*.log`
output = `cat ${filename}`
If explicit globbing is desired, then the built-in glob function can be used,
e.g.
files = glob(`*.log`)
output = `cat ${files}`
This can also be used for quickly iterating over matching patterns, e.g.
for file in glob("*.md") {
// Do something with each of the Markdown files.
}
Iroh supports command substitution in arguments, e.g.
output = `echo ${`date`}`
But since the interpolated value is treated as a single argument, something like the following won’t work, e.g.
output = `grep "ERROR" ${`find . -name "*.log"`}`
The inner command output will need to be turned into a slice of strings first, e.g.
output = `grep "ERROR" ${`find . -name "*.log"`.split("\n")}`
Or, even better:
files = `find . -name "*.log"`.split("\n")
output = `grep "ERROR" ${files}`
We believe this makes code more readable. It also makes life safer than in
shells like Bash where inputs can cause unexpected outcomes depending on the
IFS value and how it’s quoted.
Errors generated when running commands, e.g. when they return a non-zero exit
code, can be handled by using the try keyword, e.g.
output = try `ls /nonexistent`
If, instead of generating errors, an explicit response object is preferred, then
the backtick command can be prefixed with a $. This returns a value with the
exit_code, stdout, stderr, etc.
response = $`ls /nonexistent`
response.exit_code == 1 // true
By default, all backtick commands are waited on to finish running. The &
operator can be used after a backtick command to return a background job
instead, e.g.
job = `sleep 10` &
Backgrounded processes can be signalled with platform-supported signals like
.sigterm, e.g.
if job.is_running() {
job.signal(.sigkill)
}
If a command needs to be run so that the user can directly type in any input,
and see the output as it happens, then the $() form can be used:
$(ls -al)
Besides calling external processes, Iroh also supports running local commands defined within Iroh. These need to satisfy the interface:
localcmd = interface {
__init__((stdin: io.Reader, stdout: io.Writer, stderr: io.Writer))
__call__(args ...string) exit_code
}
They can be registered with a specific name, e.g.
localcmd.register("chaos-test", ChaosTest)
And can then used like any external command, e.g.
output = `chaos-test --threads 4`
Certain built-in commands like cd are implemented like this, and can thus also
be called as plain functions, e.g.
cd("silo/espra")
commit = `git rev-parse HEAD`
As changing working directories is a common need in shell scripting, the with
statement can be used to change the working directory for the lexical scope,
e.g.
with {
.cwd = "silo/espra"
} {
// Do things in the silo/espra sub-directory here.
}
Finally, in script mode, if interactive shell support is enabled, Iroh allows
for shell commands to be run without needing to be encapsulated within $(),
e.g.
cd silo/espra
git rev-parse HEAD > commit.txt
This works by:
-
First, trying to interpret a line as if it were non-shell code.
-
Otherwise, it tries to treat the line as if it were encapsulated within
$(). -
If neither succeeds, an error is generated.
For example:
cd silo/espra
commit = `cat commit.txt`
if commit.starts_with("abcdef") {
// Celebrate!
}
This allows the “normal” programming aspects of the Iroh language to be seamlessly interwoven with shell code within scripts, the Iroh REPL/Shell, etc.
System Interface
Iroh defines a system interface to abstract away some OS-related
functionality:
system = interface {
args() []string
chdir(path string)
cwd() string
exec(cmd string, args []string) process
exit(code int)
getenv(key string) ?string
lookup_cmd(cmd string) ?string
now() time
read_random(buf []byte) int
setenv(key string, value string)
signal(sig int)
sleep(duration int<s>)
}
By default, the platform-specific os_system is used by builtins like $args,
$env, etc.
$args == ["--threads", "4"] // true
$env.get("SHELL") == "/bin/zsh" // true
But this can be overridden using the with construct for testing purposes, e.g.
with {
io.system = MySys(args: ["arg1"], subprocesses: .fail)
} {
// The following sub-process call internally calls
// io.system.exec(), which in this case will fail
// on sub-process calls:
resp = $`cat /home/tav/.zshrc`
resp.exit_code == 1 // true
$args == ["arg1"] // true
}
Thanks to this system interface mocking, developers can create more reliable software by ensuring comprehensive code coverage, injecting faults deterministically, etc.
Safety & Performance
Iroh pays attention to lots of little details so as to maximize safety and performance, e.g.
-
Perfect hash functions are used to construct values such as compile-time maps so that lookups are collision-free and space efficient.
-
When immutable values like
stringtypes are cast to something like[]byte, additional copies are avoided wherever possible. -
The
withconstruct can be used to enable automatic parallelization of code within a lexical scope, e.g.with { .parallel = true } { total = data.filter { $0.revenue > 150_000 } .sum { $0.revenue } }The specific number of threads can also be specified, e.g.
with { .parallel_threads = 4 } { total = data.filter { $0.revenue > 150_000 } .sum { $0.revenue } } -
Variable names must match the pattern
^[A-Za-z_][A-Za-z0-9_]*$so as to make homoglyph attacks impossible.
C Interoperability
Iroh aims to match the high bar set by Zig for C interoperability with zero overhead. Like Zig, we ship with a C compiler and linker so that C code can be imported and used just like Iroh packages, e.g.
import "github.com/micrypt/glfw-iroh" as c
if c.glfwInit() != 0 {
...
}
window = c.glfwCreateWindow(800, 600, "Hello GLFW from Iroh", nil, nil)
if window == nil {
...
}
c.glfwMakeContextCurrent(window)
for c.glfwWindowShouldClose(window) == 0 {
...
}
While complex macros don’t get translated, constants using #define get
imported automatically, e.g.
import "github.com/tav/limits" as c
// C: #define INT_MAX 2147483647
max_int = c.INT_MAX
Iroh supports a number of data types that match whatever the C compiler would produce for a target platform:
| Iroh | C Type | Typical Size |
|---|---|---|
c_char | char | 8 bits |
| ⤷ Platform-dependent signedness | ||
c_schar | signed char | 8 bits |
c_uchar | unsigned char | 8 bits |
c_short | short int | 16 bits |
c_ushort | unsigned short int | 16 bits |
c_int | int | 32 bits |
c_uint | unsigned int | 32 bits |
c_long | long int | 32/64 bits |
| ⤷ 32-bit on Windows, 64-bit on Unix | ||
c_ulong | unsigned long int | 32/64 bits |
| ⤷ 32-bit on Windows, 64-bit on Unix | ||
c_longlong | long long int | 64 bits |
c_ulonglong | unsigned long long int | 64 bits |
c_size_t | size_t | Pointer size |
⤷ Same as uint typically | ||
c_ssize_t | ssize_t | Pointer size |
⤷ Same as int typically | ||
c_ptrdiff_t | ptrdiff_t | Pointer size |
| ⤷ For pointer arithmetic | ||
c_float | float | 32 bits |
c_double | double | 64 bits |
c_longdouble | long double | 64/80/128 bits |
| ⤷ Platform-dependent precision | ||
c_string | char* | - |
⤷ Alternatively: [*]c_char | ||
!c_string | const char* | - |
⤷ Alternatively: ![*]c_char | ||
c_union | union | - |
c_void | void | - |
opaque | void* | - |
C function signatures can be specified with an extern and called directly from
Iroh code, e.g.
func(extern: true) process_data(buf [*]c_char, len c_size_t, scale c_double) c_int
result = process_data(buf, len(buf), 1.23)
Variadic functions can be called as expected, e.g.
func(extern: true) printf(format c_string, args opaque...) c_int
printf("Hello %s, number: %d\n", "world", 42)
By default all extern function are assumed to follow the C calling convention.
This can be overridden if needed, e.g.
func(.x86_stdcall, extern: true) MessageBoxA(hWnd win32.HWND, lpText c_string, lpCaption c_string, uType c_uint) c_int
External functions also default to a platform-specific stack size, e.g. 8MB on
Linux. This can also be overridden via the stack_size parameter, e.g.
func(extern: true, stack_size: 48<KB>) grpc_init()
In order to match ABI compatibility with C without any overhead, C code can only
be called from within the .single_threaded and .multi_threaded schedulers
with non-gc allocators.
In those instances, there is no marshalling overhead and the native C calling convention is followed without any runtime interference, e.g.
// This compiles to identical assembly as C.
func add(a c_int, b c_int) c_int {
return a + b
}
To support callbacks from C, the type signature of Iroh functions can specify that they use the C calling convention, e.g.
func(extern: true) register_callback(cb func(c_int) callconv(.c) c_void) c_void
func my_callback(x c_int) callconv(.c) c_void {
// Handle callback value.
}
register_callback(my_callback)
To match C’s memory layout for structs, the extern parameter needs to be
specified, e.g.
Point = struct(extern: true) {
x c_int
y c_int
}
The c_union data type matches unions in C, e.g.
Value = c_union {
i c_int
f c_float
bytes [4]c_char
}
These are untagged and you must know which field is active. Accessing an inactive field can result in garbage data or even trigger a hardware trap, e.g.
Value = c_union {
i c_int
f c_float
bytes [4]c_char
}
v = Value{i: 42} // The .i field is active
x = v.i // This is OK
y = v.f // This is not OK
While the compiler can detect certain accesses as unsafe and generate edit-time errors, this may not always be possible, e.g. when calling external libraries, and will thus be marked as unsafe.
Alignment can be forced if needed with @align, e.g.
Point = struct(extern: true) {
x c_int @align(16)
y c_int
}
Exact bit control can be done using backing and packed, e.g.
ColorWriteMaskFlags = struct(extern: true, backing: uint32, packed: true) {
red: false,
green: false,
blue: false,
alpha: false,
}
The mem.c_allocator can be used to use the C allocator from Iroh, e.g.
with {
.allocator = mem.c_allocator()
} {
...
}
As calling C code is inherently unsafe, Iroh makes limited safety guarantees when C code is called:
-
If the C code being called was compiled by Iroh and is well-defined and does not cause any undefined behaviour according to the C23 standard, it is marked as safe.
-
Otherwise, the call to C is marked as unsafe. Such unsafe code will not be allowed in contexts like
onchain-scriptmode, and will need to be explicitly approved otherwise.
Relatedly, as the [*] raw pointers can be null pointers, they explicitly need
to be checked for nil before potentially unsafe operations like accessing
members, indexing, etc.
Finally, like Zig, Iroh also ships with multiple sets of libc headers that allows for easy cross-compilation for various target platforms.
Dynamically Generating Assembly
Compilers are not perfect. There will always be edge cases where a developer will be able to get better performance from a machine by writing raw assembly themselves.
Iroh provides a genasm keyword and an associated asm package that provides
rich support for generating assembly code programmatically. These can be used
inside function bodies, e.g.
import * from "asm"
func popcount(v uint32) uint32 {
result = uint32(0)
temp = uint32(0)
genasm {
XORL(result, result)
loop:
TESTL(v, v)
JZ(#done)
MOVL(temp, v)
SUBL(temp, 1)
ANDL(v, temp)
INCL(result)
JMP(#loop)
done:
}
return result
}
When genasm blocks are at the top-level of a package, they are used to define
functions in assembly code and need to specify the complete function signature.
While languages like C, C++, D, Rust, and Zig support inline assembly, Iroh lets
you generate assembly code at compile-time using standard control structures
such as if conditions and for loops.
This makes dealing with assembly code a lot easier, e.g. these 100 lines of code, ported from avo, generate the 1,500 lines of assembly code to support SHA-1 hashing:
import * from "asm"
genasm block(h *[5]uint32, m []byte) {
w = asm.stack_alloc(64)
w_val = (r) => w.offset((r % 16) * 4)
asm.comment("Load initial hash.")
hash = [GP32(), GP32(), GP32(), GP32(), GP32()]
for i, r in hash {
MOVL(h.offset(4*i), r)
}
asm.comment("Initialize registers.")
a, b, c, d, e = GP32(), GP32(), GP32(), GP32(), GP32()
for i, r in [a, b, c, d, e] {
MOVL(hash[i], r)
}
steps = [
(f: choose, k: 0x5a827999),
(xor, 0x6ed9eba1),
(majority, 0x8f1bbcdc),
(xor, 0xca62c1d6),
]
for r in 0..79 {
asm.comment("Round ${r}.")
s = steps[r/20]
// Load message value.
u = GP32()
if r < 16 {
MOVL(m.offset(4*r), u)
BSWAPL(u)
} else {
MOVL(w_val(r-3), u)
XORL(w_val(r-8), u)
XORL(w_val(r-14), u)
XORL(w_val(r-16), u)
ROLL(U8(1), u)
}
MOVL(u, w_val(r))
// Compute the next state register.
t = GP32()
MOVL(a, t)
ROLL(U8(5), t)
ADDL(q.f(b, c, d), t)
ADDL(e, t)
ADDL(U32(q.k), t)
ADDL(u, t)
// Update registers.
ROLL(Imm(30), b)
a, b, c, d, e = t, a, b, c, d
}
asm.comment("Final add.")
for i, r in [a, b, c, d, e] {
ADDL(r, hash[i])
}
asm.comment("Store results back.")
for i, r in hash {
MOVL(r, h.offset(4*i))
}
RET()
}
func choose(b, c, d Register) Register {
r = GP32()
MOVL(d, r)
XORL(c, r)
ANDL(b, r)
XORL(d, r)
return r
}
func majority(b, c, d Register) Register {
t, r = GP32(), GP32()
MOVL(b, t)
ORL(c, t)
ANDL(d, t)
MOVL(b, r)
ANDL(c, r)
ORL(t, r)
return r
}
func xor(b, c, d Register) Register {
r = GP32()
MOVL(b, r)
XORL(c, r)
XORL(d, r)
return r
}
As assembly generating code within Iroh can be packaged up and reused, we expect
genasm to be more heavily used than inline assembly in other languages.
In particular, we expect to see genasm used in performance-critical code, e.g.
to take advantage of specific SIMD instructions when the compiler can’t
automatically vectorize code.
The asm support package takes inspiration from projects like PeachPy, AsmJit,
and avo to enable programmatic assembly code generation, and takes care of some
complex aspects, e.g.
-
Supporting unlimited virtual registers which are transparently mapped to physical registers.
-
Automatically taking care of correct memory offsets for complex data structures.
Initially, this package will support currently popular architectures:
-
x64, i.e. x86-64, the 64-bit version of Intel’s x86 architecture as developed by AMD. -
arm64, i.e. AArch64, the 64-bit version of the ARM architecture.
Assembly generating code can match on $arch to generate different assembly
code for different architectures:
match @arch {
.x64:
// x64-specific assembly code
.arm64:
// arm64-specific assembly code
default:
// everything else
}
Support will be added over time for instructions in newer versions of
architectures, as well as other architectures as they gain adoption within the
broader market, e.g. .risc_v.
To LLVM or Not
For many new languages, LLVM has been the go-to choice for compiler infrastructure. Rust. Julia. Swift. They all use LLVM to generate and optimize machine code.
This is with good reason. LLVM is hard to match. It has battle-tested code generation for multiple architectures, with decades of work on code optimization!
But, while LLVM is definitely an amazing piece of engineering, we will not be using it to build Iroh’s official compiler:
-
LLVM is slow moving. For almost a decade, the Rust team had to maintain and ship their own fork of LLVM as the official releases didn’t include the features and bug fixes that they needed.
-
It imposes a lot of cost on language designers, e.g. working around the constraints imposed by LLVM, working around the regressions introduced by new versions, etc.
-
LLVM was never built with developer productivity in mind. If you want your language to have features like fast compilation, LLVM is painful to work around.
Instead, we will be following in the footsteps of Go, and more recently Zig, and do the generation and optimization of machine code ourselves.
-
For starters, to support most use cases, we only need to support 6 platforms at the start:
android-arm64,ios-arm64,linux-arm64,linux-x64,macos-arm64,windows-x64. -
While hardware speeds double roughly every 18 months, progress in compiler optimizations is much slower — you only get speed doublings every few decades.
-
After all, Frances Allen catalogued most of the big impact optimizations way back in 1971: inlining, unrolling, CSE, code motion, constant folding, DCE, and peephole optimization.
-
As evidenced by Go, even a simple compiler can produce reasonably fast code. Besides improving developer productivity, we’re also likely to have fewer miscompilations.
-
Most code in an executable is only run a few times. By providing our users with the ability to dynamically generate assembly, they’d be able to optimize hotspots better than most compilers.
So, while it’ll be a challenge to build a decent code generator, we believe the payoff will be well worth it.
IASM (Iroh Assembly)
Besides compiling to native architectures like .x64 and .arm64, Iroh also
compiles to IASM (Iroh Assembly), our custom instruction format for an abstract
virtual machine.
IASM borrows heavily from WASM (WebAssembly), but differs in various ways, e.g.
-
Instead of being stack-based, IASM is register-based in SSA form with embedded liveness analysis. This allows for streaming compilers to both be simpler and generate better quality code.
-
The memory layout and host function calling is Iroh-specific and limited to Iroh hosts. This allows us to avoid the overhead that WASM has when communicating with host languages, e.g.
WASM → JavaScript → Native → JavaScript → WASM -
Support for dynamic memory management, e.g. allowing for memory to be reserved in a virtual address space, allowing for memory to both grow and shrink at runtime, etc.
-
Support for zero-copy moves of values from one IASM instance to another without memory having to be explicity shared between the two.
-
Various opcodes, e.g. floating-point types and operations, are not available in modes like
onchain-script. This allows for much cheaper computational integrity and zero-knowledge proofs. -
The ability to pass object-capability references between IASM instances.
-
Built-in metering support for modes like
onchain-script. Metering is automatically aggregated and special host support makes the overhead minimal. -
Support for executing code not just on the CPU but also on accelerators like GPUs.
-
Support for encoding control flow graphs using new
jumpandjump_ifinstructions with typed blocks instead of the ineffective mish-mash of control flow instructions in WASM. -
Removal of locals as their mutability prevents conversion to SSA form, and them being global to functions prevents liveness analysis.
-
Support for arguments to be passed to blocks, and for functions to return multiple values.
-
Support for marking certain globals as
reserved, e.g. for a global that represents a stack pointer, so that more optimal code can be generated when manipulating the stack. -
Faster binary encoding of IASM instructions, e.g. using prefixed varints instead of LEB128, zigzag encoding instead of sign extension, etc.
These changes will allow us to generate significantly better optimized code than what’s possible with even the best optimizing WASM compilers, while maintaining its portability and security benefits.
Other languages like C, Rust, and Zig could compile down to IASM. But this will not be an initial priority for us, as we primarily use IASM for Iroh itself and to build onchain scripts and apps on Espra.
Performance Optimizations
Iroh supports the typical optimizations that one would find in most low-level languages, e.g.
-
Compiler optimizations based on heuristics, e.g. inlining, loop unrolling, register allocation, branch prediction, constant propagation, etc.
-
Compile-time optimizations based on target CPU features, e.g.
// Referencing $target.arch is done at compile-time, e.g. if @arch == .x64 { match @arch.features { .avx512f: // AVX-512 implementation. .avx2: // AVX2 implementation. .avx: // AVX implementation. default: // Default implementation. } } -
Runtime optimizations based on the system architecture, e.g.
// Referencing $arch is done at runtime, e.g. if $arch == .x64 { match $arch.features { .avx512f: ... } } -
Optimizations based on our
genasmsupport. -
Profile-guided optimizations where processes like inlining, branch prediction hints, and memory layout changes are guided by saved profiles of actual program execution.
But, for maximum effect, Iroh can also be configured with an Iroh Performance Evaluation Cluster, that:
-
Provides a collection of machines with the different CPU, GPU, OS, storage, and network configurations that compilation should target.
-
Receives code to optimize from the compiler and runs it against static or dynamically generated test data.
-
Tries out different optimizations while verifying that the generated code is equivalent to the original. This is done using e-graphs for side-effect-free code.
-
Caches the results and sends them back to the compiler.
We expect this to yield better optimizations than most compilers. Especially for math-heavy code like GPU kernels for AI models where the cluster can generate and test variants by:
-
Differing memory access patterns.
-
Applying rewrite rules, including algebraic rewrites, in all possible combinations.
-
Varying work distribution across GPU threads and blocks.
By intelligently searching through all possible solutions, the cluster should yield:
-
Algorithmic optimizations like FlashAttention where the output is equivalent to standard attention but with better computational and memory characteristics, enabling longer contexts.
-
More optimal re-orderings of computations to improve cache locality and avoid shared memory bank conflicts.
-
More optimal tile sizes for different levels of the memory hierarchy, i.e. registers, shared memory, and global memory.
The compiler can be configured with different evaluation profiles that specify:
-
The configurations to target, e.g. a specific set of GPUs.
-
The maximum amount of time to spend on optimizations in any given run.
These evaluation profiles can be applied to specific blocks of code, and the compiler will automatically use the most effective optimization that the cluster has discovered for specific hardware.
Bootstrapping AI Support
New programming languages are at a disadvantage with LLMs:
-
There won’t be much training data for LLMs to reference since new languages have little to no existing code on sites like GitHub.
-
Since people increasingly use LLMs instead of sites like StackOverflow for help, less new content gets created for future LLM training cycles.
-
Since training data for state-of-the-art LLMs lags behind by up to a year, any content that does get created is likely to be out of date for a fast-moving new language.
To combat this, we will be generating synthetic training data by:
-
Going through sites like StackOverflow for popular languages like Python and creating our own Q&A dataset with equivalent answers written in Iroh.
-
Finding popular blog posts and replacing examples in other languages with equivalent Iroh examples and explanations.
We will augment this with the creation of automatic “transpilers” that convert code from languages like Python, TypeScript, and Rust to Iroh, including usage of popular libraries, and vice-versa.
This will be used within the Iroh editor to automatically translate from Iroh to an appropriate intermediary language for interacting with LLMs, e.g.
Iroh → Rust → LLM → Rust → Iroh
While this won’t be perfect, it’ll allow us to make use of LLMs while they catch up with our training data.