Open Menu

Iroh: A Unified Programming Language for the Entire Stack

Why Iroh?

Building production apps today tends to involve multiple languages, e.g. Go for the backend, SQL for the database, TypeScript for the frontend, CSS for styling, etc.

Worse, despite there being hundreds of languages, if you wanted high performance with strong memory safety and minimal runtime overhead, your options are limited, e.g. to Rust and Ada.

But what if we could have it all? A single, minimal language that:

  • Was comparable to C, C++, Rust, and Zig in terms of performance.

  • Provided the memory safety of Rust but without having to constantly work around the constraints of the borrow checker.

  • Was designed for fast compilations and developer productivity like Pascal, Go, and Zig.

  • Was as easy to learn as Python and TypeScript.

  • Could be used across the entire stack: infrastructure, apps, data engineering, styling, and even smart contracts.

This is what we’re creating with Iroh. We will be using it to build both the Espra browser and Hyperchain. Developers will be able to use it to write both onchain scripts and user-facing apps on Espra.

Our only non-goal is running directly on hardware without an OS, e.g. on embedded systems or when writing kernels or drivers. We’ll leave that to Ada, C, C++, Forth, Rust, and Zig.

Execution Modes

Iroh is intended to support 6 primary execution modes:

  • standard

    • The default mechanism that produces an executable binary for an OS and target architecture, e.g. macos-arm64.

    • This is pretty similar to what you would expect if you were to write some code in a language like Go.

    • This mode has full access to make whatever calls the underlying OS allows, e.g. calling into C functions, making system calls, etc.

    • The main function within the root package serves as the entrypoint for the program being run.

  • onchain-script

    • A sandboxed execution mechanism with automatic metering of every operation.

    • This mode will have restricted data types, e.g. floating-point types won’t be available as they will necessitate complex circuits when producing zero-knowledge proofs.

    • This mode will also set certain defaults, e.g. treat all decimal-point literals as fixed128 values, check arithmetic operations for wrapping, default to an arena allocator, etc.

  • onchain-ui-app

    • A sandboxed execution mechanism that can only call into the underlying system based on explicit functions that have been passed in by the “host”, e.g. to load and save data.

    • This is somewhat similar to how WASM modules are executed by web browsers.

    • The App defined within the root package serves as the entrypoint for the onchain app being run.

  • declarative

    • A constrained mode that only allows the declarative aspects of the language.

    • This allows for Iroh to be used in settings like config files, whilst still retaining the richness of expressivity, type validation, IDE support, etc.

  • gpu-compute

    • A constrained mode for executing code on a GPU or AI chip.

    • This mode will not allow for certain operations, e.g. dynamic allocation of memory, and will have restricted data types to match what the underlying chip supports.

  • script

    • An interpreted mode where Iroh behaves like a dynamic “scripting” language.

    • This will be available when the iroh executable is run with no subcommands. This provides the user with a REPL combined with an Iroh-based shell interface.

    • The fast feedback mechanism of the REPL will allow for high developer productivity, rapid prototyping, as well as make it easier for people to learn and discover features.

    • Scripts can be run with no compilation by just passing the script name, e.g.

      iroh <script-name>.iroh
      

      Or if the script files start with a shebang, i.e.

      #!/usr/bin/env iroh
      

      They can be made executable with chmod +x and run directly, e.g.

      ./script-name.iroh
      
    • This mode will also be useful in embedded contexts, e.g. scripting for games, notebooks, etc.

While the execution mechanism and available functionality will be slightly different in each mode, they will all support the same core language. Making life easy for developers everywhere.

Structured Code Editing

Iroh code is edited through a custom editor that gives the impression of editing text. For example, here’s what “hello world” looks like:

func main() {
    print("Hello world")
}

But despite looking like text, behind the scenes, Iroh’s editor updates various data structures about the code as it is being edited. In fact this is where Iroh gets its name from:

  • Iroh stands for “intermediate representation of hypergraphs”.

Hypergraphs are just graphs where edges can connect any number of nodes, i.e. are not limited to just connecting two nodes. This gives us a powerful base data structure that can capture:

  • Data flow dependencies.

  • Control flow relationships.

  • Type relationships.

  • Package dependencies.

  • Multi-dimensional relationships that typical structures like trees and normal graphs can’t express.

Perhaps most importantly, where the typical compiler loses semantic information at each step:

Source  Parse  AST  Semantic Analysis  IR  Optimize  Machine Code

Iroh maintains full semantic information throughout. This allows us to improve the developer experience in ways that are not possible with other languages:

  • Instant semantic feedback as you type — going well beyond what IDEs can do today.

  • Refactoring that understands intent, not just syntax.

  • Intelligent error messages and debugging based on the full context.

It also makes Iroh itself a lot simpler to develop:

  • Outside of single-line expressions, we don’t need to worry about any ambiguous or conflicting grammars in the language. We can use simpler syntax without worrying about parsing.

  • There’s no need to build bespoke tooling like LSPs, code formatters, etc. Everything from how the code is presented and edited is just a transformation of the hypergraph.

This gives Iroh a massive competitive advantage:

  • Compiling becomes superfast as most of the work that a typical compiler would do is already done at “edit time”. In essence, compilation is just:

    Hypergraph  Optimize  Machine Code
    
  • All tools work from the same source of truth — providing consistency across the board.

  • As new tools are just new hypergraph transformations, the system can be easily extended with richer and richer functionality.

  • There’s a lot less maintenance work as it’s all effectively just one system instead of lots of separate tools.

The fact that a developer can only be editing one bit of code at any point in time, allows us to:

  • Do deep analysis on just the specific area that is being edited. Since we’re not starting from scratch like a typical compiler, we have the full context to help guide the developer.

  • Provide real-time inference, e.g. create sets of all the different errors that might be generated within a function.

  • Automatically refactor, e.g. detect that a deeply nested component needs an extra field to be passed to it, and make sure that all of its parent callers pass it through.

  • Provide custom editing views for different contexts, e.g. a live simulation interface for UI statecharts, a graph editor for dataflow systems, a picker for changing colour values, etc.

All in all, Iroh can provide a fundamentally better development experience, whilst still keeping it accessible and familiar with its default text-like representation.

Edit Calculus

Iroh’s editor builds on the fantastic work that Jonathan Edwards has been doing from Subtext onwards. At the heart of our editor, we have an edit calculus that:

  • Codifies a broad set of operations on the underlying hypergraph that preserves the intent of changes, whilst providing efficient in-memory and on-disk representations of the data.

  • Allows for “time-travelling” backwards and forwards across changes, whilst maintaining consistency with any changes made concurrently by others.

  • Unlike CRDTs, natively ensures semantic validity and coherence when concurrent changes are merged.

  • Unlike git’s text-based diffs, provides semantic diffs that preserve the meaning of changes made by developers.

This allows for the kind of collaboration on code, intelligent merging, enriched code reviews, and advanced debugging that’s not available in mainstream languages.

Numeric Data Types

Iroh implements the full range of numeric data types that one would expect:

  • Integer types
  • Floating-point types
  • Complex number types
  • Decimal types
  • Uncertainty types
  • Fraction types

All numeric values support the typical arithmetic operators:

  • Addition: x + y
  • Subtraction: x - y
  • Multiplication: x * y
  • Division: x / y
  • Negation: -x
  • Modulus/Remainder: x % y
  • Exponentiation: x ** y
  • Percentage: x%

Signed integer types are available in the usual bit-widths:

Type Min Value Max Value
int8 -128 127
int16 -32768 32767
int32 -2147483648 2147483647
int64 -9223372036854775808 9223372036854775807
int128 -170141183460469231731687303715884105728 170141183460469231731687303715884105727

Likewise for the unsigned integer types:

Type Min Value Max Value
uint8 0 255
uint16 0 65535
uint32 0 4294967295
uint64 0 18446744073709551615
uint128 0 340282366920938463463374607431768211455

Some integer types have aliases like in Go:

  • byte is an alias for uint8 to indicate that the data being processed represents bytes, e.g. in a byte slice: []byte.

  • int is aliased to the signed integer type corresponding to the underlying architecture’s bit-width, e.g. int64 on 64-bit platforms, int32 on 32-bit platforms, etc.

  • uint is aliased to the unsigned integer type corresponding to the underlying architecture’s bit-width, e.g. uint64 on 64-bit platforms, uint32 on 32-bit platforms, etc.

Like Zig, Iroh supports arbitrary-width integers for bit-widths between 1 and 65535 when the type name int or uint is followed by a bit-width, e.g.

// 7-bit signed integer
a = int7(20)

// 4096-bit unsigned integer
b = uint4096(10715086071862641821530)

If an arbitrary-precision integer is desired, then that is available via the built-in bigint type that automatically expands to fit the necessary precision, e.g.

a = bigint(10715086071862641821530) ** 5000 // would need 365,911 bits

Integer literals can be represented in various formats:

// Decimal
24699848519483

// With underscores for legibility
24_699_848_519_483

// Hex
0x1676e1b26f3b
0X1676E1B26F3B

// Octal
0o547334154467473

// Binary
0b101100111011011100001101100100110111100111011

Integer types support the typical bit operators:

  • Bitwise AND: a & b
  • Bitwise OR: a | b
  • Bitwise XOR: a ^ b
  • Bitwise NOT: ^a
  • Bitwise AND NOT: a &^ b
  • Left shift: a << b
  • Right shift: a >> b

Various methods exist on integer types to support bit manipulation, e.g.

  • x.count_one_bits — count the number of bits with the value 1.

  • x.leading_zero_bits — count the number of leading zeros.

  • x.bit_length — the minimum number of bits needed to represent the value.

  • x.reverse_bits — reverses the bits.

  • x.rotate_bits_left — rotate the value left by n bits.

  • x.to_big_endian — converts the value from little endian to big endian.

  • x.to_little_endian — converts the value from big endian to little endian.

  • x.trailing_zero_bits — count the number of trailing zeros.

Iroh provides the typical floating-point types:

Type Implementation
float16 IEEE-754-2008 binary16
float32 IEEE-754-2008 binary32
float64 IEEE-754-2008 binary64
float80 IEEE-754-2008 80-bit extended precision
float128 IEEE-754-2008 binary128

As well as some additional floating-point types for use cases like machine learning:

Type Implementation
float8_e4 E4M3 8-bit float-point format
float8_e5 E5M2 8-bit float-point format
bfloat16 Brain floating-point format

Non-finite floating-point values can be constructed using methods on the floating-point types, e.g.

float64.nan()     // NaN
float64.inf()     // Positive infinity
float64.neg_inf() // Negative infinity

Additional methods on the floating-point values support further manipulation:

  • x.is_nan — checks if a floating-point value is a NaN.

  • x.is_inf — checks if a floating-point value is infinity.

  • x.is_finite — checks if a floating-point value is not a NaN or infinity.

  • x.copy_sign — copies the sign of a given value.

Complex numbers can be represented using Iroh’s complex number types:

Type Description
complex64 Real and imaginary parts are float32
complex128 Real and imaginary parts are float64

Complex number values can be constructed using the complex constructor or literals, i.e.

// Using the complex constructor
x = complex(1.2, 3.4)

// Using literals
x = 1.2 + 3.4i

Similarly, quaternion types are provided for use in domains like computer graphics and physics:

Type Description
quaternion128 Real and imaginary parts are float32
quaternion256 Real and imaginary parts are float64

Quaternions number values can be constructed using the quaternion constructor or literals, i.e.

// Using the quaternion constructor
x = quaternion(1.2, 3.4, 5.6, 7.8)

// Using literals
x = 1.2 + 3.4i + 5.6j + 7.8k

Iroh provides 2 fixed-point signed data types for dealing with things like monetary values:

Type Max Value
fixed128 170141183460469231731.687303715884105727
fixed256 57896044618658097711785492504343953926634992332820282019728.792003956564819967

Unlike floating-point values, these fixed-point data types can represent decimal values exactly, and support up to 18 decimal places of precision.

The smallest number that can be represented is:

0.000000000000000001

The fixed-point types are augmented with a third decimal type:

Type Implementation
bigdecimal Supports arbitrary-precision decimal calculations

These are a lot slower than fixed-point types as they are allocated on the heap. But, it allows for unbounded scale, i.e. the number of digits after the decimal point, and unlimited range.

When even that’s not enough, and you need exact calculations, there’s a fraction type for representing a rational numbers, i.e. a quotient a/b of arbitrary-precision integers, e.g.

x = fraction(1, 3)
y = fraction(2, 5)
z = x + y // results in 11/15 exactly

The numerator and denominator of the fraction can be directly accessed:

x = fraction(1, 3)
x.num // 1
x.den // 3

The fraction value can be converted to a decimal value, e.g. using bigdecimal:

x = fraction(1, 3)
bigdecimal(x) // 0.333333333333333333

The exact scale can also be specified, e.g.

x = fraction(1, 3)
bigdecimal(x, scale: 5) // 0.33333

Iroh also supports uncertainty values that are useful for things like simulations, financial risk analysis, scientific calculations, engineering tolerance analysis, etc. These types:

  • Include a component that represents the level of uncertainty of its value.

  • Propagate the uncertainty level through calculations.

Each of the floating-point and decimal values have an uncertainty variant where the name of the variant is the underlying type’s name prefixed with u, e.g. ufloat64, ufixed128, etc.

Uncertainty values can be instantiated using the type constructor or using the ± literal, e.g.

x = ufloat64(1.2, 0.2)
y = 1.2 ± 0.2

The uncertainty bounds are propagated across calculations, e.g.

x = 1.0 ± 0.1
y = 2.0 ± 0.05
z = x + y       // 3.0 ± ~0.11

The value and uncertainty of the resulting value can be accessed directly, e.g.

x = 1.2 ± 0.2
x.value          // 1.2
x.uncertainty    // 0.2

Numeric literals like 42, 3.14, and 1e6 are initially untyped and of arbitrary precision in Iroh. They remain untyped when assigned to const values, and are only typed when assigned to variables.

When numeric literals are not typed, by default:

  • Values without decimal places, i.e. integer_literals, are inferred as int values.

  • Values with decimal places, i.e. decimal_point_literals, are inferred as fixed128 values

  • Values with e or E exponents, i.e. exponential_literals, are inferred as float64 values

For example:

// Untyped decimal-point value
const Pi = 3.14159265358979323846264338327950288419716939937510582097494459

// When explicit types aren't specified:
radius = 5.0 // inferred as a fixed128
area = Pi * radius * radius // Pi is treated as a fixed128

// When explicit types are specified:
radius = float32(5.0)
area = Pi * radius * radius // Pi is treated as a float32

The with statement can be used to control how numeric literals are inferred in specific lexical scopes, e.g.

with {
    .decimal_point_literals = .float32
    .integer_literals = .int128
} {
    x = 3.14 // x is a float32
    y = 1234 // y is an int128
}

Numeric types are automatically upcasted if it can be done safely, e.g.

x = int32(20)
y = int64(10)
z = x + y // x is automatically upcasted to an int64

Otherwise, variables will have to be cast explicitly, e.g.

x = int32(20)
y = int64(10)
z = x + int32(y) // y is explicitly downcasted to an int32

When integers are cast to a floating-point type, a type conversion is performed that tries to fit the closest representation of the integer within the float.

Conversely, when a floating-point value is cast to an integer type, its fractional part is discarded, and the result is the integer part of the floating-point value.

Except for the cases when a type won’t fit, e.g. when adding an int256 and a fixed128 value, integers are automatically upcasted to fixed-point types when needed.

Similarly, both integers and fixed-point values are both considered safe for upcasting into bigdecimal values, e.g.

x = bigdecimal(1.2)
y = fixed128(2.5)
z = 3 * (x / y) // results in a bigdecimal value of 1.44

By default, the compiler will error if literals are assigned to types outside of the range for an integer, fixed-point, or bigdecimal type, e.g.

x = int8(1024) // ERROR!

Likewise, an error is generated at runtime if a value is downcast into a type that doesn’t fit, e.g.

x = int64(1024)
y = int8(x) // ERROR!

Integer types can take an optional policy parameter to instead either truncate the value to fit the target type’s bit width, recast the bits, or clamp it to the type’s min or max value, e.g.

x = int64(4431)
y = int8(x, policy: .truncate) // results in 79
z = int8(x, policy: .clamp)    // results in 127

// Recast a uint8 as an int8
x = uint8(200)
y = int8(x, policy: .recast) // results in -56

Non-integer numeric types do not support a custom policy:

  • Fixed-point and bigdecimal values will always generate an error when a value doesn’t fit within its range.

  • Floating-point values will silently overflow into either +inf or -inf, or if a value is smaller than the smallest representable subnormal, it will become either +0 or -0.

Arithmetic operations on integer and fixed-point types are automatically checked, i.e. will raise an error on either overflow or underflow, e.g.

x = uint8(160)
y = uint8(160)
z = x + y // ERROR!

This behaviour can be changed by using the with statement to set the .integer_arithmetic_policy to either .wrapping, .saturating, or .checked, e.g.

with {
    .integer_arithmetic_policy = .wrapping
} {
    x = uint8(160)
    y = uint8(160)
    z = x + y // results in 64
}

The integer types also provides methods corresponding to operations using each of the policy variants for when you don’t want to change the setting for the whole scope, e.g.

x = uint8(160)
y = uint8(160)
z = x.wrapping_add(y) // results in 64

The errors mentioned above, along with the error caused by dividing by zero, can be caught using the try keyword, e.g.

x = 1024
y = try int8(x)

Percentage values can be constructed using the % suffix, e.g.

total = 146.00
vat_rate = 20%
vat = total * vat_rate

vat == 29.20 // true

Numeric types can also be constructed from string values, e.g.

x = int64("910365")

The string value can be of any format that’s valid for a literal of that type, e.g.

x = int32("1234")
y = fixed128("28.50")
z = int8("0xff") // Automatic base inferred from the 0x prefix

An optional base parameter can be specified when parsing strings to integer types, e.g.

x = int64("deadbeef", base: .hex)
y = int8("10101100", base: .binary)

Numbers can be rounded using the built-in round function. By default it will round to the closest integer value using .half_even rounding, e.g.

x = 13.5
round(x) // 14

An alternative rounding mode can be specified if desired, e.g.

x = 13.5
round(x, .down) // 13

The following rounding modes are natively supported:

enum {
    half_even, // Round to nearest, .5 goes towards nearest even integer
    half_up,   // Round to nearest, .5 goes away from zero
    half_down, // Round to nearest, .5 goes towards zero
    up,        // Round away from zero
    down,      // Round towards zero
    ceiling,   // Round towards positive infinity
    floor,     // Round towards negative infinity
    truncate,  // Remove fractional part
}

The number of decimal places can be controlled by an optional scale parameter, e.g.

pi = 3.1415926
round(pi, scale: 4) // 3.1416

A negative scale makes the rounding occur to the left of the decimal point. This is useful for rounding to the nearest ten, hundred, thousand, etc.

x = 12345.67
round(x, scale: -2) // 12300

The optional significant_figures parameter can round to a specific number of significant figures, e.g.

x = 123.456
round(x, significant_figures: 3) // 123

y = 0.001234
round(y, significant_figures: 2) // 0.0012

The round function works on all numeric types. Note that the rounding of floating-point values might yield surprising results as most decimal fractions can’t be represented exactly in floats.

Certain arithmetic operations on the decimal types are automatically rounded. The compiler will avoid re-ordering these so that outputs are deterministic.

By default, rounding for both fixed-point types and bigdecimal will be to 18 decimal places and using the .half_even rounding mode. This can be controlled using the with statement, e.g.

with {
    .decimal_point_literals = .fixed128
    .integer_literals = .fixed128
    .decimal_round = .floor
    .decimal_scale = 4
} {
    x = 1/3 // 0.3333
}

Converting numeric values into strings can be done by just casting them into a string type, e.g.

x = 1234
string(x) // "1234"

This can take an optional format parameter to control how the number is formatted, e.g.

x = 1234
string(x, format: .decimal)            // "1234", the default
string(x, format: .hex)                // "4d2"
string(x, format: .hex_upper)          // "4D2"
string(x, format: .hex_prefixed)       // "0x4d2"
string(x, format: .hex_upper)          // "4D2"
string(x, format: .hex_upper_prefixed) // "0X4D2"
string(x, format: .octal)              // "2322"
string(x, format: .octal_prefixed)     // "0o2322"
string(x, format: .binary)             // "10011010010"
string(x, format: .binary_prefixed)    // "0b10011010010"

The optional scale parameter will pad the output with trailing zeros to match the desired number of decimal places, e.g.

x = 12.3
string(x, scale: 2) // "12.30"

If something other than the default .half_even rounding, i.e. bankers rounding, is desired, then the optional round parameter can be used:

x = 12.395
string(x, round: .floor, scale: 2) // "12.39"

The optional thousands parameter can be set to segment the number into multiples of thousands, e.g.

x = 1234567
string(x, thousands: true) // "1,234,567"

For formatting numbers as they’re expected in different locales, the locale parameter can be set, e.g.

x = 1234567.89
string(x, locale: .de) // "1.234.567,89"

A with statement can be used to apply the locale to a lexical scope, e.g.

with {
    .locale = .de
} {
    // all string formatting in this scope will use the specified locale
}

The specific separators for decimal and thousands can also be controlled explicitly. The setting of thousands_separator implicitly sets thousands to true.

x = 1234567.89
string(x, decimal_separator: ",", thousands_separator: ".") // "1.234.567,89"

For a more complete numeric support, the standard library also provides additional packages, e.g. the math package defines constants like Pi, implements trig functions, etc.

Unit Values

Units can be defined using the built-in unit type and live within a custom <unit> namespace, e.g.

<s> = unit(name: "second", plural: "seconds")

<km> = unit(name: "kilometre", plural: "kilometres")

Numeric values can be instantiated with a specific <unit>, e.g.

distance = 20<km>
timeout = 30<s>

Unit definitions are evaluated at compile-time, and can be related to each other via the @relate function, e.g.

<s> = unit(name: "second", plural: "seconds")
<min> = unit(name: "minute", plural: "minutes")
<hour> = unit(name: "hour", plural: "hours")

@relate(<min>, 60<s>)
@relate(<hour>, 60<min>)

If the optional si_unit is set to true during unit definition, then variants using SI prefixes will be automatically created, e.g.

<s> = unit(name: "second", plural: "seconds", si_unit: true)

// Units like ns, us, ms, ks, Ms, etc. are automatically created, e.g.
1<s> == 1000<ms> // true

Non-linear unit relationships can be defined too, e.g.

<C> = unit(name: "°C")
<F> = unit(name: "°F")

@relate(<C>, ((<F> - 32) * 5) / 9)

// The opposite is automatically inferred, i.e.
<F> == ((<C> * 9) / 5) + 32

Cyclical units with a wrap_at value automatically wrap-around on calculations, e.g.

<degrees> = unit(name: "°", wrap_at: 360)

difference = 20<degrees> - 350<degrees>
difference == 30<degrees> // true

Logarithmic units can also be defined, e.g.

<dB> = unit(name: "decibel", plural: "decibels", logarithmic: true, base: 10)

20<dB> + 30<dB> == 30.4<dB> // true

Values with units are of the type quantity and can also be programmatically defined. When a function expects a parameter of type unit, the <> can be elided, e.g.

distance = quantity(20, km)

Computations with quantities propagate their units, e.g.

speed = 20<km> / 40<min>

speed == 0.5<km/min> // true

Quantities can be normalized to convertible units, e.g.

speed = 0.5<km/min>

speed.to(km/hour) == 30<km/hour>

The type system automatically prevents illegal calculations, e.g.

10<USD> + 20<min> // ERROR!

While units are automatically calculated on multiplication and division, e.g.

force = mass * acceleration  // kg⋅m/s²  (newtons)
energy = force * distance    // kg⋅m²/s² (joules)
power = energy / time        // kg⋅m²/s³ (watts)

Quantities default to using fixed128 values, but this can be customized by using a different type during the value construction, e.g.

speed = bigdecimal(55.312)<km/s>
measurement = (1.5 ± 0.1)<m>

Quantities can be parsed from string values where a numeric value is suffixed with the unit, e.g.

timeout = quantity("30s")

timeout == 30<s> // true

By default all units that are available in the scope are supported. This can be constrained by specifying the optional limit_to parameter, e.g.

block_size_limit = quantity("100MB", limit_to: [MB, GB])

block_size_limit == 0.1<GB> // true

Quantities can also be cast to strings, e.g.

timeout = 30<s>

string(timeout) == "30s"

Localized long form names for units can be used by setting long_form to true. These default to the names given during unit definition, e.g.

string(1<s>, long_form: true) == "1 second"     // true
string(30<s>, long_form: true) == "30 seconds"  // true

When cast to a string, the most appropriate unit from a list can be automatically selected by specifying closest_fit. This will find the largest unit that gives a positive integer quantity, e.g.

time_taken = time.since(start) // 251<s>

string(time_taken, closest_fit: [s, min, hour]) == "2 minutes" // true

The humanize parameter automatically selects up to two of the largest units that result in whole numbers, e.g.

string(
  time.since(post.updated_time),
  humanize: true,
  limit_units_to: [s, min, hour, day, month, year]
)
// Outputs look something like:
//   "2 minutes"
//   "3 months and 10 days"

Types of a specific quantity can be referred to explicitly as quantity[unit], e.g.

distance = 10<km>

type(distance) == quantity[km] // true

Some quantity types are aliased for convenience, e.g.

duration = quantity[s]

When parsing from a string to a quantity of a specific unit, it will be normalized from relatable units, e.g.

timeout = 30<s>
timeout = quantity("2min")

timeout == 120<s> // true

Custom asset and currency units can be defined at runtime with a symbol and name, e.g.

USDC = currency(symbol: "USDC", name: "USD Coin")
EURC = currency(symbol: "EURC", name: "EUR Coin")

Computations with currency units can be converted using live exchange rates at runtime, e.g.

fx_rate = 1.16<USDC/EURC>
total = 1000<EURC>

total.to(USDC, at: fx_rate) == 1160<USDC> // true

This allows type safety to be maintained, e.g. a GBP value can’t be accidentally added to a USD value, while supporting explicit conversion, e.g.

gbp_subtotal = 200<GBP>
usd_subtotal = 100<USD>
total = gbp_subtotal + usd_subtotal // ERROR! Can't mix currencies

// This would work:
total = gbp_subtotal + usd_subtotal.to(GBP, at: fx_rate)

Range Values

Iroh provides a native range type for constructing a sequence of values between a given start and end integer values, e.g.

x = range(start: 5, end: 9)

len(x) == 5 // true

// Prints: 5, 6, 7, 8, 9
for i in x {
    print(i)
}

A next value can also be provided to deduce the “steps” to take between the start and end, e.g.

x = range(start: 1, next: 3, end: 10)

// Prints: 1, 3, 5, 7, 9
for i in x {
    print(i)
}

If next is not specified, it defaults to start + 1 if the start value is less than or equal to end, otherwise it defaults to start - 1, e.g.

x = range(start: 5, end: 1)

// Prints: 5, 4, 3, 2, 1
for i in x {
    print(i)
}

Likewise, if start is not specified, it defaults to 0, e.g.

x = range(end: 5)

// Prints: 0, 1, 2, 3, 4, 5
for i in x {
    print(i)
}

Note that, unlike range in Python, our ranges are inclusive of the end value. We believe this is much more intuitive — especially when it comes to using ranges to slice values.

Since ranges are frequently used, the shorthand start..end syntax from Perl is available, e.g.

x = 5..10

x == range(start: 5, end: 10) // true

This is particularly useful in for loops, e.g.

// Prints: 5, 6, 7, 8, 9, 10
for i in 5..10 {
    print(i)
}

Similar to Haskell, the next value can also be specified in the shorthand as start,next..end, e.g.

// Prints: 5, 7, 9
for i in 5,7..10 {
    print(i)
}

While range expressions do not support full sub-expressions, variable identifiers can be used in place of integer literals, e.g.

pos = 0
next = 2
finish = 6

// Prints: 0, 2, 4, 6
for i in pos,next..finish {
    print(i)
}

When the start value is elided in range expressions, it defaults to 0, e.g.

finish = 6

// Prints: 0, 1, 2, 3, 4, 5, 6
for i in ..finish {
    print(i)
}

Iroh does not provide syntax for ranges that are exclusive of its end value, as having two separate syntaxes tends to confuse developers, e.g. .. and ... in Ruby, .. and ..= in Rust, etc.

Array Data Types

Iroh supports arrays, i.e. ordered collections of elements of the same type with a length that is determined at compile-time.

// A 5-item array of ints:
x = [5]int{1, 2, 3, 4, 5}

// A 3-item arrays of strings:
y = [3]string{"Tav", "Alice", "Zeno"}

Array types are of the form [N]Type when N specifies the length and Type specifies the type of each element. All unspecified array elements are zero-initialized, e.g.

x = [3]int{}

x[0] == 0 and x[1] == 0 and x[2] == 0 // true

Elements at specific indexes can also be explicitly specified for sparse initialization, e.g.

x = [3]int{2: 42}

x == {0, 0, 42} // true

The ; delimiter can be used within the length specifier to initialize with a custom value, e.g.

x = [3; 20]int{2: 42}

x == {20, 20, 42} // true

For more complex defaults, a fill function can also be specified. This function is passed the index positions for all dimensions that can be optionally used, e.g.

x = [3]int{} << { $0 * 2 }

x == {0, 2, 4} // true

Using a custom fill function together with either a default value or with arrays that have elements initialized at specific indexes will result in an error at edit time.

The length of an array can either be an integer literal or an expression that compile-time evaluates to an integer, e.g.

const n = 5

x = [n + 1]int{}

len(x) == 6 // true

The ... syntax can be used to let the compiler automatically evaluate the length of the array from the number of elements, e.g.

x = [...]int{1, 2, 3}

len(x) == 3 // true

The element type can also be elided when it can be automatically inferred, e.g.

x = [3]{"Zeno", "Reia", "Zaia"} // type is [3]string

y = [...]{"Zeno", "Reia", "Zaia"} // type is [3]string

Array elements are accessed via [index], which is 0-indexed, e.g.

x = [3]int{1, 2, 3}

x[1] == 2 // true

Indexed access is always bounds checked for safety. Out of bounds access will generate a compile-time error for index values that are known at compile-time, and a runtime error otherwise.

Elements of an array can be iterated using for loops, e.g.

x = [3]int{1, 2, 3}

for elem in x {
    print(elem)
}

If you want the index as well, destructure the iterator into 2 values, i.e.

x = [3]int{1, 2, 3}

for idx, elem in x {
    print("Element at index ${idx} is: ${elem}")
}

Arrays can be compared for equality by using the == operator, which compares each element of the array for equality, e.g.

x = [3]int{1, 2, 3}

if x == {1, 2, 3} {
    // do something
}

Note that when the compiler can infer the type of something, e.g. on one side of a comparison where the type of the other side is already known, the type specification can be elided as above.

Array instances also have a bunch of utility methods, e.g.

  • x.contains — checks if the array has a specific element.

  • x.index_of — returns the first matching index of a specific element if it exists.

  • x.reversed — returns a reversed copy of the array.

  • x.sort — sorts the array as long as the elements are sortable.

Depending on the size of the array and how it’s used, the compiler will automatically allocate the array on either the stack, heap, or even the registers.

By default, arrays are passed by value. If a function needs to mutate the array so that the changes are visible to the caller, it can specify the parameter type to be a pointer to an array, e.g.

func change_first_value(x *[3]int) {
    x[0] = 21 // indexed access is automatically dereferenced
}

x = [3]int{1, 2, 3}
change_first_value(&x)
print(x) // [3]int{21, 2, 3}

If an array is defined as a const, it can’t be mutated, e.g.

const x = [3]int{1, 2, 3}

x[0] = 5 // ERROR!

Arrays with elements of the same type can be cast between different sizes, e.g.

x = [3]int{1, 2, 3}

y = [5]int(x) // when upsizing, missing values are zero-initialized

z = [2]int(x) // when downsizing, excess values are truncated

Multi-dimensional arrays can also be specified by stacking the array types, e.g.

x = [2][3]int{
    {1, 2, 3},
    {4, 5, 6}
}

This can also use the slightly clearer MxN syntax, e.g.

x = [2x3]int{
    {1, 2, 3},
    {4, 5, 6}
}

This can be extended to whatever depth is needed, e.g.

x = [2x3x4]int{}
y = [480x640x3]byte{}

Multi-dimensional arrays default to row-major layouts so as to be CPU cache-friendly. For use cases where a column-major layout is needed, this can be explicitly specified, e.g.

x = [2x3@col]int{}

This can also be set for an entire lexical scope using the with statement, e.g.

with {
    .array_layout = .column_major
} {
    x = [2x3]int{}
}

Such array layouts can be useful in domains like linear algebra calculations. The layout can also be transposed dynamically to a new array via casting, e.g.

x = [2x3]int{}

y = @col(x)

z = @row(y) // same layout as x

Or by using @transpose when you just want the opposite layout, e.g.

x = [2x3]int{}

y = @transpose(x)

z = @transpose(x) // same layout as x

Slice Data Types

Slices, like arrays, are used for handling ordered elements. Unlike arrays, which have a size that is fixed at compile-time, slices can be dynamically resized during runtime.

Slice types are of the form []Type where Type specifies the type of each element. For example, a slice of ints can be initialized with:

x = []int{1, 2, 3}

Since slices are used often, a shorthand is available when the element type can be inferred, e.g.

x = [1, 2, 3]

Internally, slices point to a resizable array, keep track of their current length, i.e. the current number of elements, and specified capacity, i.e. allocated space:

slice = struct {
    _data      [*]array // pointer to an array
    _length    int
    _capacity  int
}

The built-in len and cap functions provide the current length and capacity of a slice, e.g.

x = [1, 2, 3]

len(x) == 3 // true
cap(x) == 3 // true

Slices are dynamically resized as needed, e.g. if you append to a slice that is already at capacity:

x = [1, 2, 3]
x.append(4)

len(x) == 4 // true
cap(x) == 4 // true

When a slice’s capacity needs to grow, it is increased to the closest power of 2 that will fit the additional elements, e.g.

x = [1, 2, 3, 4]
x.append(5)

len(x) == 5 // true
cap(x) == 8 // true

As slices are just views into arrays, they can be formed by “slicing” arrays with a range, e.g.

x = [5]int{1, 2, 3, 4, 5}
y = x[0..2]

y == [1, 2, 3] // true

Slices can also be sliced to form new slices, e.g.

x = [1, 2, 3, 4, 5]
y = x[0..2]

y == [1, 2, 3] // true

When a slice needs to grow beyond the underlying capacity, a new array is allocated and existings values are copied over. This allows for multiple independent views, e.g.

arr = [4]int{1, 2, 3, 4}

x = arr[2..3]       // [3, 4]
y = arr[..]         // [1, 2, 3, 4]

// Changes to the underlying array are reflected in other views, e.g.
x[0] = 10

x == [10, 4]        // true
y == [1, 2, 10, 4]  // true

// But if a slice is grown, it no longer points to the same array, e.g.
x.append(5)

// So changes are not reflected in  other views, e.g.
x[0] = 11

x == [11, 4, 5]     // true
y == [1, 2, 10, 4]  // true

Individual elements of a slice can be accessed by indexing, e.g.

x = [1, 2, 3]

x[0] == 1 and x[1] == 2 and x[2] == 3 // true

Similar to Python, negative indexes work as offsets from the end, e.g.

x = [1, 2, 3]

x[-1] == 3 // true
x[-2] == 2 // true

Slices can be iterated over by for loops, e.g.

x = [1, 2, 3]

// Prints: 1, 2, 3
for elem in x {
    print(elem)
}

If you want the index as well, destructure the iterator into 2 values, just like arrays, e.g.

x = [1, 2, 3]

for idx, elem in x {
    print("Element at index ${idx} is: ${elem}")
}

Using square brackets around a range expression acts as a shorthand for creating a slice from that range, e.g.

x = [1..5]

len(x) == 5           // true
x == [1, 2, 3, 4, 5]  // true

If an actual slice consisting of range values is desired, then parantheses need to be used, e.g.

x = [(1..5)]

len(x) == 1                      // true
x[0] == range(start: 1, end: 5)  // true

If the start value is left off by the range when slicing, it defaults to 0, e.g.

x = [1, 2, 3, 4, 5]
y = x[..2]

y == [1, 2, 3] // true

If the end value is left off, it defaults to the index of the last element, i.e. len(slice) - 1, or if the start value is positive, then it defaults to the negative index of the first value, e.g.

x = [1, 2, 3, 4, 5]
y = x[2..]
z = x[-3..]

y == [3, 4, 5] // true
z == [3, 2, 1] // true

If both start and end are left off, it slices the whole range, e.g.

x = [1, 2, 3, 4, 5]
y = x[..]

y == [1, 2, 3, 4, 5] // true

Slicing with stepped ranges let’s you get interleaved elements, e.g.

x = [10..20]
y = x[0,2..] // gets every other element

y == [10, 12, 14, 16, 18, 20] // true

When initializing a slice, if you already know the minimum capacity that’s needed, it can be specified ahead of time so as to avoid reallocations, e.g.

x = make([]int, cap: 100)

len(x) == 0    // true
cap(x) == 100  // true

The length can also be specified, e.g.

x = make([]int, cap: 100, len: 5)

len(x) == 5    // true
cap(x) == 100  // true

If length is specified, but not capacity, then capacity defaults to the same value as length, e.g.

x = make([]int, len: 5)

len(x) == 5  // true
cap(x) == 5  // true

By default, elements are initialized to their zero value. An alternative default can be specified if desired, e.g.

x = make([]int, len: 5, default: 20)

len(x) == 5  // true
cap(x) == 5  // true

x[0] == 20   // true
x[4] == 20   // true

Slices have a broad range of utility methods, e.g.

  • x.append — adds one or more elements to the end.

  • x.all — returns true if all elements match the given predicate function.

  • x.any — returns true if any element matches the given predicate function.

  • x.choose — select n number of elements from the slice.

  • x.chunk — split the slice into sub-slices of the specified length.

  • x.clear — removes all elements.

  • x.combinations — generate all possible combinations of the given length from the slice elements.

  • x.contains — checks if the slice has a specific element.

  • x.count — returns the number of elements that match the given predicate function.

  • x.drop — remove the first n elements.

  • x.drop_while — keep removing elements until the given predicate function fails.

  • x.enumerated — returns a new slice where each element gives both the original element index and value.

  • x.extend — add all elements from the given slice to the end.

  • x.filter — return a new slice for elements matching the given predicate function.

  • x.filter_map — applies a function to each element and keeps only the non-nil results as a new slice.

  • x.first — return the first element, if any; also takes an optional predicate function, in which case, the first element, if any, that matches that predicate will be returned.

  • x.flatten — flatten nested slices.

  • x.flat_map — transform each element with the given function and then flatten nested slices.

  • x.for_each — run a given function for each element in the slice.

  • x.group_by — group elements by the given key function.

  • x.index_of — returns the first matching index of a specific element if it exists.

  • x.insert_at — inserts one or more elements at a given index.

  • x.intersperse — insert a separator between all elements.

  • x.join — join all elements into a string with the given separator.

  • x.last — return the last element, if any.

  • x.last_index_of — returns the last matching index of a specific element if it exists.

  • x.map — return a new slice with each element transformed by a function.

  • x.max — returns the maximum element; a custom comparison function can be given when the elements are non-comparable.

  • x.min — returns the minimum element; a custom comparison function can be given when the elements are non-comparable.

  • x.partition — split into two slices based on a predicate function.

  • x.permutations — generate all possible permutations of the given length from the slice elements.

  • x.pop — removes the last element from the slice and returns it.

  • x.prepend — adds one or more elements to the start.

  • x.reduce — reduce the elements into one value using a given accumulator function and initial value.

  • x.remove_at — remove one or more elements at a given index.

  • x.remove_if — remove elements matching the given predicate function.

  • x.reverse — reverses the slice in-place.

  • x.reversed — returns a reversed copy of the slice.

  • x.scan — starting with an initial value and an accumulator function, keep applying to each element, and return a slice of all intermediary results.

  • x.shift — removes the first element from the slice and returns it.

  • x.shuffle — shuffles the slice in-place.

  • x.shuffled — returns a shuffled copy of the slice.

  • x.sort — sorts the slice in-place.

  • x.sorted — returns a sorted copy of the slice.

  • x.split_at — split into two separate slices at the given index.

  • x.sum — for slices of elements that support the + operator, this returns the sum of all elements.

  • x.take — return the first n elements.

  • x.take_while — return all elements up to the point that the given predicate function fails.

  • x.transpose — swap rows/columns for a slice of slices.

  • x.unique — remove any duplicate elements.

  • x.unzip — convert a slice of paired elements into two separate slices.

  • x.window — create sub-slices that form a sliding window of the given length.

  • x.zip — pair each element with elements from the given slice.

  • x.zip_with — pair each element with elements from the given slice, and apply the given transformation function to each pair.

Methods like filter and map can use a Swift-like closure syntax for defining their function parameters, e.g.

adult_names = people
              .filter { $0.age > 18 }
              .map { $0.name.upper() }

Likewise for sort, e.g.

people.sort { $0.age < $1.age }

// Or, if you want a stable sort:
people.sort(stable: true) { $0.age < $1.age }

Iroh fully inlines these functions, so calls like people.filter { $0.age > 18} performs exactly like a hand-written for loop, while being much more readable.

When combined with range initialization, the whole sequence effectively becomes lazy, minimizing unnecessary allocation, e.g.

result = [1..1000]
    .filter { $0 % 2 == 0 }
    .map { $0 * 2 }
    .take(10)

Methods like append and insert_at support both inserting one element at a time, e.g.

x = [1..3]
x.append(10)

x == [1, 2, 3, 10] // true

As well as appending multiple elements as once, e.g.

x = [1..3]
x.append(4, 5, 6)

x == [1, 2, 3, 4, 5, 6] // true

While all elements from another slice can be appended by using the ... splat operator, e.g.

x = [1..3]
y = [5..7]
x.append(4, ...y)

x == [1, 2, 3, 4, 5, 6, 7] // true

The extend method should be used instead for most use cases, e.g.

x = [1..3]
y = [4..7]
x.extend(y)

x == [1, 2, 3, 4, 5, 6, 7] // true

When a slice is assigned to a new variable or passed in as a parameter, it is to the same reference, e.g.

x = [1..5]
y = x

y[0] = 20

x == [20, 2, 3, 4, 5] // true
y == [20, 2, 3, 4, 5] // true

The copy method needs to be used if it should be to an independent slice, e.g.

x = [1..5]
y = x.copy()

y[0] = 20

x == [1, 2, 3, 4, 5]   // true
y == [20, 2, 3, 4, 5]  // true

When copying slices where the elements are of a composite type, using the deep_copy method will recursively call copy on all sub-elements that are deep-copyable, e.g.

x = [
  {"name": "Tav", "surname": "Siva"},
  {"name": "Alice", "surname": "Fung"}
]

// Deep copies are independent of each other.
y = x.deep_copy()
y[1]["surname"] = "Siva"

x[1]["surname"] == "Fung" // true
y[1]["surname"] == "Siva" // true

// However, standard copying is shallow.
z = x.copy()
z[1]["surname"] = "Fung"

x[1]["surname"] == "Fung" // true
z[1]["surname"] == "Fung" // true

Slices can be compared for equality by using the == operator, which compares each element of the slice for equality, e.g.

x = [1, 2, 3]

if x == [1, 2, 3] {
    // do something
}

Element-wise comparisons use dotted operators like MATLAB and Julia. When comparison operators like .== and .> are used, they produce boolean slices, e.g.

x = [1..5]

x .== 2  // [false, true, false, false, false]

x .> 3   // [false, false, false, true, true]

These can then be used to do boolean/logical indexing, i.e. returning the slice values matching the true values, as first introduced by MATLAB and made popular by NumPy, e.g.

x = [1..10]
y = x[x .> 5]

y == [6, 7, 8, 9, 10] // true

Full broadcasting is supported, where operations are applied element-wise to arrays, slices, and collections of different shapes and sizes, e.g.

x = [0..5]
y = [10..15]
z = x .+ y

z == [10, 12, 14, 16, 18, 20] // true

These operations will be automatically vectorized on architectures that support vectorization, e.g. SIMD on CPUs, GPUs, etc. This will happen transparently without needing custom hints.

Iroh supports Julia’s dotted function mechanism for broadcasting the function, i.e. applying it element-wise for each item in the input, e.g.

add = (n) => n * 2
x = [0..5]
y = add.(x)

y == [0, 2, 4, 6, 8, 10] // true

This also works with Iroh’s standard library, which provides functions like sin and exp, linear algebra functions like inv and solve, stats functions like mean and std, axis-aware reductions, etc.

Multi-dimensional slices can be created by nesting slice types, e.g.

x = [][]int{
    {1, 2, 3},
    {4, 5, 6}
}

Where the element type can be inferred, the shorthand syntax can be used, e.g.

x = [[1, 2, 3], [4, 5, 6]]

Or use ; to make it even shorter, e.g.

x = [1, 2, 3; 4, 5, 6]

x[0] == [1, 2, 3] // true
x[1] == [4, 5, 6] // true

Multiple ; delimiters can be used to make more nested structures, e.g.

x = [1, 2; 3, 4;; 5, 6; 7, 8]

len(x) == 2     // true
len(x[0]) == 2  // true
x[0][0] == 1    // true

Custom dimensions can also be specified to the constructor, e.g.

x = [2, 3]int{}

x == [0, 0, 0; 0, 0, 0] // true

By default, such multi-dimensional slices are zero-initialized. A different default can be specified using ;, e.g.

x = [2, 3; 7]int{}

x == [7, 7, 7; 7, 7, 7] // true

Custom fill function can also be specified. This function is passed the index positions for all dimensions. This can be used, if needed, to derive the fill value, e.g.

x = [2, 3]int{} << { $0 + $1 }

x == [0, 1, 2; 1, 2, 3] // true

The default row-major ordering can be overriden with @col, e.g.

x = [2, 3@col]int{}

Or as explicit parameters to make, e.g.

x = make([]int, shape: (2, 3), layout: .column_major, default: 7)

Existing arrays/slices can be reshaped easily, e.g.

x = [0..5]

// Cast it to a different shape to reshape it:
y = [2, 3]int(x)

// Or use the fluent syntax within method chains:
z = x.reshape(2, 3)

y == [0, 1, 2; 3, 4, 5]   // true
y == z                    // true

Casting to a new slice will also do type conversion on the elements if the two slices have different element types. Any optional parameters will be passed along for the type conversion, e.g.

x = []int32{1, 2, 3}

// No parameters needed as upcasting is safe:
y = []int64(x)

// Specify type conversion parameter for downcasting, e.g.
z = []int8(x, policy: truncate)

Standard arithmetic operations on slices tends to do matrix operations, e.g.

x = [1, 2; 3, 4]
y = [5, 6; 7, 8]

// Matrix multiplication:
z = x * y

z == [19, 22; 43, 50] // true

Slices can be indexed using other slices to get the elements at the given indexes, e.g.

x = [1, 2, 3, 4, 5]
y = x[[0, 2]]

y == [1, 3] // true

Multi-dimensional slices can also be sliced using ;, e.g.

x = [1, 2, 3; 4, 5, 6; 7, 8, 9]

x[..; 1] == [2, 5, 8]                // Column/Dimension slice.
x[..1; 1..2] == [2, 3; 5, 6]         // Range slice.
x[0,2..; -1..] == [3, 2, 1; 9, 8, 7] // Step slice.
x[[0, 2]; [1, 0]] == [2, 7]          // Index slice.

Slice ranges can also be assigned to, e.g.

x = [1..5]
x[..2] = [7, 8, 9]

x == [7, 8, 9, 4, 5] // true

Likewise for multi-dimensional slices, e.g.

x = [1, 2, 3; 4, 5, 6; 7, 8, 9]
x[..; 1] = [23, 24, 25]

x == [1, 23, 3; 4, 24, 6; 7, 25, 9] // true

The .= broadcasting assignment operator can be used to assign a value to each element in a range, e.g.

x = [1..5]
x[..2] .= 9

x == [9, 9, 9, 4, 5] // true

Slices support a broad range of operations that make it easy in domains like linear algebra, machine learning, physics, etc.

c = a * b           // Matrix/vector multiplication.
c = a · b           // Inner/dot product.
c = a  b           // Tensor product (Outer product for vectors; Kronecker product for matrices).
c = a × b           // Cross product.

The operations work as one would expect when they are applied to different types, e.g. multiplying by a scalar value will automatically broadcast, use matrix-vector multiplication when needed, etc.

Slices also support matrix norms and determinants with similar syntax to what’s used in mathematics, e.g.

|a|            // Determinant for square matrices

||a||          // Vector: 2-norm, Matrix: Frobenius norm
||a||_0        // Vector/Matrix: Count non-zeros
||a||_1        // Vector: 1-norm, Matrix: 1-norm
||a||_2        // Vector: 2-norm, Matrix: spectral norm
||a||_inf      // Vector: max norm, Matrix: infinity norm
||a||_max      // Vector: max norm, Matrix: max absolute element
||a||_nuc      // Matrix: nuclear norm
||a||_p        // Vector: p-norm, Matrix: Schatten p-norm
||a||_quad     // Vector: quadratic norm
||a||_w        // Vector: weighted norm, Matrix: weighted Frobenius

Subscripts with {i,j,k,l} syntax can be used on slices for operations in Einstein notation, e.g.

hidden{i,k} = input{i,j} * weights1{j,k} + bias1
output{i,l} = hidden{i,k} * weights2{k,l} + bias2

This is more readable than how frameworks like Numpy handle it, i.e.

hidden = np.einsum("ij,jk->ik", input, weights1) + bias1
output = np.einsum("ij,jk->ik", hidden, weights2) + bias2

This also simplifies the need for a number of functions, e.g. instead of having a trace function, diagonal elements can be easily summed with:

x = [1, 2, 3; 4, 5, 6; 7, 8, 9]
trace = x{i,i}

trace == 15 // true

Iroh’s approach also results in higher quality code, e.g. dimension mismatches can be caught at compile-time, the compiler can automatically optimize for the best operation order, etc.

String Data Types

Iroh defaults to Unigraph text as it fixes many of the shortcomings of Unicode. Two specific Unigraph types are provided:

  • string — the default type that’s suitable for most use cases.

  • composable_string — a special type that includes the separately namespaced Unigraph Compose Element IDs that’s useful for building text editors.

Both data types are encoded in UEF (Unigraph Encoding Format). But, where string only consists of Unigraph IDs, composable_string can include both Unigraph IDs and Compose Element IDs.

A number of string types are also provided for dealing with Unicode text:

  • utf8_string — UTF-8 encoded text.

  • wtf8_string — WTF-8 for roundtripping between UTF-8 and broken UTF-16, e.g. on Windows where you can get unpaired surrogates in filenames.

In addition, the encoded_string type takes specific encodings, e.g.

x = encoded_string("Hello", .iso_8859_1)

The supported encodings include:

enum {
    ascii,          // ASCII
    big5,           // Big 5
    cesu_8,         // CESU-8
    cp037,          // IBM Code Page 37
    cp437,          // IBM Code Page 437
    cp500,          // IBM Code Page 500
    cp850,          // IBM Code Page 850
    cp866,          // IBM Code Page 866
    cp1026,         // IBM Code Page 1026
    cp1047,         // IBM Code Page 1047
    cp1361,         // IBM Code Page 1361
    custom,         // User-defined
    euc_jp,         // EUC-JP
    euc_kr,         // EUC-KR
    gb2312,         // GB2312
    gb18030,        // GB18030
    gbk,            // GBK
    hz_gb2312,      // HZ-GB2312
    iscii,          // ISCII
    iso_2022_jp,    // ISO-2022-JP
    iso_2022_kr,    // ISO-2022-KR
    iso_8859_1,     // ISO-8859-1
    iso_8859_2,     // ISO-8859-2
    iso_8859_3,     // ISO-8859-3
    iso_8859_4,     // ISO-8859-4
    iso_8859_5,     // ISO-8859-5
    iso_8859_6,     // ISO-8859-6
    iso_8859_7,     // ISO-8859-7
    iso_8859_8,     // ISO-8859-8
    iso_8859_8_i,   // ISO-8859-8-I
    iso_8859_9,     // ISO-8859-9
    iso_8859_10,    // ISO-8859-10
    iso_8859_11,    // ISO-8859-11
    iso_8859_13,    // ISO-8859-13
    iso_8859_14,    // ISO-8859-14
    iso_8859_15,    // ISO-8859-15
    iso_8859_16,    // ISO-8859-16
    johab,          // Johab (KS C 5601-1992)
    koi8_r,         // KOI8-R
    koi8_u,         // KOI8-U
    mac_cyrillic,   // Mac OS Cyrillic
    mac_greek,      // Mac OS Greek
    mac_hebrew,     // Mac OS Hebrew
    mac_roman,      // Mac OS Roman
    mac_thai,       // Mac OS Thai
    mac_turkish,    // Mac OS Turkish
    shift_jis,      // Shift_JIS
    tis_620,        // TIS-620
    tscii,          // TSCII
    ucs_2,          // UCS-2
    ucs_4,          // UCS-4
    uef,            // UEF
    utf_7,          // UTF-7
    utf_8,          // UTF-8
    utf_16,         // UTF-16
    utf_16be,       // UTF-16BE
    utf_16le,       // UTF-16LE
    utf_32,         // UTF-32
    utf_32be,       // UTF-32BE
    utf_32le,       // UTF-32LE
    viscii,         // VISCII
    windows_874,    // Windows-874
    windows_1250,   // Windows-1250
    windows_1251,   // Windows-1251
    windows_1252,   // Windows-1252
    windows_1253,   // Windows-1253
    windows_1254,   // Windows-1254
    windows_1255,   // Windows-1255
    windows_1256,   // Windows-1256
    windows_1257,   // Windows-1257
    windows_1258,   // Windows-1258
    wtf_8,          // WTF-8
}

Any other encoding can be used by specifying a .custom encoding and providing an encoder that implements the built-in string_encoder interface, e.g.

x = encoded_string(text, .custom, encoder: scsu)

String literals default to the string type and are UEF-encoded. They are constructed using double quoted text, e.g.

x = "Hello world!"

For convenience, pressing backslash, i.e. \, in the Iroh editor will provide a mechanism for entering characters that may otherwise be difficult to type, e.g.

Sequence Definition
\a ASCII Alert/Bell
\b ASCII Backspace
\f ASCII Form Feed
\n ASCII Newline
\r ASCII Carriage Return
\t ASCII Horizontal Tab
\v ASCII Vertical Tab
\\ ASCII Backslash
\x + NN Hex Bytes
\u + NNNN Unicode Codepoint
\u{ + N... + } Unigraph ID
\U + NNNNNNNN Unicode Codepoint

As the Iroh editor automatically converts text into bytes as they are typed, this is purely a convenience for typing and rendering the strings. No escaping is needed internally.

String values can be cast between each type easily, e.g.

x = utf8_string("Hello world!")
y = encoded_string(x, .windows_1252)
z = string(y)

x == z // true

For safety, Unicode text is parsed in .strict mode by default, i.e. invalid characters generate an error. But this can be changed if needed, e.g.

// Skip characters that can't be encoded/decoded:
x = utf8_string(value, errors: .ignore)

// Replace invalid characters with the U+FFFD replacement character:
y = utf8_string(value, errors: .replace)

Iroh also provides a null-terminated c_string that is compatible with C. This can be safely converted between the other types, e.g.

// This c_path value is automatically null-terminated when
// passed to C functions. The memory management for the
// value is also automatically handled by Iroh.
c_path = c_string("/tmp/file.txt")

// When converted back to any of the other string types,
// the null termination is automatically removed.
path = string(c_path)

path == "/tmp/file.txt" // true

The c_string type will generate errors when strings that are being converted have embedded nulls in them. These need to be manually handled with try, e.g.

c_path = try c_string("/tmp/f\x00ile.txt")

The length of all string types can be found by calling len on it, e.g.

x = "Hello world"

len(x) == 11 // true

Calling len returns what most people would expect, e.g.

x = utf8_string("🤦🏼‍♂️")

len(x) == 1 // true

Except for C strings, where it returns the number of bytes, Iroh always defines the length of a string as the number of visually distinct graphemes:

  • For Unicode strings, this is the number of extended grapheme clusters. Note that Unicode’s definition of grapheme clusters can change between different versions of the standard.

  • For Unigraph strings, this is the number of Unigraph IDs which are already grapheme based. Unlike with Unicode strings, this number will never change with new versions of the standard.

Strings can be concatenated by using the + operator, e.g.

x = "Hello"
y = " world"
z = x + y

z == "Hello world" // true

Multiplying strings with an integer using the * operator will duplicate the string that number of times, e.g.

x = "a"
y = a * 5

y == "aaaaa" // true

Individual graphemes can be accessed from a string using the [] index operator, e.g.

x = "Hello world"
y = x[0]

y == 'H' // true

Unigraph graphemes are represented by the grapheme type, and Unicode ones by the unicode_grapheme type. Grapheme literals are quoted within single quotes, e.g.

char = '🤦🏼'

When graphemes are added or multiplied, they produce strings, e.g.

x = 'G' + 'B'

x == "GB" // true

Strings can also be sliced similar to slice types, e.g.

x = "Hello world"
y = x[..4]

y == "Hello" // true

All string types support the typical set of methods, e.g.

  • x.ends_with — checks if a string ends with a suffix.

  • x.index_of — finds the position of a substring.

  • x.join — join a slice of strings with the value of x.

  • x.lower — lower case the string.

  • x.pad_end — pad the end of a string

  • x.pad_start — pad the start of a string

  • x.replace — replace all occurences of a substring.

  • x.split — split the string on the given substring or grapheme.

  • x.starts_with — checks if a string starts with a prefix.

  • x.trim — remove whitespace from both ends of a string.

  • x.upper — upper case the string.

For safety, all string values are immutable. They can be cast to a []byte or []grapheme to explicitly mutate individual bytes or graphemes, e.g.

x = "Hello world!"
y = []grapheme(x)
y[-1] = '.'

string(y) == "Hello world." // true

As a convenience, Iroh will automatically cast between strings and grapheme slices when values are passed in as parameters. The compiler will also try to minimize unnecessary allocations.

To control how values are formatted, we use explicit parameters, e.g.

x = 255
y = string(x, format: .hex_prefixed)

y == "0xff" // true

Unlike the cryptic format specifiers used in languages like Python, e.g.

amount = 9876.5432
y = f"{amount:>10,.2f}"

y == "  9,876.54" # true

We believe this is a lot clearer, i.e.

amount = 9876.5432
y = string(amount, scale: 2, thousands_separator: ",", width: 10, align: .right)

y == "  9,876.54" // true

A range of format specifiers are available, e.g. quoted strings can be generated by specifying the optional quote parameter:

x = string("Tav", quote: true)
y = string(42, quote: true)
z = string("Hello \"World\"", quote: true)

print(x) // Outputs: "Tav"
print(y) // Outputs: "42"
print(z) // Outputs: "Hello \"World\""

Other format parameters include:

  • casing — change the casing of a string value to snake_case, kebab-case, camelCase, PascalCase, etc.

  • format — to encode into different formats like base64, hex, etc.

  • indent — to control the amount and characters used for indentation.

  • pad_with — pad the output with a given grapheme up to a given width.

  • truncate — truncate a string to a given width, with additional options to control how it gets truncated, e.g. at word boundaries, with a trailing ellipsis, etc.

They can also be provided as parameters on the format method on string values, e.g.

x = "Tav"
y = x.format(quote: true)

print(y) // Outputs: "Tav"

Sub-expressions can be interpolated into strings using 3 different forms:

  • ${expr} — evaluates the expression and converts the result into a string.

  • ={expr} — same as above, but also prefixes the result with the original expression followed by the = sign.

  • #{expr} — validates the expression, but instead of evaluating it, passes through the “parsed” syntax.

Interpolation using ${expr} works as one would expect, e.g.

name = "Tav"

print("Hello ${name}!") // Outputs: Hello Tav!

It supports any Iroh expression, e.g.

name = "Tav"

print("Hello ${name.filter { $0 != "T" }}!") // Outputs: Hello av!

Interpolated expressions are wrapped within string( ... ) calls, so any formatting parameters passed to string can be specified after a comma, e.g.

amount = 9876.5432

print("${amount, scale: 2, thousands_separator: ","}") // Outputs: 9,876.54

Interpolation using ={expr} is useful for debugging as it also prints the expression being evaluated, e.g.

x = 100

print("={x + x}") // Outputs: x + x = 200

Interpolation using #{expr} is only valid within template strings, where it allows for domain-specific evaluation of Iroh expressions, e.g.

time.now().format("#{weekday_short}, #{day} #{month_full} #{year4}")

// Outputs: Tue, 5 August 2025

Any function that takes a template value can be used to construct tagged template literals. These functions evaluate literals at compile-time for optimal performance, e.g.

content = html`<!doctype html>
<body>
  <div>Hello!</div>
</body>`

Finally, the []byte and string types support from and to methods to convert between different binary-to-text formats, e.g.

x = "Hello world"
y = x.to(.hex)
z = y.from(.hex)

y == "48656c6c6f20776f726c64"  // true
z == "Hello world"             // true

The supported formats include:

enum {
    ascii85,                    // Ascii85 (Adobe variant)
    ascii85_btoa,               // Ascii85 (btoa variant)
    base2,                      // Base2 (binary string as text)
    base16,                     // Base16 (alias of hex)
    base32,                     // Base32 (RFC 4648, padded)
    base32_unpadded,            // Base32 (RFC 4648, unpadded)
    base32_crockford,           // Base32 (Crockford, padded)
    base32_crockford_unpadded,  // Base32 (Crockford, unpadded)
    base32_hex,                 // Base32 (RFC 4648, extended hex alphabet, padded)
    base32_hex_unpadded,        // Base32 (RFC 4648, extended hex alphabet, unpadded)
    base36,                     // Base36 (0-9, A-Z)
    base45,                     // Base45 (RFC 9285, QR contexts)
    base58_bitcoin,             // Base58 (Bitcoin alphabet)
    base58_check,               // Base58 with 4-byte double-SHA256 checksum (no version)
    base58_flickr,              // Base58 (Flickr alphabet)
    base58_ripple,              // Base58 (Ripple alphabet)
    base62,                     // Base62 (URL shorteners, compact IDs)
    base64,                     // Base64 (RFC 4648, padded)
    base64_unpadded,            // Base64 (RFC 4648, unpadded)
    base64_urlsafe,             // Base64 (RFC 4648, URL-safe, padded)
    base64_urlsafe_unpadded,    // Base64 (RFC 4648, URL-safe, unpadded)
    base64_wrapped_64,          // Base64 (64-char wrapping, no headers)
    base64_wrapped_76,          // Base64 (76-char wrapping, no headers)
    base85_rfc1924,             // Base85 (RFC 1924 for IPv6 addresses)
    base91,                     // Base91 (efficient binary-to-text)
    base122,                    // Base122 (efficient binary-to-utf8-codepoints)
    binary,                     // Binary string (alias of base2)
    binary_prefixed,            // Binary string (0b prefix)
    hex,                        // Hexadecimal (lowercase)
    hex_prefixed,               // Hexadecimal (lowercase, 0x prefix)
    hex_upper,                  // Hexadecimal (uppercase)
    hex_upper_prefixed          // Hexadecimal (uppercase, 0x prefix)
    hex_eip55,                  // Hexadecimal (EIP-55 mixed-case checksum)
    html_attr_escape,           // Escape HTML attribute values
    html_escape,                // Escape HTML text
    json_escape,                // JSON string escaping
    modhex,                     // ModHex (YubiKey alphabet)
    percent,                    // Percent/URL encode
    percent_component,          // Percent/URL encode (RFC 3986, component-safe set)
    punycode,                   // Punycode (internationalized domains)
    quoted_printable,           // Quoted-printable (email)
    rot13,                      // ROT13 letter substitution
    rot47,                      // ROT47 ASCII substitution
    uuencode,                   // UUEncoding (data body only, fixed line length)
    xml_attr_escape,            // Escape XML attribute values
    xml_escape,                 // Escape XML character data
    xxencode,                   // XXEncoding (data body only, fixed line length)
    yenc,                       // yEnc (Usenet binary)
    z85,                        // Z85 (ZeroMQ variant)
    z_base32,                   // Base32 (Zooko, z-base-32 human-friendly variant)
}

Optionals

Iroh supports optional types which can either be nil or a value of a specific type, e.g.

x = optional[string](nil)
x == nil     // true

x = "Hello"
x == "Hello" // true

A shorthand ?type syntax is available too and defaults to nil if no value is provided, e.g.

x = ?string()
x == nil     // true

x = "Hello"
x == "Hello" // true

For the following examples, let’s assume an optional name field within a Person struct, e.g.

Person = struct {
    name  ?string
}

Optionals default to nil, e.g.

person = Person{}

person.name == nil // true

Values can be easily assigned, e.g.

person = Person{}

person.name = "Alice"

When assigning nil to nested optional types, it either needs to be explicitly disambiguated, or the nil value applies to the top-most level, e.g.

x = ??string("Test")

x = ?string(nil)  // Inner optional
x = nil           // Outer optional

Optional values can be explicitly unwrapped by using the ! operator, e.g.

name = person.name!

type(name) == string // true

Unwrapping a nil value will generate an error. To safely unwrap, conditionals with the ? operator can be used, e.g.

if person.name? {
    print(person.name.upper()) // person.name has been unwrapped into a string here
} else {
    // Handle the case where person.name is nil
}

As Iroh automatically narrows types based on conditionals with the ? operator, and explicit unwraps with the ! operator, it makes the usage of optionals both simple and safe.

Variables and fields which were initially typed as an optional retain this option even if they were type narrowed via an unwrap, e.g.

if person.name? {
    type(person.name) == string  // true
    person.name = nil            // Valid assignment.
}

Multiple optionals can be unwrapped in the same conditional, e.g.

if person.name? and vip_list? {
    // Both person.name and vip_list are unwrapped here.
}

Optional lookups can be chained, e.g.

display_name = person.name?.upper()

type(display_name) == ?string // true

The value at the end of an optional chain is always an optional. The ?? operator can be used to provide a default if the left side is nil, e.g.

display_name = person.name?.upper() ?? "ANONYMOUS"

type(display_name) == string // true

Assignments can use ?= to only assign to a variable or field if it’s not nil, e.g.

person.name ?= "Alice"

Optionals work wherever an expression returns a value, e.g. after map lookups:

users[id]?.update_last_seen()

And can be used wherever types are accepted, e.g. in function parameters:

func greet(name ?string) {
    print("Hello ${name ?? "there"}!")
}

When comparisons are made to a typed value, optionals are automatically unwrapped as necessary, e.g.

// The person.name value is implicitly unwrapped safely
// before the comparison:
if person.name == "Alice" {
    // person.name has been unwrapped to a string with
    // value "Alice"
} else {
    // person.name can be either nil or a string value
    // other than "Alice"
}

The various facets of our optionals system eliminates the issues caused by nil pointers, while maintaining a clean, readable syntax with predictable behaviour.

Map & Set Data Types

Iroh’s map data types provide support for managing key/value data within hash tables. Map types are generic over their key and value type, e.g.

profit = map[string]fixed128{
    "Consumer": 1739.50,
    "Enterprise": 4012.80,
}

Where the types can be inferred, map values can be constructed using a {} literal syntax, e.g.

profit = {
    "Consumer": 1739.50,
    "Enterprise": 4012.80,
}

Duplicate keys in map literals will cause an edit-time error, e.g.

profit = {
    "Consumer": 1739.50,
    "Enterprise": 4012.80,
    "Consumer": 2239.50,    // ERROR!
}

As types cannot be inferred from an empty {} map literal, empty maps can only be initialized with their explicit type, e.g.

profit = {} // ERROR!

profit = map[int]fixed128{} // Valid.

Maps are transparently resized as keys are added to them. To minimize the amount of resizing, maps can be initialized with an initial capacity, e.g.

// This map will only get resized once there are more
// than 5,000 keys in it:
profit = make(map[int]fixed128, cap: 5000)

Keys can be set to a value by assigning using the [] index operator, e.g.

profit = {
    "Consumer": 1739.50,
    "Enterprise": 4012.80,
}

profit["Developer"] = 9368.30

len(profit) == 3 // true

Values assigned to a key can also be retrieved using the [] operator, e.g.

profit = {
    "Consumer": 1739.50,
    "Enterprise": 4012.80,
}

consumer_profit = profit["Consumer"]
consumer_profit == 1739.50 // true

Assigning a value to a key will automatically overwrite any previous value associated with that key. To only assign a value if the key doesn’t exist, the set_if_missing method can be used, e.g.

profit = {
    "Consumer": 1739.50,
    "Enterprise": 4012.80,
}

// No effect as key exists.
profit.set_if_missing("Consumer", 0.0)
profit["Consumer"] == 1739.50

// Value set for key as it doesn't already exist.
profit.set_if_missing("Developer", 9368.30)
profit["Developer"] == 9368.30

When values are retrieved from a map using the [] operator, the zero value is returned if the key has not been set, e.g.

profit = {
    "Consumer": 1739.50,
    "Enterprise": 4012.80,
}

dev_profit = profit["Developer"]
dev_profit == 0.0 // true

If an alternative default value is desired, this can be specified using the default parameter when calling make, e.g.

seen = make(map[string]bool, default: true)
tav_seen = seen["Tav"]

tav_seen == true // true

A custom initializer function can also be specified that derives a default value from the key, e.g.

// Users are fetched from the DB on first lookup.
user_cache = make(map[int]User, default: { db.get_user(id: $0) })

To check whether a key exists in a map, the in operator can be used, e.g.

profit = {
    "Consumer": 1739.50,
    "Enterprise": 4012.80,
}

consumer_profit_exists = "Consumer" in profit
consumer_profit_exists == true // true

if "Developer" in x {
    // As "Developer" has not been set in the map, this code
    // block will not execute.
}

To safely access a key’s value, the get method can be used. This will return nil for non-existent keys, e.g.

profit = {
    "Consumer": 1739.50,
    "Enterprise": 4012.80,
}

consumer_profit = profit.get("Consumer")
consumer_profit == 1739.50 // true

dev_profit = profit.get("Developer")
dev_profit == nil // true

To minimize any confusions caused by nested optionals being returned by the get method, maps do not support using optional types as keys, e.g.

lookup = map[?string]int{} // ERROR!

The get method can be passed an optional default parameter that will be returned if the given key is not found, e.g.

profit = {
    "Consumer": 1739.50,
    "Enterprise": 4012.80,
}

dev_profit = profit.get("Developer", default: 0.0)
dev_profit == 0.0 // true

Keys can be removed using the delete method, e.g.

profit = {
    "Consumer": 1739.50,
    "Enterprise": 4012.80,
}

profit.delete("Consumer")
len(profit) == 1 // true

Maps support equality checks, e.g.

x = {1: true}
y = {1: true}

x == y // true

Maps can be iterated using for loops, e.g.

profit = {
    "Consumer": 1739.50,
    "Enterprise": 4012.80,
}

for key in profit {
    print(key) // Outputs: Consumer, Enterprise
}

For safety, iteration order is non-deterministic in all execution modes except onchain-script, where it is deterministic and based on the transaction hash.

By default, iteration will return the map keys. To retrieve both the keys and values, the 2-variable iteration form can be used, e.g.

profit = {
    "Consumer": 1739.50,
    "Enterprise": 4012.80,
}

for key, value in profit {
    print("${key} = ${value}") // Outputs: Consumer = 1739.50, Enterprise = 4012.80
}

If just the values are desired, then the values method can be called, e.g.

profit = {
    "Consumer": 1739.50,
    "Enterprise": 4012.80,
}

for value in profit.values() {
    print(value) // Outputs: 1739.50, 4012.80
}

Maps provide a number of other utility methods, e.g.

  • clear — removes all entries from the map.

  • compute — update the specified key by applying the given function on the current value.

  • copy — create a copy of the map.

  • deep_copy — create a deep copy of the map.

  • entries — returns a slice of key/value pairs.

  • filter — create a new map with key/value pairs matching a given predicate function.

  • filter_map — apply a function to each key/value pair and keep only the non-nil results as a new map.

  • find — find a key/value pair matching the given predicate function.

  • get_or_insert — get value for the given key, or insert the given default value if the key has not been set.

  • group_by — group entries into nested maps.

  • invert — create a new map with the keys and values swapped.

  • keys — returns a slice of just the keys.

  • map_values — transform all values according to the given transform function.

  • merge — merge another map into this one.

  • pop — delete the given key and return its value if it exists.

Any hashable type can be used as the key type within a map. Most built-ins like strings and numbers are hashable. User-defined types are automatically hashable if all of their fields are hashable.

If a user-defined type is unhashable, it can implement a __hash__ method to make it hashable. This can also be useful to hash it on a unique value, e.g. a primary key.

These methods can make use of the built-in hash function that is used to hash built-in types, e.g.

Person = struct {
    id          int64
    first_name  string
    last_name   string
}

func (p Person) __hash__() int64 {
    return hash(p.id)
}

// Person can now be used as a key within map types, e.g.
balances = map[Person]fixed128{}

Certain built-in types like slices and sets are not hashable as they are mutable. To use them as keys, then they will first need to be converted into immutable const values, e.g.

accessed = map[![]string]int{} // The ! in a type spec states that it is const
file_md = const ["path", "to", "file.md"]
accessed[file_md] = 1

The orderedmap data type behaves exactly like a map except that it keeps track of insertion order and has deterministic iteration order, e.g.

fruits = orderedmap{
    "apple": 10,
}

fruits["orange"] = 5
fruits["banana"] = 20

for fruit in fruits {
    print(fruit) // Outputs: apple, orange, banana
}

Iroh also supports set data types for unordered collections without any repeated elements, e.g.

x = set[int]{1, 2, 3}

Where the type can be inferred, sets can be constructed using a {} literal syntax, e.g.

x = {1, 2, 3}

Duplicates in set literals will cause an edit-time error, e.g.

x = {1, 2, 3, 2} // ERROR!

Like with maps, empty sets can only be initialized with their explicit type, e.g.

x = {} // ERROR!

x = set[int]{} // Valid.

Elements can be added to a set using the add method, e.g.

x = {1, 2, 3}
x.add(4)

len(x) == 4 // true

Elements can be removed using the remove method, e.g.

x = {1, 2, 3}
x.remove(3)

len(x) == 2 // true

Sets can be iterated using for loops, e.g.

x = {1, 2, 3}

for elem in x {
    print(x) // Outputs: 1, 2, 3 in some order
}

Like with maps, the iteration order of sets is non-deterministic in all modes except onchain-script where it will be deterministic and based on the transaction state.

Checking if an element exists in a set can be done using the in keyword, e.g.

x = {1, 2, 3}
y = 3 in x

y == true // true

if 4 in x {
    // As 4 is not in the set, this code block will not execute.
}

Sets can be combined using the typical operations, e.g.

x = {1, 2, 3, 4}
y = {3, 4, 5, 6}

x.union(y)                 // {1, 2, 3, 4, 5, 6}
x.intersection(y)          // {3, 4}
x.difference(y)            // {1, 2}
x.symmetric_difference(y)  // {1, 2, 5, 6}

For brevity, the equivalent operators can also be used, e.g.

x = {1, 2, 3, 4}
y = {3, 4, 5, 6}

x | y   // x.union(y)
x & y   // x.intersection(y)
x - y   // x.difference(y)
x ^ y   // x.symmetric_difference(y)

These methods all return a new set and leave the original untouched. When efficient memory usage is needed, in-place variants of the methods are also available, prefixed with in_place_, e.g.

x = {1, 2, 3, 4}
y = {3, 4, 5, 6}
x.in_place_union(y)

x == {1, 2, 3, 4, 5, 6} // true

Sets of the same type are also comparable to determine if a set is a subset or superset of another, e.g.

x = {1, 2, 3}
y = {1, 2, 3, 4, 5, 6}

if x <= y {
    // x is a subset of y
}

if x < y {
    // x is a proper subset of y
}

if y > x {
    // y is a superset of x
}

Boolean & Logic Types

Iroh supports basic boolean logic via the bool type which has the usual values:

  • true

  • false

The bool values can be negated using the ! operator, e.g.

x = true
y = !x

y == false // true

Boolean logic is applied for the and and or operators, and act in a short-circuiting manner, e.g.

debug = false

// When debug is false, the fetch_instance_id call is never made.
if debug and fetch_instance_id() {
    ...
}

Iroh also supports a number of other logic types. Three-valued logic is supported by the bool3 type which has the following values:

  • true

  • false

  • unknown

The unknown value expands the usual logic operations with the following combinations:

Expression Result
true and unknown unknown
false and unknown false
true or unknown true
false or unknown unknown
!unknown unknown

While three-valued logic can be emulated via an optional bool, i.e. ?bool, the explicit use of unknown instead of nil makes intent clearer in domains like knowledge representation.

Four-valued logic is supported by the bool4 type with the values:

  • true

  • false

  • unknown

  • conflicting

The conflicting value expands the usual logic operations with the following combinations:

Expression Result
true and conflicting conflicting
false and conflicting false
unknown and conflicting conflicting
true or conflicting true
false or conflicting conflicting
unknown or conflicting conflicting
!conflicting conflicting

The use of bool4 can make logic much easier to follow, especially in domains like distributed database conflicts, consensus algorithms, handling forks in blockchains, etc.

Boolean values can be safely upcast to higher dimensions, e.g.

txn_executed = true
initial_status = bool4(txn_executed)

initial_status == true // true

The fuzzy type supports fuzzy logic with fixed128 values between 0.0 (false) and 1.0 (true):

apply_brakes = fuzzy(1.0)

apply_brakes == true // true

Fuzzy logic is applied when evaluating fuzzy values, e.g.

Expression Mechanism Result
fuzzy(0.7) and fuzzy(0.4) min(a, b) fuzzy(0.4)
fuzzy(0.7) or fuzzy(0.4) max(a, b) fuzzy(0.7)
!fuzzy(0.7) 1 - value fuzzy(0.3)

Fuzzy membership is supported by various membership functions on slice data types that assign each element of the slice a value that represents its degree of membership, e.g.

  • x.gaussian_membership

  • x.s_shaped_membership

  • x.sigmoidal_membership

  • x.trapezoidal_membership

  • x.triangular_membership

  • x.z_shaped_membership

The in operator can then be used to return the fuzzy degree of membership of a value, e.g.

// Temperature and humidity ranges.
temperature = [0..100]
humidity = [0..100]

// Temperature and humidity membership functions.
temp_low = temperature.trapezoidal_membership(0, 0, 20, 30)
temp_mid = temperature.trapezoidal_membership(25, 40, 60, 75)
temp_high = temperature.trapezoidal_membership(70, 80, 100, 100)

humid_low = humidity.triangular_membership(0, 0, 50)
humid_high = humidity.triangular_membership(50, 100, 100)

// Current conditions.
current_temp = 25
current_humidity = 60

// Get membership degrees.
temp_low_deg = current_temp in temp_low
temp_mid_deg = current_temp in temp_mid
temp_high_deg = current_temp in temp_high

humid_low_deg = current_humidity in humid_low
humid_high_deg = current_humidity in humid_high

// The membership degree values are fuzzy values:
type(temp_low_deg) == fuzzy // true
type(humid_low_deg) == fuzzy // true

// They can now be used to apply rules under fuzzy logic, e.g.
if temp_high_deg and humid_high_deg {
    set_fan_speed(.high)
} else if temp_mid_deg and humid_low_deg {
    set_fan_speed(.medium)
}

This is extremely useful in various domains like AI decision making, IoT, control systems, risk assessment, recommendation engines, autonomous vehicles, etc.

Probabilistic logic is supported by the probability data type, e.g.

rain = probability(0.3, .bernoulli)

A range of different distribution types are supported:

enum {
    bernoulli,
    beta,
    beta_binomial,
    binomial,
    categorical,
    cauchy,
    chi_squared,
    dirichlet,
    discrete_uniform,
    exponential,
    f_distribution,
    gamma,
    geometric,
    gumbel,
    hypergeometric,
    laplace,
    logistic,
    lognormal,
    multinomial,
    multivariate_normal,
    negative_binomial,
    normal,
    pareto,
    poisson,
    power_law,
    rayleigh,
    student_t,
    triangular,
    uniform,
    weibull,
    zipf,
}

Each distribution type requires different parameters depending on its mathematical definition, e.g.

weather = probability([
    ("sunny", 0.6),
    ("rainy", 0.3), 
    ("cloudy", 0.1)
], .categorical)

height = probability(175.0<cm>, .normal, std: 7.0<cm>)

customers_per_hour = probability(5.0, .poisson)

income = probability(50000.0, .lognormal, sigma: 0.5)

successful_trials = probability(0.3, .binomial, trials: 20)

The likelihood of a given value can be queried by using the p method. This returns the concrete probability or, for continuous distributions, the probability density, e.g.

dice = probability(1, .uniform, max: 6)

// The chance of a 3 being rolled:
chance = dice.p(3)

chance == 0.166666666666666667 // true

The probability that a value falls within a specific range can be calculated using the p_between method, e.g.

dice = probability(1, .uniform, max: 6)

// The chance of a 4-6 being rolled:
chance = dice.p_between(4, 6)

chance == 0.5 // true

The sample method draws a random value from the distribution and sample_n returns a slice of multiple samples of the specified length, e.g.

dice = probability(1, .uniform, max: 6)

roll = dice.sample()
rolls = dice.sample_n(10)

len(rolls) == 10 // true

Some distributions are naturally boolean, e.g.

coin = probability(0.5, .bernoulli)

success = probability(0.3, .binomial, trials: 1)

Others can be converted to a boolean distribution with a predicate, or with the in_range and in_set membership tests, e.g.

temperature = probability(22.0, .normal, std: 3.0)

weather = probability([
    ("sunny", 0.6),
    ("rainy", 0.3),
    ("cloudy", 0.1)
], .categorical)

// Convert with predicates, e.g.
is_hot = temperature > 25.0
is_sunny = weather == "sunny"

// Membership tests, e.g.
likely_fever = temperature.in_range(38.0, 42.0)
bad_weather = weather.in_set({"rainy", "cloudy"})

Boolean distributions can then be used within conditionals either using an explicit threshold, e.g.

if collision_risk.p(true) > 0.1 {
    apply_emergency_brakes()
}

Or implicitly where a probability > 0.5 indicates true, e.g.

if likely_fever {
    prescribe_medication()
}

Boolean distributions can be combined with normal bool values, e.g.

rain = probability(0.7, .bernoulli)
weekend = true

stay_inside = rain and weekend
go_out = !rain or weekend

maybe_picnic = !rain and weekend
if maybe_picnic {
    print("Let's have a picnic!")
}

Assuming distributions are statistically independent, probabilistic logic operations can be applied to them, e.g.

// Garden watering logic.
rain = probability(0.7, .bernoulli)
sprinkler = probability(0.3, .bernoulli)

wet_from_rain_or_sprinkler = rain or sprinkler

// Autonomous driving.
safe_to_proceed = (visibility > 100.0) and weather.in_set({"sunny", "cloudy"})

Boolean distributions can also be compared against each other, e.g.

market_crash = probability(0.15, .bernoulli)
recession = probability(0.25, .bernoulli)

if recession > market_crash {
    adjust_investment_strategy()
}

Constraints for Bayesian inference can be defined using the observe method on probability types to create probability_observation values, e.g.

rain = probability(0.3, .bernoulli)
sprinkler = probability(0.1, .bernoulli)
wet_grass = rain or sprinkler
evidence_wet = wet_grass.observe(true)

These can then be used with the | and & operators to evaluate conditional probabilities, e.g.

// Posterior distribution for rain given evidence of observed wet grass.
rain_given_wet = rain | wet_grass.observe(true)

// Posterior distribution for disease given evidence of fever and cough.
disease_given_symptoms = disease | fever.observe(true) & cough.observe(true)

For correlated variables, the joint_distribution type can be used, e.g.

financial_risks = joint_distribution([
    ("market_crash", probability(0.15, .bernoulli)),
    ("recession", probability(0.25, .bernoulli))
], correlation: 0.8)

Multiple variables can define correlations following the standard upper triangle order of a correlation matrix, e.g.

financial_risks = joint_distribution([
    ("market_crash", probability(0.15, .bernoulli)),
    ("recession", probability(0.25, .bernoulli)), 
    ("inflation_spike", probability(0.20, .bernoulli))
], correlations: [0.8, 0.3, 0.6])

These support the standard operations while maintaining the correlations within the joint distribution, e.g.

crash, recession, inflation = financial_risks.sample()

The correlated probability values can be accessed via the dists method to do various calculations, e.g.

market_crash, recession, inflation_spike = financial_risks.dists()

// Joint probabilities, e.g.
disaster = market_crash and recession

// Conditional probabilities, e.g.
recession_given_crash = recession | market_crash.observe(true)

Our support for rich probabilistic logic makes it useful in various domains like financial modelling, risk modelling, statistical modelling, Bayesian inference, medical diagnosis, robotics, etc.

Struct Data Types

Iroh provides struct types to group together related fields under a single data type, e.g.

Person = struct {
    name     string
    age      int
    height   fixed128<cm>
    siblings []Person
}

Struct values can be initialized with braces and individual fields can be accessed using . notation, e.g.

zeno = Person{
    name: "Zeno",
    age: 11,
    height: 136<cm>
}

print(zeno.name) // Outputs: Zeno

Unspecified values are zero-initialized, e.g.

  • Numeric types default to 0.

  • Boolean types default to false.

  • String types default to "", the empty string.

  • Slice types default to [], the empty slice.

  • Optional types default to nil.

  • Struct types have all their values zero initialized.

For example:

alice = Person{name: "Alice"}

alice.age == 0        // true
alice.siblings == []  // true

Struct definitions can specify non-zero default values, e.g.

Config = struct {
    retries: 5
    timeout: 60<s>
}

cfg = Config{retries: 3}

cfg.retries == 3      // true
cfg.timeout == 60<s>  // true

Struct fields can be assigned to directly, e.g.

cfg = Config{}
cfg.retries = 7

cfg.retries == 7 // true

If a field being initialized matches an existing variable name, the {ident: ident} initialization can be simplified to just {ident}, e.g.

retries = 3
cfg = Config{retries, timeout: 10<s>}

cfg.retries == 3 // true

Likewise, the ... splat operator can be used to assign all fields of an existing struct to another, with later assignments taking precedence over former, e.g.

ori = Config{retries: 3, timeout: 10<s>}

cfg1 = Config{...ori}
cfg1 == {retries: 3, timeout: 10<s>} // true

cfg2 = Config{...ori, retries: 5}
cfg2 == {retries: 5, timeout: 10<s>} // true

cfg3 = Config{retries: 5, ...ori}
cfg3 == {retries: 3, timeout: 10<s>} // true

cfg4 = Config{retries: 5, ...ori, timeout: 20<s>}
cfg4 == {retries: 3, timeout: 20<s>} // true

Iroh automatically detects when a struct should be passed as a pointer or copied based on some heuristics, so there is no need to specify whether it should be a pointer, e.g.

  • If a function/method needs to mutate a struct value, it is passed as a pointer.

  • Where there is no mutation, and the struct value is small enough, it is just copied.

For example:

// Person is automatically passed as a pointer as it is mutated:
func update_age(person Person, new_age int) {
    person.age = new_age
}

// Config value is copied as it is not mutated and small enough:
func get_retries(cfg Config) int {
    return cfg.retries
}

Occasionally, for performance reasons, it may be necessary to annotate the exact form that’s passed in. But this should be rarely used, e.g.

// Person is passed in as a *pointer even though it's not mutated:
func process_application(person *Person, info *Submission) {
    ...
}

// The time value is passed in by !value and copied:
func get_local_time(t !UnixTime) {
    ...
}

All pointers, outside of the [*] pointers we have for interoperating with C, are always non-null and automatically dereferenced on lookups and assignments.

Tuple Data Types

Iroh supports tuples for grouping a fixed number of values together. Tuple values are enclosed within () parentheses, and indexed with [] like slices, e.g.

london = (51.5074, -0.1278)
lat = london[0]
lng = london[1]

lat == 51.5074 // true
lng == -0.1278 // true

Tuple values can be destructured easily, e.g.

lat, lng = (51.5074, -0.1278)

lat == 51.5074 // true
lng == -0.1278 // true

Like structs, tuples can contain values of different types, e.g.

tav = ("Tav", 186<cm>)

However, unlike structs, tuples are immutable, i.e. their elements cannot be re-assigned:

london = (51.5074, -0.1278)
london[0] = 42.12 // ERROR!

Individual fields of a tuple can also be optionally named and accessed via their name, e.g.

london = (lat: 51.5074, lng: -0.1278)

london.lat == 51.5074 // true
london.lng == -0.1278 // true

// Indexed access still works
london[0] == london.lat // true

Both named and unnamed fields can be mixed within the same tuple, e.g.

tav = ("Tav", height: 186<cm>)

tav[0] == "Tav"        // true
tav.height == 186<cm>  // true

To avoid accidental bugs, the order of named fields must always match an expected tuple type, e.g.

func get_coords() (lat: float64, lng: float64) {
    return (lng: -0.1278, 51.5074) // ERROR!
}

As tuples are iterable, len returns their size, and they can be iterated with for loops, e.g.

london = (51.5074, -0.1278)

for point in london {
    print(point) // Outputs: 51.5074 and then -0.1278
}

len(london) == 2 // true

Occasionally, single element tuples are useful, e.g.

  • Within APIs that are expecting tuple values.

  • To maintain consistency within data structures.

As single element tuples, e.g. (x), are indistinguishable from an expression enclosed in parentheses, languages like Python let them be constructed with a trailing comma, e.g.

# Constructing a single-element tuple in languages like
# Python and Rust:
t = (42,)

t[0] == 42 # True

This tends to cause confusion as trailing commas can be overlooked by even experienced developers. As such, we don’t provide a literal syntax for constructing single-element tuples.

Instead, they will need to be constructed using the tuple constructor. This constructor expands any iterable, e.g.

t = tuple("Tav")

len(t) == 3  // true
t[0] == "T"  // true

Therefore, to construct single-element tuples, the value will need to be embedded within a single-element slice, e.g.

t = tuple(["Tav"])

len(t) == 1    // true
t[0] == "Tav"  // true

Enum Data Types

Iroh supports enum data types, e.g.

Colour = enum {
    red
    green
    blue
}

Variants can be addressed with a leading . and matched with the match keyword, e.g.

match colour {
    .red: set_red_bg()
    .green: set_green_bg()
    .blue: set_blue_bg()
}

The use of . prefix for enum variants makes it less visually noisy than in other languages, e.g.

// Compare this Rust:
set_bg(Colour::Red)

// To this Iroh:
set_bg(.red)

So the only time when an enum variant will need to be fully qualified is when it’s used to declare a new variable, e.g.

bg = Colour.red

// The variable can then be assigned another variant without
// needing to use the fully qualified form, e.g.
bg = .green

Enum variants without any data do not get any implicit values, e.g.

Colour = enum {red, green, blue}

Colour.red == Colour.green  // false
Colour.red == 0             // ERROR! Not comparable

Variants can be given explicit values as long as they are all of the same type. This makes the variants comparable to values of those types, e.g.

Colour = enum {
    red = 1
    green = 2
    blue = 3
}

HTTPMethod = enum {
    get = "GET"
    head = "HEAD"
    post = "POST"
    put = "PUT"
}

Colour.red == 1      // true
HTTPMethod.get == "GET"  // true

Mapped enums such as these automatically support being assigned values of the variant type, e.g.

http_method = HTTPMethod.get
http_method = .post        // Assigned an enum variant
http_method = "PUT"        // Assigned a string value mapping to an enum variant

When variables of these types are assigned literal values, they are validated at edit-time. Otherwise, they are validated at runtime and may generate errors, e.g.

http_method = HTTPMethod.post
http_method = "JIBBER"     // ERROR!

As runtime values will be automatically converted during comparison, it may be useful to safely type cast them explicitly first, e.g.

http_method = try HTTPMethod(user_input)
if http_method == .get {
    ...
}

When numeric values are being assigned to variants, they can use iota to define a pattern to use:

  • The value of iota starts at 0 and auto-increments by one for each subsequent variant that doesn’t define its own value.

  • Each new use of iota resets its value back to 0.

For example:

ErrorCode = enum {
    // General errors (start at 0)
    ok = iota                  // 0
    unknown_method             // 1
    invalid_argument           // 2

    // Authentication errors (start at 100)
    auth_failed = iota + 100   // 100
    token_expired              // 101
    permission_denied          // 102

    // Network errors (start at 200)
    timeout = iota + 200       // 200
    connection_lost            // 201
    protocol_mismatch          // 202
}

By default, an enum’s tag will be the smallest unsigned integer that can represent all the enum’s possible values, e.g.

// Iroh will use a uint2 for this enum's tag as the 2 bits
// will be able to represent all 4 possible values:
Status = enum {a, b, c, d}

When the in-memory layout needs to be controlled, a custom size can be specified instead, e.g.

// A 20-bit tag will be used here, even though only 4 variants
// have been defined so far:
Status = enum(uint20) {a, b, c, d}

When the size exceeds 8 bits, a qualifier can be added to the custom size to control its endianness, e.g.

Status = enum(uint20be) {a, b, c, d}

The following qualifiers are supported:

  • be — big-endian.

  • le — little-endian (the default).

  • ne — native-endian matching whatever the current CPU/OS uses.

Matches on variants must be exhaustive, i.e. they must match all possible variants, e.g.

Status = enum {a, b, c, d}

// This match will error at edit-time as the .d variant
// hasn't been handled:
match status {
    .a:
    .b:
    .c:
}

This helps prevent any bugs caused by accidentally forgetting to handle certain cases. The default keyword can be used to act as a “catch-all”, e.g.

match http_method {
    .get: handle_get()
    .post: handle_post()
    default: handle_everything_else()
}

This is useful for enums that might grow in the future when APIs evolve, and ensures that existing code won’t break, e.g.

The exhaustive nature of enum matches can make it difficult for package authors to evolve their APIs, e.g.

// For an enum like this defined by a package author:
HTTPMethod = enum {
    get = "GET"
    head = "HEAD"
    post = "POST"
    put = "PUT"
}

// The following will compile as the match handles all 4 cases:
match http_method {
    .get: handle_get()
    .head: handle_head()
    .post: handle_post()
    .put: handle_put()
}

// But, if the package author adds a new enum .trace variant,
// like below, the `match` will fail as it's not exhaustive
// any more:
HTTPMethod = enum {
    get = "GET"
    head = "HEAD"
    post = "POST"
    put = "PUT"
    trace = "TRACE"
}

To ensure that user code continues to work if a package author decides to add new variants, enums can be marked as extensible, e.g.

HTTPMethod = enum(extensible) {
    get = "GET"
    head = "HEAD"
    post = "POST"
    put = "PUT"
}

This will force all matches on those enums to always have a default case, so that they will continue to be exhaustive even if package authors add new variants, e.g.

match http_method {
    .get: handle_get()
    .head: handle_head()
    .post: handle_post()
    .put: handle_put()
    default: handle_everything_else()
}

// When the enum gets extended with a new .trace variant, this
// match will continue to work thanks to the `default` case.

Multiple variants can be grouped with ; in a single case. Within such branches, values are automatically type narrowed to only those variants, e.g.

match http_method {
    .get:
        handle_get()
    .post; .put:
        // Only .post and .put are valid variants inside here.
        handle_common_code_for_post_and_put()
        match http_method {
            .post:
                handle_post_only_stuff()
            .put:
                handle_put_only_stuff()
            // No need to handle the other cases here.
        }
    .head:
        handle_head()
    default:
        handle_anything_else()
}

Like in Rust, enum variants can also have associated structured data of different kinds, e.g.

Message = enum {
    quit                                        // No data
    move{                                       // Struct data
        x int32
        y int32
    }
    write(string)                               // Single-element tuple
    set_colour(r: uint8, g: uint8, b: uint8)    // Multi-element tuple
}

These can be destructured within the match cases, e.g.

func process(msg Message) {
    match msg {
        .quit:                 // Handle quit
        .move{x, y}:           // Handle move
        .write(text):          // Handle write
        .set_colour(r, g, b):  // Handle set_colour
    }
}

Field Annotations

Composite types, i.e. struct and enum types, can be annotated with structured data. The annotations can be on individual fields or the type as a whole, e.g.

type Person = struct {
    dob        date         json.Field{date_format: .rfc3339}
    name       string       sql.Field{name: "user_name"}
    updated_at ?timestamp   json.Field{omit_empty: true}
} sql.Table{name: "people"}

Unlike in Go, where annotations have to be shoved into string values, e.g.

type Person struct {
    Name string `json:"name" db:"user_name"`
}

Iroh annotations can be any compile-time evaluatable value. This can be used to easily add things like custom encodings, serialization, validation, etc.

BlogPost = struct {
    contents  encoded_string  string_encoding.iso_8859_1
}

Application code can introspect the specific annotations at compile-time to drive behaviour, e.g.

annotation = Person.fields["updated_at"].annotation[json.Field]
if annotation {
    if annotation.omit_empty {
        // skip empty value ...
    }
}

Destructuring

Iroh supports destructuring for many of its data types, e.g.

// Tuples
(a, b, c) = (1, 2, 3)

// Arrays
[a, b, c] = [3]int{1, 2, 3}

// Slices
[a, b, c] = [1, 2, 3]

// Structs
{x, y} = Point{x: 10, y: 20}

// Strings
<<"Hello ", name>> = "Hello Tav"

For elements of iterable values, specific elements can be skipped with a _, e.g.

[a, _, b] = [1, 2, 3]

The ... splat operator can be used to match any “remaining” elements of an iterable, e.g.

[head, second, _, tail...] = [1, 2, 3, 4, 5]

head == 1       // true
second == 2     // true
tail == [4, 5]  // true

The ... splat operator can be in any position as long as it’s only used once, e.g.

[head, middle..., last] = [1, 2, 3, 4, 5]

head == 1            // true
middle == [2, 3, 4]  // true
last == 5            // true

Or even use it to ignore a group of intermediate elements, e.g.

[head, ..., last] = [1, 2, 3, 4, 5]

head == 1 // true
last == 5 // true

As struct fields destructure by their field names, partial destructuring only needs to specify the desired field names, e.g.

{name} = Person{name: "Tav", location: {lat: 51.5074, lng: -0.1278}}

name == "Tav" // true

Fields can be destructured to a different name with a :, e.g.

{name: user} = Person{name: "Tav", location: {lat: 51.5074, lng: -0.1278}}

user == "Tav" // true

Nested elements can be destructured as needed, e.g.

{location: {lat, lng}, name} = Person{name: "Tav", location: {lat: 51.5074, lng: -0.1278}}

name == "Tav"   // true
lat == 51.5074  // true
lng == -0.1278  // true

Destructured elements can set a default value with a : if the value is a zero-value, e.g.

{name: "Anonymous"} = Person{}

name == "Anonymous" // true

This can often be clearer than manually assigning to a local variable and checking its value before setting a default, e.g.

// Compare:
name = person.name
if !name {
    name = "Anonymous"
}

// Versus:
{name: "Anonymous"} = person

The (), [], and {} around destructured patterns can be elided when multiple elements are being destructured without any nesting or field renaming, e.g.

a, _, b = [1, 2, 3]
x, y = Point{x: 10, y: 20}

a == 1   // true
b == 3   // true
x == 10  // true
y == 20  // true

The Erlang-inspired <<x, y>> binary destructuring works well with both strings and byte slices, e.g.

// Decode binary data into specific data types, e.g.
<<version::uint8, length::uint32, checksum::int16>> = data

// Multi-byte integer types can specify an alternative to the
// default little-endian decoding, e.g.
<<version::uint8, length::uint32be, checksum::int24>> = data

// The splat operator can be used as usual, e.g.
<<version::uint8, header...>> = data

// The number of bytes to destructure can be specified with
// expressions or integer literals, e.g.
<<header::56, payload...>> = data

// These can even refer to previously destructured data, e.g.
<<version::uint8, length::uint32, header::length, payload...>> = data

Functions & Methods

The func keyword is used to define both functions and methods. Functions can have both parameters and a return value, e.g.

// This takes 2 parameters and returns an int value:
func add(a int, b int) int {
    return a + b
}

// This takes no parameters and returns a 2-tuple value:
func get_info() (name: string, height: int<cm>) {
    return ("Tav", 186<cm>)
}

// This takes 2 parameters and returns nothing:
func set_info(id int, info (name: string, height: int<cm>)) {
    cache.update(id, info)
}

Closures, i.e. anonymous functions with associated data, can be defined within function and method bodies using the => syntax, e.g.

add = (a int, b int) => {
    return a + b
}

If the anonymous function body is on the same line as the =>, then the return is implicit, i.e.

// Return is implicit whether there are braces, e.g.
add = (a int, b int) => { a + b }

// Or not:
add = (a int, b int) => a + b

Anonymous functions cannot define their return type, and thus the type of the return value must be inferrable. Functions which don’t take any parameters can omit them, e.g.

newline = () => print("")

// Print 3 newlines:
newline()
newline()
newline()

If anonymous functions don’t specify any parameters, but do receive parameters, then those parameters need to be inferrable, and can be referred to by their position, e.g. $0, $1, etc.

people = ["Reia", "Zaia"]

// This explicit function passed to the `map` method:
people.map((person string) => {
    return person.upper()
})

// Can be simplified by omitting the parameter:
people.map(() => {
    return $0.upper()
})

// Can be further simplified by putting it all on one line:
people.map(() => $0.upper())

// And even more clearly to just:
people.map { $0.upper() }

Functions, methods, and closures can all be passed as parameters in function calls, and even saved as values, e.g.

func calc(a int, b int, op (int, int) => int) {
    return op(a, b)
}

// Pass the add function as a parameter:
calc(1, 2, add)

Instruction = struct {
    op   (int, int) => int
    a    int
    b    int
}

// Store the add function as a struct field:
next = Instruction{
    op: add,
    a: 1,
    b: 2
}

Functions can be variadic if a parameter name is prefixed with ..., e.g.

func print_names(...names string) {
    for name in names {
        print(name)
    }
}

The variadic parameter is automatically a slice of the given type, and can be called with zero or more values, e.g.

print_names()                        // Outputs: nothing
print_names("Zeno", "Reia", "Zaia")  // Outputs: all 3 names

Slices can use the ... splat operator to expand their elements when calling functions with variadic parameters, e.g.

names = ["Alice", "Tav"]
print_names(...names)

Parameters can be given default values by following the parameter name with a : and the default value, e.g.

func greet(name string, greeting: "Hello") {
    return "${greeting} ${name}"
}

greet("Alice")      // Outputs: Hello Alice
greet("Tav", "Hi")  // Outputs: Hi Tav

Iroh will generate edit-time warnings for certain function definitions, e.g.

  • Functions that have more than 6 named parameters as these tends to result in unwieldy APIs.

  • Functions that use bool parameters instead of clearer enum ones, e.g. bar(123, true) vs. bar(123, .update).

Functions which take a function parameter at the last position can use block syntax for that parameter, e.g.

func calc(a int, b int, op (int, int) => int) {
    return op(a, b)
}

// We call `calc` using block syntax for the `op` parameter:
calc(1, 2) { $0 + $1 }

When a parameter is a struct type, it can be inlined by eliding the struct keyword, e.g.

func run_server(cfg {host string, port int}) {
    ...
}

Functions which take a struct parameter at the last position can accept the struct fields as “named” arguments in the function call, e.g.

Config = struct {
    log_level   enum{info, debug, fatal}
    port        int
}

func run_server(cfg Config) {
    ...
}

// The Config fields can be passed in as "named" arguments, e.g.
run_server(log_level: .info, port: 8080)

// As Config fields will be default-initialized, only any
// necessary fields need to be specified, e.g.
run_server(port: 8080)

Function parameters can combine default values, variadic parameters, struct parameters, and trailing functions as long as they follow this order:

  1. Positional parameters with types.

  2. Positional parameters with default values, i.e. optional parameters, or a variadic parameter.

  3. Trailing struct parameter (with optional default field values).

  4. Trailing function parameter.

Parameters can also use destructuring syntax, e.g.

Config = struct {
    log_level   enum{ info, error, fatal }
    port        int
}

func run_server({port} = Config) {
    // Only the `port` value is available in this scope.
}

func run_server({port: 8080} = Config) {
    // The `port` value defaults to 8080 if it's not been specified.
}

For certain types of destructuring, types can be elided, e.g. when destructuring binary data:

func parse_packet(<<header::56, payload...>>) {
    ...
}

User-defined types “inherit” all of the methods of their underlying type and have certain types autogenerated for them, e.g. struct types have the following methods created for them:

  • copy, copy_with, deepcopy — for copying the value.

  • __hash__ — for deriving the hash value.

  • __str__ — for the default string representation of the value.

Custom methods can be defined on a type by prefixing the type name before the func definition, e.g.

Config = struct {
    host   string
    port   int
}

// Define the `validate` method:
(c Config) func validate() bool {
    if !c.host {
        return false
    }
    if 1024 <= c.port <= 65535 {
        return true
    }
    return false
}

// Use the `validate` method:
c = Config{host: "", port: 8080}
if c.validate() {
    ...
}

The receiver component of the method definition, i.e. (c Config), can use any variable name to refer to the value of the specified type. There’s no implicit this or self.

Static methods on the type can be defined by specifying type as the receiver name, e.g.

(type Config) func web_server() Config {
    return Config{
        host: "localhost",
        port: 8080
    }
}

// The static method can now be called on the type:
cfg = Config.web_server()

Conditionals

Iroh uses if, if else, and else blocks like most languages, e.g.

if i%3 == 0 and i%5 == 0 {
    print("FizzBuzz")
} else if i%3 == 0 {
    print("Fizz")
} else if i%5 == 0 {
    print("Buzz")
} else {
    print(i)
}

The condition for if and if else need to evaluate to a bool value. If the value can’t be converted, it will generate an error, e.g.

Person = struct {
    first_name string
    last_name  string
}

user = Person{}

// The following will generate a compile-time error as user
// cannot be converted to a bool.
if user {
    ...
}

For convenience, most built-in types support conversion, e.g.

name = "Tav"

// Instead of this explicit conditional check:
if len(name) > 0 {
    ...
}

// The string can be used directly as string values only evaluate
// to true when they have a positive length.
if name {
    ...
}

The falsiness of values of built-in types is given by:

Type When False
Booleans false
Numbers 0
Arrays len(x) == 0
Slices len(x) == 0
Strings len(x) == 0
Maps len(x) == 0

Within conditionals, one can check if a value is within a range, e.g.

// Check if (age >= 21) and (age <= 35):
if 21 <= age <= 35 {
    ...
}

Assignments and destructuring can also be done within an if conditional as long as a ; separated conditional is also checked, e.g.

if resp = api.get_user(id); resp.success {
    // The resp variable does not pollute the outer scope.
}

User-defined types can define a __bool__ method if they want to opt into automatic type coercion into bools, e.g.

State = struct {
    command     string
    is_running  bool
}

func (s State) __bool__() bool {
    return s.is_running
}

state = State{}
if state { // Automatically checks state.is_running
    ...
}

Loops

Most languages tend to provide multiple constructs for looping, e.g. for, while, do, foreach, repeat, etc. This can be slightly confusing for those new to programming.

So Iroh instead follows Go’s approach and only uses one keyword, for, for all looping. Using for by itself results in an infinite loop, e.g.

for {
    // Will keep running code in this block indefinitely.
}

Loops can be broken with the break keyword, e.g.

for {
    now = time.now()
    // This loop will stop as soon as the year ticks over.
    if now.year > 2025 {
        break
    }
    print(now)
    time.sleep(1<s>)
}

Loops can be C-like, i.e.

for initialization; condition; increment {
    ...
}

For example:

for i = 0; i < 5; i++ {
    print(i) // Prints 0, 1, 2, 3, 4
}

Loops can use the continue keyword to skip to the next iteration, e.g.

for i = 0; i < 5; i++ {
    if i == 1 {
        continue
    }
    print(i) // Prints 0, 2, 3, 4
}

To avoid common bugs, e.g. when loop variables are captured by closures, loop variables have per-iteration scope instead of per-loop scope, e.g.

for i = 0; i < 5; i++ {
    func print_value() {
        print(i)
    }
    print_value() // Prints 0, 1, 2, 3, 4
}

Loops can be nested, and labels can be used to exit specific loops, e.g.

outer: // a label for the outer loop, can be any identifier
    for i = 0; i < 3; i++ {
        for j = 0; j < 3; j++ {
            print("={i}, ={j}")
            if i*j == 4 {
                print("Breaking out of both loops")
                break outer
            }
        }
    }

To execute code only when a loop has not been interrupted with a break, the for loop can be followed by a fully branch, e.g.

for i = 0; i < 3; i++ {
    if i == 2 {
        break
    }
    print(i)
} fully {
    print("All values got printed!")
}

Loops can also be conditional, i.e. will keep looping while the condition is true, e.g.

for len(x) > 0 {
    print(x.pop())
}

To loop over ranges, the for keyword can be combined with the in keyword, e.g.

for i in 0..5 {
    print(i) // Prints 0, 1, 2, 3, 4, 5
}

This also works with collections, e.g. iterating over a slice:

users = [{"name": "Tav"}]

for user in users {
    print(user["name"])
}

Most collections also support iterating using in with 2 variables, e.g. in slices this will return each element’s index as well as the element itself:

users = [{"name": "Tav"}]

for idx, user in users {
    print("${idx}: ${user["name"]}")
}

Similarly, iterating using just 1 variable over a map value gives just the keys, e.g.

user = {"name": "Tav", "location": "London"}

for key in user {
    print(key) // Prints name, location
}

While iterating using 2 variables gives both the key and the value, e.g.

user = {"name": "Tav", "location": "London"}

for key, value in user {
    print("${key} = ${value}")
}

When the loop variables are not needed, they can be elided, e.g.

for 0..5 {
    ...
}

Unlike boolean conditions which are re-evaluated on every loop, iterable expressions are only evaluated once. This is particularly useful when the iterables yield lazily, e.g.

for rate_limiter.available_slots() {
    handle_next_request()
}

User-defined types can add support for iteration by defining the __iter__ method which needs to return a type implementing the built-in iterator interface.

Types implementing iterator need to have a __next__ method which returns the next value in the sequence, or nil when the iteration is complete, e.g.

Counter = struct {
    current   int
    max       int
}

func (c Counter) __iter__() iterator {
    return c
}

func (c Counter) __next__() ?int {
    if c.current < c.max {
        current = c.current
        c.current++
        return current
    }
    return nil
}

counter = Counter{current: 2, max: 5}
for i in counter {
    print(i) // Prints 2, 3, 4
}

Const Values

Iroh supports const values of different kinds. If the const is on the left hand side, then the value is compile-time evaluated, e.g.

const x = factorial(5)

x == 120 // true

All dependencies of such evaluations need to be compile-time evaluatable and cannot depend on runtime input, e.g.

const private_key_modulus = read_modulus() // ERROR!

func read_modulus() uint4096 {
    return uint4096(io.read_all(stdin))
}

The @read_file and @read_url compile-time functions allow for reading various resources at compile-time, e.g.

const private_key = @read_file("~/.ssh/id_ed25519")

const logo_png = @read_url("https://assets.espra.com/logo.png")

Compile-time reads are cached on first read, and need to be explicitly uncached by running a clean build or by marking the resource with watch, e.g.

const config_data = @read_file("config.json", watch: true)

This mechanism is what powers our build system, e.g.

import "build"

func build(b build.Config) {
    glfw = b.add_static_library("glfw", sources: glob("glfw/src/*.c"))
            .add_include_path("glfw/include")

    b.add_executable(root: "src/main.iroh")
     .link_library(glfw)
     .install()
}

Instead of needing a separate build config and language, like CMake, Autotools, or Gradle, we have the full power of Iroh available in our compile-time build system.

Compile-time const values can be defined within the top-level scope of a package or within function bodies, e.g.

func serialize_value() {
    const debug = false
    if debug {
        ...
    }
}

Optimizations are then done based on these compile-time values, e.g. in the above example, the entire if debug block will be fully optimized away.

If const is on the right side of an assignment, i.e. at the head of an expression, then it is not compile-time evaluated, and instead marks the result of the expression as immutable, e.g.

path = "/home/tav"
split = const path.split("/")
split[1] = "alice" // ERROR! Cannot mutate an immutable value

When a value is marked as immutable, it can no longer be mutated. To support this, the compiler will try to re-use existing allocations wherever possible, and only make copies when necessary.

Refined Types

Types can be constrained to specific sub-ranges using @limit which limits a type with a constraint. Constraint expressions can refer to an instantiated value of the type as this, e.g.

Codepoint = @limit(uint, this <= 0x10ffff)
Port = @limit(uint16, this >= 1024 and this <= 65535)
Format = @limit(string, this in ["json", "toml", "yaml"])

Any compile-time evaluatable expression can be used as the constraint. When values of constrained types are initialized, literals are validated at compile-time, otherwise at runtime, e.g.

port1 = Port(8080)        // Validated at compile-time (literal)
port2 = Port(user_input)  // Validated at runtime (dynamic value)

Constraints are validated whenever there are any changes that could invalidate it, e.g.

StringList = @limit([]string, len(this) > 0)

x = StringList{"Tav"}
x.pop() // ERROR!

Constraining a type to a set of specific values can be written by prefixing them with a const and using | to separate the options, e.g.

Format = const "json" | "toml" | "yaml"
Priority = const 1 | 2 | 3

When constraining non-numeric types like strings to a set of values, the const prefix can be elided as the | bitwise OR operator doesn’t apply to them, e.g.

Format = "json" | "toml" | "yaml"

But the const prefix will still be needed if only one value is possible, e.g.

Format = const "json"

As string values are already immutable, this creates a value which doubles as both a type value with a single string value as well as an immutable string value.

Consumable Types

Struct types can be annotated as being consumable in order to treat them as linear types, e.g.

Transaction = struct(consumable) {
    ...
}

func (t Transaction) set_key(key string, value V) {
    ...
}

@consume
func (t Transaction) commit() {
    ...
}

@consume
func (t Transaction) rollback() {
    ...
}

All values of a consumable type must be discarded by calling a method that has been marked with the @consume decorator, e.g.

txn = db.new_txn()
txn.set_key("admin", "Tav")
txn.commit()

Failure to do so will result in an edit-time error, e.g.

txn = db.new_txn()
txn.set_key("admin", "Alice")
// ERROR! Neither txn.commit() nor txn.rollback() were called!

Once a value has been consumed, it can no longer be used, e.g.

txn = db.new_txn()
txn.set_key("admin", "Alice")
txn.commit()
txn.set_key("admin", "Zeno") // ERROR! txn value already consumed!

Up to one @consume method can be marked as default, e.g.

@consume(default: true)
func (f File) close() {
    ...
}

This will automatically consume the method with this method if it hasn’t been explicitly consumed by the time it goes out of scope, e.g.

if update_contents {
    f = create_file("/home/tav/${filename}.md")
    f.write(contents)
    // The file is automatically consumed by f.close() here.
}

This enables package authors to provide APIs that are ergonomic and safe, without needing any manual cleanup.

Expression-Based Assignment

Languages with a strong emphasis on expressions can easily lead to code with poor readability and cognitive load, e.g. consider this Rust code:

let final_price = {
    let base_price = if item.category == "premium" { if user.is_vip { item.price * 0.8 } else { item.price * 0.9 } } else { item.price };
    let shipping_cost = if base_price > 50.0 { if user.location == "remote" { 15.0 } else { 0.0 } } else { 8.0 };
    let tax_amount = if user.state == "CA" { if base_price > 100.0 { base_price * 0.08 } else { base_price * 0.06 } } else { base_price * 0.05 };
    let total = base_price + shipping_cost + tax_amount;
    if total > 200.0 {
        if user.membership.is_some() {
            total - user.membership.unwrap().discount_amount
        } else {
            total * if user.first_time_buyer { 0.95 } else { 1.0 }
        }
    } else {
        if user.has_coupon {
            if coupon.min_purchase <= total { total - coupon.amount } else { total }
        } else {
            if user.loyalty_points > 1000 { total - 10.0 } else { total }
        }
    }
};

An accidental semicolon somewhere can easily change the meaning of the entire calculation. To minimize such issues, Iroh takes a more pragmatic approach to expression-based assignments.

Expressions beginning with certain keywords like if and else can implicitly assign their block value to a variable as long as there are no further nested constructs, e.g.

base_price = if user.is_vip { price * 0.8 } else { price }

If multi-line computations are needed, or if nested constructs need to be used, the block is auto-indented and the value being assigned needs to be explicitly prefixed with a =>, e.g.

base_price =
    if item.category == "premium" {
        if user.is_vip {
            => item.price * 0.8
        } else {
            => item.price * 0.9
        }
    } else {
        => item.price
    }

The do keyword can be used to evaluate multi-line expression blocks for assignment, e.g.

value = do {
    temp = expensive_calculation()
    => temp * 2 + 1
}

Similar to how return works within function bodies, => ends computation within a block, i.e.

value = do {
    temp = expensive_calculation()
    => temp * 2 + 1
    // The following code will be unreachable, just like after a return.
    print("This won't execute")
}

Variables declared within do blocks do not pollute the outer scope. Expression-based assignment can also be nested, e.g.

base_price = do {
    discount =
        if item.category == "premium" {
            if user.is_vip {
                 => 0.2
            } else {
                 => 0.1
            }
        } else {
            => 0.0
        }
    => item.price * (1 - discount)
}

For nested expressions where assignment is to an outer block, labels can be used and assignments can use the form =>label value, e.g.

base_price = outer: do {
    discount =
        if item.category == "premium" {
            if user.is_genesis_member {
                 =>outer 0
            } else if user.is_vip {
                 => 0.2
            } else {
                 => 0.1
            }
        } else {
            => 0.0
        }
    => item.price * (1 - discount)
}

Built-in Functions

Besides built-in types like time, Iroh provides various built-in functions to make certain common operations easier, e.g.

  • cd(path)

    • Change the working directory to the given path.

      cd("/home/tav")
      
  • cap(slice)

    • Return the capacity of the given slice.

      x = make([]int, len: 100)
      
      cap(x) == 100 // true
      
  • exit(code: 0)

    • Exit the process with the given status code.
  • fprint(writer, ...args, end: "")

    • Writes the given arguments to the writer using the same formatting as the print function.

      fprint(my_file, "Hello world!")
      
  • glob(pattern)

    • Returns files and directories matching the given pattern within the current working directory.

      for path in glob("*.ts") {
          // Do something with each .ts file
      }
      
  • len(iterable)

    • Return the length of the given iterable.

      len(["a", "b", "c"]) == 3 // true
      
  • max(...args)

    • Returns the maximum of the given values.

      max(1, 12, 8) == 12 // true
      
  • min(...args)

    • Returns the minimum of the given values.

      min(1, 12, 8) == 1 // true
      
  • print(...args, end: "\n")

    • Prints the given arguments to the standard output.

      // Output with a newline at the end:
      print("Hello world!")
      
      // Output without a newline at the end:
      print("Hello world!", end: "")
      
  • print_err(...args, end: "\n")

    • Prints the given arguments to the standard error.

      // Output with a newline at the end:
      print_err("ERROR: Failed to read file: /home/tav/source.txt")
      
      // Output without a newline at the end:
      print_err("ERROR: ", end: "")
      
  • read_input(prompt: "", raw: false)

    • Read input from the standard input.

      name = read_input("Enter your name: ")
      
  • read_passphrase(prompt: "", mask: "*")

    • Read masked passphrase from the standard input.

      passphrase = read_passphrase("Passphrase: ")
      
  • type(value)

    • Return the type of the given value. The fields of the response, i.e. the type value, can only be accessed at compile-time.

      type("Hello") == string // true
      
      match type("hello") {
          string: print("Got a string value!")
          int: print("Got an int value!")
          default: print_err("Got an unknown value!")
      }
      

Along with various compile-time functions, e.g.

  • @col

    • Convert an array/slice from row-major ordering to column major.
  • @consume

    • Mark type methods that “consume” the type value.
  • @decorator

    • Mark a function/method a compile-time decorator.
  • @limit(type, constraints...)

    • Create a refined type which is validated against the specified constraints.
  • @read_file(filepath, watch: false)

    • Returns the byte slice contents of reading the given file.
  • @read_url(url, watch: false)

    • Returns the byte slice contents of reading the given URL.
  • @relate(unit, relationships...)

    • Relate units to each other through equality definitions.
  • @row

    • Convert an array/slice from column-major ordering to row major.
  • @transpose

    • Transpose the layout of an array/slice, e.g. row major to column major, and vice-versa.
  • @undo

    • Mark a specialized method as the variant to use when undoing a method call within an atomic block.

Inlining

The inline keyword tells Iroh to try and inline a function or loop at the call site for better performance, e.g.

inline func add(a int, b int) int {
    return a + b
}

As a result of the inline func hint, Iroh will directly insert the code of add wherever it’s called instead of doing a regular function call, e.g.

// This code:
c = add(a, b)

// Gets transformed into:
c = a + b

This can be beneficial in performance critical code as the overhead of the function call is eliminated. Similarly, noinline func can be used to prevent a function from being inlined, e.g.

noinline func something_complex() {
    ...
}

This can be useful in a number of cases, e.g. better debugging thanks to improved stack traces, minimizing instruction cache misses caused by ineffective inlining, etc.

The inline for mechanism can be used to inline loops, e.g.

elems = [1, 2, 3]

inline for elem in elems {
    process(elem)
}

If the length of the iterable is known at compile time, this will act as a hint to unroll the loop, i.e.

// This code:
inline for elem in elems {
    process(elem)
}

// Gets transformed into:
process(elem[0])
process(elem[1])
process(elem[2])

// Or if the element values are known, perhaps even:
process(1)
process(2)
process(3)

If the length of the iterable isn’t known at compile time, then the inline for will act as a hint to the compiler to more aggressively optimize the loop, e.g.

  • Inline any small function calls within the loop body.

  • Move any loop-invariant code outside of the loop.

  • If possible, convert the loop to use SIMD instructions.

Transmuting Values

The as keyword allows for a value to be reinterpreted as another type, e.g.

x, y = int128(1234) as [2]int64

This allows zero-copy reinterpretation of a value’s bits, e.g.

req = buf as APIRequest

Transmutations are checked for safety at edit-time. When edit-time verification isn’t possible (e.g., with dynamic length slices), runtime checks ensure safe conversion.

Symbolic Programming

Iroh has first-class support for symbolic programming. New symbols are declared using the sym keyword, e.g.

sym x, y, z

These can then be used with functions from the sym package in the standard library to do things like symbolic differentiation, e.g.

import * from "sym"

sym x

expr = x³ + sin(x) + exp(x)
y = diff(expr, x)

y == 3x² + cos(x) + exp(x) // true

Symbolic integration, e.g.

sym x

integrate(x² * sin(x), x) // -x²*cos(x) + 2x*sin(x) + 2*cos(x)

Solve equations algebraically, e.g.

sym x

solve(x² + 5x + 6 == 0, x) // [-3, -2]

Do algebraic simplification, e.g.

sym x

simplify((x² - 1)/(x - 1)) // x + 1

Over time, more and more functions will be added to the standard library, so that Iroh is competitive with existing systems like Mathematica and Sympy.

Function Decorators

Similar to decorators in Python, Iroh allows decorators to extend or modify the behaviour of functions and methods in a clean, reusable, and expressive way, e.g.

app = http.Router{}

@app.get("/items/#{item_id}", {response: .json})
func get_item(item_id int) {
    // fetch item from the database
    return {"item_id": item_id, "item": item}
}

Decorators are evaluated at compile-time, enabling extensibility without any runtime overhead. The built-in @decorator specifies if a function or method can be used as a decorator.

The first parameter to a decorator is always the function that is being decorated. The decorator can wrap the function or replace it entirely, e.g.

@decorator
func (r Router) get(handler function, path template, config Config) function {
    // register path with the router and handle parameter conversion
    return (handler.parameters) => {
        response = handler(handler.parameters)
        match config.response {
            .json:
                r.encode_json(response)
            default:
                // handle everything else
        }
    }
}

Software Transactional Memory

Iroh’s atomic blocks provide an easy way to take advantage of software transactional memory, e.g.

atomic {
    if sender.balance >= amount {
        sender.balance -= amount
        receiver.balance += amount
    }
    risky_operation(sender)
}

In the example above, if risky_operation should generate an error, then the whole atomic block will be rolled back, i.e. it’ll be as if the entire block of code had never been run.

Atomic blocks naturally compose, e.g.

func transfer(sender Account, receiver Account, amount currency) {
    atomic {
        if sender.balance >= amount {
            sender.balance -= amount
            receiver.balance += amount
        }
    }
}

atomic {
    transfer(alice, tav, 100)
    transfer(tav, zaia, 150)
}

Behind the scenes, the use of atomic essentially operates on “shadow” variables. The “real” variables are only overwritten once the entire block succeeds.

To make this efficient for certain data structures, e.g. large maps, the compiler will automatically switch them to use immutable variants that support efficient copies and rollbacks.

For even greater control, types can define @undo variants of methods that will be called in reverse order to roll back aborted changes, e.g.

func (d Document) insert_text(text string, pos int) {
    ...
}

@undo
func (d Document) insert_text(text string, pos int) {
    // This variant automatically gets called to rollback changes.
}

Atomic blocks provide optimistic concurrency without needing any explicit locks. The outermost atomic block essentially forms a transaction, and all changes that it makes is tracked automatically.

If any two transactions conflict, i.e. concurrently write to the same values, then one will be automatically rolled back and retried indefinitely until it either completes or errors.

Instead of blindly retrying, transactions can be forced to retry only when the values that they’ve read have actually changed. This can be controlled by using the retry keyword, e.g.

atomic {
    if queue.is_empty() {
        retry
    }
    job = queue.pop()
    ...
}

While transactions can roll back internal state changes, it’s generally impossible to roll back external effects such as writing output, e.g.

atomic {
    if sender.balance < amount {
        print("ERROR: Insufficient balance!")
    }
}

In these cases, a good pattern is to only create side effects once an atomic block has finished running, e.g.

result = atomic {
    if sender.balance >= amount {
        sender.balance -= amount
        receiver.balance += amount
        => .successful_transfer
    }
    => .insufficient_funds
}

if result == .insufficient_funds {
    print("ERROR: Insufficient balance!")
}

Similarly, while atomic blocks support asynchronous I/O calls, idempotency keys should be used when calling external services, e.g.

idempotency_key = ...

atomic {
    ok = <- call_transfer_api({amount, idempotency_key})
    if ok {
        mirrored.balance -= amount
    }
}

Reactive Programming

Iroh supports reactivity natively. Any variables defined using := automatically updates whenever any variables that it depends on is updated, e.g.

a = 10
b = 20
c := a + b

b = 30
c == 40 // true

When a computed value can no longer be updated, i.e. when all the variables that it depends on are no longer updatable, it is no longer tracked, e.g.

func add(a int, b int) int {
  c := a + b
  return c // c is now untracked, as a and b can no longer be updated
}

To automatically perform side-effects whenever values are updated, the built-in onchange function can be given a handler function to run, e.g

a = 10
b = 20
c := a + b

onchange {
  print("={c})")
}

b = 30 // Outputs: c = 40

b = 30 // No output as no change to affect computed value

The onchange function returns a callback that can be used to explicitly remove the registered handler, e.g.

a = 10
b = 20
c := a + b

remove_printer = onchange {
  print("c = ${c})")
}

b = 40               // Outputs: c = 50
remove_printer()
b = 30               // No output as handler has been removed

As onchange automatically tracks all the computed values that a handler depends on, the handlers are automatically cleaned up whenever the values they depend on are dropped.

This primitive is what powers many mechanisms in Iroh, e.g. our component-based UIs use this mechanism to automatically re-render the UI whenever dependent state values change.

When computed variables are defined, their definition is evaluated lazily, allowing for circular definitions, e.g.

temp_celsius := 0
temp_celsius := (temp_fahrenheit - 32) * 5/9

temp_fahrenheit := 32
temp_fahrenheit := (temp_celsius * 9/5) + 32

In such cases, the first := defines the initial value, and the second definition defines how the value should be computed. This is cleaner than in other languages and frameworks, e.g. Vue:

import { ref, computed } from 'vue'

const celsius = ref(0)
const fahrenheit = ref(32)

const celsiusComputed = computed({
  get: () => (fahrenheit.value - 32) * 5/9,
  set: (value) => fahrenheit.value = (value * 9/5) + 32
})

const fahrenheitComputed = computed({
  get: () => (celsius.value * 9/5) + 32,
  set: (value) => celsius.value = (value - 32) * 5/9
})

As the compiler knows which methods mutate underlying data structures, reactivity works automatically for complex data structures like map types, e.g.

users = [{name: "Tav"}, {name: "Alice"}]
user_count := len(users)

users.append({name: "Zeno"})
user_count == 3 // true

As the compiler tracks dependencies and data flow, it’s able to optimally update values without much overhead, and in pretty much the same way that a developer would manually do so.

For certain operations, e.g. filtering a slice, the compiler is able to avoid unnecessary operations, e.g. by only filtering new elements and adding them to a computed value, e.g.

users = [{name: "Tav", admin: true}, {name: "Alice", admin: false}]
admins := users.filter { $0.admin == true }

// As users is mutated, admins would need to be recomputed.
// However, the filter is only run on the newly appended item
// to the slice.
users.append({name: "Zeno", admin: false})

This mechanism is applied even to nested collections, e.g.

users = [...]

number_of_active_admins := users
    .filter { $0.admin }
    .filter { len($0.recent_messages) > 0 }

Fields on struct types can also be reactive, e.g.

Person = struct {
    first_name  string
    last_name   string
    full_name   := "${first_name} ${last_name}"
}

alice = Person{
    first_name: "Alice"
    last_name: "Fung"
}

alice.full_name == "Alice Fung" // true

Whenever a reactive computation can generate errors, the errors must be explicitly handled so that computed variables don’t need to propagate errors, e.g.

channel = "#espra"
msgs := try fetch_recent_messages(channel) or []
starred := msgs.filter { $0.starred }

The compiler will automatically batch updates based on dataflow analysis. This analysis is aware of I/O boundaries, so can atomically update changes even when dealing with async calls.

However, for the cases, where explicit control is needed, an atomic block can be used, e.g.

channel = "#espra"
atomic {
    msgs := <- try fetch_recent_messages(channel) or []
    starred := msgs.filter { $0.starred }
    pinned := <- try fetch_pinned_messages(channel) or []
    first_pinned := pinned.get(0)
}

Package Imports

Iroh supports referencing code from other packages using the import keyword. Like Go and JavaScript, we use static strings for referencing the package paths which can be:

  • Tilde links to refer to packages on Espra:

    import "~1.tav/some/package"
    
  • URL paths to refer to packages on some git repo that’s accessible over HTTPS:

    import "github.com/tav/package"
    
  • Relative paths starting with either ./ or ../ refer to local packages:

    import "./some/package"
    
  • Absolute paths to refer to packages within the standard library:

    import "os/exec"
    

Imported packages are automatically aliased to a package identifier. This defaults to the package_name as specified by the package’s metadata, which in turn defaults to the package’s file name.

Package names must satisfy the following regular expression:

^[a-z]([a-z0-9_]*[a-z0-9])?$

To support predictable naming, the package_name must match one of the last two slash-separated elements of the import path, e.g.

  • If the package is imported as "~1.tav/json/v2", then the package_name must be either json or v2.

An automatic conversion from hyphens in an import path to underscores in the package_name is also supported for those wanting to use hyphens for aesthetic purposes, e.g.

import "github.com/tav/chart-widget" // imported as chart_widget

Likewise, any dots in the package_name are converted to underscores, e.g.

import "~1.tav/package/csv.lib" // imported as csv_lib

In the case of conflicts, or even just for personal preference, imported packages can be bound to a different package identifier by using the as keyword, e.g.

import "~1.tav/json/v2" as fastjson

For importing non-code resources like images, custom importers can be specified via the using keyword on import statements, e.g.

import "github.com/micrypt/image"
import "~1.tav/app/hero.jpg" using image.optimize

Custom importers need to implement the built-in importer interface and are evaluated at compile-time. They are passed the resource data and metadata from the Espra tilde link.

As with all package dependencies, custom imports can also be updated by refreshing the package manifest, e.g. to fetch and use a newer version of a resource.

To use the same custom importer for multiple packages, the with statement can be used, e.g.

import "github.com/micrypt/bundle"

with {
    .importer = bundle.assets
} {
    import "~1.tav/logo/espra.png"
    import "~1.tav/logo/ethereum.png"
    import "~1.tav/logo/bitcoin.png"
}

Code within imported packages are referenced using dot syntax, e.g.

fastjson.encode([1, 2, 3])

Explicitly referencing the packages at call sites generally makes code easier to understand, rather than importing references from packages, e.g.

  • We believe that it’s easier to know what’s going on in:

    import (
      "github.com/tav/encode/json"
      "github.com/tav/encode/yaml"
    )
    
    json.encode([1, 2, 3])
    yaml.encode([1, 2, 3])
    

    Than in something like:

    use serde_json::to_string;
    use serde_yaml::to_string as to_string_yaml;
    
    to_string([1, 2, 3])
    to_string_yaml([1, 2, 3])
    

However, there are certain use cases where constantly referencing the package name will add unnecessary noise, e.g. when referencing JSX-esque components, unit definitions, enums, etc.

For these cases, a special * import form can be used, e.g.

import * from "github.com/tav/calendar"

By default, this will import all exported references from the package that starts with an upper case, any units, as well as import the package itself. This makes it cleaner to use some packages, e.g.

import * from "github.com/tav/calendar"

func App(initial_date string) {
    selected_date = calendar.parse_rfc3339(initial_date)
    return (
        <CalendarWidget selected_date>
            <Heading>"Event Start"</Heading>
        </CalendarWidget>
    )
}

Package authors can use the starexport keyword to explicitly control what exports will be available directly in other packages when they import via *, e.g.

starexport CalendarWidget, parse_rfc3339

In script mode, it’ll be possible to import all exported identifiers into the current package using the special ** import syntax, e.g.

import ** from "github.com/tav/math"

This feature can be enabled as a config option in script mode. For example, the Iroh REPL will have this always on, as it can get tedious to keep typing the package name repeatedly inside the REPL, e.g.

// The standard approach involves a lot more typing in the REPL:
y = math.sqrt(math.sin(x) + math.cos(x))

// The ** imports make it shorter:
y = sqrt(sin(x) + cos(x))

If imported identifiers conflict in ** imports, identifiers from later imports will override those with the same name from previous ones.

Visibility Control

Within packages, everything defined within the package’s global scope, i.e. variables, functions, types, and even the fields within the types, are fully visible to the rest of the code in the package.

However, outside of the package, i.e. in code that imports a package, visibility is constrained:

  • For a type value to be accessible, its identifier must start with a latin capital letter, i.e. A to Z.

  • For a non-type value to be accessible, i.e. a const value, function, a field within a public type, etc., its identifier must not start with an _.

For example, if a model package defined:

supported_countries = ["China", "UK", "USA"]

_offensive_names = []

location = struct {
    lat  int
    lng  int
}

Person = struct {
    name            string
    country         string
    _date_of_birth  time.Date
}

func (p Person) is_over_21() bool {
    ...
}

func (p Person) _has_offensive_name() bool {
    ...
}

func _calc_age(p Person) int {
    ...
}

Then, in a package that imported it:

// Accessible
model.Person
model.supported_countries

person.name
person.country
person.is_over_21()

// Inaccessible
model.location
model._offensive_names
model._calc_age

person._date_of_birth
person._has_offensive_name()

Iroh’s approach reflects established norms within the programming community of prefixing private fields with underscores, and avoids the need for public and private visibility modifiers.

Environment Variables

Iroh provides built-in functions like $env.get and $env.lookup to get the value of environment variable values, e.g.

log_level = $env.get("LOG_LEVEL")

Similarly, $env.set can be used to set these values, e.g. when you need to spawn external commands with that env value:

log_level = $env.set("LOG_LEVEL", "warn")

All environment variables values can be iterated over using $env.all, e.g.

for env_name, env_value in $env.all() {
    // Do something with each environment variable.
}

As this can get noisy in scripts, Iroh also provides syntactic sugar for variables that start with $ and are followed by a sequence of upper case latin letters, numbers, or underscores, e.g.

log_level = $LOG_LEVEL

Besides being a shortcut for getting environ values, they can also be used to update values easily, e.g.

$LOG_LEVEL = "warn"

Well-established environment variables which need to treated as lists, such as $PATH and $CFLAGS are transparently converted into string slices, e.g.

$PATH == [
    "/usr/local/bin",
    "/usr/bin",
    "/bin",
    "/usr/sbin",
    "/sbin"
]

When this value gets manipulated, the underlying environment variable gets updated, e.g.

$PATH.prepend("/home/tav/bin")

These can be cast to a string to get the encoded form, e.g.

path = string($PATH)

path == "/home/tav/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin" // true

Other well-established environment variables are also appropriately typed, e.g. $VERIFY_SSL and $NO_COLOR are treated as booleans, $HTTPS_PORT and $TIMEOUT are treated as integers, etc.

This typed nature allows for default values to be set easily, e.g.

timeout = $TIMEOUT or 60

Environment variable values are of the environ data type. Custom registrations can be defined at compile-time, e.g.

environ.register("ESPRA_DEBUG", type: bool)

environ.register("KICKASS_IMPORT_PATH", type: []string, delimiter: ";")

Any $ references to those environment variable names will then be treated as expected, e.g.

$KICKASS_IMPORT_PATH.prepend("/home/tav/kickass")

string($KICKASS_IMPORT_PATH) == "/home/tav/kickass;/usr/local/kickass" // true

By default, changes to environment variables are applied globally and inherited by all sub-processes. To limit a value to a specific lexical scope, the with statement can be used, e.g.

with {
    $TIMEOUT = 120
} {
    // Execute external commands.
}

System & Process Info

Common system and process-related info can also be found in some $-prefixed variables:

  • $arch

    • The CPU architecture, e.g. arm64, x64, etc.
  • $args

    • List of command-line arguments without the binary and script names.
  • $argv

    • List of command-line arguments including the binary and script names.
  • $available_memory

    • Currently available memory in bytes.
  • $boot_time

    • Timestamp of when the system was last booted.
  • $cloud_info

    • Access cloud metadata (on supported platforms).
  • $container_info

    • Access container metadata and runtime info (on supported platforms).
  • $cpu_count

    • Number of available CPU cores/threads.
  • $cpu_info

    • Details on the system CPUs.
  • $cwd

    • The current working directory.
  • $debug_build

    • Whether this is a debug or release build.
  • $disk_info

    • Details on the system disks.
  • $effective_gid

    • The effective group ID (on supported platforms).
  • $effective_uid

    • The effective user ID (on supported platforms).
  • $env

    • Access environment variables.
  • $exit_code

    • Exit code of the last executed command/process.
  • $groups

    • List of group IDs that the user belongs to (on supported platforms).
  • $gpu_info

    • Details on the system GPUs.
  • $interactive

    • Boolean indicating whether the process is running within an interactive session.
  • $iroh_mode

    • The current Iroh execution mode.
  • $iroh_version

    • Version of the Iroh runtime/compiler.
  • $locale

    • The current locale setting.
  • $home

    • The current user’s home directory path.
  • $hostname

    • The hostname of the system.
  • $machine_id

    • Persistent identifier for the machine.
  • $max_memory

    • Memory limit for the process.
  • $max_open_files

    • File descriptor limit for the process.
  • $max_processes

    • Processes limit for the process.
  • $mem_info

    • Details on the system memory.
  • $network_info

    • Details on the system network interfaces.
  • $os

    • The current operating system, e.g. linux, macos, windows, etc.
  • $page_size

    • The memory page size of the underlying system.
  • $parent_pid

    • The parent process ID.
  • $pid

    • The process ID of the current process.
  • $process_limits

    • Details on any limits that apply to the current process.
  • $process_start_time

    • Timestamp of when the current process started.
  • $real_gid

    • The real group ID of the current process (on supported platforms).
  • $real_uid

    • The real user ID of the current process (on supported platforms).
  • $saved_gid

    • The saved group ID for privilege restoration (on supported platforms).
  • $saved_uid

    • The saved user ID for privilege restoration (on supported platforms).
  • $session_id

    • The process session ID.
  • $stderr_tty

    • Checks whether standard error is attached to a TTY.
  • $stdin_tty

    • Checks whether standard input is attached to a TTY.
  • $stdout_tty

    • Checks whether standard output is attached to a TTY.
  • $term_colors

    • Number of colours supported by the terminal.
  • $term_height

    • The current terminal height.
  • $term_info

    • Details about the terminal capabilities and type.
  • $term_width

    • The current terminal width.
  • $temp_dir

    • The default root directory for temporary files.
  • $timezone

    • The system timezone.
  • $total_memory

    • Total memory in bytes.
  • $user

    • The current user’s username.
  • $user_cache_dir

    • The default root directory for user-specific cache data.
  • $user_config_dir

    • The default root directory for user-specific config data.
  • $virtualization_info

    • Details about the virtualization environment (on supported platforms).

Shell Interaction

Iroh provides programmatic access to running external commands via the os/exec package in the standard library. It also provides various syntactic sugar to make this easier.

An exec.Cmd value, representing an external command to run can be constructed by prefixing a slice of strings with $, e.g.

cmd = $["git", "checkout", commit]

type(cmd) == exec.Cmd // true

The returned value can be configured as needed, e.g. to set custom environment variables, use a custom reader for the command’s stdin, and a custom writer as its stderr:

cmd = $["ffmpeg", "-i", "-", "-c:v", "libx264", "-f", "mp4", "-"]

cmd.env = {
    "AV_LOG_LEVEL": "verbose"
}

cmd.stdin = mkv_file
cmd.stderr = err_buf

To inherit the current environment variable values, $env.with can be used, e.g.

cmd.env = $env.with {
    "AV_LOG_LEVEL": "verbose"
}

Methods on exec.Cmd values allow for fine-grained control over command execution, e.g.

  • cmd.output — start the command, wait for it to finish, and return the contents of its standard output.

  • cmd.run — start the command and wait for it to finish.

  • cmd.start — start the command without waiting for it to finish.

A started command has additional methods, e.g.

  • cmd.memory_usage — details about the memory used by the process.

  • cmd.pid — the process identifier.

  • cmd.wait — wait for the started command to finish.

Iroh provides shell-like syntax for running commands, piping output, etc. These can be executed by quoting the command within backticks, e.g.

commit = `git rev-parse HEAD`

This starts the given command, waits for it to finish, and if successful, i.e. gets a 0 exit code from the sub-process, returns the standard output after trimming.

Like most shells, whitespace is treated as a separator between arguments, and need to normally be escaped, e.g.

output = `/home/tav/test\ code/chaos-test --threads 4`

The \ escape of the whitespace in the command above makes it equivalent to:

cmd = $["/home/tav/test code/chaos-test", "--threads", "4"]
output = cmd.output()

The usual ' and " quote marks can be used to avoid the need for escaped whitespace, e.g.

output = `chaos-test --select "byzantine node"`

String interpolation can be used within backtick commands, e.g.

output = `git checkout ${commit}`

Values being interpolated are escaped automatically. If it’s a slice of strings, then it is treated as multiple space separated arguments. Otherwise, as a string value.

This helps to prevent a range of security vulnerabilities, e.g.

dangerous_input = "'; rm -rf /; echo '"
output = `echo ${dangerous_input}` // Safely escaped.

While still allowing for multiple arguments to be passed in safely, e.g.

files = ["file 1.md", "file 2.md"]
output = `cat ${files}` // Becomes: cat "file 1.md" "file 2.md"

Commands can be pipelined, e.g.

output = `cat file.txt | grep "Error"`

Outputs can also be redirected, e.g. to write to a file:

`cat file1.txt file2.txt > new.txt`

Or to append to a file:

`cat file1.txt file2.txt >> new.txt`

By default, only the standard output is redirected. As most shells, like Bash, use complex syntax for controlling what gets redirected or piped, e.g.

ls /nonexistent 2>&1 1>/dev/null | grep "No such"

Iroh uses more explicit @keywords in front of the | pipe, or > and >> redirect operators, e.g.

`ls /nonexistent @stderr | grep "No Such"`

This takes the following values for piping or redirecting:

  • @all — all streams, including standard output and error.

  • @both — both the standard output and error.

  • @stderr — just the standard error.

  • @stdout — just the standard output (default behaviour).

  • @stream:N — a specific file descriptor, e.g. @stream:3.

Iroh doesn’t support input redirection or heredocs within backtick commands as we believe that linear pipelines are easier to understand, i.e. when they go left to right.

Instead, when data needs to be piped in, the output of cat can be used, or any suitably typed value, i.e. a string, []byte, or io.Reader, can be piped into a backtick command using |, e.g.

mp4_file = mkv_file | `ffmpeg -i - -c:v libx264 -f mp4 -`

This acts as syntactic sugar for:

cmd = $["ffmpeg", "-i", "-", "-c:v", "libx264", "-f", "mp4", "-"]
cmd.stdin = mkv_file

mp4_file = cmd.output()

Likewise, output can be redirected to an io.Writer, e.g.

`git rev-parse HEAD` > file("commit.txt")

// Or even appended:
`git rev-parse HEAD` >> file("commits.txt")

Conditional execution within backtick commands can be controlled using and and or, e.g.

`command1 and command2` // Only run command2 if command1 succeeds.
`command1 or command2`  // Only run command2 if command1 fails.

Iroh supports automatic globbing when wildcard patterns are specified, e.g.

output = `cat *.log`

The following syntax is supported for globbing:

  • * — matches any string, including empty.

  • ? — matches any single character.

  • [abc] — matches any one character in the set.

  • [a-z] — matches any one character in the range.

  • {foo,bar} — matches alternates, i.e. either foo or bar.

  • ** — matches directories recursively.

For example:

Pattern Example Matches
*.md iroh.md, _doc.md
file?.md file1.md, fileA.md
[abc]*.py a.py, car.py
[a-z]*.py a.py, car.py, test.py
{foo,bar}.sh foo.sh, bar.sh
**/*.ts lib/main.ts, tests/fmt_test.ts

When the ' single quote is used within backtick commands, globbing is not applied, e.g.

output = `cat log | grep '*error'`

In the interests of safety and predictability, globbing is also not applied to any interpolated values, e.g.

filename = `*.log`
output = `cat ${filename}`

If explicit globbing is desired, then the built-in glob function can be used, e.g.

files = glob(`*.log`)
output = `cat ${files}`

This can also be used for quickly iterating over matching patterns, e.g.

for file in glob("*.md") {
    // Do something with each of the Markdown files.
}

Iroh supports command substitution in arguments, e.g.

output = `echo ${`date`}`

But since the interpolated value is treated as a single argument, something like the following won’t work, e.g.

output = `grep "ERROR" ${`find . -name "*.log"`}`

The inner command output will need to be turned into a slice of strings first, e.g.

output = `grep "ERROR" ${`find . -name "*.log"`.split("\n")}`

Or, even better:

files = `find . -name "*.log"`.split("\n")
output = `grep "ERROR" ${files}`

We believe this makes code more readable. It also makes life safer than in shells like Bash where inputs can cause unexpected outcomes depending on the IFS value and how it’s quoted.

Errors generated when running commands, e.g. when they return a non-zero exit code, can be handled by using the try keyword, e.g.

output = try `ls /nonexistent`

If, instead of generating errors, an explicit response object is preferred, then the backtick command can be prefixed with a $. This returns a value with the exit_code, stdout, stderr, etc.

response = $`ls /nonexistent`

response.exit_code == 1 // true

By default, all backtick commands are waited on to finish running. The & operator can be used after a backtick command to return a background job instead, e.g.

job = `sleep 10` &

Backgrounded processes can be signalled with platform-supported signals like .sigterm, e.g.

if job.is_running() {
    job.signal(.sigkill)
}

If a command needs to be run so that the user can directly type in any input, and see the output as it happens, then the $() form can be used:

$(ls -al)

Besides calling external processes, Iroh also supports running local commands defined within Iroh. These need to satisfy the interface:

localcmd = interface {
    __init__((stdin: io.Reader, stdout: io.Writer, stderr: io.Writer))
    __call__(args ...string) exit_code
}

They can be registered with a specific name, e.g.

localcmd.register("chaos-test", ChaosTest)

And can then used like any external command, e.g.

output = `chaos-test --threads 4`

Certain built-in commands like cd are implemented like this, and can thus also be called as plain functions, e.g.

cd("silo/espra")
commit = `git rev-parse HEAD`

As changing working directories is a common need in shell scripting, the with statement can be used to change the working directory for the lexical scope, e.g.

with {
    .cwd = "silo/espra"
} {
    // Do things in the silo/espra sub-directory here.
}

Finally, in script mode, if interactive shell support is enabled, Iroh allows for shell commands to be run without needing to be encapsulated within $(), e.g.

cd silo/espra
git rev-parse HEAD > commit.txt

This works by:

  • First, trying to interpret a line as if it were non-shell code.

  • Otherwise, it tries to treat the line as if it were encapsulated within $().

  • If neither succeeds, an error is generated.

For example:

cd silo/espra
commit = `cat commit.txt`

if commit.starts_with("abcdef") {
    // Celebrate!
}

This allows the “normal” programming aspects of the Iroh language to be seamlessly interwoven with shell code within scripts, the Iroh REPL/Shell, etc.

C Interoperability

Iroh aims to match the high bar set by Zig for C interoperability with zero overhead. Like Zig, we ship with a C compiler and linker so that C code can be imported and used just like Iroh packages, e.g.

import "github.com/micrypt/glfw-iroh" as c

if c.glfwInit() != 0 {
    ...
}

window = c.glfwCreateWindow(800, 600, "Hello GLFW from Iroh", nil, nil)
if window == nil {
    ...
}

c.glfwMakeContextCurrent(window)

for c.glfwWindowShouldClose(window) == 0 {
    ...
}

While complex macros don’t get translated, constants using #define get imported automatically, e.g.

import "github.com/tav/limits" as c

// C: #define INT_MAX 2147483647
max_int = c.INT_MAX

Iroh supports a number of data types that match whatever the C compiler would produce for a target platform:

IrohC TypeTypical Size
c_charchar8 bits
 ⤷ Platform-dependent signedness
c_scharsigned char8 bits
c_ucharunsigned char8 bits
c_shortshort int16 bits
c_ushortunsigned short int16 bits
c_intint32 bits
c_uintunsigned int32 bits
c_longlong int32/64 bits
 ⤷ 32-bit on Windows, 64-bit on Unix
c_ulongunsigned long int32/64 bits
 ⤷ 32-bit on Windows, 64-bit on Unix
c_longlonglong long int64 bits
c_ulonglongunsigned long long int64 bits
c_size_tsize_tPointer size
 ⤷ Same as uint typically
c_ssize_tssize_tPointer size
 ⤷ Same as int typically
c_ptrdiff_tptrdiff_tPointer size
 ⤷ For pointer arithmetic
c_floatfloat32 bits
c_doubledouble64 bits
c_longdoublelong double64/80/128 bits
 ⤷ Platform-dependent precision
c_stringchar*-
 ⤷ Alternatively: [*]c_char
!c_stringconst char*-
 ⤷ Alternatively: ![*]c_char
c_unionunion-
c_voidvoid-
opaquevoid*-

C function signatures can be specified with an extern and called directly from Iroh code, e.g.

extern func process_data(buf [*]c_char, len c_size_t, scale c_double) c_int

result = process_data(buf, len(buf), 1.23)

Variadic functions can be called as expected, e.g.

extern func printf(format: c_string, ...) c_int

printf("Hello %s, number: %d\n", "world", 42)

In order to match ABI compatibility with C without any overhead, C code can only be called from within the .single_threaded and .multi_threaded schedulers with non-gc allocators.

In those instances, there is no marshalling overhead and the native C calling convention is followed without any runtime interference, e.g.

// This compiles to identical assembly as C.
func add(a c_int, b c_int) c_int {
    return a + b
}

To support callbacks from C, the type signature of Iroh functions can specify that they use the C calling convention, e.g.

extern func register_callback(cb func(c_int) callconv(.c) c_void) c_void

func my_callback(x c_int) callconv(.c) c_void {
    // Handle callback value.
}

register_callback(my_callback)

To match C’s memory layout for structs, the extern struct keyword needs to be used, e.g.

Point = extern struct {
    x  c_int
    y  c_int
}

The c_union data type matches unions in C, e.g.

Value = c_union {
    i      c_int
    f      c_float
    bytes  [4]c_char
}

These are untagged and you must know which field is active. Accessing an inactive field can result in garbage data or even trigger a hardware trap, e.g.

Value = c_union {
    i      c_int
    f      c_float
    bytes  [4]c_char
}

v = Value{i: 42} // The .i field is active
x = v.i          // This is OK
y = v.f          // This is not OK

While the compiler can detect certain accesses as unsafe and generate edit-time errors, this may not always be possible, e.g. when calling external libraries, and will thus be marked as unsafe.

Alignment can be forced if needed with align, e.g.

Point = extern struct {
    x  c_int  align(16)
    y  c_int
}

Exact bit control can be done using packed, e.g.

ColorWriteMaskFlags = extern struct(uint32, packed) {
    red: false,
    green: false,
    blue: false,
    alpha: false,
}

The mem.c_allocator can be used to use the C allocator from Iroh, e.g.

with {
    ctx.allocator = mem.c_allocator()
} {
    ...
}

As calling C code is inherently unsafe, Iroh makes limited safety guarantees when C code is called:

  • If the C code being called was compiled by Iroh and is well-defined and does not cause any undefined behaviour according to the C23 standard, it is marked as safe.

  • Otherwise, the call to C is marked as unsafe. Such unsafe code will not be allowed in contexts like onchain-script mode, and will need to be explicitly approved otherwise.

Relatedly, as the [*] C-style pointers can be null pointers, they explicitly need to be checked for nil before potentially unsafe operations like accessing members, indexing, etc.

Finally, like Zig, Iroh also ships with multiple sets of libc headers that allows for easy cross-compilation for various target platforms.

Dynamically Generating Assembly

Compilers are not perfect. There will always be edge cases where a developer will be able to get better performance from a machine by writing raw assembly themselves.

Iroh provides a genasm keyword and an associated asm package that provides rich support for generating assembly code programmatically. These can be used inside function bodies, e.g.

import * from "asm"

func popcount(v uint32) uint32 {
    result = uint32(0)
    genasm {
        XORL(result, result)
        loop:
            TESTL(x, x)
            JZ(:done)
            SHRL(x, 1)
            ADCL(result, 0)
            JMP(:loop)
        done:
    }
    return result
}

When genasm blocks are at the top-level of a package, they are used to define functions in assembly code and need to specify the complete function signature.

While languages like C, C++, D, Rust, and Zig support inline assembly, Iroh lets you generate assembly code at compile-time using standard control structures such as if conditions and for loops.

This makes dealing with assembly code a lot easier, e.g. these 100 lines of code, ported from avo, generate the 1,500 lines of assembly code to support SHA-1 hashing:

import * from "asm"

genasm block(h *[5]uint32, m []byte) {

    w = asm.stack_alloc(64)
    w_val = (r) => w.offset((r % 16) * 4)

    asm.comment("Load initial hash.")
    hash = [GP32(), GP32(), GP32(), GP32(), GP32()]
    for i, r in hash {
        MOVL(h.offset(4*i), r)
    }

    asm.comment("Initialize registers.")
    a, b, c, d, e = GP32(), GP32(), GP32(), GP32(), GP32()
    for i, r in [a, b, c, d, e] {
        MOVL(hash[i], r)
    }

    steps = [
        (f: choose, k: 0x5a827999),
        (xor, 0x6ed9eba1),
        (majority, 0x8f1bbcdc),
        (xor, 0xca62c1d6),
    ]

    for r in 0..79 {
        asm.comment("Round ${r}.")
        s = steps[r/20]

        // Load message value.
        u = GP32()
        if r < 16 {
            MOVL(m.offset(4*r), u)
            BSWAPL(u)
        } else {
            MOVL(w_val(r-3), u)
            XORL(w_val(r-8), u)
            XORL(w_val(r-14), u)
            XORL(w_val(r-16), u)
            ROLL(U8(1), u)
        }
        MOVL(u, w_val(r))

        // Compute the next state register.
        t = GP32()
        MOVL(a, t)
        ROLL(U8(5), t)
        ADDL(q.f(b, c, d), t)
        ADDL(e, t)
        ADDL(U32(q.k), t)
        ADDL(u, t)

        // Update registers.
        ROLL(Imm(30), b)
        a, b, c, d, e = t, a, b, c, d
    }

    asm.comment("Final add.")
    for i, r in [a, b, c, d, e] {
        ADDL(r, hash[i])
    }

    asm.comment("Store results back.")
    for i, r in hash {
        MOVL(r, h.offset(4*i))
    }

    RET()

}

func choose(b, c, d Register) Register {
    r = GP32()
    MOVL(d, r)
    XORL(c, r)
    ANDL(b, r)
    XORL(d, r)
    return r
}

func majority(b, c, d Register) Register {
    t, r = GP32(), GP32()
    MOVL(b, t)
    ORL(c, t)
    ANDL(d, t)
    MOVL(b, r)
    ANDL(c, r)
    ORL(t, r)
    return r
}

func xor(b, c, d Register) Register {
    r = GP32()
    MOVL(b, r)
    XORL(c, r)
    XORL(d, r)
    return r
}

As assembly generating code within Iroh can be packaged up and reused, we expect genasm to be more heavily used than inline assembly in other languages.

In particular, we expect to see genasm used in performance-critical code, e.g. to take advantage of specific SIMD instructions when the compiler can’t automatically vectorize code.

The asm support package takes inspiration from projects like PeachPy, AsmJit, and avo to enable programmatic assembly code generation, and takes care of some complex aspects, e.g.

  • Supporting unlimited virtual registers which are transparently mapped to physical registers.

  • Automatically taking care of correct memory offsets for complex data structures.

Initially, this package will support currently popular architectures:

  • x64, i.e. x86-64, the 64-bit version of Intel’s x86 architecture as developed by AMD.

  • arm64, i.e. AArch64, the 64-bit version of the ARM architecture.

Assembly generating code can match on $arch to generate different assembly code for different architectures:

match $arch {
    .x64:
        // x64-specific assembly code
    .arm64:
        // arm64-specific assembly code
    default:
        // everything else
}

Support will be added over time for instructions in newer versions of architectures, as well as other architectures as they gain adoption within the broader market.

To LLVM or Not

For many new languages, LLVM has been the go-to choice for compiler infrastructure. Rust. Julia. Swift. They all use LLVM to generate and optimize machine code.

This is with good reason. LLVM is hard to match. It has battle-tested code generation for multiple architectures, with decades of work on code optimization!

But, while LLVM is definitely an amazing piece of engineering, we will not be using it to build Iroh’s official compiler:

  • LLVM is slow moving. For almost a decade, the Rust team had to maintain and ship their own fork of LLVM as the official releases didn’t include the features and bug fixes that they needed.

  • It imposes a lot of cost on language designers, e.g. working around the constraints imposed by LLVM, working around the regressions introduced by new versions, etc.

  • LLVM was never built with developer productivity in mind. If you want your language to have features like fast compilation, LLVM is painful to work around.

Instead, we will be following in the footsteps of Go, and more recently Zig, and do the generation and optimization of machine code ourselves.

  • For starters, to support most use cases, we only need to support 6 platforms at the start: ⁠android-arm64, ⁠ios-arm64, ⁠linux-arm64, ⁠linux-x64, ⁠macos-arm64, windows-x64.

  • While hardware speeds double roughly every 18 months, progress in compiler optimizations is much slower — you only get speed doublings every few decades.

  • After all, Frances Allen catalogued most of the big impact optimizations way back in 1971: inlining, unrolling, CSE, code motion, constant folding, DCE, and peephole optimization.

  • As evidenced by Go, even a simple compiler can produce reasonably fast code. Besides improving developer productivity, we’re also likely to have fewer miscompilations.

  • Most code in an executable is only run a few times. By providing our users with the ability to dynamically generate assembly, they’d be able to optimize hotspots better than most compilers.

So, while it’ll be a challenge to build a decent code generator, we believe the payoff will be well worth it.